Go to file

Oliver Hofmann 317c7f0340 Add Docker production build and update README

- Multi-stage Dockerfile: builds frontend, packages with Python backend
- admin.py serves frontend/dist as StaticFiles in production
- docker-entrypoint.sh runs proxy + admin-api, exits cleanly if either dies
- .dockerignore excludes .env, venv, tests, node_modules
- Split requirements.txt (prod) / requirements-dev.txt (dev+test)
- aiofiles added for StaticFiles support
- start.sh: port checks before startup, venv auto-activation, trap cleanup
- vite.config.js: clearScreen disabled
- README rewritten to reflect current architecture

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-28 08:34:45 +02:00

backend

Add Docker production build and update README

2026-04-28 08:34:45 +02:00

frontend

Add Docker production build and update README

2026-04-28 08:34:45 +02:00

.dockerignore

Add Docker production build and update README

2026-04-28 08:34:45 +02:00

.env.example

Refactor to flat APIKey model with quota, admin UI, .env config, and Berlin timezone

2026-04-28 08:21:42 +02:00

.gitignore

Fix medium/low priority review items; update README

2026-04-27 21:48:26 +02:00

docker-compose.yml

Init

2026-04-27 18:54:27 +02:00

docker-entrypoint.sh

Add Docker production build and update README

2026-04-28 08:34:45 +02:00

Dockerfile

Add Docker production build and update README

2026-04-28 08:34:45 +02:00

README.md

Add Docker production build and update README

2026-04-28 08:34:45 +02:00

run_tests.py

Init

2026-04-27 18:54:27 +02:00

start.sh

Add Docker production build and update README

2026-04-28 08:34:45 +02:00

test_api.sh

Init

2026-04-27 18:54:27 +02:00

README.md

Ollama Proxy mit API-Keys und Quotas

Ein Reverse-Proxy für Ollama mit API-Key-Authentifizierung, Quota-Management und Web-Admin-Oberfläche.

Features

API-Key-Authentifizierung (Bearer Token oder sk--Prefix)
Optionales Ablaufdatum pro API-Key
Quota-Management mit getrennten Tages- und Monatslimits (Tokens & Requests)
Token-Zählung via tiktoken, Reset-Grenzen in der Zeitzone Europe/Berlin
Web-Admin-Oberfläche (API-Keys verwalten, Ollama-Einstellungen, Proxy-Info)
OpenAI-kompatibler /v1/chat/completions-Endpunkt

Sicherheit

Admin-Oberfläche passwortgeschützt (ADMIN_PASSWORD)
Admin-API bindet lokal auf 127.0.0.1 (nicht von außen erreichbar)
API-Keys als SHA-256-Hash in der DB — Plaintext nur einmalig bei Erstellung
Quota-Check atomar mit SELECT FOR UPDATE (kein TOCTOU-Race)
CORS-Origins konfigurierbar via ALLOWED_ORIGINS

Konfiguration

.env-Datei im Projektverzeichnis anlegen (Vorlage: .env.example):

ADMIN_PASSWORD=change-me
PROXY_HOST=0.0.0.0
PROXY_PORT=8000
ADMIN_PORT=8001
DATABASE_URL=sqlite:///./test.db
OLLAMA_URL=http://localhost:11434
DEFAULT_MODEL=llama3
APP_TZ=Europe/Berlin

Variable	Standard	Beschreibung
`ADMIN_PASSWORD`	—	Passwort für die Admin-Oberfläche (Pflicht)
`PROXY_HOST`	`0.0.0.0`	Bind-Adresse des Proxys
`PROXY_PORT`	`8000`	Port des Proxys
`ADMIN_PORT`	`8001`	Port der Admin-API
`DATABASE_URL`	`sqlite:///./test.db`	DB-Verbindungsstring
`OLLAMA_URL`	`http://localhost:11434`	Adresse der Ollama-Instanz (auch in der UI änderbar)
`DEFAULT_MODEL`	`llama3`	Standard-Modell für `/v1/chat/completions` (auch in der UI änderbar)
`APP_TZ`	`Europe/Berlin`	Zeitzone für tägliche/monatliche Quota-Resets
`ALLOWED_ORIGINS`	`http://localhost:5173`	Kommagetrennte CORS-Origins

Entwicklung (lokal)

cp .env.example .env
# ADMIN_PASSWORD in .env setzen

./start.sh

Das Script prüft alle Ports auf Belegung, aktiviert automatisch eine vorhandene .venv, initialisiert die Datenbank und startet Proxy, Admin-API und Vite-Dev-Server.

Admin-Oberfläche: http://localhost:5173

Voraussetzungen

Python 3.12+ mit virtualenv
Node.js 18+

python -m venv .venv
source .venv/bin/activate
pip install -r backend/requirements-dev.txt

cd frontend && npm install

Produktion (Docker)

Image bauen

docker build -t llm-quota .

Container starten

docker run -d \
  -p 8000:8000 \
  -p 8001:8001 \
  -e ADMIN_PASSWORD=geheim \
  -e OLLAMA_URL=http://host.docker.internal:11434 \
  -e DATABASE_URL=sqlite:///./data/quota.db \
  -v $(pwd)/data:/app/backend/data \
  --name llm-quota \
  llm-quota

Port	Dienst
`8000`	Proxy (für LLM-Clients)
`8001`	Admin-API + Admin-Oberfläche

Admin-Oberfläche: http://localhost:8001

Mit PostgreSQL

docker run -d \
  -p 8000:8000 \
  -p 8001:8001 \
  -e ADMIN_PASSWORD=geheim \
  -e DATABASE_URL=postgresql://user:pass@db-host:5432/llm_quota \
  -e OLLAMA_URL=http://ollama:11434 \
  llm-quota

Hinweis: Im Container bindet die Admin-API auf 0.0.0.0. Port 8001 sollte nicht öffentlich exponiert werden — entweder per Firewall absichern oder hinter einem Reverse-Proxy (nginx, Caddy) betreiben.

Proxy-Endpunkte (Port 8000)

Alle Endpunkte erfordern einen gültigen API-Key im Authorization-Header.

curl -X POST http://localhost:8000/api/chat \
  -H "Authorization: Bearer sk-xxxxxx" \
  -H "Content-Type: application/json" \
  -d '{"model":"llama3","messages":[{"role":"user","content":"Hallo"}]}'

Endpunkt	Methode	Beschreibung
`/api/generate`	POST	Ollama generate
`/api/chat`	POST	Ollama chat
`/api/tags`	GET	Verfügbare Modelle
`/api/versions`	GET	Ollama-Version
`/v1/models`	GET	Modelle (OpenAI-Format)
`/v1/chat/completions`	POST	Chat (OpenAI-Format)

Admin-API (Port 8001)

Alle Endpunkte erfordern Authorization: Bearer <ADMIN_PASSWORD>.

Endpunkt	Methode	Beschreibung
`/api/api-keys`	GET	Alle API-Keys auflisten
`/api/api-keys`	POST	Neuen API-Key erstellen
`/api/api-keys/{id}/deactivate`	PUT	API-Key deaktivieren
`/api/api-keys/{id}/quota`	PATCH	Quota eines Keys aktualisieren
`/api/settings`	GET/PUT	Ollama-URL und Standard-Modell
`/api/ollama-models`	GET	Verfügbare Modelle von Ollama
`/api/proxy-info`	GET	Lokaler Proxy-Endpunkt

Tests

cd backend
python -m pytest tests/ -v

Projektstruktur

llm_quota/
├── backend/
│   ├── main.py              # Proxy-Server (Port 8000)
│   ├── admin.py             # Admin-API + Static-File-Serving (Port 8001)
│   ├── database.py          # DB-Verbindung & Session
│   ├── models.py            # SQLAlchemy-Modelle (APIKey, Setting, Usage)
│   ├── schemas.py           # Pydantic-Schemas
│   ├── crud.py              # DB-Operationen, Token-Zählung, Quota-Logik
│   ├── init_db.py           # Tabellen anlegen & Settings seeden
│   ├── setup_admin.py       # Standard-API-Key erstellen
│   ├── requirements.txt     # Produktiv-Dependencies
│   ├── requirements-dev.txt # Test-Dependencies
│   └── tests/
│       ├── conftest.py      # Fixtures
│       ├── test_auth.py     # Authentifizierungs-Tests
│       └── test_quota.py    # Quota-, Token- und Ablauf-Tests
├── frontend/
│   └── src/
│       ├── main.jsx         # React-Admin-UI
│       └── styles.css
├── Dockerfile
├── docker-entrypoint.sh
├── .dockerignore
├── .env.example
├── start.sh                 # Entwicklungs-Startscript
└── .gitignore

Lizenz

MIT

Languages

Python 65.8%

JavaScript 21.2%

CSS 7.6%

Shell 4.3%

Dockerfile 0.8%

Other 0.3%