4 changed files with 76 additions and 72 deletions
--- a/DOCKERHUB.en.md
+++ b/DOCKERHUB.en.md
@ -21,15 +21,11 @@ Ollama does not need to run on the same host — `OLLAMA_URL` can point to any r
 | `8000` | Proxy endpoint (OpenAI API) |
 | `8001` | Admin API + web interface |
-Port 8001 must be exposed because the container serves the admin interface directly on this port. All API endpoints require the `ADMIN_PASSWORD` — without a valid token, only the public frontend files (HTML/JS/CSS of the login page) are accessible. The password is therefore the primary protection.
+Port 8001 must be exposed because the container serves the admin interface directly on this port. To restrict access to the local machine, bind it to `127.0.0.1` — this makes the port reachable only from the host, not from the network:
 Additional hardening: binding to `127.0.0.1` restricts access to the local host and prevents direct network access:
 ```
 ports:
-  - "127.0.0.1:8001:8001"   # local access only
+  - "127.0.0.1:8001:8001"
  # or:
  - "8001:8001"              # network-wide, protected by ADMIN_PASSWORD only
 ```
 ## Environment Variables
--- a/DOCKERHUB.md
+++ b/DOCKERHUB.md
@ -21,15 +21,11 @@ Ollama muss dabei nicht auf demselben Host laufen — `OLLAMA_URL` kann auf jede
 | `8000` | Proxy-Endpunkt (OpenAI-API) |
 | `8001` | Admin-API + Web-Oberfläche |
-Port 8001 muss exposed werden, da der Container die Admin-Oberfläche selbst auf diesem Port ausliefert. Alle API-Endpunkte erfordern das `ADMIN_PASSWORD` — ein Zugriff ohne gültiges Token liefert nur die öffentlichen Frontend-Dateien (HTML/JS/CSS der Login-Seite). Das Passwort ist damit die primäre Schutzmaßnahme.
+Port 8001 muss exposed werden, da der Container die Admin-Oberfläche selbst auf diesem Port ausliefert. Um den Zugriff auf den lokalen Rechner zu beschränken, die Portbindung auf `127.0.0.1` setzen — so ist der Port nur vom Host erreichbar, nicht aus dem Netzwerk:
 Zusätzliche Härtung: Portbindung auf `127.0.0.1` beschränkt den Zugriff auf den lokalen Host und verhindert direkten Netzwerkzugriff:
 ```
 ports:
-  - "127.0.0.1:8001:8001"   # nur lokal erreichbar
+  - "127.0.0.1:8001:8001"
  # oder:
  - "8001:8001"              # netzwerkweit, Schutz nur durch ADMIN_PASSWORD
 ```
 ## Umgebungsvariablen
--- a/README.md
+++ b/README.md
@ -8,22 +8,20 @@ Ein Reverse-Proxy für Ollama mit API-Key-Authentifizierung, Quota-Management un
 - Optionales Ablaufdatum pro API-Key
 - Quota-Management mit getrennten Tages- und Monatslimits (Tokens & Requests)
 - Token-Zählung via tiktoken, Reset-Grenzen in der Zeitzone Europe/Berlin
- Web-Admin-Oberfläche (API-Keys verwalten, Ollama-Einstellungen, Verbrauchsanzeige)
+- Web-Admin-Oberfläche (API-Keys verwalten, Ollama-Einstellungen, Proxy-Info)
- OpenAI-kompatibler `/v1/chat/completions`-Endpunkt mit Streaming und Tool-Use
+- OpenAI-kompatibler `/v1/chat/completions`-Endpunkt
 - Rotierende Nutzungs-Logs
 - SQLite (Standard) oder PostgreSQL
 - Docker-Image auf DockerHub: `mediaeng/llmproxy`
 ## Sicherheit
- Admin-Oberfläche passwortgeschützt (`ADMIN_PASSWORD`) — alle API-Endpunkte erfordern den Token
+- Admin-Oberfläche passwortgeschützt (`ADMIN_PASSWORD`)
 - Admin-API bindet lokal auf `127.0.0.1` (nicht von außen erreichbar)
 - API-Keys als SHA-256-Hash in der DB — Plaintext nur einmalig bei Erstellung
 - Quota-Check atomar mit `SELECT FOR UPDATE` (kein TOCTOU-Race)
- Port 8001 kann optional auf `127.0.0.1` gebunden werden (zusätzliche Härtung)
+- CORS-Origins konfigurierbar via `ALLOWED_ORIGINS`
 ## Konfiguration
-`.env`-Datei im Projektverzeichnis anlegen:
+`.env`-Datei im Projektverzeichnis anlegen (Vorlage: `.env.example`):
 ```env
 ADMIN_PASSWORD=change-me
@ -34,7 +32,6 @@ DATABASE_URL=sqlite:///./test.db
 OLLAMA_URL=http://localhost:11434
 DEFAULT_MODEL=llama3
 APP_TZ=Europe/Berlin
 LOG_FILE=logs/usage.log
 ```
 | Variable | Standard | Beschreibung |
@ -43,15 +40,25 @@ LOG_FILE=logs/usage.log
 | `PROXY_HOST` | `0.0.0.0` | Bind-Adresse des Proxys |
 | `PROXY_PORT` | `8000` | Port des Proxys |
 | `ADMIN_PORT` | `8001` | Port der Admin-API |
-| `DATABASE_URL` | `sqlite:///./test.db` | DB-Verbindungsstring (SQLite oder PostgreSQL) |
+| `DATABASE_URL` | `sqlite:///./test.db` | DB-Verbindungsstring |
 | `OLLAMA_URL` | `http://localhost:11434` | Adresse der Ollama-Instanz (auch in der UI änderbar) |
 | `DEFAULT_MODEL` | `llama3` | Standard-Modell für `/v1/chat/completions` (auch in der UI änderbar) |
 | `APP_TZ` | `Europe/Berlin` | Zeitzone für tägliche/monatliche Quota-Resets |
-| `LOG_FILE` | `logs/usage.log` | Pfad der rotierenden Nutzungs-Logdatei |
+| `ALLOWED_ORIGINS` | `http://localhost:5173` | Kommagetrennte CORS-Origins |
 | `ALLOWED_ORIGINS` | `http://localhost:5173` | CORS-Origins (nur für Entwicklung relevant) |
 ## Entwicklung (lokal)
 ```bash
 cp .env.example .env
 # ADMIN_PASSWORD in .env setzen
 ./start.sh
 ```
 Das Script prüft alle Ports auf Belegung, aktiviert automatisch eine vorhandene `.venv`, initialisiert die Datenbank und startet Proxy, Admin-API und Vite-Dev-Server.
 Admin-Oberfläche: `http://localhost:5173`
 ### Voraussetzungen
 - Python 3.12+ mit virtualenv
@ -61,50 +68,63 @@ LOG_FILE=logs/usage.log
 python -m venv .venv
 source .venv/bin/activate
 pip install -r backend/requirements-dev.txt
 cd frontend && npm install
 ```
 ### Starten
 **Per Script:**
 ```bash
 cp .env.example .env   # ADMIN_PASSWORD setzen
 ./start.sh
 ```
 **Per PyCharm:** Run-Config „Dev" starten (startet Proxy, Admin-API und Vite-Dev-Server gemeinsam).
 Das Script prüft alle Ports auf Belegung, initialisiert die Datenbank und startet alle drei Dienste.
 Admin-Oberfläche: `http://localhost:5173`
 ## Produktion (Docker)
-### Docker Compose (empfohlen)
+### Image bauen
 ```bash
-docker compose up -d
+docker build -t llm-quota .
 ```
-Zieht das Image von DockerHub, lädt Variablen aus `.env` und verwendet die lokale SQLite-Datenbank. Weitere Compose-Varianten (PostgreSQL, Ollama als Container) siehe `DOCKERHUB.md`.
+### Container starten
 ### Image selbst bauen und pushen
 ```bash
-./build_push.sh
+docker run -d \
  -p 8000:8000 \
  -p 127.0.0.1:8001:8001 \
  -e ADMIN_PASSWORD=geheim \
  -e OLLAMA_URL=http://host.docker.internal:11434 \
  -e DATABASE_URL=sqlite:///./data/quota.db \
  -v $(pwd)/data:/app/backend/data \
  --name llm-quota \
  llm-quota
 ```
-Das Script zeigt den aktuellen Git-Tag, bietet an einen neuen zu setzen, baut das Image für `linux/arm64` und pusht zu `mediaeng/llmproxy`.
+| Port | Bindung | Dienst |
 |------|---------|--------|
 | `8000` | `0.0.0.0` — öffentlich | Proxy (für LLM-Clients) |
 | `8001` | `127.0.0.1` — nur lokal am Server | Admin-API + Admin-Oberfläche |
-### Port 8001 (Admin)
+Docker unterscheidet beim Port-Mapping zwischen `0.0.0.0` (alle Interfaces, öffentlich erreichbar) und `127.0.0.1` (nur der Server selbst kann zugreifen). Mit `-p 127.0.0.1:8001:8001` ist Port 8001 am Server verfügbar, aber von außen nicht direkt ansprechbar.
-Port 8001 muss exposed werden, da der Container die Admin-Oberfläche auf diesem Port ausliefert. Alle API-Endpunkte erfordern das `ADMIN_PASSWORD` — der Token ist der primäre Schutz. Optionale zusätzliche Härtung: Bindung auf `127.0.0.1`:
+### Admin-Oberfläche per SSH-Tunnel erreichbar machen
-```yaml
+Der SSH-Tunnel leitet einen lokalen Port auf den Server weiter und nutzt dabei, dass Port 8001 dort auf `127.0.0.1` erreichbar ist:
-ports:
+
-  - "127.0.0.1:8001:8001"   # nur lokal
+```
-  # oder:
+Admin-Laptop:8001  ──SSH──►  Server:127.0.0.1:8001  ──►  Container:8001
-  - "8001:8001"              # netzwerkweit, Schutz durch ADMIN_PASSWORD
+```
 ```bash
 ssh -L 8001:localhost:8001 user@server
 ```
 Danach ist die Admin-Oberfläche auf dem Laptop unter `http://localhost:8001` erreichbar — ohne dass Port 8001 öffentlich exponiert wird.
 ### Mit PostgreSQL
 ```bash
 docker run -d \
  -p 8000:8000 \
  -p 127.0.0.1:8001:8001 \
  -e ADMIN_PASSWORD=geheim \
  -e DATABASE_URL=postgresql://user:pass@db-host:5432/llm_quota \
  -e OLLAMA_URL=http://ollama:11434 \
  llm-quota
 ```
 ## Proxy-Endpunkte (Port 8000)
@ -112,7 +132,7 @@ ports:
 Alle Endpunkte erfordern einen gültigen API-Key im `Authorization`-Header.
 ```bash
-curl -X POST http://localhost:8000/v1/chat/completions \
+curl -X POST http://localhost:8000/api/chat \
  -H "Authorization: Bearer sk-xxxxxx" \
  -H "Content-Type: application/json" \
  -d '{"model":"llama3","messages":[{"role":"user","content":"Hallo"}]}'
@ -120,12 +140,12 @@ curl -X POST http://localhost:8000/v1/chat/completions \
 | Endpunkt | Methode | Beschreibung |
 |----------|---------|--------------|
-| `/v1/chat/completions` | POST | Chat (OpenAI-Format, Streaming + Tool-Use) |
+| `/api/generate` | POST | Ollama generate |
-| `/v1/models` | GET | Modelle (OpenAI-Format) |
+| `/api/chat` | POST | Ollama chat |
 | `/api/generate` | POST | Ollama generate (nativ) |
 | `/api/chat` | POST | Ollama chat (nativ) |
 | `/api/tags` | GET | Verfügbare Modelle |
 | `/api/versions` | GET | Ollama-Version |
 | `/v1/models` | GET | Modelle (OpenAI-Format) |
 | `/v1/chat/completions` | POST | Chat (OpenAI-Format) |
 ## Admin-API (Port 8001)
@ -133,12 +153,10 @@ Alle Endpunkte erfordern `Authorization: Bearer <ADMIN_PASSWORD>`.
 | Endpunkt | Methode | Beschreibung |
 |----------|---------|--------------|
-| `/api/api-keys` | GET | Alle API-Keys mit Verbrauchsdaten |
+| `/api/api-keys` | GET | Alle API-Keys auflisten |
 | `/api/api-keys` | POST | Neuen API-Key erstellen |
 | `/api/api-keys/{id}/quota` | PATCH | Limits eines Keys aktualisieren |
 | `/api/api-keys/{id}/activate` | PUT | API-Key aktivieren |
 | `/api/api-keys/{id}/deactivate` | PUT | API-Key deaktivieren |
-| `/api/api-keys/{id}` | DELETE | API-Key löschen |
+| `/api/api-keys/{id}/quota` | PATCH | Quota eines Keys aktualisieren |
 | `/api/settings` | GET/PUT | Ollama-URL und Standard-Modell |
 | `/api/ollama-models` | GET | Verfügbare Modelle von Ollama |
 | `/api/proxy-info` | GET | Lokaler Proxy-Endpunkt |
@ -162,27 +180,22 @@ llm_quota/
 │   ├── schemas.py           # Pydantic-Schemas
 │   ├── crud.py              # DB-Operationen, Token-Zählung, Quota-Logik
 │   ├── init_db.py           # Tabellen anlegen & Settings seeden
 │   ├── setup_admin.py       # Standard-API-Key erstellen
 │   ├── requirements.txt     # Produktiv-Dependencies
 │   ├── requirements-dev.txt # Test-Dependencies
 │   └── tests/
-│       ├── conftest.py
+│       ├── conftest.py      # Fixtures
-│       ├── test_auth.py
+│       ├── test_auth.py     # Authentifizierungs-Tests
-│       └── test_quota.py
+│       └── test_quota.py    # Quota-, Token- und Ablauf-Tests
 ├── frontend/
 │   └── src/
 │       ├── main.jsx         # React-Admin-UI
 │       └── styles.css
 ├── .idea/runConfigurations/
 │   └── Dev.xml              # PyCharm Run-Config
 ├── Dockerfile
 ├── docker-compose.yml       # Produktiv-Start mit DockerHub-Image
 ├── docker-entrypoint.sh
 ├── .dockerignore
 ├── .env.example
 ├── start.sh                 # Entwicklungs-Startscript
 ├── run_dev.py               # Entwicklungs-Runner für PyCharm
 ├── build_push.sh            # Docker-Build & Push zu DockerHub
 ├── DOCKERHUB.md             # DockerHub-Beschreibung (deutsch)
 ├── DOCKERHUB.en.md          # DockerHub-Beschreibung (englisch)
 └── .gitignore
 ```
--- a/frontend/vite.config.js
+++ b/frontend/vite.config.js
@ -5,7 +5,6 @@ export default defineConfig({
  plugins: [react()],
  clearScreen: false,
  server: {
    open: true,
    proxy: {
      '/api/api-keys': 'http://localhost:8001',
      '/api/settings': 'http://localhost:8001',