Compare commits
10 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
280b3b0762 | ||
|
|
89661dafcc | ||
|
|
3ccc7f325d | ||
|
|
25f19b6ada | ||
|
|
5b97ed0ef7 | ||
|
|
222b204d4b | ||
|
|
9910e6e062 | ||
|
|
6e704da86b | ||
|
|
bd0dc0478f | ||
|
|
92fd66381f |
@ -1,11 +1,50 @@
|
|||||||
.git/
|
# Python
|
||||||
|
__pycache__/
|
||||||
|
*.pyc
|
||||||
|
*.pyo
|
||||||
|
*.pyd
|
||||||
.venv/
|
.venv/
|
||||||
venv/
|
venv/
|
||||||
|
*.egg-info/
|
||||||
|
.pytest_cache/
|
||||||
|
|
||||||
|
# Tests
|
||||||
|
backend/tests/
|
||||||
|
|
||||||
|
# Environment & secrets
|
||||||
.env
|
.env
|
||||||
|
*.env.local
|
||||||
|
|
||||||
|
# Databases
|
||||||
|
*.db
|
||||||
|
*.sqlite3
|
||||||
|
|
||||||
|
# Logs
|
||||||
|
logs/
|
||||||
|
*.log
|
||||||
|
|
||||||
|
# IDE & tools
|
||||||
|
.idea/
|
||||||
|
.vscode/
|
||||||
|
.claude/
|
||||||
|
|
||||||
|
# Git
|
||||||
|
.git/
|
||||||
|
.gitignore
|
||||||
|
|
||||||
|
# Frontend
|
||||||
frontend/node_modules/
|
frontend/node_modules/
|
||||||
frontend/dist/
|
frontend/dist/
|
||||||
backend/__pycache__/
|
|
||||||
backend/**/__pycache__/
|
# Docker
|
||||||
backend/*.pyc
|
docker-compose.yml
|
||||||
backend/test.db
|
.dockerignore
|
||||||
backend/tests/
|
|
||||||
|
# Docs
|
||||||
|
*.md
|
||||||
|
|
||||||
|
# Dev & build scripts
|
||||||
|
run_dev.py
|
||||||
|
build_push.sh
|
||||||
|
start.sh
|
||||||
|
test_api.sh
|
||||||
|
|||||||
221
DOCKERHUB.en.md
Normal file
221
DOCKERHUB.en.md
Normal file
@ -0,0 +1,221 @@
|
|||||||
|
# mediaeng/llmproxy
|
||||||
|
|
||||||
|
A lightweight reverse proxy for [Ollama](https://ollama.com) that manages API keys with configurable token and request quotas. Incoming requests in OpenAI-compatible format are authenticated, checked against the quota, and forwarded to the configured Ollama server.
|
||||||
|
|
||||||
|
Ollama does not need to run on the same host — `OLLAMA_URL` can point to any reachable server: the Docker host itself, another machine on the network, or a remote server.
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
- OpenAI-compatible endpoint (`/v1/chat/completions`, `/v1/models`)
|
||||||
|
- API key management with daily and monthly token/request limits
|
||||||
|
- Web-based admin interface (port 8001)
|
||||||
|
- Streaming support (Server-Sent Events)
|
||||||
|
- Tool use / function calling passthrough
|
||||||
|
- Rotating usage logs
|
||||||
|
- SQLite (default) or PostgreSQL
|
||||||
|
|
||||||
|
## Ports
|
||||||
|
|
||||||
|
| Port | Service |
|
||||||
|
|------|---------|
|
||||||
|
| `8000` | Proxy endpoint (OpenAI API) |
|
||||||
|
| `8001` | Admin API + web interface |
|
||||||
|
|
||||||
|
Port 8001 must be exposed because the container serves the admin interface directly on this port. All API endpoints require the `ADMIN_PASSWORD` — without a valid token, only the public frontend files (HTML/JS/CSS of the login page) are accessible. The password is therefore the primary protection.
|
||||||
|
|
||||||
|
Additional hardening: binding to `127.0.0.1` restricts access to the local host and prevents direct network access:
|
||||||
|
|
||||||
|
```
|
||||||
|
ports:
|
||||||
|
- "127.0.0.1:8001:8001" # local access only
|
||||||
|
# or:
|
||||||
|
- "8001:8001" # network-wide, protected by ADMIN_PASSWORD only
|
||||||
|
```
|
||||||
|
|
||||||
|
## Environment Variables
|
||||||
|
|
||||||
|
| Variable | Default | Description |
|
||||||
|
|----------|---------|-------------|
|
||||||
|
| `ADMIN_PASSWORD` | – | **Required.** Password for the admin interface |
|
||||||
|
| `OLLAMA_URL` | `http://localhost:11434` | URL of the Ollama server (without `/v1` suffix) |
|
||||||
|
| `DEFAULT_MODEL` | `llama3` | Model used when the client does not specify one |
|
||||||
|
| `DATABASE_URL` | `sqlite:///./test.db` | Database connection string (SQLite or PostgreSQL) |
|
||||||
|
| `PROXY_HOST` | `0.0.0.0` | Proxy bind address |
|
||||||
|
| `PROXY_PORT` | `8000` | Proxy port |
|
||||||
|
| `ADMIN_PORT` | `8001` | Admin API port |
|
||||||
|
| `APP_TZ` | `Europe/Berlin` | Timezone for daily/monthly quota resets |
|
||||||
|
| `LOG_FILE` | `logs/usage.log` | Path of the rotating usage log file |
|
||||||
|
|
||||||
|
## Docker Compose – External Ollama, SQLite
|
||||||
|
|
||||||
|
Use this when Ollama runs outside of Docker — on the Docker host or any other reachable server. Adjust `OLLAMA_URL` accordingly.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
services:
|
||||||
|
llmproxy:
|
||||||
|
image: mediaeng/llmproxy:latest
|
||||||
|
restart: unless-stopped
|
||||||
|
ports:
|
||||||
|
- "8000:8000"
|
||||||
|
- "127.0.0.1:8001:8001"
|
||||||
|
environment:
|
||||||
|
ADMIN_PASSWORD: changeme
|
||||||
|
OLLAMA_URL: http://host.docker.internal:11434 # or http://<ip>:11434
|
||||||
|
DEFAULT_MODEL: llama3
|
||||||
|
APP_TZ: Europe/Berlin
|
||||||
|
volumes:
|
||||||
|
- llmproxy-data:/app/backend
|
||||||
|
# On Linux, add extra_hosts since host.docker.internal is not
|
||||||
|
# available automatically:
|
||||||
|
# extra_hosts:
|
||||||
|
# - "host.docker.internal:host-gateway"
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
llmproxy-data:
|
||||||
|
```
|
||||||
|
|
||||||
|
## Docker Compose – External Ollama, PostgreSQL
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
services:
|
||||||
|
llmproxy:
|
||||||
|
image: mediaeng/llmproxy:latest
|
||||||
|
restart: unless-stopped
|
||||||
|
ports:
|
||||||
|
- "8000:8000"
|
||||||
|
- "127.0.0.1:8001:8001"
|
||||||
|
environment:
|
||||||
|
ADMIN_PASSWORD: changeme
|
||||||
|
OLLAMA_URL: http://host.docker.internal:11434 # or http://<ip>:11434
|
||||||
|
DEFAULT_MODEL: llama3
|
||||||
|
APP_TZ: Europe/Berlin
|
||||||
|
DATABASE_URL: postgresql://llmproxy:secret@db:5432/llmproxy
|
||||||
|
volumes:
|
||||||
|
- llmproxy-data:/app/backend
|
||||||
|
depends_on:
|
||||||
|
db:
|
||||||
|
condition: service_healthy
|
||||||
|
# extra_hosts:
|
||||||
|
# - "host.docker.internal:host-gateway"
|
||||||
|
|
||||||
|
db:
|
||||||
|
image: postgres:16-alpine
|
||||||
|
restart: unless-stopped
|
||||||
|
environment:
|
||||||
|
POSTGRES_DB: llmproxy
|
||||||
|
POSTGRES_USER: llmproxy
|
||||||
|
POSTGRES_PASSWORD: secret
|
||||||
|
volumes:
|
||||||
|
- pg-data:/var/lib/postgresql/data
|
||||||
|
healthcheck:
|
||||||
|
test: ["CMD-SHELL", "pg_isready -U llmproxy"]
|
||||||
|
interval: 5s
|
||||||
|
timeout: 5s
|
||||||
|
retries: 5
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
pg-data:
|
||||||
|
```
|
||||||
|
|
||||||
|
## Docker Compose – Ollama as Container, SQLite
|
||||||
|
|
||||||
|
Ollama and llmproxy run together in Docker, data persisted in a volume.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
services:
|
||||||
|
llmproxy:
|
||||||
|
image: mediaeng/llmproxy:latest
|
||||||
|
restart: unless-stopped
|
||||||
|
ports:
|
||||||
|
- "8000:8000"
|
||||||
|
- "127.0.0.1:8001:8001"
|
||||||
|
environment:
|
||||||
|
ADMIN_PASSWORD: changeme
|
||||||
|
OLLAMA_URL: http://ollama:11434
|
||||||
|
DEFAULT_MODEL: llama3
|
||||||
|
APP_TZ: Europe/Berlin
|
||||||
|
volumes:
|
||||||
|
- llmproxy-data:/app/backend
|
||||||
|
depends_on:
|
||||||
|
- ollama
|
||||||
|
|
||||||
|
ollama:
|
||||||
|
image: ollama/ollama:latest
|
||||||
|
restart: unless-stopped
|
||||||
|
volumes:
|
||||||
|
- ollama-data:/root/.ollama
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
llmproxy-data:
|
||||||
|
ollama-data:
|
||||||
|
```
|
||||||
|
|
||||||
|
## Docker Compose – Ollama as Container, PostgreSQL
|
||||||
|
|
||||||
|
For production environments with an external database.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
services:
|
||||||
|
llmproxy:
|
||||||
|
image: mediaeng/llmproxy:latest
|
||||||
|
restart: unless-stopped
|
||||||
|
ports:
|
||||||
|
- "8000:8000"
|
||||||
|
- "127.0.0.1:8001:8001"
|
||||||
|
environment:
|
||||||
|
ADMIN_PASSWORD: changeme
|
||||||
|
OLLAMA_URL: http://ollama:11434
|
||||||
|
DEFAULT_MODEL: llama3
|
||||||
|
APP_TZ: Europe/Berlin
|
||||||
|
DATABASE_URL: postgresql://llmproxy:secret@db:5432/llmproxy
|
||||||
|
depends_on:
|
||||||
|
db:
|
||||||
|
condition: service_healthy
|
||||||
|
ollama:
|
||||||
|
condition: service_started
|
||||||
|
|
||||||
|
db:
|
||||||
|
image: postgres:16-alpine
|
||||||
|
restart: unless-stopped
|
||||||
|
environment:
|
||||||
|
POSTGRES_DB: llmproxy
|
||||||
|
POSTGRES_USER: llmproxy
|
||||||
|
POSTGRES_PASSWORD: secret
|
||||||
|
volumes:
|
||||||
|
- pg-data:/var/lib/postgresql/data
|
||||||
|
healthcheck:
|
||||||
|
test: ["CMD-SHELL", "pg_isready -U llmproxy"]
|
||||||
|
interval: 5s
|
||||||
|
timeout: 5s
|
||||||
|
retries: 5
|
||||||
|
|
||||||
|
ollama:
|
||||||
|
image: ollama/ollama:latest
|
||||||
|
restart: unless-stopped
|
||||||
|
volumes:
|
||||||
|
- ollama-data:/root/.ollama
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
pg-data:
|
||||||
|
ollama-data:
|
||||||
|
```
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker run -d \
|
||||||
|
-p 8000:8000 \
|
||||||
|
-e ADMIN_PASSWORD=changeme \
|
||||||
|
-e OLLAMA_URL=http://host.docker.internal:11434 \
|
||||||
|
-v llmproxy-data:/app/backend \
|
||||||
|
mediaeng/llmproxy:latest
|
||||||
|
```
|
||||||
|
|
||||||
|
## Client Configuration
|
||||||
|
|
||||||
|
Configure the proxy as an OpenAI-compatible endpoint:
|
||||||
|
|
||||||
|
```
|
||||||
|
Base URL: http://<host>:8000/v1
|
||||||
|
API Key: <API key created in the admin interface>
|
||||||
|
```
|
||||||
17
DOCKERHUB.md
17
DOCKERHUB.md
@ -19,7 +19,18 @@ Ollama muss dabei nicht auf demselben Host laufen — `OLLAMA_URL` kann auf jede
|
|||||||
| Port | Dienst |
|
| Port | Dienst |
|
||||||
|------|--------|
|
|------|--------|
|
||||||
| `8000` | Proxy-Endpunkt (OpenAI-API) |
|
| `8000` | Proxy-Endpunkt (OpenAI-API) |
|
||||||
| `8001` | Admin-API + Web-Oberfläche (nicht exponieren) |
|
| `8001` | Admin-API + Web-Oberfläche |
|
||||||
|
|
||||||
|
Port 8001 muss exposed werden, da der Container die Admin-Oberfläche selbst auf diesem Port ausliefert. Alle API-Endpunkte erfordern das `ADMIN_PASSWORD` — ein Zugriff ohne gültiges Token liefert nur die öffentlichen Frontend-Dateien (HTML/JS/CSS der Login-Seite). Das Passwort ist damit die primäre Schutzmaßnahme.
|
||||||
|
|
||||||
|
Zusätzliche Härtung: Portbindung auf `127.0.0.1` beschränkt den Zugriff auf den lokalen Host und verhindert direkten Netzwerkzugriff:
|
||||||
|
|
||||||
|
```
|
||||||
|
ports:
|
||||||
|
- "127.0.0.1:8001:8001" # nur lokal erreichbar
|
||||||
|
# oder:
|
||||||
|
- "8001:8001" # netzwerkweit, Schutz nur durch ADMIN_PASSWORD
|
||||||
|
```
|
||||||
|
|
||||||
## Umgebungsvariablen
|
## Umgebungsvariablen
|
||||||
|
|
||||||
@ -46,6 +57,7 @@ services:
|
|||||||
restart: unless-stopped
|
restart: unless-stopped
|
||||||
ports:
|
ports:
|
||||||
- "8000:8000"
|
- "8000:8000"
|
||||||
|
- "127.0.0.1:8001:8001"
|
||||||
environment:
|
environment:
|
||||||
ADMIN_PASSWORD: changeme
|
ADMIN_PASSWORD: changeme
|
||||||
OLLAMA_URL: http://host.docker.internal:11434 # oder http://<ip>:11434
|
OLLAMA_URL: http://host.docker.internal:11434 # oder http://<ip>:11434
|
||||||
@ -71,6 +83,7 @@ services:
|
|||||||
restart: unless-stopped
|
restart: unless-stopped
|
||||||
ports:
|
ports:
|
||||||
- "8000:8000"
|
- "8000:8000"
|
||||||
|
- "127.0.0.1:8001:8001"
|
||||||
environment:
|
environment:
|
||||||
ADMIN_PASSWORD: changeme
|
ADMIN_PASSWORD: changeme
|
||||||
OLLAMA_URL: http://host.docker.internal:11434 # oder http://<ip>:11434
|
OLLAMA_URL: http://host.docker.internal:11434 # oder http://<ip>:11434
|
||||||
@ -115,6 +128,7 @@ services:
|
|||||||
restart: unless-stopped
|
restart: unless-stopped
|
||||||
ports:
|
ports:
|
||||||
- "8000:8000"
|
- "8000:8000"
|
||||||
|
- "127.0.0.1:8001:8001"
|
||||||
environment:
|
environment:
|
||||||
ADMIN_PASSWORD: changeme
|
ADMIN_PASSWORD: changeme
|
||||||
OLLAMA_URL: http://ollama:11434
|
OLLAMA_URL: http://ollama:11434
|
||||||
@ -147,6 +161,7 @@ services:
|
|||||||
restart: unless-stopped
|
restart: unless-stopped
|
||||||
ports:
|
ports:
|
||||||
- "8000:8000"
|
- "8000:8000"
|
||||||
|
- "127.0.0.1:8001:8001"
|
||||||
environment:
|
environment:
|
||||||
ADMIN_PASSWORD: changeme
|
ADMIN_PASSWORD: changeme
|
||||||
OLLAMA_URL: http://ollama:11434
|
OLLAMA_URL: http://ollama:11434
|
||||||
|
|||||||
131
README.md
131
README.md
@ -8,20 +8,22 @@ Ein Reverse-Proxy für Ollama mit API-Key-Authentifizierung, Quota-Management un
|
|||||||
- Optionales Ablaufdatum pro API-Key
|
- Optionales Ablaufdatum pro API-Key
|
||||||
- Quota-Management mit getrennten Tages- und Monatslimits (Tokens & Requests)
|
- Quota-Management mit getrennten Tages- und Monatslimits (Tokens & Requests)
|
||||||
- Token-Zählung via tiktoken, Reset-Grenzen in der Zeitzone Europe/Berlin
|
- Token-Zählung via tiktoken, Reset-Grenzen in der Zeitzone Europe/Berlin
|
||||||
- Web-Admin-Oberfläche (API-Keys verwalten, Ollama-Einstellungen, Proxy-Info)
|
- Web-Admin-Oberfläche (API-Keys verwalten, Ollama-Einstellungen, Verbrauchsanzeige)
|
||||||
- OpenAI-kompatibler `/v1/chat/completions`-Endpunkt
|
- OpenAI-kompatibler `/v1/chat/completions`-Endpunkt mit Streaming und Tool-Use
|
||||||
|
- Rotierende Nutzungs-Logs
|
||||||
|
- SQLite (Standard) oder PostgreSQL
|
||||||
|
- Docker-Image auf DockerHub: `mediaeng/llmproxy`
|
||||||
|
|
||||||
## Sicherheit
|
## Sicherheit
|
||||||
|
|
||||||
- Admin-Oberfläche passwortgeschützt (`ADMIN_PASSWORD`)
|
- Admin-Oberfläche passwortgeschützt (`ADMIN_PASSWORD`) — alle API-Endpunkte erfordern den Token
|
||||||
- Admin-API bindet lokal auf `127.0.0.1` (nicht von außen erreichbar)
|
|
||||||
- API-Keys als SHA-256-Hash in der DB — Plaintext nur einmalig bei Erstellung
|
- API-Keys als SHA-256-Hash in der DB — Plaintext nur einmalig bei Erstellung
|
||||||
- Quota-Check atomar mit `SELECT FOR UPDATE` (kein TOCTOU-Race)
|
- Quota-Check atomar mit `SELECT FOR UPDATE` (kein TOCTOU-Race)
|
||||||
- CORS-Origins konfigurierbar via `ALLOWED_ORIGINS`
|
- Port 8001 kann optional auf `127.0.0.1` gebunden werden (zusätzliche Härtung)
|
||||||
|
|
||||||
## Konfiguration
|
## Konfiguration
|
||||||
|
|
||||||
`.env`-Datei im Projektverzeichnis anlegen (Vorlage: `.env.example`):
|
`.env`-Datei im Projektverzeichnis anlegen:
|
||||||
|
|
||||||
```env
|
```env
|
||||||
ADMIN_PASSWORD=change-me
|
ADMIN_PASSWORD=change-me
|
||||||
@ -32,6 +34,7 @@ DATABASE_URL=sqlite:///./test.db
|
|||||||
OLLAMA_URL=http://localhost:11434
|
OLLAMA_URL=http://localhost:11434
|
||||||
DEFAULT_MODEL=llama3
|
DEFAULT_MODEL=llama3
|
||||||
APP_TZ=Europe/Berlin
|
APP_TZ=Europe/Berlin
|
||||||
|
LOG_FILE=logs/usage.log
|
||||||
```
|
```
|
||||||
|
|
||||||
| Variable | Standard | Beschreibung |
|
| Variable | Standard | Beschreibung |
|
||||||
@ -40,25 +43,15 @@ APP_TZ=Europe/Berlin
|
|||||||
| `PROXY_HOST` | `0.0.0.0` | Bind-Adresse des Proxys |
|
| `PROXY_HOST` | `0.0.0.0` | Bind-Adresse des Proxys |
|
||||||
| `PROXY_PORT` | `8000` | Port des Proxys |
|
| `PROXY_PORT` | `8000` | Port des Proxys |
|
||||||
| `ADMIN_PORT` | `8001` | Port der Admin-API |
|
| `ADMIN_PORT` | `8001` | Port der Admin-API |
|
||||||
| `DATABASE_URL` | `sqlite:///./test.db` | DB-Verbindungsstring |
|
| `DATABASE_URL` | `sqlite:///./test.db` | DB-Verbindungsstring (SQLite oder PostgreSQL) |
|
||||||
| `OLLAMA_URL` | `http://localhost:11434` | Adresse der Ollama-Instanz (auch in der UI änderbar) |
|
| `OLLAMA_URL` | `http://localhost:11434` | Adresse der Ollama-Instanz (auch in der UI änderbar) |
|
||||||
| `DEFAULT_MODEL` | `llama3` | Standard-Modell für `/v1/chat/completions` (auch in der UI änderbar) |
|
| `DEFAULT_MODEL` | `llama3` | Standard-Modell für `/v1/chat/completions` (auch in der UI änderbar) |
|
||||||
| `APP_TZ` | `Europe/Berlin` | Zeitzone für tägliche/monatliche Quota-Resets |
|
| `APP_TZ` | `Europe/Berlin` | Zeitzone für tägliche/monatliche Quota-Resets |
|
||||||
| `ALLOWED_ORIGINS` | `http://localhost:5173` | Kommagetrennte CORS-Origins |
|
| `LOG_FILE` | `logs/usage.log` | Pfad der rotierenden Nutzungs-Logdatei |
|
||||||
|
| `ALLOWED_ORIGINS` | `http://localhost:5173` | CORS-Origins (nur für Entwicklung relevant) |
|
||||||
|
|
||||||
## Entwicklung (lokal)
|
## Entwicklung (lokal)
|
||||||
|
|
||||||
```bash
|
|
||||||
cp .env.example .env
|
|
||||||
# ADMIN_PASSWORD in .env setzen
|
|
||||||
|
|
||||||
./start.sh
|
|
||||||
```
|
|
||||||
|
|
||||||
Das Script prüft alle Ports auf Belegung, aktiviert automatisch eine vorhandene `.venv`, initialisiert die Datenbank und startet Proxy, Admin-API und Vite-Dev-Server.
|
|
||||||
|
|
||||||
Admin-Oberfläche: `http://localhost:5173`
|
|
||||||
|
|
||||||
### Voraussetzungen
|
### Voraussetzungen
|
||||||
|
|
||||||
- Python 3.12+ mit virtualenv
|
- Python 3.12+ mit virtualenv
|
||||||
@ -68,63 +61,50 @@ Admin-Oberfläche: `http://localhost:5173`
|
|||||||
python -m venv .venv
|
python -m venv .venv
|
||||||
source .venv/bin/activate
|
source .venv/bin/activate
|
||||||
pip install -r backend/requirements-dev.txt
|
pip install -r backend/requirements-dev.txt
|
||||||
|
|
||||||
cd frontend && npm install
|
cd frontend && npm install
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Starten
|
||||||
|
|
||||||
|
**Per Script:**
|
||||||
|
```bash
|
||||||
|
cp .env.example .env # ADMIN_PASSWORD setzen
|
||||||
|
./start.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
**Per PyCharm:** Run-Config „Dev" starten (startet Proxy, Admin-API und Vite-Dev-Server gemeinsam).
|
||||||
|
|
||||||
|
Das Script prüft alle Ports auf Belegung, initialisiert die Datenbank und startet alle drei Dienste.
|
||||||
|
|
||||||
|
Admin-Oberfläche: `http://localhost:5173`
|
||||||
|
|
||||||
## Produktion (Docker)
|
## Produktion (Docker)
|
||||||
|
|
||||||
### Image bauen
|
### Docker Compose (empfohlen)
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
docker build -t llm-quota .
|
docker compose up -d
|
||||||
```
|
```
|
||||||
|
|
||||||
### Container starten
|
Zieht das Image von DockerHub, lädt Variablen aus `.env` und verwendet die lokale SQLite-Datenbank. Weitere Compose-Varianten (PostgreSQL, Ollama als Container) siehe `DOCKERHUB.md`.
|
||||||
|
|
||||||
|
### Image selbst bauen und pushen
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
docker run -d \
|
./build_push.sh
|
||||||
-p 8000:8000 \
|
|
||||||
-p 127.0.0.1:8001:8001 \
|
|
||||||
-e ADMIN_PASSWORD=geheim \
|
|
||||||
-e OLLAMA_URL=http://host.docker.internal:11434 \
|
|
||||||
-e DATABASE_URL=sqlite:///./data/quota.db \
|
|
||||||
-v $(pwd)/data:/app/backend/data \
|
|
||||||
--name llm-quota \
|
|
||||||
llm-quota
|
|
||||||
```
|
```
|
||||||
|
|
||||||
| Port | Bindung | Dienst |
|
Das Script zeigt den aktuellen Git-Tag, bietet an einen neuen zu setzen, baut das Image für `linux/arm64` und pusht zu `mediaeng/llmproxy`.
|
||||||
|------|---------|--------|
|
|
||||||
| `8000` | `0.0.0.0` — öffentlich | Proxy (für LLM-Clients) |
|
|
||||||
| `8001` | `127.0.0.1` — nur lokal am Server | Admin-API + Admin-Oberfläche |
|
|
||||||
|
|
||||||
Docker unterscheidet beim Port-Mapping zwischen `0.0.0.0` (alle Interfaces, öffentlich erreichbar) und `127.0.0.1` (nur der Server selbst kann zugreifen). Mit `-p 127.0.0.1:8001:8001` ist Port 8001 am Server verfügbar, aber von außen nicht direkt ansprechbar.
|
### Port 8001 (Admin)
|
||||||
|
|
||||||
### Admin-Oberfläche per SSH-Tunnel erreichbar machen
|
Port 8001 muss exposed werden, da der Container die Admin-Oberfläche auf diesem Port ausliefert. Alle API-Endpunkte erfordern das `ADMIN_PASSWORD` — der Token ist der primäre Schutz. Optionale zusätzliche Härtung: Bindung auf `127.0.0.1`:
|
||||||
|
|
||||||
Der SSH-Tunnel leitet einen lokalen Port auf den Server weiter und nutzt dabei, dass Port 8001 dort auf `127.0.0.1` erreichbar ist:
|
```yaml
|
||||||
|
ports:
|
||||||
```
|
- "127.0.0.1:8001:8001" # nur lokal
|
||||||
Admin-Laptop:8001 ──SSH──► Server:127.0.0.1:8001 ──► Container:8001
|
# oder:
|
||||||
```
|
- "8001:8001" # netzwerkweit, Schutz durch ADMIN_PASSWORD
|
||||||
|
|
||||||
```bash
|
|
||||||
ssh -L 8001:localhost:8001 user@server
|
|
||||||
```
|
|
||||||
|
|
||||||
Danach ist die Admin-Oberfläche auf dem Laptop unter `http://localhost:8001` erreichbar — ohne dass Port 8001 öffentlich exponiert wird.
|
|
||||||
|
|
||||||
### Mit PostgreSQL
|
|
||||||
|
|
||||||
```bash
|
|
||||||
docker run -d \
|
|
||||||
-p 8000:8000 \
|
|
||||||
-p 127.0.0.1:8001:8001 \
|
|
||||||
-e ADMIN_PASSWORD=geheim \
|
|
||||||
-e DATABASE_URL=postgresql://user:pass@db-host:5432/llm_quota \
|
|
||||||
-e OLLAMA_URL=http://ollama:11434 \
|
|
||||||
llm-quota
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## Proxy-Endpunkte (Port 8000)
|
## Proxy-Endpunkte (Port 8000)
|
||||||
@ -132,7 +112,7 @@ docker run -d \
|
|||||||
Alle Endpunkte erfordern einen gültigen API-Key im `Authorization`-Header.
|
Alle Endpunkte erfordern einen gültigen API-Key im `Authorization`-Header.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
curl -X POST http://localhost:8000/api/chat \
|
curl -X POST http://localhost:8000/v1/chat/completions \
|
||||||
-H "Authorization: Bearer sk-xxxxxx" \
|
-H "Authorization: Bearer sk-xxxxxx" \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
-d '{"model":"llama3","messages":[{"role":"user","content":"Hallo"}]}'
|
-d '{"model":"llama3","messages":[{"role":"user","content":"Hallo"}]}'
|
||||||
@ -140,12 +120,12 @@ curl -X POST http://localhost:8000/api/chat \
|
|||||||
|
|
||||||
| Endpunkt | Methode | Beschreibung |
|
| Endpunkt | Methode | Beschreibung |
|
||||||
|----------|---------|--------------|
|
|----------|---------|--------------|
|
||||||
| `/api/generate` | POST | Ollama generate |
|
| `/v1/chat/completions` | POST | Chat (OpenAI-Format, Streaming + Tool-Use) |
|
||||||
| `/api/chat` | POST | Ollama chat |
|
| `/v1/models` | GET | Modelle (OpenAI-Format) |
|
||||||
|
| `/api/generate` | POST | Ollama generate (nativ) |
|
||||||
|
| `/api/chat` | POST | Ollama chat (nativ) |
|
||||||
| `/api/tags` | GET | Verfügbare Modelle |
|
| `/api/tags` | GET | Verfügbare Modelle |
|
||||||
| `/api/versions` | GET | Ollama-Version |
|
| `/api/versions` | GET | Ollama-Version |
|
||||||
| `/v1/models` | GET | Modelle (OpenAI-Format) |
|
|
||||||
| `/v1/chat/completions` | POST | Chat (OpenAI-Format) |
|
|
||||||
|
|
||||||
## Admin-API (Port 8001)
|
## Admin-API (Port 8001)
|
||||||
|
|
||||||
@ -153,10 +133,12 @@ Alle Endpunkte erfordern `Authorization: Bearer <ADMIN_PASSWORD>`.
|
|||||||
|
|
||||||
| Endpunkt | Methode | Beschreibung |
|
| Endpunkt | Methode | Beschreibung |
|
||||||
|----------|---------|--------------|
|
|----------|---------|--------------|
|
||||||
| `/api/api-keys` | GET | Alle API-Keys auflisten |
|
| `/api/api-keys` | GET | Alle API-Keys mit Verbrauchsdaten |
|
||||||
| `/api/api-keys` | POST | Neuen API-Key erstellen |
|
| `/api/api-keys` | POST | Neuen API-Key erstellen |
|
||||||
|
| `/api/api-keys/{id}/quota` | PATCH | Limits eines Keys aktualisieren |
|
||||||
|
| `/api/api-keys/{id}/activate` | PUT | API-Key aktivieren |
|
||||||
| `/api/api-keys/{id}/deactivate` | PUT | API-Key deaktivieren |
|
| `/api/api-keys/{id}/deactivate` | PUT | API-Key deaktivieren |
|
||||||
| `/api/api-keys/{id}/quota` | PATCH | Quota eines Keys aktualisieren |
|
| `/api/api-keys/{id}` | DELETE | API-Key löschen |
|
||||||
| `/api/settings` | GET/PUT | Ollama-URL und Standard-Modell |
|
| `/api/settings` | GET/PUT | Ollama-URL und Standard-Modell |
|
||||||
| `/api/ollama-models` | GET | Verfügbare Modelle von Ollama |
|
| `/api/ollama-models` | GET | Verfügbare Modelle von Ollama |
|
||||||
| `/api/proxy-info` | GET | Lokaler Proxy-Endpunkt |
|
| `/api/proxy-info` | GET | Lokaler Proxy-Endpunkt |
|
||||||
@ -180,22 +162,27 @@ llm_quota/
|
|||||||
│ ├── schemas.py # Pydantic-Schemas
|
│ ├── schemas.py # Pydantic-Schemas
|
||||||
│ ├── crud.py # DB-Operationen, Token-Zählung, Quota-Logik
|
│ ├── crud.py # DB-Operationen, Token-Zählung, Quota-Logik
|
||||||
│ ├── init_db.py # Tabellen anlegen & Settings seeden
|
│ ├── init_db.py # Tabellen anlegen & Settings seeden
|
||||||
│ ├── setup_admin.py # Standard-API-Key erstellen
|
|
||||||
│ ├── requirements.txt # Produktiv-Dependencies
|
│ ├── requirements.txt # Produktiv-Dependencies
|
||||||
│ ├── requirements-dev.txt # Test-Dependencies
|
│ ├── requirements-dev.txt # Test-Dependencies
|
||||||
│ └── tests/
|
│ └── tests/
|
||||||
│ ├── conftest.py # Fixtures
|
│ ├── conftest.py
|
||||||
│ ├── test_auth.py # Authentifizierungs-Tests
|
│ ├── test_auth.py
|
||||||
│ └── test_quota.py # Quota-, Token- und Ablauf-Tests
|
│ └── test_quota.py
|
||||||
├── frontend/
|
├── frontend/
|
||||||
│ └── src/
|
│ └── src/
|
||||||
│ ├── main.jsx # React-Admin-UI
|
│ ├── main.jsx # React-Admin-UI
|
||||||
│ └── styles.css
|
│ └── styles.css
|
||||||
|
├── .idea/runConfigurations/
|
||||||
|
│ └── Dev.xml # PyCharm Run-Config
|
||||||
├── Dockerfile
|
├── Dockerfile
|
||||||
|
├── docker-compose.yml # Produktiv-Start mit DockerHub-Image
|
||||||
├── docker-entrypoint.sh
|
├── docker-entrypoint.sh
|
||||||
├── .dockerignore
|
├── .dockerignore
|
||||||
├── .env.example
|
|
||||||
├── start.sh # Entwicklungs-Startscript
|
├── start.sh # Entwicklungs-Startscript
|
||||||
|
├── run_dev.py # Entwicklungs-Runner für PyCharm
|
||||||
|
├── build_push.sh # Docker-Build & Push zu DockerHub
|
||||||
|
├── DOCKERHUB.md # DockerHub-Beschreibung (deutsch)
|
||||||
|
├── DOCKERHUB.en.md # DockerHub-Beschreibung (englisch)
|
||||||
└── .gitignore
|
└── .gitignore
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
@ -47,6 +47,8 @@ async def read_api_keys(
|
|||||||
item.tokens_used_month = usage.tokens_used_month or 0
|
item.tokens_used_month = usage.tokens_used_month or 0
|
||||||
item.requests_today = usage.requests_today or 0
|
item.requests_today = usage.requests_today or 0
|
||||||
item.requests_month = usage.requests_month or 0
|
item.requests_month = usage.requests_month or 0
|
||||||
|
item.daily_reset_at = usage.daily_reset_at
|
||||||
|
item.monthly_reset_at = usage.monthly_reset_at
|
||||||
result.append(item)
|
result.append(item)
|
||||||
return result
|
return result
|
||||||
|
|
||||||
|
|||||||
@ -58,6 +58,8 @@ class APIKeyWithUsage(APIKey):
|
|||||||
tokens_used_month: int = 0
|
tokens_used_month: int = 0
|
||||||
requests_today: int = 0
|
requests_today: int = 0
|
||||||
requests_month: int = 0
|
requests_month: int = 0
|
||||||
|
daily_reset_at: Optional[datetime] = None
|
||||||
|
monthly_reset_at: Optional[datetime] = None
|
||||||
|
|
||||||
class Config:
|
class Config:
|
||||||
from_attributes = True
|
from_attributes = True
|
||||||
@ -5,13 +5,23 @@ cd "$(dirname "$0")"
|
|||||||
IMAGE=mediaeng/llmproxy
|
IMAGE=mediaeng/llmproxy
|
||||||
PLATFORM=linux/arm64
|
PLATFORM=linux/arm64
|
||||||
|
|
||||||
VERSION=$(git describe --tags --always)
|
CURRENT=$(git describe --tags --always)
|
||||||
if [ -z "$VERSION" ]; then
|
if [ -z "$CURRENT" ]; then
|
||||||
echo "Fehler: git describe liefert kein Ergebnis"
|
echo "Fehler: git describe liefert kein Ergebnis"
|
||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
echo "Version : $VERSION"
|
echo "Aktueller Tag: $CURRENT"
|
||||||
|
read -rp "Neuer Tag [${CURRENT}]: " INPUT
|
||||||
|
VERSION="${INPUT:-$CURRENT}"
|
||||||
|
|
||||||
|
if [ "$VERSION" != "$CURRENT" ]; then
|
||||||
|
git tag "$VERSION"
|
||||||
|
git push origin "$VERSION"
|
||||||
|
echo "Tag '$VERSION' gesetzt und gepusht."
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo ""
|
||||||
echo "Image : $IMAGE"
|
echo "Image : $IMAGE"
|
||||||
echo "Platform: $PLATFORM"
|
echo "Platform: $PLATFORM"
|
||||||
echo "Tags : $IMAGE:$VERSION $IMAGE:latest"
|
echo "Tags : $IMAGE:$VERSION $IMAGE:latest"
|
||||||
|
|||||||
@ -1,24 +1,11 @@
|
|||||||
backend:
|
services:
|
||||||
build:
|
llmproxy:
|
||||||
context: ./backend
|
image: mediaeng/llmproxy:latest
|
||||||
dockerfile: Dockerfile
|
restart: unless-stopped
|
||||||
|
env_file: .env
|
||||||
ports:
|
ports:
|
||||||
- "8000:8000"
|
- "${PROXY_PORT:-8000}:${PROXY_PORT:-8000}"
|
||||||
environment:
|
- "127.0.0.1:8001:8001"
|
||||||
- DATABASE_URL=postgresql://ollama:password@db:5432/ollama_proxy
|
|
||||||
- OLLAMA_URL=http://ollama:11434
|
|
||||||
- SECRET_KEY=your-secret-key-change-me
|
|
||||||
depends_on:
|
|
||||||
- db
|
|
||||||
|
|
||||||
db:
|
|
||||||
image: postgres:16
|
|
||||||
environment:
|
|
||||||
- POSTGRES_USER=ollama
|
|
||||||
- POSTGRES_PASSWORD=password
|
|
||||||
- POSTGRES_DB=ollama_proxy
|
|
||||||
volumes:
|
volumes:
|
||||||
- postgres_data:/var/lib/postgresql/data
|
- ./backend/test.db:/app/backend/test.db
|
||||||
|
- ./backend/logs:/app/backend/logs
|
||||||
volumes:
|
|
||||||
postgres_data:
|
|
||||||
|
|||||||
@ -11,17 +11,28 @@ function authHeaders(token) {
|
|||||||
|
|
||||||
const fmtK = (n) => { const k = n / 1000; return k % 1 === 0 ? `${k}k` : `${k.toFixed(1)}k`; };
|
const fmtK = (n) => { const k = n / 1000; return k % 1 === 0 ? `${k}k` : `${k.toFixed(1)}k`; };
|
||||||
|
|
||||||
function QuotaBar({ used, limit, isToken = false }) {
|
function QuotaBar({ used, limit, isToken = false, since = null }) {
|
||||||
if (limit == null) return <span className="quota-unlimited">∞</span>;
|
const fmt = isToken ? fmtK : (n) => n.toLocaleString('de-DE');
|
||||||
|
const sinceLabel = since
|
||||||
|
? new Date(since).toLocaleDateString('de-DE', { day: '2-digit', month: '2-digit' })
|
||||||
|
: null;
|
||||||
|
|
||||||
|
if (limit == null) return (
|
||||||
|
<div className="quota-cell">
|
||||||
|
<span className="quota-unlimited">∞</span>
|
||||||
|
{sinceLabel && <span className="quota-since">seit {sinceLabel}</span>}
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
|
||||||
const pct = Math.min(100, (used / limit) * 100);
|
const pct = Math.min(100, (used / limit) * 100);
|
||||||
const color = pct >= 90 ? '#e74c3c' : pct >= 70 ? '#e67e22' : '#27ae60';
|
const color = pct >= 90 ? '#e74c3c' : pct >= 70 ? '#e67e22' : '#27ae60';
|
||||||
const fmt = isToken ? fmtK : (n) => n.toLocaleString('de-DE');
|
|
||||||
return (
|
return (
|
||||||
<div className="quota-cell">
|
<div className="quota-cell">
|
||||||
<span className="quota-label">{fmt(used)} / {fmt(limit)}</span>
|
<span className="quota-label">{fmt(used)} / {fmt(limit)}</span>
|
||||||
<div className="progress-bar">
|
<div className="progress-bar">
|
||||||
<div className="progress-fill" style={{ width: `${pct}%`, backgroundColor: color }} />
|
<div className="progress-fill" style={{ width: `${pct}%`, backgroundColor: color }} />
|
||||||
</div>
|
</div>
|
||||||
|
{sinceLabel && <span className="quota-since">seit {sinceLabel}</span>}
|
||||||
</div>
|
</div>
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
@ -400,10 +411,10 @@ function App() {
|
|||||||
<td>{key.name}</td>
|
<td>{key.name}</td>
|
||||||
<td>{displayKey(key.key_prefix)}</td>
|
<td>{displayKey(key.key_prefix)}</td>
|
||||||
<td>{key.expires_at ? new Date(key.expires_at).toLocaleDateString('de-DE', { timeZone: 'Europe/Berlin' }) : '∞'}</td>
|
<td>{key.expires_at ? new Date(key.expires_at).toLocaleDateString('de-DE', { timeZone: 'Europe/Berlin' }) : '∞'}</td>
|
||||||
<td><QuotaBar used={key.tokens_used_today} limit={key.daily_tokens} isToken /></td>
|
<td><QuotaBar used={key.tokens_used_today} limit={key.daily_tokens} isToken since={key.daily_reset_at} /></td>
|
||||||
<td><QuotaBar used={key.tokens_used_month} limit={key.monthly_tokens} isToken /></td>
|
<td><QuotaBar used={key.tokens_used_month} limit={key.monthly_tokens} isToken since={key.monthly_reset_at} /></td>
|
||||||
<td><QuotaBar used={key.requests_today} limit={key.daily_requests} /></td>
|
<td><QuotaBar used={key.requests_today} limit={key.daily_requests} since={key.daily_reset_at} /></td>
|
||||||
<td><QuotaBar used={key.requests_month} limit={key.monthly_requests} /></td>
|
<td><QuotaBar used={key.requests_month} limit={key.monthly_requests} since={key.monthly_reset_at} /></td>
|
||||||
<td className="action-cell">
|
<td className="action-cell">
|
||||||
<button className="btn-icon btn-icon-edit" data-tooltip="Bearbeiten" onClick={() => handleEdit(key)}>✏</button>
|
<button className="btn-icon btn-icon-edit" data-tooltip="Bearbeiten" onClick={() => handleEdit(key)}>✏</button>
|
||||||
{key.is_active ? (
|
{key.is_active ? (
|
||||||
|
|||||||
@ -246,6 +246,13 @@ tr:hover {
|
|||||||
font-size: 14px;
|
font-size: 14px;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
.quota-since {
|
||||||
|
display: block;
|
||||||
|
font-size: 10px;
|
||||||
|
color: #aaa;
|
||||||
|
margin-top: 2px;
|
||||||
|
}
|
||||||
|
|
||||||
.progress-bar {
|
.progress-bar {
|
||||||
height: 4px;
|
height: 4px;
|
||||||
background: #e2e8f0;
|
background: #e2e8f0;
|
||||||
|
|||||||
@ -5,6 +5,7 @@ export default defineConfig({
|
|||||||
plugins: [react()],
|
plugins: [react()],
|
||||||
clearScreen: false,
|
clearScreen: false,
|
||||||
server: {
|
server: {
|
||||||
|
open: true,
|
||||||
proxy: {
|
proxy: {
|
||||||
'/api/api-keys': 'http://localhost:8001',
|
'/api/api-keys': 'http://localhost:8001',
|
||||||
'/api/settings': 'http://localhost:8001',
|
'/api/settings': 'http://localhost:8001',
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user