169 lines
4.9 KiB
Markdown
169 lines
4.9 KiB
Markdown
# mediaeng/llmproxy
|
||
|
||
A lightweight reverse proxy for [Ollama](https://ollama.com) that manages API keys with configurable token and request quotas. Incoming requests in OpenAI-compatible or Anthropic-compatible format are authenticated, checked against the quota, and forwarded to the configured Ollama server.
|
||
|
||
## Features
|
||
|
||
- OpenAI-compatible endpoint (`/v1/chat/completions`, `/v1/models`)
|
||
- Anthropic Messages API (`/v1/messages`) — compatible with Claude Code CLI and Anthropic SDK clients
|
||
- API key management with daily and monthly token/request limits
|
||
- Web-based admin interface (port 8001)
|
||
- Model lock: enforces a specific model for all requests (useful for courses and lab sessions)
|
||
- Streaming support (Server-Sent Events)
|
||
- Tool use / function calling passthrough
|
||
- Rotating usage logs
|
||
- SQLite (default) or PostgreSQL
|
||
|
||
## Ports
|
||
|
||
| Port | Service |
|
||
|------|---------|
|
||
| `8000` | Proxy endpoint (OpenAI and Anthropic API) |
|
||
| `8001` | Admin API + web interface |
|
||
|
||
All API endpoints require the `ADMIN_PASSWORD` — without a valid token, only the public frontend files (HTML/JS/CSS of the login page) are accessible. The password is therefore the primary protection.
|
||
|
||
## Environment Variables
|
||
|
||
| Variable | Default | Description |
|
||
|----------|---------|-------------|
|
||
| `ADMIN_PASSWORD` | – | **Required.** Password for the admin interface |
|
||
| `OLLAMA_URL` | `http://localhost:11434` | URL of the Ollama server (without `/v1` suffix) |
|
||
| `DATABASE_URL` | `sqlite:///./test.db` | Database connection string (SQLite or PostgreSQL) |
|
||
| `PROXY_HOST` | `0.0.0.0` | Proxy bind address |
|
||
| `PROXY_PORT` | `8000` | Proxy port |
|
||
| `ADMIN_HOST` | `0.0.0.0` | Admin API bind address (`127.0.0.1` to restrict to local access) |
|
||
| `ADMIN_PORT` | `8001` | Admin API port |
|
||
| `APP_TZ` | `Europe/Berlin` | Timezone for daily/monthly quota resets |
|
||
| `LOG_FILE` | `logs/usage.log` | Path of the rotating usage log file |
|
||
| `ANTHROPIC_DEFAULT_MODEL` | – | Default model for `/v1/messages` (Ollama model name, e.g. `llama3`) |
|
||
|
||
## Docker Compose – Ollama on the Host (Linux, recommended)
|
||
|
||
`network_mode: host` gives the container direct access to the host network stack. Ollama runs on the host and is reachable at `localhost:11434` — not visible from outside. The proxy and admin interface are available directly on host ports 8000 and 8001.
|
||
|
||
```yaml
|
||
services:
|
||
llmproxy:
|
||
image: mediaeng/llmproxy:latest
|
||
container_name: llmproxy
|
||
restart: unless-stopped
|
||
network_mode: host
|
||
env_file: .env
|
||
volumes:
|
||
- llmproxy-data:/app/backend
|
||
|
||
volumes:
|
||
llmproxy-data:
|
||
```
|
||
|
||
`.env`:
|
||
```env
|
||
ADMIN_PASSWORD=changeme
|
||
OLLAMA_URL=http://localhost:11434
|
||
APP_TZ=Europe/Berlin
|
||
ANTHROPIC_DEFAULT_MODEL=llama3
|
||
```
|
||
|
||
## Docker Compose – Ollama as Container, SQLite
|
||
|
||
Ollama and llmproxy run together in Docker. Ollama is not exposed externally.
|
||
|
||
```yaml
|
||
services:
|
||
llmproxy:
|
||
image: mediaeng/llmproxy:latest
|
||
restart: unless-stopped
|
||
ports:
|
||
- "8000:8000"
|
||
- "8001:8001"
|
||
environment:
|
||
ADMIN_PASSWORD: changeme
|
||
OLLAMA_URL: http://ollama:11434
|
||
APP_TZ: Europe/Berlin
|
||
ANTHROPIC_DEFAULT_MODEL: llama3
|
||
volumes:
|
||
- llmproxy-data:/app/backend
|
||
depends_on:
|
||
- ollama
|
||
|
||
ollama:
|
||
image: ollama/ollama:latest
|
||
restart: unless-stopped
|
||
volumes:
|
||
- ollama-data:/root/.ollama
|
||
|
||
volumes:
|
||
llmproxy-data:
|
||
ollama-data:
|
||
```
|
||
|
||
## Docker Compose – Ollama as Container, PostgreSQL
|
||
|
||
For production environments with an external database.
|
||
|
||
```yaml
|
||
services:
|
||
llmproxy:
|
||
image: mediaeng/llmproxy:latest
|
||
restart: unless-stopped
|
||
ports:
|
||
- "8000:8000"
|
||
- "8001:8001"
|
||
environment:
|
||
ADMIN_PASSWORD: changeme
|
||
OLLAMA_URL: http://ollama:11434
|
||
APP_TZ: Europe/Berlin
|
||
DATABASE_URL: postgresql://llmproxy:secret@db:5432/llmproxy
|
||
ANTHROPIC_DEFAULT_MODEL: llama3
|
||
depends_on:
|
||
db:
|
||
condition: service_healthy
|
||
ollama:
|
||
condition: service_started
|
||
|
||
db:
|
||
image: postgres:16-alpine
|
||
restart: unless-stopped
|
||
environment:
|
||
POSTGRES_DB: llmproxy
|
||
POSTGRES_USER: llmproxy
|
||
POSTGRES_PASSWORD: secret
|
||
volumes:
|
||
- pg-data:/var/lib/postgresql/data
|
||
healthcheck:
|
||
test: ["CMD-SHELL", "pg_isready -U llmproxy"]
|
||
interval: 5s
|
||
timeout: 5s
|
||
retries: 5
|
||
|
||
ollama:
|
||
image: ollama/ollama:latest
|
||
restart: unless-stopped
|
||
volumes:
|
||
- ollama-data:/root/.ollama
|
||
|
||
volumes:
|
||
pg-data:
|
||
ollama-data:
|
||
```
|
||
|
||
## Client Configuration
|
||
|
||
**OpenAI-compatible client:**
|
||
```
|
||
Base URL: http://<host>:8000/v1
|
||
API Key: <API key created in the admin interface>
|
||
```
|
||
|
||
**Claude Code CLI:**
|
||
```bash
|
||
ANTHROPIC_BASE_URL=http://<host>:8000 \
|
||
ANTHROPIC_AUTH_TOKEN=<API key> \
|
||
claude
|
||
```
|
||
|
||
## Acknowledgements
|
||
|
||
The Anthropic Messages API endpoint (`/v1/messages`) was inspired by [free-claude-code](https://github.com/Alishahryar1/free-claude-code) by Ali Khokhar, which pursues a similar approach for routing Claude Code requests to alternative LLM backends.
|