mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-20 23:26:41 +02:00
feat: GPU offload, signup limit, load tests & capacity planning
- Route all AI workloads (Ollama, STT, TTS, Image Gen) to GPU server (192.168.178.11) via LAN instead of host.docker.internal - Upgrade default model to gemma3:12b and max concurrent to 5 - Add daily signup limit service (MAX_DAILY_SIGNUPS env var) - Add GET /api/v1/auth/signup-status public endpoint - Add k6 load test suite (web-apps, auth, sync-websocket, ollama) - Add capacity planning documentation - Fix: add eslint-config to sveltekit-base and calendar Dockerfiles Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
16367384c7
commit
9276d9a212
12 changed files with 683 additions and 14 deletions
58
load-tests/README.md
Normal file
58
load-tests/README.md
Normal file
|
|
@ -0,0 +1,58 @@
|
|||
# Load Tests
|
||||
|
||||
k6-basierte Load Tests fuer die Mana-Infrastruktur.
|
||||
|
||||
## Setup
|
||||
|
||||
```bash
|
||||
# k6 installieren (macOS)
|
||||
brew install k6
|
||||
|
||||
# WebSocket-Extension (fuer Sync-Tests)
|
||||
# k6 hat WebSocket-Support eingebaut
|
||||
```
|
||||
|
||||
## Tests ausfuehren
|
||||
|
||||
```bash
|
||||
# Gegen lokale Umgebung
|
||||
k6 run load-tests/web-apps.js
|
||||
k6 run load-tests/auth-api.js
|
||||
k6 run load-tests/sync-websocket.js
|
||||
k6 run load-tests/llm-ollama.js
|
||||
|
||||
# Gegen Produktion (vorsichtig!)
|
||||
k6 run -e BASE_URL=https://mana.how load-tests/web-apps.js
|
||||
|
||||
# Mit mehr/weniger Last
|
||||
k6 run --vus 100 --duration 5m load-tests/web-apps.js
|
||||
|
||||
# JSON-Output fuer Grafana
|
||||
k6 run --out json=results.json load-tests/web-apps.js
|
||||
```
|
||||
|
||||
## Test-Szenarien
|
||||
|
||||
| Script | Ziel | Default VUs | Dauer |
|
||||
|--------|------|-------------|-------|
|
||||
| `web-apps.js` | SvelteKit Frontends (HTML-Responses) | 10→50→10 | 5 min |
|
||||
| `auth-api.js` | Login, Register, Token Validation | 5→20→5 | 4 min |
|
||||
| `sync-websocket.js` | mana-sync WebSocket Connections | 10→30→10 | 5 min |
|
||||
| `llm-ollama.js` | Ollama Chat Completions | 1→3→1 | 3 min |
|
||||
|
||||
## Metriken interpretieren
|
||||
|
||||
| Metrik | Gut | Akzeptabel | Schlecht |
|
||||
|--------|-----|-----------|---------|
|
||||
| http_req_duration (p95) | < 200ms | < 1s | > 2s |
|
||||
| http_req_failed | 0% | < 1% | > 5% |
|
||||
| ws_connecting (p95) | < 100ms | < 500ms | > 1s |
|
||||
| iterations | Steigend | Stabil | Fallend |
|
||||
|
||||
## Monitoring waehrend Tests
|
||||
|
||||
Grafana-Dashboard auf http://localhost:8080 (oder https://grafana.mana.how) beobachten:
|
||||
- Container CPU/RAM (cAdvisor)
|
||||
- PostgreSQL Connections
|
||||
- Redis Commands/sec
|
||||
- Netzwerk-Throughput
|
||||
Loading…
Add table
Add a link
Reference in a new issue