managarten

till/managarten

Fork 0

mirror of https://github.com/Memo-2023/mana-monorepo.git synced 2026-05-15 01:21:09 +02:00

Commit graph

Author	SHA1	Message	Date
Till JS	c67ed0df14	feat(gpu-server): add API key auth, VRAM management, and Piper TTS voices - Add API key authentication to all GPU services (X-API-Key header) - /health and /docs remain public (no key needed) - Shared key configured via GPU_API_KEY env variable - Add VRAM auto-unload for mana-image-gen (5min) and mana-stt (10min) - FLUX.2 pipeline freed after idle, recovering ~13GB VRAM - WhisperX models freed after idle, recovering ~3GB VRAM - Install Piper TTS voices (Thorsten + Kerstin) for local German TTS - Update @manacore/shared-gpu client to support apiKey parameter - Add GPU_API_KEY to .env.development - Document API auth and VRAM management in setup guide Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 21:54:35 +01:00
Till JS	16e0d99c5a	feat(gpu-server): complete GPU server setup with AI services, monitoring, and public access - Set up 5 AI services on Windows GPU server (RTX 3090): - mana-llm (Port 3025): OpenAI-compatible LLM gateway via Ollama - mana-stt (Port 3020): WhisperX with word timestamps + speaker diarization - mana-tts (Port 3022): Kokoro (EN) + Edge TTS (DE) + Piper (local DE) - mana-image-gen (Port 3023): FLUX.2 klein 4B image generation - Ollama (Port 11434): gemma3:4b/12b, qwen2.5-coder:14b, nomic-embed-text - Add @manacore/shared-gpu TypeScript client package with SttClient, TtsClient, ImageClient - Add CUDA-compatible whisper_service using faster-whisper for Windows - Configure public access via Cloudflare Tunnel (gpu-llm/stt/tts/img.mana.how) - Add Loki log aggregator (Docker on Mac Mini) + log shipper on GPU server - Add GPU scrape targets to Prometheus/VictoriaMetrics config - Add Grafana Loki datasource for GPU service logs - Add health check with auto-restart, log rotation, and log shipping - Document complete setup: Always-On config, troubleshooting, architecture Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 21:35:30 +01:00

Author

SHA1

Message

Date

Till JS

c67ed0df14

feat(gpu-server): add API key auth, VRAM management, and Piper TTS voices

- Add API key authentication to all GPU services (X-API-Key header)
  - /health and /docs remain public (no key needed)
  - Shared key configured via GPU_API_KEY env variable
- Add VRAM auto-unload for mana-image-gen (5min) and mana-stt (10min)
  - FLUX.2 pipeline freed after idle, recovering ~13GB VRAM
  - WhisperX models freed after idle, recovering ~3GB VRAM
- Install Piper TTS voices (Thorsten + Kerstin) for local German TTS
- Update @manacore/shared-gpu client to support apiKey parameter
- Add GPU_API_KEY to .env.development
- Document API auth and VRAM management in setup guide

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-27 21:54:35 +01:00

Till JS

16e0d99c5a

feat(gpu-server): complete GPU server setup with AI services, monitoring, and public access

- Set up 5 AI services on Windows GPU server (RTX 3090):
  - mana-llm (Port 3025): OpenAI-compatible LLM gateway via Ollama
  - mana-stt (Port 3020): WhisperX with word timestamps + speaker diarization
  - mana-tts (Port 3022): Kokoro (EN) + Edge TTS (DE) + Piper (local DE)
  - mana-image-gen (Port 3023): FLUX.2 klein 4B image generation
  - Ollama (Port 11434): gemma3:4b/12b, qwen2.5-coder:14b, nomic-embed-text

- Add @manacore/shared-gpu TypeScript client package with SttClient, TtsClient, ImageClient
- Add CUDA-compatible whisper_service using faster-whisper for Windows
- Configure public access via Cloudflare Tunnel (gpu-llm/stt/tts/img.mana.how)
- Add Loki log aggregator (Docker on Mac Mini) + log shipper on GPU server
- Add GPU scrape targets to Prometheus/VictoriaMetrics config
- Add Grafana Loki datasource for GPU service logs
- Add health check with auto-restart, log rotation, and log shipping
- Document complete setup: Always-On config, troubleshooting, architecture

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-27 21:35:30 +01:00

2 commits