feat(gpu-server): add API key auth, VRAM management, and Piper TTS voices

- Add API key authentication to all GPU services (X-API-Key header) - /health and /docs remain public (no key needed) - Shared key configured via GPU_API_KEY env variable - Add VRAM auto-unload for mana-image-gen (5min) and mana-stt (10min) - FLUX.2 pipeline freed after idle, recovering ~13GB VRAM - WhisperX models freed after idle, recovering ~3GB VRAM - Install Piper TTS voices (Thorsten + Kerstin) for local German TTS - Update @manacore/shared-gpu client to support apiKey parameter - Add GPU_API_KEY to .env.development - Document API auth and VRAM management in setup guide Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-14 18:41:08 +02:00 · 2026-03-27 21:54:35 +01:00 · 2026-03-27 21:54:35 +01:00 · c67ed0df14
commit c67ed0df14
parent 97ef728eca
7 changed files with 65 additions and 6 deletions
--- a/.env.development
+++ b/.env.development
@ -417,3 +417,8 @@ MUKKE_DATABASE_URL=postgresql://manacore:devpassword@localhost:5432/mukke
 CITYCORNERS_BACKEND_PORT=3025
 CITYCORNERS_DATABASE_URL=postgresql://manacore:devpassword@localhost:5432/citycorners
 CITYCORNERS_WEB_PORT=5196
+
+# GPU Server (Windows PC with RTX 3090)
+GPU_API_KEY=sk-gpu-cf483ede1e05e28fba5e56c94cd3c24e7c245e57816d3e86
+GPU_SERVER_URL=https://gpu.mana.how
+GPU_SERVER_LAN_URL=http://192.168.178.11
--- a/docs/WINDOWS_GPU_SERVER_SETUP.md
+++ b/docs/WINDOWS_GPU_SERVER_SETUP.md
@ -614,13 +614,19 @@ GPU Server (healthcheck.py → log-shipper.py)
 Shared Package im Monorepo (`packages/shared-gpu/`) für alle GPU-Services:

 ```typescript
-import { GpuClient, GPU_PUBLIC_URLS } from '@manacore/shared-gpu';
+import { GpuClient } from '@manacore/shared-gpu';

-// Öffentlich (von überall)
-const gpu = new GpuClient({ baseUrl: 'https://gpu.mana.how' });
+// Öffentlich (von überall, mit API-Key)
+const gpu = new GpuClient({
+  baseUrl: 'https://gpu.mana.how',
+  apiKey: process.env.GPU_API_KEY,
+});

 // Oder LAN (direkt, schneller)
-const gpuLan = new GpuClient({ baseUrl: 'http://192.168.178.11' });
+const gpuLan = new GpuClient({
+  baseUrl: 'http://192.168.178.11',
+  apiKey: process.env.GPU_API_KEY,
+});

 // Speech-to-Text (mit Word-Timestamps + Speaker Diarization)
 const transcript = await gpu.stt.transcribe(audioBuffer, 'recording.wav', {
@ -644,6 +650,37 @@ const health = await gpu.healthCheck();

 ---

+## API-Authentifizierung
+
+Alle GPU-Services erfordern einen API-Key für Zugriff auf geschützte Endpoints.
+`/health` und `/docs` sind öffentlich (kein Key nötig).
+
+**API-Key:** In `.env.development` unter `GPU_API_KEY`
+
+**Verwendung:**
+
+```bash
+# Mit Header
+curl -H "X-API-Key: $GPU_API_KEY" https://gpu-llm.mana.how/v1/models
+
+# Oder als Query-Parameter
+curl "https://gpu-stt.mana.how/models?api_key=$GPU_API_KEY"
+
+# Health (kein Key nötig)
+curl https://gpu-llm.mana.how/health
+```
+
+**Konfiguration auf dem GPU-Server:**
+
+| Service | Env-Variable | Datei |
+|---|---|---|
+| mana-llm | `GPU_API_KEY` | `C:\mana\services\mana-llm\.env` |
+| mana-stt | `API_KEYS`, `INTERNAL_API_KEY` | `C:\mana\services\mana-stt\.env` |
+| mana-tts | `API_KEYS`, `INTERNAL_API_KEY` | `C:\mana\services\mana-tts\.env` |
+| mana-image-gen | `GPU_API_KEY` | `C:\mana\services\mana-image-gen\.env` |
+
+---
+
 ## Fehlerbehebung

 ### Server nicht erreichbar (kein Ping, kein SSH)
--- a/packages/shared-gpu/src/gpu-client.ts
+++ b/packages/shared-gpu/src/gpu-client.ts
@ -28,8 +28,10 @@ export class GpuClient {
 	public readonly stt: SttClient;
 	public readonly tts: TtsClient;
 	public readonly image: ImageClient;
+	public readonly apiKey?: string;

 	constructor(config: GpuServiceConfig) {
+		this.apiKey = config.apiKey;
 		this.stt = new SttClient(config);
 		this.tts = new TtsClient(config);
 		this.image = new ImageClient(config);
--- a/packages/shared-gpu/src/image-client.ts
+++ b/packages/shared-gpu/src/image-client.ts
@ -9,10 +9,12 @@ import { resolveServiceUrl } from './resolve-url';
 export class ImageClient {
 	private baseUrl: string;
 	private timeout: number;
+	private apiKey?: string;

 	constructor(config: GpuServiceConfig) {
 		this.baseUrl = resolveServiceUrl(config, 'image');
 		this.timeout = config.timeout ?? 120_000;
+		this.apiKey = config.apiKey;
 	}

 	/** Generate an image from a text prompt. */
@ -23,7 +25,10 @@ export class ImageClient {
 		try {
 			const response = await fetch(`${this.baseUrl}/generate`, {
 				method: 'POST',
-				headers: { 'Content-Type': 'application/json' },
+				headers: {
+					'Content-Type': 'application/json',
+					...(this.apiKey ? { 'X-API-Key': this.apiKey } : {}),
+				},
 				body: JSON.stringify({
 					prompt: options.prompt,
 					width: options.width ?? 1024,
--- a/packages/shared-gpu/src/stt-client.ts
+++ b/packages/shared-gpu/src/stt-client.ts
@ -4,10 +4,12 @@ import { resolveServiceUrl } from './resolve-url';
 export class SttClient {
 	private baseUrl: string;
 	private timeout: number;
+	private apiKey?: string;

 	constructor(config: GpuServiceConfig) {
 		this.baseUrl = resolveServiceUrl(config, 'stt');
 		this.timeout = config.timeout ?? 60_000;
+		this.apiKey = config.apiKey;
 	}

 	/** Transcribe audio with optional word timestamps and speaker diarization. */
@ -34,6 +36,7 @@ export class SttClient {
 		try {
 			const response = await fetch(`${this.baseUrl}/transcribe`, {
 				method: 'POST',
+				headers: this.apiKey ? { 'X-API-Key': this.apiKey } : {},
 				body: formData,
 				signal: controller.signal,
 			});
--- a/packages/shared-gpu/src/tts-client.ts
+++ b/packages/shared-gpu/src/tts-client.ts
@ -4,10 +4,12 @@ import { resolveServiceUrl } from './resolve-url';
 export class TtsClient {
 	private baseUrl: string;
 	private timeout: number;
+	private apiKey?: string;

 	constructor(config: GpuServiceConfig) {
 		this.baseUrl = resolveServiceUrl(config, 'tts');
 		this.timeout = config.timeout ?? 30_000;
+		this.apiKey = config.apiKey;
 	}

 	/** Synthesize speech. Returns audio as ArrayBuffer. */
@ -23,7 +25,10 @@ export class TtsClient {
 		try {
 			const response = await fetch(`${this.baseUrl}/synthesize/auto`, {
 				method: 'POST',
-				headers: { 'Content-Type': 'application/json' },
+				headers: {
+					'Content-Type': 'application/json',
+					...(this.apiKey ? { 'X-API-Key': this.apiKey } : {}),
+				},
 				body: JSON.stringify({
 					text: options.text,
 					voice: options.voice,
--- a/packages/shared-gpu/src/types.ts
+++ b/packages/shared-gpu/src/types.ts
@ -119,6 +119,8 @@ export interface GpuServiceConfig {
 		image?: string;
 		ollama?: string;
 	};
+	/** API key for authenticated access (X-API-Key header) */
+	apiKey?: string;
 	/** Request timeout in ms (default: 30000) */
 	timeout?: number;
 }