managarten/services/mana-llm/src/api_auth.py
Till JS b8e18b7f82 chore(ai-services): adopt Windows GPU as source of truth for llm/stt/tts
The Windows GPU server has been the actual production home for these
services for some time, and the running code there has drifted ahead of
the repo. This sync pulls the live versions back into the repo so the
Windows box is no longer the only place those changes exist.

Pulled from C:\mana\services\* on mana-server-gpu (192.168.178.11):

mana-llm:
- src/main.py, src/config.py — small fixes (auth wiring, config tweaks)
- src/api_auth.py — NEW (cross-service GPU_API_KEY validator)
- service.pyw — Windows runner used by the ManaLLM scheduled task
  (sets up logging redirect, loads .env, calls uvicorn)

mana-stt:
- app/main.py — substantial cleanup (684→392 lines), drops the
  whisperx-as-separate-backend branching now that whisper_service.py
  rolls whisperx in directly
- app/whisper_service.py — full CUDA + whisperx rewrite (158→358 lines)
- app/auth.py + external_auth.py — significantly expanded auth
- app/vram_manager.py — NEW (shared VRAM accounting helper)
- service.pyw — Windows runner with CUDA pre-init, FFmpeg PATH
  injection, .env loading
- removed: app/whisper_service_cuda.py (folded into whisper_service.py)
- removed: app/whisperx_service.py (folded into whisper_service.py)

mana-tts:
- app/auth.py, external_auth.py — same auth expansion as stt
- app/f5_service.py, kokoro_service.py — Windows tweaks
- app/vram_manager.py — NEW (same shared helper as stt)
- service.pyw — Windows runner

mana-video-gen:
- service.pyw — Windows runner (no other changes; the .py code on the
  GPU box is byte-identical to what's already in the repo)

The service.pyw files contain absolute Windows paths
(C:\mana\services\<svc>) and a hardcoded FFmpeg PATH for the tills user
profile. Kept as-is intentionally — they exist to be deployed to that
one machine and any abstraction layer would just hide what's actually
happening. Anyone redeploying to a different layout will need to edit
the path strings, which is a known and obvious change.

Mac-Mini infrastructure for these services (launchd plists, install
scripts, scripts/mac-mini/setup-{stt,tts}.sh, the Mac-flux2c image-gen
implementation) is still on disk and will be removed in a follow-up
commit, along with replacing mana-image-gen with the Windows
diffusers+CUDA implementation. This commit is just the live-code sync.
2026-04-08 12:46:03 +02:00

53 lines
1.8 KiB
Python

"""
Simple API Key Authentication Middleware for GPU Services.
Checks X-API-Key header or ?api_key query parameter.
Skips auth for /health, /docs, /openapi.json, /redoc endpoints.
Environment variables:
GPU_API_KEY: Required API key (if empty, auth is disabled)
GPU_REQUIRE_AUTH: Enable/disable auth (default: true if GPU_API_KEY is set)
"""
import os
import logging
from fastapi import Request
from fastapi.responses import JSONResponse
from starlette.middleware.base import BaseHTTPMiddleware
logger = logging.getLogger(__name__)
GPU_API_KEY = os.getenv("GPU_API_KEY", "")
GPU_REQUIRE_AUTH = os.getenv("GPU_REQUIRE_AUTH", "true" if GPU_API_KEY else "false").lower() == "true"
# Endpoints that don't require auth
PUBLIC_PATHS = {"/health", "/docs", "/openapi.json", "/redoc", "/metrics"}
class ApiKeyMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request: Request, call_next):
# Skip auth if disabled
if not GPU_REQUIRE_AUTH or not GPU_API_KEY:
return await call_next(request)
# Skip auth for public endpoints
if request.url.path in PUBLIC_PATHS:
return await call_next(request)
# Check API key from header or query param
api_key = request.headers.get("X-API-Key") or request.query_params.get("api_key")
if not api_key:
return JSONResponse(
status_code=401,
content={"detail": "Missing API key. Provide X-API-Key header."},
)
if api_key != GPU_API_KEY:
logger.warning(f"Invalid API key attempt from {request.client.host if request.client else 'unknown'}")
return JSONResponse(
status_code=401,
content={"detail": "Invalid API key."},
)
return await call_next(request)