diff --git a/docs/MAC_MINI_SERVER.md b/docs/MAC_MINI_SERVER.md
index 96f5c962f..f741377ad 100644
--- a/docs/MAC_MINI_SERVER.md
+++ b/docs/MAC_MINI_SERVER.md
@@ -318,13 +318,29 @@ Drei LaunchAgents sorgen fuer automatischen Betrieb:
 - Prueft alle Services (HTTP + Docker)
 - Sendet Benachrichtigungen bei Fehlern
 
-### Deaktivierte LaunchAgents
+### Deaktivierte / entfernte LaunchAgents
 
-Diese LaunchAgents sind seit der GPU-Server-Migration deaktiviert:
-- `homebrew.mxcl.ollama.plist` — LLM laeuft auf GPU-Server
-- `com.mana.image-gen.plist` — Bildgenerierung laeuft auf GPU-Server
+Seit der GPU-Server-Migration laufen keine AI-Services mehr auf dem Mac
+Mini. Die zugehörigen LaunchAgents sind deaktiviert und ihre Repo-Vorlagen
+wurden entfernt:
+- `homebrew.mxcl.ollama.plist` — LLM läuft auf GPU-Server (`gpu-llm.mana.how`)
+- `com.mana.image-gen.plist` — entfernt; image-gen läuft als
+  Scheduled Task `ManaImageGen` auf GPU-Server (`gpu-img.mana.how`)
+- `com.mana.mana-stt.plist` — entfernt; STT als Task `ManaSTT`
+- `com.mana.mana-tts.plist` — entfernt; TTS als Task `ManaTTS`
+- `com.mana.vllm-voxtral.plist` — entfernt; vLLM-Voxtral nicht mehr verwendet
 - `com.mana.telegram-ollama-bot.plist` — Bot deaktiviert
 
+Falls auf einem Mac Mini noch alte plists installiert sind:
+
+```bash
+launchctl unload ~/Library/LaunchAgents/com.mana.image-gen.plist 2>/dev/null
+launchctl unload ~/Library/LaunchAgents/com.mana.mana-stt.plist 2>/dev/null
+launchctl unload ~/Library/LaunchAgents/com.mana.mana-tts.plist 2>/dev/null
+launchctl unload ~/Library/LaunchAgents/com.mana.vllm-voxtral.plist 2>/dev/null
+rm -f ~/Library/LaunchAgents/com.mana.{image-gen,mana-stt,mana-tts,vllm-voxtral}.plist
+```
+
 ### Setup neu ausführen
 
 Falls die LaunchAgents neu eingerichtet werden müssen:
@@ -684,28 +700,28 @@ docker image prune -a
 
 Alle AI-Services (LLM, Bildgenerierung, STT, TTS) laufen auf dem Windows GPU-Server (RTX 3090, 24 GB VRAM) unter `192.168.178.11`. Der Mac Mini ist reiner Hosting-Server fuer Web, API, DB und Sync.
 
-| Service | GPU-Server Port | Zugriff aus Docker |
-|---------|----------------|-------------------|
-| Ollama (LLM) | 11434 | `http://192.168.178.11:11434` |
-| STT (Whisper) | 3020 | `http://192.168.178.11:3020` |
-| TTS | 3022 | `http://192.168.178.11:3022` |
-| Image Gen | 3023 | `http://192.168.178.11:3023` |
+| Service | GPU-Server Port | Zugriff aus Docker | Public URL |
+|---------|----------------|-------------------|------------|
+| mana-llm | 3025 | `http://192.168.178.11:3025` | `gpu-llm.mana.how` |
+| mana-stt (Whisper) | 3020 | `http://192.168.178.11:3020` | `gpu-stt.mana.how` |
+| mana-tts | 3022 | `http://192.168.178.11:3022` | `gpu-tts.mana.how` |
+| mana-image-gen | 3023 | `http://192.168.178.11:3023` | `gpu-img.mana.how` |
+| mana-video-gen | 3026 | `http://192.168.178.11:3026` | `gpu-video.mana.how` |
+| Ollama | 11434 | `http://192.168.178.11:11434` | `gpu-ollama.mana.how` |
+
+Repo-Pendants: `services/mana-{llm,stt,tts,image-gen,video-gen}/` — die `service.pyw` Runner werden direkt auf der Windows-Box als Scheduled Tasks ausgeführt.
 
 Alle Werte sind per Env-Var ueberschreibbar (`OLLAMA_URL`, `STT_SERVICE_URL`, `TTS_SERVICE_URL`, `IMAGE_GEN_SERVICE_URL`).
 
 Cloud-Fallback bei GPU-Server-Ausfall: `mana-llm` hat `AUTO_FALLBACK_ENABLED=true` (OpenRouter, Groq, Google).
 
-### Ollama/FLUX.2 auf dem Mac Mini (deaktiviert)
+### Ollama/FLUX.2 Mac-Mini-Reste (deaktiviert)
 
-Ollama und FLUX.2 waren frueher lokal installiert, sind aber seit 2026-03-28 deaktiviert. Die Modelle liegen noch auf der SSD als Backup:
+Ollama und das alte Mac-Mini FLUX.2 (`flux2.c` MPS) waren früher lokal installiert, sind seit 2026-03-28 deaktiviert. Die zugehörigen Repo-Setup-Skripte (`scripts/mac-mini/setup-image-gen.sh`, launchd plists) wurden 2026-04-08 entfernt; die Modelle liegen ggf. noch auf der SSD als Backup:
 - `/Volumes/ManaData/ollama/` (~58 GB)
 - `/Volumes/ManaData/flux2/` (~15 GB)
 
-Bei Bedarf reaktivieren:
-```bash
-brew services start ollama
-launchctl load ~/Library/LaunchAgents/com.mana.image-gen.plist
-```
+Falls du sie auf einem alten Mac Mini noch findest, einfach löschen — sie laufen nicht mehr und werden nirgendwo gebraucht.
 
 ## Externe 4TB SSD
 
diff --git a/scripts/mac-mini/README.md b/scripts/mac-mini/README.md
index f63fbe457..6c02a3351 100644
--- a/scripts/mac-mini/README.md
+++ b/scripts/mac-mini/README.md
@@ -23,7 +23,6 @@ cd ~/projects/mana-monorepo
 | Script | Purpose |
 |--------|---------|
 | `setup-autostart.sh` | Configure automatic startup on boot (run once) |
-| `setup-stt.sh` | Setup STT service (Whisper + Voxtral) |
 | `startup.sh` | Main startup script (called by launchd) |
 | `health-check.sh` | Check all services health |
 | `status.sh` | Show full system status |
@@ -257,29 +256,18 @@ ollama list
 ollama pull gemma3:4b
 ```
 
-### STT Service (Speech-to-Text)
+### AI Services (STT, TTS, LLM, Image-Gen, Video-Gen)
 
-The STT service provides Whisper and Voxtral transcription:
+These have moved off the Mac Mini entirely. They run on the Windows GPU
+server (`mana-server-gpu`) as Windows Scheduled Tasks. See
+[`docs/WINDOWS_GPU_SERVER_SETUP.md`](../../docs/WINDOWS_GPU_SERVER_SETUP.md)
+for setup, and the per-service `services/mana-{stt,tts,llm,image-gen,video-gen}/CLAUDE.md`
+files for endpoint details.
 
-```bash
-# Setup (first time)
-./scripts/mac-mini/setup-stt.sh
+Public URLs (proxied via Cloudflare Tunnel + the Mac Mini gpu-proxy):
 
-# Check status
-curl http://localhost:3020/health
-
-# Transcribe audio
-curl -X POST http://localhost:3020/transcribe \
-  -F "file=@audio.mp3" \
-  -F "language=de"
-
-# View logs
-tail -f /tmp/mana-stt.log
-```
-
-**Available endpoints:**
-- `POST /transcribe` - Whisper transcription (recommended)
-- `POST /transcribe/voxtral` - Voxtral transcription
-- `POST /transcribe/auto` - Auto-select model
-- `GET /health` - Health check
-- `GET /models` - List available models
+- `https://gpu-stt.mana.how`
+- `https://gpu-tts.mana.how`
+- `https://gpu-llm.mana.how`
+- `https://gpu-img.mana.how`
+- `https://gpu-video.mana.how`
diff --git a/scripts/mac-mini/launchd/com.mana.image-gen.plist b/scripts/mac-mini/launchd/com.mana.image-gen.plist
deleted file mode 100644
index b8355adc6..000000000
--- a/scripts/mac-mini/launchd/com.mana.image-gen.plist
+++ /dev/null
@@ -1,53 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
-<plist version="1.0">
-<dict>
-    <key>Label</key>
-    <string>com.mana.image-gen</string>
-    <key>ProgramArguments</key>
-    <array>
-        <string>/Users/mana/projects/mana-monorepo/services/mana-image-gen/.venv/bin/python3</string>
-        <string>-m</string>
-        <string>uvicorn</string>
-        <string>app.main:app</string>
-        <string>--host</string>
-        <string>0.0.0.0</string>
-        <string>--port</string>
-        <string>3025</string>
-    </array>
-    <key>WorkingDirectory</key>
-    <string>/Users/mana/projects/mana-monorepo/services/mana-image-gen</string>
-    <key>EnvironmentVariables</key>
-    <dict>
-        <key>PATH</key>
-        <string>/opt/homebrew/bin:/Users/mana/projects/mana-monorepo/services/mana-image-gen/.venv/bin:/usr/local/bin:/usr/bin:/bin</string>
-        <key>HOME</key>
-        <string>/Users/mana</string>
-        <key>PORT</key>
-        <string>3025</string>
-        <key>FLUX_BINARY</key>
-        <string>/Users/mana/flux2/flux</string>
-        <key>FLUX_MODEL_DIR</key>
-        <string>/Users/mana/flux2/model</string>
-        <key>DEFAULT_STEPS</key>
-        <string>4</string>
-        <key>GENERATION_TIMEOUT</key>
-        <string>300</string>
-        <key>CORS_ORIGINS</key>
-        <string>https://mana.how</string>
-    </dict>
-    <key>RunAtLoad</key>
-    <true/>
-    <key>KeepAlive</key>
-    <dict>
-        <key>SuccessfulExit</key>
-        <false/>
-        <key>Crashed</key>
-        <true/>
-    </dict>
-    <key>StandardOutPath</key>
-    <string>/tmp/mana-image-gen.log</string>
-    <key>StandardErrorPath</key>
-    <string>/tmp/mana-image-gen.error.log</string>
-</dict>
-</plist>
diff --git a/scripts/mac-mini/launchd/com.mana.mana-stt.plist b/scripts/mac-mini/launchd/com.mana.mana-stt.plist
deleted file mode 100644
index 9271a5668..000000000
--- a/scripts/mac-mini/launchd/com.mana.mana-stt.plist
+++ /dev/null
@@ -1,39 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
-<plist version="1.0">
-<dict>
-    <key>Label</key>
-    <string>com.mana.mana-stt</string>
-
-    <key>ProgramArguments</key>
-    <array>
-        <string>/bin/bash</string>
-        <string>-c</string>
-        <string>cd /Users/mana/projects/mana-monorepo/services/mana-stt &amp;&amp; set -a &amp;&amp; source .env &amp;&amp; set +a &amp;&amp; .venv/bin/uvicorn app.main:app --host 0.0.0.0 --port 3020</string>
-    </array>
-
-    <key>WorkingDirectory</key>
-    <string>/Users/mana/projects/mana-monorepo/services/mana-stt</string>
-
-    <key>EnvironmentVariables</key>
-    <dict>
-        <key>PATH</key>
-        <string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin</string>
-    </dict>
-
-    <key>RunAtLoad</key>
-    <true/>
-
-    <key>KeepAlive</key>
-    <true/>
-
-    <key>StandardOutPath</key>
-    <string>/Users/mana/logs/mana-stt.log</string>
-
-    <key>StandardErrorPath</key>
-    <string>/Users/mana/logs/mana-stt.error.log</string>
-
-    <key>ThrottleInterval</key>
-    <integer>10</integer>
-</dict>
-</plist>
diff --git a/scripts/mac-mini/launchd/com.mana.mana-tts.plist b/scripts/mac-mini/launchd/com.mana.mana-tts.plist
deleted file mode 100644
index 084e39afb..000000000
--- a/scripts/mac-mini/launchd/com.mana.mana-tts.plist
+++ /dev/null
@@ -1,39 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
-<plist version="1.0">
-<dict>
-    <key>Label</key>
-    <string>com.mana.mana-tts</string>
-
-    <key>ProgramArguments</key>
-    <array>
-        <string>/bin/bash</string>
-        <string>-c</string>
-        <string>cd /Users/mana/projects/mana-monorepo/services/mana-tts &amp;&amp; set -a &amp;&amp; source .env &amp;&amp; set +a &amp;&amp; .venv/bin/uvicorn app.main:app --host 0.0.0.0 --port 3022</string>
-    </array>
-
-    <key>WorkingDirectory</key>
-    <string>/Users/mana/projects/mana-monorepo/services/mana-tts</string>
-
-    <key>EnvironmentVariables</key>
-    <dict>
-        <key>PATH</key>
-        <string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin</string>
-    </dict>
-
-    <key>RunAtLoad</key>
-    <true/>
-
-    <key>KeepAlive</key>
-    <true/>
-
-    <key>StandardOutPath</key>
-    <string>/Users/mana/logs/mana-tts.log</string>
-
-    <key>StandardErrorPath</key>
-    <string>/Users/mana/logs/mana-tts.error.log</string>
-
-    <key>ThrottleInterval</key>
-    <integer>10</integer>
-</dict>
-</plist>
diff --git a/scripts/mac-mini/setup-image-gen.sh b/scripts/mac-mini/setup-image-gen.sh
deleted file mode 100755
index 9de8e8c55..000000000
--- a/scripts/mac-mini/setup-image-gen.sh
+++ /dev/null
@@ -1,198 +0,0 @@
-#!/bin/bash
-# Setup script for Mana Image Generation as a launchd service on Mac Mini
-# Run this on the Mac Mini server to install and start the image generation service
-
-set -e
-
-SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
-REPO_DIR="$(cd "$SCRIPT_DIR/../.." && pwd)"
-SERVICE_DIR="$REPO_DIR/services/mana-image-gen"
-PLIST_NAME="com.mana.image-gen"
-PLIST_PATH="$HOME/Library/LaunchAgents/$PLIST_NAME.plist"
-
-# flux2.c paths (in home directory, no sudo required)
-FLUX_BINARY="$HOME/flux2/flux"
-FLUX_MODEL_DIR="$HOME/flux2/model"
-
-echo "=========================================="
-echo "Mana Image Generation - Mac Mini Setup"
-echo "=========================================="
-echo ""
-echo "Service directory: $SERVICE_DIR"
-echo "Plist path: $PLIST_PATH"
-echo "Flux binary: $FLUX_BINARY"
-echo "Flux model: $FLUX_MODEL_DIR"
-echo ""
-
-# Verify service directory exists
-if [[ ! -d "$SERVICE_DIR" ]]; then
-    echo "Error: Service directory not found: $SERVICE_DIR"
-    exit 1
-fi
-
-# Run main setup if venv doesn't exist or flux2.c not installed
-if [[ ! -d "$SERVICE_DIR/.venv" ]] || [[ ! -x "$FLUX_BINARY" ]]; then
-    echo "Running setup (installs flux2.c + Python environment)..."
-    echo ""
-    "$SERVICE_DIR/setup.sh"
-    echo ""
-fi
-
-# Verify flux2.c is available
-if [[ ! -x "$FLUX_BINARY" ]]; then
-    echo "Error: flux2.c not found at $FLUX_BINARY"
-    echo "Please run setup.sh first to install flux2.c"
-    exit 1
-fi
-
-if [[ ! -d "$FLUX_MODEL_DIR" ]]; then
-    echo "Error: Model not found at $FLUX_MODEL_DIR"
-    echo "Please download the FLUX.2 klein 4B model"
-    exit 1
-fi
-
-# Create LaunchAgents directory if needed
-mkdir -p "$HOME/Library/LaunchAgents"
-
-# Unload existing service if running
-if launchctl list | grep -q "$PLIST_NAME"; then
-    echo "Stopping existing service..."
-    launchctl unload "$PLIST_PATH" 2>/dev/null || true
-fi
-
-# Create plist file
-echo "Creating launchd plist..."
-cat > "$PLIST_PATH" << EOF
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
-<plist version="1.0">
-<dict>
-    <key>Label</key>
-    <string>$PLIST_NAME</string>
-
-    <key>ProgramArguments</key>
-    <array>
-        <string>$SERVICE_DIR/.venv/bin/uvicorn</string>
-        <string>app.main:app</string>
-        <string>--host</string>
-        <string>0.0.0.0</string>
-        <string>--port</string>
-        <string>3025</string>
-    </array>
-
-    <key>WorkingDirectory</key>
-    <string>$SERVICE_DIR</string>
-
-    <key>EnvironmentVariables</key>
-    <dict>
-        <key>PATH</key>
-        <string>/opt/homebrew/bin:$SERVICE_DIR/.venv/bin:/usr/local/bin:/usr/bin:/bin</string>
-        <key>PORT</key>
-        <string>3025</string>
-        <key>FLUX_BINARY</key>
-        <string>$FLUX_BINARY</string>
-        <key>FLUX_MODEL_DIR</key>
-        <string>$FLUX_MODEL_DIR</string>
-        <key>DEFAULT_STEPS</key>
-        <string>4</string>
-        <key>DEFAULT_WIDTH</key>
-        <string>1024</string>
-        <key>DEFAULT_HEIGHT</key>
-        <string>1024</string>
-        <key>GENERATION_TIMEOUT</key>
-        <string>120</string>
-        <key>CORS_ORIGINS</key>
-        <string>https://mana.how,http://localhost:5173</string>
-    </dict>
-
-    <key>RunAtLoad</key>
-    <true/>
-
-    <key>KeepAlive</key>
-    <dict>
-        <key>SuccessfulExit</key>
-        <false/>
-        <key>Crashed</key>
-        <true/>
-    </dict>
-
-    <key>ThrottleInterval</key>
-    <integer>10</integer>
-
-    <key>StandardOutPath</key>
-    <string>/tmp/mana-image-gen.log</string>
-
-    <key>StandardErrorPath</key>
-    <string>/tmp/mana-image-gen.error.log</string>
-</dict>
-</plist>
-EOF
-
-echo "Plist created: $PLIST_PATH"
-
-# Load service
-echo ""
-echo "Loading service..."
-launchctl load "$PLIST_PATH"
-
-# Wait for startup
-echo "Waiting for service to start..."
-sleep 3
-
-# Check if running
-if launchctl list | grep -q "$PLIST_NAME"; then
-    echo "Service loaded successfully!"
-else
-    echo "Warning: Service may not have loaded correctly."
-    echo "Check logs: tail -f /tmp/mana-image-gen.log"
-fi
-
-# Health check
-echo ""
-echo "Running health check..."
-sleep 2
-
-if curl -s http://localhost:3025/health | grep -q "healthy\|degraded"; then
-    echo "Health check passed!"
-    echo ""
-    curl -s http://localhost:3025/health | python3 -m json.tool
-else
-    echo "Health check failed. Service may still be starting."
-    echo "Try again in a few seconds: curl http://localhost:3025/health"
-fi
-
-echo ""
-echo "=========================================="
-echo "Setup Complete!"
-echo "=========================================="
-echo ""
-echo "Service management commands:"
-echo ""
-echo "  # View logs"
-echo "  tail -f /tmp/mana-image-gen.log"
-echo ""
-echo "  # Stop service"
-echo "  launchctl unload $PLIST_PATH"
-echo ""
-echo "  # Start service"
-echo "  launchctl load $PLIST_PATH"
-echo ""
-echo "  # Restart service"
-echo "  launchctl unload $PLIST_PATH && launchctl load $PLIST_PATH"
-echo ""
-echo "  # Check status"
-echo "  launchctl list | grep $PLIST_NAME"
-echo ""
-echo "Test endpoints:"
-echo ""
-echo "  # Health check"
-echo "  curl http://localhost:3025/health"
-echo ""
-echo "  # Model info"
-echo "  curl http://localhost:3025/models"
-echo ""
-echo "  # Generate image"
-echo "  curl -X POST http://localhost:3025/generate \\"
-echo "    -H 'Content-Type: application/json' \\"
-echo "    -d '{\"prompt\": \"A cat in space\"}'"
-echo ""
diff --git a/scripts/mac-mini/setup-stt.sh b/scripts/mac-mini/setup-stt.sh
deleted file mode 100755
index 398246ce2..000000000
--- a/scripts/mac-mini/setup-stt.sh
+++ /dev/null
@@ -1,153 +0,0 @@
-#!/bin/bash
-# Setup STT Service on Mac Mini
-# Creates launchd service for auto-start
-
-set -e
-
-SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
-REPO_DIR="$(cd "$SCRIPT_DIR/../.." && pwd)"
-STT_DIR="$REPO_DIR/services/mana-stt"
-PLIST_NAME="com.mana.stt"
-PLIST_PATH="$HOME/Library/LaunchAgents/$PLIST_NAME.plist"
-
-echo "=============================================="
-echo "  Mana STT Service Setup (Mac Mini)"
-echo "=============================================="
-echo ""
-
-# Check if STT service directory exists
-if [ ! -d "$STT_DIR" ]; then
-    echo "Error: STT service directory not found at $STT_DIR"
-    exit 1
-fi
-
-# Run the main setup script first
-echo "1. Running STT service setup..."
-cd "$STT_DIR"
-if [ ! -d ".venv" ]; then
-    echo "   Installing dependencies..."
-    ./setup.sh
-else
-    echo "   Virtual environment already exists"
-    echo "   Skipping dependency installation"
-fi
-
-# Create launchd plist
-echo ""
-echo "2. Creating launchd service..."
-
-cat > "$PLIST_PATH" << EOF
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
-<plist version="1.0">
-<dict>
-    <key>Label</key>
-    <string>$PLIST_NAME</string>
-
-    <key>ProgramArguments</key>
-    <array>
-        <string>$STT_DIR/.venv/bin/uvicorn</string>
-        <string>app.main:app</string>
-        <string>--host</string>
-        <string>0.0.0.0</string>
-        <string>--port</string>
-        <string>3020</string>
-    </array>
-
-    <key>WorkingDirectory</key>
-    <string>$STT_DIR</string>
-
-    <key>EnvironmentVariables</key>
-    <dict>
-        <key>PATH</key>
-        <string>/opt/homebrew/bin:$STT_DIR/.venv/bin:/usr/local/bin:/usr/bin:/bin</string>
-        <key>PORT</key>
-        <string>3020</string>
-        <key>WHISPER_MODEL</key>
-        <string>large-v3</string>
-        <key>PRELOAD_MODELS</key>
-        <string>false</string>
-        <key>CORS_ORIGINS</key>
-        <string>https://mana.how</string>
-    </dict>
-
-    <key>RunAtLoad</key>
-    <true/>
-
-    <key>KeepAlive</key>
-    <dict>
-        <key>SuccessfulExit</key>
-        <false/>
-        <key>Crashed</key>
-        <true/>
-    </dict>
-
-    <key>ThrottleInterval</key>
-    <integer>10</integer>
-
-    <key>StandardOutPath</key>
-    <string>/tmp/mana-stt.log</string>
-
-    <key>StandardErrorPath</key>
-    <string>/tmp/mana-stt.error.log</string>
-</dict>
-</plist>
-EOF
-
-echo "   Created: $PLIST_PATH"
-
-# Unload if already loaded
-echo ""
-echo "3. Loading launchd service..."
-launchctl unload "$PLIST_PATH" 2>/dev/null || true
-launchctl load "$PLIST_PATH"
-
-# Wait for service to start
-sleep 2
-
-# Check if service is running
-echo ""
-echo "4. Checking service status..."
-if launchctl list | grep -q "$PLIST_NAME"; then
-    echo "   Service is running"
-
-    # Check health endpoint
-    sleep 3
-    if curl -s http://localhost:3020/health > /dev/null 2>&1; then
-        echo "   Health check passed"
-        HEALTH=$(curl -s http://localhost:3020/health)
-        echo "   $HEALTH"
-    else
-        echo "   Warning: Health check failed (service may still be starting)"
-        echo "   Check logs: tail -f /tmp/mana-stt.log"
-    fi
-else
-    echo "   Warning: Service may not be running"
-    echo "   Check logs: tail -f /tmp/mana-stt.error.log"
-fi
-
-echo ""
-echo "=============================================="
-echo "  STT Service Setup Complete!"
-echo "=============================================="
-echo ""
-echo "Service URL: http://localhost:3020"
-echo ""
-echo "Useful commands:"
-echo "  # View logs"
-echo "  tail -f /tmp/mana-stt.log"
-echo ""
-echo "  # Restart service"
-echo "  launchctl kickstart -k gui/\$(id -u)/$PLIST_NAME"
-echo ""
-echo "  # Stop service"
-echo "  launchctl unload $PLIST_PATH"
-echo ""
-echo "  # Start service"
-echo "  launchctl load $PLIST_PATH"
-echo ""
-echo "  # Test transcription"
-echo "  curl -X POST http://localhost:3020/transcribe \\"
-echo "    -F 'file=@audio.mp3' \\"
-echo "    -F 'language=de'"
-echo ""
diff --git a/scripts/mac-mini/setup-tts.sh b/scripts/mac-mini/setup-tts.sh
deleted file mode 100755
index 4fe28c23a..000000000
--- a/scripts/mac-mini/setup-tts.sh
+++ /dev/null
@@ -1,172 +0,0 @@
-#!/bin/bash
-# Setup script for Mana TTS as a launchd service on Mac Mini
-# Run this on the Mac Mini server to install and start the TTS service
-
-set -e
-
-SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
-REPO_DIR="$(cd "$SCRIPT_DIR/../.." && pwd)"
-SERVICE_DIR="$REPO_DIR/services/mana-tts"
-PLIST_NAME="com.mana.tts"
-PLIST_PATH="$HOME/Library/LaunchAgents/$PLIST_NAME.plist"
-
-echo "=========================================="
-echo "Mana TTS - Mac Mini Setup"
-echo "=========================================="
-echo ""
-echo "Service directory: $SERVICE_DIR"
-echo "Plist path: $PLIST_PATH"
-echo ""
-
-# Verify service directory exists
-if [[ ! -d "$SERVICE_DIR" ]]; then
-    echo "Error: Service directory not found: $SERVICE_DIR"
-    exit 1
-fi
-
-# Run main setup if venv doesn't exist
-if [[ ! -d "$SERVICE_DIR/.venv" ]]; then
-    echo "Virtual environment not found. Running setup..."
-    echo ""
-    "$SERVICE_DIR/setup.sh"
-    echo ""
-fi
-
-# Create LaunchAgents directory if needed
-mkdir -p "$HOME/Library/LaunchAgents"
-
-# Unload existing service if running
-if launchctl list | grep -q "$PLIST_NAME"; then
-    echo "Stopping existing service..."
-    launchctl unload "$PLIST_PATH" 2>/dev/null || true
-fi
-
-# Create plist file
-echo "Creating launchd plist..."
-cat > "$PLIST_PATH" << EOF
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
-<plist version="1.0">
-<dict>
-    <key>Label</key>
-    <string>$PLIST_NAME</string>
-
-    <key>ProgramArguments</key>
-    <array>
-        <string>$SERVICE_DIR/.venv/bin/uvicorn</string>
-        <string>app.main:app</string>
-        <string>--host</string>
-        <string>0.0.0.0</string>
-        <string>--port</string>
-        <string>3022</string>
-    </array>
-
-    <key>WorkingDirectory</key>
-    <string>$SERVICE_DIR</string>
-
-    <key>EnvironmentVariables</key>
-    <dict>
-        <key>PATH</key>
-        <string>/opt/homebrew/bin:$SERVICE_DIR/.venv/bin:/usr/local/bin:/usr/bin:/bin</string>
-        <key>PORT</key>
-        <string>3022</string>
-        <key>PRELOAD_MODELS</key>
-        <string>false</string>
-        <key>MAX_TEXT_LENGTH</key>
-        <string>1000</string>
-        <key>CORS_ORIGINS</key>
-        <string>https://mana.how</string>
-    </dict>
-
-    <key>RunAtLoad</key>
-    <true/>
-
-    <key>KeepAlive</key>
-    <dict>
-        <key>SuccessfulExit</key>
-        <false/>
-        <key>Crashed</key>
-        <true/>
-    </dict>
-
-    <key>ThrottleInterval</key>
-    <integer>10</integer>
-
-    <key>StandardOutPath</key>
-    <string>/tmp/mana-tts.log</string>
-
-    <key>StandardErrorPath</key>
-    <string>/tmp/mana-tts.error.log</string>
-</dict>
-</plist>
-EOF
-
-echo "Plist created: $PLIST_PATH"
-
-# Load service
-echo ""
-echo "Loading service..."
-launchctl load "$PLIST_PATH"
-
-# Wait for startup
-echo "Waiting for service to start..."
-sleep 3
-
-# Check if running
-if launchctl list | grep -q "$PLIST_NAME"; then
-    echo "Service loaded successfully!"
-else
-    echo "Warning: Service may not have loaded correctly."
-    echo "Check logs: tail -f /tmp/mana-tts.log"
-fi
-
-# Health check
-echo ""
-echo "Running health check..."
-sleep 2
-
-if curl -s http://localhost:3022/health | grep -q "healthy"; then
-    echo "Health check passed!"
-    echo ""
-    curl -s http://localhost:3022/health | python3 -m json.tool
-else
-    echo "Health check failed. Service may still be starting."
-    echo "Try again in a few seconds: curl http://localhost:3022/health"
-fi
-
-echo ""
-echo "=========================================="
-echo "Setup Complete!"
-echo "=========================================="
-echo ""
-echo "Service management commands:"
-echo ""
-echo "  # View logs"
-echo "  tail -f /tmp/mana-tts.log"
-echo ""
-echo "  # Stop service"
-echo "  launchctl unload $PLIST_PATH"
-echo ""
-echo "  # Start service"
-echo "  launchctl load $PLIST_PATH"
-echo ""
-echo "  # Restart service"
-echo "  launchctl unload $PLIST_PATH && launchctl load $PLIST_PATH"
-echo ""
-echo "  # Check status"
-echo "  launchctl list | grep $PLIST_NAME"
-echo ""
-echo "Test endpoints:"
-echo ""
-echo "  # Health check"
-echo "  curl http://localhost:3022/health"
-echo ""
-echo "  # List voices"
-echo "  curl http://localhost:3022/voices"
-echo ""
-echo "  # Synthesize with Kokoro"
-echo "  curl -X POST http://localhost:3022/synthesize/kokoro \\"
-echo "    -H 'Content-Type: application/json' \\"
-echo "    -d '{\"text\": \"Hello world\", \"voice\": \"af_heart\"}' \\"
-echo "    --output test.wav"
-echo ""
diff --git a/services/mana-stt/CLAUDE.md b/services/mana-stt/CLAUDE.md
index 6d91d86a1..0a98c2386 100644
--- a/services/mana-stt/CLAUDE.md
+++ b/services/mana-stt/CLAUDE.md
@@ -1,79 +1,96 @@
 # mana-stt
 
-Speech-to-Text service for the Mana ecosystem. Runs on the Mac Mini M4 (Apple Silicon) and exposes a small FastAPI surface that wraps multiple Whisper backends plus Mistral's hosted Voxtral API.
+Speech-to-Text microservice. Wraps Whisper (CUDA, with WhisperX for word-level timestamps + diarization), local Voxtral via vLLM, and Mistral's hosted Voxtral API behind a small FastAPI surface. Lives on the Windows GPU server (`mana-server-gpu`, RTX 3090).
+
+> ⚠️ **Earlier history**: this directory used to contain Mac-Mini–targeted
+> code (Whisper Lightning MLX, com.mana.mana-stt.plist launchd setup,
+> setup.sh with Apple-Silicon checks). That all moved to the Windows
+> GPU box and was removed from the repo. If you're looking for the MLX
+> path, see git history.
 
 ## Tech Stack
 
 | Layer | Technology |
 |-------|------------|
-| **Runtime** | Python 3.11 + uvicorn |
+| **Runtime** | Python 3.11 + uvicorn (Windows) |
 | **Framework** | FastAPI |
-| **Local model** | Whisper Large V3 via [`lightning-whisper-mlx`](https://github.com/mustafaaljadery/lightning-whisper-mlx) (Apple MLX) |
-| **Local model (rich)** | WhisperX for word-level timestamps + diarization |
-| **Cloud model** | Mistral Voxtral Mini API |
-| **Optional** | vLLM Voxtral (GPU) — see `vllm_service.py` |
-| **Auth** | JWT validation via mana-auth (`external_auth.py`) + API key fallback (`auth.py`) |
-| **Process supervision** | launchd via `com.mana.mana-stt.plist` |
+| **Whisper** | `whisperx` on CUDA (large-v3 + word alignment + pyannote diarization) |
+| **Voxtral (local)** | vLLM serving Voxtral 3B/4B/24B (`vllm_service.py`) |
+| **Voxtral (cloud)** | Mistral API (`voxtral_api_service.py`) |
+| **Auth** | Per-key + internal-key API auth (`app/auth.py`, JWT via mana-auth in `app/external_auth.py`) |
+| **VRAM** | Shared `vram_manager.py` accountant — coordinated with mana-tts and mana-image-gen so multiple GPU services don't OOM each other |
+| **Process supervision** | Windows Scheduled Task `ManaSTT` (AtLogOn) |
 
 ## Port: 3020
 
-## Quick Start
+## Where it runs
 
-```bash
-cd services/mana-stt
-./setup.sh                                          # Create venv + install
-.venv/bin/uvicorn app.main:app --host 0.0.0.0 --port 3020
-```
+| Host | Path on disk | Entrypoint |
+|------|--------------|------------|
+| Windows GPU server (`192.168.178.11`) | `C:\mana\services\mana-stt\` | `service.pyw` via Scheduled Task `ManaSTT` |
 
-Production runs via launchd on the Mac Mini — `install-service.sh` (single service) or `install-services.sh` (mana-stt + vllm-voxtral together).
+Public URL: `https://gpu-stt.mana.how` (via Cloudflare Tunnel + Mac Mini gpu-proxy).
 
 ## API Endpoints
 
 | Method | Path | Description |
 |--------|------|-------------|
 | GET | `/health` | Liveness + which backends are loaded |
-| GET | `/models` | List available STT models |
-| POST | `/transcribe` | Whisper MLX (default, fastest local) |
-| POST | `/transcribe/whisperx` | WhisperX with word-level timestamps + diarization |
-| POST | `/transcribe/voxtral` | Local Voxtral (vLLM) |
-| POST | `/transcribe/voxtral/api` | Mistral Voxtral API (cloud) |
-| POST | `/transcribe/auto` | Tries WhisperX first, falls back to Whisper MLX |
+| GET | `/models` | Available STT models |
+| POST | `/transcribe` | Whisper (WhisperX, default) — multipart `file` + optional `language` |
+| POST | `/transcribe/voxtral` | Local Voxtral via vLLM |
+| POST | `/transcribe/auto` | Routing helper — picks the best backend for the input |
 
-All `/transcribe*` endpoints accept multipart `file` upload + optional `language` form field. Auth via `Authorization: Bearer <jwt>` or `X-API-Key`.
+All endpoints (except `/health`) require `Authorization: Bearer <token>`. Tokens are validated against `API_KEYS` (per-app keys) or `INTERNAL_API_KEY` (no rate limit), and JWTs from mana-auth are also accepted via `external_auth.py`.
 
 ## Backends (`app/`)
 
 | File | What it loads |
 |------|---------------|
-| `whisper_service.py` | Whisper Large V3 via MLX (local, default) |
-| `whisper_service_cuda.py` | CUDA Whisper (only used on Windows GPU server) |
-| `whisperx_service.py` | WhisperX with diarization (local, slower, richer output) |
-| `voxtral_service.py` | Local Voxtral via vLLM (optional, needs the second launchd job) |
-| `voxtral_api_service.py` | Mistral hosted Voxtral API (cloud) |
-| `vllm_service.py` | vLLM client primitives shared with Voxtral |
-| `auth.py` | API key auth (fallback path) |
-| `external_auth.py` | JWT auth via mana-auth public key |
+| `whisper_service.py` | WhisperX on CUDA (large-v3 + alignment + pyannote diarization) |
+| `voxtral_service.py` | Local Voxtral via vLLM (slower start, richer multilingual) |
+| `voxtral_api_service.py` | Mistral hosted Voxtral API (cloud, no GPU needed) |
+| `vllm_service.py` | vLLM client primitives shared by Voxtral |
+| `vram_manager.py` | Shared VRAM accounting — same module also used by mana-tts and mana-image-gen |
+| `auth.py` | API-key auth (internal + per-app keys) |
+| `external_auth.py` | JWT validation via mana-auth |
 
-Backends are loaded lazily during the FastAPI lifespan and reported by `/health`. Missing dependencies (e.g. CUDA on Mac) are tolerated — the service starts without them.
+Backends are loaded lazily during the FastAPI lifespan and reported by `/health`.
 
-## Configuration
-
-Reads from `services/mana-stt/.env` (loaded by the launchd plist's `set -a; source .env; set +a`). Relevant variables:
+## Configuration (`.env` on the Windows GPU box)
 
 ```env
 PORT=3020
-MANA_AUTH_URL=http://localhost:3001     # JWKS source for JWT verification
-MISTRAL_API_KEY=...                     # only needed for /transcribe/voxtral/api
-STT_API_KEY=...                         # legacy API key fallback
+WHISPER_MODEL=large-v3
+WHISPER_DEVICE=cuda
+WHISPER_COMPUTE_TYPE=float16
+WHISPER_DEFAULT_LANGUAGE=de
+PRELOAD_MODELS=true
+USE_VLLM=false
+HF_TOKEN=...                    # required for pyannote diarization models
+REQUIRE_AUTH=true
+API_KEYS=sk-app1:app1,sk-app2:app2
+INTERNAL_API_KEY=...            # cross-service, no rate limit
+CORS_ORIGINS=https://mana.how,https://chat.mana.how
 ```
 
 ## Operations
 
-- **Logs**: launchd writes to `~/Library/Logs/mana-stt.{out,err}.log` (see plist)
-- **Metrics**: Prometheus endpoint at `/metrics` if enabled in config; Grafana dashboard JSON checked in at `grafana-dashboard.json`
-- **Restart**: `launchctl kickstart -k gui/$(id -u)/com.mana.mana-stt`
+```powershell
+# Status
+Get-ScheduledTask -TaskName "ManaSTT" | Format-List TaskName, State
+Get-NetTCPConnection -LocalPort 3020 -State Listen
+
+# Restart
+Stop-ScheduledTask -TaskName "ManaSTT"
+Start-ScheduledTask -TaskName "ManaSTT"
+
+# Logs
+Get-Content C:\mana\services\mana-stt\service.log -Tail 50
+```
 
 ## Reference
 
-- `services/mana-stt/README.md` — user-facing setup, model download instructions, language coverage
-- `docs/LOCAL_STT_MODELS.md` — WER comparisons, model size/quality tradeoffs
+- `docs/WINDOWS_GPU_SERVER_SETUP.md` — Windows box setup, scheduled tasks, firewall, Cloudflare tunnel
+- `docs/LOCAL_STT_MODELS.md` — model comparisons (WER, latency, language coverage)
+- `services/mana-stt/grafana-dashboard.json` — Prometheus metrics dashboard
diff --git a/services/mana-stt/README.md b/services/mana-stt/README.md
index 7d76ce525..8e4abf5f1 100644
--- a/services/mana-stt/README.md
+++ b/services/mana-stt/README.md
@@ -1,185 +1,31 @@
 # Mana STT Service
 
-Speech-to-Text API service with **Whisper (Lightning MLX)** and **Voxtral (Mistral API)**.
+Speech-to-Text API service running on the Windows GPU server (`mana-server-gpu`, RTX 3090). Wraps **WhisperX** (CUDA, large-v3 + word alignment + pyannote diarization), local **Voxtral via vLLM**, and the hosted **Mistral Voxtral API**.
 
-Optimized for Mac Mini M4 (Apple Silicon).
+For architecture, deployment, configuration, and operations see [`CLAUDE.md`](./CLAUDE.md) and [`docs/WINDOWS_GPU_SERVER_SETUP.md`](../../docs/WINDOWS_GPU_SERVER_SETUP.md).
 
-## Architecture
+## Port: 3020
 
-```
-                    ┌─────────────────────┐
-                    │   mana-stt (3020)   │
-                    │    FastAPI          │
-                    └─────────┬───────────┘
-                              │
-            ┌─────────────────┼─────────────────┐
-            ▼                 ▼                 ▼
-    ┌──────────────┐  ┌──────────────┐  ┌──────────────┐
-    │   Whisper    │  │  Voxtral API │  │   vLLM       │
-    │  MLX (Local) │  │  (Mistral)   │  │ (Optional)   │
-    └──────────────┘  └──────────────┘  └──────────────┘
-```
+## Public URL
 
-## Features
-
-- **Whisper Large V3** - Best quality, 99+ languages, German WER 6-9% (local, MLX)
-- **Voxtral Mini** - Mistral API, speaker diarization support (cloud)
-- **Apple Silicon Optimized** - Uses MLX for fast local inference
-- **Automatic Fallback** - Falls back between backends automatically
-- **REST API** - Simple HTTP endpoints for integration
-
-## Quick Start
-
-### Installation
-
-```bash
-cd services/mana-stt
-./setup.sh
-```
-
-### Run Locally
-
-```bash
-source .venv/bin/activate
-uvicorn app.main:app --host 0.0.0.0 --port 3020
-```
-
-### Setup as System Service (Mac Mini)
-
-```bash
-./scripts/mac-mini/setup-stt.sh
-```
+`https://gpu-stt.mana.how` (via Cloudflare Tunnel + Mac Mini gpu-proxy)
 
 ## API Endpoints
 
 | Endpoint | Method | Description |
 |----------|--------|-------------|
-| `/health` | GET | Health check |
+| `/health` | GET | Health check + which backends are loaded |
 | `/models` | GET | List available models |
-| `/transcribe` | POST | Whisper transcription |
-| `/transcribe/voxtral` | POST | Voxtral transcription |
-| `/transcribe/auto` | POST | Auto-select best model |
+| `/transcribe` | POST | Whisper / WhisperX transcription |
+| `/transcribe/voxtral` | POST | Voxtral transcription (local vLLM) |
+| `/transcribe/auto` | POST | Auto-select best backend for the input |
 
-## Usage Examples
+All endpoints (except `/health`) require `Authorization: Bearer <token>`.
 
-### Transcribe with Whisper (Recommended)
+## Quick Test
 
 ```bash
-curl -X POST http://localhost:3020/transcribe \
-  -F "file=@recording.mp3" \
-  -F "language=de"
-```
-
-Response:
-```json
-{
-  "text": "Das ist ein Beispieltext...",
-  "language": "de",
-  "model": "whisper-large-v3-turbo"
-}
-```
-
-### Transcribe with Voxtral
-
-```bash
-curl -X POST http://localhost:3020/transcribe/voxtral \
-  -F "file=@recording.mp3" \
-  -F "language=de"
-```
-
-### Auto-Select Model
-
-```bash
-curl -X POST http://localhost:3020/transcribe/auto \
-  -F "file=@recording.mp3" \
-  -F "prefer=whisper"
-```
-
-## Configuration
-
-Environment variables:
-
-| Variable | Default | Description |
-|----------|---------|-------------|
-| `PORT` | `3020` | API server port |
-| `WHISPER_MODEL` | `large-v3` | Default Whisper model |
-| `PRELOAD_MODELS` | `false` | Load models on startup |
-| `CORS_ORIGINS` | `https://mana.how,...` | Allowed CORS origins |
-| `MISTRAL_API_KEY` | - | Required for Voxtral API |
-| `USE_VLLM` | `false` | Enable vLLM backend (experimental) |
-| `VLLM_URL` | `http://localhost:8100` | vLLM server URL |
-
-## Supported Audio Formats
-
-- MP3, WAV, M4A, FLAC, OGG, WebM, MP4
-- Max file size: 100MB
-- Any sample rate (automatically resampled to 16kHz)
-
-## Model Comparison
-
-| Model | German WER | Speed | VRAM | License |
-|-------|------------|-------|------|---------|
-| Whisper Large V3 Turbo | 6-9% | Fast | ~6 GB | MIT |
-| Voxtral Mini (3B) | 8-12% | Medium | ~4 GB | Apache 2.0 |
-
-## Logs
-
-```bash
-# Service logs
-tail -f /tmp/mana-stt.log
-
-# Error logs
-tail -f /tmp/mana-stt.error.log
-```
-
-## Troubleshooting
-
-### Model Download Slow
-
-First run downloads ~1.6 GB for Whisper and ~6 GB for Voxtral. Be patient.
-
-### Out of Memory
-
-Reduce batch size or use smaller model:
-```bash
-export WHISPER_MODEL=medium
-```
-
-### MPS Not Available
-
-Ensure PyTorch is installed with MPS support:
-```bash
-pip install torch torchvision torchaudio
-python -c "import torch; print(torch.backends.mps.is_available())"
-```
-
-## Integration
-
-### From Chat Backend (NestJS)
-
-```typescript
-const formData = new FormData();
-formData.append('file', audioBuffer, 'recording.webm');
-formData.append('language', 'de');
-
-const response = await fetch('http://localhost:3020/transcribe', {
-  method: 'POST',
-  body: formData,
-});
-
-const { text } = await response.json();
-```
-
-### From SvelteKit Web
-
-```typescript
-const formData = new FormData();
-formData.append('file', audioBlob, 'recording.webm');
-
-const response = await fetch('https://gpu-stt.mana.how/transcribe', {
-  method: 'POST',
-  body: formData,
-});
-
-const { text } = await response.json();
+curl -F "file=@audio.wav" -F "language=de" \
+  -H "Authorization: Bearer $INTERNAL_API_KEY" \
+  https://gpu-stt.mana.how/transcribe
 ```
diff --git a/services/mana-stt/com.mana.mana-stt.plist b/services/mana-stt/com.mana.mana-stt.plist
deleted file mode 100644
index 9271a5668..000000000
--- a/services/mana-stt/com.mana.mana-stt.plist
+++ /dev/null
@@ -1,39 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
-<plist version="1.0">
-<dict>
-    <key>Label</key>
-    <string>com.mana.mana-stt</string>
-
-    <key>ProgramArguments</key>
-    <array>
-        <string>/bin/bash</string>
-        <string>-c</string>
-        <string>cd /Users/mana/projects/mana-monorepo/services/mana-stt &amp;&amp; set -a &amp;&amp; source .env &amp;&amp; set +a &amp;&amp; .venv/bin/uvicorn app.main:app --host 0.0.0.0 --port 3020</string>
-    </array>
-
-    <key>WorkingDirectory</key>
-    <string>/Users/mana/projects/mana-monorepo/services/mana-stt</string>
-
-    <key>EnvironmentVariables</key>
-    <dict>
-        <key>PATH</key>
-        <string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin</string>
-    </dict>
-
-    <key>RunAtLoad</key>
-    <true/>
-
-    <key>KeepAlive</key>
-    <true/>
-
-    <key>StandardOutPath</key>
-    <string>/Users/mana/logs/mana-stt.log</string>
-
-    <key>StandardErrorPath</key>
-    <string>/Users/mana/logs/mana-stt.error.log</string>
-
-    <key>ThrottleInterval</key>
-    <integer>10</integer>
-</dict>
-</plist>
diff --git a/services/mana-stt/com.mana.vllm-voxtral.plist b/services/mana-stt/com.mana.vllm-voxtral.plist
deleted file mode 100644
index 197e41921..000000000
--- a/services/mana-stt/com.mana.vllm-voxtral.plist
+++ /dev/null
@@ -1,41 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
-<plist version="1.0">
-<dict>
-    <key>Label</key>
-    <string>com.mana.vllm-voxtral</string>
-
-    <key>ProgramArguments</key>
-    <array>
-        <string>/bin/bash</string>
-        <string>-c</string>
-        <string>cd /Users/mana/projects/mana-monorepo/services/mana-stt &amp;&amp; ./scripts/start-vllm-voxtral.sh</string>
-    </array>
-
-    <key>WorkingDirectory</key>
-    <string>/Users/mana/projects/mana-monorepo/services/mana-stt</string>
-
-    <key>EnvironmentVariables</key>
-    <dict>
-        <key>PATH</key>
-        <string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin</string>
-        <key>VLLM_PORT</key>
-        <string>8100</string>
-    </dict>
-
-    <key>RunAtLoad</key>
-    <true/>
-
-    <key>KeepAlive</key>
-    <true/>
-
-    <key>StandardOutPath</key>
-    <string>/Users/mana/logs/vllm-voxtral.log</string>
-
-    <key>StandardErrorPath</key>
-    <string>/Users/mana/logs/vllm-voxtral.error.log</string>
-
-    <key>ThrottleInterval</key>
-    <integer>30</integer>
-</dict>
-</plist>
diff --git a/services/mana-stt/install-service.sh b/services/mana-stt/install-service.sh
deleted file mode 100755
index 55a5200dc..000000000
--- a/services/mana-stt/install-service.sh
+++ /dev/null
@@ -1,45 +0,0 @@
-#!/bin/bash
-# Install mana-stt as a launchd service on macOS
-# Run this script on the Mac Mini server
-
-set -e
-
-SERVICE_NAME="com.mana.mana-stt"
-PLIST_FILE="$SERVICE_NAME.plist"
-SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
-LAUNCH_AGENTS_DIR="$HOME/Library/LaunchAgents"
-LOG_DIR="$HOME/logs"
-
-echo "Installing mana-stt launchd service..."
-
-# Create logs directory
-mkdir -p "$LOG_DIR"
-
-# Stop existing service if running
-if launchctl list | grep -q "$SERVICE_NAME"; then
-    echo "Stopping existing service..."
-    launchctl unload "$LAUNCH_AGENTS_DIR/$PLIST_FILE" 2>/dev/null || true
-fi
-
-# Copy plist to LaunchAgents
-cp "$SCRIPT_DIR/$PLIST_FILE" "$LAUNCH_AGENTS_DIR/"
-
-# Load the service
-echo "Loading service..."
-launchctl load "$LAUNCH_AGENTS_DIR/$PLIST_FILE"
-
-# Check status
-sleep 2
-if launchctl list | grep -q "$SERVICE_NAME"; then
-    echo "Service installed and running!"
-    echo ""
-    echo "Useful commands:"
-    echo "  View logs:    tail -f $LOG_DIR/mana-stt.log"
-    echo "  View errors:  tail -f $LOG_DIR/mana-stt.error.log"
-    echo "  Stop:         launchctl unload $LAUNCH_AGENTS_DIR/$PLIST_FILE"
-    echo "  Start:        launchctl load $LAUNCH_AGENTS_DIR/$PLIST_FILE"
-    echo "  Health check: curl http://localhost:3020/health"
-else
-    echo "ERROR: Service failed to start. Check logs at $LOG_DIR/mana-stt.error.log"
-    exit 1
-fi
diff --git a/services/mana-stt/install-services.sh b/services/mana-stt/install-services.sh
deleted file mode 100755
index e5cd3dfbb..000000000
--- a/services/mana-stt/install-services.sh
+++ /dev/null
@@ -1,84 +0,0 @@
-#!/bin/bash
-# Install mana-stt and vllm-voxtral as launchd services on macOS
-# Run this script on the Mac Mini server
-
-set -e
-
-SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
-LAUNCH_AGENTS_DIR="$HOME/Library/LaunchAgents"
-LOG_DIR="$HOME/logs"
-
-echo "============================================"
-echo "Installing Mana STT Services"
-echo "============================================"
-echo ""
-
-# Create logs directory
-mkdir -p "$LOG_DIR"
-
-install_service() {
-    local service_name="$1"
-    local plist_file="$service_name.plist"
-
-    echo "Installing $service_name..."
-
-    # Stop existing service if running
-    if launchctl list | grep -q "$service_name"; then
-        echo "  Stopping existing service..."
-        launchctl unload "$LAUNCH_AGENTS_DIR/$plist_file" 2>/dev/null || true
-    fi
-
-    # Copy plist to LaunchAgents
-    cp "$SCRIPT_DIR/$plist_file" "$LAUNCH_AGENTS_DIR/"
-
-    # Load the service
-    echo "  Loading service..."
-    launchctl load "$LAUNCH_AGENTS_DIR/$plist_file"
-
-    sleep 2
-    if launchctl list | grep -q "$service_name"; then
-        echo "  ✓ $service_name installed and running"
-    else
-        echo "  ✗ $service_name failed to start"
-        return 1
-    fi
-}
-
-# Install vLLM first (STT depends on it)
-install_service "com.mana.vllm-voxtral"
-
-# Wait for vLLM to initialize
-echo ""
-echo "Waiting for vLLM server to initialize..."
-for i in {1..30}; do
-    if curl -s http://localhost:8100/health > /dev/null 2>&1; then
-        echo "  ✓ vLLM server is ready"
-        break
-    fi
-    if [ $i -eq 30 ]; then
-        echo "  ! vLLM server not responding yet (may still be loading model)"
-    fi
-    sleep 2
-done
-
-# Install STT service
-echo ""
-install_service "com.mana.mana-stt"
-
-echo ""
-echo "============================================"
-echo "Installation complete!"
-echo "============================================"
-echo ""
-echo "Services:"
-echo "  vLLM Voxtral: http://localhost:8100"
-echo "  Mana STT: http://localhost:3020"
-echo ""
-echo "Useful commands:"
-echo "  View vLLM logs:  tail -f $LOG_DIR/vllm-voxtral.log"
-echo "  View STT logs:   tail -f $LOG_DIR/mana-stt.log"
-echo "  Health check:    curl http://localhost:3020/health"
-echo ""
-echo "Stop all:"
-echo "  launchctl unload $LAUNCH_AGENTS_DIR/com.mana.vllm-voxtral.plist"
-echo "  launchctl unload $LAUNCH_AGENTS_DIR/com.mana.mana-stt.plist"
diff --git a/services/mana-stt/scripts/setup-vllm.sh b/services/mana-stt/scripts/setup-vllm.sh
deleted file mode 100755
index c6a6ad48f..000000000
--- a/services/mana-stt/scripts/setup-vllm.sh
+++ /dev/null
@@ -1,83 +0,0 @@
-#!/bin/bash
-# Setup vLLM for Voxtral on Mac Mini M4
-#
-# vLLM runs in CPU mode on macOS (no CUDA), but still provides
-# the optimized inference pipeline for Voxtral models.
-#
-# Usage: ./scripts/setup-vllm.sh
-
-set -e
-
-SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
-SERVICE_DIR="$(dirname "$SCRIPT_DIR")"
-VENV_DIR="$SERVICE_DIR/.venv-vllm"
-
-echo "============================================"
-echo "vLLM Setup for Voxtral on Mac Mini M4"
-echo "============================================"
-echo ""
-
-# Check Python version
-PYTHON_VERSION=$(python3 --version 2>&1 | awk '{print $2}')
-PYTHON_MAJOR=$(echo $PYTHON_VERSION | cut -d. -f1)
-PYTHON_MINOR=$(echo $PYTHON_VERSION | cut -d. -f2)
-
-if [[ "$PYTHON_MAJOR" -lt 3 ]] || [[ "$PYTHON_MAJOR" -eq 3 && "$PYTHON_MINOR" -lt 10 ]]; then
-    echo "Error: Python 3.10+ required (found $PYTHON_VERSION)"
-    exit 1
-fi
-echo "Python version: $PYTHON_VERSION"
-
-# Create separate venv for vLLM (to avoid conflicts with whisper)
-echo ""
-echo "Creating virtual environment for vLLM..."
-python3 -m venv "$VENV_DIR"
-source "$VENV_DIR/bin/activate"
-
-# Upgrade pip
-pip install --upgrade pip --quiet
-
-# Install vLLM with audio support
-echo ""
-echo "Installing vLLM with audio support..."
-echo "This may take a few minutes..."
-
-# Install uv for faster package installation
-pip install uv --quiet
-
-# Install vLLM with audio support (nightly for best Voxtral support)
-uv pip install "vllm[audio]>=0.10.0" --extra-index-url https://wheels.vllm.ai/nightly 2>&1 || {
-    echo "Nightly install failed, trying stable..."
-    uv pip install "vllm[audio]>=0.10.0"
-}
-
-# Install mistral-common with audio
-uv pip install "mistral-common[audio]>=1.8.1"
-
-echo ""
-echo "============================================"
-echo "Installation complete!"
-echo "============================================"
-echo ""
-echo "To start Voxtral Mini 3B server:"
-echo "  source $VENV_DIR/bin/activate"
-echo "  vllm serve mistralai/Voxtral-Mini-3B-2507 \\"
-echo "    --tokenizer_mode mistral \\"
-echo "    --config_format mistral \\"
-echo "    --load_format mistral \\"
-echo "    --host 0.0.0.0 \\"
-echo "    --port 8100"
-echo ""
-echo "To start Voxtral Realtime 4B server:"
-echo "  source $VENV_DIR/bin/activate"
-echo "  vllm serve mistralai/Voxtral-Mini-4B-Realtime-2602 \\"
-echo "    --host 0.0.0.0 \\"
-echo "    --port 8100"
-echo ""
-echo "API Endpoint: http://localhost:8100/v1/audio/transcriptions"
-echo ""
-echo "Test with:"
-echo "  curl http://localhost:8100/v1/audio/transcriptions \\"
-echo "    -F file=@test.mp3 \\"
-echo "    -F model=mistralai/Voxtral-Mini-3B-2507 \\"
-echo "    -F language=de"
diff --git a/services/mana-stt/scripts/start-vllm-voxtral.sh b/services/mana-stt/scripts/start-vllm-voxtral.sh
deleted file mode 100755
index 70259d59a..000000000
--- a/services/mana-stt/scripts/start-vllm-voxtral.sh
+++ /dev/null
@@ -1,41 +0,0 @@
-#!/bin/bash
-# Start vLLM server for Voxtral
-#
-# Usage: ./scripts/start-vllm-voxtral.sh [model]
-#   model: "3b" (default) or "4b" for Realtime
-
-set -e
-
-SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
-SERVICE_DIR="$(dirname "$SCRIPT_DIR")"
-VENV_DIR="$SERVICE_DIR/.venv-vllm"
-MODEL="${1:-3b}"
-PORT="${VLLM_PORT:-8100}"
-
-# Activate venv
-source "$VENV_DIR/bin/activate"
-
-echo "Starting vLLM Voxtral server..."
-echo "Port: $PORT"
-
-if [[ "$MODEL" == "4b" || "$MODEL" == "realtime" ]]; then
-    echo "Model: Voxtral Mini 4B Realtime"
-    exec vllm serve mistralai/Voxtral-Mini-4B-Realtime-2602 \
-        --host 0.0.0.0 \
-        --port "$PORT" \
-        --max-model-len 4096 \
-        --max-num-batched-tokens 4096 \
-        --enforce-eager
-else
-    echo "Model: Voxtral Mini 3B"
-    # CPU mode needs smaller context and batched tokens
-    exec vllm serve mistralai/Voxtral-Mini-3B-2507 \
-        --tokenizer_mode mistral \
-        --config_format mistral \
-        --load_format mistral \
-        --host 0.0.0.0 \
-        --port "$PORT" \
-        --max-model-len 4096 \
-        --max-num-batched-tokens 4096 \
-        --enforce-eager
-fi
diff --git a/services/mana-stt/setup.sh b/services/mana-stt/setup.sh
deleted file mode 100755
index f7c878bd3..000000000
--- a/services/mana-stt/setup.sh
+++ /dev/null
@@ -1,123 +0,0 @@
-#!/bin/bash
-# Mana STT Service Setup Script
-# For Mac Mini M4 (Apple Silicon)
-
-set -e
-
-SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
-VENV_DIR="$SCRIPT_DIR/.venv"
-PYTHON_VERSION="3.11"
-
-echo "=============================================="
-echo "  Mana STT Service Setup"
-echo "  Whisper (Lightning MLX) + Voxtral"
-echo "=============================================="
-echo ""
-
-# Check if running on macOS
-if [[ "$(uname)" != "Darwin" ]]; then
-    echo "Warning: This script is optimized for macOS (Apple Silicon)"
-fi
-
-# Check for Apple Silicon
-if [[ "$(uname -m)" != "arm64" ]]; then
-    echo "Warning: Not running on Apple Silicon. MLX optimizations won't work."
-fi
-
-# Check Python version
-echo "1. Checking Python installation..."
-if command -v python3.11 &> /dev/null; then
-    PYTHON_CMD="python3.11"
-elif command -v python3 &> /dev/null; then
-    PYTHON_CMD="python3"
-    PY_VERSION=$($PYTHON_CMD --version 2>&1 | cut -d' ' -f2 | cut -d'.' -f1,2)
-    echo "   Found Python $PY_VERSION"
-else
-    echo "Error: Python 3 not found. Please install Python 3.11+"
-    echo "   brew install python@3.11"
-    exit 1
-fi
-
-# Create virtual environment
-echo ""
-echo "2. Creating virtual environment..."
-if [ -d "$VENV_DIR" ]; then
-    echo "   Virtual environment already exists at $VENV_DIR"
-    read -p "   Recreate? (y/N) " -n 1 -r
-    echo
-    if [[ $REPLY =~ ^[Yy]$ ]]; then
-        rm -rf "$VENV_DIR"
-        $PYTHON_CMD -m venv "$VENV_DIR"
-        echo "   Virtual environment recreated"
-    fi
-else
-    $PYTHON_CMD -m venv "$VENV_DIR"
-    echo "   Virtual environment created at $VENV_DIR"
-fi
-
-# Activate virtual environment
-source "$VENV_DIR/bin/activate"
-
-# Upgrade pip
-echo ""
-echo "3. Upgrading pip..."
-pip install --upgrade pip wheel setuptools
-
-# Install dependencies
-echo ""
-echo "4. Installing dependencies..."
-echo "   This may take several minutes (downloading large models)..."
-
-# Install PyTorch with MPS support first
-pip install torch torchvision torchaudio
-
-# Install MLX for Apple Silicon
-pip install mlx
-
-# Install other dependencies
-pip install -r "$SCRIPT_DIR/requirements.txt"
-
-# Install scipy for audio resampling (needed by Voxtral)
-pip install scipy
-
-echo ""
-echo "5. Verifying installation..."
-
-# Test imports
-python -c "import torch; print(f'   PyTorch {torch.__version__} - MPS available: {torch.backends.mps.is_available()}')"
-python -c "import mlx; print(f'   MLX installed')" 2>/dev/null || echo "   MLX not available (CPU fallback)"
-python -c "import fastapi; print(f'   FastAPI {fastapi.__version__}')"
-
-echo ""
-echo "6. Downloading Whisper model (large-v3)..."
-echo "   This will download ~2.9 GB on first run..."
-# Pre-download the model
-python -c "
-from lightning_whisper_mlx import LightningWhisperMLX
-print('   Initializing Whisper model...')
-whisper = LightningWhisperMLX(model='large-v3', batch_size=12)
-print('   Whisper model ready!')
-" || echo "   Note: Model will be downloaded on first transcription request"
-
-echo ""
-echo "=============================================="
-echo "  Setup Complete!"
-echo "=============================================="
-echo ""
-echo "To start the STT service:"
-echo ""
-echo "  cd $SCRIPT_DIR"
-echo "  source .venv/bin/activate"
-echo "  uvicorn app.main:app --host 0.0.0.0 --port 3020"
-echo ""
-echo "Or use the systemd/launchd service (recommended for production):"
-echo ""
-echo "  ./scripts/mac-mini/setup-stt.sh"
-echo ""
-echo "API Endpoints:"
-echo "  POST /transcribe         - Whisper transcription"
-echo "  POST /transcribe/voxtral - Voxtral transcription"
-echo "  POST /transcribe/auto    - Auto-select best model"
-echo "  GET  /health             - Health check"
-echo "  GET  /models             - List available models"
-echo ""
diff --git a/services/mana-tts/CLAUDE.md b/services/mana-tts/CLAUDE.md
index 6951e048b..78319c0da 100644
--- a/services/mana-tts/CLAUDE.md
+++ b/services/mana-tts/CLAUDE.md
@@ -1,125 +1,115 @@
-# CLAUDE.md - Mana TTS Service
+# mana-tts
 
-## Service Overview
+Text-to-Speech microservice. Wraps Kokoro (English presets), Piper (German, local ONNX), and F5-TTS (voice cloning) behind a small FastAPI surface. Lives on the Windows GPU server (`mana-server-gpu`, RTX 3090).
 
-Text-to-Speech microservice using MLX-optimized models for Apple Silicon:
+> ⚠️ **Earlier history**: this directory used to contain MLX-optimized
+> Mac-Mini code (`f5-tts-mlx`, `mlx-audio`, `setup.sh` with Apple Silicon
+> checks, `com.mana.mana-tts.plist` launchd setup). All of that moved to
+> the Windows GPU box and was removed from the repo. If you need the
+> MLX path, see git history.
 
-- **Port**: 3022
-- **Framework**: Python + FastAPI
-- **Models**: Kokoro-82M (fast), F5-TTS (voice cloning)
+## Tech Stack
 
-## Commands
+| Layer | Technology |
+|-------|------------|
+| **Runtime** | Python 3.11 + uvicorn (Windows) |
+| **Framework** | FastAPI |
+| **English (preset)** | Kokoro-82M (`kokoro_service.py`) |
+| **German (local)** | Piper ONNX with `kerstin_low.onnx` and `thorsten_medium.onnx` voices (`piper_service.py`) |
+| **Voice cloning** | F5-TTS on CUDA (`f5_service.py`) |
+| **Audio I/O** | `soundfile`, `pydub` |
+| **Auth** | Per-key + internal-key API auth (`auth.py`) + JWT via mana-auth (`external_auth.py`) |
+| **VRAM** | Shared `vram_manager.py` (same module as mana-stt + mana-image-gen) |
+| **Process supervision** | Windows Scheduled Task `ManaTTS` (AtLogOn) |
 
-```bash
-# Setup
-./setup.sh
+## Port: 3022
 
-# Development
-source .venv/bin/activate
-uvicorn app.main:app --host 0.0.0.0 --port 3022 --reload
+## Where it runs
 
-# Production (Mac Mini)
-../../scripts/mac-mini/setup-tts.sh
+| Host | Path on disk | Entrypoint |
+|------|--------------|------------|
+| Windows GPU server (`192.168.178.11`) | `C:\mana\services\mana-tts\` | `service.pyw` via Scheduled Task `ManaTTS` |
 
-# Test
-curl http://localhost:3022/health
+Public URL: `https://gpu-tts.mana.how`.
 
-# English (Kokoro)
-curl -X POST http://localhost:3022/synthesize/kokoro \
-  -H "Content-Type: application/json" \
-  -d '{"text": "Hello world", "voice": "af_heart"}' \
-  --output test_en.wav
+## API Endpoints
 
-# German (Piper) - use /synthesize/auto
-curl -X POST http://localhost:3022/synthesize/auto \
-  -H "Content-Type: application/json" \
-  -d '{"text": "Hallo Welt", "voice": "de_kerstin"}' \
-  --output test_de.wav
+| Method | Path | Description |
+|--------|------|-------------|
+| GET | `/health` | Liveness + which backends are loaded |
+| GET | `/models` | Available TTS models |
+| GET | `/voices` | List all voices (preset + custom) |
+| POST | `/voices` | Register a custom voice (reference audio + transcript) |
+| DELETE | `/voices/{voice_id}` | Delete a custom voice |
+| POST | `/synthesize/kokoro` | Kokoro synthesis (English presets) |
+| POST | `/synthesize` | F5-TTS voice cloning |
+| POST | `/synthesize/auto` | Routing helper — picks the right backend for the requested voice |
+
+All non-health endpoints require `Authorization: Bearer <token>` (per-app key, internal key, or mana-auth JWT).
+
+## Voices
+
+### Kokoro-82M (English presets)
+~300 MB download. 30+ preset English voices. Fast, no reference audio needed.
+
+### Piper (German, local ONNX)
+~63 MB per voice. 100% local, GDPR-compliant. Available:
+- `de_kerstin` (female, default)
+- `de_thorsten` (male)
+
+Fallback to Edge TTS cloud voices if Piper isn't loaded.
+
+### F5-TTS (voice cloning)
+~6 GB. Requires reference audio + transcript. Higher quality, slower. Custom voices live in `voices/` (reference audio + transcript per voice ID).
+
+## Configuration (`.env` on the Windows GPU box)
+
+```env
+PORT=3022
+PRELOAD_MODELS=false
+MAX_TEXT_LENGTH=1000
+REQUIRE_AUTH=true
+API_KEYS=sk-app1:app1,sk-app2:app2
+INTERNAL_API_KEY=...
+CORS_ORIGINS=https://mana.how,https://chat.mana.how
 ```
 
-## File Structure
+## Code layout
 
 ```
 services/mana-tts/
 ├── app/
 │   ├── __init__.py
-│   ├── main.py              # FastAPI endpoints
-│   ├── kokoro_service.py    # Kokoro TTS (English preset voices)
-│   ├── piper_service.py     # Piper TTS (German voices, local)
-│   ├── f5_service.py        # F5-TTS (voice cloning)
-│   ├── voice_manager.py     # Custom voice registry
-│   └── audio_utils.py       # Audio format conversion
-├── piper_voices/            # Piper voice models (.onnx)
-├── voices/                  # Custom F5 voice storage
-├── mlx_models/             # MLX model cache
-├── setup.sh                # Setup script
-├── requirements.txt
-└── README.md
+│   ├── main.py             # FastAPI endpoints
+│   ├── kokoro_service.py   # Kokoro (English presets)
+│   ├── piper_service.py    # Piper (German, local ONNX)
+│   ├── f5_service.py       # F5-TTS (voice cloning, CUDA)
+│   ├── voice_manager.py    # Custom voice registry
+│   ├── audio_utils.py      # Format conversion, resampling
+│   ├── auth.py             # API-key auth
+│   ├── external_auth.py    # JWT validation via mana-auth
+│   └── vram_manager.py     # Shared VRAM accountant
+└── service.pyw             # Windows runner (used by ManaTTS scheduled task)
 ```
 
-## API Endpoints
+The Piper voice ONNX files live alongside the service on the GPU box (`C:\mana\services\mana-tts\piper_voices\*.onnx`) — too big to commit, downloaded once during setup.
 
-| Endpoint | Method | Purpose |
-|----------|--------|---------|
-| `/health` | GET | Health check |
-| `/models` | GET | Model info |
-| `/voices` | GET | List all voices |
-| `/voices` | POST | Register custom voice |
-| `/voices/{id}` | DELETE | Delete custom voice |
-| `/synthesize/kokoro` | POST | Kokoro synthesis |
-| `/synthesize` | POST | F5-TTS voice cloning |
-| `/synthesize/auto` | POST | Auto-select model |
+## Operations
 
-## Models
+```powershell
+# Status
+Get-ScheduledTask -TaskName "ManaTTS" | Format-List TaskName, State
+Get-NetTCPConnection -LocalPort 3022 -State Listen
 
-### Kokoro-82M (English)
-- ~300 MB download
-- 30+ preset English voices
-- Fast inference
-- No reference audio needed
+# Restart
+Stop-ScheduledTask -TaskName "ManaTTS"
+Start-ScheduledTask -TaskName "ManaTTS"
 
-### Piper TTS (German)
-- ~63 MB per voice model
-- 100% local, GDPR-compliant
-- Fast inference on CPU
-- Available voices:
-  - `de_kerstin` - Female (default)
-  - `de_thorsten` - Male
-- Fallback to Edge TTS (cloud) if Piper unavailable:
-  - `de_katja` - Female (cloud)
-  - `de_conrad` - Male (cloud)
-  - `de_amala` - Female young (cloud)
-  - `de_florian` - Male young (cloud)
+# Logs
+Get-Content C:\mana\services\mana-tts\service.log -Tail 50
+```
 
-### F5-TTS (Voice Cloning)
-- ~6 GB download
-- Voice cloning capability
-- Requires reference audio + transcript
-- Higher quality, slower
+## Reference
 
-## Environment Variables
-
-| Variable | Default | Description |
-|----------|---------|-------------|
-| `PORT` | `3022` | Service port |
-| `PRELOAD_MODELS` | `false` | Load on startup |
-| `MAX_TEXT_LENGTH` | `1000` | Max chars |
-| `CORS_ORIGINS` | (production URLs) | CORS config |
-
-## Key Dependencies
-
-- `fastapi` - Web framework
-- `f5-tts-mlx` - Voice cloning model
-- `mlx-audio` - Kokoro implementation
-- `mlx` - Apple Silicon ML framework
-- `piper-tts` - German TTS (local)
-- `edge-tts` - German TTS fallback (cloud)
-- `soundfile` - Audio I/O
-- `pydub` - MP3 conversion
-
-## Development Notes
-
-- Models load lazily on first request (unless `PRELOAD_MODELS=true`)
-- Custom voices stored in `voices/` with reference audio + transcript
-- Singleton pattern for model instances
-- Audio returned as raw bytes with headers for metadata
+- `docs/WINDOWS_GPU_SERVER_SETUP.md` — Windows box setup, scheduled tasks, firewall, Cloudflare tunnel
+- `docs/PORT_SCHEMA.md` — port assignments across services
diff --git a/services/mana-tts/README.md b/services/mana-tts/README.md
index 15f936ca6..fa99f7039 100644
--- a/services/mana-tts/README.md
+++ b/services/mana-tts/README.md
@@ -1,237 +1,36 @@
 # Mana TTS
 
-Text-to-Speech microservice with voice cloning support, optimized for Apple Silicon.
+Text-to-Speech microservice running on the Windows GPU server (`mana-server-gpu`, RTX 3090). Wraps **Kokoro** (English presets), **Piper** (German, local ONNX), and **F5-TTS** (CUDA voice cloning).
 
-## Features
+For architecture, deployment, configuration, and operations see [`CLAUDE.md`](./CLAUDE.md) and [`docs/WINDOWS_GPU_SERVER_SETUP.md`](../../docs/WINDOWS_GPU_SERVER_SETUP.md).
 
-- **Kokoro TTS**: Fast preset voices (~300 MB model)
-- **F5-TTS**: Voice cloning with reference audio (~6 GB model)
-- **MLX Optimized**: Runs efficiently on Apple Silicon
-- **REST API**: FastAPI with OpenAPI documentation
+## Port: 3022
 
-## Quick Start
+## Public URL
 
-### Setup
-
-```bash
-# Run setup script
-./setup.sh
-
-# Or manually
-python3.11 -m venv .venv
-source .venv/bin/activate
-pip install -r requirements.txt
-```
-
-### Start Service
-
-```bash
-source .venv/bin/activate
-uvicorn app.main:app --host 0.0.0.0 --port 3022
-```
-
-### Test
-
-```bash
-# Health check
-curl http://localhost:3022/health
-
-# Synthesize with Kokoro
-curl -X POST http://localhost:3022/synthesize/kokoro \
-  -H "Content-Type: application/json" \
-  -d '{"text": "Hello world", "voice": "af_heart"}' \
-  --output test.wav
-
-# Play audio (macOS)
-afplay test.wav
-```
+`https://gpu-tts.mana.how` (via Cloudflare Tunnel + Mac Mini gpu-proxy)
 
 ## API Endpoints
 
-### Health & Info
-
 | Endpoint | Method | Description |
 |----------|--------|-------------|
-| `/health` | GET | Health check |
-| `/models` | GET | Available models |
-| `/voices` | GET | All available voices |
-
-### Synthesis
-
-| Endpoint | Method | Description |
-|----------|--------|-------------|
-| `/synthesize/kokoro` | POST | Kokoro preset voices |
+| `/health` | GET | Health check + which backends are loaded |
+| `/models` | GET | List available models |
+| `/voices` | GET | List preset + custom voices |
+| `/voices` | POST | Register a custom voice (reference audio + transcript) |
+| `/voices/{id}` | DELETE | Delete a custom voice |
+| `/synthesize/kokoro` | POST | Kokoro (English presets) |
 | `/synthesize` | POST | F5-TTS voice cloning |
-| `/synthesize/auto` | POST | Auto-select model |
+| `/synthesize/auto` | POST | Auto-select best backend for the requested voice |
 
-### Voice Management
+All non-health endpoints require `Authorization: Bearer <token>`.
 
-| Endpoint | Method | Description |
-|----------|--------|-------------|
-| `/voices` | POST | Register custom voice |
-| `/voices/{id}` | DELETE | Delete custom voice |
-
-## Synthesis Examples
-
-### Kokoro (Fast Preset Voices)
+## Quick Test
 
 ```bash
-curl -X POST http://localhost:3022/synthesize/kokoro \
+curl -X POST https://gpu-tts.mana.how/synthesize/kokoro \
+  -H "Authorization: Bearer $INTERNAL_API_KEY" \
   -H "Content-Type: application/json" \
-  -d '{
-    "text": "Welcome to Mana TTS, your personal voice synthesis service.",
-    "voice": "af_heart",
-    "speed": 1.0,
-    "output_format": "wav"
-  }' \
-  --output output.wav
+  -d '{"text":"Hello world","voice":"af_heart"}' \
+  --output test.wav
 ```
-
-### F5-TTS (Voice Cloning)
-
-```bash
-# With reference audio upload
-curl -X POST http://localhost:3022/synthesize \
-  -F "text=Hello, this is a cloned voice speaking." \
-  -F "reference_audio=@reference.wav" \
-  -F "reference_text=This is what the reference audio says." \
-  -F "output_format=wav" \
-  --output cloned.wav
-
-# With registered voice
-curl -X POST http://localhost:3022/synthesize \
-  -F "text=Hello from my registered voice." \
-  -F "voice_id=my_custom_voice" \
-  --output output.wav
-```
-
-### Auto-Select
-
-```bash
-# Uses Kokoro for preset voices, F5-TTS for custom
-curl -X POST http://localhost:3022/synthesize/auto \
-  -H "Content-Type: application/json" \
-  -d '{"text": "Auto-selected synthesis", "voice": "af_bella"}' \
-  --output output.wav
-```
-
-## Available Kokoro Voices
-
-### American Female
-- `af_heart` - Warm, emotional (default)
-- `af_alloy` - Neutral, professional
-- `af_bella` - Friendly, approachable
-- `af_jessica` - Confident, clear
-- `af_nicole` - Bright, energetic
-- `af_nova` - Modern, dynamic
-- `af_sarah` - Warm, conversational
-- ... and more
-
-### American Male
-- `am_adam` - Deep, authoritative
-- `am_echo` - Resonant, clear
-- `am_eric` - Professional, neutral
-- `am_michael` - Warm, trustworthy
-- ... and more
-
-### British Female
-- `bf_alice` - Refined, elegant
-- `bf_emma` - Clear, professional
-- `bf_lily` - Soft, gentle
-
-### British Male
-- `bm_daniel` - Classic, authoritative
-- `bm_fable` - Storyteller, expressive
-- `bm_george` - Traditional, clear
-
-## Voice Registration
-
-Register a custom voice for F5-TTS voice cloning:
-
-```bash
-curl -X POST http://localhost:3022/voices \
-  -F "voice_id=my_voice" \
-  -F "name=My Custom Voice" \
-  -F "description=A sample voice for testing" \
-  -F "transcript=Hello, this is the text spoken in the reference audio." \
-  -F "reference_audio=@my_reference.wav"
-```
-
-Pre-defined voices can also be placed in the `voices/` directory:
-
-```
-voices/
-└── my_voice/
-    ├── reference.wav       # Reference audio (required)
-    ├── transcript.txt      # Transcript of reference (required)
-    └── metadata.json       # Name and description (optional)
-```
-
-## Configuration
-
-| Variable | Default | Description |
-|----------|---------|-------------|
-| `PORT` | `3022` | API port |
-| `PRELOAD_MODELS` | `false` | Load models on startup |
-| `MAX_TEXT_LENGTH` | `1000` | Max characters per request |
-| `CORS_ORIGINS` | `https://mana.how,...` | Allowed CORS origins |
-| `F5_MODEL` | `lucasnewman/f5-tts-mlx` | F5-TTS model |
-| `KOKORO_MODEL` | `mlx-community/Kokoro-82M-bf16` | Kokoro model |
-
-## Mac Mini Deployment
-
-```bash
-# Install and start as launchd service
-../../scripts/mac-mini/setup-tts.sh
-
-# Service management
-launchctl list | grep com.mana.tts
-launchctl unload ~/Library/LaunchAgents/com.mana.tts.plist
-launchctl load ~/Library/LaunchAgents/com.mana.tts.plist
-
-# View logs
-tail -f /tmp/mana-tts.log
-```
-
-## Requirements
-
-- Python 3.10+
-- macOS with Apple Silicon (recommended)
-- ~7 GB disk space for models
-- 16 GB RAM recommended
-- ffmpeg (for MP3 output)
-
-## Troubleshooting
-
-### Models Not Loading
-
-```bash
-# Check MLX installation
-python -c "import mlx; print(mlx.__version__)"
-
-# Check mlx-audio
-python -c "import mlx_audio; print('OK')"
-
-# Check f5-tts-mlx
-python -c "from f5_tts_mlx import F5TTS; print('OK')"
-```
-
-### MP3 Output Not Working
-
-```bash
-# Install ffmpeg
-brew install ffmpeg
-
-# Verify
-ffmpeg -version
-```
-
-### Memory Issues
-
-- Reduce `MAX_TEXT_LENGTH` for less memory usage
-- Set `PRELOAD_MODELS=false` for lazy loading
-- F5-TTS requires ~6 GB, Kokoro ~500 MB
-
-## API Documentation
-
-When running, visit http://localhost:3022/docs for interactive API documentation.
diff --git a/services/mana-tts/com.mana.mana-tts.plist b/services/mana-tts/com.mana.mana-tts.plist
deleted file mode 100644
index 084e39afb..000000000
--- a/services/mana-tts/com.mana.mana-tts.plist
+++ /dev/null
@@ -1,39 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
-<plist version="1.0">
-<dict>
-    <key>Label</key>
-    <string>com.mana.mana-tts</string>
-
-    <key>ProgramArguments</key>
-    <array>
-        <string>/bin/bash</string>
-        <string>-c</string>
-        <string>cd /Users/mana/projects/mana-monorepo/services/mana-tts &amp;&amp; set -a &amp;&amp; source .env &amp;&amp; set +a &amp;&amp; .venv/bin/uvicorn app.main:app --host 0.0.0.0 --port 3022</string>
-    </array>
-
-    <key>WorkingDirectory</key>
-    <string>/Users/mana/projects/mana-monorepo/services/mana-tts</string>
-
-    <key>EnvironmentVariables</key>
-    <dict>
-        <key>PATH</key>
-        <string>/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin</string>
-    </dict>
-
-    <key>RunAtLoad</key>
-    <true/>
-
-    <key>KeepAlive</key>
-    <true/>
-
-    <key>StandardOutPath</key>
-    <string>/Users/mana/logs/mana-tts.log</string>
-
-    <key>StandardErrorPath</key>
-    <string>/Users/mana/logs/mana-tts.error.log</string>
-
-    <key>ThrottleInterval</key>
-    <integer>10</integer>
-</dict>
-</plist>
diff --git a/services/mana-tts/install-service.sh b/services/mana-tts/install-service.sh
deleted file mode 100755
index 15a153af1..000000000
--- a/services/mana-tts/install-service.sh
+++ /dev/null
@@ -1,45 +0,0 @@
-#!/bin/bash
-# Install mana-tts as a launchd service on macOS
-# Run this script on the Mac Mini server
-
-set -e
-
-SERVICE_NAME="com.mana.mana-tts"
-PLIST_FILE="$SERVICE_NAME.plist"
-SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
-LAUNCH_AGENTS_DIR="$HOME/Library/LaunchAgents"
-LOG_DIR="$HOME/logs"
-
-echo "Installing mana-tts launchd service..."
-
-# Create logs directory
-mkdir -p "$LOG_DIR"
-
-# Stop existing service if running
-if launchctl list | grep -q "$SERVICE_NAME"; then
-    echo "Stopping existing service..."
-    launchctl unload "$LAUNCH_AGENTS_DIR/$PLIST_FILE" 2>/dev/null || true
-fi
-
-# Copy plist to LaunchAgents
-cp "$SCRIPT_DIR/$PLIST_FILE" "$LAUNCH_AGENTS_DIR/"
-
-# Load the service
-echo "Loading service..."
-launchctl load "$LAUNCH_AGENTS_DIR/$PLIST_FILE"
-
-# Check status
-sleep 2
-if launchctl list | grep -q "$SERVICE_NAME"; then
-    echo "Service installed and running!"
-    echo ""
-    echo "Useful commands:"
-    echo "  View logs:    tail -f $LOG_DIR/mana-tts.log"
-    echo "  View errors:  tail -f $LOG_DIR/mana-tts.error.log"
-    echo "  Stop:         launchctl unload $LAUNCH_AGENTS_DIR/$PLIST_FILE"
-    echo "  Start:        launchctl load $LAUNCH_AGENTS_DIR/$PLIST_FILE"
-    echo "  Health check: curl http://localhost:3022/health"
-else
-    echo "ERROR: Service failed to start. Check logs at $LOG_DIR/mana-tts.error.log"
-    exit 1
-fi
diff --git a/services/mana-tts/setup.sh b/services/mana-tts/setup.sh
deleted file mode 100755
index 280bfa625..000000000
--- a/services/mana-tts/setup.sh
+++ /dev/null
@@ -1,150 +0,0 @@
-#!/bin/bash
-# Setup script for Mana TTS service
-# Optimized for Apple Silicon (MLX)
-
-set -e
-
-SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
-VENV_DIR="$SCRIPT_DIR/.venv"
-PYTHON_VERSION="3.11"
-
-echo "=========================================="
-echo "Mana TTS Setup"
-echo "=========================================="
-echo ""
-
-# Check platform
-if [[ "$(uname)" != "Darwin" ]]; then
-    echo "Warning: This service is optimized for macOS with Apple Silicon."
-    echo "Some features may not work on other platforms."
-    echo ""
-fi
-
-# Check for Apple Silicon
-if [[ "$(uname -m)" != "arm64" ]]; then
-    echo "Warning: This service is optimized for Apple Silicon (arm64)."
-    echo "Performance may be reduced on Intel Macs."
-    echo ""
-fi
-
-# Find Python
-if command -v python3.11 &> /dev/null; then
-    PYTHON_CMD="python3.11"
-elif command -v python3 &> /dev/null; then
-    PYTHON_CMD="python3"
-else
-    echo "Error: Python 3 not found. Please install Python 3.11 or later."
-    exit 1
-fi
-
-echo "Using Python: $PYTHON_CMD"
-$PYTHON_CMD --version
-echo ""
-
-# Check Python version
-PYTHON_MAJOR=$($PYTHON_CMD -c "import sys; print(sys.version_info.major)")
-PYTHON_MINOR=$($PYTHON_CMD -c "import sys; print(sys.version_info.minor)")
-
-if [[ $PYTHON_MAJOR -lt 3 ]] || [[ $PYTHON_MINOR -lt 10 ]]; then
-    echo "Error: Python 3.10 or later required. Found $PYTHON_MAJOR.$PYTHON_MINOR"
-    exit 1
-fi
-
-# Create or recreate virtual environment
-if [[ -d "$VENV_DIR" ]]; then
-    echo "Virtual environment exists at $VENV_DIR"
-    read -p "Recreate it? (y/N) " -n 1 -r
-    echo ""
-    if [[ $REPLY =~ ^[Yy]$ ]]; then
-        echo "Removing existing virtual environment..."
-        rm -rf "$VENV_DIR"
-        echo "Creating new virtual environment..."
-        $PYTHON_CMD -m venv "$VENV_DIR"
-    fi
-else
-    echo "Creating virtual environment..."
-    $PYTHON_CMD -m venv "$VENV_DIR"
-fi
-
-# Activate virtual environment
-echo "Activating virtual environment..."
-source "$VENV_DIR/bin/activate"
-
-# Upgrade pip
-echo ""
-echo "Upgrading pip..."
-pip install --upgrade pip
-
-# Install dependencies
-echo ""
-echo "Installing dependencies..."
-pip install -r "$SCRIPT_DIR/requirements.txt"
-
-# Install ffmpeg check (for MP3 support)
-echo ""
-echo "Checking for ffmpeg (required for MP3 output)..."
-if command -v ffmpeg &> /dev/null; then
-    echo "ffmpeg found: $(which ffmpeg)"
-else
-    echo "Warning: ffmpeg not found. MP3 output will not work."
-    echo "Install with: brew install ffmpeg"
-fi
-
-# Verify installations
-echo ""
-echo "Verifying installations..."
-
-# Test FastAPI
-python -c "import fastapi; print(f'FastAPI {fastapi.__version__}')" || {
-    echo "Error: FastAPI not installed correctly"
-    exit 1
-}
-
-# Test soundfile
-python -c "import soundfile; print(f'soundfile {soundfile.__version__}')" || {
-    echo "Error: soundfile not installed correctly"
-    exit 1
-}
-
-# Test MLX (on Apple Silicon)
-if [[ "$(uname -m)" == "arm64" ]]; then
-    python -c "import mlx; print(f'MLX {mlx.__version__}')" || {
-        echo "Warning: MLX not installed correctly. TTS may not work."
-    }
-fi
-
-# Test mlx-audio
-python -c "import mlx_audio; print('mlx-audio installed')" 2>/dev/null || {
-    echo "Warning: mlx-audio not imported successfully."
-    echo "You may need to install it manually or models won't load."
-}
-
-# Create directories
-echo ""
-echo "Creating required directories..."
-mkdir -p "$SCRIPT_DIR/voices"
-mkdir -p "$SCRIPT_DIR/mlx_models"
-
-echo ""
-echo "=========================================="
-echo "Setup Complete!"
-echo "=========================================="
-echo ""
-echo "To start the service:"
-echo ""
-echo "  cd $SCRIPT_DIR"
-echo "  source .venv/bin/activate"
-echo "  uvicorn app.main:app --host 0.0.0.0 --port 3022"
-echo ""
-echo "Or for development with auto-reload:"
-echo ""
-echo "  uvicorn app.main:app --host 0.0.0.0 --port 3022 --reload"
-echo ""
-echo "Test the service:"
-echo ""
-echo "  curl http://localhost:3022/health"
-echo ""
-echo "For Mac Mini deployment, run:"
-echo ""
-echo "  ./../../scripts/mac-mini/setup-tts.sh"
-echo ""