managarten/services/mana-voice-bot/CLAUDE.md

# mana-voice-bot

German voice-to-voice assistant. Wires together STT (mana-stt), an LLM (Ollama via mana-llm), and TTS (Edge TTS cloud or mana-tts) into a single end-to-end audio pipeline.

> ⚠️ **Not deployed yet.** This service exists in the repo and runs
> locally for development, but it has no Scheduled Task on the Windows
> GPU server, no launchd plist, no Cloudflare Tunnel hostname, and no
> entry in the production startup scripts. When you're ready to deploy
> it, target the Windows GPU server alongside the other AI services
> (`C:\mana\services\mana-voice-bot\`, Scheduled Task `ManaVoiceBot`,
> `service.pyw` runner, public URL `gpu-voice.mana.how` via the existing
> Mac Mini cloudflared+gpu-proxy chain).

## Tech Stack

| Layer | Technology |
|-------|------------|
| **Runtime** | Python 3.11 + uvicorn |
| **Framework** | FastAPI |
| **STT** | Whisper via mana-stt |
| **LLM** | Ollama via mana-llm (Gemma/Qwen) |
| **TTS** | Edge TTS (Microsoft cloud) — could move to mana-tts later |

## Port: 3024

> The default was `3050` until 2026-04-08. That collided with `mana-sync`
> on the Mac Mini and was a latent footgun for any future deployment
> that put both on the same host. Moved to 3024 to fit in the AI/ML
> port range alongside mana-stt (3020), mana-tts (3022), mana-image-gen
> (3023), and mana-llm (3025).

## Quick Start (local dev)

```bash
cd services/mana-voice-bot
./setup.sh
./start.sh
# or directly:
uvicorn app.main:app --host 0.0.0.0 --port 3024 --reload
```

## API Endpoints

| Method | Path | Description |
|--------|------|-------------|
| GET | `/health` | Service health check |
| GET | `/voices` | List German TTS voices |
| GET | `/models` | List available Ollama models |
| POST | `/transcribe` | Audio → text (STT only) |
| POST | `/chat` | Text → text (LLM only) |
| POST | `/chat/audio` | Text → audio (LLM + TTS) |
| POST | `/tts` | Text → audio (TTS only) |
| POST | `/voice` | Audio → audio (full pipeline) |
| POST | `/voice/metadata` | Audio → JSON (full pipeline, no audio response) |

## Pipeline

```
Audio in → Whisper (STT) → Ollama (LLM) → Edge TTS → Audio out
              ↓                  ↓             ↓
         [German text]      [Response]    [MP3 audio]
```

## German Voices

| Voice ID | Description |
|----------|-------------|
| `de-DE-ConradNeural` | Male, professional (default) |
| `de-DE-KatjaNeural` | Female, natural |
| `de-DE-AmalaNeural` | Female, friendly |
| `de-DE-BerndNeural` | Male, calm |
| `de-DE-ChristophNeural` | Male, news |
| `de-DE-ElkeNeural` | Female, warm |
| `de-DE-KillianNeural` | Male, casual |
| `de-DE-KlarissaNeural` | Female, cheerful |
| `de-DE-KlausNeural` | Male, storyteller |
| `de-DE-LouisaNeural` | Female, assistant |
| `de-DE-TanjaNeural` | Female, business |

## Configuration

| Variable | Default | Description |
|----------|---------|-------------|
| `PORT` | `3024` | Service port |
| `STT_URL` | `http://localhost:3020` | mana-stt URL |
| `OLLAMA_URL` | `http://localhost:11434` | Ollama URL |
| `DEFAULT_MODEL` | `gemma3:4b` | Default LLM model |
| `DEFAULT_VOICE` | `de-DE-ConradNeural` | Default TTS voice |
| `SYSTEM_PROMPT` | (German assistant) | LLM system prompt |

## Performance budget

Typical latency on the GPU server:
- STT (Whisper): 0.5–2 s
- LLM (Gemma 4B): 1–5 s
- TTS (Edge): 0.3–0.5 s
- **Total**: 2–7 s

## When you actually deploy this

1. Copy the directory to `C:\mana\services\mana-voice-bot\` on `mana-server-gpu`
2. Create the venv (`C:\mana\venvs\voice-bot\`) and install requirements
3. Write a `service.pyw` runner mirroring the other AI services (loads `.env`, redirects stdout/stderr to `service.log`, calls `uvicorn.run(... port=3024)`)
4. Create the Windows Scheduled Task `ManaVoiceBot` (AtLogOn) pointing at `service.pyw`
5. Add the firewall rule (`New-NetFirewallRule -DisplayName "Mana-Voice-Bot" -Direction Inbound -LocalPort 3024 -Protocol TCP -Action Allow`)
6. Add the cloudflared route in `cloudflared-config.yml`:
   `- hostname: gpu-voice.mana.how → service: http://192.168.178.11:3024`
7. Update `docs/WINDOWS_GPU_SERVER_SETUP.md` with the new task