mirror of https://github.com/Memo-2023/mana-monorepo.git synced 2026-05-16 11:39:39 +02:00

Till JS f4347032ca chore(mac-mini): remove all AI service infrastructure (moved to Windows GPU)

The Mac Mini hasn't run mana-llm/stt/tts/image-gen for a while — those
services live on the Windows GPU server now. The Mac-targeted
installers, plists, and platform-checking setup scripts have been
sitting in the repo as cargo-cult, suggesting Mac Mini deployment is
still a real option. It isn't.

Removed (Mac-Mini deployment infrastructure):

services/mana-stt/
- com.mana.mana-stt.plist            (LaunchAgent)
- com.mana.vllm-voxtral.plist        (LaunchAgent for the abandoned local Voxtral experiment)
- install-service.sh                 (single-service launchd installer)
- install-services.sh                (mana-stt + vllm-voxtral installer)
- setup.sh                           (Mac arm64 installer)
- scripts/setup-vllm.sh              (vLLM-Voxtral setup)
- scripts/start-vllm-voxtral.sh

services/mana-tts/
- com.mana.mana-tts.plist
- install-service.sh
- setup.sh                           (Mac arm64 installer)

scripts/mac-mini/
- setup-image-gen.sh                 (Mac flux2.c launchd installer)
- setup-stt.sh
- setup-tts.sh
- launchd/com.mana.image-gen.plist
- launchd/com.mana.mana-stt.plist
- launchd/com.mana.mana-tts.plist

setup-tts-bot.sh stays — it's the Matrix TTS bot installer (Synapse
side), not the mana-tts service.

Updated:
- services/mana-stt/CLAUDE.md, README.md — fully rewritten for the
  Windows GPU reality (CUDA WhisperX, Scheduled Task ManaSTT, .env keys
  matching the actual production .env on the box)
- services/mana-tts/CLAUDE.md, README.md — same treatment, documenting
  Kokoro/Piper/F5-TTS on the Windows GPU under Scheduled Task ManaTTS
- scripts/mac-mini/README.md — dropped the STT setup section, replaced
  with a pointer to docs/WINDOWS_GPU_SERVER_SETUP.md and the per-service
  CLAUDE.md files
- docs/MAC_MINI_SERVER.md — expanded the "deactivated launchagents"
  list to mention the now-removed plists, added the full GPU service
  port table with public URLs, added a cleanup snippet for any old plists
  still installed on a Mac Mini somewhere

2026-04-08 13:06:40 +02:00

4.2 KiB

Raw Blame History

mana-tts

Text-to-Speech microservice. Wraps Kokoro (English presets), Piper (German, local ONNX), and F5-TTS (voice cloning) behind a small FastAPI surface. Lives on the Windows GPU server (mana-server-gpu, RTX 3090).

⚠️ Earlier history: this directory used to contain MLX-optimized Mac-Mini code (f5-tts-mlx, mlx-audio, setup.sh with Apple Silicon checks, com.mana.mana-tts.plist launchd setup). All of that moved to the Windows GPU box and was removed from the repo. If you need the MLX path, see git history.

Tech Stack

Layer	Technology
Runtime	Python 3.11 + uvicorn (Windows)
Framework	FastAPI
English (preset)	Kokoro-82M (`kokoro_service.py`)
German (local)	Piper ONNX with `kerstin_low.onnx` and `thorsten_medium.onnx` voices (`piper_service.py`)
Voice cloning	F5-TTS on CUDA (`f5_service.py`)
Audio I/O	`soundfile`, `pydub`
Auth	Per-key + internal-key API auth (`auth.py`) + JWT via mana-auth (`external_auth.py`)
VRAM	Shared `vram_manager.py` (same module as mana-stt + mana-image-gen)
Process supervision	Windows Scheduled Task `ManaTTS` (AtLogOn)

Port: 3022

Where it runs

Host	Path on disk	Entrypoint
Windows GPU server (`192.168.178.11`)	`C:\mana\services\mana-tts\`	`service.pyw` via Scheduled Task `ManaTTS`

Public URL: https://gpu-tts.mana.how.

API Endpoints

Method	Path	Description
GET	`/health`	Liveness + which backends are loaded
GET	`/models`	Available TTS models
GET	`/voices`	List all voices (preset + custom)
POST	`/voices`	Register a custom voice (reference audio + transcript)
DELETE	`/voices/{voice_id}`	Delete a custom voice
POST	`/synthesize/kokoro`	Kokoro synthesis (English presets)
POST	`/synthesize`	F5-TTS voice cloning
POST	`/synthesize/auto`	Routing helper — picks the right backend for the requested voice

All non-health endpoints require Authorization: Bearer <token> (per-app key, internal key, or mana-auth JWT).

Voices

Kokoro-82M (English presets)

~300 MB download. 30+ preset English voices. Fast, no reference audio needed.

Piper (German, local ONNX)

~63 MB per voice. 100% local, GDPR-compliant. Available:

de_kerstin (female, default)
de_thorsten (male)

Fallback to Edge TTS cloud voices if Piper isn't loaded.

F5-TTS (voice cloning)

~6 GB. Requires reference audio + transcript. Higher quality, slower. Custom voices live in voices/ (reference audio + transcript per voice ID).

Configuration (`.env` on the Windows GPU box)

PORT=3022
PRELOAD_MODELS=false
MAX_TEXT_LENGTH=1000
REQUIRE_AUTH=true
API_KEYS=sk-app1:app1,sk-app2:app2
INTERNAL_API_KEY=...
CORS_ORIGINS=https://mana.how,https://chat.mana.how

Code layout

services/mana-tts/
├── app/
│   ├── __init__.py
│   ├── main.py             # FastAPI endpoints
│   ├── kokoro_service.py   # Kokoro (English presets)
│   ├── piper_service.py    # Piper (German, local ONNX)
│   ├── f5_service.py       # F5-TTS (voice cloning, CUDA)
│   ├── voice_manager.py    # Custom voice registry
│   ├── audio_utils.py      # Format conversion, resampling
│   ├── auth.py             # API-key auth
│   ├── external_auth.py    # JWT validation via mana-auth
│   └── vram_manager.py     # Shared VRAM accountant
└── service.pyw             # Windows runner (used by ManaTTS scheduled task)

The Piper voice ONNX files live alongside the service on the GPU box (C:\mana\services\mana-tts\piper_voices\*.onnx) — too big to commit, downloaded once during setup.

Operations

# Status
Get-ScheduledTask -TaskName "ManaTTS" | Format-List TaskName, State
Get-NetTCPConnection -LocalPort 3022 -State Listen

# Restart
Stop-ScheduledTask -TaskName "ManaTTS"
Start-ScheduledTask -TaskName "ManaTTS"

# Logs
Get-Content C:\mana\services\mana-tts\service.log -Tail 50

Reference

docs/WINDOWS_GPU_SERVER_SETUP.md — Windows box setup, scheduled tasks, firewall, Cloudflare tunnel
docs/PORT_SCHEMA.md — port assignments across services

4.2 KiB Raw Blame History