managarten/services/mana-tts/CLAUDE.md
Till-JS 5a0815708c 🌐 feat: add i18n support to 6 web apps
Add internationalization (DE + EN) to previously missing apps:
- todo: task management translations
- skilltree: skill/XP system translations
- nutriphi: nutrition tracking translations
- planta: plant care translations
- questions: research app translations
- matrix: chat client translations (layout integration)

Each app includes:
- svelte-i18n setup with SSR support
- localStorage persistence ({app}_locale pattern)
- i18n loading state in +layout.svelte
- German (default) and English translations

Updated CONSISTENCY_REPORT.md to mark i18n task as complete.

Also includes:
- mana-tts service placeholder files
2026-01-29 14:48:35 +01:00

2.6 KiB

CLAUDE.md - Mana TTS Service

Service Overview

Text-to-Speech microservice using MLX-optimized models for Apple Silicon:

  • Port: 3022
  • Framework: Python + FastAPI
  • Models: Kokoro-82M (fast), F5-TTS (voice cloning)

Commands

# Setup
./setup.sh

# Development
source .venv/bin/activate
uvicorn app.main:app --host 0.0.0.0 --port 3022 --reload

# Production (Mac Mini)
../../scripts/mac-mini/setup-tts.sh

# Test
curl http://localhost:3022/health
curl -X POST http://localhost:3022/synthesize/kokoro \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello world", "voice": "af_heart"}' \
  --output test.wav

File Structure

services/mana-tts/
├── app/
│   ├── __init__.py
│   ├── main.py              # FastAPI endpoints
│   ├── kokoro_service.py    # Kokoro TTS (preset voices)
│   ├── f5_service.py        # F5-TTS (voice cloning)
│   ├── voice_manager.py     # Custom voice registry
│   └── audio_utils.py       # Audio format conversion
├── voices/                  # Custom voice storage
├── mlx_models/             # Model cache
├── setup.sh                # Setup script
├── requirements.txt
└── README.md

API Endpoints

Endpoint Method Purpose
/health GET Health check
/models GET Model info
/voices GET List all voices
/voices POST Register custom voice
/voices/{id} DELETE Delete custom voice
/synthesize/kokoro POST Kokoro synthesis
/synthesize POST F5-TTS voice cloning
/synthesize/auto POST Auto-select model

Models

Kokoro-82M

  • ~300 MB download
  • 30+ preset voices
  • Fast inference
  • No reference audio needed

F5-TTS

  • ~6 GB download
  • Voice cloning capability
  • Requires reference audio + transcript
  • Higher quality, slower

Environment Variables

Variable Default Description
PORT 3022 Service port
PRELOAD_MODELS false Load on startup
MAX_TEXT_LENGTH 1000 Max chars
CORS_ORIGINS (production URLs) CORS config

Key Dependencies

  • fastapi - Web framework
  • f5-tts-mlx - Voice cloning model
  • mlx-audio - Kokoro implementation
  • mlx - Apple Silicon ML framework
  • soundfile - Audio I/O
  • pydub - MP3 conversion

Development Notes

  • Models load lazily on first request (unless PRELOAD_MODELS=true)
  • Custom voices stored in voices/ with reference audio + transcript
  • Singleton pattern for model instances
  • Audio returned as raw bytes with headers for metadata