mirror of https://github.com/Memo-2023/mana-monorepo.git synced 2026-05-14 20:21:09 +02:00

History

Till JS 8823cc0bf0 feat(profile): voice interview with pre-rendered TTS audio + Orpheus/Zonos backends Voice-based interview for the profile module — users choose between text, voice (question read aloud + mic for answer), or conversation mode (fully automatic flow with auto-save). Interview audio: - 92 pre-rendered MP3 files (23 questions × 4 voices) via Edge TTS - Voices: Seraphina (DE-f), Florian (DE-m), Leni (CH-f), Jan (CH-m) - User picks voice via dropdown, persisted in localStorage - Web Speech API fallback for missing audio files Profile UI: - Interview hero block on overview with 3 start modes (text/voice/conversation) - Voice/conversation toggle + voice picker in interview view - Mic button on text/textarea/tags inputs for per-question voice input - Conversation mode: auto-save + auto-advance after STT transcription - Recording/transcribing/speaking state indicators mana-tts service: - New Orpheus TTS backend (German finetune, SNAC codec) - New Zonos TTS backend (Zyphra, 200k hours, emotion control) - Endpoints: POST /synthesize/orpheus, POST /synthesize/zonos - espeak-ng installed on GPU server for Zonos phonemizer - Compare script for side-by-side voice quality testing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>		2026-04-17 15:22:52 +02:00
..
app	feat(profile): voice interview with pre-rendered TTS audio + Orpheus/Zonos backends	2026-04-17 15:22:52 +02:00
scripts	feat(profile): voice interview with pre-rendered TTS audio + Orpheus/Zonos backends	2026-04-17 15:22:52 +02:00
voices	🌐 feat: add i18n support to 6 web apps	2026-01-29 14:48:35 +01:00
.env.example	chore: complete ManaCore → Mana rename (docs, go modules, plists, images)	2026-04-07 12:26:10 +02:00
CLAUDE.md	feat(profile): voice interview with pre-rendered TTS audio + Orpheus/Zonos backends	2026-04-17 15:22:52 +02:00
README.md	chore(mac-mini): remove all AI service infrastructure (moved to Windows GPU)	2026-04-08 13:06:40 +02:00
requirements.txt	feat(profile): voice interview with pre-rendered TTS audio + Orpheus/Zonos backends	2026-04-17 15:22:52 +02:00
service.pyw	chore(ai-services): adopt Windows GPU as source of truth for llm/stt/tts	2026-04-08 12:46:03 +02:00

README.md

Mana TTS

Text-to-Speech microservice running on the Windows GPU server (mana-server-gpu, RTX 3090). Wraps Kokoro (English presets), Piper (German, local ONNX), and F5-TTS (CUDA voice cloning).

For architecture, deployment, configuration, and operations see CLAUDE.md and docs/WINDOWS_GPU_SERVER_SETUP.md.

Port: 3022

Public URL

https://gpu-tts.mana.how (via Cloudflare Tunnel + Mac Mini gpu-proxy)

API Endpoints

Endpoint	Method	Description
`/health`	GET	Health check + which backends are loaded
`/models`	GET	List available models
`/voices`	GET	List preset + custom voices
`/voices`	POST	Register a custom voice (reference audio + transcript)
`/voices/{id}`	DELETE	Delete a custom voice
`/synthesize/kokoro`	POST	Kokoro (English presets)
`/synthesize`	POST	F5-TTS voice cloning
`/synthesize/auto`	POST	Auto-select best backend for the requested voice

All non-health endpoints require Authorization: Bearer <token>.

Quick Test

curl -X POST https://gpu-tts.mana.how/synthesize/kokoro \
  -H "Authorization: Bearer $INTERNAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"text":"Hello world","voice":"af_heart"}' \
  --output test.wav