mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-19 01:21:24 +02:00
- Add Piper TTS section to mana-tts CLAUDE.md - Document available German voices (local and cloud) - Update matrix-tts-bot CLAUDE.md with new default voice - Add language auto-detection documentation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
194 lines
4.4 KiB
Markdown
194 lines
4.4 KiB
Markdown
# Matrix TTS Bot - Claude Code Guidelines
|
|
|
|
## Overview
|
|
|
|
Matrix TTS Bot converts text messages to speech and sends them back as audio messages. Uses the mana-tts service (port 3022) for synthesis.
|
|
|
|
## Tech Stack
|
|
|
|
- **Framework**: NestJS 10
|
|
- **Matrix**: matrix-bot-sdk
|
|
- **TTS Backend**: mana-tts service (Piper for German, Kokoro for English)
|
|
|
|
## Commands
|
|
|
|
```bash
|
|
# Development
|
|
pnpm install
|
|
pnpm start:dev # Start with hot reload
|
|
|
|
# Build
|
|
pnpm build # Production build
|
|
|
|
# Type check
|
|
pnpm type-check # Check TypeScript types
|
|
```
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
services/matrix-tts-bot/
|
|
├── src/
|
|
│ ├── main.ts # Application entry point (port 3023)
|
|
│ ├── app.module.ts # Root module
|
|
│ ├── health.controller.ts # Health check endpoint
|
|
│ ├── config/
|
|
│ │ └── configuration.ts # Configuration & help text
|
|
│ ├── bot/
|
|
│ │ ├── bot.module.ts
|
|
│ │ └── matrix.service.ts # Matrix client & message handler
|
|
│ └── tts/
|
|
│ ├── tts.module.ts
|
|
│ └── tts.service.ts # mana-tts API client
|
|
├── Dockerfile
|
|
└── package.json
|
|
```
|
|
|
|
## Bot Commands
|
|
|
|
| Command | Description |
|
|
|---------|-------------|
|
|
| `!help` / `!hilfe` | Show help text |
|
|
| `!voice [name]` | Change voice (e.g., `!voice bm_daniel`) |
|
|
| `!voices` | List available voices |
|
|
| `!speed [0.5-2.0]` | Change speech speed |
|
|
| `!status` | Show current settings |
|
|
| (any text) | Convert to speech |
|
|
|
|
## Message Flow
|
|
|
|
1. User sends text message
|
|
2. Bot receives via matrix-bot-sdk
|
|
3. TTS service synthesizes audio
|
|
4. Audio uploaded to Matrix
|
|
5. Audio message sent back to room
|
|
|
|
## Environment Variables
|
|
|
|
```env
|
|
# Server
|
|
PORT=3023
|
|
|
|
# Matrix
|
|
MATRIX_HOMESERVER_URL=http://localhost:8008
|
|
MATRIX_ACCESS_TOKEN=syt_xxx
|
|
MATRIX_ALLOWED_ROOMS=!roomid:matrix.mana.how
|
|
MATRIX_STORAGE_PATH=./data/bot-storage.json
|
|
|
|
# TTS Service
|
|
TTS_URL=http://localhost:3022
|
|
TTS_API_KEY=sk-internal-xxx
|
|
|
|
# Defaults
|
|
DEFAULT_VOICE=de_kerstin
|
|
DEFAULT_SPEED=1.0
|
|
MAX_TEXT_LENGTH=500
|
|
```
|
|
|
|
## TTS API Integration
|
|
|
|
The bot auto-detects language and uses the appropriate endpoint:
|
|
|
|
```typescript
|
|
// German voices use /synthesize/auto (routes to Piper)
|
|
POST /synthesize/auto
|
|
{
|
|
"text": "Hallo Welt",
|
|
"voice": "de_kerstin",
|
|
"speed": 1.0,
|
|
"output_format": "wav"
|
|
}
|
|
|
|
// English voices use /synthesize/kokoro
|
|
POST /synthesize/kokoro
|
|
{
|
|
"text": "Hello world",
|
|
"voice": "af_heart",
|
|
"speed": 1.0,
|
|
"output_format": "wav"
|
|
}
|
|
|
|
// Response: audio/wav binary
|
|
```
|
|
|
|
**Language Detection**: The bot automatically detects German text (via German characters like äöüß or common German words) and switches to a German voice if needed.
|
|
|
|
## Available Voices
|
|
|
|
### German (Local - Piper)
|
|
|
|
| Voice ID | Description |
|
|
|----------|-------------|
|
|
| `de_kerstin` | German female (default) |
|
|
| `de_thorsten` | German male |
|
|
|
|
### German (Cloud - Edge TTS)
|
|
|
|
| Voice ID | Description |
|
|
|----------|-------------|
|
|
| `de_katja` | German female |
|
|
| `de_conrad` | German male |
|
|
| `de_amala` | German female (young) |
|
|
| `de_florian` | German male (young) |
|
|
|
|
### English (Kokoro)
|
|
|
|
| Voice ID | Description |
|
|
|----------|-------------|
|
|
| `af_heart` | American female (warm) |
|
|
| `af_bella` | American female (expressive) |
|
|
| `am_michael` | American male (trustworthy) |
|
|
| `bm_daniel` | British male (classic) |
|
|
| `bf_emma` | British female (professional) |
|
|
|
|
## Docker
|
|
|
|
```bash
|
|
# Build
|
|
docker build -f services/matrix-tts-bot/Dockerfile -t matrix-tts-bot services/matrix-tts-bot
|
|
|
|
# Run
|
|
docker run -p 3023:3023 \
|
|
-e MATRIX_HOMESERVER_URL=http://synapse:8008 \
|
|
-e MATRIX_ACCESS_TOKEN=syt_xxx \
|
|
-e TTS_URL=http://mana-tts:3022 \
|
|
-v matrix-tts-bot-data:/app/data \
|
|
matrix-tts-bot
|
|
```
|
|
|
|
## Health Check
|
|
|
|
```bash
|
|
curl http://localhost:3023/health
|
|
```
|
|
|
|
## Dependencies
|
|
|
|
- **mana-tts**: Must be running on port 3022 (or configured via `TTS_URL`)
|
|
- **Matrix homeserver**: Synapse or compatible homeserver
|
|
|
|
## User Settings
|
|
|
|
Settings are stored in-memory per Matrix user ID:
|
|
- Voice selection persists during bot runtime
|
|
- Speed setting persists during bot runtime
|
|
- Settings reset when bot restarts
|
|
|
|
## Testing
|
|
|
|
```bash
|
|
# 1. Ensure mana-tts is running
|
|
curl http://localhost:3022/health
|
|
|
|
# 2. Start the bot
|
|
cd services/matrix-tts-bot
|
|
pnpm start:dev
|
|
|
|
# 3. Check bot health
|
|
curl http://localhost:3023/health
|
|
|
|
# 4. In Matrix:
|
|
# - Invite bot to a room
|
|
# - Send a text message
|
|
# - Receive audio response
|
|
```
|