mirror of https://github.com/Memo-2023/mana-monorepo.git synced 2026-05-23 16:26:42 +02:00

Till JS b0a08ce239 docs(services): add CLAUDE.md for stt + events, fix stale entries, flag port collisions

New service docs:
- services/mana-stt/CLAUDE.md — FastAPI surface with Whisper MLX (local),
  WhisperX (rich), and Voxtral (local + Mistral API). Documents the lazy
  backend loading and the launchd plist setup on the Mac Mini.
- services/mana-events/CLAUDE.md — Hono/Bun service for public RSVP and
  event-sharing. Documents the host (JWT) vs public (token) split, the
  rate-limit sweeper, and the createApp factory pattern that lets unit
  tests run without bootstrapping the production sweeper.

Stale entries fixed:
- mana-auth: dropped "rewritten from NestJS / drop-in replacement" — the
  rewrite is the only mana-auth there is now. Email channel updated from
  Brevo SMTP to self-hosted Stalwart (see docs/MAIL_SERVER.md).
- mana-notify: same Brevo → Stalwart fix in the channel table and env
  var defaults.

PORT_SCHEMA.md flagged as aspirational:
- The doc was dated 2026-03-28 and presented as "single source of truth",
  but cross-checking against actual service source files (config.go,
  main.py, start.sh) shows nothing matches. Added a prominent warning at
  the top with the real ports + two confirmed collisions:
  * mana-image-gen and mana-video-gen both default to PORT 3026
  * mana-voice-bot and mana-sync both default to PORT 3050
  Today these are masked because image-gen + voice-bot live on the
  Windows GPU server while video-gen + sync live on the Mac Mini, but
  the moment they share a host they collide. Either execute the planned
  reorg or pick non-colliding ports and rewrite the doc to match
  reality — flagged as a real follow-up.

2026-04-08 12:23:48 +02:00

3.5 KiB

Raw Blame History

mana-stt

Speech-to-Text service for the Mana ecosystem. Runs on the Mac Mini M4 (Apple Silicon) and exposes a small FastAPI surface that wraps multiple Whisper backends plus Mistral's hosted Voxtral API.

Tech Stack

Layer	Technology
Runtime	Python 3.11 + uvicorn
Framework	FastAPI
Local model	Whisper Large V3 via `lightning-whisper-mlx` (Apple MLX)
Local model (rich)	WhisperX for word-level timestamps + diarization
Cloud model	Mistral Voxtral Mini API
Optional	vLLM Voxtral (GPU) — see `vllm_service.py`
Auth	JWT validation via mana-auth (`external_auth.py`) + API key fallback (`auth.py`)
Process supervision	launchd via `com.mana.mana-stt.plist`

Port: 3020

Quick Start

cd services/mana-stt
./setup.sh                                          # Create venv + install
.venv/bin/uvicorn app.main:app --host 0.0.0.0 --port 3020

Production runs via launchd on the Mac Mini — install-service.sh (single service) or install-services.sh (mana-stt + vllm-voxtral together).

API Endpoints

Method	Path	Description
GET	`/health`	Liveness + which backends are loaded
GET	`/models`	List available STT models
POST	`/transcribe`	Whisper MLX (default, fastest local)
POST	`/transcribe/whisperx`	WhisperX with word-level timestamps + diarization
POST	`/transcribe/voxtral`	Local Voxtral (vLLM)
POST	`/transcribe/voxtral/api`	Mistral Voxtral API (cloud)
POST	`/transcribe/auto`	Tries WhisperX first, falls back to Whisper MLX

All /transcribe* endpoints accept multipart file upload + optional language form field. Auth via Authorization: Bearer <jwt> or X-API-Key.

Backends (`app/`)

File	What it loads
`whisper_service.py`	Whisper Large V3 via MLX (local, default)
`whisper_service_cuda.py`	CUDA Whisper (only used on Windows GPU server)
`whisperx_service.py`	WhisperX with diarization (local, slower, richer output)
`voxtral_service.py`	Local Voxtral via vLLM (optional, needs the second launchd job)
`voxtral_api_service.py`	Mistral hosted Voxtral API (cloud)
`vllm_service.py`	vLLM client primitives shared with Voxtral
`auth.py`	API key auth (fallback path)
`external_auth.py`	JWT auth via mana-auth public key

Backends are loaded lazily during the FastAPI lifespan and reported by /health. Missing dependencies (e.g. CUDA on Mac) are tolerated — the service starts without them.

Configuration

Reads from services/mana-stt/.env (loaded by the launchd plist's set -a; source .env; set +a). Relevant variables:

PORT=3020
MANA_AUTH_URL=http://localhost:3001     # JWKS source for JWT verification
MISTRAL_API_KEY=...                     # only needed for /transcribe/voxtral/api
STT_API_KEY=...                         # legacy API key fallback

Operations

Logs: launchd writes to ~/Library/Logs/mana-stt.{out,err}.log (see plist)
Metrics: Prometheus endpoint at /metrics if enabled in config; Grafana dashboard JSON checked in at grafana-dashboard.json
Restart: launchctl kickstart -k gui/$(id -u)/com.mana.mana-stt

Reference

services/mana-stt/README.md — user-facing setup, model download instructions, language coverage
docs/LOCAL_STT_MODELS.md — WER comparisons, model size/quality tradeoffs

3.5 KiB Raw Blame History