mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-14 20:01:09 +02:00
📝 docs(mana-stt): document Whisper + Mistral API architecture
- Disable vLLM by default (has issues on macOS CPU) - Use Mistral API for Voxtral transcription (cloud-based) - Keep Whisper-MLX for local transcription - Update README with architecture diagram Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
7c9c2645e3
commit
21d50d1e0b
2 changed files with 27 additions and 7 deletions
|
|
@ -1,14 +1,31 @@
|
|||
# ManaCore STT Service
|
||||
|
||||
Speech-to-Text API service with **Whisper (Lightning MLX)** and **Voxtral Mini**.
|
||||
Speech-to-Text API service with **Whisper (Lightning MLX)** and **Voxtral (Mistral API)**.
|
||||
|
||||
Optimized for Mac Mini M4 (Apple Silicon).
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────┐
|
||||
│ mana-stt (3020) │
|
||||
│ FastAPI │
|
||||
└─────────┬───────────┘
|
||||
│
|
||||
┌─────────────────┼─────────────────┐
|
||||
▼ ▼ ▼
|
||||
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
|
||||
│ Whisper │ │ Voxtral API │ │ vLLM │
|
||||
│ MLX (Local) │ │ (Mistral) │ │ (Optional) │
|
||||
└──────────────┘ └──────────────┘ └──────────────┘
|
||||
```
|
||||
|
||||
## Features
|
||||
|
||||
- **Whisper Large V3 Turbo** - Best quality, 99+ languages, German WER 6-9%
|
||||
- **Voxtral Mini (3B)** - Mistral AI, Apache 2.0, 8 languages including German
|
||||
- **Apple Silicon Optimized** - Uses MLX for 10x faster inference
|
||||
- **Whisper Large V3** - Best quality, 99+ languages, German WER 6-9% (local, MLX)
|
||||
- **Voxtral Mini** - Mistral API, speaker diarization support (cloud)
|
||||
- **Apple Silicon Optimized** - Uses MLX for fast local inference
|
||||
- **Automatic Fallback** - Falls back between backends automatically
|
||||
- **REST API** - Simple HTTP endpoints for integration
|
||||
|
||||
## Quick Start
|
||||
|
|
@ -85,9 +102,12 @@ Environment variables:
|
|||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `PORT` | `3020` | API server port |
|
||||
| `WHISPER_MODEL` | `large-v3-turbo` | Default Whisper model |
|
||||
| `WHISPER_MODEL` | `large-v3` | Default Whisper model |
|
||||
| `PRELOAD_MODELS` | `false` | Load models on startup |
|
||||
| `CORS_ORIGINS` | `https://mana.how,...` | Allowed CORS origins |
|
||||
| `MISTRAL_API_KEY` | - | Required for Voxtral API |
|
||||
| `USE_VLLM` | `false` | Enable vLLM backend (experimental) |
|
||||
| `VLLM_URL` | `http://localhost:8100` | vLLM server URL |
|
||||
|
||||
## Supported Audio Formats
|
||||
|
||||
|
|
|
|||
|
|
@ -32,9 +32,9 @@ CORS_ORIGINS = os.getenv(
|
|||
"https://mana.how,https://chat.mana.how,http://localhost:5173"
|
||||
).split(",")
|
||||
|
||||
# vLLM configuration
|
||||
# vLLM configuration (disabled by default - has issues on macOS CPU)
|
||||
VLLM_URL = os.getenv("VLLM_URL", "http://localhost:8100")
|
||||
USE_VLLM = os.getenv("USE_VLLM", "true").lower() == "true"
|
||||
USE_VLLM = os.getenv("USE_VLLM", "false").lower() == "true"
|
||||
|
||||
|
||||
# Response models
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue