mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-20 19:26:41 +02:00
feat: add Ollama memory optimization, LLM metrics, and chat streaming
Three improvements to the unified LLM infrastructure: 1. Ollama memory optimization (scripts/mac-mini/configure-ollama.sh): - OLLAMA_KEEP_ALIVE=5m → models unload after 5min idle (saves 3-16GB RAM) - OLLAMA_NUM_PARALLEL=1 → predictable memory usage - OLLAMA_MAX_LOADED_MODELS=1 → max 1 model in RAM at a time 2. Request-level metrics in @manacore/shared-llm: - LlmRequestMetrics interface (model, latency, tokens, fallback detection) - LlmMetricsCollector class with summary stats (for health endpoints) - Optional onMetrics callback in LlmModuleOptions - Automatic metrics emission in chatMessages() (success + error) 3. Chat streaming (token-by-token SSE): - Backend: POST /chat/completions/stream SSE endpoint - OllamaService.createStreamingCompletion() via llm.chatStreamMessages() - ChatService.createStreamingCompletion() with upfront credit consumption - Web: chatApi.createStreamingCompletion() SSE consumer - Chat store: sendMessage() now streams tokens into assistant message - UI updates reactively as each token arrives Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
ecda4535d8
commit
56ffcbac39
13 changed files with 462 additions and 29 deletions
|
|
@ -594,8 +594,13 @@ Systemeinstellungen → Datenschutz & Sicherheit → Voller Festplattenzugriff
|
|||
**LaunchAgent:** `~/Library/LaunchAgents/homebrew.mxcl.ollama.plist`
|
||||
|
||||
Optimierungen bereits aktiviert:
|
||||
- `OLLAMA_KEEP_ALIVE=5m` - Modelle nach 5min Inaktivität aus RAM entladen (spart 3-16 GB)
|
||||
- `OLLAMA_FLASH_ATTENTION=1` - Schnellere Attention-Berechnung
|
||||
- `OLLAMA_KV_CACHE_TYPE=q8_0` - Effizienterer KV-Cache
|
||||
- `OLLAMA_NUM_PARALLEL=1` - Max 1 paralleler Request (vorhersagbarer RAM)
|
||||
- `OLLAMA_MAX_LOADED_MODELS=1` - Max 1 Modell gleichzeitig im RAM
|
||||
|
||||
Setup-Script: `./scripts/mac-mini/configure-ollama.sh`
|
||||
|
||||
### Speicherort
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue