managarten

mirror of https://github.com/Memo-2023/mana-monorepo.git synced 2026-05-15 13:21:08 +02:00

Author	SHA1	Message	Date
Till JS	45063b88be	feat(mana-llm): add Google Gemini fallback provider with auto-routing Add Google Gemini as a fallback provider that activates automatically when Ollama is overloaded or unavailable, ensuring LLM requests always succeed even under load. New provider (src/providers/google.py): - Full LLMProvider implementation using google-genai SDK - Chat completions (streaming + non-streaming) - Vision/multimodal support (base64 images) - Embeddings via text-embedding-004 - Model mapping: Ollama models → Gemini equivalents (gemma3:4b → gemini-2.0-flash, llava:7b → gemini-2.0-flash, etc.) Auto-fallback routing (src/providers/router.py): - Concurrent request tracking for Ollama (OLLAMA_MAX_CONCURRENT=3) - When Ollama concurrent > max: route to Google automatically - When Ollama fails: retry on Google with model mapping - Health check caching (5s TTL) to avoid hammering Ollama - Non-Ollama providers (openrouter, groq, together) are never fallback-routed - Fallback info included in /health endpoint response New config (src/config.py): - GOOGLE_API_KEY: enables Google provider - GOOGLE_DEFAULT_MODEL: default gemini-2.0-flash - AUTO_FALLBACK_ENABLED: toggle fallback (default: true) - OLLAMA_MAX_CONCURRENT: concurrent request threshold (default: 3) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 22:44:09 +01:00
Till-JS	aba79f5c16	fix(mana-llm): fix SSE double data prefix causing message parsing issues EventSourceResponse from sse-starlette adds its own 'data:' prefix, so we should yield dicts with a 'data' key instead of pre-formatted SSE strings. This was causing 'data: data:' double prefixes and backticks appearing in chat messages. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-02 15:29:11 +01:00
Till-JS	d605366460	✨ feat(llm-playground): add model comparison feature - Add modality detection (text/vision/code) to models store - Create comparison store for parallel multi-model streaming - Add ModelModalityFilter and ModelComparisonSelector components - Add ComparisonResponseCard with metrics (duration, tokens, t/s) - Add ComparisonMessageBubble for side-by-side response view - Integrate comparison mode into ChatInput, MessageList, Sidebar - Add dev:full script to start mana-llm + playground together - Add start.sh script for mana-llm Python service	2026-01-31 23:30:16 +01:00
Till-JS	fdba0e3425	feat(llm-playground): add production deployment with auth - Add Dockerfile for multi-stage Docker build - Add mana-core-auth integration with login/register pages - Add auth store using Svelte 5 runes - Add protected route layout with auth guard - Add health endpoint for container health checks - Add runtime URL injection via hooks.server.ts - Add logout button to header - Update docker-compose.macmini.yml with llm-playground service - Update cloudflared-config.yml with playground.mana.how route - Update mana-llm CORS config for playground domain - Update generate-env.mjs with auth URL variable Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-30 18:15:02 +01:00
Till-JS	3edbd0cb26	chore: update dependencies and mana-llm improvements - Update pnpm-lock.yaml with matrix bot dependencies - Add environment variables to generate-env.mjs - Improve mana-llm config and ollama provider Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-30 17:50:58 +01:00
Till-JS	1495dbe476	✨ feat(mana-llm): add central LLM abstraction service Python/FastAPI service providing unified OpenAI-compatible API for Ollama and cloud LLM providers (OpenRouter, Groq, Together). Features: - Chat completions with streaming (SSE) - Vision/multimodal support - Embeddings generation - Multi-provider routing (provider/model format) - Prometheus metrics - Optional Redis caching	2026-01-29 22:01:00 +01:00

6 commits