managarten/services/mana-llm
Till JS 45063b88be feat(mana-llm): add Google Gemini fallback provider with auto-routing
Add Google Gemini as a fallback provider that activates automatically
when Ollama is overloaded or unavailable, ensuring LLM requests always
succeed even under load.

New provider (src/providers/google.py):
- Full LLMProvider implementation using google-genai SDK
- Chat completions (streaming + non-streaming)
- Vision/multimodal support (base64 images)
- Embeddings via text-embedding-004
- Model mapping: Ollama models → Gemini equivalents
  (gemma3:4b → gemini-2.0-flash, llava:7b → gemini-2.0-flash, etc.)

Auto-fallback routing (src/providers/router.py):
- Concurrent request tracking for Ollama (OLLAMA_MAX_CONCURRENT=3)
- When Ollama concurrent > max: route to Google automatically
- When Ollama fails: retry on Google with model mapping
- Health check caching (5s TTL) to avoid hammering Ollama
- Non-Ollama providers (openrouter, groq, together) are never fallback-routed
- Fallback info included in /health endpoint response

New config (src/config.py):
- GOOGLE_API_KEY: enables Google provider
- GOOGLE_DEFAULT_MODEL: default gemini-2.0-flash
- AUTO_FALLBACK_ENABLED: toggle fallback (default: true)
- OLLAMA_MAX_CONCURRENT: concurrent request threshold (default: 3)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 22:44:09 +01:00
..
src feat(mana-llm): add Google Gemini fallback provider with auto-routing 2026-03-23 22:44:09 +01:00
tests feat(mana-llm): add central LLM abstraction service 2026-01-29 22:01:00 +01:00
.env.example feat(mana-llm): add central LLM abstraction service 2026-01-29 22:01:00 +01:00
.gitignore feat(mana-llm): add central LLM abstraction service 2026-01-29 22:01:00 +01:00
CLAUDE.md feat(mana-llm): add central LLM abstraction service 2026-01-29 22:01:00 +01:00
docker-compose.dev.yml feat(mana-llm): add central LLM abstraction service 2026-01-29 22:01:00 +01:00
docker-compose.yml feat(mana-llm): add central LLM abstraction service 2026-01-29 22:01:00 +01:00
Dockerfile feat(mana-llm): add central LLM abstraction service 2026-01-29 22:01:00 +01:00
pyproject.toml feat(mana-llm): add Google Gemini fallback provider with auto-routing 2026-03-23 22:44:09 +01:00
requirements.txt feat(mana-llm): add central LLM abstraction service 2026-01-29 22:01:00 +01:00
start.sh feat(llm-playground): add model comparison feature 2026-01-31 23:30:16 +01:00