Commit graph

3 commits

Author SHA1 Message Date
Till JS
dff8629e1d feat(mana-llm): M1 — AliasRegistry + aliases.yaml SSOT
First milestone of the LLM-fallback plan (docs/plans/llm-fallback-aliases.md).
Introduces the `mana/<class>` namespace; the registry parses + validates
aliases.yaml at startup and reloads on demand. Schema-rejects empty
chains, missing provider prefixes, alias names outside the reserved
namespace, default→unknown references, etc.

Reload semantics: parse error keeps the previous good state in memory
so a typo + SIGHUP doesn't take the service down.

5 aliases ship with the initial config: fast-text, long-form, structured,
reasoning, vision. Each chain ends with a cloud provider so the system
keeps working when the GPU server is offline.

32 unit tests covering happy path, schema validation, namespace check,
reload safety, and a guard that the shipped aliases.yaml itself parses.
M2 (health-cache + probe-loop) and M3 (router fallback execution) build
on this; aliases are not yet wired into the request path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 20:23:51 +02:00
Till JS
659a7d9774 fix(mana-llm): add google-genai to requirements.txt for Docker builds
google-genai was in pyproject.toml but missing from requirements.txt.
The Dockerfile uses pip install -r requirements.txt, so the Google
provider never loaded in production. Now that the key is set and the
cloud tier upgraded to gemini-2.5-flash, the import fires on startup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 12:40:30 +02:00
Till-JS
1495dbe476 feat(mana-llm): add central LLM abstraction service
Python/FastAPI service providing unified OpenAI-compatible API for
Ollama and cloud LLM providers (OpenRouter, Groq, Together).

Features:
- Chat completions with streaming (SSE)
- Vision/multimodal support
- Embeddings generation
- Multi-provider routing (provider/model format)
- Prometheus metrics
- Optional Redis caching
2026-01-29 22:01:00 +01:00