managarten

mirror of https://github.com/Memo-2023/mana-monorepo.git synced 2026-05-14 22:01:09 +02:00

History

Till JS 8a49e3ffd5 feat(mana-llm): M4 — observability, debug endpoints, SIGHUP reload - `X-Mana-LLM-Resolved: <provider>/<model>` header on non-streaming responses. Streaming clients read the same info from each chunk's `model` field (SSE headers go out before the chain is walked). - Three new Prometheus metrics: `mana_llm_alias_resolved_total{alias, target}` (which concrete model an alias resolved to per request), `mana_llm_fallback_total{from_model, to_model, reason}` (each fallback transition), `mana_llm_provider_healthy{provider}` (gauge, mirrors the circuit-breaker). - New debug endpoints: `GET /v1/aliases` (registry inspection — chain + description per alias, useful for confirming SIGHUP reloads), `GET /v1/health` (full per-provider liveness snapshot — failure counter, last error, unhealthy-until backoff). - `kill -HUP <pid>` reloads `aliases.yaml`. Parse errors leave the previous good state in memory and log the rejection. - `ProviderHealthCache.add_listener()` for cache→metrics decoupling: the gauge is updated via a transition-only listener wired in main.py rather than the cache importing prometheus_client itself. - Request-side metrics now use the requested model string, success-side uses the resolved one. So `mana_llm_llm_requests_total{provider="ollama", model="gemma3:12b"}` reflects actual upstream load even when callers used `mana/long-form` aliases. 16 new observability tests (test_m4_observability.py): listener fire-on-transition semantics, exception-isolation, multi-listener, counter increments, gauge writes, end-to-end alias→metric flow, v1/aliases + v1/health endpoint shape, response.model carries the resolved target after fallback. Total suite: 115/115 in 1.6s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-04-26 20:52:28 +02:00
..
models	feat(mana-llm): add OpenAI-style tools + tool_calls passthrough	2026-04-20 15:22:48 +02:00
providers	feat(mana-llm): M4 — observability, debug endpoints, SIGHUP reload	2026-04-26 20:52:28 +02:00
streaming	fix(mana-llm): surface Gemini finish_reason errors instead of returning ""	2026-04-20 15:15:37 +02:00
utils	feat(mana-llm): M4 — observability, debug endpoints, SIGHUP reload	2026-04-26 20:52:28 +02:00
__init__.py	✨ feat(mana-llm): add central LLM abstraction service	2026-01-29 22:01:00 +01:00
aliases.py	feat(mana-llm): M1 — AliasRegistry + aliases.yaml SSOT	2026-04-26 20:23:51 +02:00
api_auth.py	chore(ai-services): adopt Windows GPU as source of truth for llm/stt/tts	2026-04-08 12:46:03 +02:00
config.py	chore(cloud-tier): upgrade default model gemini-2.0-flash → gemini-2.5-flash	2026-04-16 12:32:03 +02:00
health.py	feat(mana-llm): M4 — observability, debug endpoints, SIGHUP reload	2026-04-26 20:52:28 +02:00
health_probe.py	feat(mana-llm): M2 — ProviderHealthCache + background probe loop	2026-04-26 20:29:57 +02:00
main.py	feat(mana-llm): M4 — observability, debug endpoints, SIGHUP reload	2026-04-26 20:52:28 +02:00