managarten

till/managarten

Fork 0

mirror of https://github.com/Memo-2023/mana-monorepo.git synced 2026-05-18 13:29:39 +02:00

Commit graph

Author	SHA1	Message	Date
Till JS	4b8fede7fc	fix(mana-llm): surface Gemini finish_reason errors instead of returning "" The google provider called response.text after a chat completion and passed the resulting string downstream unchanged. When Gemini's content filter, recitation guard, or max_tokens ceiling fired, response.text quietly returned "" — which the planner then reported as "no JSON block found", masking the real cause. Empirically this failed in 45 ms on a simple Quiz mission. Introduces providers/errors.py with a small ProviderError hierarchy (Blocked / Truncated / Auth / RateLimit / Capability). google.py now inspects response.candidates[0].finish_reason and raises the matching structured error; the non-streaming path maps it to 422/502/429 via a new except-branch in main.py, and the streaming path surfaces the kind as the SSE error type. Capability is wired but not yet used — it lands with the tool-schema passthrough in the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 15:15:37 +02:00
Till-JS	aba79f5c16	fix(mana-llm): fix SSE double data prefix causing message parsing issues EventSourceResponse from sse-starlette adds its own 'data:' prefix, so we should yield dicts with a 'data' key instead of pre-formatted SSE strings. This was causing 'data: data:' double prefixes and backticks appearing in chat messages. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-02 15:29:11 +01:00
Till-JS	1495dbe476	✨ feat(mana-llm): add central LLM abstraction service Python/FastAPI service providing unified OpenAI-compatible API for Ollama and cloud LLM providers (OpenRouter, Groq, Together). Features: - Chat completions with streaming (SSE) - Vision/multimodal support - Embeddings generation - Multi-provider routing (provider/model format) - Prometheus metrics - Optional Redis caching	2026-01-29 22:01:00 +01:00

Author

SHA1

Message

Date

Till JS

4b8fede7fc

fix(mana-llm): surface Gemini finish_reason errors instead of returning ""

The google provider called response.text after a chat completion and
passed the resulting string downstream unchanged. When Gemini's content
filter, recitation guard, or max_tokens ceiling fired, response.text
quietly returned "" — which the planner then reported as "no JSON block
found", masking the real cause. Empirically this failed in 45 ms on a
simple Quiz mission.

Introduces providers/errors.py with a small ProviderError hierarchy
(Blocked / Truncated / Auth / RateLimit / Capability). google.py now
inspects response.candidates[0].finish_reason and raises the matching
structured error; the non-streaming path maps it to 422/502/429 via a
new except-branch in main.py, and the streaming path surfaces the kind
as the SSE error type. Capability is wired but not yet used — it lands
with the tool-schema passthrough in the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-20 15:15:37 +02:00

Till-JS

aba79f5c16

fix(mana-llm): fix SSE double data prefix causing message parsing issues

EventSourceResponse from sse-starlette adds its own 'data:' prefix,
so we should yield dicts with a 'data' key instead of pre-formatted
SSE strings. This was causing 'data: data:' double prefixes and
backticks appearing in chat messages.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-02 15:29:11 +01:00

Till-JS

1495dbe476

✨ feat(mana-llm): add central LLM abstraction service

Python/FastAPI service providing unified OpenAI-compatible API for
Ollama and cloud LLM providers (OpenRouter, Groq, Together).

Features:
- Chat completions with streaming (SSE)
- Vision/multimodal support
- Embeddings generation
- Multi-provider routing (provider/model format)
- Prometheus metrics
- Optional Redis caching

2026-01-29 22:01:00 +01:00

3 commits