managarten/services/mana-llm/src/streaming/sse.py
Till JS 4b8fede7fc fix(mana-llm): surface Gemini finish_reason errors instead of returning ""
The google provider called response.text after a chat completion and
passed the resulting string downstream unchanged. When Gemini's content
filter, recitation guard, or max_tokens ceiling fired, response.text
quietly returned "" — which the planner then reported as "no JSON block
found", masking the real cause. Empirically this failed in 45 ms on a
simple Quiz mission.

Introduces providers/errors.py with a small ProviderError hierarchy
(Blocked / Truncated / Auth / RateLimit / Capability). google.py now
inspects response.candidates[0].finish_reason and raises the matching
structured error; the non-streaming path maps it to 422/502/429 via a
new except-branch in main.py, and the streaming path surfaces the kind
as the SSE error type. Capability is wired but not yet used — it lands
with the tool-schema passthrough in the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 15:15:37 +02:00

52 lines
1.5 KiB
Python

"""Server-Sent Events (SSE) response handling."""
import json
import logging
from collections.abc import AsyncIterator
from src.models import ChatCompletionRequest, ChatCompletionStreamResponse
from src.providers import ProviderRouter
from src.providers.errors import ProviderError
logger = logging.getLogger(__name__)
async def stream_chat_completion(
router: ProviderRouter,
request: ChatCompletionRequest,
) -> AsyncIterator[dict]:
"""
Stream chat completion responses for SSE.
Yields dicts that EventSourceResponse will serialize as:
data: {"choices":[{"delta":{"content":"Hello"}}]}
data: [DONE]
"""
try:
async for chunk in router.chat_completion_stream(request):
# Yield dict for EventSourceResponse to serialize
yield {"data": json.dumps(chunk.model_dump(exclude_none=True))}
# Send final [DONE] marker
yield {"data": "[DONE]"}
except ProviderError as e:
logger.warning(f"Streaming provider error: kind={e.kind} detail={e}")
error_data = {
"error": {
"message": str(e),
"type": e.kind,
}
}
yield {"data": json.dumps(error_data)}
yield {"data": "[DONE]"}
except Exception as e:
logger.error(f"Streaming error: {e}")
error_data = {
"error": {
"message": str(e),
"type": "server_error",
}
}
yield {"data": json.dumps(error_data)}
yield {"data": "[DONE]"}