managarten/services/mana-llm/src/models/responses.py
Till JS e757470cb0 feat(mana-llm): add OpenAI-style tools + tool_calls passthrough
Extends the chat-completions surface so callers can ask any provider
to call named functions and get structured tool_calls back. Wired
through all three provider adapters so the planner and companion can
switch off the fragile JSON-parsing pathway.

- Request: tools[], tool_choice, assistant tool_calls, tool-role
  messages with tool_call_id.
- Response: MessageResponse.tool_calls, Choice.finish_reason adds
  "tool_calls", DeltaContent streams tool_calls.
- Google provider: Tool(function_declarations=...) build, result
  normalised (args dict → JSON string), function_response parts on
  a user turn for tool-role messages.
- OpenAI-compat: 1:1 passthrough of the OpenAI spec.
- Ollama: /api/chat passthrough; model-level capability check via a
  TOOL_CAPABLE_OLLAMA_PATTERNS whitelist (llama3.1+, qwen2.5+,
  mistral, command-r, …) — unsupported models rejected rather than
  silently falling back to prose.
- Router: model_supports_tools() check upfront for both streaming
  and non-streaming paths; ProviderCapabilityError bubbles as 400.

No silent downgrade. Missing tool support = explicit error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 15:22:48 +02:00

120 lines
2.9 KiB
Python

"""Response models for OpenAI-compatible API."""
import time
import uuid
from typing import Literal
from pydantic import BaseModel, Field
class Usage(BaseModel):
"""Token usage information."""
prompt_tokens: int = 0
completion_tokens: int = 0
total_tokens: int = 0
class ToolCallFunction(BaseModel):
"""Function portion of a tool_call the assistant produced."""
name: str
arguments: str # JSON-encoded (OpenAI spec)
class ToolCall(BaseModel):
"""A tool invocation the assistant decided to make."""
id: str
type: Literal["function"] = "function"
function: ToolCallFunction
class MessageResponse(BaseModel):
"""Response message from the model."""
role: Literal["assistant"] = "assistant"
content: str | None = None
tool_calls: list[ToolCall] | None = None
class Choice(BaseModel):
"""A single completion choice."""
index: int = 0
message: MessageResponse
finish_reason: (
Literal["stop", "length", "content_filter", "tool_calls"] | None
) = "stop"
class ChatCompletionResponse(BaseModel):
"""Response from chat completions endpoint (non-streaming)."""
id: str = Field(default_factory=lambda: f"chatcmpl-{uuid.uuid4().hex[:12]}")
object: Literal["chat.completion"] = "chat.completion"
created: int = Field(default_factory=lambda: int(time.time()))
model: str
choices: list[Choice]
usage: Usage = Field(default_factory=Usage)
class DeltaContent(BaseModel):
"""Delta content for streaming responses."""
role: Literal["assistant"] | None = None
content: str | None = None
tool_calls: list[ToolCall] | None = None
class StreamChoice(BaseModel):
"""A single streaming choice."""
index: int = 0
delta: DeltaContent
finish_reason: (
Literal["stop", "length", "content_filter", "tool_calls"] | None
) = None
class ChatCompletionStreamResponse(BaseModel):
"""Response chunk from chat completions endpoint (streaming)."""
id: str = Field(default_factory=lambda: f"chatcmpl-{uuid.uuid4().hex[:12]}")
object: Literal["chat.completion.chunk"] = "chat.completion.chunk"
created: int = Field(default_factory=lambda: int(time.time()))
model: str
choices: list[StreamChoice]
class ModelInfo(BaseModel):
"""Information about a model."""
id: str
object: Literal["model"] = "model"
created: int = Field(default_factory=lambda: int(time.time()))
owned_by: str = "mana-llm"
class ModelsResponse(BaseModel):
"""Response from models endpoint."""
object: Literal["list"] = "list"
data: list[ModelInfo]
class EmbeddingData(BaseModel):
"""A single embedding result."""
object: Literal["embedding"] = "embedding"
index: int = 0
embedding: list[float]
class EmbeddingResponse(BaseModel):
"""Response from embeddings endpoint."""
object: Literal["list"] = "list"
data: list[EmbeddingData]
model: str
usage: Usage = Field(default_factory=Usage)