B1 (token usage) and B2 (server-iteration auto-execution) shipped in
the follow-up session. B3 — extending the LlmBackend interface with
tool-call passthrough and wiring both runners through the orchestrator
instead of direct-fetch — was scoped out after honest re-evaluation:
- Browser-local Gemma can't do tool-calling reliably, so the tier-
fallback value is low (the tool-tier collapses to mana-server/cloud
anyway).
- BYOK/cloud routing via mana-llm proxy is functionally equivalent
between direct-fetch and orchestrator paths.
- ~6 h of work across 8 files with no concrete user-facing unblock.
Kept the entry point documented for whenever a use-case actually
needs tier-routing of planner calls.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Plan for ripping out the fragile text-JSON parser and the propose-approve
flow in one atomic PR. Key shifts:
- LLM uses native function calling — SDK-guaranteed structure, no parser
- Tool policy becomes auto | deny (no propose, no confirm for now)
- Timeline + per-iteration revert replace the proposal inbox as the
review surface; missions run end-to-end without human approval
- Safety via mission-budget, manual-cadence, agent-policy, revert
- No _rationale meta-param (tool name + params are self-explanatory)
Applies to webapp runner, mana-ai server runner, and companion chat —
all three share one runPlannerLoop from @mana/shared-ai after migration.
Net: ~1000 LoC deleted, ~600 added.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>