Three Claude-Code-inspired primitives for runPlannerLoop, derived from the
reverse-engineering reports in docs/reports/:
1. **Policy gate** (@mana/tool-registry) — evaluatePolicy() gates every tool
dispatch: denies admin-scope, denies destructive tools not in the user's
opt-in list, rate-limits per tool (30/60s default), flags prompt-injection
markers in freetext without blocking. Wired into mana-mcp with a
per-user rolling invocation log and POLICY_MODE env (off|log-only|enforce,
default log-only). mana-ai uses detectInjectionMarker only — tool dispatch
there is plan-only, so rate-limit/destructive checks don't apply yet.
2. **Reminder channel** (packages/shared-ai/src/planner/loop.ts) — new
reminderChannel callback in PlannerLoopInput. Called once per round with
LoopState snapshot (round, toolCallCount, usage, lastCall); returned
strings wrap in <reminder> tags and inject as transient system messages
into THIS LLM request only. Never pushed to messages[] — the Claude-Code
<system-reminder> pattern that keeps the KV-cache prefix stable.
3. **Parallel reads** (loop.ts) — isParallelSafe predicate enables
Promise.all dispatch when every tool_call in a round is parallel-safe,
in batches of PARALLEL_TOOL_BATCH_SIZE=10. Any non-safe call downgrades
the whole round to sequential. messages[] always appends in source
order, never completion order, so the debug log stays linear.
Default-off (undefined predicate) preserves pre-M1 behaviour.
Tests: 21 new in tool-registry (policy), 9 new in shared-ai (5 parallel,
4 reminder). All 74 green, type-check clean across 4 packages.
Design/plan: docs/plans/agent-loop-improvements-m1.md
Reports: docs/reports/claude-code-architecture.md,
docs/reports/mana-agent-improvements-from-claude-code.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Carries per-round token counts from the mana-llm response body
(prompt_tokens + completion_tokens) back through LlmCompletionResponse
→ PlannerLoopResult. The loop sums across rounds and exposes a single
aggregate on result.usage.
Lets mana-ai's tick re-activate per-agent daily-token budget tracking
— tokensUsed was stubbed to 0 in the migration commit (6) because the
loop didn't surface usage yet. Now recordTokenUsage + agentTokenUsage24h
get real numbers again, and the mana_ai_tokens_used_total Prometheus
counter is accurate.
Additive only: consumers without usage needs ignore the new field,
and providers that don't return usage produce zeros (not undefined —
the loop still exposes the object so downstream branches stay trivial).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The companion chat had its own ad-hoc 3-round tool-calling pipeline:
build a system prompt with tool descriptions, ask the LLM to emit
```tool JSON blocks, regex-extract, execute, feed back the result as
a synthetic user message. Same fragility class as the old text-JSON
planner — and now unnecessary since mana-llm speaks native function
calling.
Migrates companion/engine.ts to the shared runPlannerLoop, same as
the mission runner (commit 5a) and the server tick (commit 6). Tools
go to the LLM as proper function-schemas; tool_calls come back
structured; the executor runs them directly under USER_ACTOR.
Extends shared-ai/planner/loop.ts with an optional priorMessages[]
input field so the chat can preserve multi-turn history between
turns (missions don't need this and leave it empty).
Deletes the old llm-tasks/companion-chat.ts LlmTask wrapper. Nothing
else imported it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After the mobile-app deletion unblocked \`@context/mobile\`, five more
pre-existing failures surfaced across shared packages and two services.
All were silent-masked by the postinstall \`|| true\` for months.
- **shared-ai**: \`planner/loop.ts\` imported \`ToolSchema\` from
\`../tools/function-schema\`, which only imports (not re-exports) the
type. Fixed to import from the source (\`../tools/schemas\`).
- **shared-logger**: \`typeof window !== 'undefined'\` blows up under
tsconfigs that don't include the DOM lib (e.g. uload-server's
\`bun-types\`-only config), because shared-logger is consumed via
source import. Replaced with a \`globalThis\`-indirected check that
compiles under any lib configuration.
- **shared-hono**: \`credits.ts\` returned \`res.json()\` directly as
\`Promise<T | null>\`. Modern \`@types/node\` / undici types return
\`unknown\` strictly — cast to \`T\` at the boundary so the generic
contract is explicit.
- **uload-server**: \`routes/analytics.ts\` + \`routes/email.ts\` still
imported \`AuthUser\` from a \`middleware/jwt-auth\` module that was
deleted during the migration to \`@mana/shared-hono\`. Replaced with
\`AuthVariables\` from shared-hono, which matches the actual context
shape set by \`authMiddleware()\`.
- **manavoxel/web**: \`guestSeed\` collection entries were wrapped in
arrow functions, but \`local-store\` expects \`T[]\` directly and
iterates \`seed.length\` — which on a function is 0. The "guest
seed" was silently dead; eager-evaluating \`generateGuestWorld()\`
once and sharing the result fixes both the type and the runtime.
Verified: \`pnpm run type-check\` from the repo root now exits 0 —
76/76 tasks successful, no failures. First fully green state since
well before the postinstall \`|| true\` was introduced.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduces the new planner pipeline both the webapp runner and the
mana-ai tick will swap onto in the next commits. Additive for now —
the legacy buildPlannerPrompt + parsePlannerResponse stay exported so
callers can migrate one at a time; they get removed once the last
consumer is gone.
- planner/loop.ts — runPlannerLoop orchestrates a multi-turn chat
against a caller-supplied LlmClient. Tool-calls from the LLM are
handed to an onToolCall callback and their results fed back as
tool-messages. Parallel tool-calls in one turn execute sequentially
to keep the message log linear for debugging. Stops on assistant
stop, empty tool_calls, or a hard max-rounds ceiling (default 5).
- planner/system-prompt.ts — new buildSystemPrompt. ~40-line German
system frame, no tool listing (the SDK-level tools field carries
the schemas now), no JSON format example, no "please return JSON"
plea. User frame renders mission + linked inputs + last 3
iteration summaries, same as before.
- Five test cases covering the loop: immediate stop, single tool
call with result feedback, parallel calls execute in order, tool
failures propagate as tool-messages the LLM can react to, and
maxRounds ceiling fires with the right stopReason.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>