From 334c36a68edee18715e08c3f7374dcbc2ed90ea8 Mon Sep 17 00:00:00 2001 From: Till JS Date: Thu, 16 Apr 2026 11:50:21 +0200 Subject: [PATCH] docs: document reasoning loop, research pre-step, debug log, new tools MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Updates apps/mana/CLAUDE.md AI Workbench section with: - Reasoning loop (5-round auto→propose chain) - Cross-module proposal inbox in mission detail - Kontext auto-inject - Web-research pre-step (RSS via news-research) - Debug log (local-only _aiDebugLog + AiDebugBlock panel) - New proposable tools: save_news_article, list_notes, update_note, append_to_note, add_tag_to_note Adds §23 to COMPANION_BRAIN_ARCHITECTURE.md covering the full architecture: loop algorithm pseudocode, research pre-step rationale (RSS over deep-research), kontext auto-inject privacy boundary, debug log schema + UI + toggle mechanics, and new tool inventory. Co-Authored-By: Claude Opus 4.6 (1M context) --- apps/mana/CLAUDE.md | 5 +- .../COMPANION_BRAIN_ARCHITECTURE.md | 70 +++++++++++++++++++ 2 files changed, 74 insertions(+), 1 deletion(-) diff --git a/apps/mana/CLAUDE.md b/apps/mana/CLAUDE.md index 4a899f764..b4526115d 100644 --- a/apps/mana/CLAUDE.md +++ b/apps/mana/CLAUDE.md @@ -174,9 +174,12 @@ The companion is a **second actor** that works alongside the human in every modu - **Actor attribution** — every event, record, and sync row carries `{ kind, principalId, displayName }` (+ mission/iteration/rationale for AI). `principalId` is the userId / agentId / `system:` sentinel; `displayName` is cached at write time so rename doesn't rewrite history. Factories in `@mana/shared-ai/src/actor.ts`; runtime ambient context in `src/lib/data/events/actor.ts`. - **Agents** — named AI personas that own Missions. `/ai-agents` module for CRUD (policy editor, memory, budget, concurrency). Default "Mana" agent auto-bootstrapped on first login; legacy missions backfilled. `data/ai/agents/{store,queries,bootstrap}.ts`. - **AI policy** — per-tool `auto | propose | deny`. Lives on the agent (`agent.policy`). Proposable tool names come from `@mana/shared-ai`'s `AI_PROPOSABLE_TOOL_NAMES`; the mana-ai service runs a boot-time drift guard against the same list. Resolution in `src/lib/data/ai/policy.ts`; executor loads `agent.policy` for every AI write. -- **Proposal inbox** — drop `` into any module page to render pending proposals inline with approve / freitext-reject buttons. Cards show the owning agent's name + avatar chip. Wired in `/todo`, `/calendar`, `/places`, `/drink`, `/food`. +- **Proposal inbox** — drop `` into any module page to render pending proposals inline with approve / freitext-reject buttons. Cards show the owning agent's name + avatar chip. Wired in `/todo`, `/calendar`, `/places`, `/drink`, `/food`, `/news`, `/notes`. The mission-detail view also embeds a **cross-module inbox** (``): shows all pending proposals for that mission across all modules with a module-badge per card, so the user can review and approve without navigating to individual module pages. +- **Reasoning loop** — the foreground Runner chains up to 5 planner calls per iteration. Read-only tools (`list_notes`, `get_task_stats`, etc.) execute inline as auto-policy, their outputs are fed back as synthetic `ResolvedInput`s for the next planner call. The loop exits when a propose-policy tool is staged (human must approve), the planner returns 0 steps, or the budget exhausts. This enables "read → reason → act" missions like *"list all notes and tag them"* in a single run. Code: `data/ai/missions/runner.ts` reasoning loop. - **Missions** — long-lived autonomous work items at `/ai-missions` with concept + objective + linked inputs + cadence + **owning agent** (AgentPicker in the create flow). Both the foreground tick AND the server-side `mana-ai` service produce plans under the agent's identity; `data/ai/missions/server-iteration-staging.ts` translates server-source iterations into local Proposals on sync. - **Input picker** — `` sources candidates from the `input-index` registry (notes / kontext / goals / tasks / calendar). The Runner resolves via the parallel `input-resolvers` registry. Encrypted tables (notes, tasks, …) decrypt client-side only. +- **Auto-injected context** — the Runner automatically appends the user's `kontextDoc` singleton (decrypted client-side) to every planner call as a standing-context input, unless already linked manually. For missions whose objective matches research keywords (`recherchier|research|news|…`), a web-research pre-step runs the `news-research` RSS pipeline (`discoverByQuery` + `searchFeeds`) and injects results with explicit `save_news_article` instructions. +- **Debug log** — per-iteration capture of system/user prompts, raw LLM responses, resolved inputs, and auto-tool outputs. Stored in local-only Dexie table `_aiDebugLog` (never synced — contains decrypted user content). Toggled via `localStorage('mana.ai.debug')` (on by default in DEV). Rendered as expandable `` under each iteration card with copy-as-JSON button. Code: `data/ai/missions/debug.ts`, `components/ai/AiDebugBlock.svelte`. - **Scene lens** — workbench scenes can bind to an agent via `scene.viewingAsAgentId` (context menu → "An Agent binden…"). Pure UI lens, not a data-scope change. `SceneAppBar` shows the agent avatar on bound scene tabs. - **Workbench timeline** — `/ai-workbench` renders every AI-attributed event grouped by mission iteration with per-**agent** filter, per-module, per-mission. Each bucket header shows agent avatar + name + mission title. Per-bucket **Revert button** undoes the iteration's writes via `data/ai/revert/` (TaskCreated → delete, TaskCompleted → uncomplete, etc., newest-first). Separate **"Datenzugriff"** tab exposes the server-side decrypt audit (for missions with Key-Grants). diff --git a/docs/architecture/COMPANION_BRAIN_ARCHITECTURE.md b/docs/architecture/COMPANION_BRAIN_ARCHITECTURE.md index c6e2a54cf..4dc31829f 100644 --- a/docs/architecture/COMPANION_BRAIN_ARCHITECTURE.md +++ b/docs/architecture/COMPANION_BRAIN_ARCHITECTURE.md @@ -1955,3 +1955,73 @@ This is deliberately orthogonal: one agent can appear in many scenes; one scene - **Keine Team-Features.** Andere User / geteilte Daten kommen in einem separaten Plan nach dieser Iteration. - **Keine Agent-Memory-Self-Modification.** Memory wird nur vom User editiert. - **Keine Per-Agent-Encryption-Domains.** Alle Agents sehen alle Daten des einen Users. Mission-Key-Grants bleiben per-Mission. + +## 23. Reasoning Loop + Research Pre-Step + Debug Log (ab 2026-04-15) + +### 23.1 Reasoning Loop + +The foreground Runner (`apps/mana/apps/web/src/lib/data/ai/missions/runner.ts`) wraps the plan→stage pipeline in a loop of up to `MAX_REASONING_LOOP_ITERATIONS` (5) rounds per iteration: + +``` +while (budget remaining): + plan = planner(mission, loopInputs, availableTools) + if plan.steps == 0 → break (agent done) + for each step: + resolve policy + auto → execute inline, collect {autoData, autoMessage} + propose → stage proposal, set humanInLoop=true + fail → record step as failed + if humanInLoop → break (wait for user approval) + if no auto-outputs → break (no progress) + loopInputs += synthetic ResolvedInput("Zwischenergebnisse Runde N", + formatted tool outputs as JSON fenced blocks) +``` + +This enables read→reason→act missions ("list notes → tag each one") in a single user-triggered run. The `StageOutcome` type carries `autoData` + `autoMessage` so auto-executed tool payloads thread back into the prompt without a second executor call. + +**Budget**: 5 loop iterations = 5 LLM calls max. Planner `maxTokens` raised to 4096 to accommodate batch output (up to ~15 step objects). System prompt teaches the planner about the loop: "read-only tools auto-execute, write-tools get staged, emit all batch writes in one plan because staging ends the turn." + +### 23.2 Research Pre-Step + +Before the planner runs, if the mission objective matches `/recherchier|research|news|finde|suche|aktuelle|neueste/i`, the Runner calls the `news-research` module's RSS pipeline: + +1. `discoverByQuery(objective, lang)` — finds matching RSS feeds +2. `searchFeeds(feedUrls, objective, {limit: 10})` — ranks articles by relevance +3. Results formatted as a `ResolvedInput` with explicit instructions ("für jeden relevanten Artikel rufe `save_news_article(url)` auf") + +Chosen over the deep-research pipeline (`/api/v1/research/start-sync`) because: no credits consumed, faster (~2s vs ~12s), no SearXNG dependency, uses own RSS infrastructure. The deep-research pipeline still exists for the questions module. + +Failures throw explicitly (0 feeds or 0 articles) — the runner catches and injects a "research failed" `ResolvedInput` with the error message so the planner doesn't hallucinate URLs. + +### 23.3 Kontext Auto-Inject + +The user's `kontextDoc` singleton is automatically appended to every planner call as a standing-context `ResolvedInput`, unless the mission already links it as an explicit input. Decrypted client-side only — the server-side mana-ai runner skips this (encryption barrier; needs a Key-Grant for server access). + +### 23.4 Debug Log + +Per-iteration diagnostic capture in local-only Dexie table `_aiDebugLog` (schema v20, never synced — contains decrypted prompt content). Keyed by `iterationId`, capped at 50 rows. + +**Captured per iteration:** +- `plannerCalls[]` — array (one per loop round): `{systemPrompt, userPrompt, rawResponse, latencyMs}` +- `loopSteps[]` — auto-executed tool log: `{loopIndex, toolName, params, outputPreview}` +- `preStep` — web-research outcome or kontext injection state +- `resolvedInputs[]` — full list the planner saw (grows across loop rounds) + +**UI:** `` renders an expandable panel under each iteration card: +- Summary chip: "2× LLM · 4200ms · 1× Auto-Tool" +- Collapsible sections: Pre-Step, Resolved Inputs (each individually expandable), Auto-Tool outputs, per-LLM-call prompt+response +- "📋 JSON" button copies the entire debug entry to clipboard + +**Toggle:** `localStorage.setItem('mana.ai.debug', '1')`. Defaults to enabled in DEV builds, disabled in production. Checkbox in mission-detail header exposes the toggle without DevTools. + +### 23.5 New Proposable Tools + +| Tool | Module | Policy | Purpose | +|------|--------|--------|---------| +| `save_news_article` | news | propose | Save URL to reading list via Readability extract | +| `list_notes` | notes | auto | List notes (id, title, excerpt) for planner context | +| `update_note` | notes | propose | Full overwrite of title/content (destructive) | +| `append_to_note` | notes | propose | Append text to end of note (non-destructive) | +| `add_tag_to_note` | notes | propose | Append `#Tag` idempotently (deduplicates, case-insensitive) | + +All propose-tools registered in `@mana/shared-ai` `AI_PROPOSABLE_TOOL_NAMES` and mirrored in `services/mana-ai/src/planner/tools.ts` (boot-time drift guard). `AiProposalInbox` mounted on `/news` and `/notes` pages.