docs: document reasoning loop, research pre-step, debug log, new tools

Updates apps/mana/CLAUDE.md AI Workbench section with:
- Reasoning loop (5-round auto→propose chain)
- Cross-module proposal inbox in mission detail
- Kontext auto-inject
- Web-research pre-step (RSS via news-research)
- Debug log (local-only _aiDebugLog + AiDebugBlock panel)
- New proposable tools: save_news_article, list_notes, update_note,
  append_to_note, add_tag_to_note

Adds §23 to COMPANION_BRAIN_ARCHITECTURE.md covering the full
architecture: loop algorithm pseudocode, research pre-step rationale
(RSS over deep-research), kontext auto-inject privacy boundary,
debug log schema + UI + toggle mechanics, and new tool inventory.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Till JS 2026-04-16 11:50:21 +02:00
parent 9161c0b3ab
commit 334c36a68e
2 changed files with 74 additions and 1 deletions

View file

@ -174,9 +174,12 @@ The companion is a **second actor** that works alongside the human in every modu
- **Actor attribution** — every event, record, and sync row carries `{ kind, principalId, displayName }` (+ mission/iteration/rationale for AI). `principalId` is the userId / agentId / `system:<source>` sentinel; `displayName` is cached at write time so rename doesn't rewrite history. Factories in `@mana/shared-ai/src/actor.ts`; runtime ambient context in `src/lib/data/events/actor.ts`.
- **Agents** — named AI personas that own Missions. `/ai-agents` module for CRUD (policy editor, memory, budget, concurrency). Default "Mana" agent auto-bootstrapped on first login; legacy missions backfilled. `data/ai/agents/{store,queries,bootstrap}.ts`.
- **AI policy** — per-tool `auto | propose | deny`. Lives on the agent (`agent.policy`). Proposable tool names come from `@mana/shared-ai`'s `AI_PROPOSABLE_TOOL_NAMES`; the mana-ai service runs a boot-time drift guard against the same list. Resolution in `src/lib/data/ai/policy.ts`; executor loads `agent.policy` for every AI write.
- **Proposal inbox** — drop `<AiProposalInbox module="…" />` into any module page to render pending proposals inline with approve / freitext-reject buttons. Cards show the owning agent's name + avatar chip. Wired in `/todo`, `/calendar`, `/places`, `/drink`, `/food`.
- **Proposal inbox** — drop `<AiProposalInbox module="…" />` into any module page to render pending proposals inline with approve / freitext-reject buttons. Cards show the owning agent's name + avatar chip. Wired in `/todo`, `/calendar`, `/places`, `/drink`, `/food`, `/news`, `/notes`. The mission-detail view also embeds a **cross-module inbox** (`<AiProposalInbox missionId={id} />`): shows all pending proposals for that mission across all modules with a module-badge per card, so the user can review and approve without navigating to individual module pages.
- **Reasoning loop** — the foreground Runner chains up to 5 planner calls per iteration. Read-only tools (`list_notes`, `get_task_stats`, etc.) execute inline as auto-policy, their outputs are fed back as synthetic `ResolvedInput`s for the next planner call. The loop exits when a propose-policy tool is staged (human must approve), the planner returns 0 steps, or the budget exhausts. This enables "read → reason → act" missions like *"list all notes and tag them"* in a single run. Code: `data/ai/missions/runner.ts` reasoning loop.
- **Missions** — long-lived autonomous work items at `/ai-missions` with concept + objective + linked inputs + cadence + **owning agent** (AgentPicker in the create flow). Both the foreground tick AND the server-side `mana-ai` service produce plans under the agent's identity; `data/ai/missions/server-iteration-staging.ts` translates server-source iterations into local Proposals on sync.
- **Input picker**`<MissionInputPicker>` sources candidates from the `input-index` registry (notes / kontext / goals / tasks / calendar). The Runner resolves via the parallel `input-resolvers` registry. Encrypted tables (notes, tasks, …) decrypt client-side only.
- **Auto-injected context** — the Runner automatically appends the user's `kontextDoc` singleton (decrypted client-side) to every planner call as a standing-context input, unless already linked manually. For missions whose objective matches research keywords (`recherchier|research|news|…`), a web-research pre-step runs the `news-research` RSS pipeline (`discoverByQuery` + `searchFeeds`) and injects results with explicit `save_news_article` instructions.
- **Debug log** — per-iteration capture of system/user prompts, raw LLM responses, resolved inputs, and auto-tool outputs. Stored in local-only Dexie table `_aiDebugLog` (never synced — contains decrypted user content). Toggled via `localStorage('mana.ai.debug')` (on by default in DEV). Rendered as expandable `<AiDebugBlock>` under each iteration card with copy-as-JSON button. Code: `data/ai/missions/debug.ts`, `components/ai/AiDebugBlock.svelte`.
- **Scene lens** — workbench scenes can bind to an agent via `scene.viewingAsAgentId` (context menu → "An Agent binden…"). Pure UI lens, not a data-scope change. `SceneAppBar` shows the agent avatar on bound scene tabs.
- **Workbench timeline**`/ai-workbench` renders every AI-attributed event grouped by mission iteration with per-**agent** filter, per-module, per-mission. Each bucket header shows agent avatar + name + mission title. Per-bucket **Revert button** undoes the iteration's writes via `data/ai/revert/` (TaskCreated → delete, TaskCompleted → uncomplete, etc., newest-first). Separate **"Datenzugriff"** tab exposes the server-side decrypt audit (for missions with Key-Grants).

View file

@ -1955,3 +1955,73 @@ This is deliberately orthogonal: one agent can appear in many scenes; one scene
- **Keine Team-Features.** Andere User / geteilte Daten kommen in einem separaten Plan nach dieser Iteration.
- **Keine Agent-Memory-Self-Modification.** Memory wird nur vom User editiert.
- **Keine Per-Agent-Encryption-Domains.** Alle Agents sehen alle Daten des einen Users. Mission-Key-Grants bleiben per-Mission.
## 23. Reasoning Loop + Research Pre-Step + Debug Log (ab 2026-04-15)
### 23.1 Reasoning Loop
The foreground Runner (`apps/mana/apps/web/src/lib/data/ai/missions/runner.ts`) wraps the plan→stage pipeline in a loop of up to `MAX_REASONING_LOOP_ITERATIONS` (5) rounds per iteration:
```
while (budget remaining):
plan = planner(mission, loopInputs, availableTools)
if plan.steps == 0 → break (agent done)
for each step:
resolve policy
auto → execute inline, collect {autoData, autoMessage}
propose → stage proposal, set humanInLoop=true
fail → record step as failed
if humanInLoop → break (wait for user approval)
if no auto-outputs → break (no progress)
loopInputs += synthetic ResolvedInput("Zwischenergebnisse Runde N",
formatted tool outputs as JSON fenced blocks)
```
This enables read→reason→act missions ("list notes → tag each one") in a single user-triggered run. The `StageOutcome` type carries `autoData` + `autoMessage` so auto-executed tool payloads thread back into the prompt without a second executor call.
**Budget**: 5 loop iterations = 5 LLM calls max. Planner `maxTokens` raised to 4096 to accommodate batch output (up to ~15 step objects). System prompt teaches the planner about the loop: "read-only tools auto-execute, write-tools get staged, emit all batch writes in one plan because staging ends the turn."
### 23.2 Research Pre-Step
Before the planner runs, if the mission objective matches `/recherchier|research|news|finde|suche|aktuelle|neueste/i`, the Runner calls the `news-research` module's RSS pipeline:
1. `discoverByQuery(objective, lang)` — finds matching RSS feeds
2. `searchFeeds(feedUrls, objective, {limit: 10})` — ranks articles by relevance
3. Results formatted as a `ResolvedInput` with explicit instructions ("für jeden relevanten Artikel rufe `save_news_article(url)` auf")
Chosen over the deep-research pipeline (`/api/v1/research/start-sync`) because: no credits consumed, faster (~2s vs ~12s), no SearXNG dependency, uses own RSS infrastructure. The deep-research pipeline still exists for the questions module.
Failures throw explicitly (0 feeds or 0 articles) — the runner catches and injects a "research failed" `ResolvedInput` with the error message so the planner doesn't hallucinate URLs.
### 23.3 Kontext Auto-Inject
The user's `kontextDoc` singleton is automatically appended to every planner call as a standing-context `ResolvedInput`, unless the mission already links it as an explicit input. Decrypted client-side only — the server-side mana-ai runner skips this (encryption barrier; needs a Key-Grant for server access).
### 23.4 Debug Log
Per-iteration diagnostic capture in local-only Dexie table `_aiDebugLog` (schema v20, never synced — contains decrypted prompt content). Keyed by `iterationId`, capped at 50 rows.
**Captured per iteration:**
- `plannerCalls[]` — array (one per loop round): `{systemPrompt, userPrompt, rawResponse, latencyMs}`
- `loopSteps[]` — auto-executed tool log: `{loopIndex, toolName, params, outputPreview}`
- `preStep` — web-research outcome or kontext injection state
- `resolvedInputs[]` — full list the planner saw (grows across loop rounds)
**UI:** `<AiDebugBlock iterationId={id}>` renders an expandable panel under each iteration card:
- Summary chip: "2× LLM · 4200ms · 1× Auto-Tool"
- Collapsible sections: Pre-Step, Resolved Inputs (each individually expandable), Auto-Tool outputs, per-LLM-call prompt+response
- "📋 JSON" button copies the entire debug entry to clipboard
**Toggle:** `localStorage.setItem('mana.ai.debug', '1')`. Defaults to enabled in DEV builds, disabled in production. Checkbox in mission-detail header exposes the toggle without DevTools.
### 23.5 New Proposable Tools
| Tool | Module | Policy | Purpose |
|------|--------|--------|---------|
| `save_news_article` | news | propose | Save URL to reading list via Readability extract |
| `list_notes` | notes | auto | List notes (id, title, excerpt) for planner context |
| `update_note` | notes | propose | Full overwrite of title/content (destructive) |
| `append_to_note` | notes | propose | Append text to end of note (non-destructive) |
| `add_tag_to_note` | notes | propose | Append `#Tag` idempotently (deduplicates, case-insensitive) |
All propose-tools registered in `@mana/shared-ai` `AI_PROPOSABLE_TOOL_NAMES` and mirrored in `services/mana-ai/src/planner/tools.ts` (boot-time drift guard). `AiProposalInbox` mounted on `/news` and `/notes` pages.