managarten

mirror of https://github.com/Memo-2023/mana-monorepo.git synced 2026-05-19 07:01:23 +02:00

Author	SHA1	Message	Date
Till JS	72f7978ed4	feat(agent-loop): expose compactionsDone + compactedReminder producer Closes the loop on M2: when the compactor fires, the LLM needs to know it's now seeing a <compact-summary> instead of raw turns so it doesn't waste a turn asking about lost details or re-executing tools whose responses are gone. shared-ai: - LoopState grows `compactionsDone: number` (cap-1 by current loop policy, but shape kept as count for future multi-compact cycles). - runPlannerLoop populates it on each reminder-channel call. New loop test asserts [0, 1] sequence: round 1 before compaction, round 2 after. mana-ai: - New producer `compactedReminder` — fires severity=info when compactionsDone >= 1, wrapped in a German one-liner ("frag nicht nach verlorenen Details"). - Injected FIRST in buildReminderChannel so the LLM frames the rest of the round with "I'm looking at a summary" context. Metric surface stays `{producer='compacted', severity='info'}`. 4 new reminder tests (3 pure producer + 1 composition-ordering) + 1 loop-wiring test. 77 shared-ai, 20 reminders.test.ts — green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:36:21 +02:00
Till JS	3d8214a147	feat(shared-ai): wire compactor into runPlannerLoop (M2.2) PlannerLoopInput grows an optional compactor: compactor?: { maxContextTokens: number; threshold?: number; // default 0.92, matches Claude Code wU2 compact: (messages) => Promise<{ messages, compactedTurns }>; } Before each LLM call the loop checks whether promptTokens+completion has crossed threshold × maxContextTokens. If yes AND we haven't compacted this run yet, the callback runs, its returned messages REPLACE the live history, and compactionsDone flips to 1 so a runaway tool can't re-trigger. Design choices: - Fires at most ONCE per loop run. If the fresh (compacted) history hits the threshold again in the same run, the LLM round budget will hit first; better to terminate than to recursively compact a summary. - No reminder emitted automatically — the caller can wire that via reminderChannel by reading compactionsDone from LoopState (next PR; compactionsDone isn't exposed yet to keep the state surface small). - compactor callback is injectable, not hardcoded to compactHistory() from compact.ts. Lets mana-ai route the compactor LLM call to a cheaper model (Haiku) without changing the loop. - Zero maxContextTokens → skip silently (same contract as shouldCompact()). Also cleaned up the isParallelSafe non-null-assertion warning by hoisting the predicate to a local with proper narrowing. 5 new loop tests: below-threshold no-op, single-fire replacement, once-per-run idempotency, zero-cap bail, no-op when compactor returns 0 turns. 76 shared-ai tests total, green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:25:35 +02:00
Till JS	8f283726b1	feat(agent-loop): activate retryLoopReminder via LoopState.recentCalls Extends LoopState with a sliding window of the last N ExecutedCalls (oldest-first), capped at LOOP_STATE_RECENT_CALLS_WINDOW = 5. The loop maintains the window automatically; reminderChannel producers read it without touching internal state. This activates retryLoopReminder which was shape-only in `faa472be9`. The guard now fires end-to-end: when round >= 3 and the tail-2 calls both returned success:false, the LLM sees a "stop retrying, write a summary instead" <reminder> on the next turn. The tail-2 check rather than window-wide is deliberate — a flaky run with intermittent success (F, F, F, OK, F) is not a retry loop, just flaky tools. Why window=5: retry loops usually manifest within 2-3 consecutive rounds; a 5-deep window gives room for burst-detection and stale-tool heuristics without bloating the reminder channel. Cap keeps the reminder producers O(5) regardless of loop length. Tests: 3 new (sliding-window cap + slide + order in shared-ai, retry composition + budget+retry chain + tail-only heuristic in mana-ai). Total agent-loop tests now 74 across both packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:02:40 +02:00
Till JS	e5d230e599	feat(agent-loop): M1 — policy gate + reminder channel + parallel reads Three Claude-Code-inspired primitives for runPlannerLoop, derived from the reverse-engineering reports in docs/reports/: 1. Policy gate (@mana/tool-registry) — evaluatePolicy() gates every tool dispatch: denies admin-scope, denies destructive tools not in the user's opt-in list, rate-limits per tool (30/60s default), flags prompt-injection markers in freetext without blocking. Wired into mana-mcp with a per-user rolling invocation log and POLICY_MODE env (off\|log-only\|enforce, default log-only). mana-ai uses detectInjectionMarker only — tool dispatch there is plan-only, so rate-limit/destructive checks don't apply yet. 2. Reminder channel (packages/shared-ai/src/planner/loop.ts) — new reminderChannel callback in PlannerLoopInput. Called once per round with LoopState snapshot (round, toolCallCount, usage, lastCall); returned strings wrap in <reminder> tags and inject as transient system messages into THIS LLM request only. Never pushed to messages[] — the Claude-Code <system-reminder> pattern that keeps the KV-cache prefix stable. 3. Parallel reads (loop.ts) — isParallelSafe predicate enables Promise.all dispatch when every tool_call in a round is parallel-safe, in batches of PARALLEL_TOOL_BATCH_SIZE=10. Any non-safe call downgrades the whole round to sequential. messages[] always appends in source order, never completion order, so the debug log stays linear. Default-off (undefined predicate) preserves pre-M1 behaviour. Tests: 21 new in tool-registry (policy), 9 new in shared-ai (5 parallel, 4 reminder). All 74 green, type-check clean across 4 packages. Design/plan: docs/plans/agent-loop-improvements-m1.md Reports: docs/reports/claude-code-architecture.md, docs/reports/mana-agent-improvements-from-claude-code.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 13:56:40 +02:00
Till JS	5b7564b3a4	test(ai): promote MockLlmClient to a shared @mana/shared-ai export The runPlannerLoop test file and the webapp's mission-runner test each had their own inline scripted LLM mock — same interface, diverged slightly. Consolidates into packages/shared-ai/src/planner/mock-llm.ts and re-exports from the package root so any consumer can drive the loop deterministically. Both existing test files now use the shared client. 5 + 3 tests pass, 44 total in shared-ai still green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 18:05:46 +02:00
Till JS	4daca8970b	feat(shared-ai): runPlannerLoop + compact system prompt for function calling Introduces the new planner pipeline both the webapp runner and the mana-ai tick will swap onto in the next commits. Additive for now — the legacy buildPlannerPrompt + parsePlannerResponse stay exported so callers can migrate one at a time; they get removed once the last consumer is gone. - planner/loop.ts — runPlannerLoop orchestrates a multi-turn chat against a caller-supplied LlmClient. Tool-calls from the LLM are handed to an onToolCall callback and their results fed back as tool-messages. Parallel tool-calls in one turn execute sequentially to keep the message log linear for debugging. Stops on assistant stop, empty tool_calls, or a hard max-rounds ceiling (default 5). - planner/system-prompt.ts — new buildSystemPrompt. ~40-line German system frame, no tool listing (the SDK-level tools field carries the schemas now), no JSON format example, no "please return JSON" plea. User frame renders mission + linked inputs + last 3 iteration summaries, same as before. - Five test cases covering the loop: immediate stop, single tool call with result feedback, parallel calls execute in order, tool failures propagate as tool-messages the LLM can react to, and maxRounds ceiling fires with the right stopReason. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 15:31:01 +02:00

6 commits