managarten

mirror of https://github.com/Memo-2023/mana-monorepo.git synced 2026-05-19 19:41:25 +02:00

Author	SHA1	Message	Date
Till JS	0d613e1846	feat(ai): thread TokenUsage through runPlannerLoop → mana-ai budget Carries per-round token counts from the mana-llm response body (prompt_tokens + completion_tokens) back through LlmCompletionResponse → PlannerLoopResult. The loop sums across rounds and exposes a single aggregate on result.usage. Lets mana-ai's tick re-activate per-agent daily-token budget tracking — tokensUsed was stubbed to 0 in the migration commit (6) because the loop didn't surface usage yet. Now recordTokenUsage + agentTokenUsage24h get real numbers again, and the mana_ai_tokens_used_total Prometheus counter is accurate. Additive only: consumers without usage needs ignore the new field, and providers that don't return usage produce zeros (not undefined — the loop still exposes the object so downstream branches stay trivial). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 18:21:34 +02:00
Till JS	5b7564b3a4	test(ai): promote MockLlmClient to a shared @mana/shared-ai export The runPlannerLoop test file and the webapp's mission-runner test each had their own inline scripted LLM mock — same interface, diverged slightly. Consolidates into packages/shared-ai/src/planner/mock-llm.ts and re-exports from the package root so any consumer can drive the loop deterministically. Both existing test files now use the shared client. 5 + 3 tests pass, 44 total in shared-ai still green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 18:05:46 +02:00
Till JS	9f7d2f24b3	feat(companion): chat on runPlannerLoop with native function calling The companion chat had its own ad-hoc 3-round tool-calling pipeline: build a system prompt with tool descriptions, ask the LLM to emit ```tool JSON blocks, regex-extract, execute, feed back the result as a synthetic user message. Same fragility class as the old text-JSON planner — and now unnecessary since mana-llm speaks native function calling. Migrates companion/engine.ts to the shared runPlannerLoop, same as the mission runner (commit 5a) and the server tick (commit 6). Tools go to the LLM as proper function-schemas; tool_calls come back structured; the executor runs them directly under USER_ACTOR. Extends shared-ai/planner/loop.ts with an optional priorMessages[] input field so the chat can preserve multi-turn history between turns (missions don't need this and leave it empty). Deletes the old llm-tasks/companion-chat.ts LlmTask wrapper. Nothing else imported it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 16:45:33 +02:00
Till JS	0077752456	fix(type-check): clear the last five failures — monorepo type-check is now 76/76 green After the mobile-app deletion unblocked \`@context/mobile\`, five more pre-existing failures surfaced across shared packages and two services. All were silent-masked by the postinstall \`\|\| true\` for months. - shared-ai: \`planner/loop.ts\` imported \`ToolSchema\` from \`../tools/function-schema\`, which only imports (not re-exports) the type. Fixed to import from the source (\`../tools/schemas\`). - shared-logger: \`typeof window !== 'undefined'\` blows up under tsconfigs that don't include the DOM lib (e.g. uload-server's \`bun-types\`-only config), because shared-logger is consumed via source import. Replaced with a \`globalThis\`-indirected check that compiles under any lib configuration. - shared-hono: \`credits.ts\` returned \`res.json()\` directly as \`Promise<T \| null>\`. Modern \`@types/node\` / undici types return \`unknown\` strictly — cast to \`T\` at the boundary so the generic contract is explicit. - uload-server: \`routes/analytics.ts\` + \`routes/email.ts\` still imported \`AuthUser\` from a \`middleware/jwt-auth\` module that was deleted during the migration to \`@mana/shared-hono\`. Replaced with \`AuthVariables\` from shared-hono, which matches the actual context shape set by \`authMiddleware()\`. - manavoxel/web: \`guestSeed\` collection entries were wrapped in arrow functions, but \`local-store\` expects \`T[]\` directly and iterates \`seed.length\` — which on a function is 0. The "guest seed" was silently dead; eager-evaluating \`generateGuestWorld()\` once and sharing the result fixes both the type and the runtime. Verified: \`pnpm run type-check\` from the repo root now exits 0 — 76/76 tasks successful, no failures. First fully green state since well before the postinstall \`\|\| true\` was introduced. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 15:53:07 +02:00
Till JS	4daca8970b	feat(shared-ai): runPlannerLoop + compact system prompt for function calling Introduces the new planner pipeline both the webapp runner and the mana-ai tick will swap onto in the next commits. Additive for now — the legacy buildPlannerPrompt + parsePlannerResponse stay exported so callers can migrate one at a time; they get removed once the last consumer is gone. - planner/loop.ts — runPlannerLoop orchestrates a multi-turn chat against a caller-supplied LlmClient. Tool-calls from the LLM are handed to an onToolCall callback and their results fed back as tool-messages. Parallel tool-calls in one turn execute sequentially to keep the message log linear for debugging. Stops on assistant stop, empty tool_calls, or a hard max-rounds ceiling (default 5). - planner/system-prompt.ts — new buildSystemPrompt. ~40-line German system frame, no tool listing (the SDK-level tools field carries the schemas now), no JSON format example, no "please return JSON" plea. User frame renders mission + linked inputs + last 3 iteration summaries, same as before. - Five test cases covering the loop: immediate stop, single tool call with result feedback, parallel calls execute in order, tool failures propagate as tool-messages the LLM can react to, and maxRounds ceiling fires with the right stopReason. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 15:31:01 +02:00
Till JS	efc7641a60	chore(ai): P2 batch — prompt sync, perf, dedup, scope unification Six P2 items from the AI Workbench audit: #7 Prompt ↔ loop budget sync: System prompt now says "1 bis 5 Schritte pro Planungsrunde, bis zu 5 Planungsrunden" — matches MAX_REASONING_LOOP_ITERATIONS. Cross-ref comment added to runner.ts. #9 SceneHeader: useAgents() → useAgent(id): Only loads the single bound agent instead of the full agent list. Eliminates unnecessary Dexie churn on every scene header render. #10 Unified scope filter: New scope-filter.ts with filterByScopeTagMap() (batch, sync) and filterByScopeAsync() (per-record). Both scope-context.ts (AI) and scene-scope.svelte.ts (UI) now import from the shared module — zero duplicated filter logic. #11 Research dedup: Research input ID changed from `news-research-${Date.now()}` to `news-research-${mission.id}` — re-runs overwrite instead of appending duplicates. #12 Kontext injection policy clarified: loadAgentKontextAsResolvedInput no longer falls back to the global singleton. Comment + code aligned: kontext injection is explicit (via input picker), not auto. Dead loadKontextAsResolvedInput kept for potential future opt-in auto-inject feature. Audit doc updated with all items marked DONE. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 16:33:52 +02:00
Till JS	be81d11dc3	feat(ai): SSE streaming for foreground Mission Runner Enable real-time token streaming during the planner "calling-llm" phase so the user sees live progress ("empfange Plan… 128 tokens") instead of a static spinner. The parser still receives the full text once complete — no partial-JSON risk. Changes: - Extract shared SSE parser from playground into @mana/shared-llm/sse-parser - remote.ts: use stream:true when onToken callback is provided - AiPlanInput: add optional onToken field (shared-ai) - ai-plan task: pass onToken through to backend.generate() - runner.ts: throttled (500ms) phaseDetail updates during streaming - Playground: refactored to use shared SSE parser Also includes: AI agent architecture comparison report (docs/reports/) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 12:32:43 +02:00
Till JS	8a5d200c84	fix(ai): bump planner maxTokens 1024→4096 + teach prompt about the loop Debug log from a "tag 4 notes" mission showed the planner's second-round response truncated mid-step: it was proposing one add_tag_to_note per listed note but ran out of tokens halfway through note #2. Parser rejected the malformed JSON → loop exited with 0 staged, user saw nothing to approve. Raising maxTokens to 4096 fits ~15-20 step objects, which covers the batch-tagging / batch-save pattern the reasoning loop is designed for. Also updating the system prompt so the planner actually knows about the loop it's running inside: read-only tools are announced as auto-executing with outputs visible next turn, and a new rule makes explicit that batch jobs must emit all write-steps in one plan (because staging a propose-tool ends the turn). Step count raised 1-5 → 1-10. Prompt snapshot tests still pass (they check structure, not text). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 00:55:18 +02:00
Till JS	d5c351d63e	feat(ai): per-iteration debug log — capture prompt + response + inputs New local-only Dexie table _aiDebugLog (v20, never synced) holds one row per mission iteration with the full system+user prompt, raw LLM response, latency, every ResolvedInput the planner saw, and pre-step state (kontext-injected? web-research-ok-or-error?). Capped at 50 newest rows. aiPlanTask always returns the captured prompt/response on AiPlanOutput. debug; the runner persists it only when isAiDebugEnabled() — toggled via a checkbox in the Mission detail header (defaults to on in DEV builds, off in prod, override via localStorage 'mana.ai.debug'). New <AiDebugBlock> component renders below each iteration card: expandable sections for Pre-Step, Resolved Inputs (each input individually collapsible), System Prompt, User Prompt, Raw Response, plus a "📋 JSON" copy-to-clipboard button for bug reports. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 20:33:17 +02:00
Till JS	0d90b12d1c	feat(shared-ai): extract planner + mission types to @mana/shared-ai Single source of truth for AI Workbench types shared between the webapp (Vite/SvelteKit) and the server-side mana-ai Bun service. Prevents the two runtimes from drifting on prompt shape or mission structure. - `@mana/shared-ai` package: - `actor.ts` — Actor union (user \| ai \| system) + helpers, mirrors the webapp's runtime type so server-side consumers parse incoming actors without re-declaring - `missions/types.ts` — Mission, MissionCadence, MissionInputRef, MissionIteration, PlanStep, MissionState. Adds optional `iteration.source: 'browser' \| 'server'` to distinguish foreground vs server-produced iterations (groundwork for proposal write-back) - `planner/prompt.ts` — `buildPlannerPrompt` pure function - `planner/parser.ts` — `parsePlannerResponse` strict JSON validator - Vitest smoke tests (2) cover prompt → parse round-trip + unknown- tool rejection - Webapp: - `missions/types.ts` re-exports from shared-ai, keeps webapp-local `MISSIONS_TABLE` constant + `planStepStatusFromProposal` bridge - `missions/planner/{types,prompt,parser}.ts` become re-export stubs so existing imports keep working unchanged - Existing webapp tests (60) continue to pass — the wire code didn't move, just its home Next: mana-ai service imports buildPlannerPrompt/parsePlannerResponse from shared-ai + wires mana-llm + writes iteration back as a 'source=server' row (tracked in services/mana-ai/CLAUDE.md). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 00:01:57 +02:00

10 commits