managarten

mirror of https://github.com/Memo-2023/mana-monorepo.git synced 2026-05-14 20:41:09 +02:00

Author	SHA1	Message	Date
Till JS	83a4606a9a	feat(mana-ai): wire context-window compactor into mission runner (M2.3) The Claude-Code wU2 pattern goes live. Every mission run now passes a compactor into runPlannerLoop that will fire once if cumulative token usage crosses 92% of MANA_AI_COMPACT_MAX_CTX (default 1_000_000, the gemini-2.5-flash ceiling). Override via env for deployments on smaller models; set to 0 to disable entirely. The compactor reuses the planner's own LlmClient + gemini-2.5-flash model for now. When mana-llm grows a Haiku tier we'll route the compactor there — it's pure summarisation and a cheaper model saves tokens exactly where they matter. New metrics: - mana_ai_compactions_triggered_total — counter, one per firing - mana_ai_compacted_turns — histogram, how many middle turns got folded each time (< 3 ⇒ maxCtx is probably misconfigured) Logs print a 60-char tail of the summary.goal so the "what was this mission doing again" question survives a compaction. No new tests here — compactHistory and the loop wiring are already covered by the 22 tests in shared-ai (M2.1 + M2.2). The 57 existing mana-ai bun tests stay green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:28:20 +02:00
Till JS	8f283726b1	feat(agent-loop): activate retryLoopReminder via LoopState.recentCalls Extends LoopState with a sliding window of the last N ExecutedCalls (oldest-first), capped at LOOP_STATE_RECENT_CALLS_WINDOW = 5. The loop maintains the window automatically; reminderChannel producers read it without touching internal state. This activates retryLoopReminder which was shape-only in `faa472be9`. The guard now fires end-to-end: when round >= 3 and the tail-2 calls both returned success:false, the LLM sees a "stop retrying, write a summary instead" <reminder> on the next turn. The tail-2 check rather than window-wide is deliberate — a flaky run with intermittent success (F, F, F, OK, F) is not a retry loop, just flaky tools. Why window=5: retry loops usually manifest within 2-3 consecutive rounds; a 5-deep window gives room for burst-detection and stale-tool heuristics without bloating the reminder channel. Cap keeps the reminder producers O(5) regardless of loop length. Tests: 3 new (sliding-window cap + slide + order in shared-ai, retry composition + budget+retry chain + tail-only heuristic in mana-ai). Total agent-loop tests now 74 across both packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:02:40 +02:00
Till JS	25c3bb6cdf	docs(mana-mcp,mana-ai): CLAUDE.md coverage for M1 agent-loop primitives mana-mcp: - Policy-gate section: POLICY_MODE semantics, the four decision rules, where to find soak metrics during log-only burn-in. - /metrics section pointing at the Prometheus job. mana-ai: - New v0.8 status block: reminderChannel wiring, the two live producers (tokenBudgetReminder active, retryLoopReminder dormant pending LoopState extension), why POLICY_MODE here is limited to freetext inspection, why parallel-reads have no effect until the tool-registry absorbs the full AI_TOOL_CATALOG (M4 of personas). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 14:25:14 +02:00
Till JS	2a18cb5ee4	feat(mana-ai): v0.7 — cross-tick Deep Research Max pre-planning Opt-in path for missions that want Gemini Deep Research Max (up to 60 min per task) instead of the shallow RSS pre-research. Because Max runs well past a single 60-second tick, the state is carried across ticks: tick N: submit → INSERT mission_research_jobs row → skip planner tick N+k: poll → still running → skip planner (metric pending_skips) tick N+m: poll → completed → inject as ResolvedInput, DELETE row, plan - ManaResearchClient talks to mana-research's new internal /v1/internal/research/async endpoints with X-Service-Key + X-User-Id. Graceful-null on transport errors so a flaky mana-research never crashes the tick loop. - New table mana_ai.mission_research_jobs with PK (user_id, mission_id) — presence is the "pending" flag; delete-on-terminal keeps queries trivial. - handleDeepResearch() encapsulates the state machine; planOneMission now returns a discriminated union (planned \| skipped \| failed) so "research pending" isn't miscounted as a parse failure. - Opt-in at TWO gates to keep cost in check ($3–7/task, 1500 credits per run): 1. MANA_AI_DEEP_RESEARCH_ENABLED=true server-side (default off) 2. DEEP_RESEARCH_TRIGGER regex matches the mission objective (strict: "deep research", "tiefe recherche", "umfassende recherche", "hintergrundrecherche", "deep dive") Falls back to shallow RSS when either gate fails or the submit errors upstream. - Prom metrics: mana_ai_research_jobs_{submitted,completed,failed}_total labelled by provider, plus _pending_skips_total. - docker-compose wires MANA_RESEARCH_URL + the opt-in flag and adds mana-research to depends_on. - Full write-up with real API response shape (outputs plural, not OpenAI-style), step-3 MCP-server plan (security-gated, not built), ops + kill-switch: docs/reports/gemini-deep-research.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 17:56:06 +02:00
Till JS	52d008dd34	fix(goals): start GoalTracker on boot + surface AI proposals inline startGoalTracker was only ever called from tests, so DrinkLogged / TaskCompleted / MealLogged events never incremented currentValue and GoalReached never fired — the progress bars were cosmetic. Wire it into the (app)/+layout idle boot next to startStreakTracker, with matching teardown in onDestroy. Also drop <AiProposalInbox module="goals"/> into the module ListView so create_goal / pause_goal / resume_goal / complete_goal proposals are reviewable inline (previously only visible in the mission-detail view). Refresh the tool-coverage tables while we're at it: apps/mana/CLAUDE.md now reflects the real catalog state (59 tools, 19 modules — was 37/12), and services/mana-ai/CLAUDE.md shows the correct server-side propose subset (31 tools, 16 modules). Also fixes a stale 'location_log' → 'get_current_location' typo in the places row. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 14:24:39 +02:00
Till JS	d83fc370a0	docs: update tool coverage table + server-side research + templates Catches up all docs with the current state of the AI tool system. services/mana-ai/CLAUDE.md: - New v0.6 status section documenting NewsResearchClient, pre-planning research injection, config.manaApiUrl, and the full 28-tool / 11-module inventory (17 propose + 11 auto). apps/mana/CLAUDE.md: - New "Tool Coverage" table in the AI Workbench section listing all tools per module with their policy (propose vs auto). - New "Templates" subsection documenting the two-section gallery (agent vs workbench templates), the seed-handler registry, and the current handlers (meditate, habits, goals). - Architecture cross-reference updated to include §23. docs/architecture/COMPANION_BRAIN_ARCHITECTURE.md: - §23.2 gains a "Server-Side Research (mana-ai, ab v0.6)" subsection explaining how NewsResearchClient mirrors the client-side research pre-step: same endpoints, same trigger regex, but HTTP-direct from the Docker network instead of SvelteKit-internal. docs/plans/README.md: - workbench-templates.md added to the roadmap table (T1 shipped). - Multi-agent description updated to mention 28 tools + server-side web-research. - Architecture cross-reference includes §23. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 12:35:40 +02:00
Till JS	e8077a44b4	docs: flesh out Multi-Agent plan shipping log + Team Workbench forward plan The Multi-Agent Workbench shipped end-to-end (commits `1771063df` through `7c89eb625`). This commit turns the plan doc into a proper history + post- mortem and captures the deferred Team-Workbench as its own forward plan so the architectural breadcrumbs don't rot. docs/plans/multi-agent-workbench.md: - Status bumped to ✅ Shipped; every phase checkbox flipped. - Open-questions section rewritten with the decisions that were actually made (name-unique via store write-time check, per-source system principalIds, policy fully migrated, scene binding default- empty with smart suggestion). - New "Shipping-Historie" table mapping each phase to its commit, the number of files touched, and the test outcome. - New "Lessons Learnt + Follow-Up Ideen" with: * What went better than expected (L3 Actor cutover, getOrCreate instead of unique index, displayName caching) * Thin spots worth revisiting (avatar not on Actor, missing token counter for budget, no missions list on agent detail, no drag-reassign, scene binding doesn't drive filters yet) * Five deferred follow-up projects (team features, agent memory self-update, agent-to-agent messaging, meta-planner, per-agent encryption domains) docs/plans/team-workbench.md (NEW): - Full forward-looking plan for the deferred Team-Workbench. - Two use-cases (human multi-user vs multi-agent sharing team context) with the observation that they share the same infra. - Decision candidates table (still open — meant as T0 RFC fodder, not baked in). - Architecture sketch with data-model deltas over the current single-user shape. - Encryption subsection dedicated to the hardest problems: team-key wrapping per member (reuses Mission-Grant pattern), member-removal rotation (lazy vs eager), Zero-Knowledge-mode incompatibility. - T0..T6 phasing (~7 weeks for a clean first-pass). - Section "Wie Multi-Agent dafür den Weg geebnet hat" enumerating the four invariants the shipped Phase 0-7 deliberately preserved to make this plan cheap when it lands. docs/plans/README.md (NEW): - Index doc with the AI/Workbench roadmap as an ASCII flow so future contributors can locate themselves in the sequence without reading three 400-line plans first. docs/future/AI_AGENTS_IDEAS.md: - Header marks Point 1 (encrypted tables) as shipped via the Mission Grant plan; points 2-8 stay relevant. Cross-link to all three plan docs so this stays the go-to backlog. services/mana-ai/CLAUDE.md: - Design-context header expanded to link to all four related docs (arch §20-22, both shipped plans, forward team plan, ideas backlog). No code changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 22:17:04 +02:00
Till JS	7c89eb625e	feat(ai): workbench agent filter + proposal agent chip + docs (Phase 6+7) Phase 6 — Multi-Agent observability: - AI Workbench timeline gets a per-agent filter (dropdown with avatars) alongside module + mission. TimelineBucket gains agentId + agentDisplayName, projected off the bucket's first AI actor. - Bucket header now leads with the agent's avatar + name (lookup via the live useAgents query so renamed agents reflect instantly) and falls back to Actor.displayName for deleted agents. - AiProposalInbox card header replaces the generic Sparkle + "KI schlägt vor" with an agent chip "🤖 Cashflow Watcher schlägt vor" using the cached Actor.displayName. Ghost-agent label preserved via the cached displayName even when the agent record is gone. Phase 7 — Docs: - docs/architecture/COMPANION_BRAIN_ARCHITECTURE.md §22 added: data model, identity flow, tick gate order, Scene-Agent binding semantics, non-goals. - services/mana-ai/CLAUDE.md status bumped to v0.5 (Multi-Agent Workbench) with the per-agent runner features + metrics listed. - apps/mana/CLAUDE.md AI Workbench section rewritten to cover the Agent primitive, per-agent policy, scene lens, and the updated timeline header. Multi-Agent rollout is code-complete end-to-end: Phase 0 Plan ✓ Phase 4 Policy-per-agent ✓ Phase 1 Actor identity ✓ Phase 5 Agent UI + Scene lens ✓ Phase 2 Agent CRUD ✓ Phase 6 Observability ✓ Phase 3 Tick agent-aware ✓ Phase 7 Docs ✓ Tests: webapp svelte-check 0 errors, 0 warnings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 22:08:42 +02:00
Till JS	f0f5b7dcf6	fix(infra): relocate mana-ai from 3066 to 3067 — port clash with news-ingester news-ingester already owns 3066 (see docker-compose.macmini.yml:1620). Moving mana-ai to 3067 — the next free slot in the 306x services block (credits 3061, user 3062, subscriptions 3063, analytics 3064, events 3065, news-ingester 3066, mana-ai 3067). Updated: Dockerfile EXPOSE + HEALTHCHECK, config.ts default, compose service/healthcheck/port mapping, webapp getManaAiUrl() fallback, root CLAUDE.md service list, mana-ai/CLAUDE.md, and COMPANION_BRAIN_ARCHITECTURE.md §20 file map. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 14:32:07 +02:00
Till JS	a6d51afbc9	feat(mana-ai): encrypted resolver + tick uses Mission Grant to decrypt scoped inputs Phase 2 of Mission Key-Grant. The tick loop now honours a mission's grant by unwrapping the MDK and passing it + the record allowlist into the resolvers. Encrypted modules (notes, tasks, calendar, journal, kontext) resolve server-side instead of returning null. - crypto/decrypt-value.ts: mirror of webapp AES-GCM wire format (enc:1:<iv>.<ct>) — read-only, server never wraps - db/resolvers/encrypted.ts: factory + 5 concrete resolvers. Scope- violation bumps a metric + writes a structured audit row, decrypt failures same. Zero-decrypt (no grant, or record absent) = silent null, no audit noise. - db/audit.ts: best-effort append to mana_ai.decrypt_audit; write failures never cascade into tick failures. - cron/tick.ts: buildResolverContext unwraps grant per mission; MDK reference only lives for the scope of planOneMission. - ResolverContext plumbed through resolveServerInputs; existing goals resolver unchanged semantically. - Metrics: mana_ai_decrypts_total{table}, mana_ai_grant_skips_total {reason}, mana_ai_grant_scope_violations_total{table} (alert > 0). Missions without a grant still run exactly as before — plaintext resolvers fire, encrypted ones short-circuit to null. No behaviour regression for existing users. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 13:42:31 +02:00
Till JS	0bf01f434e	feat(mana-ai): Prometheus /metrics endpoint + status.mana.how integration Wires mana-ai into the existing observability stack so tick throughput, plan-failure rates, planner latencies, and snapshot refresh health are visible in Grafana + Prometheus, and the service's uptime surfaces on status.mana.how under the "Internal" section. - `src/metrics.ts` — prom-client Registry with `mana_ai_` prefix. Counters: ticks_total, plans_produced_total, plans_written_back_total, parse_failures_total, mission_errors_total, snapshots_new/updated, snapshot_rows_applied_total, http_requests_total. Histograms: tick_duration_seconds (0.1–120s), planner_request_ duration_seconds (0.25–60s), http_request_duration_seconds (0.005–10s). - `src/index.ts` — HTTP middleware labels every request by method/path/status; `/metrics` serves the Prometheus text format. - `src/cron/tick.ts` — increments counters + wraps the tick with `tickDuration.startTimer()`. Snapshot stats fold through. - `src/planner/client.ts` — wraps `complete()` in a latency histogram timer so planner tail latency shows up separately from tick duration. - `docker/prometheus/prometheus.yml` — 1. New `mana-ai` scrape job against `mana-ai:3066/metrics` (30s). 2. `/health` added to the `blackbox-internal` job so uptime shows on status.mana.how alongside mana-geocoding. - `scripts/generate-status-page.sh` — friendly label for the new probe: `mana-ai:3066/health` → "Mana AI Runner" (generator already iterates `blackbox-internal`, no other changes needed). - `package.json` — prom-client ^15.1.3 All 17 Bun tests still pass; tsc clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 01:41:40 +02:00
Till JS	5ca5976fad	docs(ai): materialized snapshot shipped, roadmap functionally complete Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 01:29:31 +02:00
Till JS	a047f6cb7c	docs(ai): Revert-per-iteration shipped in Workbench Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 01:19:16 +02:00
Till JS	9bc77dd3b9	docs(mana-ai): contract test + RLS scoping shipped; narrow remaining work Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 01:07:10 +02:00
Till JS	dccd9c5c4e	docs(mana-ai): server-side resolvers shipped; document plaintext-only scope Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 00:45:39 +02:00
Till JS	39b24b2c68	docs(ai): mark Step 9 complete — close-the-loop shipped in v0.3 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 00:30:31 +02:00
Till JS	7e17142bb3	docs(mana-ai): bump status to v0.2 — plans end-to-end, write-back open Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 00:07:01 +02:00
Till JS	b9710e6c11	feat(mana-ai): scaffold server-side Mission Runner (v0.1) Background Hono/Bun service that scans mana_sync for due Missions and will plan them via mana-llm without requiring an open browser tab. Complements the foreground `startMissionTick` in the webapp. v0.1 scope — scaffold that's deployable, boots cleanly, and reads real data. Execution write-back is tracked as the next PR so we don't commit a half-baked proposal-sync design. Shipped: - Hono app on :3066 with `/health` + service-key-gated `/internal/tick` - `src/db/missions-projection.ts` — field-level LWW replay of `sync_changes` for appId='ai' / table='aiMissions' → live Mission records. Mirrors the webapp's `applyServerChanges` semantics against Postgres instead of Dexie. - `src/db/connection.ts` — bounded `postgres.js` pool (max 4, idle 30s) - `src/cron/tick.ts` — overlap-guarded scheduler, `runTickOnce()` also reachable via HTTP for CI/ops triggering - `src/planner/client.ts` — mana-llm HTTP client shape (OpenAI-compatible `/v1/chat/completions`) - `src/middleware/service-auth.ts` — X-Service-Key gate, no end-user JWTs reach this service - Dockerfile + graceful SIGTERM shutdown (stops timer + releases pool) Not yet implemented (documented in CLAUDE.md with design trade-offs): - Prompt/parser server-side copies — today they live in the webapp. Recommended next step: extract `@mana/shared-ai` package. - Input resolvers for notes / kontext / goals — need projections or a mana-sync internal endpoint - Plan → Mission-iteration write-back + how proposals get back to the user's device (leaning option (a): server writes iterations, the webapp's sync effect translates them into local Proposals) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 23:48:30 +02:00

18 commits