managarten

mirror of https://github.com/Memo-2023/mana-monorepo.git synced 2026-05-21 12:26:43 +02:00

Author	SHA1	Message	Date
Till JS	f7536bc0b9	feat(shared-ai): route compactor to Haiku-tier model by default (M2.5) compactHistory() now defaults to DEFAULT_COMPACT_MODEL = 'google/gemini-2.5-flash-lite' when the caller doesn't override. Lite is ~3–5x cheaper than gemini-2.5-flash with near-identical summarisation quality — summarisation doesn't need the same tier as reasoning + tool-calling, and the compactor fires exactly when token spend is highest, so the cheaper route saves exactly where it matters. CompactHistoryOptions.model is now optional. All three consumers (mana-ai tick, webapp Companion, webapp Mission runner) drop their explicit gemini-2.5-flash override and let the default apply. This is the pragmatic M2.5: no mana-llm changes. The "tier" abstraction (X-Model-Tier header, env-routed aliases) from the Claude-Code report makes sense only once multiple utility tasks need cheaper routing — topic-detection, classification, command-injection checks. Today only the compactor wants it, and a model constant is the simplest contract that works. 2 new tests (default applied + override honoured). 79 shared-ai tests green, all three consumers type-check clean. One pre-existing unrelated type error in apps/mana/apps/web/src/lib/modules/wardrobe/queries.ts (not touched by this commit). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 18:26:50 +02:00
Till JS	83a4606a9a	feat(mana-ai): wire context-window compactor into mission runner (M2.3) The Claude-Code wU2 pattern goes live. Every mission run now passes a compactor into runPlannerLoop that will fire once if cumulative token usage crosses 92% of MANA_AI_COMPACT_MAX_CTX (default 1_000_000, the gemini-2.5-flash ceiling). Override via env for deployments on smaller models; set to 0 to disable entirely. The compactor reuses the planner's own LlmClient + gemini-2.5-flash model for now. When mana-llm grows a Haiku tier we'll route the compactor there — it's pure summarisation and a cheaper model saves tokens exactly where they matter. New metrics: - mana_ai_compactions_triggered_total — counter, one per firing - mana_ai_compacted_turns — histogram, how many middle turns got folded each time (< 3 ⇒ maxCtx is probably misconfigured) Logs print a 60-char tail of the summary.goal so the "what was this mission doing again" question survives a compaction. No new tests here — compactHistory and the loop wiring are already covered by the 22 tests in shared-ai (M2.1 + M2.2). The 57 existing mana-ai bun tests stay green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:28:20 +02:00
Till JS	faa472be91	feat(mana-ai): first live reminder producers — token budget + retry-loop Wires the M1 reminderChannel into the mana-ai mission runner with two initial producers in services/mana-ai/src/planner/reminders.ts: - tokenBudgetReminder — warns at 75% of the agent's daily cap, emits a stronger "wrap up NOW" message at/above 100%. Uses pretick usage + accumulated round usage so the warning tracks drift during a long plan. - retryLoopReminder — shape is in place (round≥3 + last 2 failures), currently limited to the single lastCall LoopState exposes. Extends cleanly once LoopState carries the full failure window. buildReminderChannel composes active producers; the tick hoists pretickUsage24h so the channel has the baseline. Each round the loop re-evaluates the producers, so usage drift across rounds surfaces on the NEXT turn. Also exports LoopState + ReminderChannel from @mana/shared-ai top-level so consumers don't need to reach into /planner. Tests: 13 new bun tests covering thresholds, pretick+round summing, composition, and per-round re-evaluation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 14:00:04 +02:00
Till JS	e5d230e599	feat(agent-loop): M1 — policy gate + reminder channel + parallel reads Three Claude-Code-inspired primitives for runPlannerLoop, derived from the reverse-engineering reports in docs/reports/: 1. Policy gate (@mana/tool-registry) — evaluatePolicy() gates every tool dispatch: denies admin-scope, denies destructive tools not in the user's opt-in list, rate-limits per tool (30/60s default), flags prompt-injection markers in freetext without blocking. Wired into mana-mcp with a per-user rolling invocation log and POLICY_MODE env (off\|log-only\|enforce, default log-only). mana-ai uses detectInjectionMarker only — tool dispatch there is plan-only, so rate-limit/destructive checks don't apply yet. 2. Reminder channel (packages/shared-ai/src/planner/loop.ts) — new reminderChannel callback in PlannerLoopInput. Called once per round with LoopState snapshot (round, toolCallCount, usage, lastCall); returned strings wrap in <reminder> tags and inject as transient system messages into THIS LLM request only. Never pushed to messages[] — the Claude-Code <system-reminder> pattern that keeps the KV-cache prefix stable. 3. Parallel reads (loop.ts) — isParallelSafe predicate enables Promise.all dispatch when every tool_call in a round is parallel-safe, in batches of PARALLEL_TOOL_BATCH_SIZE=10. Any non-safe call downgrades the whole round to sequential. messages[] always appends in source order, never completion order, so the debug log stays linear. Default-off (undefined predicate) preserves pre-M1 behaviour. Tests: 21 new in tool-registry (policy), 9 new in shared-ai (5 parallel, 4 reminder). All 74 green, type-check clean across 4 packages. Design/plan: docs/plans/agent-loop-improvements-m1.md Reports: docs/reports/claude-code-architecture.md, docs/reports/mana-agent-improvements-from-claude-code.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 13:56:40 +02:00
Till JS	2a18cb5ee4	feat(mana-ai): v0.7 — cross-tick Deep Research Max pre-planning Opt-in path for missions that want Gemini Deep Research Max (up to 60 min per task) instead of the shallow RSS pre-research. Because Max runs well past a single 60-second tick, the state is carried across ticks: tick N: submit → INSERT mission_research_jobs row → skip planner tick N+k: poll → still running → skip planner (metric pending_skips) tick N+m: poll → completed → inject as ResolvedInput, DELETE row, plan - ManaResearchClient talks to mana-research's new internal /v1/internal/research/async endpoints with X-Service-Key + X-User-Id. Graceful-null on transport errors so a flaky mana-research never crashes the tick loop. - New table mana_ai.mission_research_jobs with PK (user_id, mission_id) — presence is the "pending" flag; delete-on-terminal keeps queries trivial. - handleDeepResearch() encapsulates the state machine; planOneMission now returns a discriminated union (planned \| skipped \| failed) so "research pending" isn't miscounted as a parse failure. - Opt-in at TWO gates to keep cost in check ($3–7/task, 1500 credits per run): 1. MANA_AI_DEEP_RESEARCH_ENABLED=true server-side (default off) 2. DEEP_RESEARCH_TRIGGER regex matches the mission objective (strict: "deep research", "tiefe recherche", "umfassende recherche", "hintergrundrecherche", "deep dive") Falls back to shallow RSS when either gate fails or the submit errors upstream. - Prom metrics: mana_ai_research_jobs_{submitted,completed,failed}_total labelled by provider, plus _pending_skips_total. - docker-compose wires MANA_RESEARCH_URL + the opt-in flag and adds mana-research to depends_on. - Full write-up with real API response shape (outputs plural, not OpenAI-style), step-3 MCP-server plan (security-gated, not built), ops + kill-switch: docs/reports/gemini-deep-research.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 17:56:06 +02:00
Till JS	1d3794f96c	feat(mana-ai): Prometheus metrics for tool-calls, loop rounds, provider errors Three new counters + one histogram fill the observability gap from the function-calling migration: - mana_ai_tool_calls_total{tool, policy, outcome} — one tick per tool_call the planner produced. `outcome` is `deferred` on the server (stub onToolCall records for later client execution); webapp runner will emit success/failure once it grows its own Prom surface. - mana_ai_planner_rounds (histogram, buckets 1..5) — distribution of rounds consumed per iteration. Runs close to the cap signal a planner struggling with the mission objective. - mana_ai_provider_errors_total{provider, kind} — structured errors surfaced from mana-llm. Kind mirrors the ProviderError hierarchy added in commit 1 of the migration (blocked/truncated/auth/ rate_limit/capability/unknown). Plumbing: - llm-client.ts parses mana-llm's `{detail: {kind, message}}` 4xx/5xx body shape and re-throws as ProviderCallError carrying the kind. - tick.ts observes metrics at the natural emission points — rounds + per-call counter after runPlannerLoop returns, provider_errors in the catch block. Grafana dashboards + status.mana.how already pick up the collectDefaultMetrics prefix, so these metrics land in the existing mana-ai panel without scraper changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 20:48:29 +02:00
Till JS	0d613e1846	feat(ai): thread TokenUsage through runPlannerLoop → mana-ai budget Carries per-round token counts from the mana-llm response body (prompt_tokens + completion_tokens) back through LlmCompletionResponse → PlannerLoopResult. The loop sums across rounds and exposes a single aggregate on result.usage. Lets mana-ai's tick re-activate per-agent daily-token budget tracking — tokensUsed was stubbed to 0 in the migration commit (6) because the loop didn't surface usage yet. Now recordTokenUsage + agentTokenUsage24h get real numbers again, and the mana_ai_tokens_used_total Prometheus counter is accurate. Additive only: consumers without usage needs ignore the new field, and providers that don't return usage produce zeros (not undefined — the loop still exposes the object so downstream branches stay trivial). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 18:21:34 +02:00
Till JS	1cd559ca34	feat(mana-ai): server runner on runPlannerLoop, drops text-JSON parser Migrates the background tick from buildPlannerPrompt + PlannerClient + parsePlannerResponse to the shared runPlannerLoop with native function calling. Structurally identical to the webapp runner (commit 5a) — same catalog, same compact system prompt, same multi-turn chat. Server-specific twist: the ``onToolCall`` callback is a no-op stub (returns {success:true, message:'recorded — pending client application'}). The server has no Dexie access, so it can't actually execute writes; instead it captures the LLM's chosen tool_calls and writes them as PlanStep entries on the iteration. The user's client picks up those planned steps on sync — same shape as before, just sourced from the LLM's native tool_calls instead of a regex-extracted JSON block. Scope trimmed by the SERVER_TOOLS filter: only propose-default (write) tools go to the server planner. Read-only tools (list_, get_) are hidden because stubbing a response would let the LLM hallucinate that it saw real data. Read-then-act chains stay with the foreground runner, which has a real executor. Deleted: planner/client.ts (old PlannerClient; replaced by planner/llm-client.ts). Drift guard in tools.ts collapses into a SERVER_TOOLS = AI_TOOL_CATALOG.filter(propose) derivation — no more hand-maintained duplicate list; the contract test now asserts the inverse round-trip against AI_PROPOSABLE_TOOL_SET. TODO (follow-up): token usage tracking is temporarily set to 0 because runPlannerLoop doesn't expose per-message usage yet. Budget enforcement on the server is effectively disabled until the loop returns that data — the webapp runner is unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 16:39:20 +02:00
Till JS	76577869e1	feat(mana-ai): OpenTelemetry tracing + Grafana Tempo backend Add distributed tracing to the mana-ai background runner so mission execution can be visualized end-to-end in Grafana. Instrumentation (services/mana-ai/): - tracing.ts: OTel provider setup with OTLP/HTTP exporter, withSpan() helper - tick.ts: tick.planMission span with mission/agent/user attributes - client.ts: planner.complete span with LLM model, tokens, latency Infrastructure: - docker/tempo/tempo.yaml: Grafana Tempo config (OTLP HTTP on 4318) - docker-compose: tempo service + tempo_data volume + mana-ai env var - docker/grafana/provisioning/datasources/tempo.yml: auto-provisioned Trace flow: tick.planMission (root span) └── planner.complete (child span) ├── llm.model = "gpt-4o-mini" ├── llm.tokens.total = 1234 └── llm.response.length = 567 Enable: set OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 View: Grafana → Explore → Tempo datasource Also fixes: removed broken @mana/subscriptions workspace ref from arcade. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 15:21:23 +02:00
Till JS	ce57e11950	feat(mana-ai): server-side token budget enforcement per agent Implement rolling 24h token budget enforcement in the mana-ai tick loop. Agents with maxTokensPerDay set are now rate-limited server-side. Changes: - PlannerClient: extract usage.total_tokens from mana-llm response - planOneMission: return {plan, tokensUsed} tuple - tick loop: check getAgentTokenUsage24h() before planning; skip with 'skipped-budget' decision if over limit - tick loop: record token usage after successful plan via recordTokenUsage() INSERT into mana_ai.token_usage - migrate.ts: new mana_ai.token_usage table with rolling window index - metrics.ts: mana_ai_tokens_used_total counter (by agent_id) Budget flow: Agent.maxTokensPerDay = 50000 → tick checks: SELECT SUM(tokens_used) WHERE ts > now()-24h → if sum >= 50000: skip mission, emit skipped-budget metric → else: plan mission, INSERT token_usage row Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 14:41:31 +02:00
Till JS	23b8cc13fb	feat(ai-tools): server-side web-research + contacts for agents Two major tool expansions — the Recherche-Agent and Today-Agent can now research the web autonomously (no browser needed), and a future Meeting-Prep agent can read + create contacts. === research_news (server-side execution) === The biggest addition: mana-ai can now call mana-api's news-research endpoints (POST /discover + /search) directly, without a browser. Infrastructure: - services/mana-ai/src/planner/news-research-client.ts — full HTTP client with discover→search pipeline. 15s/30s timeouts. Graceful null on any failure (network, mana-api down, bad response) so the tick never crashes from research errors. - config.manaApiUrl added (default http://localhost:3060); wired in docker-compose.macmini.yml as http://mana-api:3060 + depends_on mana-api with service_healthy condition. Pre-planning research step (cron/tick.ts): - Before the planner prompt is built, the tick checks if the mission's objective or conceptMarkdown matches research keywords (same RESEARCH_TRIGGER regex the webapp uses). When it matches: * NewsResearchClient.research(objective) runs discovery + search * Results are injected as a synthetic ResolvedInput with id '__web-research__' and a formatted markdown context block * The Planner then sees real article URLs/titles/excerpts and can reference them in create_note / save_news_article steps * Log line: "pre-research: N feeds, M articles" Tool registration: - research_news added to AI_PROPOSABLE_TOOL_NAMES + mana-ai tools.ts with params (query, language?, limit?). This lets the planner also explicitly propose a research step as a PlanStep (in addition to the pre-planning auto-injection). === create_contact === - Added to AI_PROPOSABLE_TOOL_NAMES + mana-ai tools.ts with params (firstName required, lastName/email/phone/company/notes optional). - Contacts are encrypted at rest; server planner can plan the step but execution stays on the webapp (same as all propose tools). Full server-side contact resolution via Key-Grant is a future enhancement. - get_contacts added to webapp AUTO_TOOLS so agents can inspect existing contacts without nagging (read-only, auto-policy). Module coverage now: ✅ todo (5) ✅ calendar (2) ✅ notes (5) ✅ places (4) ✅ drink (3) ✅ food (2) ✅ news (1) ✅ journal (1) ✅ habits (3) ✅ news-research (1) ✅ contacts (1) 11 modules, 28 tools total (17 propose, 11 auto). Tests: mana-ai 41/41 (drift-guard passes), shared-ai type-check clean, webapp svelte-check 0 errors, 0 warnings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 12:25:45 +02:00
Till JS	f7426ab40f	feat(ai): policy is read from the owning agent (Phase 4) Until now AiPolicy lived as a user-global setting consulted for every AI action. With agents as the principal unit of AI behavior, policy belongs on the agent — different agents can be aggressive about tasks but conservative about calendar edits, etc. Webapp (tools/executor.ts): - When an AI actor invokes a tool, the executor looks up the owning agent via getAgent(actor.principalId) and passes agent.policy into resolvePolicy. Falls back to DEFAULT_AI_POLICY when the agent record is missing (legacy write, deleted agent, race) so no tool call can silently bypass the propose/deny path. - resolvePolicy already accepted an optional policy arg, so the call site change is a single line plus the agent load. Server (mana-ai): - ServerAgent gains an optional policy field, projected off the same plaintext JSONB that the webapp writes. - Tick loop filters AI_AVAILABLE_TOOLS through filterToolsByAgentPolicy before passing them to the planner prompt. Resolution order mirrors the webapp: tools[name] → defaultsByModule → defaultForAi; 'deny' drops the tool so the LLM never even sees it. Phase 5 will surface a per-agent policy editor on the agent-detail UI. Until then all agents inherit DEFAULT_AI_POLICY (baked in during createAgent), which means no behavior change for existing users — every tool that was 'propose' before is still 'propose' now, just reached via agent.policy instead of the user-level singleton. Tests: mana-ai 41/41, webapp svelte-check clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 21:43:04 +02:00
Till JS	0af50f0166	feat(mana-ai): agent-aware tick loop + snapshot projection (Phase 3) Third phase of the Multi-Agent Workbench. The background mission runner now respects the owning Agent: agent state gates whether a mission runs, concurrency is capped per-agent, and server-produced iterations carry the agent's identity as their Actor. Data layer: - db/migrate.ts: new mana_ai.agent_snapshots table (mirrors mission_snapshots) with indexes on (user_id, last_applied_at) and a partial index on active agents. - db/agents-projection.ts: refreshAgentSnapshots (incremental LWW replay over sync_changes appId='ai' table='agents') + loadActiveAgents / loadAgent helpers. mergeRaw exported for tests. - db/missions-projection.ts: ServerMission.agentId + projection reads the JSONB field (undefined for legacy missions). Tick integration (cron/tick.ts): - Refreshes both snapshot tables on every pass (parallel). - Per-user in-tick agent cache (Map<userId, Map<agentId, Agent>>) so N missions for one user hit the DB once. - Gate order: agent archived → skip silently; agent paused → skip; per-agent maxConcurrentMissions exhausted this tick → defer to next. All skip paths bump mana_ai_agent_decisions_total{decision}. - Prompt injection: withAgentContext prepends an <agent_context> block to the system prompt with the agent's name + role, and plaintext systemPrompt + memory when available. Ciphertext (enc:1:… blobs) are skipped — server has no key by design. Mirrors the Mission Grant privacy stance: encrypted context belongs to the foreground runner. Iteration writer (db/iteration-writer.ts): - New optional `agent` + `iterationId` + `rationale` inputs. - When agent is present, the sync_changes row is stamped with a makeAgentActor actor (principalId=agentId, displayName=agent.name) so the webapp timeline groups the write under the right agent. - Falls back to an AI actor with LEGACY_AI_PRINCIPAL + 'Mana' when the mission has no owning agent; ultimate fallback to the mission-runner system actor when iterationId is also missing. Metrics: - mana_ai_agent_decisions_total{decision=ran\|skipped-paused\| skipped-archived\|skipped-concurrency}. Missions without an agent don't produce this metric — plansWrittenBackTotal is the universal "did we run" counter. Tests: 41/41 (was 35) including 6 new cases for the agent LWW merge. mana-ai type-check clean. Webapp svelte-check: 0 errors (4 unrelated warnings in a different module). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 20:46:57 +02:00
Till JS	a6d51afbc9	feat(mana-ai): encrypted resolver + tick uses Mission Grant to decrypt scoped inputs Phase 2 of Mission Key-Grant. The tick loop now honours a mission's grant by unwrapping the MDK and passing it + the record allowlist into the resolvers. Encrypted modules (notes, tasks, calendar, journal, kontext) resolve server-side instead of returning null. - crypto/decrypt-value.ts: mirror of webapp AES-GCM wire format (enc:1:<iv>.<ct>) — read-only, server never wraps - db/resolvers/encrypted.ts: factory + 5 concrete resolvers. Scope- violation bumps a metric + writes a structured audit row, decrypt failures same. Zero-decrypt (no grant, or record absent) = silent null, no audit noise. - db/audit.ts: best-effort append to mana_ai.decrypt_audit; write failures never cascade into tick failures. - cron/tick.ts: buildResolverContext unwraps grant per mission; MDK reference only lives for the scope of planOneMission. - ResolverContext plumbed through resolveServerInputs; existing goals resolver unchanged semantically. - Metrics: mana_ai_decrypts_total{table}, mana_ai_grant_skips_total {reason}, mana_ai_grant_scope_violations_total{table} (alert > 0). Missions without a grant still run exactly as before — plaintext resolvers fire, encrypted ones short-circuit to null. No behaviour regression for existing users. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 13:42:31 +02:00
Till JS	0bf01f434e	feat(mana-ai): Prometheus /metrics endpoint + status.mana.how integration Wires mana-ai into the existing observability stack so tick throughput, plan-failure rates, planner latencies, and snapshot refresh health are visible in Grafana + Prometheus, and the service's uptime surfaces on status.mana.how under the "Internal" section. - `src/metrics.ts` — prom-client Registry with `mana_ai_` prefix. Counters: ticks_total, plans_produced_total, plans_written_back_total, parse_failures_total, mission_errors_total, snapshots_new/updated, snapshot_rows_applied_total, http_requests_total. Histograms: tick_duration_seconds (0.1–120s), planner_request_ duration_seconds (0.25–60s), http_request_duration_seconds (0.005–10s). - `src/index.ts` — HTTP middleware labels every request by method/path/status; `/metrics` serves the Prometheus text format. - `src/cron/tick.ts` — increments counters + wraps the tick with `tickDuration.startTimer()`. Snapshot stats fold through. - `src/planner/client.ts` — wraps `complete()` in a latency histogram timer so planner tail latency shows up separately from tick duration. - `docker/prometheus/prometheus.yml` — 1. New `mana-ai` scrape job against `mana-ai:3066/metrics` (30s). 2. `/health` added to the `blackbox-internal` job so uptime shows on status.mana.how alongside mana-geocoding. - `scripts/generate-status-page.sh` — friendly label for the new probe: `mana-ai:3066/health` → "Mana AI Runner" (generator already iterates `blackbox-internal`, no other changes needed). - `package.json` — prom-client ^15.1.3 All 17 Bun tests still pass; tsc clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 01:41:40 +02:00
Till JS	8fd9b7da79	perf(mana-ai): materialize mission snapshots, drop per-tick full replay Replaces the O(N sync_changes) LWW replay in every tick with an incremental snapshot table refresh. Each tick now applies only the delta since the last run, then runs a single indexed SELECT on the snapshot table to find due missions. - `db/migrate.ts` — idempotent migration. Creates `mana_ai` schema and `mana_ai.mission_snapshots` table on boot. Partial index on active+nextRunAt powers the tick's "due" query. - `db/snapshot-refresh.ts` - `refreshSnapshots(sql)` one-pass: joins sync_changes and snapshots on (user_id, mission_id), picks out pairs whose source max created_at exceeds the snapshot cursor. Per-pair refresh wrapped in `withUser` for RLS scoping on the source SELECT. - Bootstrap: missing snapshot rows seed from a full replay of their mission's history; subsequent ticks apply only the delta. - Delete tombstones purge the snapshot row. - `db/missions-projection.ts` `listDueMissions` — single SELECT against `mana_ai.mission_snapshots` with an indexed WHERE. Dropped the legacy cross-user scan + per-user two-phase read (unused now). `mergeAndFilter` stays for its existing test coverage. - `cron/tick.ts` calls `refreshSnapshots` before `listDueMissions` and logs when the refresh actually applied rows. No behaviour change externally. - `index.ts` awaits `migrate()` on boot (top-level `await` — Bun supports it natively). Closes the last item on the AI-Workbench roadmap's "future work" list. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 01:28:24 +02:00
Till JS	a8425941fb	feat(mana-ai): server-side input resolvers (goals for now) Plugs plaintext-safe Mission context into the Planner prompt per tick. Before this, `resolvedInputs: []` was always passed — the LLM only saw the mission's concept + objective. Now goals (the only plaintext category of linked inputs today) resolve and land in the prompt. Privacy constraint is explicit and documented: tables in the webapp's encryption registry (notes, kontext, journal, dreams, …) arrive at `sync_changes.data` as ciphertext — the master key lives in mana-auth KEK-wrapped and never reaches this service. Resolvers for encrypted modules therefore don't exist server-side; missions referencing them should use the foreground runner which decrypts client-side. - `db/resolvers/types.ts` — ServerInputResolver contract - `db/resolvers/record-replay.ts` — single-record LWW replay (tighter WHERE than `missions-projection.ts`, used by all resolvers) - `db/resolvers/goals.ts` — reads `companionGoals` via replayRecord, mirrors the webapp's default goalsResolver output shape - `db/resolvers/index.ts` — registry with `registerServerResolver` / `unregisterServerResolver` / `resolveServerInputs`. Seeds `goals`. Drift-tolerant: missions pointing at unregistered modules silently skip those inputs. - `cron/tick.ts` — wires `resolveServerInputs(sql, m.inputs, m.userId)` into the planner input; updates the outdated "stubbed" comment 5 Bun tests over the registry (handled + unhandled + thrown + mixed cases + seeded default). Future: expand to plaintext tables if/when more land (habits without free-text, dashboard configs, tags), or introduce a decrypt-via-auth sidecar if users opt into server-side access to encrypted content. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 00:42:45 +02:00
Till JS	5e01763caa	feat(ai): close the loop — server write-back + webapp staging effect Completes the off-tab AI pipeline. mana-ai now writes produced plans back to `sync_changes` as a server-sourced Mission iteration; the webapp picks it up on next sync and translates each PlanStep into a local Proposal via the existing createProposal flow. User sees the resulting ghost cards in the matching module's AiProposalInbox with full mission attribution. Server (mana-ai v0.3): - `db/connection.ts` — `withUser(sql, userId, fn)` RLS-scoped tx helper mirroring the Go `withUser` pattern (SET LOCAL app.current_user_id) - `db/iteration-writer.ts` - `planToIteration(plan, id, now)` — shared-ai AiPlanOutput → inline MissionIteration with `source: 'server'` + status='awaiting-review' - `appendServerIteration(sql, input)` — INSERT sync_changes row with op=update, data={iterations: [...]} + field_timestamps + actor JSONB={kind:'system', source:'mission-runner'} - `cron/tick.ts` — after parse success: build iteration, append to mission.iterations, persist via appendServerIteration. Stats now include `plansWrittenBack`. Actor union: - `packages/shared-ai/src/actor.ts` + webapp actor: `system.source` gains `'mission-runner'` so the server's own writes are attributed correctly and distinguishable from projection/rule writes Webapp: - `data/ai/missions/server-iteration-staging.ts` - `startServerIterationStaging()` subscribes to aiMissions via Dexie liveQuery; on each Mission update, walks iterations looking for `source='server'` entries that haven't been staged yet - For each such iteration: creates a Proposal per PlanStep under `{kind:'ai', missionId, iterationId, rationale}` so policy + hooks fire correctly - Writes proposalIds back into plan[].proposalId + status='staged' so other tabs and app restarts skip re-staging - Idempotent: in-memory `processedIterations` Set + durable proposalId marker - Wired into (app)/+layout.svelte alongside startMissionTick - 3 unit tests: translate server iteration → proposal, skip already-staged, ignore browser iterations Full pipeline now: user creates Mission in /companion/missions → mana-ai tick picks it up → calls mana-llm → parses plan → writes iteration → synced to webapp → staging effect creates proposals → user approves in /todo (or any module) → task lands with `{actor: ai, missionId, iterationId, rationale}` attribution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 00:29:30 +02:00
Till JS	203fe3ef05	feat(mana-ai): wire shared-ai planner + real mana-llm calls (v0.2) Service now produces plans end-to-end for due missions. Takes the shared prompt/parser from @mana/shared-ai, calls mana-llm's OpenAI-compatible endpoint, parses + validates the response against a server-side tool allow-list. - `src/planner/tools.ts` — hardcoded subset of webapp tools where policy === 'propose'. Mirror of `DEFAULT_AI_POLICY` in the webapp; drift just means the server doesn't suggest newly-added tools (graceful degradation). Contract test between the two lists is a sensible follow-up. - `src/cron/tick.ts` - Iterates due missions, builds the shared Planner prompt per mission, parses the LLM response, logs the resulting plan - Per-mission try/catch so one flaky LLM response doesn't abort the queue; stats now track `plansProduced` + `parseFailures` - `serverMissionToSharedMission()` converts the projection shape to the shared-ai Mission type at the boundary - `resolvedInputs: []` today — the Planner sees concept + objective + iteration history only. Full resolvers (notes/kontext/goals via Postgres replay) land alongside write-back in the next PR. - No write-back yet: the plan is logged but not persisted to `sync_changes`. Write-back needs an RLS-scoped helper mirroring mana-sync's `withUser` pattern — tracked explicitly as the remaining open piece in CLAUDE.md. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 00:06:22 +02:00
Till JS	b9710e6c11	feat(mana-ai): scaffold server-side Mission Runner (v0.1) Background Hono/Bun service that scans mana_sync for due Missions and will plan them via mana-llm without requiring an open browser tab. Complements the foreground `startMissionTick` in the webapp. v0.1 scope — scaffold that's deployable, boots cleanly, and reads real data. Execution write-back is tracked as the next PR so we don't commit a half-baked proposal-sync design. Shipped: - Hono app on :3066 with `/health` + service-key-gated `/internal/tick` - `src/db/missions-projection.ts` — field-level LWW replay of `sync_changes` for appId='ai' / table='aiMissions' → live Mission records. Mirrors the webapp's `applyServerChanges` semantics against Postgres instead of Dexie. - `src/db/connection.ts` — bounded `postgres.js` pool (max 4, idle 30s) - `src/cron/tick.ts` — overlap-guarded scheduler, `runTickOnce()` also reachable via HTTP for CI/ops triggering - `src/planner/client.ts` — mana-llm HTTP client shape (OpenAI-compatible `/v1/chat/completions`) - `src/middleware/service-auth.ts` — X-Service-Key gate, no end-user JWTs reach this service - Dockerfile + graceful SIGTERM shutdown (stops timer + releases pool) Not yet implemented (documented in CLAUDE.md with design trade-offs): - Prompt/parser server-side copies — today they live in the webapp. Recommended next step: extract `@mana/shared-ai` package. - Input resolvers for notes / kontext / goals — need projections or a mana-sync internal endpoint - Plan → Mission-iteration write-back + how proposals get back to the user's device (leaning option (a): server writes iterations, the webapp's sync effect translates them into local Proposals) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 23:48:30 +02:00

20 commits