Commit graph

17 commits

Author SHA1 Message Date
Till JS
74bbfda212 feat(ai): Mission Grant consent UI + Workbench audit tab
Phase 3 — user-facing side of the Mission Key-Grant rollout. Users
can now opt into server-side execution, revoke it, and inspect every
decrypt the runner has performed.

Webapp:
- MissionGrantDialog explains the scope (record count, tables, TTL,
  audit visibility, revocation) and calls requestMissionGrant. Error
  paths render distinctly for ZK, not-configured, missing vault.
- Mission detail shows a Server-Zugriff box with status pill
  (aktiv/abgelaufen/nicht erteilt), Neu-erteilen + Zurückziehen
  buttons. Only renders for missions with at least one encrypted-
  table input.
- store.ts: setMissionGrant / revokeMissionGrant helpers, Proxy-
  stripped like the rest of the store's writes.
- Workbench adds a Timeline/Datenzugriff tab switch. Audit tab queries
  the new GET /api/v1/me/ai-audit endpoint, renders decrypt events
  with color-coded status pills (ok/failed/scope-violation) and
  stable reason strings.
- getManaAiUrl() added to api/config for the audit fetch.

mana-ai:
- GET /api/v1/me/ai-audit (JWT-gated via shared-hono authMiddleware)
  backed by readDecryptAudit() — withUser + RLS double-gate so a user
  can only read their own rows.
- Limit capped at 1000, newest-first.

Missions without a grant continue to work exactly as before; the
grant UI is purely additive.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 13:53:11 +02:00
Till JS
a6d51afbc9 feat(mana-ai): encrypted resolver + tick uses Mission Grant to decrypt scoped inputs
Phase 2 of Mission Key-Grant. The tick loop now honours a mission's
grant by unwrapping the MDK and passing it + the record allowlist into
the resolvers. Encrypted modules (notes, tasks, calendar, journal,
kontext) resolve server-side instead of returning null.

- crypto/decrypt-value.ts: mirror of webapp AES-GCM wire format
  (enc:1:<iv>.<ct>) — read-only, server never wraps
- db/resolvers/encrypted.ts: factory + 5 concrete resolvers. Scope-
  violation bumps a metric + writes a structured audit row, decrypt
  failures same. Zero-decrypt (no grant, or record absent) = silent
  null, no audit noise.
- db/audit.ts: best-effort append to mana_ai.decrypt_audit; write
  failures never cascade into tick failures.
- cron/tick.ts: buildResolverContext unwraps grant per mission; MDK
  reference only lives for the scope of planOneMission.
- ResolverContext plumbed through resolveServerInputs; existing goals
  resolver unchanged semantically.
- Metrics: mana_ai_decrypts_total{table}, mana_ai_grant_skips_total
  {reason}, mana_ai_grant_scope_violations_total{table} (alert > 0).

Missions without a grant still run exactly as before — plaintext
resolvers fire, encrypted ones short-circuit to null. No behaviour
regression for existing users.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 13:42:31 +02:00
Till JS
9a3025fed8 feat(ai,auth): Mission Grant endpoint + unwrap helper + audit table
Phase 1 of the Mission Key-Grant rollout. Webapp can now request a
wrapped per-mission data key; mana-ai can unwrap and (Phase 2) use it.

mana-auth:
- POST /api/v1/me/ai-mission-grant — HKDF-derives MDK from the user
  master key, RSA-OAEP-2048-wraps with the mana-ai public key, returns
  { wrappedKey, derivation, issuedAt, expiresAt }
- MissionGrantService refuses zero-knowledge users (409 ZK_ACTIVE) and
  returns 503 GRANT_NOT_CONFIGURED when MANA_AI_PUBLIC_KEY_PEM is unset
- TTL clamped to [1h, 30d]

mana-ai:
- configureMissionGrantKey + unwrapMissionGrant with structured failure
  reasons (not-configured / expired / malformed / wrap-rejected)
- mana_ai.decrypt_audit table + RLS policy scoped to
  app.current_user_id — append-only row per server-side decrypt attempt
- MANA_AI_PRIVATE_KEY_PEM env slot; absent = grants silently disabled

No existing behaviour changes: missions without a grant run exactly as
before. Grant flow is wired end-to-end but unused until Phase 2 lands
the encrypted resolver.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 13:41:59 +02:00
Till JS
0bf01f434e feat(mana-ai): Prometheus /metrics endpoint + status.mana.how integration
Wires mana-ai into the existing observability stack so tick throughput,
plan-failure rates, planner latencies, and snapshot refresh health are
visible in Grafana + Prometheus, and the service's uptime surfaces on
status.mana.how under the "Internal" section.

- `src/metrics.ts` — prom-client Registry with `mana_ai_` prefix.
  Counters: ticks_total, plans_produced_total, plans_written_back_total,
  parse_failures_total, mission_errors_total, snapshots_new/updated,
  snapshot_rows_applied_total, http_requests_total.
  Histograms: tick_duration_seconds (0.1–120s), planner_request_
  duration_seconds (0.25–60s), http_request_duration_seconds (0.005–10s).
- `src/index.ts` — HTTP middleware labels every request by
  method/path/status; `/metrics` serves the Prometheus text format.
- `src/cron/tick.ts` — increments counters + wraps the tick with
  `tickDuration.startTimer()`. Snapshot stats fold through.
- `src/planner/client.ts` — wraps `complete()` in a latency histogram
  timer so planner tail latency shows up separately from tick duration.
- `docker/prometheus/prometheus.yml` —
  1. New `mana-ai` scrape job against `mana-ai:3066/metrics` (30s).
  2. `/health` added to the `blackbox-internal` job so uptime shows on
     status.mana.how alongside mana-geocoding.
- `scripts/generate-status-page.sh` — friendly label for the new probe:
  `mana-ai:3066/health` → "Mana AI Runner" (generator already iterates
  `blackbox-internal`, no other changes needed).
- `package.json` — prom-client ^15.1.3

All 17 Bun tests still pass; tsc clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 01:41:40 +02:00
Till JS
5ca5976fad docs(ai): materialized snapshot shipped, roadmap functionally complete
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 01:29:31 +02:00
Till JS
8fd9b7da79 perf(mana-ai): materialize mission snapshots, drop per-tick full replay
Replaces the O(N sync_changes) LWW replay in every tick with an
incremental snapshot table refresh. Each tick now applies only the
delta since the last run, then runs a single indexed SELECT on the
snapshot table to find due missions.

- `db/migrate.ts` — idempotent migration. Creates `mana_ai` schema and
  `mana_ai.mission_snapshots` table on boot. Partial index on
  active+nextRunAt powers the tick's "due" query.
- `db/snapshot-refresh.ts`
  - `refreshSnapshots(sql)` one-pass: joins sync_changes and snapshots
    on (user_id, mission_id), picks out pairs whose source max
    created_at exceeds the snapshot cursor. Per-pair refresh wrapped
    in `withUser` for RLS scoping on the source SELECT.
  - Bootstrap: missing snapshot rows seed from a full replay of their
    mission's history; subsequent ticks apply only the delta.
  - Delete tombstones purge the snapshot row.
- `db/missions-projection.ts` `listDueMissions` — single SELECT against
  `mana_ai.mission_snapshots` with an indexed WHERE. Dropped the legacy
  cross-user scan + per-user two-phase read (unused now). `mergeAndFilter`
  stays for its existing test coverage.
- `cron/tick.ts` calls `refreshSnapshots` before `listDueMissions` and
  logs when the refresh actually applied rows. No behaviour change
  externally.
- `index.ts` awaits `migrate()` on boot (top-level `await` — Bun
  supports it natively).

Closes the last item on the AI-Workbench roadmap's "future work" list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 01:28:24 +02:00
Till JS
a047f6cb7c docs(ai): Revert-per-iteration shipped in Workbench
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 01:19:16 +02:00
Till JS
9bc77dd3b9 docs(mana-ai): contract test + RLS scoping shipped; narrow remaining work
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 01:07:10 +02:00
Till JS
ad1659f036 refactor(mana-ai): RLS-scope mission reads via per-user two-phase query
Closes the "cross-user scan" caveat on the mission read path. The
earlier implementation pulled every aiMissions row server-wide and
partitioned by user_id in memory — fine for a pre-launch single-user
deploy, not a cross-user infrastructure.

New flow:

  1. `listMissionUsers(sql)` — one cross-user DISTINCT query. This is
     the ONLY surface that still reads across users; documented as
     requiring BYPASSRLS on the service's DB role (or ownership without
     FORCE).
  2. `listDueMissionsForUser(sql, userId, now)` — RLS-scoped via
     `withUser(sql, userId, tx => ...)` just like the write path in
     `iteration-writer.ts`. Defense-in-depth: even if the SELECT mis-
     filters, RLS drops any row whose user_id doesn't match the session
     setting.
  3. `listDueMissions(sql, now)` — two-phase composition of the above.

The LWW merge + due-filter logic is factored out into a pure
`mergeAndFilter(rows, userId, now)`. Fully unit-tested (6 Bun cases):
active-due happy-path, future nextRunAt, non-active state, delete
tombstone, multi-row LWW merge, userId stamping.

Matches the pattern already in use for writes (`db/connection.ts:withUser`
+ `db/iteration-writer.ts`). Docstring on `listMissionUsers` spells out
the remaining BYPASSRLS dependency so ops knows what role the service
needs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 01:06:17 +02:00
Till JS
4be5e29bd3 feat(shared-ai): canonical proposable-tool list + drift guard on mana-ai
Makes the webapp's AI policy and the server's tool allow-list physically
impossible to drift. Adds the missing entries the guard caught on first
run: `complete_tasks_by_title`, `visit_place`, `undo_drink` now have
parameter schemas server-side too.

- `packages/shared-ai/src/policy/proposable-tools.ts`
  - `AI_PROPOSABLE_TOOL_NAMES` as `const` array + literal union type
  - `AI_PROPOSABLE_TOOL_SET` for set-membership checks
- Webapp `DEFAULT_AI_POLICY` derives its `propose` entries from the
  shared list via `Object.fromEntries(...)` — adding a tool there is now
  a one-line change in `@mana/shared-ai`
- mana-ai `AI_AVAILABLE_TOOLS`: module-load assertion compares its
  hardcoded names against `AI_PROPOSABLE_TOOL_SET` and throws with a
  pointed error on drift (extras in one direction, missing in the
  other). Service refuses to start on mismatch — better than silent
  degradation.
- Bun test (`tools.test.ts`) runs the same contract plus sanity checks
  (non-empty description, required params carry docs). Vitest policy
  test adds the symmetric check on the webapp side.

All three runtimes now green: webapp 66/66, shared-ai 2/2,
mana-ai 9/9 Bun tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 00:52:38 +02:00
Till JS
dccd9c5c4e docs(mana-ai): server-side resolvers shipped; document plaintext-only scope
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 00:45:39 +02:00
Till JS
a8425941fb feat(mana-ai): server-side input resolvers (goals for now)
Plugs plaintext-safe Mission context into the Planner prompt per tick.
Before this, `resolvedInputs: []` was always passed — the LLM only saw
the mission's concept + objective. Now goals (the only plaintext
category of linked inputs today) resolve and land in the prompt.

Privacy constraint is explicit and documented: tables in the webapp's
encryption registry (notes, kontext, journal, dreams, …) arrive at
`sync_changes.data` as ciphertext — the master key lives in mana-auth
KEK-wrapped and never reaches this service. Resolvers for encrypted
modules therefore don't exist server-side; missions referencing them
should use the foreground runner which decrypts client-side.

- `db/resolvers/types.ts` — ServerInputResolver contract
- `db/resolvers/record-replay.ts` — single-record LWW replay
  (tighter WHERE than `missions-projection.ts`, used by all resolvers)
- `db/resolvers/goals.ts` — reads `companionGoals` via replayRecord,
  mirrors the webapp's default goalsResolver output shape
- `db/resolvers/index.ts` — registry with `registerServerResolver` /
  `unregisterServerResolver` / `resolveServerInputs`. Seeds `goals`.
  Drift-tolerant: missions pointing at unregistered modules silently
  skip those inputs.
- `cron/tick.ts` — wires `resolveServerInputs(sql, m.inputs, m.userId)`
  into the planner input; updates the outdated "stubbed" comment

5 Bun tests over the registry (handled + unhandled + thrown +
mixed cases + seeded default).

Future: expand to plaintext tables if/when more land (habits without
free-text, dashboard configs, tags), or introduce a decrypt-via-auth
sidecar if users opt into server-side access to encrypted content.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 00:42:45 +02:00
Till JS
39b24b2c68 docs(ai): mark Step 9 complete — close-the-loop shipped in v0.3
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 00:30:31 +02:00
Till JS
5e01763caa feat(ai): close the loop — server write-back + webapp staging effect
Completes the off-tab AI pipeline. mana-ai now writes produced plans
back to `sync_changes` as a server-sourced Mission iteration; the webapp
picks it up on next sync and translates each PlanStep into a local
Proposal via the existing createProposal flow. User sees the resulting
ghost cards in the matching module's AiProposalInbox with full mission
attribution.

Server (mana-ai v0.3):
- `db/connection.ts` — `withUser(sql, userId, fn)` RLS-scoped tx helper
  mirroring the Go `withUser` pattern (SET LOCAL app.current_user_id)
- `db/iteration-writer.ts`
  - `planToIteration(plan, id, now)` — shared-ai AiPlanOutput → inline
    MissionIteration with `source: 'server'` + status='awaiting-review'
  - `appendServerIteration(sql, input)` — INSERT sync_changes row with
    op=update, data={iterations: [...]} + field_timestamps + actor
    JSONB={kind:'system', source:'mission-runner'}
- `cron/tick.ts` — after parse success: build iteration, append to
  mission.iterations, persist via appendServerIteration. Stats now
  include `plansWrittenBack`.

Actor union:
- `packages/shared-ai/src/actor.ts` + webapp actor: `system.source` gains
  `'mission-runner'` so the server's own writes are attributed correctly
  and distinguishable from projection/rule writes

Webapp:
- `data/ai/missions/server-iteration-staging.ts`
  - `startServerIterationStaging()` subscribes to aiMissions via Dexie
    liveQuery; on each Mission update, walks iterations looking for
    `source='server'` entries that haven't been staged yet
  - For each such iteration: creates a Proposal per PlanStep under
    `{kind:'ai', missionId, iterationId, rationale}` so policy + hooks
    fire correctly
  - Writes proposalIds back into plan[].proposalId + status='staged' so
    other tabs and app restarts skip re-staging
  - Idempotent: in-memory `processedIterations` Set + durable
    proposalId marker
- Wired into (app)/+layout.svelte alongside startMissionTick
- 3 unit tests: translate server iteration → proposal, skip
  already-staged, ignore browser iterations

Full pipeline now: user creates Mission in /companion/missions →
mana-ai tick picks it up → calls mana-llm → parses plan →
writes iteration → synced to webapp → staging effect creates
proposals → user approves in /todo (or any module) → task lands with
`{actor: ai, missionId, iterationId, rationale}` attribution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 00:29:30 +02:00
Till JS
7e17142bb3 docs(mana-ai): bump status to v0.2 — plans end-to-end, write-back open
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 00:07:01 +02:00
Till JS
203fe3ef05 feat(mana-ai): wire shared-ai planner + real mana-llm calls (v0.2)
Service now produces plans end-to-end for due missions. Takes the
shared prompt/parser from @mana/shared-ai, calls mana-llm's
OpenAI-compatible endpoint, parses + validates the response against a
server-side tool allow-list.

- `src/planner/tools.ts` — hardcoded subset of webapp tools where
  policy === 'propose'. Mirror of `DEFAULT_AI_POLICY` in the webapp;
  drift just means the server doesn't suggest newly-added tools
  (graceful degradation). Contract test between the two lists is a
  sensible follow-up.
- `src/cron/tick.ts`
  - Iterates due missions, builds the shared Planner prompt per mission,
    parses the LLM response, logs the resulting plan
  - Per-mission try/catch so one flaky LLM response doesn't abort the
    queue; stats now track `plansProduced` + `parseFailures`
  - `serverMissionToSharedMission()` converts the projection shape to
    the shared-ai Mission type at the boundary
- `resolvedInputs: []` today — the Planner sees concept + objective +
  iteration history only. Full resolvers (notes/kontext/goals via
  Postgres replay) land alongside write-back in the next PR.
- No write-back yet: the plan is logged but not persisted to
  `sync_changes`. Write-back needs an RLS-scoped helper mirroring
  mana-sync's `withUser` pattern — tracked explicitly as the remaining
  open piece in CLAUDE.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 00:06:22 +02:00
Till JS
b9710e6c11 feat(mana-ai): scaffold server-side Mission Runner (v0.1)
Background Hono/Bun service that scans mana_sync for due Missions and
will plan them via mana-llm without requiring an open browser tab.
Complements the foreground `startMissionTick` in the webapp.

v0.1 scope — scaffold that's deployable, boots cleanly, and reads real
data. Execution write-back is tracked as the next PR so we don't commit
a half-baked proposal-sync design.

Shipped:
- Hono app on :3066 with `/health` + service-key-gated `/internal/tick`
- `src/db/missions-projection.ts` — field-level LWW replay of
  `sync_changes` for appId='ai' / table='aiMissions' → live Mission
  records. Mirrors the webapp's `applyServerChanges` semantics against
  Postgres instead of Dexie.
- `src/db/connection.ts` — bounded `postgres.js` pool (max 4, idle 30s)
- `src/cron/tick.ts` — overlap-guarded scheduler, `runTickOnce()` also
  reachable via HTTP for CI/ops triggering
- `src/planner/client.ts` — mana-llm HTTP client shape
  (OpenAI-compatible `/v1/chat/completions`)
- `src/middleware/service-auth.ts` — X-Service-Key gate, no end-user JWTs
  reach this service
- Dockerfile + graceful SIGTERM shutdown (stops timer + releases pool)

Not yet implemented (documented in CLAUDE.md with design trade-offs):
- Prompt/parser server-side copies — today they live in the webapp.
  Recommended next step: extract `@mana/shared-ai` package.
- Input resolvers for notes / kontext / goals — need projections or a
  mana-sync internal endpoint
- Plan → Mission-iteration write-back + how proposals get back to the
  user's device (leaning option (a): server writes iterations, the
  webapp's sync effect translates them into local Proposals)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:48:30 +02:00