Wire the Mission Key-Grant feature into the production Mac Mini
compose stack so mana-ai can boot and mana-auth can mint grants.
- New mana-ai service block (port 3066) — 256m mem limit, depends on
postgres + mana-llm, tick interval configurable via
MANA_AI_TICK_INTERVAL_MS / MANA_AI_TICK_ENABLED. Pulls
MANA_AI_PRIVATE_KEY_PEM from env; absent = grants silently disabled.
- mana-auth environment gains MANA_AI_PUBLIC_KEY_PEM (default empty
so existing deployments without the keypair degrade to 503
GRANT_NOT_CONFIGURED rather than failing to boot).
- mana-auth Dockerfile rewritten to the two-stage pnpm+bun pattern
used by mana-credits/mana-events — required now that mana-auth has
a @mana/shared-ai workspace dep. The previous single-stage
Dockerfile with service-scoped build context couldn't resolve any
@mana/* imports; that only worked historically because it fell
through at runtime via a pre-built layer.
- mana-ai Dockerfile copies packages/shared-ai into the installer
stage alongside shared-hono.
The build contexts for mana-auth flip from services/mana-auth to the
repo root. Existing CI/CD paths (scripts/mac-mini/build-app.sh) pass
through to docker compose build and pick up the new context
automatically — no script edits needed.
Flip-on procedure: on the Mac Mini, set MANA_AI_PUBLIC_KEY_PEM +
MANA_AI_PRIVATE_KEY_PEM in .env (already done, see
secrets/mana-ai/README.md on the host), then rebuild mana-auth +
build mana-ai.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 3 — user-facing side of the Mission Key-Grant rollout. Users
can now opt into server-side execution, revoke it, and inspect every
decrypt the runner has performed.
Webapp:
- MissionGrantDialog explains the scope (record count, tables, TTL,
audit visibility, revocation) and calls requestMissionGrant. Error
paths render distinctly for ZK, not-configured, missing vault.
- Mission detail shows a Server-Zugriff box with status pill
(aktiv/abgelaufen/nicht erteilt), Neu-erteilen + Zurückziehen
buttons. Only renders for missions with at least one encrypted-
table input.
- store.ts: setMissionGrant / revokeMissionGrant helpers, Proxy-
stripped like the rest of the store's writes.
- Workbench adds a Timeline/Datenzugriff tab switch. Audit tab queries
the new GET /api/v1/me/ai-audit endpoint, renders decrypt events
with color-coded status pills (ok/failed/scope-violation) and
stable reason strings.
- getManaAiUrl() added to api/config for the audit fetch.
mana-ai:
- GET /api/v1/me/ai-audit (JWT-gated via shared-hono authMiddleware)
backed by readDecryptAudit() — withUser + RLS double-gate so a user
can only read their own rows.
- Limit capped at 1000, newest-first.
Missions without a grant continue to work exactly as before; the
grant UI is purely additive.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 2 of Mission Key-Grant. The tick loop now honours a mission's
grant by unwrapping the MDK and passing it + the record allowlist into
the resolvers. Encrypted modules (notes, tasks, calendar, journal,
kontext) resolve server-side instead of returning null.
- crypto/decrypt-value.ts: mirror of webapp AES-GCM wire format
(enc:1:<iv>.<ct>) — read-only, server never wraps
- db/resolvers/encrypted.ts: factory + 5 concrete resolvers. Scope-
violation bumps a metric + writes a structured audit row, decrypt
failures same. Zero-decrypt (no grant, or record absent) = silent
null, no audit noise.
- db/audit.ts: best-effort append to mana_ai.decrypt_audit; write
failures never cascade into tick failures.
- cron/tick.ts: buildResolverContext unwraps grant per mission; MDK
reference only lives for the scope of planOneMission.
- ResolverContext plumbed through resolveServerInputs; existing goals
resolver unchanged semantically.
- Metrics: mana_ai_decrypts_total{table}, mana_ai_grant_skips_total
{reason}, mana_ai_grant_scope_violations_total{table} (alert > 0).
Missions without a grant still run exactly as before — plaintext
resolvers fire, encrypted ones short-circuit to null. No behaviour
regression for existing users.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 1 of the Mission Key-Grant rollout. Webapp can now request a
wrapped per-mission data key; mana-ai can unwrap and (Phase 2) use it.
mana-auth:
- POST /api/v1/me/ai-mission-grant — HKDF-derives MDK from the user
master key, RSA-OAEP-2048-wraps with the mana-ai public key, returns
{ wrappedKey, derivation, issuedAt, expiresAt }
- MissionGrantService refuses zero-knowledge users (409 ZK_ACTIVE) and
returns 503 GRANT_NOT_CONFIGURED when MANA_AI_PUBLIC_KEY_PEM is unset
- TTL clamped to [1h, 30d]
mana-ai:
- configureMissionGrantKey + unwrapMissionGrant with structured failure
reasons (not-configured / expired / malformed / wrap-rejected)
- mana_ai.decrypt_audit table + RLS policy scoped to
app.current_user_id — append-only row per server-side decrypt attempt
- MANA_AI_PRIVATE_KEY_PEM env slot; absent = grants silently disabled
No existing behaviour changes: missions without a grant run exactly as
before. Grant flow is wired end-to-end but unused until Phase 2 lands
the encrypted resolver.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wires mana-ai into the existing observability stack so tick throughput,
plan-failure rates, planner latencies, and snapshot refresh health are
visible in Grafana + Prometheus, and the service's uptime surfaces on
status.mana.how under the "Internal" section.
- `src/metrics.ts` — prom-client Registry with `mana_ai_` prefix.
Counters: ticks_total, plans_produced_total, plans_written_back_total,
parse_failures_total, mission_errors_total, snapshots_new/updated,
snapshot_rows_applied_total, http_requests_total.
Histograms: tick_duration_seconds (0.1–120s), planner_request_
duration_seconds (0.25–60s), http_request_duration_seconds (0.005–10s).
- `src/index.ts` — HTTP middleware labels every request by
method/path/status; `/metrics` serves the Prometheus text format.
- `src/cron/tick.ts` — increments counters + wraps the tick with
`tickDuration.startTimer()`. Snapshot stats fold through.
- `src/planner/client.ts` — wraps `complete()` in a latency histogram
timer so planner tail latency shows up separately from tick duration.
- `docker/prometheus/prometheus.yml` —
1. New `mana-ai` scrape job against `mana-ai:3066/metrics` (30s).
2. `/health` added to the `blackbox-internal` job so uptime shows on
status.mana.how alongside mana-geocoding.
- `scripts/generate-status-page.sh` — friendly label for the new probe:
`mana-ai:3066/health` → "Mana AI Runner" (generator already iterates
`blackbox-internal`, no other changes needed).
- `package.json` — prom-client ^15.1.3
All 17 Bun tests still pass; tsc clean.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces the O(N sync_changes) LWW replay in every tick with an
incremental snapshot table refresh. Each tick now applies only the
delta since the last run, then runs a single indexed SELECT on the
snapshot table to find due missions.
- `db/migrate.ts` — idempotent migration. Creates `mana_ai` schema and
`mana_ai.mission_snapshots` table on boot. Partial index on
active+nextRunAt powers the tick's "due" query.
- `db/snapshot-refresh.ts`
- `refreshSnapshots(sql)` one-pass: joins sync_changes and snapshots
on (user_id, mission_id), picks out pairs whose source max
created_at exceeds the snapshot cursor. Per-pair refresh wrapped
in `withUser` for RLS scoping on the source SELECT.
- Bootstrap: missing snapshot rows seed from a full replay of their
mission's history; subsequent ticks apply only the delta.
- Delete tombstones purge the snapshot row.
- `db/missions-projection.ts` `listDueMissions` — single SELECT against
`mana_ai.mission_snapshots` with an indexed WHERE. Dropped the legacy
cross-user scan + per-user two-phase read (unused now). `mergeAndFilter`
stays for its existing test coverage.
- `cron/tick.ts` calls `refreshSnapshots` before `listDueMissions` and
logs when the refresh actually applied rows. No behaviour change
externally.
- `index.ts` awaits `migrate()` on boot (top-level `await` — Bun
supports it natively).
Closes the last item on the AI-Workbench roadmap's "future work" list.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Closes the "cross-user scan" caveat on the mission read path. The
earlier implementation pulled every aiMissions row server-wide and
partitioned by user_id in memory — fine for a pre-launch single-user
deploy, not a cross-user infrastructure.
New flow:
1. `listMissionUsers(sql)` — one cross-user DISTINCT query. This is
the ONLY surface that still reads across users; documented as
requiring BYPASSRLS on the service's DB role (or ownership without
FORCE).
2. `listDueMissionsForUser(sql, userId, now)` — RLS-scoped via
`withUser(sql, userId, tx => ...)` just like the write path in
`iteration-writer.ts`. Defense-in-depth: even if the SELECT mis-
filters, RLS drops any row whose user_id doesn't match the session
setting.
3. `listDueMissions(sql, now)` — two-phase composition of the above.
The LWW merge + due-filter logic is factored out into a pure
`mergeAndFilter(rows, userId, now)`. Fully unit-tested (6 Bun cases):
active-due happy-path, future nextRunAt, non-active state, delete
tombstone, multi-row LWW merge, userId stamping.
Matches the pattern already in use for writes (`db/connection.ts:withUser`
+ `db/iteration-writer.ts`). Docstring on `listMissionUsers` spells out
the remaining BYPASSRLS dependency so ops knows what role the service
needs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Makes the webapp's AI policy and the server's tool allow-list physically
impossible to drift. Adds the missing entries the guard caught on first
run: `complete_tasks_by_title`, `visit_place`, `undo_drink` now have
parameter schemas server-side too.
- `packages/shared-ai/src/policy/proposable-tools.ts`
- `AI_PROPOSABLE_TOOL_NAMES` as `const` array + literal union type
- `AI_PROPOSABLE_TOOL_SET` for set-membership checks
- Webapp `DEFAULT_AI_POLICY` derives its `propose` entries from the
shared list via `Object.fromEntries(...)` — adding a tool there is now
a one-line change in `@mana/shared-ai`
- mana-ai `AI_AVAILABLE_TOOLS`: module-load assertion compares its
hardcoded names against `AI_PROPOSABLE_TOOL_SET` and throws with a
pointed error on drift (extras in one direction, missing in the
other). Service refuses to start on mismatch — better than silent
degradation.
- Bun test (`tools.test.ts`) runs the same contract plus sanity checks
(non-empty description, required params carry docs). Vitest policy
test adds the symmetric check on the webapp side.
All three runtimes now green: webapp 66/66, shared-ai 2/2,
mana-ai 9/9 Bun tests.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Plugs plaintext-safe Mission context into the Planner prompt per tick.
Before this, `resolvedInputs: []` was always passed — the LLM only saw
the mission's concept + objective. Now goals (the only plaintext
category of linked inputs today) resolve and land in the prompt.
Privacy constraint is explicit and documented: tables in the webapp's
encryption registry (notes, kontext, journal, dreams, …) arrive at
`sync_changes.data` as ciphertext — the master key lives in mana-auth
KEK-wrapped and never reaches this service. Resolvers for encrypted
modules therefore don't exist server-side; missions referencing them
should use the foreground runner which decrypts client-side.
- `db/resolvers/types.ts` — ServerInputResolver contract
- `db/resolvers/record-replay.ts` — single-record LWW replay
(tighter WHERE than `missions-projection.ts`, used by all resolvers)
- `db/resolvers/goals.ts` — reads `companionGoals` via replayRecord,
mirrors the webapp's default goalsResolver output shape
- `db/resolvers/index.ts` — registry with `registerServerResolver` /
`unregisterServerResolver` / `resolveServerInputs`. Seeds `goals`.
Drift-tolerant: missions pointing at unregistered modules silently
skip those inputs.
- `cron/tick.ts` — wires `resolveServerInputs(sql, m.inputs, m.userId)`
into the planner input; updates the outdated "stubbed" comment
5 Bun tests over the registry (handled + unhandled + thrown +
mixed cases + seeded default).
Future: expand to plaintext tables if/when more land (habits without
free-text, dashboard configs, tags), or introduce a decrypt-via-auth
sidecar if users opt into server-side access to encrypted content.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Completes the off-tab AI pipeline. mana-ai now writes produced plans
back to `sync_changes` as a server-sourced Mission iteration; the webapp
picks it up on next sync and translates each PlanStep into a local
Proposal via the existing createProposal flow. User sees the resulting
ghost cards in the matching module's AiProposalInbox with full mission
attribution.
Server (mana-ai v0.3):
- `db/connection.ts` — `withUser(sql, userId, fn)` RLS-scoped tx helper
mirroring the Go `withUser` pattern (SET LOCAL app.current_user_id)
- `db/iteration-writer.ts`
- `planToIteration(plan, id, now)` — shared-ai AiPlanOutput → inline
MissionIteration with `source: 'server'` + status='awaiting-review'
- `appendServerIteration(sql, input)` — INSERT sync_changes row with
op=update, data={iterations: [...]} + field_timestamps + actor
JSONB={kind:'system', source:'mission-runner'}
- `cron/tick.ts` — after parse success: build iteration, append to
mission.iterations, persist via appendServerIteration. Stats now
include `plansWrittenBack`.
Actor union:
- `packages/shared-ai/src/actor.ts` + webapp actor: `system.source` gains
`'mission-runner'` so the server's own writes are attributed correctly
and distinguishable from projection/rule writes
Webapp:
- `data/ai/missions/server-iteration-staging.ts`
- `startServerIterationStaging()` subscribes to aiMissions via Dexie
liveQuery; on each Mission update, walks iterations looking for
`source='server'` entries that haven't been staged yet
- For each such iteration: creates a Proposal per PlanStep under
`{kind:'ai', missionId, iterationId, rationale}` so policy + hooks
fire correctly
- Writes proposalIds back into plan[].proposalId + status='staged' so
other tabs and app restarts skip re-staging
- Idempotent: in-memory `processedIterations` Set + durable
proposalId marker
- Wired into (app)/+layout.svelte alongside startMissionTick
- 3 unit tests: translate server iteration → proposal, skip
already-staged, ignore browser iterations
Full pipeline now: user creates Mission in /companion/missions →
mana-ai tick picks it up → calls mana-llm → parses plan →
writes iteration → synced to webapp → staging effect creates
proposals → user approves in /todo (or any module) → task lands with
`{actor: ai, missionId, iterationId, rationale}` attribution.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Service now produces plans end-to-end for due missions. Takes the
shared prompt/parser from @mana/shared-ai, calls mana-llm's
OpenAI-compatible endpoint, parses + validates the response against a
server-side tool allow-list.
- `src/planner/tools.ts` — hardcoded subset of webapp tools where
policy === 'propose'. Mirror of `DEFAULT_AI_POLICY` in the webapp;
drift just means the server doesn't suggest newly-added tools
(graceful degradation). Contract test between the two lists is a
sensible follow-up.
- `src/cron/tick.ts`
- Iterates due missions, builds the shared Planner prompt per mission,
parses the LLM response, logs the resulting plan
- Per-mission try/catch so one flaky LLM response doesn't abort the
queue; stats now track `plansProduced` + `parseFailures`
- `serverMissionToSharedMission()` converts the projection shape to
the shared-ai Mission type at the boundary
- `resolvedInputs: []` today — the Planner sees concept + objective +
iteration history only. Full resolvers (notes/kontext/goals via
Postgres replay) land alongside write-back in the next PR.
- No write-back yet: the plan is logged but not persisted to
`sync_changes`. Write-back needs an RLS-scoped helper mirroring
mana-sync's `withUser` pattern — tracked explicitly as the remaining
open piece in CLAUDE.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Background Hono/Bun service that scans mana_sync for due Missions and
will plan them via mana-llm without requiring an open browser tab.
Complements the foreground `startMissionTick` in the webapp.
v0.1 scope — scaffold that's deployable, boots cleanly, and reads real
data. Execution write-back is tracked as the next PR so we don't commit
a half-baked proposal-sync design.
Shipped:
- Hono app on :3066 with `/health` + service-key-gated `/internal/tick`
- `src/db/missions-projection.ts` — field-level LWW replay of
`sync_changes` for appId='ai' / table='aiMissions' → live Mission
records. Mirrors the webapp's `applyServerChanges` semantics against
Postgres instead of Dexie.
- `src/db/connection.ts` — bounded `postgres.js` pool (max 4, idle 30s)
- `src/cron/tick.ts` — overlap-guarded scheduler, `runTickOnce()` also
reachable via HTTP for CI/ops triggering
- `src/planner/client.ts` — mana-llm HTTP client shape
(OpenAI-compatible `/v1/chat/completions`)
- `src/middleware/service-auth.ts` — X-Service-Key gate, no end-user JWTs
reach this service
- Dockerfile + graceful SIGTERM shutdown (stops timer + releases pool)
Not yet implemented (documented in CLAUDE.md with design trade-offs):
- Prompt/parser server-side copies — today they live in the webapp.
Recommended next step: extract `@mana/shared-ai` package.
- Input resolvers for notes / kontext / goals — need projections or a
mana-sync internal endpoint
- Plan → Mission-iteration write-back + how proposals get back to the
user's device (leaning option (a): server writes iterations, the
webapp's sync effect translates them into local Proposals)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds an opaque JSON `actor` column alongside the existing field_timestamps
so cross-device consumers can distinguish user / ai / system writes. The
server never parses the shape — it just stores and re-emits the blob the
webapp stamped in its Dexie hook.
- `sync/types.go` — Change.Actor as json.RawMessage with omitempty; nil
for pre-actor clients so wire remains backward-compatible
- `store/postgres.go`
- Migrate: CREATE TABLE includes `actor JSONB` for fresh DBs;
ALTER TABLE ADD COLUMN IF NOT EXISTS actor JSONB for existing ones
(idempotent, safe to re-run)
- RecordChange signature takes json.RawMessage; pgx writes nil as NULL
- All three SELECT paths (GetChangesSince, GetAllChangesSince,
StreamAllUserChanges) return actor, Scan into ChangeRow.Actor
- ChangeRow.Actor added with doc noting "missing = user" consumer rule
- `sync/handler.go` — Change.Actor threaded through HandleSync →
RecordChange, and populated on both changeFromRow (pull/POST replies)
and convertChanges (SSE stream)
- Tests: roundtrip of an AI-actor payload + omitempty verification for
pre-actor clients. All existing tests still pass.
Webapp types still need `actor?: Actor` on SyncChange + PendingChange to
match the wire, and applyServerChanges needs to stamp __lastActor /
__fieldActors from incoming changes for Workbench attribution on other
devices — both tracked as separate follow-ups.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- DATA_LAYER_AUDIT.md: new section 8 covering the export/import flow
end-to-end — architecture diagram, .mana format, protocol-stability
commitments we locked in pre-launch (eventId + schemaVersion + op
vocab + tombstones-forever), encryption-boundary argument, file
map, and the remaining backup backlog (M4b, M5, signature,
resumable download, dedup table).
- services/mana-sync/CLAUDE.md: /backup/export row in API table with
explicit note that it sits outside the billing gate, new Backup /
Restore section with format sketch + split between writer.go (pure)
and handler.go (shim), test-coverage line mentions the backup cases,
project-structure tree lists backup/*.go, Security section mentions
RLS still applies to the export path.
No code changes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Refactor: HTTP handler becomes a thin shim over a pure WriteBackup(w,
userID, createdAt, iter) function. RowIterator abstracts the store, so
tests feed synthetic ChangeRow slices and production feeds
StreamAllUserChanges. Zero behavior change in production — same bytes
on the wire.
Tests (all pass):
- TestWriteBackup_Roundtrip: three rows across two apps, assert zip has
2 entries, events.jsonl has 3 JSON lines in order, insert omits
fieldTimestamps, update surfaces them, manifest apps are sorted,
eventsSha256 equals a recomputed sha of the decompressed body.
- TestWriteBackup_EmptyUser: empty userID refused up-front.
- TestWriteBackup_NoRows: zero-row export still produces a valid zip
with an empty events.jsonl and a manifest with eventCount=0 and a
non-empty sha (sha of empty input).
- TestWriteBackup_DefaultsSchemaVersionZeroRowsToOne: legacy rows with
schema_version=0 clamp to 1 so the manifest never claims a protocol
version that never existed.
Paired with the vitest zip parser suite on the TS side, this closes
the Go-writes / JS-reads round-trip without needing live mana-sync.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Recovering three files dropped when a parallel terminal reset past the
original M1 commit:
- cmd/server/main.go: register GET /backup/export outside billingMiddleware
- lib/api/services/backup.ts: browser-side downloadBackup() helper
- settings/my-data/+page.svelte: "Backup & Wiederherstellung" section
Pairs with the earlier backup handler + schema_version work already on
main (79996f946). With this commit the endpoint is actually reachable
end-to-end and the download button works.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- sync_changes gains schema_version column (default 1, idempotent ADD)
- Change/Changeset carry schemaVersion; server refuses > MaxSupported
- server->client changes now carry eventId + schemaVersion so the
restore path can dedup via eventId and route through a migration
chain keyed on schemaVersion
- backup JSONL gains schemaVersion per line
Pre-M2 clients (omit the field) are treated as v1 for compatibility.
This is the stability contract we commit to before launch: once v1
events are in the wild, all future builds must replay them forward.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Admins can now grant Cloud Sync to users without charging credits. Gifted
rows carry is_gifted=true plus gifted_by/gifted_at audit columns; the
billing cron skips them, and /activate and /deactivate refuse to touch
them. New endpoints POST/DELETE /api/v1/admin/sync/:userId/gift.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backend: Hono/Bun service on port 3042 with JMAP client for Stalwart,
account provisioning (@mana.how addresses on user registration),
thread/message/send/label API endpoints, and JWT + service-key auth.
Frontend: Mail module with 3-column inbox UI (mailboxes, thread list,
detail/compose), local-first encrypted drafts in Dexie, and API-driven
thread fetching. Scoped CSS with theme tokens.
Integration: Dexie v11 schema, mail pgSchema in mana_platform,
mana-auth fire-and-forget hook for account provisioning,
getManaMailUrl() in API config, app registry + branding update.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
**Unit tests (`bun test`, 42 checks, 0 deps)**
- `src/lib/__tests__/category-map.test.ts` locks in the Pelias→
PlaceCategory priority resolution. Covers the ambiguous multi-category
case (food beats retail for restaurants, transit beats professional
for car rentals, transport:rail still maps to transit, …), the simple
single-category paths, the layer-hint fallback, and regression cases
from real Konstanz/Stuttgart/Köln venues observed during deploy
verification.
- `src/lib/__tests__/cache.test.ts` covers LRU eviction order, TTL
expiry, move-to-end on get (so frequently-read entries survive
eviction), size tracking, and typed-value storage.
**Smoke test (`./scripts/smoke-test.sh` or `bun run test:smoke`)**
End-to-end curls against a running service, aimed at post-deploy
verification. Health endpoints, forward (venue + street fallback),
focus biasing, reverse geocoding, cache hit. 9 checks total.
Wired up as `test:smoke` in package.json so it runs alongside the
unit tests. Verified working: 42/42 unit tests green locally, 9/9
smoke checks green against the live Mac Mini deployment.
CLAUDE.md Testing section rewritten to reflect the new test layers.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After the 2026-04-11 production deploy, several non-obvious gotchas
surfaced that needed documenting:
- Forward search: autocomplete→search fallback explained, so future-me
knows why the handler hits two Pelias endpoints for address-style
queries.
- Pelias infra: corrected object counts (13.4M actual, not 22M), noted
the libpostal RAM surprise (~1.9 GB, much larger than Pelias docs
suggest), and added real per-container RAM numbers from production.
- pelias.json: document that we dropped placeholder/pip/interpolation
(not just how to run them) and why the cleaner degradation matters.
- Wrapper gotchas section: Bun idleTimeout, Colima bind-mount cache
staleness, and the host.docker.internal-from-blackbox workaround.
- /health/pelias endpoint is now listed in the API table since it's
the integration point with blackbox monitoring.
- Testing section added — explicitly "no automated tests yet", with a
curl-based manual smoke test set a human can run after changes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pelias /autocomplete deliberately excludes the address layer as a
performance optimization, so queries like "Marktstätte Konstanz"
(street + locality) return 0 venue matches even though they're clearly
in the index. /search covers all layers including addresses and streets.
Query /autocomplete first (fast, fuzzy, great for venue names), and if
it returns nothing, try /search. Best of both worlds: quick matches for
"Konzil Restaurant" plus reliable matches for street addresses.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two production follow-ups surfaced after the deploy:
1. Pelias API was emitting continuous `ENOTFOUND placeholder`, `pip`,
`interpolation` errors because we declared those services in
pelias.json but never actually run them (we don't need WOF
admin lookup or street interpolation for the DACH use case).
Removed the stale entries — Pelias degrades cleanly to
libpostal-only parsing, which is what we want.
2. Bun.serve's default idleTimeout is 10s, which is too tight for
cold Pelias queries hitting Elasticsearch. Raise to 60s so
first-query-after-idle doesn't get cut off.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
blackbox-exporter can't resolve host.docker.internal on Colima, so
probes of host.docker.internal:4000 and :9200 always fail. Instead,
add a /health/pelias endpoint on the Hono wrapper that proxies to
the Pelias API, and update prometheus.yml to probe the wrapper's
proxied health endpoint.
Also simplifies the status page friendly_name() now that we don't
need to display the host.docker.internal targets.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Port 4400 collides with mana-infra-landings (status.mana.how nginx)
on the production mac mini. libpostal is only reached internally by
pelias-api over the pelias compose network anyway — no host binding
needed. Use expose instead of ports to drop the host mapping.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use node:22-alpine + pnpm to install workspace dependencies, then copy
node_modules into the bun runtime stage. This resolves @mana/shared-hono
which depends on @mana/shared-logger (transitive workspace dep).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
bun install doesn't read pnpm-workspace.yaml, so workspace dependencies
like @mana/shared-hono can't be resolved. Switch to pnpm install with
--filter to install only mana-credits and its workspace deps.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous version chained cd + bun install with || fallback, which
left CWD in services/mana-credits after the first attempt and caused the
fallback cd to fail. Use WORKDIR directives instead — each step starts
from a known absolute path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Production deployment + observability for the self-hosted geocoding stack:
**docker-compose.macmini.yml**
- New mana-geocoding container (port 3018, internal-only — no traefik
labels, no Cloudflare route). Uses host.docker.internal to reach the
Pelias API on the host's pelias compose stack. Dockerfile added under
services/mana-geocoding/ using the same Bun/Hono pattern as mana-events.
**Prometheus**
- New blackbox-internal job probing mana-geocoding:3018/health, the
Pelias API on host.docker.internal:4000/v1/status, and Elasticsearch
at host.docker.internal:9200/_cluster/health. Kept separate from
blackbox-api which is reserved for public HTTPS endpoints.
**status.mana.how (generate-status-page.sh)**
- Include blackbox-internal in the metric query and add an "Interne
Dienste" section with its own summary card, right between Infrastruktur
and GPU Dienste. Summary grid goes from 4 to 5 columns with a
900px breakpoint.
- friendly_name() now handles http:// URLs and rewrites container-name
hosts like mana-geocoding:3018/health → "Mana Geocoding",
host.docker.internal:4000 → "Pelias API",
host.docker.internal:9200 → "Pelias Elasticsearch".
**Grafana uptime dashboard**
- Add an "Internal" series to the "Alle Dienste — Uptime-Verlauf" panel
- New "Interne Dienste Status" table panel showing per-instance up/down
- New "Geocoding Ø Latenz" stat panel for probe_duration_seconds
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Expand services/mana-geocoding/CLAUDE.md with:
- The Pelias API patch (geojsonify_place_details.js) that forces the
category field to always be returned, with regeneration instructions
- The priority-ordered Pelias→PlaceCategory mapping and verified
example mappings from the DACH index
- A full initial-import walkthrough covering the non-obvious gotchas
(analysis-icu plugin, dach-latest → planet-latest rename, adminLookup
disabled, leveldbpath, libpostal config object form, boundary.country
single-value constraint)
Also register mana-geocoding in the root services list.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Dockerfile only copied services/mana-sync, but go.mod has a replace
directive pointing to ../../packages/shared-go which needs to be in the
build context. Switch context to repo root and copy both packages.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pelias hides the 'category' field from API responses unless the
caller filters by categories=... explicitly — a default intended for
keyword search that strips category metadata from address queries.
Patch the Pelias API's geojsonify_place_details.js so the category
array is returned on every feature (food, retail, transport, …),
mounted into the container as a read-only volume override.
Rewrite category-map.ts to map Pelias' OSM taxonomy to our 7
PlaceCategories using a priority-ordered list so a restaurant
tagged ['food','retail','nightlife'] resolves to 'food' (the most
specific), not 'shopping'.
Verified with Konstanz test queries:
Konzil Restaurant → food
Bahnhof Konstanz → transit
Physiotherapie-Schule → work
MX-Park → leisure
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After importing 22M OSM objects for the DACH extract:
- Disable adminLookup (no WOF data needed for address search)
- Configure leveldb path inside the data volume
- Specify planet-latest.osm.pbf as the import filename
- Convert libpostal service config from string to object form
- Drop boundary.country default — Pelias only accepts a single
country value, and our index only contains DACH data anyway
Verified forward + reverse geocoding work end-to-end for Konstanz
test queries via the mana-geocoding wrapper on port 3018.
Known limitation: OSM category/type (amenity:restaurant etc.) is
not yet populated in Pelias responses — will require whitelisting
those tags in the importer config and re-running the import.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Dockerfile copied only its own package.json, causing bun install to
fail on @mana/shared-hono workspace dependency. Now copies workspace root
package.json and shared-hono/shared-types packages.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New mana-geocoding service (port 3018) wraps a self-hosted Pelias
instance with LRU caching and OSM→PlaceCategory auto-mapping.
All geocoding queries stay within our infrastructure — no user
location data leaves the network.
Places module integration:
- Address autocomplete search in ListView (creates place with
name, coords, address, category in one step)
- Address search + reverse geocoding button in DetailView
- Auto-fill address via reverse geocoding during tracking
- OSM category mapping (amenity:restaurant→food, shop:*→shopping, etc.)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add MANA_CREDITS_URL and MANA_SERVICE_KEY to configuration table
- Document billing gate on sync endpoints (402 behavior, 5min cache, fail-open)
- Add billing/check.go to project structure
- Add stream endpoint to API table
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>