Catches the service-level package.json files that the previous
sweep (4cca25ed0) missed — they don't appear in any dev:*:full
orchestrator but get invoked when someone runs `pnpm --filter
@mana/<service> dev` directly.
Touched: mana-geocoding, mana-mail, mana-subscriptions, mana-mcp,
news-ingester, mana-persona-runner, mana-research, mana-user,
plus apps/memoro (server + audio-server).
mana-ai stays on --watch on purpose: its entry uses an explicit
`Bun.serve({...})` call instead of `export default { port,
fetch }`, plus a SIGTERM/SIGINT handler that calls
`server.stop()`. --hot would replace the module without releasing
the old server reference and produce exactly the EADDRINUSE we're
trying to avoid. If mana-ai gets refactored to the standard
default-export shape, flip its dev script too.
Cold-start fetches from the mana-geocoding container to photon-self
on mana-gpu (over WSL2 mirrored networking) consistently take >10s on
the first probe and ~2s once warm. The previous 8s default caused the
chain to false-mark photon-self unhealthy on every cold path, leaking
to public photon for the next 30s health-cache window — and pinning
the public-photon answer in the 7d cache (now shortened to 1h).
Also wires the docker-compose macmini env to honor PROVIDER_TIMEOUT_MS
and CACHE_PUBLIC_TTL_MS overrides so production picks up the new
values without a code rebuild.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pelias was retired from the Mac mini on 2026-04-28; photon-self
(self-hosted Photon on mana-gpu) has been the live primary since then.
This removes the now-dead Pelias adapter, config, tests, and the
services/mana-geocoding/pelias/ stack — the entire compose file, the
geojsonify_place_details.js patch, the setup.sh import script.
Provider chain is now `photon-self → photon → nominatim`. The chain
keeps its `privacy: 'local' | 'public'` split, sensitive-query
blocking, coord quantization, and aggressive caching unchanged.
Three direct calls to nominatim.openstreetmap.org that bypassed
mana-geocoding now route through the wrapper:
- citycorners/add-city + citycorners/cities/[slug]/add use the shared
searchAddress() client (browser → same-origin proxy → mana-geocoding
→ photon-self).
- memoro mobile drops its OSM reverse-geocoding fallback entirely;
Expo's on-device reverse-geocoding stays as the sole path. Routing
through the wrapper would require a memoro-server proxy endpoint —
a follow-up if Expo's quality proves insufficient.
Other behavioral changes:
- CACHE_PUBLIC_TTL_MS dropped from 7d → 1h. The long TTL was a
privacy-amplification trick from the Pelias era; with photon-self
serving the bulk of traffic, a transient cross-LAN blip was pinning
cached fallback answers for days. 1h gives quick recovery.
- /health/pelias renamed to /health/photon-self; prometheus blackbox
config + status-page generator updated.
- mana-geocoding container no longer needs `extra_hosts:
host.docker.internal:host-gateway` (was only there for the
Pelias-on-host-network era).
113 tests passing. CLAUDE.md rewritten to reflect the post-Pelias
architecture.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Decision report: status flipped to MIGRATED; added migration log with
five WSL2 gotchas (bzip2 missing, no official Photon image,
firewall=true blocks cross-LAN, vmIdleTimeout=-1 ineffective,
PowerShell pre-expansion of bash $(...)) and resource snapshot.
- mana-geocoding CLAUDE.md: PHOTON_SELF_API_URL note now reflects live
primary status on mana-gpu since 2026-04-28.
- photon-self/: operator scripts for the weekly DB refresh — update.sh
(atomic-swap with rollback), systemd unit + timer (Sun 03:30 +30min
jitter, Persistent=true), README with re-installation instructions
for DR. Currently installed and enabled on mana-gpu.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
migration
The chain now distinguishes two Photon instances:
photon-self privacy: 'local' (self-hosted on mana-gpu)
photon privacy: 'public' (komoot.io, last-resort fallback)
Both wrap the same `PhotonProvider` class with different config — only
the URL, name, and privacy stance differ. The new ProviderName variant
'photon-self' lets the chain track per-provider health for them
independently (a single 'photon' slot would collide in the health
Map).
Opt-in registration: `photon-self` is only built when
PHOTON_SELF_API_URL is set in the env. When unset (current state),
the chain has the same shape as before — full backward compat. After
the GPU migration, flipping the env-var on is the only deploy step
needed:
PHOTON_SELF_API_URL=http://192.168.178.11:2322
Default chain order updated to:
photon-self,pelias,photon,nominatim
^^^^^^^^^^^ silently skipped if not registered (env unset)
The privacy guarantee is structural: photon-self carries privacy:
'local', so the existing sensitive-query block from the previous
hardening commit now has a real local backend post-migration —
medical/crisis-service queries get real results instead of the
"sensitive_local_unavailable" notice.
Tests: 148 (was 141). New coverage:
- src/__tests__/app.test.ts: createChain registration logic — verifies
photon-self appears iff PHOTON_SELF_API_URL is set, ordering
honored, GEOCODING_PROVIDERS env-var filter respected
- providers/__tests__/photon-normalizer.test.ts: provider field
carries 'photon' or 'photon-self' based on the call argument
Recon of mana-gpu (2026-04-28): Windows 11 Pro Build 26200, 64 GB
RAM (56 GB free), 739 GB disk free, no WSL2/Docker yet, no native
GPU services running. Setup plan documented in
docs/runbooks/photon-on-mana-gpu.md (3–4 h, ~1 h of which is
download/unpack waiting).
quantization + extended cache TTL for public answers
Three independent defenses limit what public geocoding APIs (Photon,
Nominatim) can learn from our outbound traffic:
1. **Sensitive-query block** (`lib/sensitive-query.ts`)
Queries matching the medical/mental-health/crisis-service keyword
list (Hausarzt, Psychiater, Klinikum, HIV, Frauenhaus, …) are
never forwarded to public APIs. The chain detects sensitivity at
the route layer and runs the search in localOnly mode — providers
with `privacy: 'public'` are filtered out before iteration begins.
When no local provider is available (Pelias stopped), a sensitive
query returns ok:true with results:[] and notice:
'sensitive_local_unavailable' so the UI can show a sensible
message instead of "no results".
The keyword list is documented inline. False negatives are the
risk; false positives just produce a 0-result UX hit (better
trade-off).
2. **Coordinate quantization** (`lib/privacy.ts`)
Forward-search focus.lat/lon: rounded to 2 decimals (~1.1km).
Enough for the bias to work, hides exact GPS.
Reverse-geocoding lat/lon: rounded to 3 decimals (~110m).
City-block resolution — sufficient for "what's near me?",
avoids reverse-geocoding the user's exact front door.
Pelias always gets full precision; quantization only on the way
out to public APIs. New `privacy: 'local' | 'public'` field on
the GeocodingProvider interface drives this.
3. **Extended cache TTL for public answers**
New `cache.publicTtlMs` config option, default 7 days (vs. 24h
for local-provider answers). LRU cache extended with optional
`ttlOverrideMs` per entry. Same query from N users → 1 outbound
request to Photon/Nominatim. Strongest privacy lever we have
over public providers (we can't change their logging, only the
rate at which we feed them queries).
Threat coverage:
✓ User IP / identity hidden (already true — wrapper is the proxy)
✓ Exact GPS hidden (quantization)
✓ Sensitive query content protected (block)
~ Non-sensitive query content visible (acceptable trade-off)
~ Aggregate profiling reduced ~10–100× (cache)
✗ TLS-level traffic analysis, compelled disclosure (out of scope)
Tests: 141 (was 115). New coverage:
- privacy.test.ts: quantization rules (locks the privacy claim)
- sensitive-query.test.ts: positive matches across categories +
documented false positives we accept
- chain.test.ts: localOnly mode end-to-end including the load-
bearing assertion that public providers' search() must NEVER be
called when the chain is in localOnly mode (no race window)
- cache.test.ts: per-entry ttlOverride longer + shorter than default
Live smoke verified end-to-end:
- "Hausarzt Konstanz" with Pelias down → no public API call,
notice: 'sensitive_local_unavailable'
- "Konstanz" → falls through to Photon, notice: 'fallback_used'
- Reverse with high-precision GPS → Photon receives quantized
coords, returns city-block-level result
First-probe DNS+TLS handshake against Nominatim can take >5s on a
cold start (verified locally: 642ms warm, sometimes 5-8s cold). The
old 5s default false-marked Nominatim unhealthy and the 30s health-
cache then locked us into a fallback-of-fallback gap. 8s gives
enough headroom for cold-start while still cutting off actually-
stuck connections.
Photon and Pelias don't hit this — Photon's CDN is consistently
sub-second and Pelias is on localhost / LAN. Only the public
Nominatim path warranted the bump, but the timeout is per-provider
shared so we adjust it globally.
Existing PROVIDER_TIMEOUT_MS env override still wins.
mana-geocoding now tries Pelias first, falls back to public Photon
(komoot.io) and finally to public Nominatim (OSM) when Pelias is
unhealthy or unreachable. The Places module's address lookup keeps
working even when the Pelias container is stopped — which it currently
is on the Mac mini, freeing 3 GB of RAM until Pelias gets migrated to
the GPU server.
Architecture:
ProviderChain ─ tries providers in priority order, stops on first
success. A clean empty-results answer is definitive
(don't burn through public-API budget on a query that
legitimately has no match). Only network errors / 5xx
/ 429 trigger fallthrough.
HealthCache ─ per-provider, 30s TTL. A failed health probe or a
failed search marks the provider unhealthy and skips
it for the rest of the cache window. Lazy refresh —
no background pinger.
RateLimiter ─ single-token FIFO queue, 1100ms gap by default.
Used to enforce Nominatim's 1 req/sec policy. Handles
abort during inter-task wait by releasing the busy
flag so later tasks aren't blocked.
Provider details:
pelias — primary, self-hosted DACH index, full OSM taxonomy in
`peliasCategories`, no rate limit
photon — public komoot endpoint, GeoJSON shape, raw `osm_key:
osm_value` mapped via lib/osm-category-map.ts. Faster
than Nominatim, no advertised rate limit but be polite.
nominatim — public OSM endpoint, strict 1 req/sec via the limiter,
custom User-Agent required (otherwise 403). Last
resort — fallback for when Photon is also down.
Response shape changes (additive only — existing callers keep
working):
- results[].provider: 'pelias' | 'photon' | 'nominatim'
- results[].peliasCategories: only present when Pelias served the
request (was already absent on Pelias-API patch failures)
- top-level provider: <name> + tried: <name[]> on success/error
- new endpoint: GET /health/providers — per-provider snapshot
Configuration via env (defaults shipped):
GEOCODING_PROVIDERS=pelias,photon,nominatim # order matters
PROVIDER_TIMEOUT_MS=5000
PROVIDER_HEALTH_CACHE_MS=30000
PHOTON_API_URL=https://photon.komoot.io
NOMINATIM_API_URL=https://nominatim.openstreetmap.org
NOMINATIM_USER_AGENT=mana-geocoding/1.0 (+https://mana.how; ...)
NOMINATIM_INTERVAL_MS=1100
Testing: 115 tests green (was 42). New coverage:
- osm-category-map.test.ts (47 cases over food/transit/shopping/
leisure/work/other priority resolution)
- rate-limiter.test.ts (FIFO, abort-during-wait, abort-during-sleep)
- chain.test.ts (failover, empty-results-stops, health-cache,
snapshot)
- photon-normalizer.test.ts and nominatim-normalizer.test.ts (lock
the wire-format mapping for both fallback providers)
Live smoke against public Photon verified — both /search and /reverse
return correctly normalized results with provider="photon" when Pelias
is unreachable.
**Unit tests (`bun test`, 42 checks, 0 deps)**
- `src/lib/__tests__/category-map.test.ts` locks in the Pelias→
PlaceCategory priority resolution. Covers the ambiguous multi-category
case (food beats retail for restaurants, transit beats professional
for car rentals, transport:rail still maps to transit, …), the simple
single-category paths, the layer-hint fallback, and regression cases
from real Konstanz/Stuttgart/Köln venues observed during deploy
verification.
- `src/lib/__tests__/cache.test.ts` covers LRU eviction order, TTL
expiry, move-to-end on get (so frequently-read entries survive
eviction), size tracking, and typed-value storage.
**Smoke test (`./scripts/smoke-test.sh` or `bun run test:smoke`)**
End-to-end curls against a running service, aimed at post-deploy
verification. Health endpoints, forward (venue + street fallback),
focus biasing, reverse geocoding, cache hit. 9 checks total.
Wired up as `test:smoke` in package.json so it runs alongside the
unit tests. Verified working: 42/42 unit tests green locally, 9/9
smoke checks green against the live Mac Mini deployment.
CLAUDE.md Testing section rewritten to reflect the new test layers.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After the 2026-04-11 production deploy, several non-obvious gotchas
surfaced that needed documenting:
- Forward search: autocomplete→search fallback explained, so future-me
knows why the handler hits two Pelias endpoints for address-style
queries.
- Pelias infra: corrected object counts (13.4M actual, not 22M), noted
the libpostal RAM surprise (~1.9 GB, much larger than Pelias docs
suggest), and added real per-container RAM numbers from production.
- pelias.json: document that we dropped placeholder/pip/interpolation
(not just how to run them) and why the cleaner degradation matters.
- Wrapper gotchas section: Bun idleTimeout, Colima bind-mount cache
staleness, and the host.docker.internal-from-blackbox workaround.
- /health/pelias endpoint is now listed in the API table since it's
the integration point with blackbox monitoring.
- Testing section added — explicitly "no automated tests yet", with a
curl-based manual smoke test set a human can run after changes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pelias /autocomplete deliberately excludes the address layer as a
performance optimization, so queries like "Marktstätte Konstanz"
(street + locality) return 0 venue matches even though they're clearly
in the index. /search covers all layers including addresses and streets.
Query /autocomplete first (fast, fuzzy, great for venue names), and if
it returns nothing, try /search. Best of both worlds: quick matches for
"Konzil Restaurant" plus reliable matches for street addresses.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two production follow-ups surfaced after the deploy:
1. Pelias API was emitting continuous `ENOTFOUND placeholder`, `pip`,
`interpolation` errors because we declared those services in
pelias.json but never actually run them (we don't need WOF
admin lookup or street interpolation for the DACH use case).
Removed the stale entries — Pelias degrades cleanly to
libpostal-only parsing, which is what we want.
2. Bun.serve's default idleTimeout is 10s, which is too tight for
cold Pelias queries hitting Elasticsearch. Raise to 60s so
first-query-after-idle doesn't get cut off.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
blackbox-exporter can't resolve host.docker.internal on Colima, so
probes of host.docker.internal:4000 and :9200 always fail. Instead,
add a /health/pelias endpoint on the Hono wrapper that proxies to
the Pelias API, and update prometheus.yml to probe the wrapper's
proxied health endpoint.
Also simplifies the status page friendly_name() now that we don't
need to display the host.docker.internal targets.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Port 4400 collides with mana-infra-landings (status.mana.how nginx)
on the production mac mini. libpostal is only reached internally by
pelias-api over the pelias compose network anyway — no host binding
needed. Use expose instead of ports to drop the host mapping.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Production deployment + observability for the self-hosted geocoding stack:
**docker-compose.macmini.yml**
- New mana-geocoding container (port 3018, internal-only — no traefik
labels, no Cloudflare route). Uses host.docker.internal to reach the
Pelias API on the host's pelias compose stack. Dockerfile added under
services/mana-geocoding/ using the same Bun/Hono pattern as mana-events.
**Prometheus**
- New blackbox-internal job probing mana-geocoding:3018/health, the
Pelias API on host.docker.internal:4000/v1/status, and Elasticsearch
at host.docker.internal:9200/_cluster/health. Kept separate from
blackbox-api which is reserved for public HTTPS endpoints.
**status.mana.how (generate-status-page.sh)**
- Include blackbox-internal in the metric query and add an "Interne
Dienste" section with its own summary card, right between Infrastruktur
and GPU Dienste. Summary grid goes from 4 to 5 columns with a
900px breakpoint.
- friendly_name() now handles http:// URLs and rewrites container-name
hosts like mana-geocoding:3018/health → "Mana Geocoding",
host.docker.internal:4000 → "Pelias API",
host.docker.internal:9200 → "Pelias Elasticsearch".
**Grafana uptime dashboard**
- Add an "Internal" series to the "Alle Dienste — Uptime-Verlauf" panel
- New "Interne Dienste Status" table panel showing per-instance up/down
- New "Geocoding Ø Latenz" stat panel for probe_duration_seconds
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Expand services/mana-geocoding/CLAUDE.md with:
- The Pelias API patch (geojsonify_place_details.js) that forces the
category field to always be returned, with regeneration instructions
- The priority-ordered Pelias→PlaceCategory mapping and verified
example mappings from the DACH index
- A full initial-import walkthrough covering the non-obvious gotchas
(analysis-icu plugin, dach-latest → planet-latest rename, adminLookup
disabled, leveldbpath, libpostal config object form, boundary.country
single-value constraint)
Also register mana-geocoding in the root services list.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pelias hides the 'category' field from API responses unless the
caller filters by categories=... explicitly — a default intended for
keyword search that strips category metadata from address queries.
Patch the Pelias API's geojsonify_place_details.js so the category
array is returned on every feature (food, retail, transport, …),
mounted into the container as a read-only volume override.
Rewrite category-map.ts to map Pelias' OSM taxonomy to our 7
PlaceCategories using a priority-ordered list so a restaurant
tagged ['food','retail','nightlife'] resolves to 'food' (the most
specific), not 'shopping'.
Verified with Konstanz test queries:
Konzil Restaurant → food
Bahnhof Konstanz → transit
Physiotherapie-Schule → work
MX-Park → leisure
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After importing 22M OSM objects for the DACH extract:
- Disable adminLookup (no WOF data needed for address search)
- Configure leveldb path inside the data volume
- Specify planet-latest.osm.pbf as the import filename
- Convert libpostal service config from string to object form
- Drop boundary.country default — Pelias only accepts a single
country value, and our index only contains DACH data anyway
Verified forward + reverse geocoding work end-to-end for Konstanz
test queries via the mana-geocoding wrapper on port 3018.
Known limitation: OSM category/type (amenity:restaurant etc.) is
not yet populated in Pelias responses — will require whitelisting
those tags in the importer config and re-running the import.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New mana-geocoding service (port 3018) wraps a self-hosted Pelias
instance with LRU caching and OSM→PlaceCategory auto-mapping.
All geocoding queries stay within our infrastructure — no user
location data leaves the network.
Places module integration:
- Address autocomplete search in ListView (creates place with
name, coords, address, category in one step)
- Address search + reverse geocoding button in DetailView
- Auto-fill address via reverse geocoding during tracking
- OSM category mapping (amenity:restaurant→food, shop:*→shopping, etc.)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>