quantization + extended cache TTL for public answers
Three independent defenses limit what public geocoding APIs (Photon,
Nominatim) can learn from our outbound traffic:
1. **Sensitive-query block** (`lib/sensitive-query.ts`)
Queries matching the medical/mental-health/crisis-service keyword
list (Hausarzt, Psychiater, Klinikum, HIV, Frauenhaus, …) are
never forwarded to public APIs. The chain detects sensitivity at
the route layer and runs the search in localOnly mode — providers
with `privacy: 'public'` are filtered out before iteration begins.
When no local provider is available (Pelias stopped), a sensitive
query returns ok:true with results:[] and notice:
'sensitive_local_unavailable' so the UI can show a sensible
message instead of "no results".
The keyword list is documented inline. False negatives are the
risk; false positives just produce a 0-result UX hit (better
trade-off).
2. **Coordinate quantization** (`lib/privacy.ts`)
Forward-search focus.lat/lon: rounded to 2 decimals (~1.1km).
Enough for the bias to work, hides exact GPS.
Reverse-geocoding lat/lon: rounded to 3 decimals (~110m).
City-block resolution — sufficient for "what's near me?",
avoids reverse-geocoding the user's exact front door.
Pelias always gets full precision; quantization only on the way
out to public APIs. New `privacy: 'local' | 'public'` field on
the GeocodingProvider interface drives this.
3. **Extended cache TTL for public answers**
New `cache.publicTtlMs` config option, default 7 days (vs. 24h
for local-provider answers). LRU cache extended with optional
`ttlOverrideMs` per entry. Same query from N users → 1 outbound
request to Photon/Nominatim. Strongest privacy lever we have
over public providers (we can't change their logging, only the
rate at which we feed them queries).
Threat coverage:
✓ User IP / identity hidden (already true — wrapper is the proxy)
✓ Exact GPS hidden (quantization)
✓ Sensitive query content protected (block)
~ Non-sensitive query content visible (acceptable trade-off)
~ Aggregate profiling reduced ~10–100× (cache)
✗ TLS-level traffic analysis, compelled disclosure (out of scope)
Tests: 141 (was 115). New coverage:
- privacy.test.ts: quantization rules (locks the privacy claim)
- sensitive-query.test.ts: positive matches across categories +
documented false positives we accept
- chain.test.ts: localOnly mode end-to-end including the load-
bearing assertion that public providers' search() must NEVER be
called when the chain is in localOnly mode (no race window)
- cache.test.ts: per-entry ttlOverride longer + shorter than default
Live smoke verified end-to-end:
- "Hausarzt Konstanz" with Pelias down → no public API call,
notice: 'sensitive_local_unavailable'
- "Konstanz" → falls through to Photon, notice: 'fallback_used'
- Reverse with high-precision GPS → Photon receives quantized
coords, returns city-block-level result
First-probe DNS+TLS handshake against Nominatim can take >5s on a
cold start (verified locally: 642ms warm, sometimes 5-8s cold). The
old 5s default false-marked Nominatim unhealthy and the 30s health-
cache then locked us into a fallback-of-fallback gap. 8s gives
enough headroom for cold-start while still cutting off actually-
stuck connections.
Photon and Pelias don't hit this — Photon's CDN is consistently
sub-second and Pelias is on localhost / LAN. Only the public
Nominatim path warranted the bump, but the timeout is per-provider
shared so we adjust it globally.
Existing PROVIDER_TIMEOUT_MS env override still wins.
mana-geocoding now tries Pelias first, falls back to public Photon
(komoot.io) and finally to public Nominatim (OSM) when Pelias is
unhealthy or unreachable. The Places module's address lookup keeps
working even when the Pelias container is stopped — which it currently
is on the Mac mini, freeing 3 GB of RAM until Pelias gets migrated to
the GPU server.
Architecture:
ProviderChain ─ tries providers in priority order, stops on first
success. A clean empty-results answer is definitive
(don't burn through public-API budget on a query that
legitimately has no match). Only network errors / 5xx
/ 429 trigger fallthrough.
HealthCache ─ per-provider, 30s TTL. A failed health probe or a
failed search marks the provider unhealthy and skips
it for the rest of the cache window. Lazy refresh —
no background pinger.
RateLimiter ─ single-token FIFO queue, 1100ms gap by default.
Used to enforce Nominatim's 1 req/sec policy. Handles
abort during inter-task wait by releasing the busy
flag so later tasks aren't blocked.
Provider details:
pelias — primary, self-hosted DACH index, full OSM taxonomy in
`peliasCategories`, no rate limit
photon — public komoot endpoint, GeoJSON shape, raw `osm_key:
osm_value` mapped via lib/osm-category-map.ts. Faster
than Nominatim, no advertised rate limit but be polite.
nominatim — public OSM endpoint, strict 1 req/sec via the limiter,
custom User-Agent required (otherwise 403). Last
resort — fallback for when Photon is also down.
Response shape changes (additive only — existing callers keep
working):
- results[].provider: 'pelias' | 'photon' | 'nominatim'
- results[].peliasCategories: only present when Pelias served the
request (was already absent on Pelias-API patch failures)
- top-level provider: <name> + tried: <name[]> on success/error
- new endpoint: GET /health/providers — per-provider snapshot
Configuration via env (defaults shipped):
GEOCODING_PROVIDERS=pelias,photon,nominatim # order matters
PROVIDER_TIMEOUT_MS=5000
PROVIDER_HEALTH_CACHE_MS=30000
PHOTON_API_URL=https://photon.komoot.io
NOMINATIM_API_URL=https://nominatim.openstreetmap.org
NOMINATIM_USER_AGENT=mana-geocoding/1.0 (+https://mana.how; ...)
NOMINATIM_INTERVAL_MS=1100
Testing: 115 tests green (was 42). New coverage:
- osm-category-map.test.ts (47 cases over food/transit/shopping/
leisure/work/other priority resolution)
- rate-limiter.test.ts (FIFO, abort-during-wait, abort-during-sleep)
- chain.test.ts (failover, empty-results-stops, health-cache,
snapshot)
- photon-normalizer.test.ts and nominatim-normalizer.test.ts (lock
the wire-format mapping for both fallback providers)
Live smoke against public Photon verified — both /search and /reverse
return correctly normalized results with provider="photon" when Pelias
is unreachable.
**Unit tests (`bun test`, 42 checks, 0 deps)**
- `src/lib/__tests__/category-map.test.ts` locks in the Pelias→
PlaceCategory priority resolution. Covers the ambiguous multi-category
case (food beats retail for restaurants, transit beats professional
for car rentals, transport:rail still maps to transit, …), the simple
single-category paths, the layer-hint fallback, and regression cases
from real Konstanz/Stuttgart/Köln venues observed during deploy
verification.
- `src/lib/__tests__/cache.test.ts` covers LRU eviction order, TTL
expiry, move-to-end on get (so frequently-read entries survive
eviction), size tracking, and typed-value storage.
**Smoke test (`./scripts/smoke-test.sh` or `bun run test:smoke`)**
End-to-end curls against a running service, aimed at post-deploy
verification. Health endpoints, forward (venue + street fallback),
focus biasing, reverse geocoding, cache hit. 9 checks total.
Wired up as `test:smoke` in package.json so it runs alongside the
unit tests. Verified working: 42/42 unit tests green locally, 9/9
smoke checks green against the live Mac Mini deployment.
CLAUDE.md Testing section rewritten to reflect the new test layers.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After the 2026-04-11 production deploy, several non-obvious gotchas
surfaced that needed documenting:
- Forward search: autocomplete→search fallback explained, so future-me
knows why the handler hits two Pelias endpoints for address-style
queries.
- Pelias infra: corrected object counts (13.4M actual, not 22M), noted
the libpostal RAM surprise (~1.9 GB, much larger than Pelias docs
suggest), and added real per-container RAM numbers from production.
- pelias.json: document that we dropped placeholder/pip/interpolation
(not just how to run them) and why the cleaner degradation matters.
- Wrapper gotchas section: Bun idleTimeout, Colima bind-mount cache
staleness, and the host.docker.internal-from-blackbox workaround.
- /health/pelias endpoint is now listed in the API table since it's
the integration point with blackbox monitoring.
- Testing section added — explicitly "no automated tests yet", with a
curl-based manual smoke test set a human can run after changes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pelias /autocomplete deliberately excludes the address layer as a
performance optimization, so queries like "Marktstätte Konstanz"
(street + locality) return 0 venue matches even though they're clearly
in the index. /search covers all layers including addresses and streets.
Query /autocomplete first (fast, fuzzy, great for venue names), and if
it returns nothing, try /search. Best of both worlds: quick matches for
"Konzil Restaurant" plus reliable matches for street addresses.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two production follow-ups surfaced after the deploy:
1. Pelias API was emitting continuous `ENOTFOUND placeholder`, `pip`,
`interpolation` errors because we declared those services in
pelias.json but never actually run them (we don't need WOF
admin lookup or street interpolation for the DACH use case).
Removed the stale entries — Pelias degrades cleanly to
libpostal-only parsing, which is what we want.
2. Bun.serve's default idleTimeout is 10s, which is too tight for
cold Pelias queries hitting Elasticsearch. Raise to 60s so
first-query-after-idle doesn't get cut off.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
blackbox-exporter can't resolve host.docker.internal on Colima, so
probes of host.docker.internal:4000 and :9200 always fail. Instead,
add a /health/pelias endpoint on the Hono wrapper that proxies to
the Pelias API, and update prometheus.yml to probe the wrapper's
proxied health endpoint.
Also simplifies the status page friendly_name() now that we don't
need to display the host.docker.internal targets.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Port 4400 collides with mana-infra-landings (status.mana.how nginx)
on the production mac mini. libpostal is only reached internally by
pelias-api over the pelias compose network anyway — no host binding
needed. Use expose instead of ports to drop the host mapping.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Production deployment + observability for the self-hosted geocoding stack:
**docker-compose.macmini.yml**
- New mana-geocoding container (port 3018, internal-only — no traefik
labels, no Cloudflare route). Uses host.docker.internal to reach the
Pelias API on the host's pelias compose stack. Dockerfile added under
services/mana-geocoding/ using the same Bun/Hono pattern as mana-events.
**Prometheus**
- New blackbox-internal job probing mana-geocoding:3018/health, the
Pelias API on host.docker.internal:4000/v1/status, and Elasticsearch
at host.docker.internal:9200/_cluster/health. Kept separate from
blackbox-api which is reserved for public HTTPS endpoints.
**status.mana.how (generate-status-page.sh)**
- Include blackbox-internal in the metric query and add an "Interne
Dienste" section with its own summary card, right between Infrastruktur
and GPU Dienste. Summary grid goes from 4 to 5 columns with a
900px breakpoint.
- friendly_name() now handles http:// URLs and rewrites container-name
hosts like mana-geocoding:3018/health → "Mana Geocoding",
host.docker.internal:4000 → "Pelias API",
host.docker.internal:9200 → "Pelias Elasticsearch".
**Grafana uptime dashboard**
- Add an "Internal" series to the "Alle Dienste — Uptime-Verlauf" panel
- New "Interne Dienste Status" table panel showing per-instance up/down
- New "Geocoding Ø Latenz" stat panel for probe_duration_seconds
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Expand services/mana-geocoding/CLAUDE.md with:
- The Pelias API patch (geojsonify_place_details.js) that forces the
category field to always be returned, with regeneration instructions
- The priority-ordered Pelias→PlaceCategory mapping and verified
example mappings from the DACH index
- A full initial-import walkthrough covering the non-obvious gotchas
(analysis-icu plugin, dach-latest → planet-latest rename, adminLookup
disabled, leveldbpath, libpostal config object form, boundary.country
single-value constraint)
Also register mana-geocoding in the root services list.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pelias hides the 'category' field from API responses unless the
caller filters by categories=... explicitly — a default intended for
keyword search that strips category metadata from address queries.
Patch the Pelias API's geojsonify_place_details.js so the category
array is returned on every feature (food, retail, transport, …),
mounted into the container as a read-only volume override.
Rewrite category-map.ts to map Pelias' OSM taxonomy to our 7
PlaceCategories using a priority-ordered list so a restaurant
tagged ['food','retail','nightlife'] resolves to 'food' (the most
specific), not 'shopping'.
Verified with Konstanz test queries:
Konzil Restaurant → food
Bahnhof Konstanz → transit
Physiotherapie-Schule → work
MX-Park → leisure
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After importing 22M OSM objects for the DACH extract:
- Disable adminLookup (no WOF data needed for address search)
- Configure leveldb path inside the data volume
- Specify planet-latest.osm.pbf as the import filename
- Convert libpostal service config from string to object form
- Drop boundary.country default — Pelias only accepts a single
country value, and our index only contains DACH data anyway
Verified forward + reverse geocoding work end-to-end for Konstanz
test queries via the mana-geocoding wrapper on port 3018.
Known limitation: OSM category/type (amenity:restaurant etc.) is
not yet populated in Pelias responses — will require whitelisting
those tags in the importer config and re-running the import.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New mana-geocoding service (port 3018) wraps a self-hosted Pelias
instance with LRU caching and OSM→PlaceCategory auto-mapping.
All geocoding queries stay within our infrastructure — no user
location data leaves the network.
Places module integration:
- Address autocomplete search in ListView (creates place with
name, coords, address, category in one step)
- Address search + reverse geocoding button in DetailView
- Auto-fill address via reverse geocoding during tracking
- OSM category mapping (amenity:restaurant→food, shop:*→shopping, etc.)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>