managarten/services/mana-geocoding
Till JS bcc21ca785 feat(geocoding): privacy hardening — sensitive-query block + coord
quantization + extended cache TTL for public answers

Three independent defenses limit what public geocoding APIs (Photon,
Nominatim) can learn from our outbound traffic:

1. **Sensitive-query block** (`lib/sensitive-query.ts`)
   Queries matching the medical/mental-health/crisis-service keyword
   list (Hausarzt, Psychiater, Klinikum, HIV, Frauenhaus, …) are
   never forwarded to public APIs. The chain detects sensitivity at
   the route layer and runs the search in localOnly mode — providers
   with `privacy: 'public'` are filtered out before iteration begins.
   When no local provider is available (Pelias stopped), a sensitive
   query returns ok:true with results:[] and notice:
   'sensitive_local_unavailable' so the UI can show a sensible
   message instead of "no results".

   The keyword list is documented inline. False negatives are the
   risk; false positives just produce a 0-result UX hit (better
   trade-off).

2. **Coordinate quantization** (`lib/privacy.ts`)
   Forward-search focus.lat/lon: rounded to 2 decimals (~1.1km).
     Enough for the bias to work, hides exact GPS.
   Reverse-geocoding lat/lon: rounded to 3 decimals (~110m).
     City-block resolution — sufficient for "what's near me?",
     avoids reverse-geocoding the user's exact front door.
   Pelias always gets full precision; quantization only on the way
   out to public APIs. New `privacy: 'local' | 'public'` field on
   the GeocodingProvider interface drives this.

3. **Extended cache TTL for public answers**
   New `cache.publicTtlMs` config option, default 7 days (vs. 24h
   for local-provider answers). LRU cache extended with optional
   `ttlOverrideMs` per entry. Same query from N users → 1 outbound
   request to Photon/Nominatim. Strongest privacy lever we have
   over public providers (we can't change their logging, only the
   rate at which we feed them queries).

Threat coverage:
   ✓ User IP / identity hidden (already true — wrapper is the proxy)
   ✓ Exact GPS hidden (quantization)
   ✓ Sensitive query content protected (block)
   ~ Non-sensitive query content visible (acceptable trade-off)
   ~ Aggregate profiling reduced ~10–100× (cache)
   ✗ TLS-level traffic analysis, compelled disclosure (out of scope)

Tests: 141 (was 115). New coverage:
- privacy.test.ts: quantization rules (locks the privacy claim)
- sensitive-query.test.ts: positive matches across categories +
  documented false positives we accept
- chain.test.ts: localOnly mode end-to-end including the load-
  bearing assertion that public providers' search() must NEVER be
  called when the chain is in localOnly mode (no race window)
- cache.test.ts: per-entry ttlOverride longer + shorter than default

Live smoke verified end-to-end:
- "Hausarzt Konstanz" with Pelias down → no public API call,
  notice: 'sensitive_local_unavailable'
- "Konstanz" → falls through to Photon, notice: 'fallback_used'
- Reverse with high-precision GPS → Photon receives quantized
  coords, returns city-block-level result
2026-04-28 16:04:56 +02:00
..
pelias fix(geocoding): drop unused Pelias services, raise Bun idleTimeout 2026-04-11 17:41:57 +02:00
scripts test(geocoding): add unit tests + end-to-end smoke test script 2026-04-11 20:21:18 +02:00
src feat(geocoding): privacy hardening — sensitive-query block + coord 2026-04-28 16:04:56 +02:00
CLAUDE.md feat(geocoding): privacy hardening — sensitive-query block + coord 2026-04-28 16:04:56 +02:00
Dockerfile feat(monitoring): add mana-geocoding + Pelias to prod compose, Prometheus, Grafana, and status.mana.how 2026-04-11 16:11:01 +02:00
package.json test(geocoding): add unit tests + end-to-end smoke test script 2026-04-11 20:21:18 +02:00
tsconfig.json feat(places): add self-hosted geocoding with Pelias (DACH) 2026-04-10 23:02:25 +02:00