mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-15 01:41:08 +02:00
quantization + extended cache TTL for public answers
Three independent defenses limit what public geocoding APIs (Photon,
Nominatim) can learn from our outbound traffic:
1. **Sensitive-query block** (`lib/sensitive-query.ts`)
Queries matching the medical/mental-health/crisis-service keyword
list (Hausarzt, Psychiater, Klinikum, HIV, Frauenhaus, …) are
never forwarded to public APIs. The chain detects sensitivity at
the route layer and runs the search in localOnly mode — providers
with `privacy: 'public'` are filtered out before iteration begins.
When no local provider is available (Pelias stopped), a sensitive
query returns ok:true with results:[] and notice:
'sensitive_local_unavailable' so the UI can show a sensible
message instead of "no results".
The keyword list is documented inline. False negatives are the
risk; false positives just produce a 0-result UX hit (better
trade-off).
2. **Coordinate quantization** (`lib/privacy.ts`)
Forward-search focus.lat/lon: rounded to 2 decimals (~1.1km).
Enough for the bias to work, hides exact GPS.
Reverse-geocoding lat/lon: rounded to 3 decimals (~110m).
City-block resolution — sufficient for "what's near me?",
avoids reverse-geocoding the user's exact front door.
Pelias always gets full precision; quantization only on the way
out to public APIs. New `privacy: 'local' | 'public'` field on
the GeocodingProvider interface drives this.
3. **Extended cache TTL for public answers**
New `cache.publicTtlMs` config option, default 7 days (vs. 24h
for local-provider answers). LRU cache extended with optional
`ttlOverrideMs` per entry. Same query from N users → 1 outbound
request to Photon/Nominatim. Strongest privacy lever we have
over public providers (we can't change their logging, only the
rate at which we feed them queries).
Threat coverage:
✓ User IP / identity hidden (already true — wrapper is the proxy)
✓ Exact GPS hidden (quantization)
✓ Sensitive query content protected (block)
~ Non-sensitive query content visible (acceptable trade-off)
~ Aggregate profiling reduced ~10–100× (cache)
✗ TLS-level traffic analysis, compelled disclosure (out of scope)
Tests: 141 (was 115). New coverage:
- privacy.test.ts: quantization rules (locks the privacy claim)
- sensitive-query.test.ts: positive matches across categories +
documented false positives we accept
- chain.test.ts: localOnly mode end-to-end including the load-
bearing assertion that public providers' search() must NEVER be
called when the chain is in localOnly mode (no race window)
- cache.test.ts: per-entry ttlOverride longer + shorter than default
Live smoke verified end-to-end:
- "Hausarzt Konstanz" with Pelias down → no public API call,
notice: 'sensitive_local_unavailable'
- "Konstanz" → falls through to Photon, notice: 'fallback_used'
- Reverse with high-precision GPS → Photon receives quantized
coords, returns city-block-level result
|
||
|---|---|---|
| .. | ||
| pelias | ||
| scripts | ||
| src | ||
| CLAUDE.md | ||
| Dockerfile | ||
| package.json | ||
| tsconfig.json | ||