managarten/services/mana-geocoding/CLAUDE.md
Till JS fc49198992 docs(geocoding): post-migration log + Photon weekly-refresh operator scripts
- Decision report: status flipped to MIGRATED; added migration log with
  five WSL2 gotchas (bzip2 missing, no official Photon image,
  firewall=true blocks cross-LAN, vmIdleTimeout=-1 ineffective,
  PowerShell pre-expansion of bash $(...)) and resource snapshot.
- mana-geocoding CLAUDE.md: PHOTON_SELF_API_URL note now reflects live
  primary status on mana-gpu since 2026-04-28.
- photon-self/: operator scripts for the weekly DB refresh — update.sh
  (atomic-swap with rollback), systemd unit + timer (Sun 03:30 +30min
  jitter, Persistent=true), README with re-installation instructions
  for DR. Currently installed and enabled on mana-gpu.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 21:31:37 +02:00

456 lines
20 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# mana-geocoding
Geocoding service for the Places module. **Provider-chain architecture** — tries a self-hosted Pelias first, falls back to public Photon (komoot) and then public Nominatim (OSM) when Pelias is unhealthy or unreachable. All Pelias-served queries stay on our infrastructure; fallback queries leak the search string to a public OSM endpoint.
## Tech Stack
| Layer | Technology |
|-------|------------|
| **Runtime** | Bun |
| **Framework** | Hono |
| **Primary geocoder** | Pelias (self-hosted, Elasticsearch-backed) |
| **Fallback 1** | [Photon](https://photon.komoot.io) (public, no rate limit advertised) |
| **Fallback 2** | [Nominatim](https://nominatim.openstreetmap.org) (public, 1 req/sec strict) |
| **Data** | OpenStreetMap DACH extract (DE/AT/CH) for Pelias; global OSM for the public fallbacks |
| **Caching** | In-memory LRU (5000 entries, 24h TTL) — applies to all provider answers |
## Port: 3018
## Quick Start
```bash
# 1. Start Pelias stack (first time: run setup.sh for data import)
cd services/mana-geocoding/pelias
docker compose up -d
# First time only:
chmod +x setup.sh && ./setup.sh
# 2. Start the Hono wrapper
cd services/mana-geocoding
bun run dev
```
## API Endpoints
All endpoints are public (no auth required) — the service is internal-only, not exposed to the internet.
| Method | Path | Description |
|--------|------|-------------|
| GET | `/api/v1/geocode/search?q=...` | Forward geocoding / autocomplete |
| GET | `/api/v1/geocode/reverse?lat=...&lon=...` | Reverse geocoding |
| GET | `/api/v1/geocode/stats` | Cache statistics |
| GET | `/health` | Wrapper health |
| GET | `/health/pelias` | Upstream Pelias health (used by blackbox monitoring) |
### Forward-search strategy
The wrapper queries Pelias `/autocomplete` first (fast, fuzzy, optimised for
venue names like "Konzil Restaurant"). If that returns zero features, it
falls back to `/search`, which covers the address layer that autocomplete
deliberately excludes as a performance optimisation.
This gives the best of both worlds: quick venue matches for free-text
queries AND reliable results for street-style queries like "Marktstätte
Konstanz". See `src/routes/geocode.ts` — the fallback is baked into the
forward handler.
### Search params
| Param | Required | Description |
|-------|----------|-------------|
| `q` | yes | Search query (min 2 chars) |
| `limit` | no | Max results (default 5, max 20) |
| `lang` | no | Language (default `de`) |
| `focus.lat` | no | Bias results towards this latitude |
| `focus.lon` | no | Bias results towards this longitude |
### Reverse params
| Param | Required | Description |
|-------|----------|-------------|
| `lat` | yes | Latitude |
| `lon` | yes | Longitude |
| `lang` | no | Language (default `de`) |
### Response format
```json
{
"results": [
{
"label": "Münster Café, Münsterplatz 3, 78462 Konstanz",
"name": "Münster Café",
"latitude": 47.663,
"longitude": 9.175,
"address": {
"street": "Münsterplatz",
"houseNumber": "3",
"postalCode": "78462",
"city": "Konstanz",
"country": "Germany"
},
"category": "food",
"peliasCategories": ["food", "retail", "nightlife"],
"confidence": 0.95
}
]
}
```
## Category Mapping
Pelias' OSM importer tags each venue with its own taxonomy (`food`, `retail`,
`transport`, `health`, `education`, …). We collapse those into the 7
PlaceCategories used by the Places module, using a **priority-ordered list**
so the most specific signal wins:
| PlaceCategory | Wins if Pelias categories contain |
|---------------|-----------------------------------|
| `food` | `food` (beats retail/nightlife — a restaurant is food) |
| `transit` | `transport`, `transport:public`, `transport:air`, `transport:bus`, `transport:taxi`, `transport:sea` |
| `shopping` | `retail` (when no `food` present) |
| `leisure` | `entertainment`, `nightlife`, `recreation` |
| `work` | `education`, `professional`, `government`, `finance` |
| `other` | `health`, `religion`, everything else |
| `home` | (not auto-detected — set manually by the user) |
**Example mappings verified on the DACH index:**
| OSM venue | Pelias categories | → PlaceCategory |
|-----------|-------------------|-----------------|
| Konzil Konstanz Restaurant | `[food, retail, nightlife]` | `food` |
| Bahnhof Konstanz | `[transport, transport:station]` | `transit` |
| Physiotherapie-Schule | `[education]` | `work` |
| MX-Park (Rennstrecke) | `[recreation]` | `leisure` |
The priority list lives in `src/lib/category-map.ts` — update it if you want
a Pelias category to map somewhere else.
### Critical: the Pelias API patch
By default, Pelias **hides** the `category` field from API responses unless
the caller explicitly passes `?categories=...` — a quirk intended for keyword
filtering that also strips category metadata from normal address queries. We
work around this by mounting a **patched copy** of
`helper/geojsonify_place_details.js` over the upstream one in the `pelias-api`
container (`pelias/geojsonify_place_details.js`). The patch changes
`condition: checkCategoryParam``condition: () => true` so the category
array always flows through to the wrapper.
If you bump the `pelias/api` image, regenerate the patched file:
```bash
cd services/mana-geocoding/pelias
docker run --rm pelias/api:latest cat /code/pelias/api/helper/geojsonify_place_details.js \
| sed 's|condition: checkCategoryParam|condition: () => true|' \
> geojsonify_place_details.js
docker compose up -d --force-recreate api
```
## Configuration
```env
PORT=3018
# --- Provider chain (tried in order) ----------------------------------
# Default order: photon-self,pelias,photon,nominatim
# `photon-self` is silently dropped if PHOTON_SELF_API_URL is unset.
GEOCODING_PROVIDERS=photon-self,pelias,photon,nominatim
PROVIDER_TIMEOUT_MS=8000 # per-provider request timeout (cold-start safe)
PROVIDER_HEALTH_CACHE_MS=30000 # health-cache TTL — skip dead providers
# --- Self-hosted Photon (privacy: 'local', PRIMARY since 2026-04-28) --
# Live on mana-gpu (Windows 11, WSL2-Ubuntu, Docker, Photon Europe-wide
# Java JAR + OpenSearch). Cross-LAN reach via WSL2 mirrored networking.
# Set in .env.macmini; flow into the container via docker-compose env.
PHOTON_SELF_API_URL=http://192.168.178.11:2322
# --- Pelias (legacy, currently stopped — privacy: 'local') ------------
PELIAS_API_URL=http://pelias-api:4000/v1
# --- Public Photon (privacy: 'public', last-resort fallback) ----------
PHOTON_API_URL=https://photon.komoot.io
# --- Nominatim (fallback 2) -------------------------------------------
NOMINATIM_API_URL=https://nominatim.openstreetmap.org
NOMINATIM_USER_AGENT=mana-geocoding/1.0 (+https://mana.how; kontakt@memoro.ai)
NOMINATIM_INTERVAL_MS=1100 # >= 1000 to honor 1 req/sec policy
# --- Misc -------------------------------------------------------------
CORS_ORIGINS=http://localhost:5173,https://mana.how
CACHE_MAX_ENTRIES=5000
CACHE_TTL_MS=86400000 # 24h — used for local-provider answers
CACHE_PUBLIC_TTL_MS=604800000 # 7d — extended TTL for public-API answers (privacy)
```
To **disable a provider**, drop it from `GEOCODING_PROVIDERS`. To run with
no local backend at all, set `GEOCODING_PROVIDERS=photon,nominatim`
the wrapper will block sensitive queries (see Privacy hardening below)
since no `privacy: 'local'` provider is reachable.
The dual-Photon split:
- `photon-self` — self-hosted Photon (mana-gpu), `privacy: 'local'`, eligible
for sensitive queries. Registered iff `PHOTON_SELF_API_URL` is set.
- `photon` — public komoot.io endpoint, `privacy: 'public'`, last-resort
fallback for non-sensitive queries when self-hosted is down.
Both share the same `PhotonProvider` class — only the URL, name, and
privacy stance differ. See the [migration runbook](../../docs/runbooks/photon-on-mana-gpu.md)
and [decision report](../../docs/reports/geocoding-self-hosting-2026-04-28.md)
for the operational story.
## Provider-chain semantics
The `ProviderChain` (`src/providers/chain.ts`) iterates providers in
priority order and stops on the first success. A provider that returns
**zero results successfully** stops the chain — we don't waste public-API
budget on a query that legitimately doesn't match. Only network errors
(unreachable, 5xx, 429) cause fallthrough.
Per-provider health is cached for `PROVIDER_HEALTH_CACHE_MS` (default 30s).
A failed health probe or a failed search marks the provider unhealthy and
skips it for the rest of the cache window. The next request after the cache
expires re-probes lazily — there is no background health pinger.
```
Client (Places module)
→ mana-geocoding (Hono, port 3018)
→ LRU cache (24h TTL) ← hit: ~0 ms
→ Provider chain
1. Pelias ← reachable: 50200 ms (DACH index, fully featured)
2. Photon ← fallback: 200500 ms public, partial features
3. Nominatim ← last resort: 200800 ms + 1 req/sec queue
```
The response body includes `provider: 'pelias' | 'photon' | 'nominatim'`,
`tried: ProviderName[]`, and an optional `notice` (`'fallback_used'` or
`'sensitive_local_unavailable'`) so the caller can render an
"approximate match" hint or explain why a sensitive query returned 0
results.
## Privacy hardening
When a request goes to Pelias, the user's query content + focus point
stay on our infrastructure. When it falls through to Photon or
Nominatim, the query is forwarded to a third party. Three independent
defenses limit what those third parties can learn:
### 1. Sensitive-query block (`src/lib/sensitive-query.ts`)
Queries matching the medical / mental-health / crisis-service keyword
list (`Hausarzt`, `Psychiater`, `Klinikum`, `Suchtberatung`, `HIV`,
`Frauenhaus`, …) are **never forwarded to public APIs**, even if Pelias
is unreachable. The chain detects sensitivity at the route layer and
calls `chain.search(req, signal, { localOnly: true })` — providers with
`privacy: 'public'` are filtered out *before* the iteration begins, so
there is no race window.
When no local provider is available (e.g. Pelias is stopped), a
sensitive query returns `ok: true, results: [], notice:
'sensitive_local_unavailable'`. The UI should show "Diese Suche bleibt
bewusst lokal — kein Treffer im DACH-Index. Versuche eine allgemeinere
Formulierung." rather than "no results".
The keyword list is documented and maintained inline. False negatives
(a sensitive query slipping through) are the primary risk; false
positives just produce a 0-result UX hit, which is the safer
trade-off.
### 2. Coordinate quantization (`src/lib/privacy.ts`)
Coordinates are rounded before forwarding to public providers:
- **Forward-search focus** (`focus.lat/lon`): rounded to 2 decimals
(~1.1 km). Enough for the "results near me" bias without sending
exact GPS.
- **Reverse-geocoding lat/lon**: rounded to 3 decimals (~110 m).
City-block resolution — sufficient for "what's near me?", avoids
logging exact home/workplace coordinates to a third party.
Pelias always gets full-precision coordinates — quantization only
applies on the way out to public APIs.
### 3. Aggressive caching of public-API answers
`config.cache.publicTtlMs` (default 7 days) overrides the default 24h
cache TTL when the response came from a public provider. Same query
from 1000 different users → 1 outbound request to Photon/Nominatim.
This is the strongest privacy lever we have over public providers,
since we can't change their logging behavior — only the rate at which
we feed them queries.
### What this protects + what it doesn't
| Threat | Protected? |
|---|---|
| Public API sees user's IP | ✓ (wrapper is the proxy, only mac-mini IP goes out) |
| Public API sees user identity / JWT | ✓ (wrapper sends no auth headers) |
| Public API sees query content | partial — sensitive queries blocked entirely, others go through |
| Public API sees user's exact GPS | ✓ (quantized to ~1km / ~110m) |
| Aggregate location-intent profiling | partial — cache reduces volume ~10100× |
| TLS-level traffic analysis (timing) | ✗ (not in scope) |
| Compelled disclosure of public-API logs | ✗ (no legal mitigation) |
Residual risk for non-sensitive queries: "third party learns what
queries our backend made, with timestamps, but not who made them."
Acceptable for restaurant/landmark lookups, blocked for medical lookups.
## Pelias Infrastructure
The Pelias stack runs as a separate docker-compose in `pelias/`:
- **elasticsearch** — Index storage (Docker volume, ~5GB for DACH after
indexing 13.4M OSM objects — 10M addresses + 3.3M venues)
- **api** — HTTP API (port 4000), patched for category passthrough
- **libpostal** — Address parsing (internal only, not exposed on host port
because 4400 collides with mana-infra-landings on the Mac Mini)
- **Import containers** — Run once for initial data load, then stopped
**Production RAM usage** (measured on the Mac Mini after the 2026-04-11 deploy):
| Container | RAM |
|---|---|
| pelias-elasticsearch | ~1.2 GB |
| pelias-libpostal | ~1.9 GB (address parser model) |
| pelias-api | ~100 MB |
| mana-geocoding (wrapper) | ~2060 MB |
Total: **~3.2 GB** — larger than the initial ~1.5 GB estimate because
libpostal loads its full address parser into memory up front.
### Initial import (one-time)
The DACH PBF extract is ~5GB and takes 30-45 minutes to index. See
`pelias/setup.sh` for the full pipeline. Key steps, in order:
1. `docker compose up -d` — bring up ES, api, libpostal
2. `docker exec pelias-elasticsearch elasticsearch-plugin install analysis-icu`
then restart — the official ES image doesn't ship `analysis-icu` which
Pelias' schema mapping requires
3. `docker compose --profile import run --rm schema ./bin/create_index`
4. `docker compose --profile import run --rm openstreetmap ./bin/download`
(downloads `dach-latest.osm.pbf` from Geofabrik, ~5GB)
5. **Rename** `dach-latest.osm.pbf``planet-latest.osm.pbf` inside the
pelias-data volume (Pelias' importer expects that filename). The
`pelias.json` config references it as `planet-latest.osm.pbf` too.
6. `docker compose --profile import run --rm openstreetmap ./bin/start`
(22M objects, ~30 min on an M2 Mac mini)
### pelias.json gotchas
A few non-obvious settings required for a self-hosted DACH deployment:
- **`adminLookup.enabled: false`** — Pelias tries to resolve country/region
hierarchies via "Who's On First" data by default. We don't import WOF,
so this must be disabled or import crashes with `unable to locate sqlite
folder`.
- **`leveldbpath: "/data/leveldb"`** — not `/tmp/leveldb`; the container
user (1001) needs write access and `/tmp` is not mounted.
- **`api.services.libpostal: { url: "..." }`** — must be an object, not a
string. The API's Joi schema rejects the string form.
- **Only declare services you actually run.** We used to list `placeholder`,
`pip`, and `interpolation` in `api.services` but never ran the containers;
Pelias logged `ENOTFOUND` errors on every query. Dropping the unused
entries makes Pelias degrade cleanly to libpostal-only parsing (warns
`service disabled` once at startup, then silent).
- **No `defaultParameters.boundary.country`** — Pelias only accepts a
single country value for `boundary.country`. Since our index only
contains DACH data anyway, we drop the filter entirely.
- **`features: { filename: "planet-latest.osm.pbf" }`** — required because
Geofabrik downloads come named `dach-latest.osm.pbf`, but Pelias'
openstreetmap importer looks for `planet-latest.osm.pbf` by default.
### Wrapper gotchas
- **`idleTimeout: 60`** on `Bun.serve` — the default 10 s cuts off cold
queries that hit Elasticsearch and libpostal in sequence. 60 s is
generous for the worst case while still catching actually-stuck
connections.
- **Colima bind-mount cache.** The mac-mini bind-mounts this repo's files
into several monitoring containers. Colima on macOS sometimes serves a
stale view of a bind-mounted file even after the file on disk changes.
After editing `scripts/generate-status-page.sh` (also bind-mounted into
`mana-status-gen`), restart the consuming container so it sees the
fresh content: `docker restart mana-status-gen`.
- **`host.docker.internal` doesn't resolve from blackbox-exporter** on
Colima, so the external monitoring can't probe pelias-api or
elasticsearch directly. Instead, the wrapper exposes `/health/pelias`
which proxies a request to Pelias; Prometheus probes that internal
endpoint inside the docker network. See `prometheus.yml` job
`blackbox-internal`.
## Testing
Two layers:
### Unit tests (`bun test`)
Fast, no dependencies. Locks in the subtle logic:
```bash
cd services/mana-geocoding
bun test
```
- `src/lib/__tests__/category-map.test.ts` — Pelias→PlaceCategory
priority resolution.
- `src/lib/__tests__/osm-category-map.test.ts` — raw OSM-tag→PlaceCategory
mapping used by Photon + Nominatim (since they emit `class:type` rather
than Pelias's curated taxonomy).
- `src/lib/__tests__/cache.test.ts` — LRU eviction order, TTL expiry,
move-to-end on `get`, size tracking.
- `src/lib/__tests__/rate-limiter.test.ts` — single-token rate limiter
(used to enforce Nominatim's 1 req/sec policy). FIFO order, abort
cleanup, busy-flag release on aborted interval-wait.
- `src/providers/__tests__/chain.test.ts` — provider chain failover, health
cache, "stop on empty results" semantics.
- `src/providers/__tests__/photon-normalizer.test.ts` and
`nominatim-normalizer.test.ts` — locking the wire-format mapping for the
two public fallback providers.
As of the 2026-04-28 privacy-hardening rollout: **141 tests, all green**.
### Smoke test (`bun run test:smoke`)
End-to-end curls against a running service. Requires a fully deployed
Pelias stack with the DACH index loaded — run this after a deploy to
confirm the full pipeline is healthy.
```bash
cd services/mana-geocoding
bun run test:smoke # default http://localhost:3018
./scripts/smoke-test.sh http://mana-geocoding:3018 # from another container
```
Asserts: wrapper + pelias health, restaurant→food, station→transit,
street+locality fallback returns results, focus biasing works, reverse
geocoding for Konstanz and München, cache hit on repeat. 9 checks.
## Code Layout
```
src/
├── index.ts # Bootstrap
├── app.ts # Hono app factory + chain wiring
├── config.ts # Environment config (incl. provider list)
├── routes/
│ ├── geocode.ts # Forward + reverse, delegates to chain
│ └── health.ts # /health, /health/pelias, /health/providers
├── providers/
│ ├── types.ts # GeocodingProvider interface, shared shape
│ ├── chain.ts # Failover orchestrator + health cache
│ ├── pelias.ts # Primary: self-hosted DACH Pelias
│ ├── photon.ts # Fallback 1: photon.komoot.io
│ └── nominatim.ts # Fallback 2: nominatim.openstreetmap.org
└── lib/
├── cache.ts # LRU cache with TTL + per-entry override
├── category-map.ts # Pelias-taxonomy → PlaceCategory
├── osm-category-map.ts # Raw OSM `class:type` → PlaceCategory
├── privacy.ts # Coordinate quantization for public APIs
├── rate-limiter.ts # Single-token limiter (used by Nominatim)
└── sensitive-query.ts # Health/crisis keyword detector
pelias/
├── docker-compose.yml # Pelias stack
├── pelias.json # Pelias config (DACH region)
└── setup.sh # Initial data import script
```