managarten/services/mana-geocoding/CLAUDE.md
Till JS f7de9fdf2d docs(geocoding): document the Pelias category patch + import gotchas
Expand services/mana-geocoding/CLAUDE.md with:
- The Pelias API patch (geojsonify_place_details.js) that forces the
  category field to always be returned, with regeneration instructions
- The priority-ordered Pelias→PlaceCategory mapping and verified
  example mappings from the DACH index
- A full initial-import walkthrough covering the non-obvious gotchas
  (analysis-icu plugin, dach-latest → planet-latest rename, adminLookup
  disabled, leveldbpath, libpostal config object form, boundary.country
  single-value constraint)

Also register mana-geocoding in the root services list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 15:50:40 +02:00

7.9 KiB

mana-geocoding

Self-hosted geocoding service. Wraps a local Pelias instance (DACH region) with caching and automatic OSM → PlaceCategory mapping. All geocoding queries stay within our infrastructure — no user location data leaves the network.

Tech Stack

Layer Technology
Runtime Bun
Framework Hono
Geocoding Pelias (self-hosted, Elasticsearch-backed)
Data OpenStreetMap DACH extract (DE/AT/CH)
Caching In-memory LRU (5000 entries, 24h TTL)

Port: 3018

Quick Start

# 1. Start Pelias stack (first time: run setup.sh for data import)
cd services/mana-geocoding/pelias
docker compose up -d
# First time only:
chmod +x setup.sh && ./setup.sh

# 2. Start the Hono wrapper
cd services/mana-geocoding
bun run dev

API Endpoints

All endpoints are public (no auth required) — the service is internal-only, not exposed to the internet.

Method Path Description
GET /api/v1/geocode/search?q=... Forward geocoding / autocomplete
GET /api/v1/geocode/reverse?lat=...&lon=... Reverse geocoding
GET /api/v1/geocode/stats Cache statistics
GET /health Health check

Search params

Param Required Description
q yes Search query (min 2 chars)
limit no Max results (default 5, max 20)
lang no Language (default de)
focus.lat no Bias results towards this latitude
focus.lon no Bias results towards this longitude

Reverse params

Param Required Description
lat yes Latitude
lon yes Longitude
lang no Language (default de)

Response format

{
  "results": [
    {
      "label": "Münster Café, Münsterplatz 3, 78462 Konstanz",
      "name": "Münster Café",
      "latitude": 47.663,
      "longitude": 9.175,
      "address": {
        "street": "Münsterplatz",
        "houseNumber": "3",
        "postalCode": "78462",
        "city": "Konstanz",
        "country": "Germany"
      },
      "category": "food",
      "peliasCategories": ["food", "retail", "nightlife"],
      "confidence": 0.95
    }
  ]
}

Category Mapping

Pelias' OSM importer tags each venue with its own taxonomy (food, retail, transport, health, education, …). We collapse those into the 7 PlaceCategories used by the Places module, using a priority-ordered list so the most specific signal wins:

PlaceCategory Wins if Pelias categories contain
food food (beats retail/nightlife — a restaurant is food)
transit transport, transport:public, transport:air, transport:bus, transport:taxi, transport:sea
shopping retail (when no food present)
leisure entertainment, nightlife, recreation
work education, professional, government, finance
other health, religion, everything else
home (not auto-detected — set manually by the user)

Example mappings verified on the DACH index:

OSM venue Pelias categories → PlaceCategory
Konzil Konstanz Restaurant [food, retail, nightlife] food
Bahnhof Konstanz [transport, transport:station] transit
Physiotherapie-Schule [education] work
MX-Park (Rennstrecke) [recreation] leisure

The priority list lives in src/lib/category-map.ts — update it if you want a Pelias category to map somewhere else.

Critical: the Pelias API patch

By default, Pelias hides the category field from API responses unless the caller explicitly passes ?categories=... — a quirk intended for keyword filtering that also strips category metadata from normal address queries. We work around this by mounting a patched copy of helper/geojsonify_place_details.js over the upstream one in the pelias-api container (pelias/geojsonify_place_details.js). The patch changes condition: checkCategoryParamcondition: () => true so the category array always flows through to the wrapper.

If you bump the pelias/api image, regenerate the patched file:

cd services/mana-geocoding/pelias
docker run --rm pelias/api:latest cat /code/pelias/api/helper/geojsonify_place_details.js \
  | sed 's|condition: checkCategoryParam|condition: () => true|' \
  > geojsonify_place_details.js
docker compose up -d --force-recreate api

Architecture

Client (Places module)
  → mana-geocoding (Hono, port 3018)
    → LRU cache check
    → Pelias API (port 4000) [patched — see above]
      → Elasticsearch (port 9200)

Configuration

PORT=3018
PELIAS_API_URL=http://localhost:4000/v1
CORS_ORIGINS=http://localhost:5173,https://mana.how
CACHE_MAX_ENTRIES=5000
CACHE_TTL_MS=86400000

Pelias Infrastructure

The Pelias stack runs as a separate docker-compose in pelias/:

  • elasticsearch — Index storage (Docker volume, ~5GB for DACH after indexing 22.1M OSM objects — 18.3M addresses + 3.86M venues)
  • api — HTTP API (port 4000), patched for category passthrough
  • libpostal — Address parsing (port 4400)
  • Import containers — Run once for initial data load, then stopped

RAM usage (running): ~1.5GB (elasticsearch 512MB + api + libpostal)

Initial import (one-time)

The DACH PBF extract is ~5GB and takes 30-45 minutes to index. See pelias/setup.sh for the full pipeline. Key steps, in order:

  1. docker compose up -d — bring up ES, api, libpostal
  2. docker exec pelias-elasticsearch elasticsearch-plugin install analysis-icu then restart — the official ES image doesn't ship analysis-icu which Pelias' schema mapping requires
  3. docker compose --profile import run --rm schema ./bin/create_index
  4. docker compose --profile import run --rm openstreetmap ./bin/download (downloads dach-latest.osm.pbf from Geofabrik, ~5GB)
  5. Rename dach-latest.osm.pbfplanet-latest.osm.pbf inside the pelias-data volume (Pelias' importer expects that filename). The pelias.json config references it as planet-latest.osm.pbf too.
  6. docker compose --profile import run --rm openstreetmap ./bin/start (22M objects, ~30 min on an M2 Mac mini)

pelias.json gotchas

A few non-obvious settings required for a self-hosted DACH deployment:

  • adminLookup.enabled: false — Pelias tries to resolve country/region hierarchies via "Who's On First" data by default. We don't import WOF, so this must be disabled or import crashes with unable to locate sqlite folder.
  • leveldbpath: "/data/leveldb" — not /tmp/leveldb; the container user (1001) needs write access and /tmp is not mounted.
  • api.services.libpostal: { url: "..." } — must be an object, not a string. The API's Joi schema rejects the string form.
  • No defaultParameters.boundary.country — Pelias only accepts a single country value for boundary.country. Since our index only contains DACH data anyway, we drop the filter entirely.
  • features: { filename: "planet-latest.osm.pbf" } — required because Geofabrik downloads come named dach-latest.osm.pbf, but Pelias' openstreetmap importer looks for planet-latest.osm.pbf by default.

Code Layout

src/
├── index.ts              # Bootstrap
├── app.ts                # Hono app factory
├── config.ts             # Environment config
├── routes/
│   ├── geocode.ts        # Forward + reverse endpoints with caching
│   └── health.ts
└── lib/
    ├── cache.ts          # LRU cache with TTL
    └── category-map.ts   # OSM → PlaceCategory mapping
pelias/
├── docker-compose.yml    # Pelias stack
├── pelias.json           # Pelias config (DACH region)
└── setup.sh              # Initial data import script