chore(geocoding): remove Pelias + close 3 bypass paths to public Nominatim

Pelias was retired from the Mac mini on 2026-04-28; photon-self
(self-hosted Photon on mana-gpu) has been the live primary since then.
This removes the now-dead Pelias adapter, config, tests, and the
services/mana-geocoding/pelias/ stack — the entire compose file, the
geojsonify_place_details.js patch, the setup.sh import script.

Provider chain is now `photon-self → photon → nominatim`. The chain
keeps its `privacy: 'local' | 'public'` split, sensitive-query
blocking, coord quantization, and aggressive caching unchanged.

Three direct calls to nominatim.openstreetmap.org that bypassed
mana-geocoding now route through the wrapper:

- citycorners/add-city + citycorners/cities/[slug]/add use the shared
  searchAddress() client (browser → same-origin proxy → mana-geocoding
  → photon-self).
- memoro mobile drops its OSM reverse-geocoding fallback entirely;
  Expo's on-device reverse-geocoding stays as the sole path. Routing
  through the wrapper would require a memoro-server proxy endpoint —
  a follow-up if Expo's quality proves insufficient.

Other behavioral changes:

- CACHE_PUBLIC_TTL_MS dropped from 7d → 1h. The long TTL was a
  privacy-amplification trick from the Pelias era; with photon-self
  serving the bulk of traffic, a transient cross-LAN blip was pinning
  cached fallback answers for days. 1h gives quick recovery.
- /health/pelias renamed to /health/photon-self; prometheus blackbox
  config + status-page generator updated.
- mana-geocoding container no longer needs `extra_hosts:
  host.docker.internal:host-gateway` (was only there for the
  Pelias-on-host-network era).

113 tests passing. CLAUDE.md rewritten to reflect the post-Pelias
architecture.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Till JS 2026-04-28 22:12:26 +02:00
parent 7bca16dfa7
commit 2bbcf14aba
35 changed files with 330 additions and 1262 deletions

View file

@ -1,184 +0,0 @@
/**
* Unit tests for the PeliasPlaceCategory mapping.
*
* This is the subtle part of the service: a Pelias venue often has
* multiple categories (e.g. a restaurant is `['food','retail','nightlife']`)
* and we need to pick the most specific one. The priority list in
* category-map.ts encodes that choice, and these tests lock it in.
*/
import { describe, it, expect } from 'bun:test';
import { mapPeliasToPlaceCategory } from '../category-map';
describe('mapPeliasToPlaceCategory', () => {
describe('priority-ordered multi-category resolution', () => {
it('picks food over retail for a restaurant', () => {
expect(mapPeliasToPlaceCategory(['food', 'retail', 'nightlife'])).toBe('food');
});
it('picks food over retail for a bakery', () => {
// Bakery is tagged food+retail in the Pelias OSM taxonomy
expect(mapPeliasToPlaceCategory(['food', 'retail'])).toBe('food');
});
it('picks food over nightlife for a cafe', () => {
expect(mapPeliasToPlaceCategory(['food', 'nightlife'])).toBe('food');
});
it('picks transit over professional for a car_rental', () => {
// car_rental is tagged transport+professional in Pelias
expect(mapPeliasToPlaceCategory(['transport', 'professional'])).toBe('transit');
});
it('picks transit for a bus_station (multiple transport subcategories)', () => {
expect(mapPeliasToPlaceCategory(['transport', 'transport:public', 'transport:bus'])).toBe(
'transit'
);
});
it('picks transit for a station (transport:rail)', () => {
expect(
mapPeliasToPlaceCategory([
'transport',
'transport:public',
'transport:station',
'transport:rail',
])
).toBe('transit');
});
});
describe('single-category resolution', () => {
it('maps food to food', () => {
expect(mapPeliasToPlaceCategory(['food'])).toBe('food');
});
it('maps retail to shopping', () => {
expect(mapPeliasToPlaceCategory(['retail'])).toBe('shopping');
});
it('maps transport to transit', () => {
expect(mapPeliasToPlaceCategory(['transport'])).toBe('transit');
});
it('maps education to work', () => {
expect(mapPeliasToPlaceCategory(['education'])).toBe('work');
});
it('maps professional to work', () => {
expect(mapPeliasToPlaceCategory(['professional'])).toBe('work');
});
it('maps government to work', () => {
expect(mapPeliasToPlaceCategory(['government'])).toBe('work');
});
it('maps finance to work', () => {
expect(mapPeliasToPlaceCategory(['finance'])).toBe('work');
});
it('maps entertainment to leisure', () => {
expect(mapPeliasToPlaceCategory(['entertainment'])).toBe('leisure');
});
it('maps nightlife to leisure', () => {
expect(mapPeliasToPlaceCategory(['nightlife'])).toBe('leisure');
});
it('maps recreation to leisure', () => {
expect(mapPeliasToPlaceCategory(['recreation'])).toBe('leisure');
});
it('maps health to other', () => {
expect(mapPeliasToPlaceCategory(['health'])).toBe('other');
});
it('maps religion to other', () => {
expect(mapPeliasToPlaceCategory(['religion'])).toBe('other');
});
});
describe('real-world Pelias venue categories', () => {
// These are literal category arrays observed from the Konstanz DACH
// index during the 2026-04-11 deploy verification. Locking them in
// as regression tests so future priority changes can't silently
// break address search in production.
it('Konzil Restaurant Konstanz → food', () => {
expect(mapPeliasToPlaceCategory(['food', 'retail', 'nightlife'])).toBe('food');
});
it('Stuttgart Hauptbahnhof → transit', () => {
expect(
mapPeliasToPlaceCategory([
'transport',
'transport:public',
'transport:station',
'transport:rail',
])
).toBe('transit');
});
it('Physiotherapie-Schule → work', () => {
expect(mapPeliasToPlaceCategory(['education'])).toBe('work');
});
it('MX-Park (Rennstrecke) → leisure', () => {
expect(mapPeliasToPlaceCategory(['recreation'])).toBe('leisure');
});
it('KulturKiosk → work', () => {
// KulturKiosk is tagged professional in Pelias
expect(mapPeliasToPlaceCategory(['professional'])).toBe('work');
});
it('Kölner Domshop → shopping', () => {
expect(mapPeliasToPlaceCategory(['retail'])).toBe('shopping');
});
});
describe('empty / null / unknown categories', () => {
it('returns other for empty array', () => {
expect(mapPeliasToPlaceCategory([])).toBe('other');
});
it('returns other for undefined', () => {
expect(mapPeliasToPlaceCategory(undefined)).toBe('other');
});
it('returns other for null', () => {
expect(mapPeliasToPlaceCategory(null)).toBe('other');
});
it('returns other for unknown category strings', () => {
expect(mapPeliasToPlaceCategory(['random', 'unknown'])).toBe('other');
});
it('picks known category even if unknown ones come first', () => {
expect(mapPeliasToPlaceCategory(['unknown', 'food'])).toBe('food');
});
});
describe('Pelias layer fallback', () => {
it('uses layer hint for venue with no categories', () => {
expect(mapPeliasToPlaceCategory(undefined, 'venue')).toBe('other');
});
it('uses layer hint for address', () => {
expect(mapPeliasToPlaceCategory(undefined, 'address')).toBe('other');
});
it('uses layer hint for street', () => {
expect(mapPeliasToPlaceCategory(undefined, 'street')).toBe('other');
});
it('uses layer hint for locality', () => {
expect(mapPeliasToPlaceCategory(undefined, 'locality')).toBe('other');
});
it('prefers categories over layer hint', () => {
// A venue with food category should be food, not other
expect(mapPeliasToPlaceCategory(['food'], 'venue')).toBe('food');
});
});
});

View file

@ -2,8 +2,6 @@
* Unit tests for the raw-OSM-tag PlaceCategory mapper.
*
* Covers the cases Photon and Nominatim emit for typical DACH queries.
* The Pelias mapper has its own tests in category-map.test.ts; this file
* tests *only* the raw-OSM-tag path used by the public-API fallbacks.
*/
import { describe, expect, it } from 'bun:test';
@ -54,7 +52,7 @@ describe('mapOsmTagToPlaceCategory', () => {
expect(mapOsmTagToPlaceCategory('aeroway', 'aerodrome')).toBe('transit');
});
it('amenity:car_rental → transit', () => {
// Matches Pelias mapper's "car_rental → transit" decision
// car_rental → transit (transport-flavored)
expect(mapOsmTagToPlaceCategory('amenity', 'car_rental')).toBe('transit');
});
});
@ -116,7 +114,7 @@ describe('mapOsmTagToPlaceCategory', () => {
describe('other (health/religion/unknown)', () => {
it('amenity:hospital → other', () => {
// Health goes to other (matches Pelias mapper)
// Health goes to other
expect(mapOsmTagToPlaceCategory('amenity', 'hospital')).toBe('other');
});
it('amenity:pharmacy → other', () => {

View file

@ -1,7 +1,7 @@
/**
* Simple in-memory LRU cache with TTL for geocoding results.
* Geocoding results rarely change, so we cache aggressively to
* reduce load on the Pelias instance.
* Geocoding results rarely change, so we cache to reduce load on
* upstream providers.
*/
interface CacheEntry<T> {
@ -37,11 +37,10 @@ export class LRUCache<T> {
/**
* Insert or update a cache entry.
*
* @param ttlOverrideMs Optional per-entry TTL. Useful when results
* from public-API providers should live longer than results from
* the (frequently-changing) local Pelias index e.g. 7 days for
* Photon/Nominatim answers, 24 hours for Pelias answers. When
* omitted, the constructor's default TTL applies.
* @param ttlOverrideMs Optional per-entry TTL. The route layer uses
* this so public-fallback answers expire faster than local-provider
* answers see `ttlFor()` in routes/geocode.ts. When omitted, the
* constructor's default TTL applies.
*/
set(key: string, value: T, ttlOverrideMs?: number): void {
// Delete first so re-insert goes to end

View file

@ -1,89 +1,10 @@
/**
* Maps Pelias categories (OSM taxonomy) to our 7 Places categories.
*
* Pelias' openstreetmap importer tags venues with categories from its
* built-in taxonomy (food, retail, transport, health, education, ).
* We collapse those into the simpler Places enum:
* The 7 Places categories used across the geocoding wrapper and clients.
*
* home · work · food · shopping · transit · leisure · other
*
* A venue can have multiple Pelias categories (e.g. a restaurant is
* tagged `['food', 'retail', 'nightlife']`). We pick the most specific
* one in priority order rather than the first a restaurant should be
* "food" even though "retail" also matches.
* Provider-specific mappers (see `osm-category-map.ts` for Photon /
* Nominatim) collapse the upstream taxonomy into this shape. `home` is
* never auto-detected it's set manually by the user.
*/
export type PlaceCategory = 'home' | 'work' | 'food' | 'shopping' | 'transit' | 'leisure' | 'other';
/**
* Priority-ordered: first matching category wins. Earlier entries are
* more specific, so "food" beats "retail", "transport" beats "professional".
*/
const PELIAS_PRIORITY: Array<[string, PlaceCategory]> = [
// Food is strongest signal — a restaurant is food, not retail
['food', 'food'],
// Transit/transport
['transport:public', 'transit'],
['transport:air', 'transit'],
['transport:sea', 'transit'],
['transport:bus', 'transit'],
['transport:taxi', 'transit'],
['transport', 'transit'],
// Shopping — explicit retail markers
['retail', 'shopping'],
// Leisure / entertainment / recreation
['entertainment', 'leisure'],
['nightlife', 'leisure'],
['recreation', 'leisure'],
// Work-ish
['education', 'work'],
['professional', 'work'],
['government', 'work'],
['finance', 'work'],
// Health/religion fall through to other
['health', 'other'],
['religion', 'other'],
];
/**
* Derive a PlaceCategory from a Pelias feature's category array.
*
* @param categories The `category` array from a Pelias feature's properties
* @param peliasLayer The Pelias layer (venue, address, street, ) used as fallback hint
*/
export function mapPeliasToPlaceCategory(
categories?: string[] | null,
peliasLayer?: string
): PlaceCategory {
if (Array.isArray(categories) && categories.length > 0) {
// Walk our priority list and pick the first match
for (const [peliasCat, placeCat] of PELIAS_PRIORITY) {
if (categories.includes(peliasCat)) return placeCat;
}
}
// Fallback: use Pelias layer as a hint. Addresses/streets/regions
// all land in "other" since they aren't really "places" in the
// categorical sense.
if (peliasLayer) {
switch (peliasLayer) {
case 'venue':
return 'other';
case 'address':
case 'street':
return 'other';
case 'neighbourhood':
case 'locality':
case 'region':
case 'country':
return 'other';
}
}
return 'other';
}

View file

@ -2,15 +2,9 @@
* Maps raw OSM `class:type` tags (Photon's `osm_key:osm_value`,
* Nominatim's `class:type`) to our 7 PlaceCategories.
*
* Pelias has a curated multi-category taxonomy (`food`, `retail`,
* `transport`, ) that we map via `category-map.ts`. Photon and Nominatim
* return raw OSM tags instead `amenity:restaurant`, `shop:supermarket`,
* `public_transport:station`, etc. so they need a different lookup.
*
* The list below is intentionally narrow: it only covers tags we actually
* see in real Photon/Nominatim responses for DACH queries. Anything else
* falls through to `other`, which matches the Pelias mapper's behavior for
* unknown categories.
* falls through to `other`.
*
* If a query returns a tag we don't handle, that's the signal to add it
* here not to try to enumerate all 1000+ OSM types.
@ -25,8 +19,8 @@ interface Tag {
/**
* Priority-ordered: first match wins. More-specific entries (with a
* `value`) come before generic key-only entries. Matches Pelias's
* "food beats retail" priority intent.
* `value`) come before generic key-only entries. Same "food beats retail"
* priority intent as the upstream taxonomies.
*/
const OSM_RULES: Array<{ match: Tag; category: PlaceCategory }> = [
// ── Food (highest priority — restaurants are food, even when also
@ -82,7 +76,7 @@ const OSM_RULES: Array<{ match: Tag; category: PlaceCategory }> = [
{ match: { key: 'amenity', value: 'embassy' }, category: 'work' },
{ match: { key: 'office' }, category: 'work' },
// ── Health / religion → other (matches Pelias mapper) ───────────
// ── Health / religion → other ───────────────────────────────────
{ match: { key: 'amenity', value: 'hospital' }, category: 'other' },
{ match: { key: 'amenity', value: 'clinic' }, category: 'other' },
{ match: { key: 'amenity', value: 'doctors' }, category: 'other' },

View file

@ -14,7 +14,7 @@
* not telling Photon "user is at THIS HOUSE". Reverse geocoding
* against the city block instead of the building is acceptable.
*
* Pelias and other LAN-local providers always get the original
* Photon-self and other LAN-local providers always get the original
* full-precision coordinates quantization only applies on the way
* out to the public internet.
*/

View file

@ -12,7 +12,7 @@
*
* Trade-offs:
* - False positives are OK (a user searching for "Praxis Müller" who
* wanted the dance studio gets 0 results when Pelias is down not
* wanted the dance studio gets 0 results when photon-self is down not
* ideal but better than a privacy leak)
* - False negatives are NOT OK (we'd rather over-block than under-block)
* - The list is intentionally narrow: only words with clear medical or