feat(geocoding): support dual-Photon (self-hosted + public) for GPU

migration

The chain now distinguishes two Photon instances:
  photon-self  privacy: 'local'   (self-hosted on mana-gpu)
  photon       privacy: 'public'  (komoot.io, last-resort fallback)

Both wrap the same `PhotonProvider` class with different config — only
the URL, name, and privacy stance differ. The new ProviderName variant
'photon-self' lets the chain track per-provider health for them
independently (a single 'photon' slot would collide in the health
Map).

Opt-in registration: `photon-self` is only built when
PHOTON_SELF_API_URL is set in the env. When unset (current state),
the chain has the same shape as before — full backward compat. After
the GPU migration, flipping the env-var on is the only deploy step
needed:
  PHOTON_SELF_API_URL=http://192.168.178.11:2322

Default chain order updated to:
  photon-self,pelias,photon,nominatim
  ^^^^^^^^^^^ silently skipped if not registered (env unset)

The privacy guarantee is structural: photon-self carries privacy:
'local', so the existing sensitive-query block from the previous
hardening commit now has a real local backend post-migration —
medical/crisis-service queries get real results instead of the
"sensitive_local_unavailable" notice.

Tests: 148 (was 141). New coverage:
- src/__tests__/app.test.ts: createChain registration logic — verifies
  photon-self appears iff PHOTON_SELF_API_URL is set, ordering
  honored, GEOCODING_PROVIDERS env-var filter respected
- providers/__tests__/photon-normalizer.test.ts: provider field
  carries 'photon' or 'photon-self' based on the call argument

Recon of mana-gpu (2026-04-28): Windows 11 Pro Build 26200, 64 GB
RAM (56 GB free), 739 GB disk free, no WSL2/Docker yet, no native
GPU services running. Setup plan documented in
docs/runbooks/photon-on-mana-gpu.md (3–4 h, ~1 h of which is
download/unpack waiting).
This commit is contained in:
Till JS 2026-04-28 17:19:04 +02:00
parent 104a5a46a0
commit 153ad8049c
8 changed files with 537 additions and 16 deletions

View file

@ -153,14 +153,22 @@ docker compose up -d --force-recreate api
PORT=3018
# --- Provider chain (tried in order) ----------------------------------
GEOCODING_PROVIDERS=pelias,photon,nominatim
# Default order: photon-self,pelias,photon,nominatim
# `photon-self` is silently dropped if PHOTON_SELF_API_URL is unset.
GEOCODING_PROVIDERS=photon-self,pelias,photon,nominatim
PROVIDER_TIMEOUT_MS=8000 # per-provider request timeout (cold-start safe)
PROVIDER_HEALTH_CACHE_MS=30000 # health-cache TTL — skip dead providers
# --- Pelias (primary) -------------------------------------------------
# --- Self-hosted Photon (privacy: 'local', primary post-migration) ----
# Set this to point at the GPU-server-hosted Photon. When unset, the
# `photon-self` slot is not registered and the chain falls back to
# public providers as before.
PHOTON_SELF_API_URL=http://192.168.178.11:2322
# --- Pelias (legacy, currently stopped — privacy: 'local') ------------
PELIAS_API_URL=http://pelias-api:4000/v1
# --- Photon (fallback 1) ----------------------------------------------
# --- Public Photon (privacy: 'public', last-resort fallback) ----------
PHOTON_API_URL=https://photon.komoot.io
# --- Nominatim (fallback 2) -------------------------------------------
@ -176,9 +184,20 @@ CACHE_PUBLIC_TTL_MS=604800000 # 7d — extended TTL for public-API answe
```
To **disable a provider**, drop it from `GEOCODING_PROVIDERS`. To run with
no Pelias at all (e.g. while it's being migrated), set
`GEOCODING_PROVIDERS=photon,nominatim`. The chain ordering is honored
exactly — the first listed provider is tried first.
no local backend at all, set `GEOCODING_PROVIDERS=photon,nominatim`
the wrapper will block sensitive queries (see Privacy hardening below)
since no `privacy: 'local'` provider is reachable.
The dual-Photon split:
- `photon-self` — self-hosted Photon (mana-gpu), `privacy: 'local'`, eligible
for sensitive queries. Registered iff `PHOTON_SELF_API_URL` is set.
- `photon` — public komoot.io endpoint, `privacy: 'public'`, last-resort
fallback for non-sensitive queries when self-hosted is down.
Both share the same `PhotonProvider` class — only the URL, name, and
privacy stance differ. See the [migration runbook](../../docs/runbooks/photon-on-mana-gpu.md)
and [decision report](../../docs/reports/geocoding-self-hosting-2026-04-28.md)
for the operational story.
## Provider-chain semantics

View file

@ -0,0 +1,97 @@
/**
* Tests for the chain wiring in `createChain()`. The behavioral assertions
* here are the migration-critical ones make sure that:
* - `photon-self` is registered iff `PHOTON_SELF_API_URL` is set
* - `photon-self` carries `privacy: 'local'` (eligible for sensitive queries)
* - the public `photon` slot stays `privacy: 'public'`
* - chain order is honored (self before public)
*/
import { describe, expect, it } from 'bun:test';
import { createChain } from '../app';
import type { Config } from '../config';
function baseConfig(overrides: Partial<Config> = {}): Config {
return {
port: 3018,
pelias: { apiUrl: 'http://127.0.0.1:1' },
photon: { apiUrl: 'https://photon.komoot.io' },
photonSelf: { apiUrl: undefined },
nominatim: {
apiUrl: 'https://nominatim.openstreetmap.org',
userAgent: 'test',
intervalMs: 1100,
},
cors: { origins: [] },
cache: { maxEntries: 100, ttlMs: 1000, publicTtlMs: 7000 },
providers: {
enabled: ['photon-self', 'pelias', 'photon', 'nominatim'],
healthCacheMs: 30_000,
timeoutMs: 8000,
},
...overrides,
};
}
describe('createChain — photon-self registration', () => {
it('does NOT register photon-self when PHOTON_SELF_API_URL is unset', () => {
const chain = createChain(baseConfig());
const snapshot = chain.getHealthSnapshot();
const names = snapshot.map((p) => p.name);
expect(names).not.toContain('photon-self');
});
it('registers photon-self when PHOTON_SELF_API_URL is set', () => {
const chain = createChain(
baseConfig({
photonSelf: { apiUrl: 'http://192.168.178.11:2322' },
})
);
const snapshot = chain.getHealthSnapshot();
const names = snapshot.map((p) => p.name);
expect(names).toContain('photon-self');
});
it('honors order: photon-self before public photon when both are enabled', () => {
const chain = createChain(
baseConfig({
photonSelf: { apiUrl: 'http://192.168.178.11:2322' },
providers: {
enabled: ['photon-self', 'photon', 'nominatim'],
healthCacheMs: 30_000,
timeoutMs: 8000,
},
})
);
const snapshot = chain.getHealthSnapshot();
// First entry is photon-self, then photon (public), then nominatim.
const names = snapshot.map((p) => p.name);
expect(names[0]).toBe('photon-self');
expect(names).toContain('photon');
expect(names).toContain('nominatim');
});
it('a stray empty PHOTON_SELF_API_URL does not register a useless provider', () => {
// The config loader trims and treats '' as undefined, but defend in
// depth — pass an explicit empty string here too.
const chain = createChain(baseConfig({ photonSelf: { apiUrl: undefined } }));
const names = chain.getHealthSnapshot().map((p) => p.name);
expect(names).not.toContain('photon-self');
});
it('photon-self is filtered to enabled list (drop if not in GEOCODING_PROVIDERS)', () => {
const chain = createChain(
baseConfig({
photonSelf: { apiUrl: 'http://192.168.178.11:2322' },
providers: {
// User explicitly excludes photon-self via env-var
enabled: ['photon', 'nominatim'],
healthCacheMs: 30_000,
timeoutMs: 8000,
},
})
);
const names = chain.getHealthSnapshot().map((p) => p.name);
expect(names).not.toContain('photon-self');
});
});

View file

@ -55,11 +55,29 @@ export function createChain(config: Config): ProviderChain {
})
);
// Self-hosted Photon (mana-gpu). Only registered when the env-var is set
// — pre-migration this stays absent and the chain falls through to
// public providers as before. Once the GPU server is running Photon,
// flip PHOTON_SELF_API_URL on and this becomes the primary provider.
if (config.photonSelf.apiUrl) {
built.set(
'photon-self',
new PhotonProvider({
apiUrl: config.photonSelf.apiUrl,
timeoutMs: config.providers.timeoutMs,
name: 'photon-self',
privacy: 'local',
})
);
}
built.set(
'photon',
new PhotonProvider({
apiUrl: config.photon.apiUrl,
timeoutMs: config.providers.timeoutMs,
// name + privacy default to 'photon' / 'public' — public komoot
// is the always-on safety net behind self-hosted.
})
);

View file

@ -11,9 +11,18 @@ export interface Config {
apiUrl: string;
};
photon: {
/** Photon base URL (defaults to public komoot endpoint) */
/** Photon base URL public komoot endpoint by default. Used by
* the `'photon'` provider slot which always has `privacy: 'public'`. */
apiUrl: string;
};
photonSelf: {
/** Self-hosted Photon URL (e.g. `http://192.168.178.11:2322` for the
* GPU server). When set, the wrapper registers a separate
* `'photon-self'` provider with `privacy: 'local'` eligible for
* sensitive queries. When undefined, the slot is disabled and the
* chain only has the public providers (current pre-migration state). */
apiUrl: string | undefined;
};
nominatim: {
apiUrl: string;
userAgent: string;
@ -57,6 +66,13 @@ export function loadConfig(): Config {
photon: {
apiUrl: process.env.PHOTON_API_URL || 'https://photon.komoot.io',
},
photonSelf: {
// Opt-in: only registered when this env-var is explicitly set
// (e.g. http://192.168.178.11:2322 once the GPU server is up).
// Empty string → treated as unset so a stray "" in .env doesn't
// register a useless provider.
apiUrl: process.env.PHOTON_SELF_API_URL?.trim() || undefined,
},
nominatim: {
apiUrl: process.env.NOMINATIM_API_URL || 'https://nominatim.openstreetmap.org',
userAgent:
@ -73,7 +89,13 @@ export function loadConfig(): Config {
publicTtlMs: parseInt(process.env.CACHE_PUBLIC_TTL_MS || String(7 * 24 * 60 * 60 * 1000), 10),
},
providers: {
// Default order (when GEOCODING_PROVIDERS is unset): try the
// self-hosted Photon first if it's been configured, then public
// providers as fallback. `photon-self` is silently dropped at
// chain-build time if `photonSelf.apiUrl` is undefined, so the
// list is the same shape regardless of migration status.
enabled: parseProviderList(process.env.GEOCODING_PROVIDERS, [
'photon-self',
'pelias',
'photon',
'nominatim',
@ -90,7 +112,7 @@ export function loadConfig(): Config {
function parseProviderList(raw: string | undefined, fallback: ProviderName[]): ProviderName[] {
if (!raw) return fallback;
const valid: ProviderName[] = ['pelias', 'photon', 'nominatim'];
const valid: ProviderName[] = ['pelias', 'photon-self', 'photon', 'nominatim'];
const parsed = raw
.split(',')
.map((s) => s.trim().toLowerCase())

View file

@ -124,4 +124,30 @@ describe('normalizePhotonFeature', () => {
expect(result.latitude).toBeGreaterThan(47);
expect(result.latitude).toBeLessThan(48);
});
it('stamps provider:"photon" by default (back-compat)', () => {
const result = normalizePhotonFeature({
type: 'Feature',
geometry: { type: 'Point', coordinates: [9.17, 47.66] },
properties: { osm_key: 'place', osm_value: 'city', name: 'X' },
});
expect(result.provider).toBe('photon');
});
it('stamps provider:"photon-self" when called with that name (self-hosted path)', () => {
// The dual-Photon migration relies on this: a result from the
// self-hosted instance must NOT look like it came from public
// komoot. UI uses the provider field to decide whether to show
// the "approximate match" badge — fallback_used notice fires only
// for `privacy: 'public'` providers.
const result = normalizePhotonFeature(
{
type: 'Feature',
geometry: { type: 'Point', coordinates: [9.17, 47.66] },
properties: { osm_key: 'place', osm_value: 'city', name: 'X' },
},
'photon-self'
);
expect(result.provider).toBe('photon-self');
});
});

View file

@ -29,13 +29,23 @@ import type {
export interface PhotonConfig {
apiUrl: string;
timeoutMs: number;
/** Override the default provider name. Used when registering a second
* Photon instance pointing at a self-hosted backend (`'photon-self'`)
* alongside the public komoot endpoint (`'photon'`). */
name?: 'photon' | 'photon-self';
/** Override the default privacy stance. Self-hosted Photon on our
* infrastructure is `'local'`; public komoot is `'public'`. */
privacy?: 'local' | 'public';
}
export class PhotonProvider implements GeocodingProvider {
readonly name = 'photon' as const;
readonly privacy = 'public' as const;
readonly name: 'photon' | 'photon-self';
readonly privacy: 'local' | 'public';
constructor(private readonly config: PhotonConfig) {}
constructor(private readonly config: PhotonConfig) {
this.name = config.name ?? 'photon';
this.privacy = config.privacy ?? 'public';
}
async search(req: SearchRequest, signal?: AbortSignal): Promise<ProviderResponse> {
const params = new URLSearchParams({
@ -64,7 +74,10 @@ export class PhotonProvider implements GeocodingProvider {
status: res.status,
};
}
return { ok: true, results: res.features.map(normalizePhotonFeature) };
return {
ok: true,
results: res.features.map((f) => normalizePhotonFeature(f, this.name)),
};
} catch (e) {
return { ok: false, kind: 'unreachable', error: errorMessage(e) };
}
@ -93,7 +106,10 @@ export class PhotonProvider implements GeocodingProvider {
status: res.status,
};
}
return { ok: true, results: res.features.map(normalizePhotonFeature) };
return {
ok: true,
results: res.features.map((f) => normalizePhotonFeature(f, this.name)),
};
} catch (e) {
return { ok: false, kind: 'unreachable', error: errorMessage(e) };
}
@ -161,7 +177,16 @@ interface PhotonFeature {
};
}
export function normalizePhotonFeature(f: PhotonFeature): GeocodingResult {
/**
* @param providerName Which provider tag to stamp on the result. Defaults
* to `'photon'` (public komoot) for backward compat. Pass `'photon-self'`
* to mark results as coming from our self-hosted instance useful for
* the UI to know "this came from local infra, no privacy compromise".
*/
export function normalizePhotonFeature(
f: PhotonFeature,
providerName: 'photon' | 'photon-self' = 'photon'
): GeocodingResult {
const props = f.properties;
const [lon, lat] = f.geometry.coordinates;
@ -186,7 +211,7 @@ export function normalizePhotonFeature(f: PhotonFeature): GeocodingResult {
// but the consumer side keys off the absence of this field as a
// "result came from a fallback" signal.
confidence: typeof props.importance === 'number' ? props.importance : 0.5,
provider: 'photon',
provider: providerName,
};
}

View file

@ -41,7 +41,21 @@ export interface GeocodingResult {
provider: ProviderName;
}
export type ProviderName = 'pelias' | 'photon' | 'nominatim';
/**
* Provider identifiers. Two of these wrap the same `PhotonProvider`
* class with different configs:
*
* - `photon-self`: self-hosted Photon (typically on mana-gpu),
* `privacy: 'local'`. Eligible for sensitive queries.
* - `photon`: public photon.komoot.io, `privacy: 'public'`. Last-resort
* fallback for non-sensitive queries when the self-hosted instance
* is down.
*
* The split exists because the chain identifies providers by name and
* tracks per-provider health. A single `photon` slot can't simultaneously
* mean two different backends.
*/
export type ProviderName = 'pelias' | 'photon-self' | 'photon' | 'nominatim';
export interface SearchRequest {
q: string;