mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-14 20:21:09 +02:00
docs(geocoding): post-migration log + Photon weekly-refresh operator scripts
- Decision report: status flipped to MIGRATED; added migration log with five WSL2 gotchas (bzip2 missing, no official Photon image, firewall=true blocks cross-LAN, vmIdleTimeout=-1 ineffective, PowerShell pre-expansion of bash $(...)) and resource snapshot. - mana-geocoding CLAUDE.md: PHOTON_SELF_API_URL note now reflects live primary status on mana-gpu since 2026-04-28. - photon-self/: operator scripts for the weekly DB refresh — update.sh (atomic-swap with rollback), systemd unit + timer (Sun 03:30 +30min jitter, Persistent=true), README with re-installation instructions for DR. Currently installed and enabled on mana-gpu. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
7ebbf064ce
commit
fc49198992
6 changed files with 254 additions and 5 deletions
|
|
@ -1,6 +1,6 @@
|
||||||
# Geocoding Self-Hosting — Decision Report
|
# Geocoding Self-Hosting — Decision Report
|
||||||
|
|
||||||
**Status:** Recommendation — pending migration
|
**Status:** ✅ MIGRATED — Photon-on-mana-gpu live since 2026-04-28 19:27 CEST
|
||||||
**Date:** 2026-04-28
|
**Date:** 2026-04-28
|
||||||
**Context:** Pelias was retired from the Mac mini on 2026-04-28 (3 GB RAM was crushing the host into 8.6 GB swap). The wrapper now serves all queries through public Photon + Nominatim, with sensitive-query blocking + coord quantization as privacy mitigations. We need a self-hosted geocoder back in the chain so sensitive queries (`Hausarzt`, `Klinikum`, …) don't return zero results when the user actually wants them, and so we don't depend on a third party for routine address lookups.
|
**Context:** Pelias was retired from the Mac mini on 2026-04-28 (3 GB RAM was crushing the host into 8.6 GB swap). The wrapper now serves all queries through public Photon + Nominatim, with sensitive-query blocking + coord quantization as privacy mitigations. We need a self-hosted geocoder back in the chain so sensitive queries (`Hausarzt`, `Klinikum`, …) don't return zero results when the user actually wants them, and so we don't depend on a third party for routine address lookups.
|
||||||
|
|
||||||
|
|
@ -217,3 +217,91 @@ Tests: extend `chain.test.ts` to verify the order pelias-class → photon-class
|
||||||
- [mediagis/nominatim-docker discussion #265](https://github.com/mediagis/nominatim-docker/discussions/265) — Germany-import resource reports (12 GB RAM, ~100 GB disk, 8–12 h)
|
- [mediagis/nominatim-docker discussion #265](https://github.com/mediagis/nominatim-docker/discussions/265) — Germany-import resource reports (12 GB RAM, ~100 GB disk, 8–12 h)
|
||||||
- [Photon OpenSearch wiki page](https://wiki.openstreetmap.org/wiki/Photon) — region scoping, memory tuning
|
- [Photon OpenSearch wiki page](https://wiki.openstreetmap.org/wiki/Photon) — region scoping, memory tuning
|
||||||
- Internal: [`services/mana-geocoding/CLAUDE.md`](../../services/mana-geocoding/CLAUDE.md) for the current Pelias setup we're replacing
|
- Internal: [`services/mana-geocoding/CLAUDE.md`](../../services/mana-geocoding/CLAUDE.md) for the current Pelias setup we're replacing
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Migration log + lessons learned (2026-04-28)
|
||||||
|
|
||||||
|
The migration ran from 17:42 to 19:27 CEST — about 1 h 45 min, almost
|
||||||
|
all of which was unattended download/unpack waiting time (29 GB tarball
|
||||||
|
+ 80 GB unpack). Went smoother than the runbook estimated except for
|
||||||
|
five WSL2-specific gotchas:
|
||||||
|
|
||||||
|
### What worked first try
|
||||||
|
|
||||||
|
- **WSL2 install via SSH:** `winget install Microsoft.WSL` followed by
|
||||||
|
`wsl --install Ubuntu-24.04 --no-launch` — fully unattended, no
|
||||||
|
interactive prompts, including the previously-painful first-run user
|
||||||
|
setup (the `--no-launch` flag combined with `--user root` for
|
||||||
|
follow-up commands skipped the wizard entirely).
|
||||||
|
- **Docker Engine in WSL2 (instead of Docker Desktop):** apt install
|
||||||
|
`docker-ce` from the official repo, then run as systemd service.
|
||||||
|
Headless, no GUI session needed — much cleaner for SSH-driven
|
||||||
|
setup than Docker Desktop.
|
||||||
|
- **WSL2 Mirrored Networking** (Win11 22H2+): the Linux distro shares
|
||||||
|
the Windows host's LAN IP. Photon listens on
|
||||||
|
`192.168.178.11:2322` directly — no `netsh interface portproxy`
|
||||||
|
forwarding. Just one Windows Defender Firewall rule and the Mac
|
||||||
|
mini reaches it.
|
||||||
|
- **Photon Europe pre-built tarball** (29 GB compressed → ~80 GB
|
||||||
|
unpacked) downloaded at ~9 MB/s sustained, unpacked at ~80 MB/s.
|
||||||
|
No PBF import, no Elasticsearch tuning, no patch hacks.
|
||||||
|
|
||||||
|
### Five gotchas worth documenting
|
||||||
|
|
||||||
|
1. **`bzip2` is not installed by default in Ubuntu 24.04 minimal.**
|
||||||
|
`tar -xjf` fails with `bzip2: Cannot exec`. Fix: `apt install bzip2`
|
||||||
|
before unpacking. Took ~15 minutes to spot because the script's
|
||||||
|
`set -e` exited cleanly after the failure.
|
||||||
|
|
||||||
|
2. **No official Photon Docker image.** Komoot publishes a JAR but
|
||||||
|
no `komoot/photon` on Docker Hub. Solution: run the JAR inside
|
||||||
|
`eclipse-temurin:21-jre` with the data dir + JAR mounted in.
|
||||||
|
Cleaner than community images (which lag the upstream version).
|
||||||
|
|
||||||
|
3. **`firewall=true` in `.wslconfig` blocks cross-LAN inbound.**
|
||||||
|
The first nginx-on-:2322 cross-LAN test worked. After enabling
|
||||||
|
`firewall=true` (intended to harden Hyper-V firewall), Photon
|
||||||
|
became unreachable from the Mac mini even though the Windows
|
||||||
|
Defender rule allowed it. Removing the line fixed it instantly.
|
||||||
|
The Hyper-V firewall layer in WSL2 is a separate, stricter pass
|
||||||
|
that the Windows-side rule doesn't cover.
|
||||||
|
|
||||||
|
4. **`vmIdleTimeout=-1` does NOT prevent WSL2 idle-shutdown** on
|
||||||
|
Win11 26200. The VM still shuts down ~60 s after the last SSH
|
||||||
|
session closes, killing the Photon container. Workaround that
|
||||||
|
actually works: a Windows Task Scheduler task at boot that runs
|
||||||
|
`wsl -d Ubuntu-24.04 --user root -- /bin/sleep infinity`. Holds
|
||||||
|
the VM open permanently. Survives reboots.
|
||||||
|
|
||||||
|
5. **PowerShell quoting + bash inside `wsl ... -- bash -c "..."`.**
|
||||||
|
`$(dpkg --print-architecture)` and `$(lsb_release -cs)` got
|
||||||
|
pre-expanded by PowerShell on the Windows side, breaking the
|
||||||
|
Docker apt sources line. Fix: write the install script to a file,
|
||||||
|
transfer via scp, run via `wsl ... bash /mnt/c/temp/script.sh`.
|
||||||
|
No quoting layers to fight.
|
||||||
|
|
||||||
|
### Resource snapshot post-migration
|
||||||
|
|
||||||
|
- **mana-gpu:** Photon container 391 MB / 31 GB (1.2 %) memory at
|
||||||
|
steady state, 290 % CPU during initial OpenSearch shard recovery,
|
||||||
|
near-zero CPU at idle. Disk: 80 GB unpacked photon_data + 29 GB
|
||||||
|
tarball still on disk (kept for debugging — can be removed).
|
||||||
|
- **mana-server:** mana-geocoding container unchanged in resource
|
||||||
|
use; chain just routes to a different upstream. Cross-LAN
|
||||||
|
per-request latency added: ~5–15 ms.
|
||||||
|
|
||||||
|
### Cutover verification
|
||||||
|
|
||||||
|
- `provider: "photon-self"` confirmed on both `/search` and `/reverse`
|
||||||
|
endpoints from inside mana-geocoding container and externally via
|
||||||
|
`https://mana.how/api/v1/geocode/...`.
|
||||||
|
- Sensitive query "Hausarzt Konstanz" now returns real results
|
||||||
|
(`Hausarztpraxis am Tannenhof, Am Tannenhof 2, 78464 Konstanz`)
|
||||||
|
instead of the previous `notice: 'sensitive_local_unavailable'`
|
||||||
|
empty response. Privacy stance maintained: the query never leaves
|
||||||
|
our infra.
|
||||||
|
- Public Photon + public Nominatim stay registered as last-resort
|
||||||
|
`privacy: 'public'` fallbacks. Health-snapshot shows them as
|
||||||
|
`healthy: false, ageMs: null` — they're never probed because
|
||||||
|
`photon-self` is healthy.
|
||||||
|
|
|
||||||
|
|
@ -159,10 +159,10 @@ GEOCODING_PROVIDERS=photon-self,pelias,photon,nominatim
|
||||||
PROVIDER_TIMEOUT_MS=8000 # per-provider request timeout (cold-start safe)
|
PROVIDER_TIMEOUT_MS=8000 # per-provider request timeout (cold-start safe)
|
||||||
PROVIDER_HEALTH_CACHE_MS=30000 # health-cache TTL — skip dead providers
|
PROVIDER_HEALTH_CACHE_MS=30000 # health-cache TTL — skip dead providers
|
||||||
|
|
||||||
# --- Self-hosted Photon (privacy: 'local', primary post-migration) ----
|
# --- Self-hosted Photon (privacy: 'local', PRIMARY since 2026-04-28) --
|
||||||
# Set this to point at the GPU-server-hosted Photon. When unset, the
|
# Live on mana-gpu (Windows 11, WSL2-Ubuntu, Docker, Photon Europe-wide
|
||||||
# `photon-self` slot is not registered and the chain falls back to
|
# Java JAR + OpenSearch). Cross-LAN reach via WSL2 mirrored networking.
|
||||||
# public providers as before.
|
# Set in .env.macmini; flow into the container via docker-compose env.
|
||||||
PHOTON_SELF_API_URL=http://192.168.178.11:2322
|
PHOTON_SELF_API_URL=http://192.168.178.11:2322
|
||||||
|
|
||||||
# --- Pelias (legacy, currently stopped — privacy: 'local') ------------
|
# --- Pelias (legacy, currently stopped — privacy: 'local') ------------
|
||||||
|
|
|
||||||
78
services/mana-geocoding/photon-self/README.md
Normal file
78
services/mana-geocoding/photon-self/README.md
Normal file
|
|
@ -0,0 +1,78 @@
|
||||||
|
# photon-self — operator files for the GPU-server Photon
|
||||||
|
|
||||||
|
Source-of-truth copies of the scripts and systemd units that run on
|
||||||
|
`mana-gpu` to host the self-hosted Photon. Versioned here so the setup
|
||||||
|
can be rebuilt in DR scenarios without recreating from memory.
|
||||||
|
|
||||||
|
## What lives where
|
||||||
|
|
||||||
|
| File | Where it runs | Purpose |
|
||||||
|
|---|---|---|
|
||||||
|
| `photon-update.sh` | inside the WSL2 Ubuntu distro on mana-gpu, at `/usr/local/bin/photon-update.sh` | Weekly index refresh — download new tarball, atomic swap, restart container, rollback on failure |
|
||||||
|
| `photon-update.service` | `/etc/systemd/system/photon-update.service` | Oneshot wrapper that invokes the script |
|
||||||
|
| `photon-update.timer` | `/etc/systemd/system/photon-update.timer` | Sunday 03:30 + 30-min jitter, persistent across reboots |
|
||||||
|
|
||||||
|
## Re-installation (after a clean Windows reinstall etc.)
|
||||||
|
|
||||||
|
After you've followed [docs/runbooks/photon-on-mana-gpu.md](../../../docs/runbooks/photon-on-mana-gpu.md)
|
||||||
|
to get WSL2 + Docker + the initial Photon container running:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Run inside WSL2 Ubuntu as root:
|
||||||
|
cp /mnt/c/path/to/repo/services/mana-geocoding/photon-self/photon-update.sh \
|
||||||
|
/usr/local/bin/photon-update.sh
|
||||||
|
chmod +x /usr/local/bin/photon-update.sh
|
||||||
|
|
||||||
|
cp /mnt/c/path/to/repo/services/mana-geocoding/photon-self/photon-update.service \
|
||||||
|
/etc/systemd/system/
|
||||||
|
cp /mnt/c/path/to/repo/services/mana-geocoding/photon-self/photon-update.timer \
|
||||||
|
/etc/systemd/system/
|
||||||
|
|
||||||
|
systemctl daemon-reload
|
||||||
|
systemctl enable --now photon-update.timer
|
||||||
|
systemctl list-timers photon-update.timer # verify next run
|
||||||
|
```
|
||||||
|
|
||||||
|
## Manual trigger
|
||||||
|
|
||||||
|
To force a refresh outside the schedule:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Inside WSL2 Ubuntu as root
|
||||||
|
systemctl start photon-update.service
|
||||||
|
journalctl -u photon-update.service -f # watch progress
|
||||||
|
tail -f /var/log/photon-update.log # script-level detail
|
||||||
|
```
|
||||||
|
|
||||||
|
## What the update script does
|
||||||
|
|
||||||
|
```
|
||||||
|
1. curl new tarball → /opt/photon-data/photon-db.tar.bz2.new
|
||||||
|
2. Verify size ≥ 25 GB (sanity guard against truncated downloads)
|
||||||
|
3. tar -xjf into /opt/photon-data/photon_data.new
|
||||||
|
4. docker stop photon
|
||||||
|
5. mv old → .old, mv new → live (atomic-ish — both renames in same FS)
|
||||||
|
6. docker start photon
|
||||||
|
7. Poll /api?q=Konstanz for up to 180 s
|
||||||
|
- On success: rm -rf .old (cleanup)
|
||||||
|
- On failure: rollback (mv live → .bad, mv .old → live, restart)
|
||||||
|
```
|
||||||
|
|
||||||
|
The rollback path is the load-bearing part — a corrupted GraphHopper
|
||||||
|
dump or a Photon version-mismatch can otherwise leave the service in a
|
||||||
|
non-serving state until the operator notices.
|
||||||
|
|
||||||
|
## Why systemd timer instead of cron
|
||||||
|
|
||||||
|
WSL2 Ubuntu has systemd enabled by default since the 0.67.6 release.
|
||||||
|
Timers give us:
|
||||||
|
|
||||||
|
- `Persistent=true` — runs missed jobs at next boot if the GPU server
|
||||||
|
was off Sunday morning. Cron just skips them.
|
||||||
|
- `RandomizedDelaySec=30min` — spreads 100s of weekly jobs across the
|
||||||
|
GraphHopper CDN window, polite-neighbour style.
|
||||||
|
- `journalctl -u photon-update` — structured logs in one place.
|
||||||
|
- Status-checkable with `systemctl list-timers`.
|
||||||
|
|
||||||
|
The downside (more files on disk than a single crontab entry) is
|
||||||
|
negligible.
|
||||||
10
services/mana-geocoding/photon-self/photon-update.service
Normal file
10
services/mana-geocoding/photon-self/photon-update.service
Normal file
|
|
@ -0,0 +1,10 @@
|
||||||
|
[Unit]
|
||||||
|
Description=Weekly Photon DB refresh from GraphHopper
|
||||||
|
After=docker.service network-online.target
|
||||||
|
Wants=network-online.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=oneshot
|
||||||
|
ExecStart=/usr/local/bin/photon-update.sh
|
||||||
|
# Don't fail loudly if it can't run — next week's timer tries again
|
||||||
|
SuccessExitStatus=0
|
||||||
60
services/mana-geocoding/photon-self/photon-update.sh
Executable file
60
services/mana-geocoding/photon-self/photon-update.sh
Executable file
|
|
@ -0,0 +1,60 @@
|
||||||
|
#!/bin/bash
|
||||||
|
# Weekly Photon DB update.
|
||||||
|
# - Downloads the latest tarball from GraphHopper
|
||||||
|
# - Verifies size before swapping (don't replace good data with a partial)
|
||||||
|
# - Atomic swap via mv → restart container
|
||||||
|
# - Keeps one previous version as .old in case of bad index
|
||||||
|
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
DATA_DIR=/opt/photon-data
|
||||||
|
URL=https://download1.graphhopper.com/public/europe/photon-db-europe-1.0-latest.tar.bz2
|
||||||
|
MIN_SIZE=$((25 * 1024 * 1024 * 1024)) # 25 GB minimum, real one is ~29 GB
|
||||||
|
LOG=/var/log/photon-update.log
|
||||||
|
|
||||||
|
exec >>"$LOG" 2>&1
|
||||||
|
echo "=== $(date -Iseconds) — photon-update starting ==="
|
||||||
|
|
||||||
|
cd "$DATA_DIR"
|
||||||
|
|
||||||
|
# Download to .new — don't touch the live tarball
|
||||||
|
curl --silent --show-error --output photon-db.tar.bz2.new "$URL"
|
||||||
|
ACTUAL=$(stat -c %s photon-db.tar.bz2.new)
|
||||||
|
if [ "$ACTUAL" -lt "$MIN_SIZE" ]; then
|
||||||
|
echo "FAIL: downloaded tarball only $((ACTUAL / 1024 / 1024)) MB, expected ≥25 GB. Aborting."
|
||||||
|
rm -f photon-db.tar.bz2.new
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
echo "Downloaded $((ACTUAL / 1024 / 1024)) MB OK"
|
||||||
|
|
||||||
|
# Unpack to a fresh dir alongside the live one
|
||||||
|
rm -rf photon_data.new
|
||||||
|
mkdir photon_data.new
|
||||||
|
tar -xjf photon-db.tar.bz2.new -C photon_data.new --strip-components=1
|
||||||
|
|
||||||
|
# Stop the container, swap, restart
|
||||||
|
docker stop photon
|
||||||
|
mv photon_data photon_data.old || true
|
||||||
|
mv photon_data.new photon_data
|
||||||
|
mv photon-db.tar.bz2 photon-db.tar.bz2.old || true
|
||||||
|
mv photon-db.tar.bz2.new photon-db.tar.bz2
|
||||||
|
docker start photon
|
||||||
|
|
||||||
|
# Wait for it to come up + smoke
|
||||||
|
for i in $(seq 1 90); do
|
||||||
|
if curl -fsS --max-time 2 http://localhost:2322/api?q=Konstanz >/dev/null 2>&1; then
|
||||||
|
echo "OK — Photon ready after $i seconds with new index"
|
||||||
|
# Cleanup old version on success
|
||||||
|
rm -rf photon_data.old photon-db.tar.bz2.old
|
||||||
|
echo "=== $(date -Iseconds) — photon-update complete ==="
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
sleep 2
|
||||||
|
done
|
||||||
|
|
||||||
|
echo "FAIL — Photon did not become ready within 180 s after swap. Rolling back."
|
||||||
|
docker stop photon
|
||||||
|
mv photon_data photon_data.bad
|
||||||
|
mv photon_data.old photon_data
|
||||||
|
docker start photon
|
||||||
|
exit 1
|
||||||
13
services/mana-geocoding/photon-self/photon-update.timer
Normal file
13
services/mana-geocoding/photon-self/photon-update.timer
Normal file
|
|
@ -0,0 +1,13 @@
|
||||||
|
[Unit]
|
||||||
|
Description=Trigger weekly Photon DB refresh
|
||||||
|
|
||||||
|
[Timer]
|
||||||
|
# Sunday 03:30 — outside likely-usage hours
|
||||||
|
OnCalendar=Sun *-*-* 03:30:00
|
||||||
|
# Run on next boot if the system was off at the scheduled time
|
||||||
|
Persistent=true
|
||||||
|
# Spread across 30 min so we don't hammer GraphHopper at exactly :30:00
|
||||||
|
RandomizedDelaySec=30min
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=timers.target
|
||||||
Loading…
Add table
Add a link
Reference in a new issue