managarten/docker
Till JS 0bf01f434e feat(mana-ai): Prometheus /metrics endpoint + status.mana.how integration
Wires mana-ai into the existing observability stack so tick throughput,
plan-failure rates, planner latencies, and snapshot refresh health are
visible in Grafana + Prometheus, and the service's uptime surfaces on
status.mana.how under the "Internal" section.

- `src/metrics.ts` — prom-client Registry with `mana_ai_` prefix.
  Counters: ticks_total, plans_produced_total, plans_written_back_total,
  parse_failures_total, mission_errors_total, snapshots_new/updated,
  snapshot_rows_applied_total, http_requests_total.
  Histograms: tick_duration_seconds (0.1–120s), planner_request_
  duration_seconds (0.25–60s), http_request_duration_seconds (0.005–10s).
- `src/index.ts` — HTTP middleware labels every request by
  method/path/status; `/metrics` serves the Prometheus text format.
- `src/cron/tick.ts` — increments counters + wraps the tick with
  `tickDuration.startTimer()`. Snapshot stats fold through.
- `src/planner/client.ts` — wraps `complete()` in a latency histogram
  timer so planner tail latency shows up separately from tick duration.
- `docker/prometheus/prometheus.yml` —
  1. New `mana-ai` scrape job against `mana-ai:3066/metrics` (30s).
  2. `/health` added to the `blackbox-internal` job so uptime shows on
     status.mana.how alongside mana-geocoding.
- `scripts/generate-status-page.sh` — friendly label for the new probe:
  `mana-ai:3066/health` → "Mana AI Runner" (generator already iterates
  `blackbox-internal`, no other changes needed).
- `package.json` — prom-client ^15.1.3

All 17 Bun tests still pass; tsc clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 01:41:40 +02:00
..
alert-notifier feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
alertmanager feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
blackbox feat(monitoring): add uptime monitoring via Blackbox Exporter 2026-03-31 17:43:25 +02:00
grafana refactor: rename zitare -> quotes (Zitate) 2026-04-14 20:59:16 +02:00
init-db feat(mail): add mana-mail service and frontend module (Phase 1 MVP) 2026-04-13 20:35:54 +02:00
loki feat(gpu-server): complete GPU server setup with AI services, monitoring, and public access 2026-03-27 21:35:30 +01:00
nginx refactor: rename zitare -> quotes (Zitate) 2026-04-14 20:59:16 +02:00
postgres fix(infra): use postgres -c flags instead of config_file override 2026-03-24 11:42:42 +01:00
prometheus feat(mana-ai): Prometheus /metrics endpoint + status.mana.how integration 2026-04-15 01:41:40 +02:00
promtail fix(mana-auth) + chore: rewrite /api/v1/auth/login JWT mint, remove Matrix stack 2026-04-08 16:32:13 +02:00
shared 🐛 fix(docker): add missing build-shared-packages.sh script for Docker builds 2025-12-25 20:51:15 +01:00
templates chore: remove all NestJS backend references, replace with Hono/Bun 2026-03-31 16:52:25 +02:00
Dockerfile.hono-server feat(infra): add docker-compose for new Hono services + DB init 2026-03-28 17:54:24 +01:00
Dockerfile.sveltekit-base fix(docker): drop packages/shared-config (deleted) from sveltekit-base 2026-04-09 12:43:17 +02:00