managarten/docker
Till JS 169821de1a feat(monitoring): add LLM Grafana dashboard, Prometheus scraping, and alerts
Wire mana-llm service into the monitoring stack:

Prometheus (docker/prometheus/prometheus.yml):
- Add mana-llm scrape job (port 3025, 15s interval)
- Include mana-llm in ServiceDown alert expression

Alerts (docker/prometheus/alerts.yml):
- New llm_alerts group with 4 rules:
  - LLMServiceDown: mana-llm down > 1 min (critical)
  - LLMHighErrorRate: > 10% errors for 5 min (warning)
  - OllamaProviderDown: > 50% requests via Google fallback (warning)
  - LLMSlowResponses: p95 > 30s for 5 min (warning)

Grafana Dashboard (docker/grafana/dashboards/mana-llm.json):
- 6 stat panels: status, req/min, error rate, fallback rate, latency, tokens/min
- Requests by Provider (stacked area: Ollama vs Google vs OpenRouter)
- Tokens by Type (prompt vs completion)
- Latency Percentiles (p50, p90, p99)
- Latency by Provider comparison
- Requests by Model breakdown
- Errors by Type
- Google Fallback Rate over time (with threshold coloring)
- Provider Distribution pie chart (24h)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 11:16:27 +01:00
..
alert-notifier feat(monitoring): add alerting stack and maintenance scripts 2026-02-12 13:46:57 +01:00
alertmanager feat(monitoring): add alerting stack and maintenance scripts 2026-02-12 13:46:57 +01:00
grafana feat(monitoring): add LLM Grafana dashboard, Prometheus scraping, and alerts 2026-03-24 11:16:27 +01:00
init-db feat(citycorners): add city guide app for Konstanz with full monorepo integration 2026-03-23 10:56:26 +01:00
matrix 🔒 refactor(bots): remove !login command and enforce OIDC-only auth 2026-02-14 11:26:58 +01:00
nginx first implementation 2025-11-27 17:26:18 +01:00
prometheus feat(monitoring): add LLM Grafana dashboard, Prometheus scraping, and alerts 2026-03-24 11:16:27 +01:00
shared 🐛 fix(docker): add missing build-shared-packages.sh script for Docker builds 2025-12-25 20:51:15 +01:00
templates feat(games): add whopixels hosting at whopxl.mana.how 2026-03-20 19:57:50 +01:00
Dockerfile.nestjs-base feat(docker): add shared NestJS builder base image 2026-03-21 10:48:31 +01:00