managarten

mirror of https://github.com/Memo-2023/mana-monorepo.git synced 2026-05-17 08:19:40 +02:00

Author	SHA1	Message	Date
Till JS	4e370911e8	feat(monitoring): disk metrics via Pushgateway, Loki in Master Overview, Colima move script - check-disk-space.sh now pushes mac_disk_used_percent + mac_colima_disk_used_gb to Pushgateway every hour so vmalert can alert on real macOS disk usage - alerts.yml: replace broken node-exporter disk alerts with Pushgateway-based ones - master-overview.json: add "Recent Errors (Loki)" section with live error log stream, error rate timeseries and top error sources barchart - move-colima-to-external-ssd.sh: guided script to move 200GB Colima VM datadisk from internal SSD to /Volumes/ManaData (3.6TB external SSD) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-30 20:03:33 +02:00
Till JS	143112f77a	feat(observability): add mana-search, mana-media, and Synapse to monitoring - Add Prometheus scraping for mana-search (port 3020, already has metrics) - Add Prometheus scraping for mana-media (port 3015, MetricsModule added) - Add Prometheus scraping for Matrix Synapse (port 9002, already enabled) - Add MetricsModule to mana-media with media_ prefix - Update Dockerfile for mana-media to include shared-nestjs-metrics - Replace hardcoded ServiceDown alert list with dynamic regex (.*-backend\|mana-core-auth\|mana-search\|mana-media\|synapse) - Replace hardcoded backends.json query with dynamic regex - Add Search, Media, Synapse to master-overview and system-overview dashboards Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 10:46:59 +01:00
Till JS	6fa6509fa5	feat(observability): add metrics and monitoring for all 15 backends - Add MetricsModule to 8 backends missing it (photos, zitare, mukke, planta, picture, storage, presi, nutriphi) - Enable Prometheus scraping for all 15 backends in prometheus.yml (was only 6, with 3 commented out and 6 missing entirely) - Update ServiceDown alert rule to cover all 15 backends - Update Grafana dashboards (backends, master-overview, system-overview) with all backend services in health panels - Fix imprecise regex in application-details dashboard Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 09:09:04 +01:00
Till-JS	edbf775f37	📊 feat(grafana): add Total Requests and Requests/sec to Key Metrics - Added Total Requests counter for overall user interaction - Added Requests/sec for current load visibility - Reduced panel width to fit 8 metrics in one row Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 12:32:01 +01:00
Till-JS	e7719eeba0	✨ feat(grafana): enhance Master Overview with Key Metrics on top - Move Key Metrics section to top of dashboard - Add new panels: Services UP, Apps Running, Matrix Bots, Avg Response Time - Reorganize layout for better overview at a glance - Remove CPU/Memory/Disk (no node-exporter), add Redis Keys Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 12:28:53 +01:00
Till-JS	9dfad0128a	📈 feat(monitoring): upgrade to VictoriaMetrics + DuckDB analytics - Replace Prometheus with VictoriaMetrics (2-year retention) - Add DuckDB analytics module for business KPIs (unlimited retention) - Add master overview dashboard combining all metrics - Add business metrics dashboard for user growth tracking - Add backup script for VictoriaMetrics snapshots and DuckDB - Add ADR documentation for monitoring stack decision Analytics API endpoints: - GET /api/v1/analytics/health - Service health - GET /api/v1/analytics/latest - Latest metrics snapshot - GET /api/v1/analytics/growth - User growth over time - GET /api/v1/analytics/monthly - Monthly aggregates - POST /api/v1/analytics/snapshot - Manual snapshot trigger	2026-01-28 12:38:04 +01:00

6 commits