Commit graph

14 commits

Author SHA1 Message Date
Till-JS
fe33f4b355 fix(mana-core-auth): complete production readiness with test fixes
- Fix LoggerService mock in better-auth.service.spec.ts
- Fix name assertion in auth.controller.spec.ts (empty string fallback)
- Fix createRemoteJWKSet mock in jwt-auth.guard.spec.ts
- Add Grafana dashboard for Auth Service monitoring
- Add 10 auth-specific Prometheus alert rules
- Update production readiness plan to 100% complete

All 199 unit tests passing.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 14:18:58 +01:00
Till-JS
fdaf6a9c75 🔧 fix(dashboards): fix broken panels and metrics
- Backends: Remove Docker container section (cAdvisor not deployed)
- Backends: Add Auth Service Runtime section with correct auth_ prefixed metrics
- Backends: Rename to "Backends Overview"
- Application Details: Fix Node.js Runtime to use auth_ prefixed metrics
- Application Details: Rename section to "Auth Service Runtime"

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 12:54:07 +01:00
Till-JS
7aa5115c78 📊 feat(monitoring): add node-exporter for host system metrics
- Add node-exporter service to docker-compose for CPU/Memory/Disk monitoring
- Enable node-exporter scrape target in Prometheus config
- Update System Overview dashboard with Host System section:
  - CPU, Memory, Disk usage gauges
  - Total RAM, Total Disk, Uptime, Load stats
  - CPU & Memory over time graph
  - Network I/O graph
- Add Node Exporter to service status panel

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 12:38:44 +01:00
Till-JS
84e9f86db9 🔧 fix(grafana): rewrite System Overview with available metrics
- Removed node_* metrics (node-exporter not deployed)
- Removed container_last_seen (cAdvisor not deployed)
- Added Service Status, Traffic Overview, Database sections
- All panels now use available Prometheus metrics

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 12:33:11 +01:00
Till-JS
edbf775f37 📊 feat(grafana): add Total Requests and Requests/sec to Key Metrics
- Added Total Requests counter for overall user interaction
- Added Requests/sec for current load visibility
- Reduced panel width to fit 8 metrics in one row

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 12:32:01 +01:00
Till-JS
e7719eeba0 feat(grafana): enhance Master Overview with Key Metrics on top
- Move Key Metrics section to top of dashboard
- Add new panels: Services UP, Apps Running, Matrix Bots, Avg Response Time
- Reorganize layout for better overview at a glance
- Remove CPU/Memory/Disk (no node-exporter), add Redis Keys

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 12:28:53 +01:00
Till-JS
9b7d8c36b8 🐛 fix(grafana): correct VictoriaMetrics datasource port (8428 → 9090)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 05:13:48 +01:00
Till-JS
9dfad0128a 📈 feat(monitoring): upgrade to VictoriaMetrics + DuckDB analytics
- Replace Prometheus with VictoriaMetrics (2-year retention)
- Add DuckDB analytics module for business KPIs (unlimited retention)
- Add master overview dashboard combining all metrics
- Add business metrics dashboard for user growth tracking
- Add backup script for VictoriaMetrics snapshots and DuckDB
- Add ADR documentation for monitoring stack decision

Analytics API endpoints:
- GET /api/v1/analytics/health - Service health
- GET /api/v1/analytics/latest - Latest metrics snapshot
- GET /api/v1/analytics/growth - User growth over time
- GET /api/v1/analytics/monthly - Monthly aggregates
- POST /api/v1/analytics/snapshot - Manual snapshot trigger
2026-01-28 12:38:04 +01:00
Till-JS
0cd2bc858a feat(stats): add user statistics to Prometheus metrics and Grafana
- Add user metrics to mana-core-auth MetricsService:
  - auth_users_total: Total registered users
  - auth_users_verified: Email-verified users
  - auth_users_created_today/this_week/this_month
- Create Grafana user-statistics dashboard with:
  - User overview stats (total, verified, verification rate, new today)
  - Registration period breakdown (today/week/month)
  - User growth trends over time
- Enhance telegram-stats-bot /users command:
  - Add yesterday comparison with trends
  - Add week-over-week comparison
  - Add mini bar chart for last 7 days registration
- Include user stats in daily Telegram report
2026-01-26 10:53:57 +01:00
Till-JS
edf13b7102 revert: fix CI by reverting Telegram notifications
Reverting 618c58c5 which broke the CI workflow.
Will re-add notifications after fixing the issue.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 10:40:10 +01:00
Till-JS
618c58c519 feat(ci): add Telegram notifications and Grafana CI/CD dashboard
- Add notify-start job with Telegram notification for build start
- Add notify-complete job with build status and duration notification
- Push CI metrics to Prometheus Pushgateway for Grafana visualization
- Create CI/CD Grafana dashboard with build status, duration, and history
- Add Pushgateway scrape config to Prometheus

Requires TELEGRAM_BOT_TOKEN, TELEGRAM_CHAT_ID, and PUSHGATEWAY_URL secrets.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 10:31:17 +01:00
Till-JS
8c259a008b feat(monitoring): add comprehensive Grafana dashboards and alerting
New dashboards:
- Application Details: Node.js runtime (heap, event loop, GC),
  HTTP details (status codes, methods, top routes), error analysis
- Database Details: PostgreSQL and Redis metrics with detailed breakdowns

Alerting rules (docker/prometheus/alerts.yml):
- Service: down, high/very high error rate, slow response time
- Infrastructure: high CPU/memory/disk usage
- Database: PostgreSQL/Redis down, high connections, low cache hit
- Container: high CPU/memory, restarts

All dashboards include service selector variable for filtering.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 09:47:18 +01:00
Till-JS
4a236a7a1f feat(todo): add Prometheus metrics and update docs
- Add MetricsModule with prom-client for todo backend
- Add MetricsInterceptor for request tracking
- Update COMMANDS.md with presi and storage commands
- Update Grafana dashboards for backend monitoring

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-25 13:31:44 +01:00
Till-JS
6d86a08d63 feat: add monitoring dashboard (Prometheus + Grafana + Umami + Admin)
Phase 1: Infrastructure
- Add docker/prometheus/prometheus.yml with scrape configs for all services
- Add docker/grafana/provisioning for auto-configured datasources
- Add docker/grafana/dashboards (system-overview, backends-docker)
- Update docker-compose.macmini.yml with monitoring services:
  - prometheus, grafana, node-exporter, cadvisor
  - postgres-exporter, redis-exporter, umami
- Add grafana.mana.how and analytics.mana.how to Caddyfile

Phase 2: Backend Metrics
- Create packages/shared-nestjs-metrics with:
  - MetricsModule (auto /metrics endpoint)
  - MetricsService (Counter, Histogram, Gauge helpers)
  - MetricsMiddleware (auto HTTP request tracking)

Phase 3: Umami Web Analytics
- Add Umami tracking scripts to all landing pages
- Add Umami tracking scripts to all web apps
- Create scripts/mac-mini/setup-umami-db.sh

Phase 4: Admin Dashboard (ManaCore Web)
- Add admin routes: /admin, /admin/users, /admin/system
- Create StatCard, QuickLinks, UserTable components
- Add Admin link to navigation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-23 15:31:39 +01:00