- Add explicit uid: deploy-tracking to datasource provisioning
- Add instant: true to all Prometheus stat/gauge panel queries
- Pushgateway gauges need instant queries, not range queries
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instrument the CD pipeline to record per-deploy and per-service metrics
(build time, image size, startup time, health status) into PostgreSQL and
push gauges to Pushgateway. Adds a Grafana dashboard with 13 panels covering
deploy frequency, build performance, service health, and history.
New files:
- scripts/mac-mini/init-deploy-tracking.sql (idempotent DDL)
- scripts/deploy-metrics.sh (bash library for CI)
- docker/grafana/provisioning/datasources/deploy-tracking.yml
- docker/grafana/dashboards/deploy-tracking.json
Modified:
- docker/prometheus/prometheus.yml (pushgateway scrape job)
- .github/workflows/cd-macmini.yml (build/health instrumentation)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add PostgreSQL datasource pointing to GlitchTip database
- Add Error Tracking dashboard with 7 panels:
- Total Open Issues (stat)
- Issues by Project (pie chart)
- Total Events (stat)
- Projects Tracked (stat)
- Resolved vs Unresolved (stat)
- New Issues Over Time (stacked bar chart, 30 days)
- Recent Issues (table with 50 latest, color-coded levels)
- Dashboard links to GlitchTip UI for detailed investigation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Infrastructure:
- Add GlitchTip (web + worker) to docker-compose.macmini.yml (port 8020)
- Add glitchtip.mana.how to Cloudflare Tunnel config
- Add glitchtip database to init-db SQL
- Add GLITCHTIP_DSN to .env.development
Shared Package (@manacore/shared-error-tracking):
- initErrorTracking() - Sentry-compatible init with GlitchTip DSN
- captureException(), captureMessage(), setUser(), setTag(), flush()
- SentryExceptionFilter for NestJS (captures 5xx errors only)
- Graceful no-op when DSN is not configured
Integration:
- Add instrument.ts to calendar, contacts, todo backends
- Import instrument.ts before app bootstrap in all 3 main.ts files
- Error tracking auto-initializes when GLITCHTIP_DSN env var is set
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Create NestJS backend on port 3020 with 4 modules (space, document, ai, token)
- Add Drizzle schema with 5 tables (spaces, documents, token_transactions, model_prices, user_tokens)
- Rewrite web services (spaces, documents, tokens, ai) to use shared API client instead of Supabase
- Move AI API keys server-side (Azure OpenAI, Google Gemini)
- Add seed script for model prices (gpt-4.1, gemini-pro, gemini-flash)
- Add 70 unit tests across 4 test suites (space, document, token, ai services)
- Add monorepo integration (setup-databases.sh, generate-env.mjs, docker init-db, root scripts)
- Remove @supabase/supabase-js dependency and delete supabase.ts from web app
- Update CLAUDE.md with full API documentation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove !login and !logout commands from all 16+ Matrix bots
- Remove login/logout references from all help/welcome messages
- Disable password login in Synapse (password_config.enabled: false)
- System is now OIDC-only via Mana Core authentication
Users must authenticate via "Sign in with Mana Core" in Element.
Existing bot access tokens remain valid.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Medium priority stability improvements:
Alerting:
- Add vmalert for evaluating Prometheus alert rules
- Add alertmanager for alert routing and grouping
- Add alert-notifier service for Telegram/ntfy notifications
- Enable cadvisor scraping in prometheus config
Disk Monitoring:
- Add check-disk-space.sh for hourly disk monitoring
- Alert on 80% (warning) and 90% (critical) thresholds
- Auto-cleanup Docker when disk is critical
- Add com.manacore.disk-check.plist for LaunchD
Weekly Reports:
- Add weekly-report.sh for system health summary
- Includes: backup status, disk usage, container health,
database stats, error log summary
- Runs every Sunday at 10 AM via LaunchD
Health Check Updates:
- Add checks for vmalert, alertmanager, alert-notifier
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Disable api-gateway and skilltree-web (no working images/Dockerfiles)
- Fix mana-search Dockerfile healthcheck port and endpoint
- Update health-check.sh to skip disabled services
- Fix search service health endpoint (/api/v1/health)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add PlaygroundLogo to shared-branding package
- Add playground to APP_BRANDING, APP_ICONS, and APP_URLS
- Replace custom login/register pages with shared-auth-ui components
- Update authStore with resendVerificationEmail and improved signUp
- Add Caddy reverse proxy entry for playground.mana.how
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix LoggerService mock in better-auth.service.spec.ts
- Fix name assertion in auth.controller.spec.ts (empty string fallback)
- Fix createRemoteJWKSet mock in jwt-auth.guard.spec.ts
- Add Grafana dashboard for Auth Service monitoring
- Add 10 auth-specific Prometheus alert rules
- Update production readiness plan to 100% complete
All 199 unit tests passing.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add node-exporter service to docker-compose for CPU/Memory/Disk monitoring
- Enable node-exporter scrape target in Prometheus config
- Update System Overview dashboard with Host System section:
- CPU, Memory, Disk usage gauges
- Total RAM, Total Disk, Uptime, Load stats
- CPU & Memory over time graph
- Network I/O graph
- Add Node Exporter to service status panel
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Removed node_* metrics (node-exporter not deployed)
- Removed container_last_seen (cAdvisor not deployed)
- Added Service Status, Traffic Overview, Database sections
- All panels now use available Prometheus metrics
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Added Total Requests counter for overall user interaction
- Added Requests/sec for current load visibility
- Reduced panel width to fit 8 metrics in one row
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Move Key Metrics section to top of dashboard
- Add new panels: Services UP, Apps Running, Matrix Bots, Avg Response Time
- Reorganize layout for better overview at a glance
- Remove CPU/Memory/Disk (no node-exporter), add Redis Keys
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Commented out:
- node-exporter (container not deployed)
- cadvisor (container not deployed)
- storage/presi/nutriphi-backend (no /metrics endpoint yet)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add SYNAPSE_OIDC_CLIENT_SECRET to mana-core-auth env
- Enable OIDC provider config in homeserver.yaml
- Add matrix.mana.how and element.mana.how to CORS origins
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update Synapse OIDC configuration to use the registered
matrix-synapse client for SSO authentication.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add OIDC Provider plugin to Better Auth configuration
- Add OIDC database tables (oauth_applications, oauth_access_tokens,
oauth_authorization_codes, oauth_consents)
- Configure Synapse as OIDC client in homeserver.yaml
- Update Element Web config for SSO support
- Add seed script for OIDC clients (db:seed:oidc)
- Update Cloudflare tunnel config with Matrix URLs
This enables Single Sign-On between Mana Core Auth and Matrix/Synapse,
allowing users to authenticate via their existing Mana account.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Since bots cannot support E2E encryption and all data is stored locally,
hide the encryption warning banners for better UX.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Replace Prometheus with VictoriaMetrics (2-year retention)
- Add DuckDB analytics module for business KPIs (unlimited retention)
- Add master overview dashboard combining all metrics
- Add business metrics dashboard for user growth tracking
- Add backup script for VictoriaMetrics snapshots and DuckDB
- Add ADR documentation for monitoring stack decision
Analytics API endpoints:
- GET /api/v1/analytics/health - Service health
- GET /api/v1/analytics/latest - Latest metrics snapshot
- GET /api/v1/analytics/growth - User growth over time
- GET /api/v1/analytics/monthly - Monthly aggregates
- POST /api/v1/analytics/snapshot - Manual snapshot trigger
- Add user metrics to mana-core-auth MetricsService:
- auth_users_total: Total registered users
- auth_users_verified: Email-verified users
- auth_users_created_today/this_week/this_month
- Create Grafana user-statistics dashboard with:
- User overview stats (total, verified, verification rate, new today)
- Registration period breakdown (today/week/month)
- User growth trends over time
- Enhance telegram-stats-bot /users command:
- Add yesterday comparison with trends
- Add week-over-week comparison
- Add mini bar chart for last 7 days registration
- Include user stats in daily Telegram report
Reverting 618c58c5 which broke the CI workflow.
Will re-add notifications after fixing the issue.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add notify-start job with Telegram notification for build start
- Add notify-complete job with build status and duration notification
- Push CI metrics to Prometheus Pushgateway for Grafana visualization
- Create CI/CD Grafana dashboard with build status, duration, and history
- Add Pushgateway scrape config to Prometheus
Requires TELEGRAM_BOT_TOKEN, TELEGRAM_CHAT_ID, and PUSHGATEWAY_URL secrets.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
New dashboards:
- Application Details: Node.js runtime (heap, event loop, GC),
HTTP details (status codes, methods, top routes), error analysis
- Database Details: PostgreSQL and Redis metrics with detailed breakdowns
Alerting rules (docker/prometheus/alerts.yml):
- Service: down, high/very high error rate, slow response time
- Infrastructure: high CPU/memory/disk usage
- Database: PostgreSQL/Redis down, high connections, low cache hit
- Container: high CPU/memory, restarts
All dashboards include service selector variable for filtering.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add MetricsModule with prom-client for todo backend
- Add MetricsInterceptor for request tracking
- Update COMMANDS.md with presi and storage commands
- Update Grafana dashboards for backend monitoring
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update all tracking script URLs and admin dashboard links to use the
new stats.mana.how subdomain for Umami web analytics.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Merge till-dev branch containing:
- Planta plant care tracking application
- Clock backend with alarms, timers, world clocks
- Zitare backend with favorites and lists
- Various app improvements and fixes
- Auth system updates
- Infrastructure improvements
Note: Some type-check issues may need resolution after merge.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The todo-backend Dockerfile (and potentially other backends) expect this script
to exist in docker/shared/. This script builds shared packages in dependency
order during Docker image builds.
Fixes CI failure: "ERROR: failed to build: /docker/shared/build-shared-packages.sh: not found"
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add picture-backend and picture-web to CI Docker build matrix
- Add picture services to staging deployment workflow
- Add picture-backend to production deployment workflow
- Create Dockerfile and docker-entrypoint.sh for picture-web
- Fix picture-backend Dockerfile port (3003→3006) and health endpoint
- Add picture routes to Caddyfile.staging
- Add REPLICATE_API_TOKEN and MANA_CORE_SERVICE_KEY env vars
- Add scripts/setup-databases.sh for automatic DB creation and schema push
- Add dev:*:full commands (chat, zitare, contacts, calendar, clock, todo, picture)
- Update docker/init-db to create all databases on first startup
- Add docs/LOCAL_DEVELOPMENT.md with comprehensive local dev guide
- Update CLAUDE.md with new quick start commands
Now developers can run `pnpm dev:chat:full` to automatically:
1. Create the database if missing
2. Push the latest schema
3. Start auth, backend, and web with colored output
- Add vCard/CSV file import with duplicate detection and merge options
- Add Google Contacts OAuth2 integration for importing from Google
- Add vCard/CSV export with format selection and filtering options
- Add connected_accounts table for OAuth token storage
- Add FileUploader, ImportPreview, GoogleImport components
- Add ExportModal with format selection (vCard/CSV)
- Add i18n translations for import/export (DE/EN)