managarten

mirror of https://github.com/Memo-2023/mana-monorepo.git synced 2026-05-14 23:41:08 +02:00

Author	SHA1	Message	Date
Till JS	3f4a100b3b	fix(docker): remove backend-only packages from sveltekit-base shared-errors, shared-logger, shared-llm, notify-client are not needed by SvelteKit web apps. Their presence caused transitive dependency conflicts (astro check failing). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 21:17:46 +01:00
Till JS	9276d9a212	feat: GPU offload, signup limit, load tests & capacity planning - Route all AI workloads (Ollama, STT, TTS, Image Gen) to GPU server (192.168.178.11) via LAN instead of host.docker.internal - Upgrade default model to gemma3:12b and max concurrent to 5 - Add daily signup limit service (MAX_DAILY_SIGNUPS env var) - Add GET /api/v1/auth/signup-status public endpoint - Add k6 load test suite (web-apps, auth, sync-websocket, ollama) - Add capacity planning documentation - Fix: add eslint-config to sveltekit-base and calendar Dockerfiles Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 21:14:24 +01:00
Till JS	16367384c7	fix(docker): use --no-frozen-lockfile in all web Dockerfiles After extensive package restructuring (deletions, consolidations, new packages), the frozen lockfile causes resolution failures in Docker. Use --no-frozen-lockfile until lockfile stabilizes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 21:12:03 +01:00
Till JS	105a7b041f	fix(docker): add missing packages to sveltekit-base Dockerfile Add 8 packages that were created after the base image was defined: subscriptions, credits, shared-hono, shared-storage, shared-landing-ui, shared-llm, notify-client, shared-errors, shared-logger Fixes: Rollup failed to resolve @manacore/subscriptions during web builds. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 20:46:07 +01:00
Till JS	9ba05537ff	fix(docker): update sveltekit-base with renamed/new packages Replace deleted shared-feedback-* and shared-help-* packages with consolidated @manacore/feedback and @manacore/help packages. Add local-store and shared-auth-stores packages. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 18:23:44 +01:00
Till JS	79a53cf70a	fix(infra): sync Prometheus + cloudflared ports with current deployment - Prometheus: mana-sync 3010→3051, mana-matrix-bot 4001→4000 - Cloudflared: api.mana.how 3060→3016 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 18:07:12 +01:00
Till JS	099a40bbd1	chore: replace all mana-core-auth references with mana-auth Update docker-compose (dev + macmini), CI/CD workflows, Prometheus, package.json scripts, env generation, database setup, CODEOWNERS, and dependabot to reference the new Hono-based mana-auth service. Delete zombie mana-core-auth directory (already removed from Git). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 18:05:31 +01:00
Till JS	18fae3b66d	feat(infra): add docker-compose for new Hono services + DB init - Add mana-user (3062), mana-subscriptions (3063), mana-analytics (3064) to docker-compose with health checks and traefik labels - Replace old NestJS Tier 3 app backends (~300 lines) with comment placeholder for Hono compute servers (need shared Dockerfile) - Create docker/Dockerfile.hono-server — shared Bun Dockerfile for all 14 app compute servers (ARG APP for build context) - Add 5 new databases to setup-databases.sh: mana_auth, mana_credits, mana_user, mana_subscriptions, mana_analytics, mana_sync Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 17:54:24 +01:00
Till JS	14099cc42c	docs(infra): add PORT_SCHEMA.md + update Prometheus scrape targets Comprehensive port schema documentation as single source of truth. All services assigned to logical ranges: - 3000-3009: Core platform (auth, credits, subscriptions, user, analytics) - 3010-3019: Core infra (sync, media, search, notify, crawler, gateway) - 3020-3029: AI/ML (llm, stt, tts, image-gen, voice-bot) - 3030-3059: App backends - 4000-4099: Matrix stack - 5000-5059: Web frontends - 8000-8099: Monitoring - 9000-9199: Infrastructure exporters All port conflicts resolved. Prometheus targets updated to match. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 03:02:12 +01:00
Till JS	753c685ef7	feat(services): create mana-analytics, remove feedback/analytics/ai from auth Extract feedback, analytics, and AI modules from mana-core-auth into standalone mana-analytics service (Hono + Bun, Port 3064). New service (services/mana-analytics/): - User feedback CRUD with voting - AI-powered feedback title generation via mana-llm - Simplified from DuckDB analytics to pure PostgreSQL - ~550 LOC Removed from mana-core-auth: - feedback/ module (6 files) - analytics/ module (4 files) - ai/ module (3 files) - db/schema/feedback.schema.ts mana-core-auth now contains ONLY pure auth: - Better Auth (JWT, Sessions, 2FA, Passkeys, OIDC, Magic Links) - Organizations/Guilds (membership management) - API Keys, Security, Me (GDPR), Health, Metrics - Ready for Phase 5: Hono rewrite Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 02:29:24 +01:00
Till JS	ced7dd7441	feat(monitoring): add mana-sync, mana-notify, mana-crawler to Prometheus Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 02:18:21 +01:00
Till JS	d7799ec95d	refactor(photos): remove NestJS backend, use local-first + direct mana-media The Photos NestJS backend was a proxy to mana-media that enriched responses with local album/favorite/tag data. Now: - Albums store → local-first via albumCollection + albumItemCollection - Favorites → local-first via favoriteCollection (toggle in IndexedDB) - Photo tags → local-first via photoTagCollection - Photo listing/stats → direct mana-media API calls from frontend - Upload → direct mana-media upload from frontend - Delete → direct mana-media delete from frontend Removed 27 TypeScript files, 1 Docker container, 1 port (3039). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 02:18:03 +01:00
Till JS	dd2f814cf3	refactor(presi): replace NestJS backend with lightweight Hono server The Presi NestJS backend (40 source files, 50 deps) was a CRUD wrapper around decks, slides, and themes — all now handled by local-first sync. Only the share-link feature requires server-side state (public URLs without auth), so a minimal Hono + Bun server replaces the entire NestJS backend: - apps/presi/apps/server/ — Hono server with share routes + GDPR admin Uses @manacore/shared-hono for auth (JWKS), health, admin, errors - Web app API client stripped to share-only (was 270 lines → 90 lines) - Removed from docker-compose, CI/CD, Prometheus, env generation - NestJS backend deleted (40 TS files, 8 test specs, 3038 lines) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 02:08:40 +01:00
Till JS	32939fbfb5	refactor(infra): remove zitare + clock NestJS backends, add shared-hono package Both apps are fully local-first via Dexie.js + mana-sync. Their NestJS backends were pure CRUD wrappers (20 + 31 source files) that are no longer needed. Changes: - Add packages/shared-hono: JWT auth via JWKS (jose), Drizzle DB factory, health route, generic GDPR admin handler, error middleware - Migrate zitare lists page from fetch() to listsStore (local-first) - Rewrite clock timers store from API-based to timerCollection (Dexie) - Update clock +layout.svelte CommandBar search to use local collections - Remove zitare-backend + clock-backend from docker-compose, CI/CD, Prometheus, env generation, setup scripts - Add docs/TECHNOLOGY_AUDIT_2026_03.md with full repo analysis Net result: -2 Docker containers, -2 ports, -2728 lines of code Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 22:43:46 +01:00
Till JS	16e0d99c5a	feat(gpu-server): complete GPU server setup with AI services, monitoring, and public access - Set up 5 AI services on Windows GPU server (RTX 3090): - mana-llm (Port 3025): OpenAI-compatible LLM gateway via Ollama - mana-stt (Port 3020): WhisperX with word timestamps + speaker diarization - mana-tts (Port 3022): Kokoro (EN) + Edge TTS (DE) + Piper (local DE) - mana-image-gen (Port 3023): FLUX.2 klein 4B image generation - Ollama (Port 11434): gemma3:4b/12b, qwen2.5-coder:14b, nomic-embed-text - Add @manacore/shared-gpu TypeScript client package with SttClient, TtsClient, ImageClient - Add CUDA-compatible whisper_service using faster-whisper for Windows - Configure public access via Cloudflare Tunnel (gpu-llm/stt/tts/img.mana.how) - Add Loki log aggregator (Docker on Mac Mini) + log shipper on GPU server - Add GPU scrape targets to Prometheus/VictoriaMetrics config - Add Grafana Loki datasource for GPU service logs - Add health check with auto-restart, log rotation, and log shipping - Document complete setup: Always-On config, troubleshooting, architecture Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 21:35:30 +01:00
Till JS	a31ccc6c62	feat(infra): add api.mana.how route + Prometheus scrape targets for Go services - Cloudflare Tunnel: api.mana.how → localhost:3060 (Go API Gateway) - Prometheus: scrape targets for mana-api-gateway:3060 and mana-matrix-bot:4000 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 21:27:04 +01:00
Till JS	cdfbfcd13e	feat(infra): add sveltekit-base image and build-app script for Mac Mini - Add docker/Dockerfile.sveltekit-base: pre-built base with all 34 shared packages (mirrors nestjs-base pattern), eliminates redundant COPY/build steps from individual web Dockerfiles - Add scripts/mac-mini/build-app.sh: stops monitoring stack before build to free RAM, auto-restarts on exit (trap cleanup) - Migrate todo web Dockerfile to use sveltekit-base:local (47 COPY lines → 2, 4 build steps → 0) - Update CD workflow to build sveltekit-base when deploying web apps Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 12:17:48 +01:00
Till JS	1052469397	feat(infra): extend Dockerfile validator to backends and services Validator now checks 52 Dockerfiles (web + backend + service). Fixed 10 missing COPYs across backends, services, and nestjs-base. Generator also supports backend/service Dockerfiles with markers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 08:57:10 +01:00
Till JS	e3115b302d	feat(infra): add Cloudflare fallback plan + self-hosted landing pages Two infrastructure improvements for tech independence: 1. Cloudflare Fallback Documentation (docs/CLOUDFLARE_FALLBACK.md): - Plan B: WireGuard + Caddy on Hetzner VPS (€3.79/mo) - Complete Caddyfile with all 30+ subdomains - Step-by-step failover checklist (~15 min to switch) - Plan C: Direct IP with ISP 2. Self-Hosted Landing Pages (eliminates Cloudflare Pages dependency): - Nginx container (mana-infra-landings) on port 4400 - Multi-site config: each subdomain → separate dist/ folder - Build script: scripts/mac-mini/build-landings.sh - Cloudflare Tunnel ingress rules for 10 landing page domains - Storage: /Volumes/ManaData/landings/ on external SSD - Domains: it, chats, pics, zitares, presis, clocks, manadeck, nutriphi, citycorners, docs Migration path: Build landings locally, set Cloudflare DNS to tunnel instead of Pages, then decommission CF Pages projects. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 12:07:40 +01:00
Till JS	e06e8cca59	fix(infra): use postgres -c flags instead of config_file override The config_file override replaced the entire default PostgreSQL config including listen_addresses, breaking inter-container communication. Use inline -c flags instead which only override specific parameters. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 11:42:42 +01:00
Till JS	fcd7c82ce4	fix(infra): simplify PostgreSQL backup to pg_dumpall + pg_basebackup pgBackRest as Docker sidecar was overly complex (needs shared WAL directory, stanza management, special entrypoint). Replace with a simpler, proven approach using native PostgreSQL tools: Backup container (postgres:16-alpine): - Hourly: pg_dumpall \| gzip (all databases as SQL, ~2 day retention) - Daily 03:00: pg_basebackup -Ft -z (physical backup, 30 day retention) - Auto-cleanup of old backups - Storage: /Volumes/ManaData/backups/postgres/ Also: Remove pgbackrest.conf, simplify postgresql.conf (remove WAL archiving config, keep performance tuning + replication for basebackup) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 11:39:20 +01:00
Till JS	39526918a3	feat(infra): add pgBackRest for PostgreSQL Point-in-Time Recovery Replace simple pg_dumpall with pgBackRest PITR backup system. This enables recovery to any second, not just the last daily dump. Configuration: - docker/postgres/postgresql.conf: WAL archiving + performance tuning (shared_buffers=512MB, effective_cache_size=2GB for 16GB Mac Mini) - docker/postgres/pgbackrest.conf: stanza config + retention policy Docker (docker-compose.macmini.yml): - postgres: mount custom config, enable WAL archiving - postgres-backup: new pgBackRest container - Storage: /Volumes/ManaData/backups/pgbackrest - Retention: 4 full + 14 differential (~4 weeks) - Compression: Zstandard (zst) Backup Schedule: - 03:00 daily: Full backup - Every 6h: Differential (changes since last full) - Every hour: Incremental (changes since last backup) - Continuous: WAL archiving (every 60s) Documentation (docs/POSTGRES_BACKUP.md): - Complete restore procedures (full, PITR, single DB) - First-time setup instructions - Monitoring and alerting integration Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 11:18:33 +01:00
Till JS	169821de1a	feat(monitoring): add LLM Grafana dashboard, Prometheus scraping, and alerts Wire mana-llm service into the monitoring stack: Prometheus (docker/prometheus/prometheus.yml): - Add mana-llm scrape job (port 3025, 15s interval) - Include mana-llm in ServiceDown alert expression Alerts (docker/prometheus/alerts.yml): - New llm_alerts group with 4 rules: - LLMServiceDown: mana-llm down > 1 min (critical) - LLMHighErrorRate: > 10% errors for 5 min (warning) - OllamaProviderDown: > 50% requests via Google fallback (warning) - LLMSlowResponses: p95 > 30s for 5 min (warning) Grafana Dashboard (docker/grafana/dashboards/mana-llm.json): - 6 stat panels: status, req/min, error rate, fallback rate, latency, tokens/min - Requests by Provider (stacked area: Ollama vs Google vs OpenRouter) - Tokens by Type (prompt vs completion) - Latency Percentiles (p50, p90, p99) - Latency by Provider comparison - Requests by Model breakdown - Errors by Type - Google Fallback Rate over time (with threshold coloring) - Provider Distribution pie chart (24h) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 11:16:27 +01:00
Till JS	1c5c2446f6	feat(citycorners): add city guide app for Konstanz with full monorepo integration New project with three apps: - Landing (Astro): static site with SVG illustrations, location data - Backend (NestJS, port 3025): CRUD API for locations + favorites, Drizzle ORM, auth via mana-core-auth - Web (SvelteKit, port 5196): Tailwind 4, PillNav, auth (login/register/SSO), Leaflet map, favorites with optimistic updates, theme/settings Infrastructure: DB init SQL, setup-databases.sh, generate-env.mjs, root package.json scripts, Dockerfiles, docker-compose.macmini.yml (backend:3025, web:5022), Cloudflare wrangler.toml. Branding: registered in shared-branding (AppId, APP_BRANDING, APP_ICONS, MANA_APPS, CitycornersLogo). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 10:56:26 +01:00
Till JS	143112f77a	feat(observability): add mana-search, mana-media, and Synapse to monitoring - Add Prometheus scraping for mana-search (port 3020, already has metrics) - Add Prometheus scraping for mana-media (port 3015, MetricsModule added) - Add Prometheus scraping for Matrix Synapse (port 9002, already enabled) - Add MetricsModule to mana-media with media_ prefix - Update Dockerfile for mana-media to include shared-nestjs-metrics - Replace hardcoded ServiceDown alert list with dynamic regex (.*-backend\|mana-core-auth\|mana-search\|mana-media\|synapse) - Replace hardcoded backends.json query with dynamic regex - Add Search, Media, Synapse to master-overview and system-overview dashboards Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 10:46:59 +01:00
Till JS	cc5ba3bb90	chore: remove Hetzner legacy artifacts and update docs for Mac Mini self-hosting Deleted files: - docker/caddy/Caddyfile.production + Caddyfile.staging (Hetzner reverse proxy configs) - scripts/deploy/ (deploy-hetzner.sh, build-and-push.sh, health-check.sh, migrate-db.sh, rollback.sh) - scripts/generate-staging-secrets.sh - cicd/ directory (11 Hetzner CI/CD planning docs) - CI_CD_IMPLEMENTATION_SUMMARY.md, CI_CD_README.md, FILES_CREATED.md, HIVE_MIND_FINAL_REPORT.md Updated docs: - CLAUDE.md: Remove Hetzner Object Storage references, update to MinIO - docs/ANALYTICS.md: Cloudflare Tunnel instead of Caddy - docs/URL_SCHEMA.md: Mac Mini + Cloudflare Tunnel instead of Hetzner IP - .env.development: Remove "Hetzner in production" comments Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 10:12:24 +01:00
Till JS	c8de944c8d	feat(monitoring): add GlitchTip health check and disk space monitoring - Add GlitchTip to health-check.sh monitoring endpoints - Add native disk space checks for / and /Volumes/ManaData with 80%/90% thresholds - Extend Prometheus disk alerts to include /host_mnt/Volumes/ManaData mountpoint - Add ManaData disk usage gauge to Grafana system-overview dashboard Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 09:33:09 +01:00
Till JS	c1ef55fd54	fix(infra): rename LightWrite to Mukke in Caddyfile production config LightWrite was replaced by Mukke on the same ports (5180/3010). Update reverse proxy to use mukke.mana.how and mukke-api.mana.how. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 09:26:54 +01:00
Till JS	6fa6509fa5	feat(observability): add metrics and monitoring for all 15 backends - Add MetricsModule to 8 backends missing it (photos, zitare, mukke, planta, picture, storage, presi, nutriphi) - Enable Prometheus scraping for all 15 backends in prometheus.yml (was only 6, with 3 commented out and 6 missing entirely) - Update ServiceDown alert rule to cover all 15 backends - Update Grafana dashboards (backends, master-overview, system-overview) with all backend services in health panels - Fix imprecise regex in application-details dashboard Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 09:09:04 +01:00
Till JS	420926aef1	fix(infra): add no-cache headers for PWA files in Caddyfile Ensure sw.js, manifest.webmanifest, and registerSW.js are never cached by the browser or CDN so service worker updates are picked up immediately after deploys. Uses a reusable Caddy snippet imported by all web app blocks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-22 19:10:49 +01:00
Till JS	403b1c7b87	feat(storage): add controller tests, Caddy config, and PWA improvements Controller tests (50 tests, all passing): - FileController: 12 tests (CRUD, upload, download with headers/URL mode) - FolderController: 8 tests (CRUD, move, favorite) - TrashController: 6 tests (restore file/folder, permanent delete, empty) - SearchController: 6 tests (search, empty query, favorites) - ShareController: 7 tests (CRUD, expiresInDays conversion, public token) - TagController: 7 tests (CRUD with optional color) Total test count now: 159 (133 backend + 26 web) Deployment: - Add Caddy reverse proxy entries for storage.mana.how and storage-api.mana.how PWA: - Upgrade to 'full' preset for better offline caching (fonts, external resources) - Add app shortcuts: Dateien, Suche, Favoriten - Improve offline page with links to cached pages Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 12:48:11 +01:00
Till JS	683a4c5331	feat(docker): add shared NestJS builder base image - Add docker/Dockerfile.nestjs-base with all shared packages pre-built - Convert 6 backend Dockerfiles (chat, todo, calendar, clock, contacts, mukke) to inherit from nestjs-base:local - Fix bugs: duplicate shared-nestjs-setup builds (mukke), unnecessary shared-error-tracking rebuild in production stage (chat, clock) - CD pipeline builds base image before services when backends deploy - Net reduction: 317 lines removed, 112 added (-205 lines) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 10:48:31 +01:00
Till JS	d9ccb5e31b	feat(games): add whopixels hosting at whopxl.mana.how Dockerfile, docker-compose service (port 5100), Caddy and cloudflared routing for the WhoPixels game. PORT is now configurable via env var. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 19:57:50 +01:00
Till JS	42fe39c6a2	fix(infra): fix deploy tracking dashboard datasource UIDs and instant queries - Add explicit uid: deploy-tracking to datasource provisioning - Add instant: true to all Prometheus stat/gauge panel queries - Pushgateway gauges need instant queries, not range queries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 17:35:41 +01:00
Till JS	dea632c6c7	fix(caddy): update all reverse proxy ports to match docker containers Many Caddy ports were outdated and pointing to dead services: - mana.how: 5173→5000 - chat: 3000→5010, chat-api: 3002→3030 - todo: 5188→5011 - calendar: 5186→5012, calendar-api: 3016→3032 - clock: 5187→5013, clock-api: 3017→3033 - contacts: 5184→5014 - grafana: 3100→8000, stats: 3200→8010 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 17:09:09 +01:00
Till JS	3f91c4656a	feat(infra): add deploy tracking with PostgreSQL, Pushgateway & Grafana dashboard Instrument the CD pipeline to record per-deploy and per-service metrics (build time, image size, startup time, health status) into PostgreSQL and push gauges to Pushgateway. Adds a Grafana dashboard with 13 panels covering deploy frequency, build performance, service health, and history. New files: - scripts/mac-mini/init-deploy-tracking.sql (idempotent DDL) - scripts/deploy-metrics.sh (bash library for CI) - docker/grafana/provisioning/datasources/deploy-tracking.yml - docker/grafana/dashboards/deploy-tracking.json Modified: - docker/prometheus/prometheus.yml (pushgateway scrape job) - .github/workflows/cd-macmini.yml (build/health instrumentation) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 17:08:03 +01:00
Till JS	f264e9f2ae	feat(grafana): add GlitchTip error tracking dashboard - Add PostgreSQL datasource pointing to GlitchTip database - Add Error Tracking dashboard with 7 panels: - Total Open Issues (stat) - Issues by Project (pie chart) - Total Events (stat) - Projects Tracked (stat) - Resolved vs Unresolved (stat) - New Issues Over Time (stacked bar chart, 30 days) - Recent Issues (table with 50 latest, color-coded levels) - Dashboard links to GlitchTip UI for detailed investigation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 21:14:09 +01:00
Till JS	b11e1284dc	feat(error-tracking): add GlitchTip integration with shared error-tracking package Infrastructure: - Add GlitchTip (web + worker) to docker-compose.macmini.yml (port 8020) - Add glitchtip.mana.how to Cloudflare Tunnel config - Add glitchtip database to init-db SQL - Add GLITCHTIP_DSN to .env.development Shared Package (@manacore/shared-error-tracking): - initErrorTracking() - Sentry-compatible init with GlitchTip DSN - captureException(), captureMessage(), setUser(), setTag(), flush() - SentryExceptionFilter for NestJS (captures 5xx errors only) - Graceful no-op when DSN is not configured Integration: - Add instrument.ts to calendar, contacts, todo backends - Import instrument.ts before app bootstrap in all 3 main.ts files - Error tracking auto-initializes when GLITCHTIP_DSN env var is set Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 13:30:13 +01:00
Till JS	ea4b585f37	feat(context): add NestJS backend, PostgreSQL database, and migrate web app from Supabase to API - Create NestJS backend on port 3020 with 4 modules (space, document, ai, token) - Add Drizzle schema with 5 tables (spaces, documents, token_transactions, model_prices, user_tokens) - Rewrite web services (spaces, documents, tokens, ai) to use shared API client instead of Supabase - Move AI API keys server-side (Azure OpenAI, Google Gemini) - Add seed script for model prices (gpt-4.1, gemini-pro, gemini-flash) - Add 70 unit tests across 4 test suites (space, document, token, ai services) - Add monorepo integration (setup-databases.sh, generate-env.mjs, docker init-db, root scripts) - Remove @supabase/supabase-js dependency and delete supabase.ts from web app - Update CLAUDE.md with full API documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 09:28:01 +01:00
Till-JS	6797195bdc	🔧 chore(infra): add lightwrite subdomain configuration - Add lightwrite.mana.how → localhost:5180 - Add lightwrite-api.mana.how → localhost:3010 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-16 11:19:54 +01:00
Till-JS	d81b8aebf2	🔒 refactor(bots): remove !login command and enforce OIDC-only auth - Remove !login and !logout commands from all 16+ Matrix bots - Remove login/logout references from all help/welcome messages - Disable password login in Synapse (password_config.enabled: false) - System is now OIDC-only via Mana Core authentication Users must authenticate via "Sign in with Mana Core" in Element. Existing bot access tokens remain valid. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-14 11:26:58 +01:00
Till-JS	405084b52d	🔧 fix(skilltree): change web port to 5020 (5018 used by zitare)	2026-02-13 23:14:38 +01:00
Till-JS	d49d147257	🔧 chore(caddy): add skilltree reverse proxy config	2026-02-13 23:11:12 +01:00
Till-JS	acc8de36ee	feat(monitoring): add alerting stack and maintenance scripts Medium priority stability improvements: Alerting: - Add vmalert for evaluating Prometheus alert rules - Add alertmanager for alert routing and grouping - Add alert-notifier service for Telegram/ntfy notifications - Enable cadvisor scraping in prometheus config Disk Monitoring: - Add check-disk-space.sh for hourly disk monitoring - Alert on 80% (warning) and 90% (critical) thresholds - Auto-cleanup Docker when disk is critical - Add com.manacore.disk-check.plist for LaunchD Weekly Reports: - Add weekly-report.sh for system health summary - Includes: backup status, disk usage, container health, database stats, error log summary - Runs every Sunday at 10 AM via LaunchD Health Check Updates: - Add checks for vmalert, alertmanager, alert-notifier Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-12 13:46:57 +01:00
Till-JS	d5e18c9c27	🔧 fix(mac-mini): update health checks and disable missing services - Disable api-gateway and skilltree-web (no working images/Dockerfiles) - Fix mana-search Dockerfile healthcheck port and endpoint - Update health-check.sh to skip disabled services - Fix search service health endpoint (/api/v1/health) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-12 13:28:55 +01:00
Till-JS	8525020e8a	feat(playground): integrate shared auth UI for consistent login experience - Add PlaygroundLogo to shared-branding package - Add playground to APP_BRANDING, APP_ICONS, and APP_URLS - Replace custom login/register pages with shared-auth-ui components - Update authStore with resendVerificationEmail and improved signUp - Add Caddy reverse proxy entry for playground.mana.how Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-02 14:53:51 +01:00
Till-JS	fe33f4b355	✅ fix(mana-core-auth): complete production readiness with test fixes - Fix LoggerService mock in better-auth.service.spec.ts - Fix name assertion in auth.controller.spec.ts (empty string fallback) - Fix createRemoteJWKSet mock in jwt-auth.guard.spec.ts - Add Grafana dashboard for Auth Service monitoring - Add 10 auth-specific Prometheus alert rules - Update production readiness plan to 100% complete All 199 unit tests passing. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 14:18:58 +01:00
Till-JS	fdaf6a9c75	🔧 fix(dashboards): fix broken panels and metrics - Backends: Remove Docker container section (cAdvisor not deployed) - Backends: Add Auth Service Runtime section with correct auth_ prefixed metrics - Backends: Rename to "Backends Overview" - Application Details: Fix Node.js Runtime to use auth_ prefixed metrics - Application Details: Rename section to "Auth Service Runtime" Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 12:54:07 +01:00
Till-JS	7aa5115c78	📊 feat(monitoring): add node-exporter for host system metrics - Add node-exporter service to docker-compose for CPU/Memory/Disk monitoring - Enable node-exporter scrape target in Prometheus config - Update System Overview dashboard with Host System section: - CPU, Memory, Disk usage gauges - Total RAM, Total Disk, Uptime, Load stats - CPU & Memory over time graph - Network I/O graph - Add Node Exporter to service status panel Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 12:38:44 +01:00
Till-JS	84e9f86db9	🔧 fix(grafana): rewrite System Overview with available metrics - Removed node_* metrics (node-exporter not deployed) - Removed container_last_seen (cAdvisor not deployed) - Added Service Status, Traffic Overview, Database sections - All panels now use available Prometheus metrics Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 12:33:11 +01:00

1 2

86 commits