All web app subdomains (chat.mana.how, todo.mana.how, etc.) were removed
when the unified app launched, but monitoring configs still referenced them.
Update blackbox targets to use mana.how/route URLs, remove stale API backend
routes from cloudflared, clean up CORS origins, and fix status page generator
to handle route-based URLs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New GPU service for fast text-to-video generation using LTX-Video (~2B params)
on the RTX 3090. Generates 480p clips in 10-30 seconds, uses ~10GB VRAM.
Includes Cloudflare Tunnel route, Prometheus monitoring, and health checks.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- inventar-web: fix mangled icon import in settings page
- skilltree-web: create missing lib/services/storage.ts for export/import
- startup.sh: add umami/synapse DB creation + synapse user setup with C locale
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Prevents the internal SSD from filling up if the external SSD is not
mounted or if `colima delete` wiped the datadisk symlink.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- check-disk-space.sh: always prune dangling images, unused volumes, and
build cache >7 days on every run (not just at critical threshold)
- check-disk-space.sh: auto-remove node_modules if found on server
(never needed — Docker builds inside containers)
- disk-check launchd: reduce interval from 60min to 15min to catch
disk issues faster (yesterday we hit 100% before hourly check caught it)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- check-disk-space.sh now pushes mac_disk_used_percent + mac_colima_disk_used_gb
to Pushgateway every hour so vmalert can alert on real macOS disk usage
- alerts.yml: replace broken node-exporter disk alerts with Pushgateway-based ones
- master-overview.json: add "Recent Errors (Loki)" section with live error log
stream, error rate timeseries and top error sources barchart
- move-colima-to-external-ssd.sh: guided script to move 200GB Colima VM
datadisk from internal SSD to /Volumes/ManaData (3.6TB external SSD)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
colima delete wipes the entire VM disk on every power cycle, forcing
full image rebuilds. colima stop --force is sufficient to clear stale
process state after a hard shutdown.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The startup script runs `colima delete` on hard shutdown recovery,
wiping the colima.yaml mount config. Then `colima start` only added
/Volumes/ManaData but forgot /Users/mana — causing all file bind-mounts
to appear as empty directories (VirtioFS can't see host files).
This was the root cause of Synapse/SearXNG/Alertmanager/Loki crashing
after the power outage. Now both mounts are always passed explicitly.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Loki was already running but had no log shipper. Adds Promtail to collect
Docker logs from all 66 containers with automatic tier labeling (infra,
auth, core, app, matrix, games) and a Grafana Logs Explorer dashboard.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Kill Docker Desktop if it auto-started
- Clean stale Colima state from hard shutdown (delete --force)
- Start Colima with VZ, 12GB RAM, VirtioFS
- Restore named volumes from backup if missing
- Start containers with --no-build to skip broken Dockerfiles
- Create missing databases
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mac Mini has docker at /usr/local/bin/docker, not in PATH.
Use same DOCKER_CMD pattern as build-app.sh.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
build-app.sh now checks available RAM before builds and only stops
monitoring containers when free memory is below 3 GB threshold.
New memory-baseline.sh script measures per-container and per-category
RAM usage for capacity planning.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove set -e to prevent abort on non-critical errors
- Suppress tar errors for volatile TSDB files (VictoriaMetrics)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Deactivate Ollama, FLUX.2, and Telegram Bot LaunchAgents on Mac Mini
- Remove extra_hosts from mana-llm (no longer needs host.docker.internal)
- Update health-check.sh to monitor GPU server services instead of local
- Update status.sh to show GPU server status instead of native services
- Rewrite MAC_MINI_SERVER.md: remove ~400 lines of Ollama/FLUX/Bot docs,
add GPU server architecture diagram and deactivation notes
- Update CAPACITY_PLANNING.md with post-offload numbers (~80-150 peak users)
Mac Mini is now a pure hosting server (Web, API, DB, Sync).
All AI workloads run on GPU server (RTX 3090) via LAN.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Forgejo v11 on port 3041 (git.mana.how via Cloudflare Tunnel)
- Forgejo Runner for CI/CD (GitHub Actions compatible)
- Built-in Docker registry and LFS support
- Registration disabled (admin-only)
- SSH access on port 2222
- Go Services CI workflow (.forgejo/workflows/go-services.yml)
- Setup script: scripts/mac-mini/setup-forgejo.sh
Replaces GitHub dependency for CI/CD. GitHub can remain as
mirror/backup while Forgejo becomes the primary Git host.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Photos NestJS backend was a proxy to mana-media that enriched
responses with local album/favorite/tag data. Now:
- Albums store → local-first via albumCollection + albumItemCollection
- Favorites → local-first via favoriteCollection (toggle in IndexedDB)
- Photo tags → local-first via photoTagCollection
- Photo listing/stats → direct mana-media API calls from frontend
- Upload → direct mana-media upload from frontend
- Delete → direct mana-media delete from frontend
Removed 27 TypeScript files, 1 Docker container, 1 port (3039).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Presi NestJS backend (40 source files, 50 deps) was a CRUD wrapper
around decks, slides, and themes — all now handled by local-first sync.
Only the share-link feature requires server-side state (public URLs
without auth), so a minimal Hono + Bun server replaces the entire
NestJS backend:
- apps/presi/apps/server/ — Hono server with share routes + GDPR admin
Uses @manacore/shared-hono for auth (JWKS), health, admin, errors
- Web app API client stripped to share-only (was 270 lines → 90 lines)
- Removed from docker-compose, CI/CD, Prometheus, env generation
- NestJS backend deleted (40 TS files, 8 test specs, 3038 lines)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both apps are fully local-first via Dexie.js + mana-sync. Their NestJS
backends were pure CRUD wrappers (20 + 31 source files) that are no
longer needed.
Changes:
- Add packages/shared-hono: JWT auth via JWKS (jose), Drizzle DB factory,
health route, generic GDPR admin handler, error middleware
- Migrate zitare lists page from fetch() to listsStore (local-first)
- Rewrite clock timers store from API-based to timerCollection (Dexie)
- Update clock +layout.svelte CommandBar search to use local collections
- Remove zitare-backend + clock-backend from docker-compose, CI/CD,
Prometheus, env generation, setup scripts
- Add docs/TECHNOLOGY_AUDIT_2026_03.md with full repo analysis
Net result: -2 Docker containers, -2 ports, -2728 lines of code
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace 21 separate NestJS Matrix bot processes (~2.1 GB RAM, ~4.2 GB Docker images)
with a single Go binary using plugin architecture (8.6 MB binary, ~30 MB RAM).
New services:
- services/mana-matrix-bot/ — Go Matrix bot with 21 plugins (mautrix-go, Redis sessions)
- services/mana-api-gateway-go/ — Go API gateway (rate limiting, API keys, credit billing)
Deleted:
- 21 services/matrix-*-bot/ directories
- packages/bot-services/ and packages/matrix-bot-common/
- Legacy deploy scripts and CI build jobs
Updated:
- docker-compose.macmini.yml: new Go services, legacy bots removed
- CI/CD: change detection + build jobs for Go services
- Root package.json: new dev:matrix, build:matrix, test:matrix scripts
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
docker compose stop with service names can hang due to env var warnings.
Using docker stop/start with container names is more reliable.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add docker/Dockerfile.sveltekit-base: pre-built base with all 34 shared
packages (mirrors nestjs-base pattern), eliminates redundant COPY/build
steps from individual web Dockerfiles
- Add scripts/mac-mini/build-app.sh: stops monitoring stack before build
to free RAM, auto-restarts on exit (trap cleanup)
- Migrate todo web Dockerfile to use sveltekit-base:local (47 COPY lines
→ 2, 4 build steps → 0)
- Update CD workflow to build sveltekit-base when deploying web apps
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Full-stack achievement system for SkillTree with backend (NestJS) and frontend (SvelteKit):
- 26 achievements across 7 categories (XP, Skills, Levels, Activities, Streak, Branches, Special)
- 5 rarity tiers (Common → Legendary) with distinct styling
- Auto-unlock after XP gain, skill creation, and activity logging
- Celebration animation on unlock with sparkle effects
- Achievements page with category filters and progress tracking
- IndexedDB offline support with local condition evaluation
- Backend seeds achievements on startup, checks conditions after mutations
- Stats overview extended with achievement counter
- i18n translations (DE + EN)
Also adds docs/MONETIZATION_REPORT.md with ranked analysis of all apps.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two infrastructure improvements for tech independence:
1. Cloudflare Fallback Documentation (docs/CLOUDFLARE_FALLBACK.md):
- Plan B: WireGuard + Caddy on Hetzner VPS (€3.79/mo)
- Complete Caddyfile with all 30+ subdomains
- Step-by-step failover checklist (~15 min to switch)
- Plan C: Direct IP with ISP
2. Self-Hosted Landing Pages (eliminates Cloudflare Pages dependency):
- Nginx container (mana-infra-landings) on port 4400
- Multi-site config: each subdomain → separate dist/ folder
- Build script: scripts/mac-mini/build-landings.sh
- Cloudflare Tunnel ingress rules for 10 landing page domains
- Storage: /Volumes/ManaData/landings/ on external SSD
- Domains: it, chats, pics, zitares, presis, clocks,
manadeck, nutriphi, citycorners, docs
Migration path: Build landings locally, set Cloudflare DNS to
tunnel instead of Pages, then decommission CF Pages projects.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add GlitchTip to health-check.sh monitoring endpoints
- Add native disk space checks for / and /Volumes/ManaData with 80%/90% thresholds
- Extend Prometheus disk alerts to include /host_mnt/Volumes/ManaData mountpoint
- Add ManaData disk usage gauge to Grafana system-overview dashboard
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instrument the CD pipeline to record per-deploy and per-service metrics
(build time, image size, startup time, health status) into PostgreSQL and
push gauges to Pushgateway. Adds a Grafana dashboard with 13 panels covering
deploy frequency, build performance, service health, and history.
New files:
- scripts/mac-mini/init-deploy-tracking.sql (idempotent DDL)
- scripts/deploy-metrics.sh (bash library for CI)
- docker/grafana/provisioning/datasources/deploy-tracking.yml
- docker/grafana/dashboards/deploy-tracking.json
Modified:
- docker/prometheus/prometheus.yml (pushgateway scrape job)
- .github/workflows/cd-macmini.yml (build/health instrumentation)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Medium priority stability improvements:
Alerting:
- Add vmalert for evaluating Prometheus alert rules
- Add alertmanager for alert routing and grouping
- Add alert-notifier service for Telegram/ntfy notifications
- Enable cadvisor scraping in prometheus config
Disk Monitoring:
- Add check-disk-space.sh for hourly disk monitoring
- Alert on 80% (warning) and 90% (critical) thresholds
- Auto-cleanup Docker when disk is critical
- Add com.manacore.disk-check.plist for LaunchD
Weekly Reports:
- Add weekly-report.sh for system health summary
- Includes: backup status, disk usage, container health,
database stats, error log summary
- Runs every Sunday at 10 AM via LaunchD
Health Check Updates:
- Add checks for vmalert, alertmanager, alert-notifier
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
High priority stability features:
- Add all LaunchD plists to Git for version control
- Handle crash-looping containers (Restarting status) in ensure-containers.sh
- Add database backup script with daily/weekly rotation
- Add Docker log rotation setup (50MB max, 3 files per container)
New files:
- scripts/mac-mini/backup-databases.sh - Daily pg_dump with rotation
- scripts/mac-mini/setup-docker-logging.sh - Configure daemon.json
- scripts/mac-mini/launchd/*.plist - All 8 LaunchD service configs
- scripts/mac-mini/launchd/README.md - Documentation
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Disable api-gateway and skilltree-web (no working images/Dockerfiles)
- Fix mana-search Dockerfile healthcheck port and endpoint
- Update health-check.sh to skip disabled services
- Fix search service health endpoint (/api/v1/health)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add ensure-containers-running.sh to detect and auto-start stuck containers
- Add LaunchD plist for automatic container health checks every 5 minutes
- Update health-check.sh with correct ports (3031/5011 for todo, etc.)
- Update deploy.sh health checks to match docker-compose.macmini.yml
- Fix container name references (mana-infra-postgres instead of manacore-postgres)
This prevents 502 errors when containers get stuck in "Created" status.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Move queue name constants to separate file (queue-names.ts) to avoid
circular dependency between queue.module.ts and processor files.
The @Processor decorator evaluates at module load time, and importing
constants from queue.module.ts created a circular dependency that
resulted in undefined queue names.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- NestJS bot that converts text messages to speech via mana-tts
- Commands: !voice, !voices, !speed, !status, !help
- User settings stored in-memory (voice, speed per user)
- Docker config for Mac Mini deployment
- Setup script for bot registration
Co-Authored-By: Claude <noreply@anthropic.com>
Add internationalization (DE + EN) to previously missing apps:
- todo: task management translations
- skilltree: skill/XP system translations
- nutriphi: nutrition tracking translations
- planta: plant care translations
- questions: research app translations
- matrix: chat client translations (layout integration)
Each app includes:
- svelte-i18n setup with SSR support
- localStorage persistence ({app}_locale pattern)
- i18n loading state in +layout.svelte
- German (default) and English translations
Updated CONSISTENCY_REPORT.md to mark i18n task as complete.
Also includes:
- mana-tts service placeholder files
- Fix AiHandler to use correct service methods:
- setSessionModel instead of setModel
- clearSessionHistory instead of clearHistory
- compareModels for model comparison
- Fix TodoHandler to use index-based methods:
- completeTaskByIndex instead of completeTask
- deleteTaskByIndex instead of deleteTask
- Add deploy-mana-bot.sh script for full deployment automation
https://claude.ai/code/session_015bwcqVRiFmSydYTjvDJGTc
- Add matrix-mana-bot to docker-compose.macmini.yml
- Add setup-mana-bot.sh script for bot registration
- Add dev:matrix:* scripts to root package.json
- Add devlog entry documenting the new architecture
The gateway bot is now ready for deployment alongside
the existing standalone Matrix bots.
https://claude.ai/code/session_015bwcqVRiFmSydYTjvDJGTc