- Deactivate Ollama, FLUX.2, and Telegram Bot LaunchAgents on Mac Mini
- Remove extra_hosts from mana-llm (no longer needs host.docker.internal)
- Update health-check.sh to monitor GPU server services instead of local
- Update status.sh to show GPU server status instead of native services
- Rewrite MAC_MINI_SERVER.md: remove ~400 lines of Ollama/FLUX/Bot docs,
add GPU server architecture diagram and deactivation notes
- Update CAPACITY_PLANNING.md with post-offload numbers (~80-150 peak users)
Mac Mini is now a pure hosting server (Web, API, DB, Sync).
All AI workloads run on GPU server (RTX 3090) via LAN.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add GlitchTip to health-check.sh monitoring endpoints
- Add native disk space checks for / and /Volumes/ManaData with 80%/90% thresholds
- Extend Prometheus disk alerts to include /host_mnt/Volumes/ManaData mountpoint
- Add ManaData disk usage gauge to Grafana system-overview dashboard
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Medium priority stability improvements:
Alerting:
- Add vmalert for evaluating Prometheus alert rules
- Add alertmanager for alert routing and grouping
- Add alert-notifier service for Telegram/ntfy notifications
- Enable cadvisor scraping in prometheus config
Disk Monitoring:
- Add check-disk-space.sh for hourly disk monitoring
- Alert on 80% (warning) and 90% (critical) thresholds
- Auto-cleanup Docker when disk is critical
- Add com.manacore.disk-check.plist for LaunchD
Weekly Reports:
- Add weekly-report.sh for system health summary
- Includes: backup status, disk usage, container health,
database stats, error log summary
- Runs every Sunday at 10 AM via LaunchD
Health Check Updates:
- Add checks for vmalert, alertmanager, alert-notifier
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Disable api-gateway and skilltree-web (no working images/Dockerfiles)
- Fix mana-search Dockerfile healthcheck port and endpoint
- Update health-check.sh to skip disabled services
- Fix search service health endpoint (/api/v1/health)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add ensure-containers-running.sh to detect and auto-start stuck containers
- Add LaunchD plist for automatic container health checks every 5 minutes
- Update health-check.sh with correct ports (3031/5011 for todo, etc.)
- Update deploy.sh health checks to match docker-compose.macmini.yml
- Fix container name references (mana-infra-postgres instead of manacore-postgres)
This prevents 502 errors when containers get stuck in "Created" status.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The backends exclude /health from the api/v1 prefix, so the correct
endpoint is /health not /api/v1/health. Also added missing services
(Contacts, Storage, Presi) and use /health for web apps.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Web apps: check root URL (/) instead of /health (SvelteKit has no health endpoint)
- Todo backend: fix path to /api/v1/health
- Remove redundant PostgreSQL HTTP check (checked via docker exec)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
SSH sessions don't inherit the full PATH, so docker command
wasn't found. Now all scripts explicitly add /usr/local/bin
and /opt/homebrew/bin to PATH.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- setup-autostart.sh: Configure launchd services for boot
- startup.sh: Main startup script (waits for Docker, starts containers)
- health-check.sh: Check all services (runs every 5 min)
- status.sh: Full system status overview
- restart.sh: Restart containers (with --pull and --force options)
- stop.sh: Stop all containers gracefully
- README.md: Complete documentation
Includes optional ntfy.sh push notifications for health check failures.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>