Commit graph

14 commits

Author SHA1 Message Date
Till JS
b45ddbbb83 refactor: remove local AI services from Mac Mini, GPU-only architecture
- Deactivate Ollama, FLUX.2, and Telegram Bot LaunchAgents on Mac Mini
- Remove extra_hosts from mana-llm (no longer needs host.docker.internal)
- Update health-check.sh to monitor GPU server services instead of local
- Update status.sh to show GPU server status instead of native services
- Rewrite MAC_MINI_SERVER.md: remove ~400 lines of Ollama/FLUX/Bot docs,
  add GPU server architecture diagram and deactivation notes
- Update CAPACITY_PLANNING.md with post-offload numbers (~80-150 peak users)

Mac Mini is now a pure hosting server (Web, API, DB, Sync).
All AI workloads run on GPU server (RTX 3090) via LAN.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:23:37 +01:00
Till JS
c8de944c8d feat(monitoring): add GlitchTip health check and disk space monitoring
- Add GlitchTip to health-check.sh monitoring endpoints
- Add native disk space checks for / and /Volumes/ManaData with 80%/90% thresholds
- Extend Prometheus disk alerts to include /host_mnt/Volumes/ManaData mountpoint
- Add ManaData disk usage gauge to Grafana system-overview dashboard

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 09:33:09 +01:00
Till-JS
acc8de36ee feat(monitoring): add alerting stack and maintenance scripts
Medium priority stability improvements:

Alerting:
- Add vmalert for evaluating Prometheus alert rules
- Add alertmanager for alert routing and grouping
- Add alert-notifier service for Telegram/ntfy notifications
- Enable cadvisor scraping in prometheus config

Disk Monitoring:
- Add check-disk-space.sh for hourly disk monitoring
- Alert on 80% (warning) and 90% (critical) thresholds
- Auto-cleanup Docker when disk is critical
- Add com.manacore.disk-check.plist for LaunchD

Weekly Reports:
- Add weekly-report.sh for system health summary
- Includes: backup status, disk usage, container health,
  database stats, error log summary
- Runs every Sunday at 10 AM via LaunchD

Health Check Updates:
- Add checks for vmalert, alertmanager, alert-notifier

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-12 13:46:57 +01:00
Till-JS
d5e18c9c27 🔧 fix(mac-mini): update health checks and disable missing services
- Disable api-gateway and skilltree-web (no working images/Dockerfiles)
- Fix mana-search Dockerfile healthcheck port and endpoint
- Update health-check.sh to skip disabled services
- Fix search service health endpoint (/api/v1/health)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-12 13:28:55 +01:00
Till-JS
2fe7f842c6 🔧 fix(mac-mini): add container recovery and update health check ports
- Add ensure-containers-running.sh to detect and auto-start stuck containers
- Add LaunchD plist for automatic container health checks every 5 minutes
- Update health-check.sh with correct ports (3031/5011 for todo, etc.)
- Update deploy.sh health checks to match docker-compose.macmini.yml
- Fix container name references (mana-infra-postgres instead of manacore-postgres)

This prevents 502 errors when containers get stuck in "Created" status.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-12 12:51:49 +01:00
Claude
7c5e9e3c49
feat(matrix): add Stats Bot and Project Doc Bot services
Complete GDPR-compliant bot suite for Matrix:

matrix-stats-bot (port 3312):
- Analytics reports from Umami
- Commands: !stats, !today, !week, !realtime, !users
- Scheduled daily/weekly reports to Matrix room

matrix-project-doc-bot (port 3313):
- Project documentation with photos, voice, text
- Voice transcription via OpenAI Whisper
- Blog generation with 5 styles (casual, technical, tutorial, social, story)
- Commands: !new, !projects, !switch, !status, !generate, !export
- Uses PostgreSQL + S3 (MinIO) for storage

Changes:
- docker-compose.macmini.yml: Added both Matrix bots
- health-check.sh: Added health checks for both bots

Environment variables required:
- MATRIX_STATS_BOT_TOKEN, MATRIX_PROJECT_DOC_BOT_TOKEN
- OPENAI_API_KEY (for Project Doc Bot)

https://claude.ai/code/session_01E3r5aFW3YLAhEJfsL2ryhv
2026-01-28 00:44:28 +00:00
Claude
aabe328b51
feat(matrix): add Matrix Ollama Bot service
GDPR-compliant replacement for telegram-ollama-bot using Matrix protocol:

New service: services/matrix-ollama-bot/
- NestJS application with matrix-bot-sdk
- Same functionality as telegram-ollama-bot
- Commands: !help, !models, !model, !mode, !clear, !status
- System prompts: default, classify, summarize, translate, code
- Chat history per user (last 10 messages)

Changes:
- docker-compose.macmini.yml: Added matrix-ollama-bot service
- health-check.sh: Added Matrix Ollama Bot health check

Environment variables required:
- MATRIX_OLLAMA_BOT_TOKEN: Bot access token
- MATRIX_OLLAMA_BOT_ROOMS: Optional room restrictions

https://claude.ai/code/session_01E3r5aFW3YLAhEJfsL2ryhv
2026-01-28 00:35:35 +00:00
Claude
3aa9e8608d
feat(matrix): add self-hosted Matrix infrastructure for GDPR compliance
Add complete Matrix/Synapse setup as Telegram bot alternative:

Docker configuration:
- Synapse homeserver (port 8008) with PostgreSQL backend
- Element Web client (port 8087) with ManaCore branding
- DSGVO-compliant data retention policies (1-365 days)
- Prometheus metrics endpoint for monitoring

Config files:
- docker/matrix/homeserver.yaml - Synapse configuration
- docker/matrix/log.config.yaml - Logging with rotation
- docker/matrix/element-config.json - Element Web settings

Scripts & docs:
- scripts/mac-mini/setup-matrix.sh - One-time initialization
- Updated health-check.sh with Matrix services
- Updated MAC_MINI_SERVER.md with Matrix documentation

https://claude.ai/code/session_01E3r5aFW3YLAhEJfsL2ryhv
2026-01-28 00:20:12 +00:00
Till-JS
7252498f32 fix(health-check): correct presi-backend health endpoint path
Use /api/v1/health instead of /api/health.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-27 15:05:04 +01:00
Till-JS
fe8cb3cebb 🐛 fix(health-check): correct health endpoint paths
The backends exclude /health from the api/v1 prefix, so the correct
endpoint is /health not /api/v1/health. Also added missing services
(Contacts, Storage, Presi) and use /health for web apps.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-27 03:39:16 +01:00
Till-JS
de6151ae27 feat(mac-mini): add notification system for health checks
- Update health-check.sh with Telegram, Email, and ntfy notification functions
- Add notifications.env.example template for configuration
- Add setup-notifications.sh interactive setup script

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-23 13:18:04 +01:00
Till-JS
c512592685 fix(mac-mini): correct health check endpoints
- Web apps: check root URL (/) instead of /health (SvelteKit has no health endpoint)
- Todo backend: fix path to /api/v1/health
- Remove redundant PostgreSQL HTTP check (checked via docker exec)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-23 12:21:40 +01:00
Till-JS
732aa79fab fix(mac-mini): add PATH export for Docker CLI in all scripts
SSH sessions don't inherit the full PATH, so docker command
wasn't found. Now all scripts explicitly add /usr/local/bin
and /opt/homebrew/bin to PATH.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-23 12:17:24 +01:00
Till-JS
93060dc335 feat(mac-mini): add auto-start and management scripts
- setup-autostart.sh: Configure launchd services for boot
- startup.sh: Main startup script (waits for Docker, starts containers)
- health-check.sh: Check all services (runs every 5 min)
- status.sh: Full system status overview
- restart.sh: Restart containers (with --pull and --force options)
- stop.sh: Stop all containers gracefully
- README.md: Complete documentation

Includes optional ntfy.sh push notifications for health check failures.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-23 11:48:24 +01:00