managarten/scripts/mac-mini/launchd/README.md
Till-JS acc8de36ee feat(monitoring): add alerting stack and maintenance scripts
Medium priority stability improvements:

Alerting:
- Add vmalert for evaluating Prometheus alert rules
- Add alertmanager for alert routing and grouping
- Add alert-notifier service for Telegram/ntfy notifications
- Enable cadvisor scraping in prometheus config

Disk Monitoring:
- Add check-disk-space.sh for hourly disk monitoring
- Alert on 80% (warning) and 90% (critical) thresholds
- Auto-cleanup Docker when disk is critical
- Add com.manacore.disk-check.plist for LaunchD

Weekly Reports:
- Add weekly-report.sh for system health summary
- Includes: backup status, disk usage, container health,
  database stats, error log summary
- Runs every Sunday at 10 AM via LaunchD

Health Check Updates:
- Add checks for vmalert, alertmanager, alert-notifier

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-12 13:46:57 +01:00

59 lines
1.7 KiB
Markdown

# LaunchD Services for Mac Mini
These plist files configure automatic services on the Mac Mini server.
## Installation
```bash
# Copy all plists to LaunchAgents
cp *.plist ~/Library/LaunchAgents/
# Load all services
for f in *.plist; do launchctl load ~/Library/LaunchAgents/$f; done
```
## Services
| Service | Description | Interval |
|---------|-------------|----------|
| `docker-startup` | Starts Docker containers on boot | At login |
| `ensure-containers` | Detects and restarts stuck/crash-looping containers | Every 5 min |
| `health-check` | Checks all services and sends alerts | Every 5 min |
| `backup-databases` | PostgreSQL backup with daily/weekly rotation | Daily 3 AM |
| `disk-check` | Monitors disk space, alerts on thresholds | Hourly |
| `weekly-report` | Generates system health summary | Sunday 10 AM |
| `ssd-check` | Monitors SSD health | Periodic |
| `mana-stt` | Speech-to-text service (Whisper) | At login |
| `mana-tts` | Text-to-speech service (Kokoro) | At login |
| `image-gen` | Image generation service | At login |
| `telegram-ollama-bot` | Telegram bot with Ollama | At login |
## Management Commands
```bash
# Check status
launchctl list | grep manacore
# View logs
tail -f /tmp/manacore-*.log
# Reload a service
launchctl unload ~/Library/LaunchAgents/com.manacore.health-check.plist
launchctl load ~/Library/LaunchAgents/com.manacore.health-check.plist
# Stop a service
launchctl unload ~/Library/LaunchAgents/com.manacore.<service>.plist
```
## Troubleshooting
Exit codes in `launchctl list`:
- `0` = Running successfully
- `1` = Last run had errors (check logs)
- `-` = Not running / waiting for next interval
- `78` = Configuration error
Check error logs:
```bash
cat /tmp/manacore-<service>.error.log
```