managarten/scripts/mac-mini
Till JS 85e38176d8 chore(macmini/scripts): runbook hardening — status diff + ingress walk
Two failures during the 2026-04-07 production outage triage were caused
not by the underlying outage but by `status.sh` and `health-check.sh`
hiding the broken state. Both scripts hardened so the same outage
shape can't reoccur invisibly.

status.sh — compose-vs-running diff
  The old script printed "X containers running / Y total" without
  noticing that some compose-defined containers were never started in
  the first place. The Mac Mini was running 37 of 42 declared
  containers and the script reported "37 running" with no indication
  of the gap — `mana-core-sync` and `mana-api-gateway` were silently
  missing for hours.

  New behaviour: read every service from `docker compose config`,
  diff its `container_name` against `docker ps`, and report each
  declared service whose container is not currently up. The same
  outage state would have been flagged on the very first run.

health-check.sh — public-hostname walk via Cloudflare DNS
  The old script probed ~50 hardcoded `localhost:<port>/health`
  endpoints across Chat, Todo, Calendar, etc. — but the per-app
  HTTP backends those endpoints expected don't exist anymore (the
  ghost-API cleanup removed them entirely). Every probe returned
  HTTP 000 / connection refused, generating a wall of false-positive
  alerts that drowned out the real signal.

  The block was replaced with a dynamic walk of every `hostname:`
  entry in `~/.cloudflared/config.yml`. Each hostname is probed via
  the public Cloudflare tunnel, so DNS gaps, missing tunnel routes,
  502/530 origin failures and timeouts surface as failures the same
  way real users would experience them. On its first run after the
  cleanup it surfaced eighteen previously-invisible hostname failures
  (no DNS, 502, or 530) — every one of them a real production issue.

  DNS resolution intentionally goes through `dig +short HOST @1.1.1.1`
  instead of the local resolver. The Mac Mini's home-router DNS keeps
  a negative cache for hours after the first failed lookup, so newly
  added CNAMEs (like the post-outage sync/media records) appeared as
  "no response" from inside the script for hours even though external
  users saw them resolve immediately. Asking Cloudflare's DNS directly
  gives the script the same view the public internet has.

  The Matrix, Element, GPU-LAN-redundant and monitoring port-by-port
  blocks were removed — the public-hostname walk covers all of them
  via their `*.mana.how` hostnames going through the actual tunnel.

  The "stuck container" detector now ignores `*-init` containers
  (one-shot init pods, Exit 0 = success, intentionally never re-run).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 22:31:53 +02:00
..
launchd chore: complete ManaCore → Mana rename (docs, go modules, plists, images) 2026-04-07 12:26:10 +02:00
backup-databases.sh feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
build-app.sh chore: remove all NestJS backend references, replace with Hono/Bun 2026-03-31 16:52:25 +02:00
build-landings.sh refactor: rename ManaDeck to Cards across entire monorepo 2026-04-01 11:45:21 +02:00
check-disk-space.sh feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
configure-ollama.sh feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
deploy-v2.sh feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
deploy.sh feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
ensure-containers-running.sh feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
health-check.sh chore(macmini/scripts): runbook hardening — status diff + ingress walk 2026-04-07 22:31:53 +02:00
init-deploy-tracking.sql feat(infra): add deploy tracking with PostgreSQL, Pushgateway & Grafana dashboard 2026-03-20 17:08:03 +01:00
memory-baseline.sh fix(infra): use DOCKER_CMD variable in memory-baseline.sh 2026-03-29 15:12:44 +02:00
migrate-to-colima.sh feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
move-colima-to-external-ssd.sh feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
notifications.env.example chore: complete ManaCore → Mana rename (docs, go modules, plists, images) 2026-04-07 12:26:10 +02:00
push-schemas.sh feat(infra+ui): deploy script v2, schema push, SyncIndicator component 2026-03-28 18:02:06 +01:00
README.md chore: complete ManaCore → Mana rename (docs, go modules, plists, images) 2026-04-07 12:26:10 +02:00
restart.sh feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
setup-autostart.sh feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
setup-cloudflared-service.sh feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
setup-docker-logging.sh feat(mac-mini): add stability improvements 2026-02-12 13:33:44 +01:00
setup-forgejo.sh feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
setup-image-gen.sh feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
setup-matrix.sh feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
setup-notifications.sh feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
setup-ssh-client.sh feat: add SSH access via Cloudflare Tunnel 2026-01-22 19:27:39 +01:00
setup-stt.sh feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
setup-tts-bot.sh feat(matrix): add TTS bot for text-to-speech conversion 2026-01-29 16:03:26 +01:00
setup-tts.sh feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
setup-umami-db.sh feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
startup.sh fix(mac-mini): make startup.sh idempotent and non-destructive 2026-04-07 13:19:46 +02:00
status.sh chore(macmini/scripts): runbook hardening — status diff + ingress walk 2026-04-07 22:31:53 +02:00
stop.sh feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00
tune-tcp.sh feat(skilltree): add achievement system with 26 achievements + monetization report 2026-03-24 12:17:43 +01:00
weekly-report.sh feat: rename ManaCore to Mana across entire codebase 2026-04-05 20:00:13 +02:00

Mac Mini Server Scripts

Scripts for managing the Mana production environment on Mac Mini.

Quick Start (After System Update)

# 1. SSH into Mac Mini (from your local machine)
ssh mac-mini

# 2. Navigate to project
cd ~/projects/mana-monorepo

# 3. Setup auto-start (only needed once)
./scripts/mac-mini/setup-autostart.sh

# 4. Check status
./scripts/mac-mini/status.sh

Scripts Overview

Script Purpose
setup-autostart.sh Configure automatic startup on boot (run once)
setup-stt.sh Setup STT service (Whisper + Voxtral)
startup.sh Main startup script (called by launchd)
health-check.sh Check all services health
status.sh Show full system status
restart.sh Restart all Docker containers
stop.sh Stop all Docker containers
deploy.sh Pull latest images and deploy

First-Time Setup

1. Prerequisites on Mac Mini

# Install Homebrew
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# Install required tools
brew install cloudflared git docker

# Install Docker Desktop
# Download from: https://www.docker.com/products/docker-desktop/

2. Clone Repository

mkdir -p ~/projects
cd ~/projects
git clone https://github.com/Memo-2023/mana-monorepo.git
cd mana-monorepo

3. Configure Cloudflare Tunnel

# Login to Cloudflare
cloudflared tunnel login

# The tunnel is already created (ID: bb0ea86d-8253-4a54-838b-107bb7945be9)
# Credentials should be at: ~/.cloudflared/bb0ea86d-8253-4a54-838b-107bb7945be9.json

4. Configure Environment

# Copy and edit the environment file
cp .env.macmini.example .env.macmini
nano .env.macmini

5. Enable Auto-Start

# This sets up all launchd services
./scripts/mac-mini/setup-autostart.sh

6. Configure Docker Desktop

Open Docker Desktop and enable:

  • Settings > General > Start Docker Desktop when you sign in

Daily Operations

Check Status

./scripts/mac-mini/status.sh

Run Health Check

./scripts/mac-mini/health-check.sh

Restart Services

# Normal restart
./scripts/mac-mini/restart.sh

# Pull latest images and restart
./scripts/mac-mini/restart.sh --pull

# Force recreate containers
./scripts/mac-mini/restart.sh --force

View Logs

# Startup log
tail -f /tmp/mana-startup.log

# Health check log
tail -f /tmp/mana-health.log

# Cloudflare tunnel log
tail -f /tmp/cloudflared.log

# Specific container logs
docker logs -f mana-auth
docker logs -f chat-backend

Stop Services

./scripts/mac-mini/stop.sh

LaunchD Services

Three services are configured to run automatically:

Service Label Purpose
Cloudflared com.cloudflare.cloudflared Tunnel to Cloudflare
Docker Startup com.mana.docker-startup Start containers on boot
Health Check com.mana.health-check Check every 5 minutes
STT Service com.mana.stt Speech-to-Text (Whisper + Voxtral)

Manual Service Control

# Check status
launchctl list | grep -E 'cloudflare|mana'

# Restart a service
launchctl kickstart -k gui/$(id -u)/com.mana.docker-startup

# Stop a service
launchctl unload ~/Library/LaunchAgents/com.mana.docker-startup.plist

# Start a service
launchctl load ~/Library/LaunchAgents/com.mana.docker-startup.plist

Troubleshooting

Docker not starting

# Check if Docker Desktop is running
docker info

# Start Docker Desktop manually
open -a Docker

Cloudflare tunnel not connecting

# Check cloudflared status
pgrep -x cloudflared

# View tunnel logs
tail -50 /tmp/cloudflared.log

# Restart tunnel
launchctl kickstart -k gui/$(id -u)/com.cloudflare.cloudflared

Container health check failing

# Check specific container
docker logs <container-name>

# Restart specific container
docker restart <container-name>

# Check database connectivity
docker exec mana-postgres pg_isready -U postgres

Services not starting on boot

# Re-run setup
./scripts/mac-mini/setup-autostart.sh

# Check launchd errors
launchctl error <exit-code>

# Verify plist files
plutil ~/Library/LaunchAgents/com.mana.*.plist

Push Notifications (Optional)

To receive notifications when health checks fail:

  1. Create a topic at ntfy.sh
  2. Add to your shell profile:
    export NTFY_TOPIC=your-topic-name
    
  3. Subscribe on your phone using the ntfy app

URLs

Once running, services are available at:

Service URL
Unified App https://mana.how
Auth API https://auth.mana.how
API Gateway https://api.mana.how
Forgejo (Git) https://git.mana.how
Grafana https://grafana.mana.how
Status Page https://status.mana.how
GlitchTip https://glitchtip.mana.how
Umami https://stats.mana.how
SSH ssh mac-mini (via cloudflared)

Native Services (non-Docker)

Ollama (LLM)

Ollama runs natively on Mac Mini for LLM inference:

# Check status
curl http://localhost:11434/api/tags

# List models
ollama list

# Pull a model
ollama pull gemma3:4b

STT Service (Speech-to-Text)

The STT service provides Whisper and Voxtral transcription:

# Setup (first time)
./scripts/mac-mini/setup-stt.sh

# Check status
curl http://localhost:3020/health

# Transcribe audio
curl -X POST http://localhost:3020/transcribe \
  -F "file=@audio.mp3" \
  -F "language=de"

# View logs
tail -f /tmp/mana-stt.log

Available endpoints:

  • POST /transcribe - Whisper transcription (recommended)
  • POST /transcribe/voxtral - Voxtral transcription
  • POST /transcribe/auto - Auto-select model
  • GET /health - Health check
  • GET /models - List available models