# Hetzner Production Deployment Guide **Version**: 1.0 **Last Updated**: 2025-12-01 **Scope**: Complete production deployment guide for Manacore monorepo on Hetzner Cloud --- ## Table of Contents 1. [Server Specifications](#1-server-specifications--instance-types) 2. [Network Architecture](#2-network-architecture) 3. [Storage & Backup Strategies](#3-storage--backup-strategies) 4. [Security Hardening](#4-security-hardening-checklist) 5. [Monitoring & Logging](#5-monitoring--logging-solutions) 6. [CI/CD Integration](#6-cicd-integration-patterns) 7. [Cost Optimization](#7-cost-optimization-tips) 8. [Orchestration Choice](#8-orchestration-choice-docker-swarm-vs-kubernetes) 9. [Production Setup Scripts](#9-production-ready-deployment-scripts) 10. [Production Checklist](#10-production-ready-checklist) --- ## 1. Server Specifications & Instance Types ### Recommended Server Types #### Entry-Level Production (Small Applications) **Hetzner CX23**: 2 vCPUs, 4 GB RAM, 40 GB storage, 20 TB traffic - **Price**: €3.49/month - **Use Case**: Single container apps, development/staging environments - **Suitable For**: Individual microservices, low-traffic applications #### Mid-Tier Production (Standard Applications) **Hetzner CPX21**: 3 shared vCPUs, 4 GB RAM, 80 GB storage - **Price**: ~€7/month - **Use Case**: Multi-container applications, small microservices - **Best For**: 2-3 backend services + web apps **Hetzner CX33**: 2 vCPUs, 8 GB RAM, 80 GB storage, 20 TB traffic - **Price**: €5.49/month - **Use Case**: Standard production workloads - **Best For**: Full stack with 5-6 services #### High-Performance Production **CCX Series**: Dedicated vCPUs for CPU-intensive workloads - **CCX42**: 16 vCPU, 64 GB RAM - €101/month - **Use Case**: High-traffic applications, full monorepo deployment - **Best For**: 10+ services with monitoring stack **CAX ARM Series**: 40% better cost efficiency - **CAX21**: 4 ARM vCPUs, 8 GB RAM - ~€8/month - **Use Case**: ARM-compatible Docker images - **Benefit**: Better performance-per-euro ### ARM vs x86 Considerations **ARM64 (CAX) Advantages**: - 40% cost savings - Better performance-per-euro - Modern Docker images support ARM64 **Compatibility Check**: - Node.js: ✅ Full ARM64 support - Python: ✅ Full ARM64 support - Go: ✅ Native ARM64 - PostgreSQL: ✅ Official ARM images - Redis: ✅ Official ARM images **Check Your Dependencies**: ```bash # Test ARM compatibility locally (M1/M2 Mac) docker buildx build --platform linux/arm64 . # Or on AMD64 with QEMU docker run --rm --privileged multiarch/qemu-user-static --reset -p yes docker buildx build --platform linux/arm64 . ``` ### Installation Method **Recommended**: Use **Docker CE App** from Hetzner Cloud Apps during server creation. **Benefits**: - Docker and docker-compose pre-installed - Optimized for Hetzner infrastructure - Eliminates manual installation errors **Alternative** (Manual Installation): ```bash curl -fsSL https://get.docker.com -o get-docker.sh sh get-docker.sh rm get-docker.sh ``` --- ## 2. Network Architecture ### Private Networks **Architecture Overview**: ``` ┌─────────────────┐ ┌─────────────────┐ │ Web Server │────▶│ App Server │ │ (Public IP) │ │ (Private only) │ │ - Traefik │ │ - Backends │ │ - Web Apps │ │ - Processing │ └─────────────────┘ └─────────────────┘ │ │ └───────────┬───────────┘ │ ┌──────▼──────┐ │ Database │ │ (Private) │ │ - PostgreSQL│ │ - Redis │ └─────────────┘ ``` ### Best Practices **1. Configure Private Networks BEFORE Docker Installation** ```bash # Create private network via Hetzner Console or CLI hcloud network create --name production-network --ip-range 10.0.0.0/16 # Create subnet hcloud network add-subnet production-network --network-zone eu-central --type server --ip-range 10.0.1.0/24 # Attach servers to network hcloud server attach-to-network --network production-network --ip 10.0.1.2 ``` **2. Docker Daemon Configuration for Private Networks** **MTU for Private Networks**: 1450 bytes (Hetzner requirement) ```json // /etc/docker/daemon.json { "mtu": 1450, "default-address-pools": [{ "base": "172.17.0.0/12", "size": 24 }], "live-restore": true, "userland-proxy": false, "no-new-privileges": true, "icc": false } ``` **Apply Configuration**: ```bash systemctl restart docker ``` **3. Network Isolation Strategy** - **Public Network**: Only expose necessary services (web apps, APIs) - **Private Network**: All inter-service communication (backends, databases) - **Hetzner Cloud Firewall**: Primary security layer - **UFW (Secondary)**: Host-level firewall ### Floating IPs (High Availability) **Use Cases**: - High availability setups - Zero-downtime deployments - Failover scenarios **Implementation with Docker Swarm**: ```bash # Create floating IP hcloud floating-ip create --type ipv4 --name production-lb --home-location nbg1 # Assign to server hcloud floating-ip assign # Docker service for IP management docker service create \ --name ip-floater \ --mode global \ --constraint 'node.role==manager' \ --mount type=bind,src=/var/run/docker.sock,dst=/var/run/docker.sock \ -e HCLOUD_TOKEN=${HCLOUD_TOKEN} \ -e FLOATING_IP=${FLOATING_IP} \ costela/hetzner-ip-floater:latest ``` ### Load Balancers **Hetzner Cloud Load Balancer**: - **Protocol Support**: TCP, HTTP, HTTPS (HTTP/2 by default) - **Health Checks**: Active and passive monitoring - **Instant Configuration**: Changes apply immediately - **Proxy Protocol**: Preserve client IP addresses - **Pricing**: Starting at €5.39/month **Recommended Architecture**: ``` Internet → Hetzner LB → Private Network → Docker Containers ``` **Configuration Options**: 1. **Direct Binding**: App containers bind to private IPs ```yaml services: web: networks: - private ports: - '10.0.1.2:3000:3000' ``` 2. **Traefik Reverse Proxy**: LB routes to Traefik on Docker Swarm ```yaml services: traefik: ports: - '80:80' - '443:443' networks: - public - private ``` 3. **Kubernetes Ingress**: Automatic LB provisioning ```yaml apiVersion: v1 kind: Service metadata: annotations: load-balancer.hetzner.cloud/location: nbg1 spec: type: LoadBalancer ``` --- ## 3. Storage & Backup Strategies ### Block Storage Volumes **Characteristics**: - Attach to **single server only** (not shared) - ext4 or xfs filesystems (ext4 recommended) - Up to 10 TB per volume - Hot-attach/detach support - **€0.05/GB/month** pricing **Docker Volume Best Practices**: ```bash # 1. Create and format volume (first time) mkfs.ext4 -F /dev/disk/by-id/scsi-0HC_Volume_12345 # 2. Mount volume to dedicated path mkdir -p /mnt/volumes/data mount /dev/disk/by-id/scsi-0HC_Volume_12345 /mnt/volumes/data # 3. Add to /etc/fstab for persistence echo '/dev/disk/by-id/scsi-0HC_Volume_12345 /mnt/volumes/data ext4 discard,nofail,defaults 0 0' >> /etc/fstab # 4. Test auto-mount umount /mnt/volumes/data mount -a ``` **Docker Compose Usage**: ```yaml volumes: app-data: driver: local driver_opts: type: none o: bind device: /mnt/volumes/data ``` ### ⚠️ Critical: Hetzner Does NOT Provide Volume Backups **You MUST implement your own backup solution** ### Backup Strategy #### Option 1: Borg Backup with Storage Box (Recommended) **Why Borg?** - Deduplication (saves space) - Compression (lz4, zstd) - Encryption (AES-256) - Incremental backups - Fast recovery **Setup**: ```bash # 1. Install Borg apt install borgbackup # 2. Initialize repository on Storage Box borg init --encryption=repokey \ ssh://u123456@u123456.your-storagebox.de:23/./backups # Store passphrase securely echo "your-encryption-passphrase" > /root/.borg-passphrase chmod 600 /root/.borg-passphrase # 3. Create backup script cat > /usr/local/bin/docker-backup.sh <<'EOF' #!/bin/bash set -e BORG_REPO="ssh://u123456@u123456.your-storagebox.de:23/./backups" export BORG_PASSPHRASE=$(cat /root/.borg-passphrase) # Stop containers for consistency (optional) # docker-compose -f /app/docker-compose.yml stop # Create backup borg create --stats --compression lz4 \ $BORG_REPO::$(date +%Y%m%d-%H%M%S) \ /mnt/volumes/data \ /var/lib/docker/volumes # Prune old backups borg prune \ --keep-daily=7 \ --keep-weekly=4 \ --keep-monthly=6 \ $BORG_REPO # Restart containers # docker-compose -f /app/docker-compose.yml start echo "Backup completed successfully" EOF chmod +x /usr/local/bin/docker-backup.sh # 4. Schedule with cron (daily at 2 AM) echo "0 2 * * * /usr/local/bin/docker-backup.sh >> /var/log/backup.log 2>&1" | crontab - ``` **Restore**: ```bash # List backups borg list ssh://u123456@u123456.your-storagebox.de:23/./backups # Restore specific backup borg extract ssh://u123456@u123456.your-storagebox.de:23/./backups::20251201-020000 ``` #### Option 2: Restic (Alternative) ```bash # Install Restic apt install restic # Initialize repository restic -r sftp:u123456@u123456.your-storagebox.de:backups init # Create backup restic -r sftp:u123456@u123456.your-storagebox.de:backups \ backup /mnt/volumes/data # Restore restic -r sftp:u123456@u123456.your-storagebox.de:backups \ restore latest --target /mnt/volumes/data ``` #### Option 3: Database-Specific Backups **PostgreSQL**: ```bash #!/bin/bash # /usr/local/bin/postgres-backup.sh BACKUP_DIR="/backup/postgres" DATE=$(date +%Y%m%d-%H%M%S) mkdir -p $BACKUP_DIR # Dump all databases docker exec postgres pg_dumpall -U manacore | \ gzip > $BACKUP_DIR/all-databases-$DATE.sql.gz # Retain last 7 days find $BACKUP_DIR -name "*.sql.gz" -mtime +7 -delete echo "PostgreSQL backup completed: $DATE" ``` **Redis**: ```bash #!/bin/bash # Redis automatically creates dump.rdb and appendonly.aof # Just backup these files cp /var/lib/docker/volumes/redis-data/_data/dump.rdb \ /backup/redis/dump-$(date +%Y%m%d).rdb ``` **Schedule Both**: ```cron # /etc/cron.d/database-backups 0 3 * * * root /usr/local/bin/postgres-backup.sh >> /var/log/postgres-backup.log 2>&1 30 3 * * * root /usr/local/bin/redis-backup.sh >> /var/log/redis-backup.log 2>&1 ``` ### Storage Box Usage **Hetzner Storage Box** (NOT for Docker Images): - **Remote storage via**: CIFS/SMB, SSHFS, SFTP, Borg - **Pricing**: Starting at €3.81/month for 100 GB - **Best For**: Backups, media files, logs **Critical Warning**: ❌ **DO NOT store Docker images on Storage Box** - Causes instability (storage can disconnect) - Docker requires 100% available storage - Use only for application data, NOT `/var/lib/docker` **Safe Usage Pattern** (Application Uploads): ```yaml # docker-compose.yml volumes: uploads: driver: local driver_opts: type: cifs o: 'username=u123456,password=${STORAGE_BOX_PASSWORD},addr=u123456.your-storagebox.de' device: '//u123456.your-storagebox.de/uploads' ``` --- ## 4. Security Hardening Checklist ### Initial Server Setup #### 1. SSH Hardening ```bash # Disable root login sed -i 's/#\?PermitRootLogin.*/PermitRootLogin no/' /etc/ssh/sshd_config # Disable password authentication (SSH keys only) sed -i 's/#\?PasswordAuthentication.*/PasswordAuthentication no/' /etc/ssh/sshd_config # Create sudo user adduser deploy usermod -aG sudo deploy usermod -aG docker deploy # Setup SSH keys mkdir -p /home/deploy/.ssh cp ~/.ssh/authorized_keys /home/deploy/.ssh/ chown -R deploy:deploy /home/deploy/.ssh chmod 700 /home/deploy/.ssh chmod 600 /home/deploy/.ssh/authorized_keys # Restart SSH systemctl restart sshd ``` #### 2. Firewall Configuration (Defense in Depth) **Layer 1: Hetzner Cloud Firewall** (Primary): ```bash # Create firewall via Hetzner CLI hcloud firewall create --name production # Allow SSH (from specific IPs only - replace with your IP) hcloud firewall add-rule production \ --direction in \ --protocol tcp \ --port 22 \ --source-ips YOUR_IP/32 # Allow HTTP/HTTPS from anywhere hcloud firewall add-rule production \ --direction in \ --protocol tcp \ --port 80 \ --source-ips 0.0.0.0/0,::/0 hcloud firewall add-rule production \ --direction in \ --protocol tcp \ --port 443 \ --source-ips 0.0.0.0/0,::/0 # Apply to server hcloud firewall apply-to-resource production \ --type server \ --server web-01 ``` **Layer 2: UFW** (Secondary, Host-Level): ```bash # Install UFW apt install ufw # Default policies ufw default deny incoming ufw default allow outgoing # Allow SSH, HTTP, HTTPS ufw allow 22/tcp ufw allow 80/tcp ufw allow 443/tcp # Allow Docker Swarm (if using) ufw allow 2377/tcp # Cluster management ufw allow 7946/tcp # Node communication ufw allow 7946/udp # Node communication ufw allow 4789/udp # Overlay network # Enable firewall ufw enable # Check status ufw status verbose ``` #### 3. Docker-Specific Security ```json // /etc/docker/daemon.json { "live-restore": true, "userland-proxy": false, "no-new-privileges": true, "icc": false, "log-driver": "json-file", "log-opts": { "max-size": "10m", "max-file": "3" }, "metrics-addr": "127.0.0.1:9323", "experimental": true } ``` **Docker Compose Security**: ```yaml services: app: image: myapp:latest read_only: true security_opt: - no-new-privileges:true cap_drop: - ALL cap_add: - NET_BIND_SERVICE tmpfs: - /tmp:noexec,nosuid,size=100m user: '1000:1000' ``` #### 4. Fail2ban Configuration ```bash apt install fail2ban # Create local config cat > /etc/fail2ban/jail.local < 0.9 for: 5m labels: severity: warning annotations: summary: 'High memory usage on {{ $labels.name }}' description: 'Container {{ $labels.name }} memory usage is above 90%.' - alert: HighCPUUsage expr: rate(container_cpu_usage_seconds_total[5m]) > 0.8 for: 5m labels: severity: warning annotations: summary: 'High CPU usage on {{ $labels.name }}' description: 'Container {{ $labels.name }} CPU usage is above 80%.' - name: host interval: 30s rules: - alert: HostOutOfDiskSpace expr: (node_filesystem_avail_bytes / node_filesystem_size_bytes) < 0.1 for: 5m labels: severity: critical annotations: summary: 'Host out of disk space' description: 'Disk space is below 10%.' - alert: HostHighCPULoad expr: 100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80 for: 5m labels: severity: warning annotations: summary: 'Host high CPU load' description: 'CPU load is > 80%.' ``` ### Hetzner-Specific Monitoring **Hetzner Cloud Exporter** (Monitor Hetzner Resources): ```bash docker run -d \ --name hcloud-exporter \ -p 9501:9501 \ -e HCLOUD_TOKEN=${HCLOUD_TOKEN} \ promhippie/hcloud_exporter:latest ``` **Add to Prometheus**: ```yaml scrape_configs: - job_name: 'hetzner-cloud' static_configs: - targets: ['hcloud-exporter:9501'] ``` **Available Grafana Dashboards**: - **Hetzner Cloud Servers**: Dashboard ID 16169 - **Hetzner Cloud Servers & Load Balancers**: Dashboard ID 20257 ### Log Management **Loki Configuration** (`docker/loki/loki-config.yml`): ```yaml auth_enabled: false server: http_listen_port: 3100 ingester: lifecycler: address: 127.0.0.1 ring: kvstore: store: inmemory replication_factor: 1 final_sleep: 0s chunk_idle_period: 5m chunk_retain_period: 30s schema_config: configs: - from: 2020-05-15 store: boltdb object_store: filesystem schema: v11 index: prefix: index_ period: 168h storage_config: boltdb: directory: /loki/index filesystem: directory: /loki/chunks limits_config: enforce_metric_name: false reject_old_samples: true reject_old_samples_max_age: 168h chunk_store_config: max_look_back_period: 0s table_manager: retention_deletes_enabled: true retention_period: 720h ``` **Promtail Configuration** (`docker/promtail/promtail-config.yml`): ```yaml server: http_listen_port: 9080 grpc_listen_port: 0 positions: filename: /tmp/positions.yaml clients: - url: http://loki:3100/loki/api/v1/push scrape_configs: - job_name: system static_configs: - targets: - localhost labels: job: varlogs __path__: /var/log/**/*.log - job_name: docker static_configs: - targets: - localhost labels: job: docker __path__: /var/lib/docker/containers/**/*.log ``` **Deploy Monitoring Stack**: ```bash # Start monitoring services docker compose -f docker-compose.monitoring.yml up -d # Check status docker compose -f docker-compose.monitoring.yml ps # Access Grafana http://your-server-ip:3000 ``` --- ## 6. CI/CD Integration Patterns ### GitHub Actions with Hetzner Cloud #### Option 1: Deploy to Existing Server (Recommended) **Workflow**: `.github/workflows/deploy-hetzner.yml` ```yaml name: Deploy to Hetzner on: push: branches: [main] workflow_dispatch: env: REGISTRY: ghcr.io IMAGE_NAME: ${{ github.repository }} jobs: build-and-push: runs-on: ubuntu-latest permissions: contents: read packages: write steps: - name: Checkout code uses: actions/checkout@v4 - name: Set up Docker Buildx uses: docker/setup-buildx-action@v3 - name: Log in to Container Registry uses: docker/login-action@v3 with: registry: ${{ env.REGISTRY }} username: ${{ github.actor }} password: ${{ secrets.GITHUB_TOKEN }} - name: Extract metadata id: meta uses: docker/metadata-action@v5 with: images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }} tags: | type=sha type=ref,event=branch type=semver,pattern={{version}} - name: Build and push Docker images uses: docker/build-push-action@v5 with: context: . file: ./services/mana-core-auth/Dockerfile push: true tags: ${{ steps.meta.outputs.tags }} labels: ${{ steps.meta.outputs.labels }} cache-from: type=gha cache-to: type=gha,mode=max deploy: needs: build-and-push runs-on: ubuntu-latest steps: - name: Checkout code uses: actions/checkout@v4 - name: Deploy to Hetzner uses: appleboy/ssh-action@master with: host: ${{ secrets.HETZNER_HOST }} username: deploy key: ${{ secrets.SSH_PRIVATE_KEY }} script: | cd /app # Pull latest images docker compose -f docker-compose.production.yml pull # Rolling update (zero downtime) docker compose -f docker-compose.production.yml up -d --remove-orphans # Run migrations if needed docker compose -f docker-compose.production.yml exec -T mana-core-auth pnpm migration:run || true # Health check sleep 10 curl -f http://localhost:3001/api/v1/health || exit 1 echo "Deployment completed successfully" - name: Notify on failure if: failure() uses: 8398a7/action-slack@v3 with: status: ${{ job.status }} text: 'Deployment to Hetzner failed!' webhook_url: ${{ secrets.SLACK_WEBHOOK }} ``` #### Option 2: Self-Hosted GitHub Runner on Hetzner **Benefits**: - 3-10x cheaper than GitHub-hosted runners - Faster builds with persistent caching - Full control over environment **Setup**: ```bash # On Hetzner server cd /opt mkdir actions-runner && cd actions-runner # Download runner (check latest version) curl -o actions-runner-linux-x64-2.311.0.tar.gz -L \ https://github.com/actions/runner/releases/download/v2.311.0/actions-runner-linux-x64-2.311.0.tar.gz tar xzf actions-runner-linux-x64-2.311.0.tar.gz # Configure (get token from GitHub repo settings) ./config.sh --url https://github.com/your-org/manacore-monorepo --token YOUR_TOKEN # Install as service sudo ./svc.sh install sudo ./svc.sh start ``` **Use in Workflow**: ```yaml jobs: deploy: runs-on: self-hosted steps: - uses: actions/checkout@v4 - run: docker compose up -d ``` ⚠️ **Important**: Hetzner bills per hour, not per minute. A 30-second run costs the same as a 1-hour run. ### Docker Registry Options #### Option 1: GitHub Container Registry (Recommended) ```yaml - name: Login to GitHub Container Registry uses: docker/login-action@v3 with: registry: ghcr.io username: ${{ github.actor }} password: ${{ secrets.GITHUB_TOKEN }} - name: Build and Push uses: docker/build-push-action@v5 with: push: true tags: ghcr.io/${{ github.repository }}:latest ``` #### Option 2: Docker Hub ```yaml - name: Login to Docker Hub uses: docker/login-action@v3 with: username: ${{ secrets.DOCKERHUB_USERNAME }} password: ${{ secrets.DOCKERHUB_TOKEN }} ``` #### Option 3: Self-Hosted Harbor Registry ```bash # Deploy Harbor on Hetzner docker compose -f harbor-docker-compose.yml up -d ``` ### Deployment Strategies #### Blue-Green Deployment ```yaml - name: Blue-Green Deploy run: | ssh deploy@${{ secrets.HETZNER_HOST }} << 'EOF' cd /app # Start green environment docker compose -f docker-compose.green.yml up -d # Wait for health checks sleep 30 # Switch traffic (update nginx/traefik config) sudo mv /etc/nginx/sites-enabled/blue.conf /etc/nginx/sites-enabled/blue.conf.bak sudo mv /etc/nginx/sites-enabled/green.conf.new /etc/nginx/sites-enabled/green.conf sudo nginx -s reload # Stop blue environment docker compose -f docker-compose.blue.yml down EOF ``` #### Rolling Update (Docker Swarm) ```yaml - name: Deploy to Swarm run: | ssh deploy@${{ secrets.HETZNER_HOST }} << 'EOF' docker service update \ --image ghcr.io/your-org/myapp:${{ github.sha }} \ --update-parallelism 2 \ --update-delay 10s \ --update-failure-action rollback \ myapp EOF ``` --- ## 7. Cost Optimization Tips ### Server Right-Sizing **Progressive Scaling Strategy**: ``` Development/Testing: CX11 (€3.92/month) ↓ Staging: CX23 (€3.49/month) ↓ Production (Small): CPX21 (€7/month) ↓ Production (Medium): CX33 (€28/month) ↓ Production (Large): CCX42 (€101/month) ``` **Cost Calculator**: https://costgoat.com/pricing/hetzner ### Resource Optimization Strategies #### 1. Use ARM Servers (CAX Series) **Cost Savings**: 40% lower operational costs vs x86 **Example**: - **CX21** (x86): 2 vCPU, 4GB RAM - €6/month - **CAX21** (ARM): 4 vCPU, 8GB RAM - ~€8/month - **Better**: More CPUs, more RAM, same price range **Requirements**: - ARM64-compatible Docker images - Test thoroughly before production migration #### 2. Implement Auto-Scaling with Hetzner API ```bash #!/bin/bash # auto-scale.sh LOAD=$(uptime | awk -F'load average:' '{print $2}' | cut -d, -f1 | xargs) THRESHOLD=4.0 if (( $(echo "$LOAD > $THRESHOLD" | bc -l) )); then # Scale up - create new server hcloud server create \ --type cpx21 \ --name web-$(date +%s) \ --image docker-ce \ --ssh-key default echo "Scaled up due to load: $LOAD" else echo "Load normal: $LOAD" fi ``` #### 3. Volume Management ```bash #!/bin/bash # cleanup-volumes.sh # List detached volumes hcloud volume list -o json | jq -r '.[] | select(.server == null) | .id' # Delete old snapshots (>30 days) hcloud snapshot list -o json | \ jq -r '.[] | select(.created | fromdateiso8601 < now - 2592000) | .id' | \ xargs -I {} hcloud snapshot delete {} ``` **Cost Impact**: - Volumes: €0.05/GB/month (even when detached) - Snapshots: €0.01/GB/month - Storage Box: €0.04/GB/month (cheaper for cold storage) #### 4. Network Traffic Optimization **Included Traffic**: 20 TB/month (most plans) **Additional Traffic**: €1.19/TB **Optimization**: - Use private networks for inter-server communication (free) - Enable compression in Nginx/Traefik - Serve static assets from CDN (Cloudflare free) ```nginx # Enable gzip compression gzip on; gzip_vary on; gzip_min_length 1024; gzip_types text/plain text/css text/xml application/json application/javascript; ``` #### 5. Load Balancer Optimization **Pricing**: - Small LB (5K connections): €5.39/month - Large LB (40K connections): €15.49/month **When to Use**: - Multi-server setups only - For single server, use Nginx/Traefik directly (no LB cost) #### 6. Monitoring Costs **Self-Hosted** (Prometheus + Grafana): - Cost: ~€0/month (runs on same server) - Overhead: ~200MB RAM - No external service fees **External Monitoring** (Datadog, New Relic): - Cost: $20-50+/month per host - Only if specific features required ### Total Cost Examples #### Single App Deployment (Minimal) ``` Server (CPX21): €7.00/month Volume (50GB): €2.50/month Snapshot (weekly, 10GB): €0.50/month Storage Box (100GB backup): €3.81/month ───────────────────────────────────────── Total: €13.81/month ``` #### High-Availability Setup (Production) ``` 2x Servers (CPX21): €14.00/month Load Balancer (small): €5.39/month 3x Volumes (50GB each): €7.50/month Storage Box (500GB backup): €10.11/month Private Network: €0.00/month (free) Cloud Firewall: €0.00/month (free) ───────────────────────────────────────── Total: €37.00/month ``` #### Full Monorepo Deployment (All Services) ``` 3x App Servers (CX33): €84.00/month 1x DB Server (CX31): €28.00/month Load Balancer (medium): €10.00/month 5x Volumes (100GB each): €25.00/month Storage Box (1TB backup): €19.00/month Private Network: €0.00/month Cloud Firewall: €0.00/month ───────────────────────────────────────── Total: €166.00/month Equivalent on AWS: $400-600/month Savings: 60-75% ``` ### Cost Monitoring **Track Usage with Hetzner API**: ```bash #!/bin/bash # cost-report.sh # Get current month billing YEAR_MONTH=$(date +%Y-%m) hcloud billing get-month $YEAR_MONTH | jq # Example output: # { # "from": "2025-12-01", # "to": "2025-12-31", # "total_net": "45.67", # "total_gross": "54.35" # } ``` **Set Billing Alerts** (via Hetzner Console): - Alert at €50 - Alert at €100 - Alert at €150 ### Cost Optimization Checklist - [ ] Start with smaller server types - [ ] Evaluate CAX ARM servers for 40% savings - [ ] Use private networks for inter-server traffic (free) - [ ] Delete unused volumes and snapshots regularly - [ ] Use Storage Box for backups (cheaper than volumes) - [ ] Implement auto-scaling for variable workloads - [ ] Monitor resource usage and right-size servers - [ ] Use Hetzner's included 20TB/month traffic - [ ] Self-host monitoring (Prometheus/Grafana) - [ ] Regular cost audits with billing API --- ## 8. Orchestration Choice: Docker Swarm vs Kubernetes ### When to Use Docker Swarm **Best For**: - Small to medium deployments (<50 nodes) - Teams familiar with Docker Compose - Quick setup requirements (<30 minutes to production) - Simple applications without complex networking - Projects prioritizing simplicity over features **Advantages**: - Native Docker integration (same CLI) - Easy migration from docker-compose - Lower learning curve - Faster deployment times - Lower resource overhead (~100MB vs ~1GB for K8s) **Hetzner Setup**: ```bash # Initialize swarm on manager node docker swarm init --advertise-addr 10.0.1.2 # Join worker nodes docker swarm join --token 10.0.1.2:2377 # Deploy stack docker stack deploy -c docker-compose.yml manacore # Scale service docker service scale manacore_chat-backend=3 # Rolling update docker service update \ --image ghcr.io/org/chat-backend:v2 \ manacore_chat-backend ``` ### When to Use Kubernetes (k3s) **Best For**: - Medium to large deployments (>20 nodes) - Complex microservices architectures - Need for advanced networking (service mesh) - Teams requiring extensive ecosystem tools - Enterprise compliance requirements **Advantages on Hetzner**: - k3s optimized for Hetzner's cost structure - 40% lower costs vs MicroK8s - Production-grade availability - Extensive ecosystem (Helm, operators, etc.) - Better for multi-tenant applications **k3s Recommended** over full Kubernetes: - 50% less memory usage - Single binary installation - Hetzner-specific tooling available ### Quick Comparison | Factor | Docker Swarm | k3s on Hetzner | | -------------------------- | ---------------- | ------------------------------- | | **Setup Time** | 15 minutes | 30-60 minutes | | **Learning Curve** | Low | Medium | | **Resource Overhead** | Minimal (~100MB) | Low (~500MB) | | **Ecosystem** | Limited | Extensive | | **Cost (3 nodes)** | ~€21/month | ~€21/month | | **Operational Complexity** | Lower | Higher | | **Max Scale** | ~50 nodes | 1000+ nodes | | **Auto-Scaling** | Manual | HPA (Horizontal Pod Autoscaler) | | **Service Mesh** | No | Yes (Linkerd, Istio) | ### Recommendation for Manacore Monorepo **Start with Docker Swarm**, then migrate to k3s if needed: **Rationale**: 1. **Faster Time to Market**: 15-minute setup vs 1+ week for K8s 2. **Lower Complexity**: Existing Docker Compose knowledge sufficient 3. **Cost Effective**: Same infrastructure cost, lower ops overhead 4. **Sufficient for 90% of Use Cases**: <50 services, <100K requests/day **Migration Path**: ``` Docker Compose (Development) ↓ Docker Swarm (Production) ↓ k3s/Kubernetes (if scaling beyond 50 nodes) ``` --- ## 9. Production-Ready Deployment Scripts ### Complete Server Setup Script ```bash #!/bin/bash # hetzner-production-setup.sh # Complete Hetzner production setup automation set -e echo "=== Hetzner Docker Production Setup ===" # Configuration DEPLOY_USER="deploy" DOCKER_VERSION="24.0" SERVER_IP=$(curl -s ifconfig.me) # Colors for output RED='\033[0;31m' GREEN='\033[0;32m' YELLOW='\033[1;33m' NC='\033[0m' # No Color log_info() { echo -e "${GREEN}[INFO]${NC} $1"; } log_warn() { echo -e "${YELLOW}[WARN]${NC} $1"; } log_error() { echo -e "${RED}[ERROR]${NC} $1"; exit 1; } # 1. System Update log_info "Updating system packages..." apt update && apt upgrade -y || log_error "System update failed" # 2. Install Docker (if not pre-installed) if ! command -v docker &> /dev/null; then log_info "Installing Docker..." curl -fsSL https://get.docker.com -o get-docker.sh sh get-docker.sh rm get-docker.sh else log_info "Docker already installed: $(docker --version)" fi # 3. Install Docker Compose if ! command -v docker-compose &> /dev/null; then log_info "Installing Docker Compose..." apt install -y docker-compose-plugin fi # 4. Create deploy user if ! id "$DEPLOY_USER" &> /dev/null; then log_info "Creating deploy user..." adduser --disabled-password --gecos "" $DEPLOY_USER usermod -aG sudo,docker $DEPLOY_USER # Setup SSH keys mkdir -p /home/$DEPLOY_USER/.ssh if [ -f /root/.ssh/authorized_keys ]; then cp /root/.ssh/authorized_keys /home/$DEPLOY_USER/.ssh/ chown -R $DEPLOY_USER:$DEPLOY_USER /home/$DEPLOY_USER/.ssh chmod 700 /home/$DEPLOY_USER/.ssh chmod 600 /home/$DEPLOY_USER/.ssh/authorized_keys log_info "SSH keys copied for $DEPLOY_USER" fi else log_info "User $DEPLOY_USER already exists" fi # 5. Configure Docker daemon log_info "Configuring Docker daemon..." cat > /etc/docker/daemon.json < /etc/fail2ban/jail.local < /opt/monitoring/prometheus/prometheus.yml < /opt/monitoring/grafana/provisioning/datasources/prometheus.yml < /etc/logrotate.d/docker-containers < /dev/null; then log_info "✓ $SERVICE_NAME is healthy" else log_error "✗ $SERVICE_NAME health check failed" fi done # Clean up old images log_info "Cleaning up old Docker images..." docker image prune -f log_info "Deployment completed successfully!" ``` --- ## 10. Production-Ready Checklist ### Infrastructure - [ ] **Server Provisioned**: Appropriate Hetzner server type selected - [ ] **Private Network Configured**: 10.0.0.0/16 network created - [ ] **Floating IP Setup** (if HA required) - [ ] **Load Balancer Configured** (if multi-server) - [ ] **Volumes Mounted**: Block storage attached and formatted - [ ] **Hetzner Cloud Firewall**: Rules configured with IP restrictions - [ ] **DNS Records**: A/AAAA records pointing to server IP ### Storage & Backup - [ ] **Volumes Mounted**: Attached to `/mnt/volumes/*` - [ ] **Storage Box Configured**: Access credentials set - [ ] **Borg Backup Setup**: Repository initialized - [ ] **Automated Backups**: Cron job scheduled (daily at 2 AM) - [ ] **Database Backups**: PostgreSQL/Redis backup scripts created - [ ] **Backup Testing**: Restore procedure tested and documented - [ ] **Retention Policy**: Old backups pruned (7 days, 4 weeks, 6 months) ### Security - [ ] **SSH Key-Only Authentication**: Password auth disabled - [ ] **Root Login Disabled**: PermitRootLogin no - [ ] **UFW Configured**: Host-level firewall enabled - [ ] **fail2ban Installed**: Brute force protection active - [ ] **Automatic Security Updates**: unattended-upgrades enabled - [ ] **Docker Secrets**: Production secrets stored securely - [ ] **Containers Run as Non-Root**: All services use unprivileged users - [ ] **SSL/TLS Configured**: Let's Encrypt certificates active - [ ] **Security Scanning**: Trivy/Hadolint integrated in CI/CD ### Monitoring - [ ] **Prometheus Deployed**: Metrics collection running - [ ] **Grafana Deployed**: Dashboards configured - [ ] **cAdvisor Running**: Container metrics available - [ ] **Node Exporter Running**: Host metrics collected - [ ] **Loki + Promtail**: Centralized logging active - [ ] **Hetzner Cloud Exporter** (optional): Cloud resource monitoring - [ ] **Alert Rules Configured**: Critical alerts defined - [ ] **Alert Notifications**: Email/Slack notifications working - [ ] **Health Checks**: All services have health endpoints ### Deployment - [ ] **Docker Compose Files**: Production files tested - [ ] **Environment Variables**: Secrets properly configured - [ ] **CI/CD Pipeline**: GitHub Actions workflow working - [ ] **Docker Registry**: Images pushed to registry - [ ] **Deployment Strategy**: Blue-green or rolling updates defined - [ ] **Rollback Procedure**: Tested and documented - [ ] **Health Checks**: Pre-deployment and post-deployment checks ### Documentation - [ ] **Deployment Runbook**: Step-by-step deployment guide - [ ] **Rollback Procedure**: Emergency rollback documented - [ ] **Disaster Recovery Plan**: Complete recovery steps - [ ] **On-Call Procedures**: Incident response playbook - [ ] **Architecture Diagram**: Current infrastructure documented - [ ] **Access Documentation**: Server access, credentials locations - [ ] **Monitoring Dashboard**: Team has access to Grafana ### Cost Management - [ ] **Right-Sized Servers**: Appropriate server types selected - [ ] **ARM Servers Evaluated**: CAX series considered for savings - [ ] **Private Networks Used**: Inter-server traffic optimized - [ ] **Unused Resources Cleaned**: Old volumes/snapshots removed - [ ] **Billing Alerts Configured**: Threshold alerts set - [ ] **Cost Monitoring**: Monthly cost reports automated ### Performance - [ ] **Resource Limits Set**: CPU/memory limits defined - [ ] **Database Optimization**: PostgreSQL tuned for workload - [ ] **Redis Caching**: Cache hit ratio monitored - [ ] **CDN Configured**: Static assets served via CDN - [ ] **Compression Enabled**: Gzip/Brotli compression active - [ ] **Load Testing**: Application stress-tested --- ## Conclusion This guide provides a comprehensive production deployment strategy for the Manacore monorepo on Hetzner Cloud infrastructure. Following these practices will result in: - **Cost-Effective**: 60-75% cost savings vs AWS/GCP - **Secure**: Defense-in-depth security strategy - **Reliable**: High availability with failover capabilities - **Observable**: Complete monitoring and logging stack - **Maintainable**: Automated deployments and backups **Estimated Time to Production**: - Initial setup: 4-6 hours - Application deployment: 2-3 hours - Testing and hardening: 4-6 hours - **Total**: ~10-15 hours for complete production deployment **Monthly Operational Cost**: - Single server: €14-28/month - HA setup: €37-50/month - Full monorepo: €166/month --- **Related Documentation**: - `DOCKER_SETUP_ANALYSIS.md` - Current Docker setup analysis - `DOCKER_COMPOSE_PRODUCTION_ARCHITECTURE.md` - Architecture design - `DEPLOYMENT_HETZNER.md` - Deployment options comparison - `CI_CD_SETUP.md` - CI/CD pipeline details