managarten/docs/DEPLOYMENT.md
Till-JS ee42b6cc76 feat: major update with network graphs, themes, todo extensions, and more
## New Features

### Network Graph Visualization (Contacts, Calendar, Todo)
- D3.js force simulation for physics-based layout
- Zoom & pan with mouse/touchpad
- Keyboard shortcuts: +/- zoom, 0 reset, Esc deselect, / search, F focus
- Filtering by tags, company/location/project, connection strength
- Shared components in @manacore/shared-ui

### Central Tags API (mana-core-auth)
- CRUD endpoints for tags
- Schema: tags table with userId, name, color, app
- Shared tag components in @manacore/shared-ui

### Custom Themes System
- Theme editor with live preview and color picker
- Community theme gallery
- Theme sharing (public, unlisted, private)
- Backend API in mana-core-auth

### Todo App Extensions
- Glass-pill design for task input and items
- Settings page with 20+ preferences
- Task edit modal with inline editing
- Statistics page with visualizations
- PWA support with offline capabilities
- Multiple kanban boards

### Contacts App Features
- Duplicate detection
- Photo upload
- Batch operations
- Enhanced favorites page with multiple view modes
- Alphabet view improvements
- Search modal

### Help System
- @manacore/shared-help-content
- @manacore/shared-help-ui
- @manacore/shared-help-types

### Other Features
- Themes page for all apps
- Referral system frontend
- CommandBar (global search)
- Skeleton loaders
- Settings page improvements

## Bug Fixes
- Network graph simulation initialization
- Database schema TEXT for user_id columns (Better Auth compatibility)
- Various styling fixes

## Documentation
- Daily report for 2025-12-10
- CI/CD deployment guide

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 02:37:46 +01:00

762 lines
16 KiB
Markdown

# Deployment Guide
This guide covers the complete deployment process for the manacore-monorepo, including CI/CD setup, Docker orchestration, and production deployment strategies.
## Table of Contents
- [Overview](#overview)
- [Prerequisites](#prerequisites)
- [CI/CD Pipeline](#cicd-pipeline)
- [Docker Setup](#docker-setup)
- [Deployment Environments](#deployment-environments)
- [Deployment Process](#deployment-process)
- [Rollback Procedures](#rollback-procedures)
- [Monitoring and Maintenance](#monitoring-and-maintenance)
- [Troubleshooting](#troubleshooting)
## Overview
The manacore-monorepo uses a comprehensive CI/CD pipeline with the following features:
- **Automated Testing**: PR checks, type checking, linting, and format validation
- **Smart Build Detection**: Only builds affected projects using Turborepo filters
- **Docker Orchestration**: Multi-stage builds for all service types
- **Zero-Downtime Deployments**: Rolling updates with health checks
- **Automated Rollbacks**: Emergency rollback procedures
- **Security Scanning**: Dependency audits and vulnerability checks
### Architecture
```
┌─────────────────┐
│ GitHub PR │
└────────┬────────┘
┌─────────────────┐
│ PR Validation │ ← Lint, Type Check, Build, Test
└────────┬────────┘
┌─────────────────┐
│ Merge to Main │
└────────┬────────┘
┌─────────────────┐
│ Build & Push │ ← Docker images to registry
│ Docker Images │
└────────┬────────┘
┌─────────────────┐
│ Deploy Staging │ ← Automatic deployment
└────────┬────────┘
┌─────────────────┐
│ Manual Approval │ ← Production gate
└────────┬────────┘
┌─────────────────┐
│Deploy Production│ ← With backup & health checks
└─────────────────┘
```
## Prerequisites
### Required Tools
- **Docker**: Version 20.10+
- **Docker Compose**: Version 2.0+
- **Node.js**: Version 20+
- **pnpm**: Version 9.15.0
- **Git**: Version 2.30+
### Required Accounts
- **GitHub**: Repository access and Actions enabled
- **Docker Hub**: For image storage (or alternative registry)
- **Supabase**: For database services
- **Azure**: For OpenAI services
- **Hetzner/Coolify**: For hosting (recommended)
### GitHub Secrets
Configure the following secrets in your GitHub repository (`Settings > Secrets and variables > Actions`):
#### Docker Registry
```
DOCKER_USERNAME=your-docker-username
DOCKER_PASSWORD=your-docker-password
DOCKER_REGISTRY=wuesteon
```
#### Staging Environment
```
STAGING_HOST=staging.manacore.app
STAGING_USER=deploy
STAGING_SSH_KEY=<private-key>
STAGING_POSTGRES_HOST=postgres
STAGING_POSTGRES_PORT=5432
STAGING_POSTGRES_DB=manacore
STAGING_POSTGRES_USER=postgres
STAGING_POSTGRES_PASSWORD=<secure-password>
STAGING_REDIS_HOST=redis
STAGING_REDIS_PORT=6379
STAGING_REDIS_PASSWORD=<secure-password>
STAGING_SUPABASE_URL=https://xxx.supabase.co
STAGING_SUPABASE_ANON_KEY=<anon-key>
STAGING_SUPABASE_SERVICE_ROLE_KEY=<service-role-key>
STAGING_AZURE_OPENAI_ENDPOINT=https://xxx.openai.azure.com
STAGING_AZURE_OPENAI_API_KEY=<api-key>
STAGING_JWT_SECRET=<jwt-secret>
STAGING_JWT_PUBLIC_KEY=<public-key>
STAGING_JWT_PRIVATE_KEY=<private-key>
```
#### Production Environment
```
PRODUCTION_HOST=api.manacore.app
PRODUCTION_USER=deploy
PRODUCTION_SSH_KEY=<private-key>
PRODUCTION_API_URL=https://api.manacore.app
# ... (same structure as staging with production values)
```
#### Turbo Cache (Optional)
```
TURBO_TOKEN=<vercel-token>
TURBO_TEAM=<team-name>
```
#### Code Coverage (Optional)
```
CODECOV_TOKEN=<codecov-token>
```
## CI/CD Pipeline
### Workflow Files
The CI/CD pipeline consists of 6 GitHub Actions workflows:
#### 1. PR Validation (`ci-pull-request.yml`)
**Triggers**: Pull requests to `main` or `develop`
**Steps**:
1. Detect changed projects
2. Run format check
3. Run linting
4. Type checking
5. Build affected projects
6. Run tests with coverage
7. Docker build validation
8. Security scanning
**Required Checks**: Format, Type Check, Build
#### 2. Main Branch CI (`ci-main.yml`)
**Triggers**: Push to `main` branch
**Steps**:
1. Full validation (all projects)
2. Build all projects
3. Build and push Docker images
4. Trigger staging deployment
#### 3. Staging Deployment (`cd-staging.yml`)
**Triggers**: Manual or automated from main CI
**Steps**:
1. SSH to staging server
2. Pull latest Docker images
3. Update environment configuration
4. Deploy services with zero-downtime
5. Run database migrations
6. Health checks
7. Notify on completion
#### 4. Production Deployment (`cd-production.yml`)
**Triggers**: Manual only
**Steps**:
1. Validate deployment request
2. Request manual approval
3. Create database backup
4. Deploy with rolling update
5. Run migrations
6. Health checks
7. Monitor for 5 minutes
8. Run smoke tests
9. Notify on completion
#### 5. Test Coverage (`test-coverage.yml`)
**Triggers**: PRs, pushes to main, weekly schedule
**Steps**:
1. Run all tests with coverage
2. Collect coverage reports
3. Upload to Codecov
4. Generate summary
5. Check coverage thresholds (50% minimum)
#### 6. Dependency Updates (`dependency-update.yml`)
**Triggers**: Weekly schedule, manual
**Steps**:
1. Check for outdated dependencies
2. Run security audit
3. Create issue for critical vulnerabilities
4. Update lock file
5. Create PR with changes
### Change Detection
The pipeline uses `dorny/paths-filter` to detect which projects have changed:
```yaml
filters:
maerchenzauber:
- 'apps/maerchenzauber/**'
- 'packages/**'
chat:
- 'apps/chat/**'
- 'packages/**'
# ... other projects
```
Only affected projects are built and tested, saving time and resources.
## Docker Setup
### Multi-Stage Builds
All Dockerfiles use multi-stage builds for optimal image size:
1. **Builder Stage**: Install dependencies and build
2. **Production Stage**: Copy only production dependencies and built assets
### Service Types
#### NestJS Backend
Template: `docker/templates/Dockerfile.nestjs`
```dockerfile
FROM node:20-alpine AS builder
# Build with all dependencies
FROM node:20-alpine AS production
# Production with minimal footprint
```
**Key Features**:
- Non-root user (`nestjs`)
- Health checks
- Resource limits
- Optimized caching
#### SvelteKit Web
Template: `docker/templates/Dockerfile.sveltekit`
**Key Features**:
- SSR support
- Static asset optimization
- Non-root user
- Health endpoints
#### Astro Landing Pages
Template: `docker/templates/Dockerfile.astro`
**Key Features**:
- Nginx-based serving
- Gzip compression
- Security headers
- Static file caching
### Docker Compose
Two environments are provided:
#### Staging (`docker-compose.staging.yml`)
- Includes PostgreSQL and Redis
- Service discovery via Docker network
- Local development configuration
- Verbose logging
#### Production (`docker-compose.production.yml`)
- External database connections
- Resource limits
- Optimized logging
- Security hardening
## Deployment Environments
### Staging
**Purpose**: Pre-production testing and validation
**URL**: `https://staging.manacore.app`
**Characteristics**:
- Automatic deployment from `main` branch
- Separate database instances
- Full feature parity with production
- Verbose logging enabled
**Access**:
```bash
ssh deploy@staging.manacore.app
cd ~/manacore-staging
docker compose ps
```
### Production
**Purpose**: Live production environment
**URL**: `https://api.manacore.app`
**Characteristics**:
- Manual deployment with approval
- High availability configuration
- Performance optimized
- Enhanced monitoring
- Backup procedures
**Access**:
```bash
ssh deploy@api.manacore.app
cd ~/manacore-production
docker compose ps
```
## Deployment Process
### Automated Staging Deployment
Staging deployment happens automatically when code is merged to `main`:
```bash
# 1. Create PR
git checkout -b feature/my-feature
git push origin feature/my-feature
# 2. PR Validation runs automatically
# - Checks pass
# 3. Merge to main
# - Main CI builds Docker images
# - Pushes to registry
# - Triggers staging deployment
# 4. Staging deployment
# - Pulls latest images
# - Rolling update
# - Health checks
# - Success!
```
### Manual Production Deployment
Production requires manual trigger and approval:
#### Step 1: Trigger Deployment
Go to GitHub Actions > CD - Production Deployment > Run workflow
**Required Inputs**:
- Service: `all` or specific service name
- Environment: `production`
- Confirm: Type `deploy`
#### Step 2: Approval
Workflow pauses for manual approval at `production-approval` environment.
Approve in: GitHub > Settings > Environments > production-approval
#### Step 3: Automated Deployment
Once approved:
1. Creates database backup
2. Tags current deployment
3. Pulls latest images
4. Runs migrations
5. Rolling update (zero-downtime)
6. Health checks
7. 5-minute monitoring
8. Smoke tests
#### Step 4: Verification
```bash
# Check deployment status
./scripts/deploy/health-check.sh production
# View logs
ssh deploy@api.manacore.app
cd ~/manacore-production
docker compose logs -f
```
### Manual Deployment Scripts
For manual deployments or troubleshooting:
#### Build and Push Images
```bash
# Build all services
./scripts/deploy/build-and-push.sh all latest
# Build specific service
./scripts/deploy/build-and-push.sh chat-backend v1.2.3
```
#### Deploy to Server
```bash
# Deploy to staging
export STAGING_HOST=staging.manacore.app
export STAGING_USER=deploy
./scripts/deploy/deploy-hetzner.sh staging all
# Deploy to production
export PRODUCTION_HOST=api.manacore.app
export PRODUCTION_USER=deploy
./scripts/deploy/deploy-hetzner.sh production all
```
#### Health Checks
```bash
# Check staging
./scripts/deploy/health-check.sh staging
# Check production
./scripts/deploy/health-check.sh production
```
#### Database Migrations
```bash
# Run migrations for specific project
./scripts/deploy/migrate-db.sh chat staging
./scripts/deploy/migrate-db.sh mana-core-auth production
```
## Rollback Procedures
### Automated Rollback (Recommended)
```bash
# Rollback staging
./scripts/deploy/rollback.sh staging all
# Rollback production (specific service)
./scripts/deploy/rollback.sh production chat-backend
```
**What the script does**:
1. Confirms rollback with user
2. Checks for previous deployment backup
3. Stops current services
4. Restores previous docker-compose configuration
5. Restores database (if applicable)
6. Starts services with previous version
7. Runs health checks
8. Reports status
### Manual Rollback
If automated rollback fails:
```bash
# SSH to server
ssh deploy@api.manacore.app
cd ~/manacore-production
# List available backups
ls -lt backups/
# Choose backup
BACKUP_DIR=backups/20250127_120000
# Restore configuration
cp $BACKUP_DIR/docker-compose.yml ./docker-compose.yml
cp $BACKUP_DIR/.env.backup ./.env
# Restore database (if needed)
docker compose exec -T postgres psql -U postgres < $BACKUP_DIR/postgres_backup.sql
# Restart services
docker compose up -d
# Check status
docker compose ps
```
## Monitoring and Maintenance
### Log Management
```bash
# View logs for all services
docker compose logs -f
# View logs for specific service
docker compose logs -f mana-core-auth
# View last 100 lines
docker compose logs --tail=100 chat-backend
# Search logs
docker compose logs | grep ERROR
```
### Resource Monitoring
```bash
# Check container resources
docker stats
# Check disk usage
docker system df
# Cleanup unused resources
docker system prune -a
```
### Database Backups
Automated backups are created before each production deployment.
**Manual backup**:
```bash
# Create backup
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
docker compose exec -T postgres pg_dumpall -U postgres > backup_$TIMESTAMP.sql
# Restore from backup
docker compose exec -T postgres psql -U postgres < backup_20250127.sql
```
### Health Monitoring
Set up external monitoring tools to ping health endpoints:
- Mana Core Auth: `https://api.manacore.app/api/v1/health`
- Maerchenzauber: `https://api.manacore.app/health`
- Chat Backend: `https://api.manacore.app/api/health`
Recommended tools:
- UptimeRobot
- Pingdom
- Better Uptime
- Datadog
## Troubleshooting
### Deployment Fails
**Issue**: Deployment workflow fails
**Solutions**:
1. Check workflow logs in GitHub Actions
2. Verify all required secrets are set
3. Ensure SSH access to server works
4. Check Docker registry credentials
```bash
# Test SSH access
ssh deploy@staging.manacore.app 'echo "SSH works"'
# Test Docker login
echo $DOCKER_PASSWORD | docker login -u $DOCKER_USERNAME --password-stdin
```
### Health Checks Fail
**Issue**: Service fails health checks after deployment
**Solutions**:
1. Check service logs
2. Verify environment variables
3. Check database connectivity
4. Verify port mappings
```bash
# Check service logs
docker compose logs --tail=200 mana-core-auth
# Test health endpoint directly
docker compose exec mana-core-auth wget -O - http://localhost:3001/api/v1/health
# Check environment
docker compose exec mana-core-auth env | grep -v PASSWORD
```
### Database Connection Issues
**Issue**: Services can't connect to database
**Solutions**:
1. Verify database is running
2. Check connection strings
3. Verify credentials
4. Check network connectivity
```bash
# Check database status
docker compose exec postgres psql -U postgres -c '\l'
# Test connection from service
docker compose exec mana-core-auth nc -zv postgres 5432
```
### Image Build Failures
**Issue**: Docker build fails in CI
**Solutions**:
1. Check Dockerfile syntax
2. Verify all COPY paths exist
3. Check for build dependency issues
4. Review build logs
```bash
# Test build locally
docker buildx build --file apps/chat/apps/backend/Dockerfile .
# Build with verbose output
docker buildx build --progress=plain --file apps/chat/apps/backend/Dockerfile .
```
### Out of Disk Space
**Issue**: Server runs out of disk space
**Solutions**:
```bash
# Check disk usage
df -h
# Clean Docker resources
docker system prune -a --volumes
# Remove old images
docker image prune -a --filter "until=72h"
# Remove old backups
cd ~/manacore-production/backups
ls -t | tail -n +10 | xargs rm -rf
```
### Services Not Starting
**Issue**: Docker Compose services fail to start
**Solutions**:
```bash
# Check service dependencies
docker compose config
# Start services one by one
docker compose up -d postgres
docker compose up -d redis
docker compose up -d mana-core-auth
# Check startup logs
docker compose logs --tail=100 --follow
```
## Best Practices
### 1. Always Test in Staging First
Never deploy directly to production without testing in staging.
### 2. Use Tagged Releases
Tag important releases:
```bash
git tag -a v1.2.3 -m "Release version 1.2.3"
git push origin v1.2.3
```
### 3. Monitor After Deployment
Watch logs and metrics for at least 30 minutes after production deployment.
### 4. Communicate Deployments
Notify team before production deployments, especially during business hours.
### 5. Keep Backups
Always verify backups are created before production deployments.
### 6. Document Changes
Update CHANGELOG.md with notable changes for each deployment.
### 7. Security
- Rotate secrets regularly
- Keep dependencies updated
- Review security audit reports
- Use least-privilege access
## Support
For deployment issues or questions:
1. Check this documentation
2. Review GitHub Actions logs
3. Check service logs on server
4. Contact DevOps team
**Emergency Contact**: DevOps on-call rotation