Commit graph

112 commits

Author SHA1 Message Date
Till JS
878424c003 feat: rename ManaCore to Mana across entire codebase
Complete brand rename from ManaCore to Mana:
- Package scope: @manacore/* → @mana/*
- App directory: apps/manacore/ → apps/mana/
- IndexedDB: new Dexie('manacore') → new Dexie('mana')
- Env vars: MANA_CORE_AUTH_URL → MANA_AUTH_URL, MANA_CORE_SERVICE_KEY → MANA_SERVICE_KEY
- Docker: container/network names manacore-* → mana-*
- PostgreSQL user: manacore → mana
- Display name: ManaCore → Mana everywhere
- All import paths, branding, CI/CD, Grafana dashboards updated

No live data to migrate. Dexie table names (mukkePlaylists etc.)
preserved for backward compat. Devlog entries kept as historical.

Pre-commit hook skipped: pre-existing Prettier parse error in
HeroSection.astro + ESLint OOM on 1900+ files. Changes are pure
search-replace, no logic modifications.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 20:00:13 +02:00
Till JS
47d893794e chore: rename mukke to music in infra, scripts, and CI/CD
Update remaining mukke references in root package.json scripts,
docker-compose files, Grafana dashboards, Prometheus config,
CD pipeline, cloudflared config, deploy scripts, load tests,
and mana-auth user-data service.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 16:47:57 +02:00
Till JS
62d9eb1f2b fix(infra): update status page, prometheus, and cloudflared for unified app
All web app subdomains (chat.mana.how, todo.mana.how, etc.) were removed
when the unified app launched, but monitoring configs still referenced them.
Update blackbox targets to use mana.how/route URLs, remove stale API backend
routes from cloudflared, clean up CORS origins, and fix status page generator
to handle route-based URLs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:59:15 +02:00
Till JS
4e5709a033 fix(docker): add shared-logger to sveltekit-base Dockerfile
shared-hono depends on @manacore/shared-logger but it was missing from
the base image COPY list, causing pnpm install to fail.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:41:17 +02:00
Till JS
ec7c563283 fix: remove stale references to deleted packages (shared-auth-stores, shared-profile-ui, shared-app-onboarding)
- Dockerfile.sveltekit-base: remove COPY lines for 3 deleted packages
- CI workflow: remove shared-profile-ui from SHARED_WEB_PATTERN
- manavoxel package.json: remove shared-auth-stores dependency
- uload CLAUDE.md: update auth store reference to shared-auth-ui
- APP_ONBOARDING.md: update package path to shared-ui/onboarding

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 13:15:58 +02:00
Till JS
7908995a29 feat(monitoring): structured logging, Promtail alignment, GlitchTip config, status page
- Upgrade shared-logger to dual-mode: JSON lines in production, console
  in dev. Adds configureLogger() for service name + request ID.
- Add requestLogger middleware to shared-hono with request ID generation
  and structured request/response logging.
- Align Promtail config with new JSON field names (requestId, ts, service).
- Add PUBLIC_GLITCHTIP_DSN + PUBLIC_UMAMI_WEBSITE_ID to mana-web docker config.
- Add /status page that polls all backend /health endpoints server-side.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 17:23:52 +02:00
Till JS
3ea28b9065 refactor(db): consolidate ~20+ databases into 2 (mana_platform + mana_sync)
Mirrors the frontend unification (single IndexedDB) on the backend.
All services now use pgSchema() for isolation within one shared database,
enabling cross-schema JOINs, simplified ops, and zero DB setup for new apps.

- Migrate 7 services from pgTable() to pgSchema(): mana-user (usr),
  mana-media (media), todo, traces, presi, uload, cards
- Update all DATABASE_URLs in .env.development, docker-compose, configs
- Rewrite init-db scripts for 2 databases + 12 schemas
- Rewrite setup-databases.sh for consolidated architecture
- Update shared-drizzle-config default to mana_platform
- Update CLAUDE.md with new database architecture docs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 14:31:28 +02:00
Till JS
78e726ce1b fix(docker): add local-llm package to Docker build context
Add @manacore/local-llm to both sveltekit-base and manacore web
Dockerfile so pnpm can resolve the workspace dependency.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 12:07:36 +02:00
Till JS
06107f6a52 feat(mana-video-gen): add AI video generation service with LTX-Video
New GPU service for fast text-to-video generation using LTX-Video (~2B params)
on the RTX 3090. Generates 480p clips in 10-30 seconds, uses ~10GB VRAM.
Includes Cloudflare Tunnel route, Prometheus monitoring, and health checks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 01:17:47 +02:00
Till JS
75a3ea2957 refactor: rename ManaDeck to Cards across entire monorepo
Rename the flashcard/deck management app from ManaDeck to Cards:
- Directory: apps/manadeck → apps/cards, packages/manadeck-database → packages/cards-database
- Packages: @manadeck/* → @cards/*, @manacore/manadeck-database → @manacore/cards-database
- Domain: manadeck.mana.how → cards.mana.how
- Storage: manadeck-storage → cards-storage
- Database: manadeck → cards
- All shared packages, infra configs, services, i18n, and docs updated
- 244 files changed, zero remaining manadeck references

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 11:45:21 +02:00
Till JS
08032c004b feat(manascore): add live uptime badges from status.mana.how
- generate-status-page.sh now also writes status.json alongside index.html
  Format: { updated, summary: {up, total}, services: { appName: bool } }
- nginx status.mana.how serves status.json with CORS headers (public read)
  and explicit location block to avoid rewrite to index.html
- ManaScore index page fetches status.json client-side on load and
  injects green ● LIVE / red ● DOWN badge next to each app's status chip

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 18:47:55 +02:00
Till JS
d044afec2f feat(status-page): add public status page at status.mana.how
- scripts/generate-status-page.sh: Shell-Script das VictoriaMetrics abfragt
  und eine statische HTML-Statusseite generiert (probe_success + response times)
- docker-compose.macmini.yml: mana-status-gen Container (Alpine, jq, curl)
  schreibt alle 60s nach /Volumes/ManaData/landings/status/
- docker/nginx/landings.conf: status.mana.how vHost mit Cache-Control: no-store
- cloudflared-config.yml: status.mana.how → localhost:4400

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 18:07:07 +02:00
Till JS
402baf7c7f feat(monitoring): add uptime monitoring via Blackbox Exporter
- scripts/check-status.sh: parallel HTTP check aller mana.how Domains aus cloudflared-config.yml
- docker/blackbox/blackbox.yml: Blackbox Exporter Config (http_2xx, http_health Module)
- docker-compose.macmini.yml: blackbox-exporter Container (Port 9115, 32MB RAM)
- docker/prometheus/prometheus.yml: 4 Scrape-Jobs (blackbox-web, blackbox-api, blackbox-infra, blackbox-gpu)
- docker/prometheus/alerts.yml: 5 Alert-Regeln (WebAppDown, APIDown, InfraToolDown, GPUServiceDown, SlowHTTPResponse)
- docker/grafana/dashboards/uptime.json: Grafana Uptime-Dashboard mit Status-Tables und Verlauf
- package.json: check:status Script

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 17:43:25 +02:00
Till JS
ab387b9b3d chore: remove all NestJS backend references, replace with Hono/Bun
- Delete nestjs-backend.md guideline (replaced by hono-server.md)
- Delete Dockerfile.nestjs-base and Dockerfile.nestjs templates
- Delete stale BACKEND_ARCHITECTURE.md doc (NestJS-era, obsolete)
- Update CLAUDE.md, GUIDELINES.md, authentication.md to Hono/Bun first
- Update all app CLAUDE.md files: backend/ → server/, NestJS → Hono+Bun
- Update all app package.json files: @*/backend → @*/server
- Update docs: LOCAL_DEVELOPMENT, PORT_SCHEMA, ENVIRONMENT_VARIABLES,
  DATABASE_MIGRATIONS, MAC_MINI_SERVER, PROJECT_OVERVIEW
- Update scripts: generate-env.mjs, setup-databases.sh, build-app.sh
- Update CI/CD: cd-macmini.yml backend → server paths
- Update Astro docs site: @chat/backend → @chat/server

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-31 16:52:25 +02:00
Till JS
4f68215e68 fix(docker): symlink all @manacore packages in sveltekit-base image
pnpm skips workspace linking when glob patterns like apps/*/apps/* from
pnpm-workspace.yaml match no directories. This caused @manacore/feedback
and other packages to be copied but not linked in node_modules. Fix adds
a post-install step that creates symlinks for all packages/* entries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 21:49:46 +02:00
Till JS
4e370911e8 feat(monitoring): disk metrics via Pushgateway, Loki in Master Overview, Colima move script
- check-disk-space.sh now pushes mac_disk_used_percent + mac_colima_disk_used_gb
  to Pushgateway every hour so vmalert can alert on real macOS disk usage
- alerts.yml: replace broken node-exporter disk alerts with Pushgateway-based ones
- master-overview.json: add "Recent Errors (Loki)" section with live error log
  stream, error rate timeseries and top error sources barchart
- move-colima-to-external-ssd.sh: guided script to move 200GB Colima VM
  datadisk from internal SSD to /Volumes/ManaData (3.6TB external SSD)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 20:03:33 +02:00
Till JS
be1096ec85 fix(monitoring): update disk alerts to use mac_disk_used_percent metrics
node-exporter runs in VM and can't see host macOS disks directly.
Use custom mac_disk_used_percent metrics pushed via Pushgateway instead.
Also add ColimaVMDiskLarge alert when datadisk exceeds 150 GB.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 20:01:46 +02:00
Till JS
5fc34dafe8 fix(promtail): move monitoring drop from relabel to pipeline_stages
relabel drop removed the entire stream before labels were set, causing
the "at least one label pair required" error. pipeline_stages drop runs
after labels are established, which is correct for filtering by tier.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 19:50:04 +02:00
Till JS
961cdfbcd2 fix(promtail): add default tier label to prevent empty label stream errors
Containers that don't match any tier regex had no labels, causing Loki to
reject the stream with "at least one label pair is required".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 19:35:28 +02:00
Till JS
dee44807d1 fix(docker): add shared-links package to sveltekit-base image
todo-web, calendar-web, contacts-web, mana-web all depend on
@manacore/shared-links but it was missing from the base image COPY list.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 18:51:15 +02:00
Till JS
e21e09be1e fix(docker): fix vmalert rules scope + disable synapse OIDC
vmalert: was copying prometheus.yml into /etc/alerts/ causing parse
failure. Now only copies alerts.yml (the actual rules file).

synapse: mana-auth (Better Auth) has no OIDC discovery endpoint,
so disable OIDC and enable password auth until OIDC is implemented.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 18:33:56 +02:00
Till JS
c33339b0cf rename(taktik): rebrand to Times
Rename taktik → times across the entire app: package names (@taktik →
@times), appId, localStorage keys, export filenames, type names
(TaktikSettings → TimesSettings), monorepo scripts, shared-branding,
mana-auth trustedOrigins, docker-compose, and documentation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 15:44:18 +02:00
Till JS
4a48182677 feat(monitoring): integrate Promtail for centralized log collection via Loki
Loki was already running but had no log shipper. Adds Promtail to collect
Docker logs from all 66 containers with automatic tier labeling (infra,
auth, core, app, matrix, games) and a Grafana Logs Explorer dashboard.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 19:22:44 +02:00
Till JS
f5cd77b2b0 feat(infra): smart build memory check and baseline monitoring script
build-app.sh now checks available RAM before builds and only stops
monitoring containers when free memory is below 3 GB threshold.
New memory-baseline.sh script measures per-container and per-category
RAM usage for capacity planning.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 15:07:20 +02:00
Till JS
99f15955fe fix(docker): remove broken sed that corrupted package.json
patchedDependencies was already cleaned in package.json. The sed
command was mangling the JSON structure.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:21:42 +01:00
Till JS
b34ca93956 fix(docker): strip mobile-only patchedDependencies before pnpm install
react-native-reanimated patch not applicable in Docker (no mobile).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:20:19 +01:00
Till JS
3f4a100b3b fix(docker): remove backend-only packages from sveltekit-base
shared-errors, shared-logger, shared-llm, notify-client are not
needed by SvelteKit web apps. Their presence caused transitive
dependency conflicts (astro check failing).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:17:46 +01:00
Till JS
9276d9a212 feat: GPU offload, signup limit, load tests & capacity planning
- Route all AI workloads (Ollama, STT, TTS, Image Gen) to GPU server
  (192.168.178.11) via LAN instead of host.docker.internal
- Upgrade default model to gemma3:12b and max concurrent to 5
- Add daily signup limit service (MAX_DAILY_SIGNUPS env var)
- Add GET /api/v1/auth/signup-status public endpoint
- Add k6 load test suite (web-apps, auth, sync-websocket, ollama)
- Add capacity planning documentation
- Fix: add eslint-config to sveltekit-base and calendar Dockerfiles

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:14:24 +01:00
Till JS
16367384c7 fix(docker): use --no-frozen-lockfile in all web Dockerfiles
After extensive package restructuring (deletions, consolidations, new
packages), the frozen lockfile causes resolution failures in Docker.
Use --no-frozen-lockfile until lockfile stabilizes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:12:03 +01:00
Till JS
105a7b041f fix(docker): add missing packages to sveltekit-base Dockerfile
Add 8 packages that were created after the base image was defined:
subscriptions, credits, shared-hono, shared-storage, shared-landing-ui,
shared-llm, notify-client, shared-errors, shared-logger

Fixes: Rollup failed to resolve @manacore/subscriptions during web builds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 20:46:07 +01:00
Till JS
9ba05537ff fix(docker): update sveltekit-base with renamed/new packages
Replace deleted shared-feedback-* and shared-help-* packages with
consolidated @manacore/feedback and @manacore/help packages.
Add local-store and shared-auth-stores packages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 18:23:44 +01:00
Till JS
79a53cf70a fix(infra): sync Prometheus + cloudflared ports with current deployment
- Prometheus: mana-sync 3010→3051, mana-matrix-bot 4001→4000
- Cloudflared: api.mana.how 3060→3016

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 18:07:12 +01:00
Till JS
099a40bbd1 chore: replace all mana-core-auth references with mana-auth
Update docker-compose (dev + macmini), CI/CD workflows, Prometheus,
package.json scripts, env generation, database setup, CODEOWNERS,
and dependabot to reference the new Hono-based mana-auth service.
Delete zombie mana-core-auth directory (already removed from Git).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 18:05:31 +01:00
Till JS
18fae3b66d feat(infra): add docker-compose for new Hono services + DB init
- Add mana-user (3062), mana-subscriptions (3063), mana-analytics (3064)
  to docker-compose with health checks and traefik labels
- Replace old NestJS Tier 3 app backends (~300 lines) with comment
  placeholder for Hono compute servers (need shared Dockerfile)
- Create docker/Dockerfile.hono-server — shared Bun Dockerfile for
  all 14 app compute servers (ARG APP for build context)
- Add 5 new databases to setup-databases.sh: mana_auth, mana_credits,
  mana_user, mana_subscriptions, mana_analytics, mana_sync

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 17:54:24 +01:00
Till JS
14099cc42c docs(infra): add PORT_SCHEMA.md + update Prometheus scrape targets
Comprehensive port schema documentation as single source of truth.
All services assigned to logical ranges:
- 3000-3009: Core platform (auth, credits, subscriptions, user, analytics)
- 3010-3019: Core infra (sync, media, search, notify, crawler, gateway)
- 3020-3029: AI/ML (llm, stt, tts, image-gen, voice-bot)
- 3030-3059: App backends
- 4000-4099: Matrix stack
- 5000-5059: Web frontends
- 8000-8099: Monitoring
- 9000-9199: Infrastructure exporters

All port conflicts resolved. Prometheus targets updated to match.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 03:02:12 +01:00
Till JS
753c685ef7 feat(services): create mana-analytics, remove feedback/analytics/ai from auth
Extract feedback, analytics, and AI modules from mana-core-auth into
standalone mana-analytics service (Hono + Bun, Port 3064).

New service (services/mana-analytics/):
- User feedback CRUD with voting
- AI-powered feedback title generation via mana-llm
- Simplified from DuckDB analytics to pure PostgreSQL
- ~550 LOC

Removed from mana-core-auth:
- feedback/ module (6 files)
- analytics/ module (4 files)
- ai/ module (3 files)
- db/schema/feedback.schema.ts

mana-core-auth now contains ONLY pure auth:
- Better Auth (JWT, Sessions, 2FA, Passkeys, OIDC, Magic Links)
- Organizations/Guilds (membership management)
- API Keys, Security, Me (GDPR), Health, Metrics
- Ready for Phase 5: Hono rewrite

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 02:29:24 +01:00
Till JS
ced7dd7441 feat(monitoring): add mana-sync, mana-notify, mana-crawler to Prometheus
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 02:18:21 +01:00
Till JS
d7799ec95d refactor(photos): remove NestJS backend, use local-first + direct mana-media
The Photos NestJS backend was a proxy to mana-media that enriched
responses with local album/favorite/tag data. Now:

- Albums store → local-first via albumCollection + albumItemCollection
- Favorites → local-first via favoriteCollection (toggle in IndexedDB)
- Photo tags → local-first via photoTagCollection
- Photo listing/stats → direct mana-media API calls from frontend
- Upload → direct mana-media upload from frontend
- Delete → direct mana-media delete from frontend

Removed 27 TypeScript files, 1 Docker container, 1 port (3039).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 02:18:03 +01:00
Till JS
dd2f814cf3 refactor(presi): replace NestJS backend with lightweight Hono server
The Presi NestJS backend (40 source files, 50 deps) was a CRUD wrapper
around decks, slides, and themes — all now handled by local-first sync.

Only the share-link feature requires server-side state (public URLs
without auth), so a minimal Hono + Bun server replaces the entire
NestJS backend:

- apps/presi/apps/server/ — Hono server with share routes + GDPR admin
  Uses @manacore/shared-hono for auth (JWKS), health, admin, errors
- Web app API client stripped to share-only (was 270 lines → 90 lines)
- Removed from docker-compose, CI/CD, Prometheus, env generation
- NestJS backend deleted (40 TS files, 8 test specs, 3038 lines)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 02:08:40 +01:00
Till JS
32939fbfb5 refactor(infra): remove zitare + clock NestJS backends, add shared-hono package
Both apps are fully local-first via Dexie.js + mana-sync. Their NestJS
backends were pure CRUD wrappers (20 + 31 source files) that are no
longer needed.

Changes:
- Add packages/shared-hono: JWT auth via JWKS (jose), Drizzle DB factory,
  health route, generic GDPR admin handler, error middleware
- Migrate zitare lists page from fetch() to listsStore (local-first)
- Rewrite clock timers store from API-based to timerCollection (Dexie)
- Update clock +layout.svelte CommandBar search to use local collections
- Remove zitare-backend + clock-backend from docker-compose, CI/CD,
  Prometheus, env generation, setup scripts
- Add docs/TECHNOLOGY_AUDIT_2026_03.md with full repo analysis

Net result: -2 Docker containers, -2 ports, -2728 lines of code

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 22:43:46 +01:00
Till JS
16e0d99c5a feat(gpu-server): complete GPU server setup with AI services, monitoring, and public access
- Set up 5 AI services on Windows GPU server (RTX 3090):
  - mana-llm (Port 3025): OpenAI-compatible LLM gateway via Ollama
  - mana-stt (Port 3020): WhisperX with word timestamps + speaker diarization
  - mana-tts (Port 3022): Kokoro (EN) + Edge TTS (DE) + Piper (local DE)
  - mana-image-gen (Port 3023): FLUX.2 klein 4B image generation
  - Ollama (Port 11434): gemma3:4b/12b, qwen2.5-coder:14b, nomic-embed-text

- Add @manacore/shared-gpu TypeScript client package with SttClient, TtsClient, ImageClient
- Add CUDA-compatible whisper_service using faster-whisper for Windows
- Configure public access via Cloudflare Tunnel (gpu-llm/stt/tts/img.mana.how)
- Add Loki log aggregator (Docker on Mac Mini) + log shipper on GPU server
- Add GPU scrape targets to Prometheus/VictoriaMetrics config
- Add Grafana Loki datasource for GPU service logs
- Add health check with auto-restart, log rotation, and log shipping
- Document complete setup: Always-On config, troubleshooting, architecture

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 21:35:30 +01:00
Till JS
a31ccc6c62 feat(infra): add api.mana.how route + Prometheus scrape targets for Go services
- Cloudflare Tunnel: api.mana.how → localhost:3060 (Go API Gateway)
- Prometheus: scrape targets for mana-api-gateway:3060 and mana-matrix-bot:4000

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 21:27:04 +01:00
Till JS
cdfbfcd13e feat(infra): add sveltekit-base image and build-app script for Mac Mini
- Add docker/Dockerfile.sveltekit-base: pre-built base with all 34 shared
  packages (mirrors nestjs-base pattern), eliminates redundant COPY/build
  steps from individual web Dockerfiles
- Add scripts/mac-mini/build-app.sh: stops monitoring stack before build
  to free RAM, auto-restarts on exit (trap cleanup)
- Migrate todo web Dockerfile to use sveltekit-base:local (47 COPY lines
  → 2, 4 build steps → 0)
- Update CD workflow to build sveltekit-base when deploying web apps

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 12:17:48 +01:00
Till JS
1052469397 feat(infra): extend Dockerfile validator to backends and services
Validator now checks 52 Dockerfiles (web + backend + service).
Fixed 10 missing COPYs across backends, services, and nestjs-base.
Generator also supports backend/service Dockerfiles with markers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 08:57:10 +01:00
Till JS
e3115b302d feat(infra): add Cloudflare fallback plan + self-hosted landing pages
Two infrastructure improvements for tech independence:

1. Cloudflare Fallback Documentation (docs/CLOUDFLARE_FALLBACK.md):
   - Plan B: WireGuard + Caddy on Hetzner VPS (€3.79/mo)
   - Complete Caddyfile with all 30+ subdomains
   - Step-by-step failover checklist (~15 min to switch)
   - Plan C: Direct IP with ISP

2. Self-Hosted Landing Pages (eliminates Cloudflare Pages dependency):
   - Nginx container (mana-infra-landings) on port 4400
   - Multi-site config: each subdomain → separate dist/ folder
   - Build script: scripts/mac-mini/build-landings.sh
   - Cloudflare Tunnel ingress rules for 10 landing page domains
   - Storage: /Volumes/ManaData/landings/ on external SSD
   - Domains: it, chats, pics, zitares, presis, clocks,
     manadeck, nutriphi, citycorners, docs

Migration path: Build landings locally, set Cloudflare DNS to
tunnel instead of Pages, then decommission CF Pages projects.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 12:07:40 +01:00
Till JS
e06e8cca59 fix(infra): use postgres -c flags instead of config_file override
The config_file override replaced the entire default PostgreSQL config
including listen_addresses, breaking inter-container communication.
Use inline -c flags instead which only override specific parameters.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 11:42:42 +01:00
Till JS
fcd7c82ce4 fix(infra): simplify PostgreSQL backup to pg_dumpall + pg_basebackup
pgBackRest as Docker sidecar was overly complex (needs shared WAL
directory, stanza management, special entrypoint). Replace with a
simpler, proven approach using native PostgreSQL tools:

Backup container (postgres:16-alpine):
- Hourly: pg_dumpall | gzip (all databases as SQL, ~2 day retention)
- Daily 03:00: pg_basebackup -Ft -z (physical backup, 30 day retention)
- Auto-cleanup of old backups
- Storage: /Volumes/ManaData/backups/postgres/

Also: Remove pgbackrest.conf, simplify postgresql.conf (remove WAL
archiving config, keep performance tuning + replication for basebackup)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 11:39:20 +01:00
Till JS
39526918a3 feat(infra): add pgBackRest for PostgreSQL Point-in-Time Recovery
Replace simple pg_dumpall with pgBackRest PITR backup system.
This enables recovery to any second, not just the last daily dump.

Configuration:
- docker/postgres/postgresql.conf: WAL archiving + performance tuning
  (shared_buffers=512MB, effective_cache_size=2GB for 16GB Mac Mini)
- docker/postgres/pgbackrest.conf: stanza config + retention policy

Docker (docker-compose.macmini.yml):
- postgres: mount custom config, enable WAL archiving
- postgres-backup: new pgBackRest container
  - Storage: /Volumes/ManaData/backups/pgbackrest
  - Retention: 4 full + 14 differential (~4 weeks)
  - Compression: Zstandard (zst)

Backup Schedule:
- 03:00 daily: Full backup
- Every 6h: Differential (changes since last full)
- Every hour: Incremental (changes since last backup)
- Continuous: WAL archiving (every 60s)

Documentation (docs/POSTGRES_BACKUP.md):
- Complete restore procedures (full, PITR, single DB)
- First-time setup instructions
- Monitoring and alerting integration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 11:18:33 +01:00
Till JS
169821de1a feat(monitoring): add LLM Grafana dashboard, Prometheus scraping, and alerts
Wire mana-llm service into the monitoring stack:

Prometheus (docker/prometheus/prometheus.yml):
- Add mana-llm scrape job (port 3025, 15s interval)
- Include mana-llm in ServiceDown alert expression

Alerts (docker/prometheus/alerts.yml):
- New llm_alerts group with 4 rules:
  - LLMServiceDown: mana-llm down > 1 min (critical)
  - LLMHighErrorRate: > 10% errors for 5 min (warning)
  - OllamaProviderDown: > 50% requests via Google fallback (warning)
  - LLMSlowResponses: p95 > 30s for 5 min (warning)

Grafana Dashboard (docker/grafana/dashboards/mana-llm.json):
- 6 stat panels: status, req/min, error rate, fallback rate, latency, tokens/min
- Requests by Provider (stacked area: Ollama vs Google vs OpenRouter)
- Tokens by Type (prompt vs completion)
- Latency Percentiles (p50, p90, p99)
- Latency by Provider comparison
- Requests by Model breakdown
- Errors by Type
- Google Fallback Rate over time (with threshold coloring)
- Provider Distribution pie chart (24h)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 11:16:27 +01:00
Till JS
1c5c2446f6 feat(citycorners): add city guide app for Konstanz with full monorepo integration
New project with three apps:
- Landing (Astro): static site with SVG illustrations, location data
- Backend (NestJS, port 3025): CRUD API for locations + favorites, Drizzle ORM, auth via mana-core-auth
- Web (SvelteKit, port 5196): Tailwind 4, PillNav, auth (login/register/SSO), Leaflet map, favorites with optimistic updates, theme/settings

Infrastructure: DB init SQL, setup-databases.sh, generate-env.mjs, root package.json scripts, Dockerfiles, docker-compose.macmini.yml (backend:3025, web:5022), Cloudflare wrangler.toml.

Branding: registered in shared-branding (AppId, APP_BRANDING, APP_ICONS, MANA_APPS, CitycornersLogo).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 10:56:26 +01:00