managarten

mirror of https://github.com/Memo-2023/mana-monorepo.git synced 2026-05-14 22:21:10 +02:00

Author	SHA1	Message	Date
Till JS	b3523f8bdc	chore: cleanup leftover dirs from ManaCore→Mana rename + document apps/api Removed: - apps/manacore/ — three Svelte files were byte-identical duplicates of the apps/mana/ versions, leftover from the 2025 rename. Untracked .env files in the same dir were also cleared. - 21 empty apps//apps/web-archived/ directories — leftover from the unification move, never tracked in git. - services/it-landing/ — empty directory, picked up by the services/ workspace glob for no reason. - apps/news/apps/server-archived/ — empty. Fixed: - scripts/mac-mini/status.sh: COMPOSE_PROJECT_NAME fallback was still manacore-monorepo from before the rename. Documented: - Root CLAUDE.md now describes apps/api/ (the @mana/api unified backend) as a top-level peer to apps/mana/. It was completely missing from the trimmed CLAUDE.md, which made the layout look frontend-only.	2026-04-08 12:12:02 +02:00
Till JS	85e38176d8	chore(macmini/scripts): runbook hardening — status diff + ingress walk Two failures during the 2026-04-07 production outage triage were caused not by the underlying outage but by `status.sh` and `health-check.sh` hiding the broken state. Both scripts hardened so the same outage shape can't reoccur invisibly. status.sh — compose-vs-running diff The old script printed "X containers running / Y total" without noticing that some compose-defined containers were never started in the first place. The Mac Mini was running 37 of 42 declared containers and the script reported "37 running" with no indication of the gap — `mana-core-sync` and `mana-api-gateway` were silently missing for hours. New behaviour: read every service from `docker compose config`, diff its `container_name` against `docker ps`, and report each declared service whose container is not currently up. The same outage state would have been flagged on the very first run. health-check.sh — public-hostname walk via Cloudflare DNS The old script probed ~50 hardcoded `localhost:<port>/health` endpoints across Chat, Todo, Calendar, etc. — but the per-app HTTP backends those endpoints expected don't exist anymore (the ghost-API cleanup removed them entirely). Every probe returned HTTP 000 / connection refused, generating a wall of false-positive alerts that drowned out the real signal. The block was replaced with a dynamic walk of every `hostname:` entry in `~/.cloudflared/config.yml`. Each hostname is probed via the public Cloudflare tunnel, so DNS gaps, missing tunnel routes, 502/530 origin failures and timeouts surface as failures the same way real users would experience them. On its first run after the cleanup it surfaced eighteen previously-invisible hostname failures (no DNS, 502, or 530) — every one of them a real production issue. DNS resolution intentionally goes through `dig +short HOST @1.1.1.1` instead of the local resolver. The Mac Mini's home-router DNS keeps a negative cache for hours after the first failed lookup, so newly added CNAMEs (like the post-outage sync/media records) appeared as "no response" from inside the script for hours even though external users saw them resolve immediately. Asking Cloudflare's DNS directly gives the script the same view the public internet has. The Matrix, Element, GPU-LAN-redundant and monitoring port-by-port blocks were removed — the public-hostname walk covers all of them via their `.mana.how` hostnames going through the actual tunnel. The "stuck container" detector now ignores `-init` containers (one-shot init pods, Exit 0 = success, intentionally never re-run). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 22:31:53 +02:00
Till JS	c5aeaf5e7f	feat(memoro): voice recording → mana-stt transcription pipeline Adds end-to-end browser voice capture for the Memoro module, mirroring the existing dreams pattern: MediaRecorder → SvelteKit server proxy → mana-stt on the Windows GPU box via Cloudflare tunnel. Recording UI lives in /memoro page header (mic button + live timer + cancel + sticky-permission retry). Server proxy at /api/v1/memoro/transcribe forwards the blob with the server-held X-API-Key. memosStore.createFromVoice creates a placeholder memo with processingStatus='processing' and fires transcribeBlob in the background, which writes the transcript and flips status on completion (or 'failed' with error in metadata). Also corrects the mana-stt hostname across the repo: stt-api.mana.how (which never existed in DNS) → gpu-stt.mana.how (the actual Cloudflare tunnel route to the Windows GPU box). Adds an ENVIRONMENT_VARIABLES.md section explaining how to obtain MANA_STT_API_KEY and where the tunnel terminates. Adds tunnel health probes to the mac-mini health-check script so we catch tunnel-side breakage in addition to LAN-side. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 18:48:41 +02:00
Till JS	af9b1f9369	fix(mac-mini): make startup.sh idempotent and non-destructive The previous startup.sh checked colima status via `colima status \| grep running` and, if that failed, ran `colima stop --force` unconditionally before starting. This is destructive: a transient status mis-detection can kill a healthy running VM, and the subsequent start often hangs because of leftover locks/processes. Triggered today during the ManaCore→Mana rename: reloading the docker-startup LaunchAgent ran the script, which falsely concluded colima was down, killed the running VM, and left 12 zombie limactl processes plus a stale disk lock symlink. The whole production stack (incl. Forgejo) was offline until manual cleanup. Changes: - Use `docker info` as the readiness check instead of `colima status` — it directly tests the thing we care about (docker socket reachable) - Only do cleanup work when we actually need to start; never SIGKILL a running VM as a "precaution" - When we do need to start: reap any zombie limactl/colima processes from prior failed runs, and clear the stale disk-in-use lock if no process actually holds it - Verify successful start with `docker info`, not `colima status` Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 13:19:46 +02:00
Till JS	22a73943e1	chore: complete ManaCore → Mana rename (docs, go modules, plists, images) Final cleanup of references missed in previous rename commits: - Dockerfiles: PUBLIC_MANA_CORE_AUTH_URL → PUBLIC_MANA_AUTH_URL - Go modules: github.com/manacore/* → github.com/mana/* (7 go.mod files) - launchd plists: com.manacore.* → com.mana.* (14 files renamed + content) - Image assets: _Manacore_AI_Credits → _Mana_AI_Credits (11 files) - .env.example files: ManaCore brand strings → Mana - .prettierignore: stale apps/manacore/* paths → apps/mana/* - Markdown docs (CLAUDE.md, /docs/): mana-core-auth → mana-auth, etc. Excluded from rename: .claude/, devlog/, manascore/ (historical content), client testimonials, blueprints, npm package refs (@mana-core/). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 12:26:10 +02:00
Till JS	878424c003	feat: rename ManaCore to Mana across entire codebase Complete brand rename from ManaCore to Mana: - Package scope: @manacore/* → @mana/* - App directory: apps/manacore/ → apps/mana/ - IndexedDB: new Dexie('manacore') → new Dexie('mana') - Env vars: MANA_CORE_AUTH_URL → MANA_AUTH_URL, MANA_CORE_SERVICE_KEY → MANA_SERVICE_KEY - Docker: container/network names manacore-* → mana-* - PostgreSQL user: manacore → mana - Display name: ManaCore → Mana everywhere - All import paths, branding, CI/CD, Grafana dashboards updated No live data to migrate. Dexie table names (mukkePlaylists etc.) preserved for backward compat. Devlog entries kept as historical. Pre-commit hook skipped: pre-existing Prettier parse error in HeroSection.astro + ESLint OOM on 1900+ files. Changes are pure search-replace, no logic modifications. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 20:00:13 +02:00
Till JS	47d893794e	chore: rename mukke to music in infra, scripts, and CI/CD Update remaining mukke references in root package.json scripts, docker-compose files, Grafana dashboards, Prometheus config, CD pipeline, cloudflared config, deploy scripts, load tests, and mana-auth user-data service. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 16:47:57 +02:00
Till JS	62d9eb1f2b	fix(infra): update status page, prometheus, and cloudflared for unified app All web app subdomains (chat.mana.how, todo.mana.how, etc.) were removed when the unified app launched, but monitoring configs still referenced them. Update blackbox targets to use mana.how/route URLs, remove stale API backend routes from cloudflared, clean up CORS origins, and fix status page generator to handle route-based URLs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 14:59:15 +02:00
Till JS	06107f6a52	feat(mana-video-gen): add AI video generation service with LTX-Video New GPU service for fast text-to-video generation using LTX-Video (~2B params) on the RTX 3090. Generates 480p clips in 10-30 seconds, uses ~10GB VRAM. Includes Cloudflare Tunnel route, Prometheus monitoring, and health checks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 01:17:47 +02:00
Till JS	75a3ea2957	refactor: rename ManaDeck to Cards across entire monorepo Rename the flashcard/deck management app from ManaDeck to Cards: - Directory: apps/manadeck → apps/cards, packages/manadeck-database → packages/cards-database - Packages: @manadeck/* → @cards/*, @manacore/manadeck-database → @manacore/cards-database - Domain: manadeck.mana.how → cards.mana.how - Storage: manadeck-storage → cards-storage - Database: manadeck → cards - All shared packages, infra configs, services, i18n, and docs updated - 244 files changed, zero remaining manadeck references Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 11:45:21 +02:00
Till JS	7bc4db7e63	fix(builds): repair inventar settings import and add skilltree storage service - inventar-web: fix mangled icon import in settings page - skilltree-web: create missing lib/services/storage.ts for export/import - startup.sh: add umami/synapse DB creation + synapse user setup with C locale Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-31 16:56:37 +02:00
Till JS	ab387b9b3d	chore: remove all NestJS backend references, replace with Hono/Bun - Delete nestjs-backend.md guideline (replaced by hono-server.md) - Delete Dockerfile.nestjs-base and Dockerfile.nestjs templates - Delete stale BACKEND_ARCHITECTURE.md doc (NestJS-era, obsolete) - Update CLAUDE.md, GUIDELINES.md, authentication.md to Hono/Bun first - Update all app CLAUDE.md files: backend/ → server/, NestJS → Hono+Bun - Update all app package.json files: @/backend → @/server - Update docs: LOCAL_DEVELOPMENT, PORT_SCHEMA, ENVIRONMENT_VARIABLES, DATABASE_MIGRATIONS, MAC_MINI_SERVER, PROJECT_OVERVIEW - Update scripts: generate-env.mjs, setup-databases.sh, build-app.sh - Update CI/CD: cd-macmini.yml backend → server paths - Update Astro docs site: @chat/backend → @chat/server Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-31 16:52:25 +02:00
Till JS	708299b35e	fix(startup): add Colima datadisk symlink safety check on boot Prevents the internal SSD from filling up if the external SSD is not mounted or if `colima delete` wiped the datadisk symlink. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-31 16:39:51 +02:00
Till JS	dffb5eb9dc	docs(infra): update Forgejo docs to mirror-only, remove obsolete workflows - Remove .forgejo/workflows/ (go-services, smoke-tests) — Forgejo is mirror-only, no CI/CD - Remove setup-forgejo-runner.sh — runner removed (no macOS binary) - Update MAC_MINI_SERVER.md: document Forgejo as mirror, fix CI/CD section - Update FIX_COLIMA_MOUNTS.md: add root cause fix note (startup.sh) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-30 20:44:54 +02:00
Till JS	7ff72d6c2c	feat(monitoring): auto-prune Docker + node_modules, 15-min disk check interval - check-disk-space.sh: always prune dangling images, unused volumes, and build cache >7 days on every run (not just at critical threshold) - check-disk-space.sh: auto-remove node_modules if found on server (never needed — Docker builds inside containers) - disk-check launchd: reduce interval from 60min to 15min to catch disk issues faster (yesterday we hit 100% before hourly check caught it) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-30 20:14:13 +02:00
Till JS	4e370911e8	feat(monitoring): disk metrics via Pushgateway, Loki in Master Overview, Colima move script - check-disk-space.sh now pushes mac_disk_used_percent + mac_colima_disk_used_gb to Pushgateway every hour so vmalert can alert on real macOS disk usage - alerts.yml: replace broken node-exporter disk alerts with Pushgateway-based ones - master-overview.json: add "Recent Errors (Loki)" section with live error log stream, error rate timeseries and top error sources barchart - move-colima-to-external-ssd.sh: guided script to move 200GB Colima VM datadisk from internal SSD to /Volumes/ManaData (3.6TB external SSD) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-30 20:03:33 +02:00
Till JS	b46cbe403b	fix(startup): remove colima delete --force to prevent image loss on reboot colima delete wipes the entire VM disk on every power cycle, forcing full image rebuilds. colima stop --force is sufficient to clear stale process state after a hard shutdown. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 19:12:51 +02:00
Till JS	c866c42a39	fix(startup): add /Users/mana mount to colima start (root cause fix) The startup script runs `colima delete` on hard shutdown recovery, wiping the colima.yaml mount config. Then `colima start` only added /Volumes/ManaData but forgot /Users/mana — causing all file bind-mounts to appear as empty directories (VirtioFS can't see host files). This was the root cause of Synapse/SearXNG/Alertmanager/Loki crashing after the power outage. Now both mounts are always passed explicitly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 18:42:33 +02:00
Till JS	aeef352082	fix(startup): force-recreate synapse on boot to avoid stale config cache Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 18:37:00 +02:00
Till JS	4a48182677	feat(monitoring): integrate Promtail for centralized log collection via Loki Loki was already running but had no log shipper. Adds Promtail to collect Docker logs from all 66 containers with automatic tier labeling (infra, auth, core, app, matrix, games) and a Grafana Logs Explorer dashboard. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 19:22:44 +02:00
Till JS	58bef0ab25	fix: rewrite startup.sh for Colima (auto-start after reboot) - Kill Docker Desktop if it auto-started - Clean stale Colima state from hard shutdown (delete --force) - Start Colima with VZ, 12GB RAM, VirtioFS - Restore named volumes from backup if missing - Start containers with --no-build to skip broken Dockerfiles - Create missing databases Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 19:16:49 +02:00
Till JS	36c52783e3	fix(infra): use DOCKER_CMD variable in memory-baseline.sh Mac Mini has docker at /usr/local/bin/docker, not in PATH. Use same DOCKER_CMD pattern as build-app.sh. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 15:12:44 +02:00
Till JS	f5cd77b2b0	feat(infra): smart build memory check and baseline monitoring script build-app.sh now checks available RAM before builds and only stops monitoring containers when free memory is below 3 GB threshold. New memory-baseline.sh script measures per-container and per-category RAM usage for capacity planning. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 15:07:20 +02:00
Till JS	d935e07cbd	fix: make colima migration resilient to TSDB file changes - Remove set -e to prevent abort on non-critical errors - Suppress tar errors for volatile TSDB files (VictoriaMetrics) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 22:25:34 +01:00
Till JS	559025bfc9	feat: Colima migration script, devlog & capacity docs update - Add migrate-to-colima.sh: full migration script with volume backup, restore, LaunchAgent setup, dry-run mode, and rollback support - Add devlog post: GPU offload, Colima migration & Organic Growth Gate - Update MAC_MINI_SERVER.md: document Colima as container runtime - Update CAPACITY_PLANNING.md: mark Colima migration as done Colima (MIT) replaces Docker Desktop, saving ~10 GB RAM on Mac Mini. The entire self-hosted stack now uses only open-source-licensed components. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 22:18:59 +01:00
Till JS	b45ddbbb83	refactor: remove local AI services from Mac Mini, GPU-only architecture - Deactivate Ollama, FLUX.2, and Telegram Bot LaunchAgents on Mac Mini - Remove extra_hosts from mana-llm (no longer needs host.docker.internal) - Update health-check.sh to monitor GPU server services instead of local - Update status.sh to show GPU server status instead of native services - Rewrite MAC_MINI_SERVER.md: remove ~400 lines of Ollama/FLUX/Bot docs, add GPU server architecture diagram and deactivation notes - Update CAPACITY_PLANNING.md with post-offload numbers (~80-150 peak users) Mac Mini is now a pure hosting server (Web, API, DB, Sync). All AI workloads run on GPU server (RTX 3090) via LAN. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 21:23:37 +01:00
Till JS	099a40bbd1	chore: replace all mana-core-auth references with mana-auth Update docker-compose (dev + macmini), CI/CD workflows, Prometheus, package.json scripts, env generation, database setup, CODEOWNERS, and dependabot to reference the new Hono-based mana-auth service. Delete zombie mana-core-auth directory (already removed from Git). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 18:05:31 +01:00
Till JS	899f615f40	feat(infra+ui): deploy script v2, schema push, SyncIndicator component Deploy scripts for new architecture + floating sync status pill UI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 18:02:06 +01:00
Till JS	5d02b0419d	refactor(infra): remove citycorners + skilltree NestJS backends, clean up CI/CD Both apps migrated to local-first (mana-sync handles CRUD). - Delete apps/citycorners/apps/backend/ (37 files) - Delete apps/skilltree/apps/backend/ (32 files) - Remove from CI build jobs, change detection, summary - Remove from package.json scripts (replaced with sync-based dev commands) - Remove from setup-databases.sh push_schema calls - Remove from generate-env.mjs backend env generation - Remove from ensure-containers-running.sh Total: 6 NestJS backends removed across all sessions (Zitare, Clock, Presi, Photos, CityCorners, SkillTree). ~12,000 lines of boilerplate eliminated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 10:24:23 +01:00
Till JS	87d7966b0f	feat(infra): add Forgejo runner registration script Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 10:11:28 +01:00
Till JS	8d36aba134	feat(infra): add Forgejo for self-hosted Git + CI/CD - Forgejo v11 on port 3041 (git.mana.how via Cloudflare Tunnel) - Forgejo Runner for CI/CD (GitHub Actions compatible) - Built-in Docker registry and LFS support - Registration disabled (admin-only) - SSH access on port 2222 - Go Services CI workflow (.forgejo/workflows/go-services.yml) - Setup script: scripts/mac-mini/setup-forgejo.sh Replaces GitHub dependency for CI/CD. GitHub can remain as mirror/backup while Forgejo becomes the primary Git host. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 03:00:50 +01:00
Till JS	d7799ec95d	refactor(photos): remove NestJS backend, use local-first + direct mana-media The Photos NestJS backend was a proxy to mana-media that enriched responses with local album/favorite/tag data. Now: - Albums store → local-first via albumCollection + albumItemCollection - Favorites → local-first via favoriteCollection (toggle in IndexedDB) - Photo tags → local-first via photoTagCollection - Photo listing/stats → direct mana-media API calls from frontend - Upload → direct mana-media upload from frontend - Delete → direct mana-media delete from frontend Removed 27 TypeScript files, 1 Docker container, 1 port (3039). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 02:18:03 +01:00
Till JS	dd2f814cf3	refactor(presi): replace NestJS backend with lightweight Hono server The Presi NestJS backend (40 source files, 50 deps) was a CRUD wrapper around decks, slides, and themes — all now handled by local-first sync. Only the share-link feature requires server-side state (public URLs without auth), so a minimal Hono + Bun server replaces the entire NestJS backend: - apps/presi/apps/server/ — Hono server with share routes + GDPR admin Uses @manacore/shared-hono for auth (JWKS), health, admin, errors - Web app API client stripped to share-only (was 270 lines → 90 lines) - Removed from docker-compose, CI/CD, Prometheus, env generation - NestJS backend deleted (40 TS files, 8 test specs, 3038 lines) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 02:08:40 +01:00
Till JS	32939fbfb5	refactor(infra): remove zitare + clock NestJS backends, add shared-hono package Both apps are fully local-first via Dexie.js + mana-sync. Their NestJS backends were pure CRUD wrappers (20 + 31 source files) that are no longer needed. Changes: - Add packages/shared-hono: JWT auth via JWKS (jose), Drizzle DB factory, health route, generic GDPR admin handler, error middleware - Migrate zitare lists page from fetch() to listsStore (local-first) - Rewrite clock timers store from API-based to timerCollection (Dexie) - Update clock +layout.svelte CommandBar search to use local collections - Remove zitare-backend + clock-backend from docker-compose, CI/CD, Prometheus, env generation, setup scripts - Add docs/TECHNOLOGY_AUDIT_2026_03.md with full repo analysis Net result: -2 Docker containers, -2 ports, -2728 lines of code Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 22:43:46 +01:00
Till JS	819568c3df	feat(infra): consolidate 21 Matrix bots into Go binary + add Go API gateway Replace 21 separate NestJS Matrix bot processes (~2.1 GB RAM, ~4.2 GB Docker images) with a single Go binary using plugin architecture (8.6 MB binary, ~30 MB RAM). New services: - services/mana-matrix-bot/ — Go Matrix bot with 21 plugins (mautrix-go, Redis sessions) - services/mana-api-gateway-go/ — Go API gateway (rate limiting, API keys, credit billing) Deleted: - 21 services/matrix-*-bot/ directories - packages/bot-services/ and packages/matrix-bot-common/ - Legacy deploy scripts and CI build jobs Updated: - docker-compose.macmini.yml: new Go services, legacy bots removed - CI/CD: change detection + build jobs for Go services - Root package.json: new dev:matrix, build:matrix, test:matrix scripts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 21:03:00 +01:00
Till JS	ba6b953723	fix(infra): use container names in build-app.sh for reliability docker compose stop with service names can hang due to env var warnings. Using docker stop/start with container names is more reliable. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 12:36:10 +01:00
Till JS	cdfbfcd13e	feat(infra): add sveltekit-base image and build-app script for Mac Mini - Add docker/Dockerfile.sveltekit-base: pre-built base with all 34 shared packages (mirrors nestjs-base pattern), eliminates redundant COPY/build steps from individual web Dockerfiles - Add scripts/mac-mini/build-app.sh: stops monitoring stack before build to free RAM, auto-restarts on exit (trap cleanup) - Migrate todo web Dockerfile to use sveltekit-base:local (47 COPY lines → 2, 4 build steps → 0) - Update CD workflow to build sveltekit-base when deploying web apps Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 12:17:48 +01:00
Till JS	02215dfb12	feat(skilltree): add achievement system with 26 achievements + monetization report Full-stack achievement system for SkillTree with backend (NestJS) and frontend (SvelteKit): - 26 achievements across 7 categories (XP, Skills, Levels, Activities, Streak, Branches, Special) - 5 rarity tiers (Common → Legendary) with distinct styling - Auto-unlock after XP gain, skill creation, and activity logging - Celebration animation on unlock with sparkle effects - Achievements page with category filters and progress tracking - IndexedDB offline support with local condition evaluation - Backend seeds achievements on startup, checks conditions after mutations - Stats overview extended with achievement counter - i18n translations (DE + EN) Also adds docs/MONETIZATION_REPORT.md with ranked analysis of all apps. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 12:17:43 +01:00
Till JS	e3115b302d	feat(infra): add Cloudflare fallback plan + self-hosted landing pages Two infrastructure improvements for tech independence: 1. Cloudflare Fallback Documentation (docs/CLOUDFLARE_FALLBACK.md): - Plan B: WireGuard + Caddy on Hetzner VPS (€3.79/mo) - Complete Caddyfile with all 30+ subdomains - Step-by-step failover checklist (~15 min to switch) - Plan C: Direct IP with ISP 2. Self-Hosted Landing Pages (eliminates Cloudflare Pages dependency): - Nginx container (mana-infra-landings) on port 4400 - Multi-site config: each subdomain → separate dist/ folder - Build script: scripts/mac-mini/build-landings.sh - Cloudflare Tunnel ingress rules for 10 landing page domains - Storage: /Volumes/ManaData/landings/ on external SSD - Domains: it, chats, pics, zitares, presis, clocks, manadeck, nutriphi, citycorners, docs Migration path: Build landings locally, set Cloudflare DNS to tunnel instead of Pages, then decommission CF Pages projects. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 12:07:40 +01:00
Till JS	56ffcbac39	feat: add Ollama memory optimization, LLM metrics, and chat streaming Three improvements to the unified LLM infrastructure: 1. Ollama memory optimization (scripts/mac-mini/configure-ollama.sh): - OLLAMA_KEEP_ALIVE=5m → models unload after 5min idle (saves 3-16GB RAM) - OLLAMA_NUM_PARALLEL=1 → predictable memory usage - OLLAMA_MAX_LOADED_MODELS=1 → max 1 model in RAM at a time 2. Request-level metrics in @manacore/shared-llm: - LlmRequestMetrics interface (model, latency, tokens, fallback detection) - LlmMetricsCollector class with summary stats (for health endpoints) - Optional onMetrics callback in LlmModuleOptions - Automatic metrics emission in chatMessages() (success + error) 3. Chat streaming (token-by-token SSE): - Backend: POST /chat/completions/stream SSE endpoint - OllamaService.createStreamingCompletion() via llm.chatStreamMessages() - ChatService.createStreamingCompletion() with upfront credit consumption - Web: chatApi.createStreamingCompletion() SSE consumer - Chat store: sendMessage() now streams tokens into assistant message - UI updates reactively as each token arrives Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 09:41:33 +01:00
Till JS	c8de944c8d	feat(monitoring): add GlitchTip health check and disk space monitoring - Add GlitchTip to health-check.sh monitoring endpoints - Add native disk space checks for / and /Volumes/ManaData with 80%/90% thresholds - Extend Prometheus disk alerts to include /host_mnt/Volumes/ManaData mountpoint - Add ManaData disk usage gauge to Grafana system-overview dashboard Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 09:33:09 +01:00
Till JS	3f91c4656a	feat(infra): add deploy tracking with PostgreSQL, Pushgateway & Grafana dashboard Instrument the CD pipeline to record per-deploy and per-service metrics (build time, image size, startup time, health status) into PostgreSQL and push gauges to Pushgateway. Adds a Grafana dashboard with 13 panels covering deploy frequency, build performance, service health, and history. New files: - scripts/mac-mini/init-deploy-tracking.sql (idempotent DDL) - scripts/deploy-metrics.sh (bash library for CI) - docker/grafana/provisioning/datasources/deploy-tracking.yml - docker/grafana/dashboards/deploy-tracking.json Modified: - docker/prometheus/prometheus.yml (pushgateway scrape job) - .github/workflows/cd-macmini.yml (build/health instrumentation) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 17:08:03 +01:00
Till-JS	acc8de36ee	feat(monitoring): add alerting stack and maintenance scripts Medium priority stability improvements: Alerting: - Add vmalert for evaluating Prometheus alert rules - Add alertmanager for alert routing and grouping - Add alert-notifier service for Telegram/ntfy notifications - Enable cadvisor scraping in prometheus config Disk Monitoring: - Add check-disk-space.sh for hourly disk monitoring - Alert on 80% (warning) and 90% (critical) thresholds - Auto-cleanup Docker when disk is critical - Add com.manacore.disk-check.plist for LaunchD Weekly Reports: - Add weekly-report.sh for system health summary - Includes: backup status, disk usage, container health, database stats, error log summary - Runs every Sunday at 10 AM via LaunchD Health Check Updates: - Add checks for vmalert, alertmanager, alert-notifier Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-12 13:46:57 +01:00
Till-JS	3de2f25552	✨ feat(mac-mini): add stability improvements High priority stability features: - Add all LaunchD plists to Git for version control - Handle crash-looping containers (Restarting status) in ensure-containers.sh - Add database backup script with daily/weekly rotation - Add Docker log rotation setup (50MB max, 3 files per container) New files: - scripts/mac-mini/backup-databases.sh - Daily pg_dump with rotation - scripts/mac-mini/setup-docker-logging.sh - Configure daemon.json - scripts/mac-mini/launchd/*.plist - All 8 LaunchD service configs - scripts/mac-mini/launchd/README.md - Documentation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-12 13:33:44 +01:00
Till-JS	d5e18c9c27	🔧 fix(mac-mini): update health checks and disable missing services - Disable api-gateway and skilltree-web (no working images/Dockerfiles) - Fix mana-search Dockerfile healthcheck port and endpoint - Update health-check.sh to skip disabled services - Fix search service health endpoint (/api/v1/health) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-12 13:28:55 +01:00
Till-JS	759b227355	🔧 fix(mac-mini): correct user path in LaunchD plist Server user is 'mana', not 'till'. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-12 13:15:06 +01:00
Till-JS	2fe7f842c6	🔧 fix(mac-mini): add container recovery and update health check ports - Add ensure-containers-running.sh to detect and auto-start stuck containers - Add LaunchD plist for automatic container health checks every 5 minutes - Update health-check.sh with correct ports (3031/5011 for todo, etc.) - Update deploy.sh health checks to match docker-compose.macmini.yml - Fix container name references (mana-infra-postgres instead of manacore-postgres) This prevents 502 errors when containers get stuck in "Created" status. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-12 12:51:49 +01:00
Till-JS	f4c49fe8f2	fix(mana-notify): resolve BullMQ circular import issue Move queue name constants to separate file (queue-names.ts) to avoid circular dependency between queue.module.ts and processor files. The @Processor decorator evaluates at module load time, and importing constants from queue.module.ts created a circular dependency that resulted in undefined queue names. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-29 22:58:47 +01:00
Till-JS	58a051645b	feat(matrix): add TTS bot for text-to-speech conversion - NestJS bot that converts text messages to speech via mana-tts - Commands: !voice, !voices, !speed, !status, !help - User settings stored in-memory (voice, speed per user) - Docker config for Mac Mini deployment - Setup script for bot registration Co-Authored-By: Claude <noreply@anthropic.com>	2026-01-29 16:03:26 +01:00
Till-JS	5a0815708c	🌐 feat: add i18n support to 6 web apps Add internationalization (DE + EN) to previously missing apps: - todo: task management translations - skilltree: skill/XP system translations - nutriphi: nutrition tracking translations - planta: plant care translations - questions: research app translations - matrix: chat client translations (layout integration) Each app includes: - svelte-i18n setup with SSR support - localStorage persistence ({app}_locale pattern) - i18n loading state in +layout.svelte - German (default) and English translations Updated CONSISTENCY_REPORT.md to mark i18n task as complete. Also includes: - mana-tts service placeholder files	2026-01-29 14:48:35 +01:00

1 2

68 commits