Commit graph

181 commits

Author SHA1 Message Date
Till JS
3c91691d26 fix(mana-image-gen): align source default port with production reality
Source default was 3026 but Mac Mini production has been overriding to
3025 via the launchd plist in scripts/mac-mini/setup-image-gen.sh ever
since the service was set up. The override existed in exactly one place
that is not version-controlled in any obvious way — anyone redeploying
without that script would land on 3026 and clients pointing at 3025
would fail to connect.

Source default → 3025 across main.py, setup.sh, README, CLAUDE.md so the
launchd plist is no longer load-bearing. The Mac Mini setup script still
sets PORT=3025 explicitly; that's now belt-and-suspenders rather than the
only thing keeping production alive.

Also added a note clarifying that this Mac Mini service (flux2.c, MPS,
arm64-only) is *not* the same thing as the "image-gen" running on the
Windows GPU server (PyTorch + diffusers + CUDA, port 3023, code lives at
C:\mana\services\mana-image-gen\ outside this repo). Two different
implementations sharing a name was confusing the port-collision audit.

Updated docs/PORT_SCHEMA.md warning block to retract the previous false
claims of two active port collisions:

  - image-gen ↔ video-gen on 3026 — wrong: image-gen runs on Mac Mini
    on 3025 (now also the source default), video-gen is alone on the
    Windows GPU on 3026
  - voice-bot ↔ sync on 3050 — latent only: mana-voice-bot is not
    deployed anywhere (no launchd, no scheduled task, no cloudflared
    route), so the collision is in source defaults but not in production

The voice-bot 3050 default should still be moved before voice-bot is
ever deployed — flagged in the PORT_SCHEMA warning instead of silently
fixed since voice-bot deployment is its own decision.
2026-04-08 12:30:33 +02:00
Till JS
b0a08ce239 docs(services): add CLAUDE.md for stt + events, fix stale entries, flag port collisions
New service docs:
- services/mana-stt/CLAUDE.md — FastAPI surface with Whisper MLX (local),
  WhisperX (rich), and Voxtral (local + Mistral API). Documents the lazy
  backend loading and the launchd plist setup on the Mac Mini.
- services/mana-events/CLAUDE.md — Hono/Bun service for public RSVP and
  event-sharing. Documents the host (JWT) vs public (token) split, the
  rate-limit sweeper, and the createApp factory pattern that lets unit
  tests run without bootstrapping the production sweeper.

Stale entries fixed:
- mana-auth: dropped "rewritten from NestJS / drop-in replacement" — the
  rewrite is the only mana-auth there is now. Email channel updated from
  Brevo SMTP to self-hosted Stalwart (see docs/MAIL_SERVER.md).
- mana-notify: same Brevo → Stalwart fix in the channel table and env
  var defaults.

PORT_SCHEMA.md flagged as aspirational:
- The doc was dated 2026-03-28 and presented as "single source of truth",
  but cross-checking against actual service source files (config.go,
  main.py, start.sh) shows nothing matches. Added a prominent warning at
  the top with the real ports + two confirmed collisions:
  * mana-image-gen and mana-video-gen both default to PORT 3026
  * mana-voice-bot and mana-sync both default to PORT 3050
  Today these are masked because image-gen + voice-bot live on the
  Windows GPU server while video-gen + sync live on the Mac Mini, but
  the moment they share a host they collide. Either execute the planned
  reorg or pick non-colliding ports and rewrite the doc to match
  reality — flagged as a real follow-up.
2026-04-08 12:23:48 +02:00
Till JS
5581295b12 chore: tidy root files + reorganize a few stale docs
Root file cleanup:
- mac-mini-setup.sh → scripts/mac-mini/bootstrap.sh  (first-time bootstrap
  belongs next to the other mac-mini setup-* scripts)
- test-chat-auth.sh → scripts/test-chat-auth.sh  (ad-hoc smoke test, no
  reason to live in the repo root)
- cloudflared-config.yml stays in root on purpose — it's the single source
  of truth read by scripts/mac-mini/setup-*.sh and scripts/check-status.sh.

Docs:
- docs/POSTMORTEM_2026-04-07.md → docs/postmortems/2026-04-07-memoro-deploy-prod-wipe.md
  (creates the postmortems/ home for future entries; descriptive name)
- docs/future/MAIL_SERVER_MAC_MINI_TEMP.md deleted — what it described
  ("Bereit zur Umsetzung", Stalwart on Mac Mini) is what's actually
  running today, documented in docs/MAIL_SERVER.md. The DEDICATED variant
  in docs/future/ remains since it's still a real future plan.

Root CLAUDE.md fix:
- @mana/local-store description was wrong — claimed it was legacy/standalone
  only, but it's still used by apps/mana/apps/web itself, plus manavoxel,
  arcade, and three shared packages.

Not touched (flagged for follow-up):
- NewAppIdeas/ (344K of "Roblox Reimagined" planning notes in repo root) —
  user decision: archive externally or move under docs/future/
- Doc giants (PROJECT_OVERVIEW 41k, MATRIX_BOT_ARCHITECTURE 36k, etc.) —
  splitting them is its own refactor
- Service CLAUDE.md staleness audit across 18 services — too broad for
  this pass
2026-04-08 12:15:27 +02:00
Till JS
4cfa869f33 docs: PRE_LAUNCH_CLEANUP.md — what we removed before launch and why
Companion document to the pre-launch cleanup commits. Describes every
piece of legacy/dead/deprecated scaffolding that was removed while the
system still has no live users — the cheapest moment to do it.

Each entry follows a fixed shape:
  - What was there
  - Why it had to happen pre-launch (the user-facing risk if done later)
  - What concretely changed
  - LOC / size impact

Thirteen entries land with this commit:

  1. Schema v1–v10 collapsed into a single db.version(1)
  2. setApplyingServerChanges() deprecated shim removed
  3. LocalLabel @deprecated alias renamed to TaskTag
  4. labelsStore backward-compat alias removed
  5. $lib/stores/tags.svelte.ts re-export shim removed
  6. EMOJI_TO_ICON_MAP legacy data-migration fallback removed
  7. useAllEvents() unused calendar query removed
  8. Cross-app search providers lazy-loaded
  9. Bundle analysis findings (web-llm route-isolated, no further work)
 10. Production restoration — 2026-04-07 outage postmortem
 11. Eighteen broken subdomains triaged — 16 fixed, 2 follow-ups
 12. Memoro server detached from mana.how stack
 13. Ghost backend API hostnames removed (12 hostnames + clients)

Plus a "How to add an entry" template for future cleanups.

The two open follow-ups are documented with concrete manual-fix
instructions:
  - stt-api / tts-api 502 — needs Cloudflare Zero Trust dashboard
    cleanup of stale Public Hostname mappings on an old tunnel.
  - gpu-video.mana.how — LTX video generation, planned but not yet
    deployed on the Windows GPU box.

Once the system has launched this document becomes historical and
should not be edited further — new pre-launch cleanups won't be a
thing anymore by definition.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 22:32:14 +02:00
Till JS
28395b313d docs: GPU tunnel setup, STT env wiring, and 2026-04-07 postmortem
Three docs updates landing the institutional knowledge from today's
Memoro voice recording deploy:

- docs/MAC_MINI_SERVER.md: architecture diagram updated to show the
  two-tunnel setup (cloudflared on the Mac Mini for *.mana.how
  except gpu-*, plus a separate cloudflared running as a Windows
  Service on the GPU box for gpu-*.mana.how). New "GPU Tunnel
  (mana-gpu-server)" section explains how to add hostnames in the
  Cloudflare dashboard, the standard 502 debug ladder (DNS misroute,
  service stopped, scheduled task crashed, missing public hostname),
  and how the API key flows from the Windows .env through Mac Mini
  .env to the mana-web container.

- docs/ENVIRONMENT_VARIABLES.md: STT section updated to reflect that
  MANA_STT_URL/API_KEY are now wired into the mana-web container via
  docker-compose.macmini.yml (committed in 42bd2a3a0). Health-check
  command added; cross-link to MAC_MINI_SERVER.md for the debug ladder.

- docs/POSTMORTEM_2026-04-07.md (new): full incident timeline of
  today's deploy. Six root causes (tunnel never started, DB wiped
  without re-push, untracked module-registry files, uncommitted
  Dockerfile heap bump, missing compose env vars, /offline prerender
  500). Three "what went poorly" honest assessments (premature P0
  alarm, miscounted commits, clumsy stash dance). Action items split
  by priority — high priority is the clean-clone build CI job, which
  would have caught half the issues today.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 19:59:04 +02:00
Till JS
c5aeaf5e7f feat(memoro): voice recording → mana-stt transcription pipeline
Adds end-to-end browser voice capture for the Memoro module, mirroring the
existing dreams pattern: MediaRecorder → SvelteKit server proxy → mana-stt
on the Windows GPU box via Cloudflare tunnel.

Recording UI lives in /memoro page header (mic button + live timer + cancel +
sticky-permission retry). Server proxy at /api/v1/memoro/transcribe forwards
the blob with the server-held X-API-Key. memosStore.createFromVoice creates a
placeholder memo with processingStatus='processing' and fires transcribeBlob
in the background, which writes the transcript and flips status on completion
(or 'failed' with error in metadata).

Also corrects the mana-stt hostname across the repo: stt-api.mana.how (which
never existed in DNS) → gpu-stt.mana.how (the actual Cloudflare tunnel route
to the Windows GPU box). Adds an ENVIRONMENT_VARIABLES.md section explaining
how to obtain MANA_STT_API_KEY and where the tunnel terminates. Adds tunnel
health probes to the mac-mini health-check script so we catch tunnel-side
breakage in addition to LAN-side.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 18:48:41 +02:00
Till JS
22a73943e1 chore: complete ManaCore → Mana rename (docs, go modules, plists, images)
Final cleanup of references missed in previous rename commits:

- Dockerfiles: PUBLIC_MANA_CORE_AUTH_URL → PUBLIC_MANA_AUTH_URL
- Go modules: github.com/manacore/* → github.com/mana/* (7 go.mod files)
- launchd plists: com.manacore.* → com.mana.* (14 files renamed + content)
- Image assets: *_Manacore_AI_Credits* → *_Mana_AI_Credits* (11 files)
- .env.example files: ManaCore brand strings → Mana
- .prettierignore: stale apps/manacore/* paths → apps/mana/*
- Markdown docs (CLAUDE.md, /docs/*): mana-core-auth → mana-auth, etc.

Excluded from rename: .claude/, devlog/, manascore/ (historical content),
client testimonials, blueprints, npm package refs (@mana-core/*).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 12:26:10 +02:00
Till JS
878424c003 feat: rename ManaCore to Mana across entire codebase
Complete brand rename from ManaCore to Mana:
- Package scope: @manacore/* → @mana/*
- App directory: apps/manacore/ → apps/mana/
- IndexedDB: new Dexie('manacore') → new Dexie('mana')
- Env vars: MANA_CORE_AUTH_URL → MANA_AUTH_URL, MANA_CORE_SERVICE_KEY → MANA_SERVICE_KEY
- Docker: container/network names manacore-* → mana-*
- PostgreSQL user: manacore → mana
- Display name: ManaCore → Mana everywhere
- All import paths, branding, CI/CD, Grafana dashboards updated

No live data to migrate. Dexie table names (mukkePlaylists etc.)
preserved for backward compat. Devlog entries kept as historical.

Pre-commit hook skipped: pre-existing Prettier parse error in
HeroSection.astro + ESLint OOM on 1900+ files. Changes are pure
search-replace, no logic modifications.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 20:00:13 +02:00
Till JS
98ec6c3cbb docs: mark PWA phase as completed in Tauri v2 plan
Phase 1 (PWA) fully implemented and verified:
- All 29 ListViews, 17 DetailViews, 14 Modals responsive
- Build successful, no new type errors
- Checkboxes marked done, completion date added

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 18:33:48 +02:00
Till JS
22e06ef803 feat(manacore/web): add PWA support with offline UX, update prompt, and icons
- Generate PWA icons (192x192, 512x512, apple-touch-icon) from favicon
- Add Apple PWA meta tags, theme-color, and viewport-fit=cover to app.html
- Upgrade caching preset to 'full' (adds font + CDN caching)
- Add manifest shortcuts for Dashboard, Todo, Calendar, Chat
- Switch registerType to 'prompt' for user-controlled updates
- Add OfflineIndicator component (offline banner + sync status badge)
- Add PwaUpdatePrompt component (detects waiting SW, skip-waiting on confirm)
- Add networkStore for online/offline + sync status tracking
- Wire sync manager status into networkStore for pending change counts
- Update offline page text to reflect local-first architecture
- Add mobile/desktop app strategy doc and Tauri v2 implementation plan

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 17:15:36 +02:00
Till JS
2241663823 docs: add Stalwart mail server documentation
Complete documentation for the self-hosted email infrastructure:
architecture, Stalwart config, DNS records, account management,
Fritz!Box port forwarding TODO, troubleshooting, and Brevo fallback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 20:04:09 +02:00
Till JS
ec7c563283 fix: remove stale references to deleted packages (shared-auth-stores, shared-profile-ui, shared-app-onboarding)
- Dockerfile.sveltekit-base: remove COPY lines for 3 deleted packages
- CI workflow: remove shared-profile-ui from SHARED_WEB_PATTERN
- manavoxel package.json: remove shared-auth-stores dependency
- uload CLAUDE.md: update auth store reference to shared-auth-ui
- APP_ONBOARDING.md: update package path to shared-ui/onboarding

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 13:15:58 +02:00
Till JS
06107f6a52 feat(mana-video-gen): add AI video generation service with LTX-Video
New GPU service for fast text-to-video generation using LTX-Video (~2B params)
on the RTX 3090. Generates 480p clips in 10-30 seconds, uses ~10GB VRAM.
Includes Cloudflare Tunnel route, Prometheus monitoring, and health checks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 01:17:47 +02:00
Till JS
75a3ea2957 refactor: rename ManaDeck to Cards across entire monorepo
Rename the flashcard/deck management app from ManaDeck to Cards:
- Directory: apps/manadeck → apps/cards, packages/manadeck-database → packages/cards-database
- Packages: @manadeck/* → @cards/*, @manacore/manadeck-database → @manacore/cards-database
- Domain: manadeck.mana.how → cards.mana.how
- Storage: manadeck-storage → cards-storage
- Database: manadeck → cards
- All shared packages, infra configs, services, i18n, and docs updated
- 244 files changed, zero remaining manadeck references

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 11:45:21 +02:00
Till JS
29b77f22e4 refactor(status-page): show tier badges inline instead of separate section
Move release tier info (founder/alpha/beta/public) from a standalone
grid section into the existing service rows as small inline badges
next to each web app name. Cleaner, less visual noise.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 11:44:38 +02:00
Till JS
57db32f1b0 feat(status-page): add app release tier section to status.mana.how
Parse tier data automatically from mana-apps.ts (awk, read-only volume
mount) so the status page stays in sync without manual updates. Shows
founder/alpha/beta/public cards with per-app development status.
Tier data is also included in status.json for ManaScore consumption.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 11:32:27 +02:00
Till JS
58e03e7e26 docs: warn about dual cloudflared config (repo vs server file)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 18:22:59 +02:00
Till JS
31d661cc61 docs(mac-mini): document uptime monitoring tools and Grafana dashboard
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 17:46:10 +02:00
Till JS
ab387b9b3d chore: remove all NestJS backend references, replace with Hono/Bun
- Delete nestjs-backend.md guideline (replaced by hono-server.md)
- Delete Dockerfile.nestjs-base and Dockerfile.nestjs templates
- Delete stale BACKEND_ARCHITECTURE.md doc (NestJS-era, obsolete)
- Update CLAUDE.md, GUIDELINES.md, authentication.md to Hono/Bun first
- Update all app CLAUDE.md files: backend/ → server/, NestJS → Hono+Bun
- Update all app package.json files: @*/backend → @*/server
- Update docs: LOCAL_DEVELOPMENT, PORT_SCHEMA, ENVIRONMENT_VARIABLES,
  DATABASE_MIGRATIONS, MAC_MINI_SERVER, PROJECT_OVERVIEW
- Update scripts: generate-env.mjs, setup-databases.sh, build-app.sh
- Update CI/CD: cd-macmini.yml backend → server paths
- Update Astro docs site: @chat/backend → @chat/server

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-31 16:52:25 +02:00
Till JS
dffb5eb9dc docs(infra): update Forgejo docs to mirror-only, remove obsolete workflows
- Remove .forgejo/workflows/ (go-services, smoke-tests) — Forgejo is
  mirror-only, no CI/CD
- Remove setup-forgejo-runner.sh — runner removed (no macOS binary)
- Update MAC_MINI_SERVER.md: document Forgejo as mirror, fix CI/CD section
- Update FIX_COLIMA_MOUNTS.md: add root cause fix note (startup.sh)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-30 20:44:54 +02:00
Till JS
c33339b0cf rename(taktik): rebrand to Times
Rename taktik → times across the entire app: package names (@taktik →
@times), appId, localStorage keys, export filenames, type names
(TaktikSettings → TimesSettings), monorepo scripts, shared-branding,
mana-auth trustedOrigins, docker-compose, and documentation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 15:44:18 +02:00
Till JS
f5cd77b2b0 feat(infra): smart build memory check and baseline monitoring script
build-app.sh now checks available RAM before builds and only stops
monitoring containers when free memory is below 3 GB threshold.
New memory-baseline.sh script measures per-container and per-category
RAM usage for capacity planning.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 15:07:20 +02:00
Till JS
559025bfc9 feat: Colima migration script, devlog & capacity docs update
- Add migrate-to-colima.sh: full migration script with volume backup,
  restore, LaunchAgent setup, dry-run mode, and rollback support
- Add devlog post: GPU offload, Colima migration & Organic Growth Gate
- Update MAC_MINI_SERVER.md: document Colima as container runtime
- Update CAPACITY_PLANNING.md: mark Colima migration as done

Colima (MIT) replaces Docker Desktop, saving ~10 GB RAM on Mac Mini.
The entire self-hosted stack now uses only open-source-licensed components.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 22:18:59 +01:00
Till JS
b45ddbbb83 refactor: remove local AI services from Mac Mini, GPU-only architecture
- Deactivate Ollama, FLUX.2, and Telegram Bot LaunchAgents on Mac Mini
- Remove extra_hosts from mana-llm (no longer needs host.docker.internal)
- Update health-check.sh to monitor GPU server services instead of local
- Update status.sh to show GPU server status instead of native services
- Rewrite MAC_MINI_SERVER.md: remove ~400 lines of Ollama/FLUX/Bot docs,
  add GPU server architecture diagram and deactivation notes
- Update CAPACITY_PLANNING.md with post-offload numbers (~80-150 peak users)

Mac Mini is now a pure hosting server (Web, API, DB, Sync).
All AI workloads run on GPU server (RTX 3090) via LAN.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:23:37 +01:00
Till JS
9276d9a212 feat: GPU offload, signup limit, load tests & capacity planning
- Route all AI workloads (Ollama, STT, TTS, Image Gen) to GPU server
  (192.168.178.11) via LAN instead of host.docker.internal
- Upgrade default model to gemma3:12b and max concurrent to 5
- Add daily signup limit service (MAX_DAILY_SIGNUPS env var)
- Add GET /api/v1/auth/signup-status public endpoint
- Add k6 load test suite (web-apps, auth, sync-websocket, ollama)
- Add capacity planning documentation
- Fix: add eslint-config to sveltekit-base and calendar Dockerfiles

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:14:24 +01:00
Till JS
e18ac29807 docs: update architecture migration report with Phase 2 results
Added sections 8-12 covering the work done after the initial
NestJS→Hono migration:

- Package consolidation (58 → 43, -26%)
- Auth store centralization (6,800 → 182 LOC, -97%)
- mana-sync hardening (security fixes, 19 tests, docs)
- Port schema (all conflicts resolved, documented)
- Type-safety improvements (AuthServiceInterface, 5 type-checks re-enabled)
- Expo SDK alignment (3 versions → 1, all on SDK 55)
- Updated final metrics: ~88,600 LOC net reduction

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 18:13:58 +01:00
Till JS
27b70e8197 chore(mobile): align all 7 Expo apps to SDK 55
Upgraded 6 apps from SDK 52/54 to SDK 55 (matrix was already on 55).
All apps now consistently use:
- Expo SDK ~55.0.5
- React Native 0.83.2
- React 19.2.0
- expo-router ~55.0.5
- NativeWind ~4.2.3

Before: 3 different SDK versions (52, 54, 55)
After:  1 version (55) across all 7 mobile apps

Added docs/EXPO_SDK_UPGRADE.md with testing checklist.

Note: pnpm install + device testing required to validate the upgrades.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 18:10:14 +01:00
Till JS
edd42e435c docs: add comprehensive architecture migration report
Detailed comparison of the entire NestJS → Hono + Local-First migration:
- ~87% backend code reduction (62.5k → 8.3k LOC)
- ~80% RAM reduction (3.5GB → 700MB)
- ~98% faster cold starts (2-5s → 50ms)
- 19/22 apps on Local-First with IndexedDB + Sync
- 0 NestJS services remaining (was 18)
- Complete service inventory with LOC, ports, features

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 17:42:12 +01:00
Till JS
8ccf8ff818 chore: misc fixes, new services, lockfile cleanup
Assorted changes from recent sessions:
- .gitignore: add mana-sync binary, Forgejo data
- chat/web: add isSidebarMode to navigation store
- clock/web: fix alarm page markup
- contacts/mukke/presi/questions: add svelte.config.js aliases
- context/web: add missing dependency
- manacore/landing: update pricing page
- manacore/web + todo/web: update mana dashboard pages
- planta/web: fix dashboard layout
- pnpm-lock.yaml: cleanup after backend removals
- docs/APP_GAP_ANALYSIS.md: new gap analysis doc
- services/mana-analytics: add Dockerfile
- services/mana-subscriptions: new Go subscription service

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 10:27:35 +01:00
Till JS
14099cc42c docs(infra): add PORT_SCHEMA.md + update Prometheus scrape targets
Comprehensive port schema documentation as single source of truth.
All services assigned to logical ranges:
- 3000-3009: Core platform (auth, credits, subscriptions, user, analytics)
- 3010-3019: Core infra (sync, media, search, notify, crawler, gateway)
- 3020-3029: AI/ML (llm, stt, tts, image-gen, voice-bot)
- 3030-3059: App backends
- 4000-4099: Matrix stack
- 5000-5059: Web frontends
- 8000-8099: Monitoring
- 9000-9199: Infrastructure exporters

All port conflicts resolved. Prometheus targets updated to match.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 03:02:12 +01:00
Till JS
32939fbfb5 refactor(infra): remove zitare + clock NestJS backends, add shared-hono package
Both apps are fully local-first via Dexie.js + mana-sync. Their NestJS
backends were pure CRUD wrappers (20 + 31 source files) that are no
longer needed.

Changes:
- Add packages/shared-hono: JWT auth via JWKS (jose), Drizzle DB factory,
  health route, generic GDPR admin handler, error middleware
- Migrate zitare lists page from fetch() to listsStore (local-first)
- Rewrite clock timers store from API-based to timerCollection (Dexie)
- Update clock +layout.svelte CommandBar search to use local collections
- Remove zitare-backend + clock-backend from docker-compose, CI/CD,
  Prometheus, env generation, setup scripts
- Add docs/TECHNOLOGY_AUDIT_2026_03.md with full repo analysis

Net result: -2 Docker containers, -2 ports, -2728 lines of code

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 22:43:46 +01:00
Till JS
c67ed0df14 feat(gpu-server): add API key auth, VRAM management, and Piper TTS voices
- Add API key authentication to all GPU services (X-API-Key header)
  - /health and /docs remain public (no key needed)
  - Shared key configured via GPU_API_KEY env variable
- Add VRAM auto-unload for mana-image-gen (5min) and mana-stt (10min)
  - FLUX.2 pipeline freed after idle, recovering ~13GB VRAM
  - WhisperX models freed after idle, recovering ~3GB VRAM
- Install Piper TTS voices (Thorsten + Kerstin) for local German TTS
- Update @manacore/shared-gpu client to support apiKey parameter
- Add GPU_API_KEY to .env.development
- Document API auth and VRAM management in setup guide

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 21:54:35 +01:00
Till JS
16e0d99c5a feat(gpu-server): complete GPU server setup with AI services, monitoring, and public access
- Set up 5 AI services on Windows GPU server (RTX 3090):
  - mana-llm (Port 3025): OpenAI-compatible LLM gateway via Ollama
  - mana-stt (Port 3020): WhisperX with word timestamps + speaker diarization
  - mana-tts (Port 3022): Kokoro (EN) + Edge TTS (DE) + Piper (local DE)
  - mana-image-gen (Port 3023): FLUX.2 klein 4B image generation
  - Ollama (Port 11434): gemma3:4b/12b, qwen2.5-coder:14b, nomic-embed-text

- Add @manacore/shared-gpu TypeScript client package with SttClient, TtsClient, ImageClient
- Add CUDA-compatible whisper_service using faster-whisper for Windows
- Configure public access via Cloudflare Tunnel (gpu-llm/stt/tts/img.mana.how)
- Add Loki log aggregator (Docker on Mac Mini) + log shipper on GPU server
- Add GPU scrape targets to Prometheus/VictoriaMetrics config
- Add Grafana Loki datasource for GPU service logs
- Add health check with auto-restart, log rotation, and log shipping
- Document complete setup: Always-On config, troubleshooting, architecture

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 21:35:30 +01:00
Till JS
6efeadb39e docs: add base images and build-app.sh documentation
Document sveltekit-base/nestjs-base Docker images and the build-app.sh
script in both CLAUDE.md and MAC_MINI_SERVER.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 14:26:26 +01:00
Till JS
3631cc7707 docs: add Windows GPU server to project documentation
- Add mana-server-gpu (RTX 3090, 24GB VRAM) to CLAUDE.md server section
- Add SSH config for mana-gpu alias
- Fix WINDOWS_GPU_SERVER_SETUP.md: correct network values, admin SSH key
  setup with SID-based permissions (language-independent)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 09:04:08 +01:00
Till JS
1f35206433 docs: update Cloudflare domains guide and tech independence roadmap
CLOUDFLARE_DOMAINS.md: Complete rewrite for self-hosted landing architecture
- Document DNS migration from CF Pages to Tunnel (step-by-step)
- List all 30+ domains with their routing (Apps, Landings, Services)
- Tunnel CNAME for 10 new landing domains to create
- Process for adding new landing pages

TECH_STACK_INDEPENDENCE.md: Mark Prio 4, 6, 8 as completed
- Prio 4: PostgreSQL backup (hourly dumps + daily base backups)
- Prio 6: Landing pages self-hosted (Nginx on port 4400)
- Prio 8: Cloudflare fallback documented (WireGuard + Caddy)
- Update self-hosted percentage: 80% → ~85%
- Remaining: Prio 5 (email), 7 (vision models), 9 (redundancy)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 12:42:55 +01:00
Till JS
02215dfb12 feat(skilltree): add achievement system with 26 achievements + monetization report
Full-stack achievement system for SkillTree with backend (NestJS) and frontend (SvelteKit):
- 26 achievements across 7 categories (XP, Skills, Levels, Activities, Streak, Branches, Special)
- 5 rarity tiers (Common → Legendary) with distinct styling
- Auto-unlock after XP gain, skill creation, and activity logging
- Celebration animation on unlock with sparkle effects
- Achievements page with category filters and progress tracking
- IndexedDB offline support with local condition evaluation
- Backend seeds achievements on startup, checks conditions after mutations
- Stats overview extended with achievement counter
- i18n translations (DE + EN)

Also adds docs/MONETIZATION_REPORT.md with ranked analysis of all apps.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 12:17:43 +01:00
Till JS
e3115b302d feat(infra): add Cloudflare fallback plan + self-hosted landing pages
Two infrastructure improvements for tech independence:

1. Cloudflare Fallback Documentation (docs/CLOUDFLARE_FALLBACK.md):
   - Plan B: WireGuard + Caddy on Hetzner VPS (€3.79/mo)
   - Complete Caddyfile with all 30+ subdomains
   - Step-by-step failover checklist (~15 min to switch)
   - Plan C: Direct IP with ISP

2. Self-Hosted Landing Pages (eliminates Cloudflare Pages dependency):
   - Nginx container (mana-infra-landings) on port 4400
   - Multi-site config: each subdomain → separate dist/ folder
   - Build script: scripts/mac-mini/build-landings.sh
   - Cloudflare Tunnel ingress rules for 10 landing page domains
   - Storage: /Volumes/ManaData/landings/ on external SSD
   - Domains: it, chats, pics, zitares, presis, clocks,
     manadeck, nutriphi, citycorners, docs

Migration path: Build landings locally, set Cloudflare DNS to
tunnel instead of Pages, then decommission CF Pages projects.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 12:07:40 +01:00
Till JS
fcd7c82ce4 fix(infra): simplify PostgreSQL backup to pg_dumpall + pg_basebackup
pgBackRest as Docker sidecar was overly complex (needs shared WAL
directory, stanza management, special entrypoint). Replace with a
simpler, proven approach using native PostgreSQL tools:

Backup container (postgres:16-alpine):
- Hourly: pg_dumpall | gzip (all databases as SQL, ~2 day retention)
- Daily 03:00: pg_basebackup -Ft -z (physical backup, 30 day retention)
- Auto-cleanup of old backups
- Storage: /Volumes/ManaData/backups/postgres/

Also: Remove pgbackrest.conf, simplify postgresql.conf (remove WAL
archiving config, keep performance tuning + replication for basebackup)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 11:39:20 +01:00
Till JS
39526918a3 feat(infra): add pgBackRest for PostgreSQL Point-in-Time Recovery
Replace simple pg_dumpall with pgBackRest PITR backup system.
This enables recovery to any second, not just the last daily dump.

Configuration:
- docker/postgres/postgresql.conf: WAL archiving + performance tuning
  (shared_buffers=512MB, effective_cache_size=2GB for 16GB Mac Mini)
- docker/postgres/pgbackrest.conf: stanza config + retention policy

Docker (docker-compose.macmini.yml):
- postgres: mount custom config, enable WAL archiving
- postgres-backup: new pgBackRest container
  - Storage: /Volumes/ManaData/backups/pgbackrest
  - Retention: 4 full + 14 differential (~4 weeks)
  - Compression: Zstandard (zst)

Backup Schedule:
- 03:00 daily: Full backup
- Every 6h: Differential (changes since last full)
- Every hour: Incremental (changes since last backup)
- Continuous: WAL archiving (every 60s)

Documentation (docs/POSTGRES_BACKUP.md):
- Complete restore procedures (full, PITR, single DB)
- First-time setup instructions
- Monitoring and alerting integration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 11:18:33 +01:00
Till JS
62c5dddab0 feat(project-doc-bot): migrate to shared-llm, remove OpenAI dependency
Migrate matrix-project-doc-bot from raw fetch to @manacore/shared-llm
and remove the unused openai npm package. The bot was already using
mana-llm and mana-stt (not OpenAI directly), but the code still had
raw fetch calls and the openai package installed.

Changes:
- generation.service.ts: raw fetch → llm.chat() via LlmClientService
- app.module.ts: add LlmModule.forRootAsync()
- Remove openai dependency (was unused in code)
- Update CLAUDE.md: document actual AI stack (mana-llm + mana-stt)
- Update TECH_STACK_INDEPENDENCE.md: mark Prio 1-3 as completed
  - Prio 1: Picture App → mana-image-gen 
  - Prio 2: Project Doc Bot → Ollama + mana-stt 
  - Prio 3: All LLM calls via mana-llm 
  - Self-hosted percentage: 75% → ~80%

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 10:44:56 +01:00
Till JS
51f80f43b6 docs: add Cloudflare Pages domain configuration guide
Documents all Cloudflare Pages projects, their custom domain status,
and pending actions. Identifies 5 landing pages without custom domains
and 3 deploy scripts without CF projects.

Key action items:
- it.mana.how → it-landing (just deployed, needs custom domain)
- citycorners.mana.how → citycorners-landing (no custom domain)
- nutriphi.mana.how → nutriphi-landing (no custom domain)
- manadeck.mana.how → manadeck-landing (no custom domain)
- docs.mana.how → manacore-docs (no custom domain)
- mail-landing, moodlit-landing CF projects need creation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 10:15:41 +01:00
Till JS
56ffcbac39 feat: add Ollama memory optimization, LLM metrics, and chat streaming
Three improvements to the unified LLM infrastructure:

1. Ollama memory optimization (scripts/mac-mini/configure-ollama.sh):
   - OLLAMA_KEEP_ALIVE=5m → models unload after 5min idle (saves 3-16GB RAM)
   - OLLAMA_NUM_PARALLEL=1 → predictable memory usage
   - OLLAMA_MAX_LOADED_MODELS=1 → max 1 model in RAM at a time

2. Request-level metrics in @manacore/shared-llm:
   - LlmRequestMetrics interface (model, latency, tokens, fallback detection)
   - LlmMetricsCollector class with summary stats (for health endpoints)
   - Optional onMetrics callback in LlmModuleOptions
   - Automatic metrics emission in chatMessages() (success + error)

3. Chat streaming (token-by-token SSE):
   - Backend: POST /chat/completions/stream SSE endpoint
   - OllamaService.createStreamingCompletion() via llm.chatStreamMessages()
   - ChatService.createStreamingCompletion() with upfront credit consumption
   - Web: chatApi.createStreamingCompletion() SSE consumer
   - Chat store: sendMessage() now streams tokens into assistant message
   - UI updates reactively as each token arrives

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 09:41:33 +01:00
Till JS
d29348d906 docs: add devlog for morning session + update guidelines to session-based
Switch devlog convention from daily to session-based (vormittag/abend).
Add devlog for Manalink prod-readiness work and deployment.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 13:20:40 +01:00
Till JS
f71e7d371b docs: add TODO for rotating leaked API keys from git history
Keys were removed from .env.development but remain in git history.
OpenAI, Gemini, Replicate, and Supabase keys need rotation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 12:10:53 +01:00
Till JS
b76746229e chore: remove @manacore/shared-supabase package
Package was unused — no imports found across the entire codebase.
All apps have migrated to direct PostgreSQL (Drizzle ORM) for backends
and mana-core-auth API for mobile/web clients.

Removes package and all documentation references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 12:07:14 +01:00
Till JS
07365c31b2 chore: remove stale docs and outdated design plans
- docs/AUTOMATED_TESTING_SYSTEM.md (implementation report, workflow already exists)
- docs/README_ENV_AUDIT.md (index for already-deleted audit files)
- docs/DEPENDENCY_ALIGNMENT.md (version tracking, immediately outdated)
- .claude/plans/mana-search-service.md (draft from Jan 2025, service already built)
- .claude/plans/questions-app.md (draft from Jan 2025, app already built)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 11:15:29 +01:00
Till JS
67a181bb04 chore: major cleanup of legacy docs, reports, and unused configs
Deleted 50 files (~26,000 lines):

Root-level legacy reports:
- AUTH_*.md (5 files) - auth architecture reports, now in CLAUDE.md
- TESTING_STRATEGY_*.md, QA_*, TEST_CASES_*.md - old testing plans
- BACKEND_DESIGN_PATTERN_AUDIT.md, COMPATIBILITY_MATRIX_AND_REMEDIATION.md
- HISTORICAL-ANALYSIS.md, MERGE-FIX-SUMMARY.md, RELEASE-PLAN.md
- MANACORE-TODOS.md, APP-IDEAS.md, COMMANDS.md

docs/ cleanup:
- 6 testing docs (duplicates/superseded by .claude/guidelines/testing.md)
- 3 env audit files (canonical: ENVIRONMENT_VARIABLES.md)
- 3 Mac Mini setup docs (canonical: MAC_MINI_SERVER.md)
- 5 daily reports (historical, no ongoing value)
- SELF-HOSTING-GUIDE.md (Coolify/Hetzner based, obsolete)
- CHANGELOG, CONSISTENCY_REPORT, CONSOLIDATION_OPPORTUNITIES, pr-reviews/

.claude/ cleanup:
- audit/ directory (Dec 2025 audit, outdated)
- Speculative plans (MacBook Pro server, Windows GPU server)

Other:
- docker-compose.yml (Traefik-based, replaced by docker-compose.macmini.yml)
- TROUBLESHOOTING.md trimmed (removed 730-line staging deployment section)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 10:43:11 +01:00
Till JS
7c1e2aca49 chore: remove remaining Hetzner references across codebase
Deleted:
- DOCKER_REGISTRY_SETUP.md, QUICK_START_CICD.md (legacy CI/CD docs)
- docs/ULOAD-DEPLOYMENT.md (Hetzner VPS deployment guide)
- scripts/get-ssh-key.sh, scripts/remove-coolify-references.sh (legacy scripts)

Updated Hetzner → MinIO references in:
- shared-storage (package.json, README, client.ts, types.ts)
- App CLAUDE.md files (mukke, storage, planta, picture)
- .claude/GUIDELINES.md, sveltekit-web.md guideline
- TROUBLESHOOTING.md, SETUP_TEMPLATES.md (replaced IPs with placeholders)
- GIT_WORKFLOW.md, COMMANDS.md
- services/matrix-project-doc-bot/CLAUDE.md

Remaining Hetzner mentions are in historical devlogs/audits and docs
that list Hetzner as a hosting alternative (not as active infrastructure).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 10:30:26 +01:00
Till JS
cc5ba3bb90 chore: remove Hetzner legacy artifacts and update docs for Mac Mini self-hosting
Deleted files:
- docker/caddy/Caddyfile.production + Caddyfile.staging (Hetzner reverse proxy configs)
- scripts/deploy/ (deploy-hetzner.sh, build-and-push.sh, health-check.sh, migrate-db.sh, rollback.sh)
- scripts/generate-staging-secrets.sh
- cicd/ directory (11 Hetzner CI/CD planning docs)
- CI_CD_IMPLEMENTATION_SUMMARY.md, CI_CD_README.md, FILES_CREATED.md, HIVE_MIND_FINAL_REPORT.md

Updated docs:
- CLAUDE.md: Remove Hetzner Object Storage references, update to MinIO
- docs/ANALYTICS.md: Cloudflare Tunnel instead of Caddy
- docs/URL_SCHEMA.md: Mac Mini + Cloudflare Tunnel instead of Hetzner IP
- .env.development: Remove "Hetzner in production" comments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 10:12:24 +01:00