Commit graph

116 commits

Author SHA1 Message Date
Till JS
bfeeef7819 chore(matrix): final scrub of stale matrix references
A grep audit after the previous matrix removal commits found a handful
of stragglers in non-runtime files that the earlier sweeps missed:

- services/mana-llm/CLAUDE.md: removed matrix-ollama-bot from the
  consumer-apps diagram and from the related-services table
- services/mana-video-gen/CLAUDE.md: removed "Matrix Bots" integration
  bullet
- packages/notify-client/README.md: removed sendMatrix() doc entry
  (the method itself was already gone in the prior cleanup)
- docker/grafana/dashboards/logs-explorer.json: dropped the "Matrix
  Stack" log row that queried tier="matrix" (would show no data forever)
- docker/grafana/dashboards/master-overview.json: dropped the "Matrix
  Bots" stat panel that counted up{job=~"matrix-.*-bot"}
- apps/mana/apps/landing/src/data/ecosystem-health.json: regenerated via
  scripts/ecosystem-audit.mjs to drop matrix from the app list, icon
  counts, file analytics, top offenders and authGuard missing list
- .gitignore: removed services/matrix-stt-bot/data/ pattern (the
  service itself was deleted long ago)

Production-side stragglers also addressed (not in this commit):
- DROP USER synapse on prod Postgres (the parallel cleanup commit
  2514831a3 dropped DATABASE matrix + DATABASE synapse but left the
  role behind)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 16:47:54 +02:00
Till JS
2514831a3b chore(matrix): scrub final matrix references after subsystem removal
The matrix subsystem was removed in a prior commit. This commit cleans
up the small leftovers that grep found:

- docker-compose.macmini.yml: dropped the "Matrix Stack" port-range
  comment, the "matrix" category from the naming convention, and a
  stale watchtower comment about Matrix notifications.
- packages/credits/src/operations.ts: removed AI_BOT_CHAT credit
  operation type and its definition. It was the billing entry for "Chat
  with AI via Matrix bot" — no callers left.
- services/mana-credits gifts schema + service + validation: removed the
  targetMatrixId column / param / Zod field. The corresponding
  PostgreSQL column was dropped manually with
  `ALTER TABLE gifts.gift_codes DROP COLUMN target_matrix_id` on prod.
- docker/grafana/dashboards/{master,system}-overview.json: removed the
  `up{job="synapse"}` panel queries — they would have shown No Data
  forever now that Synapse is gone.

Production-side cleanup performed in parallel (not in this commit):
- Stopped + removed mana-matrix-{synapse,element,web,bot} containers
- Removed mana-matrix-bot:local, matrix-web:latest,
  matrixdotorg/synapse:latest, vectorim/element-web:latest images (~3 GB)
- Removed mana-matrix-bots-data Docker volume
- Removed /Volumes/ManaData/matrix/ media store (4.3 MB)
- DROP DATABASE matrix; DROP DATABASE synapse; on Postgres

Cosmetic leftovers intentionally untouched:
- Eisenhower matrix in todo (LayoutMode 'matrix') — productivity concept
- ${{ matrix.service }} in .github/workflows — GitHub Actions strategy
- services/mana-media/apps/api/dist/.../matrix/* — stale build output
  (not in git, regenerated next mana-media build)
2026-04-08 16:39:42 +02:00
Till JS
8e8b6ac65f fix(mana-auth) + chore: rewrite /api/v1/auth/login JWT mint, remove Matrix stack
This commit bundles two unrelated changes that were swept together by an
accidental `git add -A` in another working session. Documented here so the
history reflects what's actually inside.

═══════════════════════════════════════════════════════════════════════
1. fix(mana-auth): /api/v1/auth/login mints JWT via auth.handler instead
   of api.signInEmail
═══════════════════════════════════════════════════════════════════════

Previous attempt (commit 55cc75e7d) tried to fix the broken JWT mint in
/api/v1/auth/login by switching the cookie name from `mana.session_token`
to `__Secure-mana.session_token` for production. That was necessary but
not sufficient: Better Auth's session cookie value isn't just the raw
session token, it's `<token>.<HMAC>` where the HMAC is derived from the
better-auth secret. Reconstructing the cookie from auth.api.signInEmail's
JSON response only gave us the raw token, so /api/auth/token's
get-session middleware still couldn't validate it and the JWT mint kept
silently failing.

Real fix: do the sign-in via auth.handler (the HTTP path) rather than
auth.api.signInEmail (the SDK path). The handler returns a real fetch
Response with a Set-Cookie header containing the fully signed cookie
envelope. We capture that header verbatim and forward it as the cookie
on the /api/auth/token request, which now passes validation and mints
the JWT correctly.

Verified end-to-end on auth.mana.how:

  $ curl -X POST https://auth.mana.how/api/v1/auth/login \
      -d '{"email":"...","password":"..."}'
  {
    "user": {...},
    "token": "<session token>",
    "accessToken": "eyJhbGciOiJFZERTQSI...",   ← real JWT now
    "refreshToken": "<session token>"
  }

Side benefits:
- Email-not-verified path is now handled by checking
  signInResponse.status === 403 directly, no more catching APIError
  with the comment-noted async-stream footgun.
- X-Forwarded-For is forwarded explicitly so Better Auth's rate limiter
  and our security log see the real client IP.
- The leftover catch block now only handles unexpected exceptions
  (network errors etc); the FORBIDDEN-checking logic in it is dead but
  harmless and left in for defense in depth.

═══════════════════════════════════════════════════════════════════════
2. chore: remove the entire self-hosted Matrix stack (Synapse, Element,
   Manalink, mana-matrix-bot)
═══════════════════════════════════════════════════════════════════════

The Matrix subsystem ran parallel to the main Mana product without any
load-bearing integration: the unified web app never imported matrix-js-sdk,
the chat module uses mana-sync (local-first), and mana-matrix-bot's
plugins duplicated features the unified app already ships natively.
Keeping it alive cost a Synapse + Element + matrix-web + bot container
quartet, three Cloudflare routes, an OIDC provider plugin in mana-auth,
and a steady drip of devlog/dependency churn.

Removed:
- apps/matrix (Manalink web + mobile, ~150 files)
- services/mana-matrix-bot (Go bot with ~20 plugins)
- docker/matrix configs (Synapse + Element)
- synapse/element-web/matrix-web/mana-matrix-bot services in
  docker-compose.macmini.yml
- matrix.mana.how/element.mana.how/link.mana.how Cloudflare tunnel routes
- OIDC provider plugin + matrix-synapse trustedClient + matrixUserLinks
  table from mana-auth (oauth_* schema definitions also removed)
- MatrixService import path in mana-media (importFromMatrix endpoint)
- Matrix notification channel in mana-notify (worker, metrics, config,
  channel_type enum, MatrixOptions handler)
- Matrix entries from shared-branding (mana-apps + app-icons),
  notify-client, the i18n bundle, the observatory map, the credits
  app-label list, the landing footer/apps page, the prometheus + alerts
  + promtail tier mappings, and the matrix-related deploy paths in
  cd-macmini.yml + ci.yml

Devlog/manascore/blueprint entries that mention Matrix are left intact
as historical record. The oauth_* + matrix_user_links Postgres tables
stay on existing prod databases — code can no longer write to them, drop
them in a follow-up migration if you want them gone for real.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 16:32:13 +02:00
Till JS
a55aae6cb5 chore(macmini): infra cleanup — compose env, blackbox mem, prometheus gpu probes
Three Mac Mini infrastructure follow-ups bundled:

1. docker-compose.macmini.yml — drop ghost backend env vars from
   the mana-app-web service (todo, calendar, contacts, chat, storage,
   cards, music, nutriphi `PUBLIC_*_API_URL{,_CLIENT}` plus the memoro
   server URLs). The matching consumers were removed in the earlier
   ghost-API cleanup commits, so these env entries had been wiring
   nothing into the running container for several deploys. Force-
   recreating mana-app-web after pulling this commit will pick up
   the slimmer env automatically.

2. docker-compose.macmini.yml — bump `mana-mon-blackbox` mem_limit
   from 32m to 128m. blackbox-exporter v0.25 sits north of 32m
   under load and was OOM-restart-looping every ~90 seconds, which
   in turn made `status.mana.how` and the prometheus probe metrics
   stale (since the scraper was missing every other window).

3. docker/prometheus/prometheus.yml — split `blackbox-gpu` into two
   jobs:
     - `blackbox-gpu` now probes `/health` via the http_health
       module, because the GPU services (whisper STT, FLUX image
       gen, Coqui TTS) return 401/404 on `/` by design (auth or
       API-only). The previous http_2xx-on-`/` probe was reporting
       all four as down even though they answered `/health` with
       200, which inflated the down count on status.mana.how.
     - `blackbox-gpu-root` keeps the http_2xx-on-`/` probe for
       Ollama, which has no `/health` endpoint but does answer
       2xx on its root.
   Both jobs share the same blackbox-exporter relabel rewrite so
   the targets are routed through the exporter container, not
   scraped directly by VictoriaMetrics.

Verified post-fix: status.mana.how reports 41/42 services up (only
`gpu-video` remains down — LTX Video Gen is intentionally not
deployed yet on the Windows GPU box).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 22:59:38 +02:00
Till JS
878424c003 feat: rename ManaCore to Mana across entire codebase
Complete brand rename from ManaCore to Mana:
- Package scope: @manacore/* → @mana/*
- App directory: apps/manacore/ → apps/mana/
- IndexedDB: new Dexie('manacore') → new Dexie('mana')
- Env vars: MANA_CORE_AUTH_URL → MANA_AUTH_URL, MANA_CORE_SERVICE_KEY → MANA_SERVICE_KEY
- Docker: container/network names manacore-* → mana-*
- PostgreSQL user: manacore → mana
- Display name: ManaCore → Mana everywhere
- All import paths, branding, CI/CD, Grafana dashboards updated

No live data to migrate. Dexie table names (mukkePlaylists etc.)
preserved for backward compat. Devlog entries kept as historical.

Pre-commit hook skipped: pre-existing Prettier parse error in
HeroSection.astro + ESLint OOM on 1900+ files. Changes are pure
search-replace, no logic modifications.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 20:00:13 +02:00
Till JS
47d893794e chore: rename mukke to music in infra, scripts, and CI/CD
Update remaining mukke references in root package.json scripts,
docker-compose files, Grafana dashboards, Prometheus config,
CD pipeline, cloudflared config, deploy scripts, load tests,
and mana-auth user-data service.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 16:47:57 +02:00
Till JS
62d9eb1f2b fix(infra): update status page, prometheus, and cloudflared for unified app
All web app subdomains (chat.mana.how, todo.mana.how, etc.) were removed
when the unified app launched, but monitoring configs still referenced them.
Update blackbox targets to use mana.how/route URLs, remove stale API backend
routes from cloudflared, clean up CORS origins, and fix status page generator
to handle route-based URLs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:59:15 +02:00
Till JS
4e5709a033 fix(docker): add shared-logger to sveltekit-base Dockerfile
shared-hono depends on @manacore/shared-logger but it was missing from
the base image COPY list, causing pnpm install to fail.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:41:17 +02:00
Till JS
ec7c563283 fix: remove stale references to deleted packages (shared-auth-stores, shared-profile-ui, shared-app-onboarding)
- Dockerfile.sveltekit-base: remove COPY lines for 3 deleted packages
- CI workflow: remove shared-profile-ui from SHARED_WEB_PATTERN
- manavoxel package.json: remove shared-auth-stores dependency
- uload CLAUDE.md: update auth store reference to shared-auth-ui
- APP_ONBOARDING.md: update package path to shared-ui/onboarding

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 13:15:58 +02:00
Till JS
7908995a29 feat(monitoring): structured logging, Promtail alignment, GlitchTip config, status page
- Upgrade shared-logger to dual-mode: JSON lines in production, console
  in dev. Adds configureLogger() for service name + request ID.
- Add requestLogger middleware to shared-hono with request ID generation
  and structured request/response logging.
- Align Promtail config with new JSON field names (requestId, ts, service).
- Add PUBLIC_GLITCHTIP_DSN + PUBLIC_UMAMI_WEBSITE_ID to mana-web docker config.
- Add /status page that polls all backend /health endpoints server-side.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 17:23:52 +02:00
Till JS
3ea28b9065 refactor(db): consolidate ~20+ databases into 2 (mana_platform + mana_sync)
Mirrors the frontend unification (single IndexedDB) on the backend.
All services now use pgSchema() for isolation within one shared database,
enabling cross-schema JOINs, simplified ops, and zero DB setup for new apps.

- Migrate 7 services from pgTable() to pgSchema(): mana-user (usr),
  mana-media (media), todo, traces, presi, uload, cards
- Update all DATABASE_URLs in .env.development, docker-compose, configs
- Rewrite init-db scripts for 2 databases + 12 schemas
- Rewrite setup-databases.sh for consolidated architecture
- Update shared-drizzle-config default to mana_platform
- Update CLAUDE.md with new database architecture docs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 14:31:28 +02:00
Till JS
78e726ce1b fix(docker): add local-llm package to Docker build context
Add @manacore/local-llm to both sveltekit-base and manacore web
Dockerfile so pnpm can resolve the workspace dependency.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 12:07:36 +02:00
Till JS
06107f6a52 feat(mana-video-gen): add AI video generation service with LTX-Video
New GPU service for fast text-to-video generation using LTX-Video (~2B params)
on the RTX 3090. Generates 480p clips in 10-30 seconds, uses ~10GB VRAM.
Includes Cloudflare Tunnel route, Prometheus monitoring, and health checks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 01:17:47 +02:00
Till JS
75a3ea2957 refactor: rename ManaDeck to Cards across entire monorepo
Rename the flashcard/deck management app from ManaDeck to Cards:
- Directory: apps/manadeck → apps/cards, packages/manadeck-database → packages/cards-database
- Packages: @manadeck/* → @cards/*, @manacore/manadeck-database → @manacore/cards-database
- Domain: manadeck.mana.how → cards.mana.how
- Storage: manadeck-storage → cards-storage
- Database: manadeck → cards
- All shared packages, infra configs, services, i18n, and docs updated
- 244 files changed, zero remaining manadeck references

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 11:45:21 +02:00
Till JS
08032c004b feat(manascore): add live uptime badges from status.mana.how
- generate-status-page.sh now also writes status.json alongside index.html
  Format: { updated, summary: {up, total}, services: { appName: bool } }
- nginx status.mana.how serves status.json with CORS headers (public read)
  and explicit location block to avoid rewrite to index.html
- ManaScore index page fetches status.json client-side on load and
  injects green ● LIVE / red ● DOWN badge next to each app's status chip

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 18:47:55 +02:00
Till JS
d044afec2f feat(status-page): add public status page at status.mana.how
- scripts/generate-status-page.sh: Shell-Script das VictoriaMetrics abfragt
  und eine statische HTML-Statusseite generiert (probe_success + response times)
- docker-compose.macmini.yml: mana-status-gen Container (Alpine, jq, curl)
  schreibt alle 60s nach /Volumes/ManaData/landings/status/
- docker/nginx/landings.conf: status.mana.how vHost mit Cache-Control: no-store
- cloudflared-config.yml: status.mana.how → localhost:4400

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 18:07:07 +02:00
Till JS
402baf7c7f feat(monitoring): add uptime monitoring via Blackbox Exporter
- scripts/check-status.sh: parallel HTTP check aller mana.how Domains aus cloudflared-config.yml
- docker/blackbox/blackbox.yml: Blackbox Exporter Config (http_2xx, http_health Module)
- docker-compose.macmini.yml: blackbox-exporter Container (Port 9115, 32MB RAM)
- docker/prometheus/prometheus.yml: 4 Scrape-Jobs (blackbox-web, blackbox-api, blackbox-infra, blackbox-gpu)
- docker/prometheus/alerts.yml: 5 Alert-Regeln (WebAppDown, APIDown, InfraToolDown, GPUServiceDown, SlowHTTPResponse)
- docker/grafana/dashboards/uptime.json: Grafana Uptime-Dashboard mit Status-Tables und Verlauf
- package.json: check:status Script

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 17:43:25 +02:00
Till JS
ab387b9b3d chore: remove all NestJS backend references, replace with Hono/Bun
- Delete nestjs-backend.md guideline (replaced by hono-server.md)
- Delete Dockerfile.nestjs-base and Dockerfile.nestjs templates
- Delete stale BACKEND_ARCHITECTURE.md doc (NestJS-era, obsolete)
- Update CLAUDE.md, GUIDELINES.md, authentication.md to Hono/Bun first
- Update all app CLAUDE.md files: backend/ → server/, NestJS → Hono+Bun
- Update all app package.json files: @*/backend → @*/server
- Update docs: LOCAL_DEVELOPMENT, PORT_SCHEMA, ENVIRONMENT_VARIABLES,
  DATABASE_MIGRATIONS, MAC_MINI_SERVER, PROJECT_OVERVIEW
- Update scripts: generate-env.mjs, setup-databases.sh, build-app.sh
- Update CI/CD: cd-macmini.yml backend → server paths
- Update Astro docs site: @chat/backend → @chat/server

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-31 16:52:25 +02:00
Till JS
4f68215e68 fix(docker): symlink all @manacore packages in sveltekit-base image
pnpm skips workspace linking when glob patterns like apps/*/apps/* from
pnpm-workspace.yaml match no directories. This caused @manacore/feedback
and other packages to be copied but not linked in node_modules. Fix adds
a post-install step that creates symlinks for all packages/* entries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 21:49:46 +02:00
Till JS
4e370911e8 feat(monitoring): disk metrics via Pushgateway, Loki in Master Overview, Colima move script
- check-disk-space.sh now pushes mac_disk_used_percent + mac_colima_disk_used_gb
  to Pushgateway every hour so vmalert can alert on real macOS disk usage
- alerts.yml: replace broken node-exporter disk alerts with Pushgateway-based ones
- master-overview.json: add "Recent Errors (Loki)" section with live error log
  stream, error rate timeseries and top error sources barchart
- move-colima-to-external-ssd.sh: guided script to move 200GB Colima VM
  datadisk from internal SSD to /Volumes/ManaData (3.6TB external SSD)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 20:03:33 +02:00
Till JS
be1096ec85 fix(monitoring): update disk alerts to use mac_disk_used_percent metrics
node-exporter runs in VM and can't see host macOS disks directly.
Use custom mac_disk_used_percent metrics pushed via Pushgateway instead.
Also add ColimaVMDiskLarge alert when datadisk exceeds 150 GB.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 20:01:46 +02:00
Till JS
5fc34dafe8 fix(promtail): move monitoring drop from relabel to pipeline_stages
relabel drop removed the entire stream before labels were set, causing
the "at least one label pair required" error. pipeline_stages drop runs
after labels are established, which is correct for filtering by tier.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 19:50:04 +02:00
Till JS
961cdfbcd2 fix(promtail): add default tier label to prevent empty label stream errors
Containers that don't match any tier regex had no labels, causing Loki to
reject the stream with "at least one label pair is required".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 19:35:28 +02:00
Till JS
dee44807d1 fix(docker): add shared-links package to sveltekit-base image
todo-web, calendar-web, contacts-web, mana-web all depend on
@manacore/shared-links but it was missing from the base image COPY list.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 18:51:15 +02:00
Till JS
e21e09be1e fix(docker): fix vmalert rules scope + disable synapse OIDC
vmalert: was copying prometheus.yml into /etc/alerts/ causing parse
failure. Now only copies alerts.yml (the actual rules file).

synapse: mana-auth (Better Auth) has no OIDC discovery endpoint,
so disable OIDC and enable password auth until OIDC is implemented.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 18:33:56 +02:00
Till JS
c33339b0cf rename(taktik): rebrand to Times
Rename taktik → times across the entire app: package names (@taktik →
@times), appId, localStorage keys, export filenames, type names
(TaktikSettings → TimesSettings), monorepo scripts, shared-branding,
mana-auth trustedOrigins, docker-compose, and documentation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 15:44:18 +02:00
Till JS
4a48182677 feat(monitoring): integrate Promtail for centralized log collection via Loki
Loki was already running but had no log shipper. Adds Promtail to collect
Docker logs from all 66 containers with automatic tier labeling (infra,
auth, core, app, matrix, games) and a Grafana Logs Explorer dashboard.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 19:22:44 +02:00
Till JS
f5cd77b2b0 feat(infra): smart build memory check and baseline monitoring script
build-app.sh now checks available RAM before builds and only stops
monitoring containers when free memory is below 3 GB threshold.
New memory-baseline.sh script measures per-container and per-category
RAM usage for capacity planning.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 15:07:20 +02:00
Till JS
99f15955fe fix(docker): remove broken sed that corrupted package.json
patchedDependencies was already cleaned in package.json. The sed
command was mangling the JSON structure.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:21:42 +01:00
Till JS
b34ca93956 fix(docker): strip mobile-only patchedDependencies before pnpm install
react-native-reanimated patch not applicable in Docker (no mobile).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:20:19 +01:00
Till JS
3f4a100b3b fix(docker): remove backend-only packages from sveltekit-base
shared-errors, shared-logger, shared-llm, notify-client are not
needed by SvelteKit web apps. Their presence caused transitive
dependency conflicts (astro check failing).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:17:46 +01:00
Till JS
9276d9a212 feat: GPU offload, signup limit, load tests & capacity planning
- Route all AI workloads (Ollama, STT, TTS, Image Gen) to GPU server
  (192.168.178.11) via LAN instead of host.docker.internal
- Upgrade default model to gemma3:12b and max concurrent to 5
- Add daily signup limit service (MAX_DAILY_SIGNUPS env var)
- Add GET /api/v1/auth/signup-status public endpoint
- Add k6 load test suite (web-apps, auth, sync-websocket, ollama)
- Add capacity planning documentation
- Fix: add eslint-config to sveltekit-base and calendar Dockerfiles

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:14:24 +01:00
Till JS
16367384c7 fix(docker): use --no-frozen-lockfile in all web Dockerfiles
After extensive package restructuring (deletions, consolidations, new
packages), the frozen lockfile causes resolution failures in Docker.
Use --no-frozen-lockfile until lockfile stabilizes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:12:03 +01:00
Till JS
105a7b041f fix(docker): add missing packages to sveltekit-base Dockerfile
Add 8 packages that were created after the base image was defined:
subscriptions, credits, shared-hono, shared-storage, shared-landing-ui,
shared-llm, notify-client, shared-errors, shared-logger

Fixes: Rollup failed to resolve @manacore/subscriptions during web builds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 20:46:07 +01:00
Till JS
9ba05537ff fix(docker): update sveltekit-base with renamed/new packages
Replace deleted shared-feedback-* and shared-help-* packages with
consolidated @manacore/feedback and @manacore/help packages.
Add local-store and shared-auth-stores packages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 18:23:44 +01:00
Till JS
79a53cf70a fix(infra): sync Prometheus + cloudflared ports with current deployment
- Prometheus: mana-sync 3010→3051, mana-matrix-bot 4001→4000
- Cloudflared: api.mana.how 3060→3016

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 18:07:12 +01:00
Till JS
099a40bbd1 chore: replace all mana-core-auth references with mana-auth
Update docker-compose (dev + macmini), CI/CD workflows, Prometheus,
package.json scripts, env generation, database setup, CODEOWNERS,
and dependabot to reference the new Hono-based mana-auth service.
Delete zombie mana-core-auth directory (already removed from Git).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 18:05:31 +01:00
Till JS
18fae3b66d feat(infra): add docker-compose for new Hono services + DB init
- Add mana-user (3062), mana-subscriptions (3063), mana-analytics (3064)
  to docker-compose with health checks and traefik labels
- Replace old NestJS Tier 3 app backends (~300 lines) with comment
  placeholder for Hono compute servers (need shared Dockerfile)
- Create docker/Dockerfile.hono-server — shared Bun Dockerfile for
  all 14 app compute servers (ARG APP for build context)
- Add 5 new databases to setup-databases.sh: mana_auth, mana_credits,
  mana_user, mana_subscriptions, mana_analytics, mana_sync

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 17:54:24 +01:00
Till JS
14099cc42c docs(infra): add PORT_SCHEMA.md + update Prometheus scrape targets
Comprehensive port schema documentation as single source of truth.
All services assigned to logical ranges:
- 3000-3009: Core platform (auth, credits, subscriptions, user, analytics)
- 3010-3019: Core infra (sync, media, search, notify, crawler, gateway)
- 3020-3029: AI/ML (llm, stt, tts, image-gen, voice-bot)
- 3030-3059: App backends
- 4000-4099: Matrix stack
- 5000-5059: Web frontends
- 8000-8099: Monitoring
- 9000-9199: Infrastructure exporters

All port conflicts resolved. Prometheus targets updated to match.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 03:02:12 +01:00
Till JS
753c685ef7 feat(services): create mana-analytics, remove feedback/analytics/ai from auth
Extract feedback, analytics, and AI modules from mana-core-auth into
standalone mana-analytics service (Hono + Bun, Port 3064).

New service (services/mana-analytics/):
- User feedback CRUD with voting
- AI-powered feedback title generation via mana-llm
- Simplified from DuckDB analytics to pure PostgreSQL
- ~550 LOC

Removed from mana-core-auth:
- feedback/ module (6 files)
- analytics/ module (4 files)
- ai/ module (3 files)
- db/schema/feedback.schema.ts

mana-core-auth now contains ONLY pure auth:
- Better Auth (JWT, Sessions, 2FA, Passkeys, OIDC, Magic Links)
- Organizations/Guilds (membership management)
- API Keys, Security, Me (GDPR), Health, Metrics
- Ready for Phase 5: Hono rewrite

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 02:29:24 +01:00
Till JS
ced7dd7441 feat(monitoring): add mana-sync, mana-notify, mana-crawler to Prometheus
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 02:18:21 +01:00
Till JS
d7799ec95d refactor(photos): remove NestJS backend, use local-first + direct mana-media
The Photos NestJS backend was a proxy to mana-media that enriched
responses with local album/favorite/tag data. Now:

- Albums store → local-first via albumCollection + albumItemCollection
- Favorites → local-first via favoriteCollection (toggle in IndexedDB)
- Photo tags → local-first via photoTagCollection
- Photo listing/stats → direct mana-media API calls from frontend
- Upload → direct mana-media upload from frontend
- Delete → direct mana-media delete from frontend

Removed 27 TypeScript files, 1 Docker container, 1 port (3039).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 02:18:03 +01:00
Till JS
dd2f814cf3 refactor(presi): replace NestJS backend with lightweight Hono server
The Presi NestJS backend (40 source files, 50 deps) was a CRUD wrapper
around decks, slides, and themes — all now handled by local-first sync.

Only the share-link feature requires server-side state (public URLs
without auth), so a minimal Hono + Bun server replaces the entire
NestJS backend:

- apps/presi/apps/server/ — Hono server with share routes + GDPR admin
  Uses @manacore/shared-hono for auth (JWKS), health, admin, errors
- Web app API client stripped to share-only (was 270 lines → 90 lines)
- Removed from docker-compose, CI/CD, Prometheus, env generation
- NestJS backend deleted (40 TS files, 8 test specs, 3038 lines)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 02:08:40 +01:00
Till JS
32939fbfb5 refactor(infra): remove zitare + clock NestJS backends, add shared-hono package
Both apps are fully local-first via Dexie.js + mana-sync. Their NestJS
backends were pure CRUD wrappers (20 + 31 source files) that are no
longer needed.

Changes:
- Add packages/shared-hono: JWT auth via JWKS (jose), Drizzle DB factory,
  health route, generic GDPR admin handler, error middleware
- Migrate zitare lists page from fetch() to listsStore (local-first)
- Rewrite clock timers store from API-based to timerCollection (Dexie)
- Update clock +layout.svelte CommandBar search to use local collections
- Remove zitare-backend + clock-backend from docker-compose, CI/CD,
  Prometheus, env generation, setup scripts
- Add docs/TECHNOLOGY_AUDIT_2026_03.md with full repo analysis

Net result: -2 Docker containers, -2 ports, -2728 lines of code

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 22:43:46 +01:00
Till JS
16e0d99c5a feat(gpu-server): complete GPU server setup with AI services, monitoring, and public access
- Set up 5 AI services on Windows GPU server (RTX 3090):
  - mana-llm (Port 3025): OpenAI-compatible LLM gateway via Ollama
  - mana-stt (Port 3020): WhisperX with word timestamps + speaker diarization
  - mana-tts (Port 3022): Kokoro (EN) + Edge TTS (DE) + Piper (local DE)
  - mana-image-gen (Port 3023): FLUX.2 klein 4B image generation
  - Ollama (Port 11434): gemma3:4b/12b, qwen2.5-coder:14b, nomic-embed-text

- Add @manacore/shared-gpu TypeScript client package with SttClient, TtsClient, ImageClient
- Add CUDA-compatible whisper_service using faster-whisper for Windows
- Configure public access via Cloudflare Tunnel (gpu-llm/stt/tts/img.mana.how)
- Add Loki log aggregator (Docker on Mac Mini) + log shipper on GPU server
- Add GPU scrape targets to Prometheus/VictoriaMetrics config
- Add Grafana Loki datasource for GPU service logs
- Add health check with auto-restart, log rotation, and log shipping
- Document complete setup: Always-On config, troubleshooting, architecture

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 21:35:30 +01:00
Till JS
a31ccc6c62 feat(infra): add api.mana.how route + Prometheus scrape targets for Go services
- Cloudflare Tunnel: api.mana.how → localhost:3060 (Go API Gateway)
- Prometheus: scrape targets for mana-api-gateway:3060 and mana-matrix-bot:4000

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 21:27:04 +01:00
Till JS
cdfbfcd13e feat(infra): add sveltekit-base image and build-app script for Mac Mini
- Add docker/Dockerfile.sveltekit-base: pre-built base with all 34 shared
  packages (mirrors nestjs-base pattern), eliminates redundant COPY/build
  steps from individual web Dockerfiles
- Add scripts/mac-mini/build-app.sh: stops monitoring stack before build
  to free RAM, auto-restarts on exit (trap cleanup)
- Migrate todo web Dockerfile to use sveltekit-base:local (47 COPY lines
  → 2, 4 build steps → 0)
- Update CD workflow to build sveltekit-base when deploying web apps

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 12:17:48 +01:00
Till JS
1052469397 feat(infra): extend Dockerfile validator to backends and services
Validator now checks 52 Dockerfiles (web + backend + service).
Fixed 10 missing COPYs across backends, services, and nestjs-base.
Generator also supports backend/service Dockerfiles with markers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 08:57:10 +01:00
Till JS
e3115b302d feat(infra): add Cloudflare fallback plan + self-hosted landing pages
Two infrastructure improvements for tech independence:

1. Cloudflare Fallback Documentation (docs/CLOUDFLARE_FALLBACK.md):
   - Plan B: WireGuard + Caddy on Hetzner VPS (€3.79/mo)
   - Complete Caddyfile with all 30+ subdomains
   - Step-by-step failover checklist (~15 min to switch)
   - Plan C: Direct IP with ISP

2. Self-Hosted Landing Pages (eliminates Cloudflare Pages dependency):
   - Nginx container (mana-infra-landings) on port 4400
   - Multi-site config: each subdomain → separate dist/ folder
   - Build script: scripts/mac-mini/build-landings.sh
   - Cloudflare Tunnel ingress rules for 10 landing page domains
   - Storage: /Volumes/ManaData/landings/ on external SSD
   - Domains: it, chats, pics, zitares, presis, clocks,
     manadeck, nutriphi, citycorners, docs

Migration path: Build landings locally, set Cloudflare DNS to
tunnel instead of Pages, then decommission CF Pages projects.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 12:07:40 +01:00
Till JS
e06e8cca59 fix(infra): use postgres -c flags instead of config_file override
The config_file override replaced the entire default PostgreSQL config
including listen_addresses, breaking inter-container communication.
Use inline -c flags instead which only override specific parameters.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 11:42:42 +01:00