Commit graph

125 commits

Author SHA1 Message Date
Till JS
a91a6076cc refactor: rename planta → plants, clean up codebase
- Rename planta module to plants everywhere (routes, modules, API,
  branding, i18n, docker, docs, shared packages)
- Fix package name collisions: @mana/credits-service, @mana/subscriptions-service
  (unblocks turbo)
- Extract layout composables: use-ai-tier-items, use-sync-status-items,
  RouteTierGate (layout 1345→1015 lines)
- Create shared DB pool for apps/api (lib/db.ts), migrate 5 modules
- Add automations module queries.ts with useAllAutomations/useEnabledAutomations
- Remove debug console.log statements from production code
- Rename storage display name: Ablage → Speicher

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 18:59:44 +02:00
Till JS
c47ce83e83 fix(geocoding): proxy Pelias health through wrapper for monitoring
blackbox-exporter can't resolve host.docker.internal on Colima, so
probes of host.docker.internal:4000 and :9200 always fail. Instead,
add a /health/pelias endpoint on the Hono wrapper that proxies to
the Pelias API, and update prometheus.yml to probe the wrapper's
proxied health endpoint.

Also simplifies the status page friendly_name() now that we don't
need to display the host.docker.internal targets.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 16:45:43 +02:00
Till JS
957060ca55 feat(monitoring): add mana-geocoding + Pelias to prod compose, Prometheus, Grafana, and status.mana.how
Production deployment + observability for the self-hosted geocoding stack:

**docker-compose.macmini.yml**
- New mana-geocoding container (port 3018, internal-only — no traefik
  labels, no Cloudflare route). Uses host.docker.internal to reach the
  Pelias API on the host's pelias compose stack. Dockerfile added under
  services/mana-geocoding/ using the same Bun/Hono pattern as mana-events.

**Prometheus**
- New blackbox-internal job probing mana-geocoding:3018/health, the
  Pelias API on host.docker.internal:4000/v1/status, and Elasticsearch
  at host.docker.internal:9200/_cluster/health. Kept separate from
  blackbox-api which is reserved for public HTTPS endpoints.

**status.mana.how (generate-status-page.sh)**
- Include blackbox-internal in the metric query and add an "Interne
  Dienste" section with its own summary card, right between Infrastruktur
  and GPU Dienste. Summary grid goes from 4 to 5 columns with a
  900px breakpoint.
- friendly_name() now handles http:// URLs and rewrites container-name
  hosts like mana-geocoding:3018/health → "Mana Geocoding",
  host.docker.internal:4000 → "Pelias API",
  host.docker.internal:9200 → "Pelias Elasticsearch".

**Grafana uptime dashboard**
- Add an "Internal" series to the "Alle Dienste — Uptime-Verlauf" panel
- New "Interne Dienste Status" table panel showing per-instance up/down
- New "Geocoding Ø Latenz" stat panel for probe_duration_seconds

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 16:11:01 +02:00
Till JS
2a177ba032 fix(monitoring): add 10 missing modules to blackbox probes + geocoding to status
Blackbox web probes were missing: body, journal, dreams, firsts,
cycles, events, finance, places, who, news, mail. These modules
exist in mana-apps.ts and are deployed but were never added to
prometheus.yml — so they didn't show on status.mana.how.

Also adds mana-geocoding and mana-events to the internal SvelteKit
status page health checks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 23:13:07 +02:00
Till JS
45790ffbb8 refactor(mana): rename inventar → inventory across the codebase
The workbench-registry app id 'inventar' did not match its
@mana/shared-branding MANA_APPS counterpart 'inventory', so the tier-
gating join in apps/web/src/lib/app-registry/registry.ts silently
failed for the inventory module — it fell into the "no MANA_APPS
entry, default visible" fallback and was effectively un-gated. The
codebase had also voted overwhelmingly for 'inventar' (53 files) vs
'inventory' (3 files in shared-branding), so the long-standing
mismatch was just bookkeeping debt waiting to bite.

Pre-release, no live data, so the cleanest fix is to align everything
on the English 'inventory':

- Workbench-registry id, module.config.ts appId, module folder, route
  folder and i18n locale folder all renamed via git mv
- Standalone apps/inventar/ workspace package renamed
- All imports, store identifiers (InventarEvents → InventoryEvents,
  INVENTAR_GUEST_SEED, inventarModuleConfig), i18n keys and href/goto
  paths follow the rename
- The German display label "Inventar" is preserved everywhere it is a
  user-visible string (page titles, i18n values, toast labels)
- Dexie table prefixes (invCollections, invItems, …) are unchanged
- Drive-by fix: ListView.svelte was querying non-existent
  inventarCollections/inventarItems tables — corrected to the actual
  invCollections/invItems names from module.config
- The "inventar ↔ inventory id mismatch" workaround comment in
  registry.ts is removed since the mismatch no longer exists

module-registry.ts also picks up the user's parallel newsModuleConfig
addition because both edits land in the same import block — keeping
them split would have left the build in an inconsistent state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 15:50:24 +02:00
Till JS
4423c03573 fix(docker): drop packages/shared-config (deleted) from sveltekit-base
Fourth stale package COPY in three days. The pattern is unfortunately
predictable: package gets removed in a parallel cleanup commit, the
Dockerfile.sveltekit-base entry stays behind, nobody notices because
nobody runs the base build manually anymore. Then is_base_image_stale
fires the next time something in packages/ changes and the build
falls over.

Long-term: add a pre-flight check to build-app.sh that validates
every COPY-referenced path actually exists before kicking off Docker.
Failing fast is much friendlier than failing 30 seconds into a Docker
layer.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 12:43:17 +02:00
Till JS
32b0bf9a18 fix(docker): drop more stale shared package COPY lines from sveltekit-base
Three more removed packages had stale COPY entries in the base
Dockerfile, blocking the build the moment is_base_image_stale tried
to rebuild the image:

  - packages/credit-operations  (deleted in NestJS→Hono migration)
  - packages/shared-api-client  (same)
  - packages/shared-splitscreen (separate cleanup)

Same shape as the shared-subscription-types/-ui removal earlier
today (commit a9178ec2f). The deletions go in cleanup commits and
the Dockerfile lines stay behind because nobody runs --base
manually anymore — until is_base_image_stale picks up a packages/
change and tries to rebuild, at which point COPY of a non-existent
path bricks the build.

Removed both the COPY lines AND the corresponding `cd /app/packages/
{credit-operations,shared-api-client} && pnpm build` lines from the
post-install build chain so they can't accidentally re-introduce
the references.

Verified by `grep '^COPY packages/' Dockerfile.sveltekit-base | awk
{print $2} | while read pkg; do [ ! -d $pkg ] && echo MISSING: $pkg;
done` returning empty.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 12:25:45 +02:00
Till JS
56065c8537 fix(mana/web): unwrap $state proxy in workbench-scenes Dexie writes
Adding an app to a workbench scene threw DataCloneError. scenesState
is a $state array, so current.openApps was a Svelte 5 proxy and
spreading it into a new array left proxy entries inside; IndexedDB's
structured clone refuses to serialise those. Snapshot before handing
the array to patchScene / createScene so Dexie sees plain objects.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 00:44:00 +02:00
Till JS
a9178ec2fb fix(docker): drop stale shared-subscription-* COPY lines from sveltekit-base
The base image referenced packages/shared-subscription-types and
packages/shared-subscription-ui, which were consolidated into
packages/subscriptions a while back and no longer exist on disk.
`build-app.sh --base` therefore failed every time with:

  failed to compute cache key: "/packages/shared-subscription-ui": not found

That latent failure was harmless until today: the CSP fix for WebLLM
in @mana/shared-utils never made it into the live mana-web container
because shared-utils lives inside sveltekit-base:local (not COPYed by
the per-app Dockerfile), and rebuilding the base was impossible. With
the stale lines removed the base image rebuilds, picks up the current
shared-utils, and downstream apps inherit the fixed CSP automatically.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 18:28:59 +02:00
Till JS
bfeeef7819 chore(matrix): final scrub of stale matrix references
A grep audit after the previous matrix removal commits found a handful
of stragglers in non-runtime files that the earlier sweeps missed:

- services/mana-llm/CLAUDE.md: removed matrix-ollama-bot from the
  consumer-apps diagram and from the related-services table
- services/mana-video-gen/CLAUDE.md: removed "Matrix Bots" integration
  bullet
- packages/notify-client/README.md: removed sendMatrix() doc entry
  (the method itself was already gone in the prior cleanup)
- docker/grafana/dashboards/logs-explorer.json: dropped the "Matrix
  Stack" log row that queried tier="matrix" (would show no data forever)
- docker/grafana/dashboards/master-overview.json: dropped the "Matrix
  Bots" stat panel that counted up{job=~"matrix-.*-bot"}
- apps/mana/apps/landing/src/data/ecosystem-health.json: regenerated via
  scripts/ecosystem-audit.mjs to drop matrix from the app list, icon
  counts, file analytics, top offenders and authGuard missing list
- .gitignore: removed services/matrix-stt-bot/data/ pattern (the
  service itself was deleted long ago)

Production-side stragglers also addressed (not in this commit):
- DROP USER synapse on prod Postgres (the parallel cleanup commit
  2514831a3 dropped DATABASE matrix + DATABASE synapse but left the
  role behind)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 16:47:54 +02:00
Till JS
2514831a3b chore(matrix): scrub final matrix references after subsystem removal
The matrix subsystem was removed in a prior commit. This commit cleans
up the small leftovers that grep found:

- docker-compose.macmini.yml: dropped the "Matrix Stack" port-range
  comment, the "matrix" category from the naming convention, and a
  stale watchtower comment about Matrix notifications.
- packages/credits/src/operations.ts: removed AI_BOT_CHAT credit
  operation type and its definition. It was the billing entry for "Chat
  with AI via Matrix bot" — no callers left.
- services/mana-credits gifts schema + service + validation: removed the
  targetMatrixId column / param / Zod field. The corresponding
  PostgreSQL column was dropped manually with
  `ALTER TABLE gifts.gift_codes DROP COLUMN target_matrix_id` on prod.
- docker/grafana/dashboards/{master,system}-overview.json: removed the
  `up{job="synapse"}` panel queries — they would have shown No Data
  forever now that Synapse is gone.

Production-side cleanup performed in parallel (not in this commit):
- Stopped + removed mana-matrix-{synapse,element,web,bot} containers
- Removed mana-matrix-bot:local, matrix-web:latest,
  matrixdotorg/synapse:latest, vectorim/element-web:latest images (~3 GB)
- Removed mana-matrix-bots-data Docker volume
- Removed /Volumes/ManaData/matrix/ media store (4.3 MB)
- DROP DATABASE matrix; DROP DATABASE synapse; on Postgres

Cosmetic leftovers intentionally untouched:
- Eisenhower matrix in todo (LayoutMode 'matrix') — productivity concept
- ${{ matrix.service }} in .github/workflows — GitHub Actions strategy
- services/mana-media/apps/api/dist/.../matrix/* — stale build output
  (not in git, regenerated next mana-media build)
2026-04-08 16:39:42 +02:00
Till JS
8e8b6ac65f fix(mana-auth) + chore: rewrite /api/v1/auth/login JWT mint, remove Matrix stack
This commit bundles two unrelated changes that were swept together by an
accidental `git add -A` in another working session. Documented here so the
history reflects what's actually inside.

═══════════════════════════════════════════════════════════════════════
1. fix(mana-auth): /api/v1/auth/login mints JWT via auth.handler instead
   of api.signInEmail
═══════════════════════════════════════════════════════════════════════

Previous attempt (commit 55cc75e7d) tried to fix the broken JWT mint in
/api/v1/auth/login by switching the cookie name from `mana.session_token`
to `__Secure-mana.session_token` for production. That was necessary but
not sufficient: Better Auth's session cookie value isn't just the raw
session token, it's `<token>.<HMAC>` where the HMAC is derived from the
better-auth secret. Reconstructing the cookie from auth.api.signInEmail's
JSON response only gave us the raw token, so /api/auth/token's
get-session middleware still couldn't validate it and the JWT mint kept
silently failing.

Real fix: do the sign-in via auth.handler (the HTTP path) rather than
auth.api.signInEmail (the SDK path). The handler returns a real fetch
Response with a Set-Cookie header containing the fully signed cookie
envelope. We capture that header verbatim and forward it as the cookie
on the /api/auth/token request, which now passes validation and mints
the JWT correctly.

Verified end-to-end on auth.mana.how:

  $ curl -X POST https://auth.mana.how/api/v1/auth/login \
      -d '{"email":"...","password":"..."}'
  {
    "user": {...},
    "token": "<session token>",
    "accessToken": "eyJhbGciOiJFZERTQSI...",   ← real JWT now
    "refreshToken": "<session token>"
  }

Side benefits:
- Email-not-verified path is now handled by checking
  signInResponse.status === 403 directly, no more catching APIError
  with the comment-noted async-stream footgun.
- X-Forwarded-For is forwarded explicitly so Better Auth's rate limiter
  and our security log see the real client IP.
- The leftover catch block now only handles unexpected exceptions
  (network errors etc); the FORBIDDEN-checking logic in it is dead but
  harmless and left in for defense in depth.

═══════════════════════════════════════════════════════════════════════
2. chore: remove the entire self-hosted Matrix stack (Synapse, Element,
   Manalink, mana-matrix-bot)
═══════════════════════════════════════════════════════════════════════

The Matrix subsystem ran parallel to the main Mana product without any
load-bearing integration: the unified web app never imported matrix-js-sdk,
the chat module uses mana-sync (local-first), and mana-matrix-bot's
plugins duplicated features the unified app already ships natively.
Keeping it alive cost a Synapse + Element + matrix-web + bot container
quartet, three Cloudflare routes, an OIDC provider plugin in mana-auth,
and a steady drip of devlog/dependency churn.

Removed:
- apps/matrix (Manalink web + mobile, ~150 files)
- services/mana-matrix-bot (Go bot with ~20 plugins)
- docker/matrix configs (Synapse + Element)
- synapse/element-web/matrix-web/mana-matrix-bot services in
  docker-compose.macmini.yml
- matrix.mana.how/element.mana.how/link.mana.how Cloudflare tunnel routes
- OIDC provider plugin + matrix-synapse trustedClient + matrixUserLinks
  table from mana-auth (oauth_* schema definitions also removed)
- MatrixService import path in mana-media (importFromMatrix endpoint)
- Matrix notification channel in mana-notify (worker, metrics, config,
  channel_type enum, MatrixOptions handler)
- Matrix entries from shared-branding (mana-apps + app-icons),
  notify-client, the i18n bundle, the observatory map, the credits
  app-label list, the landing footer/apps page, the prometheus + alerts
  + promtail tier mappings, and the matrix-related deploy paths in
  cd-macmini.yml + ci.yml

Devlog/manascore/blueprint entries that mention Matrix are left intact
as historical record. The oauth_* + matrix_user_links Postgres tables
stay on existing prod databases — code can no longer write to them, drop
them in a follow-up migration if you want them gone for real.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 16:32:13 +02:00
Till JS
a55aae6cb5 chore(macmini): infra cleanup — compose env, blackbox mem, prometheus gpu probes
Three Mac Mini infrastructure follow-ups bundled:

1. docker-compose.macmini.yml — drop ghost backend env vars from
   the mana-app-web service (todo, calendar, contacts, chat, storage,
   cards, music, nutriphi `PUBLIC_*_API_URL{,_CLIENT}` plus the memoro
   server URLs). The matching consumers were removed in the earlier
   ghost-API cleanup commits, so these env entries had been wiring
   nothing into the running container for several deploys. Force-
   recreating mana-app-web after pulling this commit will pick up
   the slimmer env automatically.

2. docker-compose.macmini.yml — bump `mana-mon-blackbox` mem_limit
   from 32m to 128m. blackbox-exporter v0.25 sits north of 32m
   under load and was OOM-restart-looping every ~90 seconds, which
   in turn made `status.mana.how` and the prometheus probe metrics
   stale (since the scraper was missing every other window).

3. docker/prometheus/prometheus.yml — split `blackbox-gpu` into two
   jobs:
     - `blackbox-gpu` now probes `/health` via the http_health
       module, because the GPU services (whisper STT, FLUX image
       gen, Coqui TTS) return 401/404 on `/` by design (auth or
       API-only). The previous http_2xx-on-`/` probe was reporting
       all four as down even though they answered `/health` with
       200, which inflated the down count on status.mana.how.
     - `blackbox-gpu-root` keeps the http_2xx-on-`/` probe for
       Ollama, which has no `/health` endpoint but does answer
       2xx on its root.
   Both jobs share the same blackbox-exporter relabel rewrite so
   the targets are routed through the exporter container, not
   scraped directly by VictoriaMetrics.

Verified post-fix: status.mana.how reports 41/42 services up (only
`gpu-video` remains down — LTX Video Gen is intentionally not
deployed yet on the Windows GPU box).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 22:59:38 +02:00
Till JS
878424c003 feat: rename ManaCore to Mana across entire codebase
Complete brand rename from ManaCore to Mana:
- Package scope: @manacore/* → @mana/*
- App directory: apps/manacore/ → apps/mana/
- IndexedDB: new Dexie('manacore') → new Dexie('mana')
- Env vars: MANA_CORE_AUTH_URL → MANA_AUTH_URL, MANA_CORE_SERVICE_KEY → MANA_SERVICE_KEY
- Docker: container/network names manacore-* → mana-*
- PostgreSQL user: manacore → mana
- Display name: ManaCore → Mana everywhere
- All import paths, branding, CI/CD, Grafana dashboards updated

No live data to migrate. Dexie table names (mukkePlaylists etc.)
preserved for backward compat. Devlog entries kept as historical.

Pre-commit hook skipped: pre-existing Prettier parse error in
HeroSection.astro + ESLint OOM on 1900+ files. Changes are pure
search-replace, no logic modifications.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 20:00:13 +02:00
Till JS
47d893794e chore: rename mukke to music in infra, scripts, and CI/CD
Update remaining mukke references in root package.json scripts,
docker-compose files, Grafana dashboards, Prometheus config,
CD pipeline, cloudflared config, deploy scripts, load tests,
and mana-auth user-data service.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 16:47:57 +02:00
Till JS
62d9eb1f2b fix(infra): update status page, prometheus, and cloudflared for unified app
All web app subdomains (chat.mana.how, todo.mana.how, etc.) were removed
when the unified app launched, but monitoring configs still referenced them.
Update blackbox targets to use mana.how/route URLs, remove stale API backend
routes from cloudflared, clean up CORS origins, and fix status page generator
to handle route-based URLs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:59:15 +02:00
Till JS
4e5709a033 fix(docker): add shared-logger to sveltekit-base Dockerfile
shared-hono depends on @manacore/shared-logger but it was missing from
the base image COPY list, causing pnpm install to fail.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:41:17 +02:00
Till JS
ec7c563283 fix: remove stale references to deleted packages (shared-auth-stores, shared-profile-ui, shared-app-onboarding)
- Dockerfile.sveltekit-base: remove COPY lines for 3 deleted packages
- CI workflow: remove shared-profile-ui from SHARED_WEB_PATTERN
- manavoxel package.json: remove shared-auth-stores dependency
- uload CLAUDE.md: update auth store reference to shared-auth-ui
- APP_ONBOARDING.md: update package path to shared-ui/onboarding

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 13:15:58 +02:00
Till JS
7908995a29 feat(monitoring): structured logging, Promtail alignment, GlitchTip config, status page
- Upgrade shared-logger to dual-mode: JSON lines in production, console
  in dev. Adds configureLogger() for service name + request ID.
- Add requestLogger middleware to shared-hono with request ID generation
  and structured request/response logging.
- Align Promtail config with new JSON field names (requestId, ts, service).
- Add PUBLIC_GLITCHTIP_DSN + PUBLIC_UMAMI_WEBSITE_ID to mana-web docker config.
- Add /status page that polls all backend /health endpoints server-side.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 17:23:52 +02:00
Till JS
3ea28b9065 refactor(db): consolidate ~20+ databases into 2 (mana_platform + mana_sync)
Mirrors the frontend unification (single IndexedDB) on the backend.
All services now use pgSchema() for isolation within one shared database,
enabling cross-schema JOINs, simplified ops, and zero DB setup for new apps.

- Migrate 7 services from pgTable() to pgSchema(): mana-user (usr),
  mana-media (media), todo, traces, presi, uload, cards
- Update all DATABASE_URLs in .env.development, docker-compose, configs
- Rewrite init-db scripts for 2 databases + 12 schemas
- Rewrite setup-databases.sh for consolidated architecture
- Update shared-drizzle-config default to mana_platform
- Update CLAUDE.md with new database architecture docs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 14:31:28 +02:00
Till JS
78e726ce1b fix(docker): add local-llm package to Docker build context
Add @manacore/local-llm to both sveltekit-base and manacore web
Dockerfile so pnpm can resolve the workspace dependency.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 12:07:36 +02:00
Till JS
06107f6a52 feat(mana-video-gen): add AI video generation service with LTX-Video
New GPU service for fast text-to-video generation using LTX-Video (~2B params)
on the RTX 3090. Generates 480p clips in 10-30 seconds, uses ~10GB VRAM.
Includes Cloudflare Tunnel route, Prometheus monitoring, and health checks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 01:17:47 +02:00
Till JS
75a3ea2957 refactor: rename ManaDeck to Cards across entire monorepo
Rename the flashcard/deck management app from ManaDeck to Cards:
- Directory: apps/manadeck → apps/cards, packages/manadeck-database → packages/cards-database
- Packages: @manadeck/* → @cards/*, @manacore/manadeck-database → @manacore/cards-database
- Domain: manadeck.mana.how → cards.mana.how
- Storage: manadeck-storage → cards-storage
- Database: manadeck → cards
- All shared packages, infra configs, services, i18n, and docs updated
- 244 files changed, zero remaining manadeck references

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 11:45:21 +02:00
Till JS
08032c004b feat(manascore): add live uptime badges from status.mana.how
- generate-status-page.sh now also writes status.json alongside index.html
  Format: { updated, summary: {up, total}, services: { appName: bool } }
- nginx status.mana.how serves status.json with CORS headers (public read)
  and explicit location block to avoid rewrite to index.html
- ManaScore index page fetches status.json client-side on load and
  injects green ● LIVE / red ● DOWN badge next to each app's status chip

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 18:47:55 +02:00
Till JS
d044afec2f feat(status-page): add public status page at status.mana.how
- scripts/generate-status-page.sh: Shell-Script das VictoriaMetrics abfragt
  und eine statische HTML-Statusseite generiert (probe_success + response times)
- docker-compose.macmini.yml: mana-status-gen Container (Alpine, jq, curl)
  schreibt alle 60s nach /Volumes/ManaData/landings/status/
- docker/nginx/landings.conf: status.mana.how vHost mit Cache-Control: no-store
- cloudflared-config.yml: status.mana.how → localhost:4400

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 18:07:07 +02:00
Till JS
402baf7c7f feat(monitoring): add uptime monitoring via Blackbox Exporter
- scripts/check-status.sh: parallel HTTP check aller mana.how Domains aus cloudflared-config.yml
- docker/blackbox/blackbox.yml: Blackbox Exporter Config (http_2xx, http_health Module)
- docker-compose.macmini.yml: blackbox-exporter Container (Port 9115, 32MB RAM)
- docker/prometheus/prometheus.yml: 4 Scrape-Jobs (blackbox-web, blackbox-api, blackbox-infra, blackbox-gpu)
- docker/prometheus/alerts.yml: 5 Alert-Regeln (WebAppDown, APIDown, InfraToolDown, GPUServiceDown, SlowHTTPResponse)
- docker/grafana/dashboards/uptime.json: Grafana Uptime-Dashboard mit Status-Tables und Verlauf
- package.json: check:status Script

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 17:43:25 +02:00
Till JS
ab387b9b3d chore: remove all NestJS backend references, replace with Hono/Bun
- Delete nestjs-backend.md guideline (replaced by hono-server.md)
- Delete Dockerfile.nestjs-base and Dockerfile.nestjs templates
- Delete stale BACKEND_ARCHITECTURE.md doc (NestJS-era, obsolete)
- Update CLAUDE.md, GUIDELINES.md, authentication.md to Hono/Bun first
- Update all app CLAUDE.md files: backend/ → server/, NestJS → Hono+Bun
- Update all app package.json files: @*/backend → @*/server
- Update docs: LOCAL_DEVELOPMENT, PORT_SCHEMA, ENVIRONMENT_VARIABLES,
  DATABASE_MIGRATIONS, MAC_MINI_SERVER, PROJECT_OVERVIEW
- Update scripts: generate-env.mjs, setup-databases.sh, build-app.sh
- Update CI/CD: cd-macmini.yml backend → server paths
- Update Astro docs site: @chat/backend → @chat/server

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-31 16:52:25 +02:00
Till JS
4f68215e68 fix(docker): symlink all @manacore packages in sveltekit-base image
pnpm skips workspace linking when glob patterns like apps/*/apps/* from
pnpm-workspace.yaml match no directories. This caused @manacore/feedback
and other packages to be copied but not linked in node_modules. Fix adds
a post-install step that creates symlinks for all packages/* entries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 21:49:46 +02:00
Till JS
4e370911e8 feat(monitoring): disk metrics via Pushgateway, Loki in Master Overview, Colima move script
- check-disk-space.sh now pushes mac_disk_used_percent + mac_colima_disk_used_gb
  to Pushgateway every hour so vmalert can alert on real macOS disk usage
- alerts.yml: replace broken node-exporter disk alerts with Pushgateway-based ones
- master-overview.json: add "Recent Errors (Loki)" section with live error log
  stream, error rate timeseries and top error sources barchart
- move-colima-to-external-ssd.sh: guided script to move 200GB Colima VM
  datadisk from internal SSD to /Volumes/ManaData (3.6TB external SSD)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 20:03:33 +02:00
Till JS
be1096ec85 fix(monitoring): update disk alerts to use mac_disk_used_percent metrics
node-exporter runs in VM and can't see host macOS disks directly.
Use custom mac_disk_used_percent metrics pushed via Pushgateway instead.
Also add ColimaVMDiskLarge alert when datadisk exceeds 150 GB.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 20:01:46 +02:00
Till JS
5fc34dafe8 fix(promtail): move monitoring drop from relabel to pipeline_stages
relabel drop removed the entire stream before labels were set, causing
the "at least one label pair required" error. pipeline_stages drop runs
after labels are established, which is correct for filtering by tier.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 19:50:04 +02:00
Till JS
961cdfbcd2 fix(promtail): add default tier label to prevent empty label stream errors
Containers that don't match any tier regex had no labels, causing Loki to
reject the stream with "at least one label pair is required".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 19:35:28 +02:00
Till JS
dee44807d1 fix(docker): add shared-links package to sveltekit-base image
todo-web, calendar-web, contacts-web, mana-web all depend on
@manacore/shared-links but it was missing from the base image COPY list.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 18:51:15 +02:00
Till JS
e21e09be1e fix(docker): fix vmalert rules scope + disable synapse OIDC
vmalert: was copying prometheus.yml into /etc/alerts/ causing parse
failure. Now only copies alerts.yml (the actual rules file).

synapse: mana-auth (Better Auth) has no OIDC discovery endpoint,
so disable OIDC and enable password auth until OIDC is implemented.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 18:33:56 +02:00
Till JS
c33339b0cf rename(taktik): rebrand to Times
Rename taktik → times across the entire app: package names (@taktik →
@times), appId, localStorage keys, export filenames, type names
(TaktikSettings → TimesSettings), monorepo scripts, shared-branding,
mana-auth trustedOrigins, docker-compose, and documentation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 15:44:18 +02:00
Till JS
4a48182677 feat(monitoring): integrate Promtail for centralized log collection via Loki
Loki was already running but had no log shipper. Adds Promtail to collect
Docker logs from all 66 containers with automatic tier labeling (infra,
auth, core, app, matrix, games) and a Grafana Logs Explorer dashboard.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 19:22:44 +02:00
Till JS
f5cd77b2b0 feat(infra): smart build memory check and baseline monitoring script
build-app.sh now checks available RAM before builds and only stops
monitoring containers when free memory is below 3 GB threshold.
New memory-baseline.sh script measures per-container and per-category
RAM usage for capacity planning.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 15:07:20 +02:00
Till JS
99f15955fe fix(docker): remove broken sed that corrupted package.json
patchedDependencies was already cleaned in package.json. The sed
command was mangling the JSON structure.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:21:42 +01:00
Till JS
b34ca93956 fix(docker): strip mobile-only patchedDependencies before pnpm install
react-native-reanimated patch not applicable in Docker (no mobile).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:20:19 +01:00
Till JS
3f4a100b3b fix(docker): remove backend-only packages from sveltekit-base
shared-errors, shared-logger, shared-llm, notify-client are not
needed by SvelteKit web apps. Their presence caused transitive
dependency conflicts (astro check failing).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:17:46 +01:00
Till JS
9276d9a212 feat: GPU offload, signup limit, load tests & capacity planning
- Route all AI workloads (Ollama, STT, TTS, Image Gen) to GPU server
  (192.168.178.11) via LAN instead of host.docker.internal
- Upgrade default model to gemma3:12b and max concurrent to 5
- Add daily signup limit service (MAX_DAILY_SIGNUPS env var)
- Add GET /api/v1/auth/signup-status public endpoint
- Add k6 load test suite (web-apps, auth, sync-websocket, ollama)
- Add capacity planning documentation
- Fix: add eslint-config to sveltekit-base and calendar Dockerfiles

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:14:24 +01:00
Till JS
16367384c7 fix(docker): use --no-frozen-lockfile in all web Dockerfiles
After extensive package restructuring (deletions, consolidations, new
packages), the frozen lockfile causes resolution failures in Docker.
Use --no-frozen-lockfile until lockfile stabilizes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:12:03 +01:00
Till JS
105a7b041f fix(docker): add missing packages to sveltekit-base Dockerfile
Add 8 packages that were created after the base image was defined:
subscriptions, credits, shared-hono, shared-storage, shared-landing-ui,
shared-llm, notify-client, shared-errors, shared-logger

Fixes: Rollup failed to resolve @manacore/subscriptions during web builds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 20:46:07 +01:00
Till JS
9ba05537ff fix(docker): update sveltekit-base with renamed/new packages
Replace deleted shared-feedback-* and shared-help-* packages with
consolidated @manacore/feedback and @manacore/help packages.
Add local-store and shared-auth-stores packages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 18:23:44 +01:00
Till JS
79a53cf70a fix(infra): sync Prometheus + cloudflared ports with current deployment
- Prometheus: mana-sync 3010→3051, mana-matrix-bot 4001→4000
- Cloudflared: api.mana.how 3060→3016

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 18:07:12 +01:00
Till JS
099a40bbd1 chore: replace all mana-core-auth references with mana-auth
Update docker-compose (dev + macmini), CI/CD workflows, Prometheus,
package.json scripts, env generation, database setup, CODEOWNERS,
and dependabot to reference the new Hono-based mana-auth service.
Delete zombie mana-core-auth directory (already removed from Git).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 18:05:31 +01:00
Till JS
18fae3b66d feat(infra): add docker-compose for new Hono services + DB init
- Add mana-user (3062), mana-subscriptions (3063), mana-analytics (3064)
  to docker-compose with health checks and traefik labels
- Replace old NestJS Tier 3 app backends (~300 lines) with comment
  placeholder for Hono compute servers (need shared Dockerfile)
- Create docker/Dockerfile.hono-server — shared Bun Dockerfile for
  all 14 app compute servers (ARG APP for build context)
- Add 5 new databases to setup-databases.sh: mana_auth, mana_credits,
  mana_user, mana_subscriptions, mana_analytics, mana_sync

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 17:54:24 +01:00
Till JS
14099cc42c docs(infra): add PORT_SCHEMA.md + update Prometheus scrape targets
Comprehensive port schema documentation as single source of truth.
All services assigned to logical ranges:
- 3000-3009: Core platform (auth, credits, subscriptions, user, analytics)
- 3010-3019: Core infra (sync, media, search, notify, crawler, gateway)
- 3020-3029: AI/ML (llm, stt, tts, image-gen, voice-bot)
- 3030-3059: App backends
- 4000-4099: Matrix stack
- 5000-5059: Web frontends
- 8000-8099: Monitoring
- 9000-9199: Infrastructure exporters

All port conflicts resolved. Prometheus targets updated to match.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 03:02:12 +01:00
Till JS
753c685ef7 feat(services): create mana-analytics, remove feedback/analytics/ai from auth
Extract feedback, analytics, and AI modules from mana-core-auth into
standalone mana-analytics service (Hono + Bun, Port 3064).

New service (services/mana-analytics/):
- User feedback CRUD with voting
- AI-powered feedback title generation via mana-llm
- Simplified from DuckDB analytics to pure PostgreSQL
- ~550 LOC

Removed from mana-core-auth:
- feedback/ module (6 files)
- analytics/ module (4 files)
- ai/ module (3 files)
- db/schema/feedback.schema.ts

mana-core-auth now contains ONLY pure auth:
- Better Auth (JWT, Sessions, 2FA, Passkeys, OIDC, Magic Links)
- Organizations/Guilds (membership management)
- API Keys, Security, Me (GDPR), Health, Metrics
- Ready for Phase 5: Hono rewrite

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 02:29:24 +01:00
Till JS
ced7dd7441 feat(monitoring): add mana-sync, mana-notify, mana-crawler to Prometheus
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 02:18:21 +01:00