Deleted: - DOCKER_REGISTRY_SETUP.md, QUICK_START_CICD.md (legacy CI/CD docs) - docs/ULOAD-DEPLOYMENT.md (Hetzner VPS deployment guide) - scripts/get-ssh-key.sh, scripts/remove-coolify-references.sh (legacy scripts) Updated Hetzner → MinIO references in: - shared-storage (package.json, README, client.ts, types.ts) - App CLAUDE.md files (mukke, storage, planta, picture) - .claude/GUIDELINES.md, sveltekit-web.md guideline - TROUBLESHOOTING.md, SETUP_TEMPLATES.md (replaced IPs with placeholders) - GIT_WORKFLOW.md, COMMANDS.md - services/matrix-project-doc-bot/CLAUDE.md Remaining Hetzner mentions are in historical devlogs/audits and docs that list Hetzner as a hosting alternative (not as active infrastructure). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
35 KiB
Troubleshooting Guide
Common issues and solutions for the manacore-monorepo.
Table of Contents
- Recursive Turbo Calls
- Build Issues
- Linting Issues
- NestJS Dependency Injection
- Staging Deployment Issues
- GitHub Running Disabled Workflows
- chat-backend Container Unhealthy
- SvelteKit Static Environment Variable Imports
- Orphan Docker Containers
- Client-Side Calling localhost Instead of Public IP
- CORS Blocking Cross-Origin Requests
- Missing Database Schema
- pnpm Symlinks Broken in Docker Container
- Hardcoded localhost URLs in SvelteKit Web Apps
Recursive Turbo Calls
Problem: Infinite Loop / Tasks Running Forever
Symptoms:
pnpm run buildruns for 10+ minutes without completingpnpm run linthangs indefinitelypnpm run type-checkshows thousands of duplicate task entries- CI/CD pipelines timeout after 10+ minutes
Root Cause:
Parent workspace packages (e.g., apps/zitare/package.json, apps/presi/package.json) have scripts that call turbo run <task>, creating an infinite recursion loop.
How It Happens
Root turbo → finds "build" script in apps/zitare/package.json
→ runs "turbo run build" in zitare
→ finds "build" script again
→ runs "turbo run build" again
→ (infinite loop!)
❌ WRONG - Causes Infinite Recursion
// apps/zitare/package.json - DON'T DO THIS!
{
"scripts": {
"build": "turbo run build", // ❌ WRONG
"lint": "turbo run lint", // ❌ WRONG
"type-check": "turbo run type-check", // ❌ WRONG
"clean": "turbo run clean" // ❌ WRONG
}
}
// apps/picture/package.json - DON'T DO THIS!
{
"scripts": {
"build": "pnpm run --recursive build", // ❌ WRONG
"lint": "pnpm --filter '@picture/*' run lint" // ❌ WRONG
}
}
✅ CORRECT - Let Root Turbo Handle Orchestration
// apps/zitare/package.json - CORRECT
{
"scripts": {
"dev": "turbo run dev", // ✅ OK for dev (persistent task, scoped)
// No build, lint, type-check scripts - handled by root turbo
"db:push": "pnpm --filter @zitare/backend db:push", // ✅ OK
"db:studio": "pnpm --filter @zitare/backend db:studio" // ✅ OK
}
}
Why dev is the Exception
Using turbo run dev in parent packages is acceptable because:
- It's typically run directly on that package (scoped:
pnpm zitare:dev) - Dev tasks are persistent (long-running) and turbo handles them differently
- Root never orchestrates
devacross all packages simultaneously
The Rule
Parent workspace packages must NEVER have scripts that call
turbo run <task>for tasks that turbo orchestrates from the root.
Tasks orchestrated from root (defined in turbo.json):
- ✅
build- Root handles this - ✅
lint- Root handles this - ✅
type-check- Root handles this - ✅
test- Root handles this - ✅
clean- Root handles this - ❌
dev- Exception (scoped usage is fine)
How to Fix
If you added a recursive script:
- Open the parent package.json (e.g.,
apps/myapp/package.json) - Remove the problematic script entirely:
{
"scripts": {
"dev": "turbo run dev",
- "build": "turbo run build",
- "lint": "turbo run lint",
- "type-check": "turbo run type-check",
"db:push": "pnpm --filter @myapp/backend db:push"
}
}
- The root
turbo.jsonalready handles orchestration for these tasks
Affected Locations
Parent packages are located at:
apps/*/package.json(e.g.,apps/zitare/package.json)games/*/package.json(e.g.,games/mana-games/package.json)
Do NOT add turbo scripts here!
Child packages (these are fine):
apps/*/apps/*/package.json(e.g.,apps/zitare/apps/backend/package.json)packages/*/package.json(e.g.,packages/shared-theme/package.json)
Build Issues
Build Fails with "ELIFECYCLE Command failed"
Check for:
- Recursive turbo calls (see above)
- Missing dependencies in a package
- TypeScript errors in source code
- Import/export mismatches
Debugging:
# Run build and capture full output
pnpm run build 2>&1 | tee build.log
# Search for actual error (not just ELIFECYCLE)
grep -A10 "error during build" build.log
# Build specific package to isolate issue
pnpm --filter @zitare/backend build
Build Times Out in CI
Symptoms:
- CI runs for 10+ minutes
- Timeout before completion
- "No output has been received in the last 10m0s"
Solution:
This is almost always caused by recursive turbo calls. See the Recursive Turbo Calls section above.
Quick fix:
# Locally, check if build completes in reasonable time
time pnpm run build
# Should complete in < 2 minutes for clean build
# Should complete in < 30 seconds for cached build
If it takes longer than 2-3 minutes, you have recursive scripts.
Linting Issues
Lint Hangs or Runs Forever
Same issue as build - recursive turbo calls!
❌ WRONG:
// apps/presi/package.json - DON'T DO THIS!
{
"scripts": {
"lint": "pnpm --filter '@presi/*' run lint" // ❌ Recursive
}
}
✅ CORRECT:
// apps/presi/package.json - Remove the lint script
{
"scripts": {
"dev": "pnpm --filter '@presi/*' run dev"
// No lint script - root turbo handles it
}
}
Run lint from root:
# Lint all packages
pnpm run lint
# Lint specific package
pnpm --filter @presi/backend lint
# Lint specific project
pnpm turbo run lint --filter=presi
ESLint Errors
Common issues:
-
Missing eslint config
# Add shared config pnpm add -D @manacore/eslint-config --filter @myapp/backend -
Incompatible ESLint versions
# Check versions pnpm ls eslint # Update to match root version pnpm add -D eslint@latest --filter @myapp/backend
Prevention Checklist
When creating a new app or package:
- DO NOT add
build,lint,type-check, ortestscripts to parent packages - DO add these scripts to child packages (apps/myapp/apps/backend/package.json)
- DO use project-specific scripts (e.g.,
db:push,db:studio) - DO test locally:
pnpm run buildshould complete in < 2 minutes - DO refer to
CLAUDE.mdfor patterns
Quick Validation
# Check for problematic patterns in parent packages
for pkg in apps/*/package.json games/*/package.json; do
if grep -q '"build".*turbo run build' "$pkg" 2>/dev/null; then
echo "❌ RECURSIVE SCRIPT FOUND: $pkg"
fi
done
NestJS Dependency Injection
Problem: "Nest can't resolve dependencies" Error
Symptoms:
- NestJS fails to start with error:
Nest can't resolve dependencies of the XService (?) - Error mentions "argument Function at index [0] is available"
- The module imports look correct but service still won't inject
Root Cause:
Using type-only imports (import {X }) for classes that need to be injected. TypeScript erases type-only imports at compile time, so the actual class is not available at runtime for dependency injection.
❌ WRONG - Type-Only Import
// services/mana-core-auth/src/ai/ai.service.ts - DON'T DO THIS!
import { Injectable } from '@nestjs/common';
import { ConfigService } from '@nestjs/config'; // ❌ Type-only import
@Injectable()
export class AiService {
constructor(private configService: ConfigService) {
// NestJS can't inject ConfigService because it was type-only imported!
}
}
What happens:
- TypeScript compiles the code
- The
typekeyword tells TypeScript to erase the import at compile time - The compiled JS has NO import for ConfigService
- At runtime, NestJS can't find the ConfigService class to inject
- Error: "Nest can't resolve dependencies of the AiService (?)"
✅ CORRECT - Regular Import
// services/mana-core-auth/src/ai/ai.service.ts - CORRECT
import { Injectable } from '@nestjs/common';
import { ConfigService } from '@nestjs/config'; // ✅ Regular import
@Injectable()
export class AiService {
constructor(private configService: ConfigService) {
// ConfigService is properly imported and can be injected
}
}
The Rule
For NestJS dependency injection, NEVER use type-only imports (
import {X }) for classes you need to inject.
- ✅
import { ConfigService }- Regular import (works) - ❌
import {ConfigService }- Type-only import (breaks DI) - ✅
import type { MyInterface }- Type-only for interfaces (fine, not injected) - ✅
import {MyType, MyClass }- Mixed (MyType erased, MyClass available)
How to Fix
- Find the service with the DI error
- Check all imports for classes used in the constructor
- Remove the
typekeyword from class imports:
import { Injectable } from '@nestjs/common';
- import {ConfigService } from '@nestjs/config';
+ import { ConfigService } from '@nestjs/config';
@Injectable()
export class AiService {
constructor(private configService: ConfigService) {}
}
- Rebuild and test:
pnpm --filter mana-core-auth build
pnpm --filter mana-core-auth start:dev
Debugging
If you're still getting DI errors after removing type-only imports:
- Check the module imports the provider's dependencies:
@Module({
imports: [ConfigModule], // ← ConfigService needs ConfigModule
providers: [AiService],
exports: [AiService],
})
export class AiModule {}
- Verify the compiled JavaScript:
# Build the service
pnpm --filter mana-core-auth build
# Check the compiled output
cat services/mana-core-auth/dist/ai/ai.service.js | grep "require"
# Should see:
# const config_1 = require("@nestjs/config"); ✅ Good
# NOT:
# const config_1 = undefined; ❌ Bad (type-only import)
- Check Docker builds:
If the error only happens in Docker but not locally:
# Build Docker image without cache
docker build --no-cache -f services/mana-core-auth/Dockerfile -t test .
# Check the compiled code in the image
docker run --rm --entrypoint cat test /app/dist/ai/ai.service.js
Related Issues
- Commit d69cc607 - Fixed type-only ConfigService import in AiService
- TypeScript
import typevsimport {}- both erase at compile time - Docker layer caching can hide fixes if source wasn't properly copied
Staging Deployment Issues
Overview
This section documents the complete troubleshooting journey for deploying mana-core-auth + chat (backend + web) to staging. It covers GitHub Actions CI/CD simplification, Docker health checks, database setup, and SvelteKit environment variables.
Problem 1: GitHub Running Disabled Workflows
Symptoms:
- Workflows with
.full.ymlextension were still running test.full.ymlwas being recognized as a valid workflow- Multiple unnecessary workflows running on every push
What We Tried:
- ❌ Renaming to
.disabledextension → Still ran - ❌ Renaming to
.full.ymlextension → Still ran (GitHub recognizes any.ymlin.github/workflows/)
Solution:
- ✅ Rename to
.yml.bakextension (GitHub ignores non-.ymlfiles)
# Disable a workflow
mv .github/workflows/test.yml .github/workflows/test.yml.bak
# Re-enable a workflow
mv .github/workflows/test.yml.bak .github/workflows/test.yml
Files Changed:
test.yml→test.yml.baktest-coverage.yml→test-coverage.yml.bakci-pull-request.yml→ci-pull-request.yml.bakdependency-update.yml→dependency-update.yml.bak
Problem 2: chat-backend Container Unhealthy
Symptoms:
- Deployment failed with:
dependency failed to start: container chat-backend-staging is unhealthy - chat-web wouldn't start because it depends on chat-backend being healthy
Debugging Steps:
# Connect to staging server
ssh -i ~/.ssh/deploy_key deploy@your-server-ip
# Check container status
cd ~/manacore-staging
docker compose ps
# Check logs for the failing container
docker compose logs chat-backend --tail=100
# Test health endpoint manually from inside container
docker compose exec chat-backend wget -q -O - http://localhost:3002/api/v1/health
Root Cause 1: Missing Database
The logs showed:
error: database "chat" does not exist
Fix: Create the database manually:
docker compose exec -T postgres psql -U postgres -c "CREATE DATABASE chat;"
Root Cause 2: Wrong Health Check Path
The docker-compose.staging.yml had:
healthcheck:
test: ['CMD', 'wget', '...', 'http://localhost:3002/api/health'] # ❌ WRONG
But NestJS health endpoint is at /api/v1/health:
healthcheck:
test: ['CMD', 'wget', '...', 'http://localhost:3002/api/v1/health'] # ✅ CORRECT
How to Verify Health Endpoints:
| Service | Port | Health Endpoint |
|---|---|---|
| mana-core-auth | 3001 | /api/v1/health |
| chat-backend | 3002 | /api/v1/health |
| chat-web | 3000 | /health |
# Test from outside the server
curl http://your-server-ip:3001/api/v1/health
curl http://your-server-ip:3002/api/v1/health
curl http://your-server-ip:3000/health
Problem 3: SvelteKit Static Environment Variable Imports
Symptoms:
- Docker build failed with:
PUBLIC_MANA_CORE_AUTH_URL is not exported by $env/static/public - Build error during
npm run buildin Docker
Root Cause:
SvelteKit's $env/static/public imports are resolved at build time, not runtime. When building in Docker, these environment variables don't exist.
❌ WRONG - Static Import (Build Time):
// apps/chat/apps/web/src/lib/stores/auth.svelte.ts
import { PUBLIC_MANA_CORE_AUTH_URL } from '$env/static/public'; // ❌ Fails in Docker
const authUrl = PUBLIC_MANA_CORE_AUTH_URL;
✅ CORRECT - Runtime Environment Variable:
// apps/chat/apps/web/src/lib/stores/auth.svelte.ts
import { browser } from '$app/environment';
function getAuthUrl(): string {
if (browser && typeof window !== 'undefined') {
// Client-side: check for injected env or use default
return (
(window as unknown as { __PUBLIC_MANA_CORE_AUTH_URL__?: string })
.__PUBLIC_MANA_CORE_AUTH_URL__ ||
import.meta.env.PUBLIC_MANA_CORE_AUTH_URL ||
'http://localhost:3001'
);
}
// Server-side: use process.env or default
return process.env.PUBLIC_MANA_CORE_AUTH_URL || 'http://localhost:3001';
}
The Pattern:
- Check if running in browser
- Try window-injected variable (for runtime injection)
- Try
import.meta.env(for Vite build-time) - Fall back to
process.env(for SSR) - Use localhost default for development
Files Fixed:
apps/chat/apps/web/src/lib/stores/auth.svelte.tsapps/chat/apps/web/src/lib/services/feedback.ts
Problem 4: Orphan Docker Containers
Symptoms:
- Old containers from previous deployments still running
docker compose psshows unexpected services
Fix:
# Remove orphan containers
docker compose down --remove-orphans
# Bring up fresh
docker compose up -d
# Manually remove specific orphans
docker rm -f manadeck-backend-staging manacore-nginx-staging
Problem 5: Client-Side Calling localhost Instead of Public IP
Symptoms:
- Browser console shows:
POST http://localhost:3001/api/v1/auth/register net::ERR_CONNECTION_REFUSED - API calls work from server but fail from browser
- The injected
window.__PUBLIC_MANA_CORE_AUTH_URL__is empty or undefined
Root Cause:
SvelteKit's environment variables work differently on server vs client:
- Server-side (SSR): Has access to
process.env - Client-side (browser): Does NOT have access to
process.env- needs explicit injection
The initial fix using process.env only worked for SSR. Browser code falls back to localhost.
Solution - Runtime Environment Injection:
- Add client URLs to docker-compose.staging.yml:
chat-web:
environment:
# Server-side URLs (Docker internal network)
PUBLIC_BACKEND_URL: http://chat-backend:3002
PUBLIC_MANA_CORE_AUTH_URL: http://mana-core-auth:3001
# Client-side URLs (browser access via public IP)
PUBLIC_BACKEND_URL_CLIENT: http://your-server-ip:3002
PUBLIC_MANA_CORE_AUTH_URL_CLIENT: http://your-server-ip:3001
- Inject into HTML via hooks.server.ts:
// apps/chat/apps/web/src/hooks.server.ts
import type { Handle } from '@sveltejs/kit';
const PUBLIC_MANA_CORE_AUTH_URL_CLIENT =
process.env.PUBLIC_MANA_CORE_AUTH_URL_CLIENT || process.env.PUBLIC_MANA_CORE_AUTH_URL || '';
export const handle: Handle = async ({ event, resolve }) => {
return resolve(event, {
transformPageChunk: ({ html }) => {
const envScript = `<script>
window.__PUBLIC_MANA_CORE_AUTH_URL__ = "${PUBLIC_MANA_CORE_AUTH_URL_CLIENT}";
</script>`;
return html.replace('<head>', `<head>${envScript}`);
},
});
};
- Read from window in client code:
// apps/chat/apps/web/src/lib/stores/auth.svelte.ts
function getAuthUrl(): string {
if (browser && typeof window !== 'undefined') {
const injectedUrl = (window as unknown as { __PUBLIC_MANA_CORE_AUTH_URL__?: string })
.__PUBLIC_MANA_CORE_AUTH_URL__;
return injectedUrl || 'http://localhost:3001';
}
return process.env.PUBLIC_MANA_CORE_AUTH_URL || 'http://localhost:3001';
}
How to Verify:
Open browser DevTools (F12) → Console:
window.__PUBLIC_MANA_CORE_AUTH_URL__;
// Should show: "http://your-server-ip:3001"
Problem 6: CORS Blocking Cross-Origin Requests
Symptoms:
- Browser console shows:
Access to fetch at 'http://your-server-ip:3001/...' from origin 'http://your-server-ip:3000' has been blocked by CORS policy - API calls work via curl but fail from browser
- Preflight OPTIONS requests fail
Root Cause:
Browser security blocks requests between different origins (port counts as different origin):
- chat-web:
http://your-server-ip:3000 - mana-core-auth:
http://your-server-ip:3001
Even though they're on the same IP, different ports = different origins = CORS blocked.
Solution:
Add CORS_ORIGINS environment variable to mana-core-auth in docker-compose.staging.yml:
mana-core-auth:
environment:
# ... other env vars ...
# CORS - Allow chat-web and other staging origins
CORS_ORIGINS: http://your-server-ip:3000,http://your-server-ip:3002,http://localhost:3000
CORS Configuration in mana-core-auth:
The service reads CORS_ORIGINS from environment:
// services/mana-core-auth/src/config/configuration.ts
cors: {
origin: process.env.CORS_ORIGINS?.split(',') || [
'http://localhost:3000',
'http://localhost:8081',
],
}
// services/mana-core-auth/src/main.ts
const corsOrigins = configService.get<string[]>('cors.origin') || [];
app.enableCors({
origin: corsOrigins,
credentials: true,
});
How to Verify:
# Test CORS preflight
curl -X OPTIONS http://your-server-ip:3001/api/v1/auth/register \
-H "Origin: http://your-server-ip:3000" \
-H "Access-Control-Request-Method: POST" \
-v
# Should see in response headers:
# Access-Control-Allow-Origin: http://your-server-ip:3000
Problem 7: Missing Database Schema
Symptoms:
- API returns:
{"statusCode": 500, "message": "relation \"auth.users\" does not exist"} - Registration/login endpoints fail with 500 error
- Health check passes but auth endpoints fail
Root Cause:
The database exists but the schema hasn't been pushed. Drizzle ORM needs to run db:push to create:
authschema with tables: users, accounts, sessions, passwords, verification, etc.creditsschema with tables: balances, transactions, packages, etc.
Why It Happened:
The CD workflow was calling pnpm run db:migrate but that script doesn't exist in the package.json. The correct script is db:push which runs drizzle-kit push.
Solution:
- Manual fix (immediate):
ssh -i ~/.ssh/deploy_key deploy@your-server-ip
cd ~/manacore-staging
# Push schema to database (--force skips interactive confirmation)
docker compose exec -T mana-core-auth npx drizzle-kit push --force
- Fix CD workflow (permanent):
# .github/workflows/cd-staging.yml - BEFORE
docker compose exec -T mana-core-auth pnpm run db:migrate || echo "Auth migrations skipped"
# .github/workflows/cd-staging.yml - AFTER
docker compose exec -T mana-core-auth npx drizzle-kit push --force || echo "Auth schema push skipped"
How to Verify:
# Check if auth schema exists
docker compose exec -T postgres psql -U postgres -d manacore_auth -c '\dt auth.*'
# Should show 12 tables:
# auth | accounts, invitations, jwks, members, organizations,
# passwords, security_events, sessions, two_factor_auth,
# user_settings, users, verification
Files Changed:
.github/workflows/cd-staging.yml- Line 253:db:migrate→drizzle-kit push --force
Complete Staging Deployment Checklist
Before Deployment
- Verify
docker-compose.staging.ymlhas correct health check paths - Verify CI/CD workflow (
cd-staging.yml) has matching health check paths - Check that required databases exist or CI creates them
- Verify CD workflow runs
drizzle-kit push --forceto create schemas (notdb:migrate) - Verify
CORS_ORIGINSincludes all frontend origins - Verify
PUBLIC_*_CLIENTenv vars have correct public IPs for browser access
During Deployment Failure
-
SSH to server:
ssh -i ~/.ssh/deploy_key deploy@your-server-ip cd ~/manacore-staging -
Check container status:
docker compose ps -
Check logs for failing container:
docker compose logs <container-name> --tail=100 -
Common fixes:
# Create missing database docker compose exec -T postgres psql -U postgres -c "CREATE DATABASE <dbname>;" # Restart a service docker compose restart <service-name> # Force recreate docker compose up -d --force-recreate <service-name> -
Verify health:
curl http://localhost:3001/api/v1/health # mana-core-auth curl http://localhost:3002/api/v1/health # chat-backend curl http://localhost:3000/health # chat-web
After Deployment
- Verify all health endpoints respond
- Check container logs for errors
- Test actual functionality (login, API calls)
Key Files for Staging Deployment
| File | Purpose |
|---|---|
docker-compose.staging.yml |
Service definitions and health checks |
.github/workflows/cd-staging.yml |
CI/CD deployment workflow |
.github/workflows/ci-main.yml |
Docker image builds on push to main |
Health Check Patterns
Docker Compose (docker-compose.staging.yml):
healthcheck:
test: ['CMD', 'wget', '--no-verbose', '--tries=1', '--spider', 'http://localhost:PORT/ENDPOINT']
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
CI/CD Workflow (cd-staging.yml):
# Check from inside container
docker compose exec -T chat-backend wget -q -O - http://localhost:3002/api/v1/health
Lessons Learned
-
GitHub Workflows: Only files ending in
.ymlor.yamlin.github/workflows/are recognized. Use.bakextension to disable. -
NestJS Health Endpoints: All NestJS backends use
/api/v1/health, not/api/health. -
Docker Compose Dependencies: When using
depends_on: condition: service_healthy, the dependent service won't start until the health check passes. -
Database Creation: Must happen AFTER PostgreSQL is healthy but BEFORE dependent services run migrations.
-
SvelteKit Environment Variables: Use runtime patterns (
process.env,import.meta.env) instead of$env/static/publicfor Docker builds. -
Verify Before Commit: Always check both
docker-compose.staging.ymlAND CI/CD workflows for matching paths. -
Server vs Client URLs: Docker internal URLs (e.g.,
http://mana-core-auth:3001) only work server-side. Browsers need public IPs. Use separate_CLIENTenv vars for browser access. -
SvelteKit Runtime Injection: Use
hooks.server.tswithtransformPageChunkto inject environment variables into HTML at runtime. This is the only reliable way to pass server env vars to client code. -
CORS for Multi-Service Apps: When frontend and backend are on different ports, configure CORS on the backend. Port differences count as different origins (e.g.,
:3000vs:3001). -
Environment Variable Flow:
docker-compose.yml → Container env → process.env (SSR) → hooks.server.ts → window.__VAR__ (browser) -
Database Schema vs Database: Creating a database (
CREATE DATABASE) is not enough - Drizzle needsdb:pushto create schemas and tables. Health checks may pass with empty database, but API calls will fail with "relation does not exist". -
Drizzle Kit Interactive Mode:
drizzle-kit pushprompts for confirmation. Use--forceflag in CI/CD to skip interactive mode. -
pnpm Symlinks in Docker: pnpm uses symlinks to a central
.pnpmstore. When copyingnode_modulesin Docker, you must preserve both the symlinks AND the target directory they point to. See Problem 8.
Problem 8: pnpm Symlinks Broken in Docker Container
Symptoms:
- Container starts but crashes with:
Cannot find package 'date-fns' imported from /app/build/server/chunks/_page.svelte-xxx.js - Error
ERR_MODULE_NOT_FOUNDfor packages that ARE in node_modules - Works locally but fails in Docker production stage
ls node_modules/date-fnsshows a symlink pointing to../../../../../node_modules/.pnpm/...
Root Cause:
pnpm uses symlinks to a central .pnpm store at the monorepo root. When you copy only the app's node_modules in Docker, the symlinks point to paths that don't exist:
# In builder stage (pnpm workspace):
/app/apps/todo/apps/web/node_modules/date-fns → ../../../../../node_modules/.pnpm/date-fns@4.1.0/node_modules/date-fns
# In production stage (old broken approach):
/app/node_modules/date-fns → ../../../../../node_modules/.pnpm/...
# ↑ BROKEN! This path doesn't exist because we only copied to /app/
❌ WRONG - Flattening Directory Structure:
# Production stage
FROM node:20-alpine AS production
WORKDIR /app # ❌ Different from builder structure
# Copy node_modules (symlinks will be broken!)
COPY --from=builder /app/apps/todo/apps/web/node_modules ./node_modules # ❌ BROKEN
COPY --from=builder /app/apps/todo/apps/web/build ./build
COPY --from=builder /app/apps/todo/apps/web/package.json ./
CMD ["node", "build"]
The symlinks in node_modules point to ../../../../../node_modules/.pnpm/... which resolves to a non-existent path from /app/.
✅ CORRECT - Preserve Directory Structure + Copy .pnpm Store:
# Production stage
FROM node:20-alpine AS production
# Keep same directory structure as builder so pnpm symlinks resolve correctly
WORKDIR /app/apps/todo/apps/web # ✅ Same as builder
# Copy the pnpm store that symlinks point to (at /app/node_modules/.pnpm)
COPY --from=builder /app/node_modules/.pnpm /app/node_modules/.pnpm # ✅ Target of symlinks
# Copy the app's node_modules (contains symlinks to the pnpm store)
COPY --from=builder /app/apps/todo/apps/web/node_modules ./node_modules # ✅ Symlinks work now
# Copy built application
COPY --from=builder /app/apps/todo/apps/web/build ./build
COPY --from=builder /app/apps/todo/apps/web/package.json ./
CMD ["node", "build"]
Why This Works:
WORKDIR /app/apps/todo/apps/web- Production container has same path as builder- Symlinks in
./node_modules/point to../../../../../node_modules/.pnpm/... - From
/app/apps/todo/apps/web/node_modules/, going up 5 directories reaches/app/ /app/node_modules/.pnpm/exists because we copied it!
How to Debug:
# Check if symlinks are broken in the container
docker exec <container> ls -la node_modules/date-fns
# Shows: date-fns -> ../../../../../node_modules/.pnpm/date-fns@4.1.0/node_modules/date-fns
# Check if the target exists
docker exec <container> ls -la /app/node_modules/.pnpm/date-fns@4.1.0/
# If "No such file or directory" → symlink is broken
# Check the image size (should be ~1GB with .pnpm store, ~50MB without)
docker images | grep todo-web
Trade-offs:
| Approach | Image Size | Symlinks | Works |
|---|---|---|---|
| Copy only app's node_modules | ~50MB | Broken | ❌ |
| Copy app's node_modules + .pnpm store | ~1GB | Working | ✅ |
The larger image size is the cost of pnpm's deduplication strategy. In a monorepo, this is actually more efficient than copying all dependencies flat.
Alternative: Use npm Instead of pnpm in Docker:
If image size is critical, you could use npm in the Docker build:
# Alternative approach (not recommended for monorepos)
FROM node:20-alpine AS production
WORKDIR /app
COPY --from=builder /app/apps/todo/apps/web/build ./build
COPY --from=builder /app/apps/todo/apps/web/package.json ./
# Clean install with npm (flattens dependencies)
RUN npm install --omit=dev
CMD ["node", "build"]
⚠️ Warning: This may fail with workspace:* protocol in package.json dependencies. Only works if all dependencies are published to npm.
Affected Files:
apps/todo/apps/web/Dockerfileapps/manacore/apps/web/Dockerfileapps/chat/apps/web/Dockerfileapps/calendar/apps/web/Dockerfileapps/clock/apps/web/Dockerfile
Related Commits:
fd1c0ee6- fix(docker): preserve pnpm symlink structure in web Dockerfiles
Problem 9: Hardcoded localhost URLs in SvelteKit Web Apps
Symptoms:
- Browser console shows:
POST http://localhost:3001/api/v1/auth/login net::ERR_CONNECTION_REFUSED - App works locally but auth fails on staging/production
- The
window.__PUBLIC_MANA_CORE_AUTH_URL__may be set correctly, but code doesn't use it - Looking at the source code reveals hardcoded URLs like
const MANA_AUTH_URL = 'http://localhost:3001'
Root Cause:
Developers hardcode localhost:3001 directly in TypeScript files instead of using the runtime injection pattern. This works locally but breaks in Docker deployments.
Common Locations of Hardcoded URLs:
// ❌ These patterns are WRONG:
// In auth.svelte.ts
const MANA_AUTH_URL = 'http://localhost:3001';
// In user-settings.svelte.ts
const MANA_AUTH_URL = 'http://localhost:3001';
// In feedback.ts or feedback page
apiUrl: 'http://localhost:3001',
// Using build-time env vars (also wrong for Docker)
const MANA_AUTH_URL = import.meta.env.PUBLIC_MANA_CORE_AUTH_URL || 'http://localhost:3001';
Solution:
- Create
hooks.server.tsif it doesn't exist (see Problem 5) - Use
getAuthUrl()pattern in all files:
// ✅ CORRECT pattern
import { browser } from '$app/environment';
function getAuthUrl(): string {
if (browser && typeof window !== 'undefined') {
const injectedUrl = (window as unknown as { __PUBLIC_MANA_CORE_AUTH_URL__?: string })
.__PUBLIC_MANA_CORE_AUTH_URL__;
return injectedUrl || 'http://localhost:3001';
}
return process.env.PUBLIC_MANA_CORE_AUTH_URL || 'http://localhost:3001';
}
// Use getAuthUrl() instead of hardcoded string
const auth = initializeWebAuth({ baseUrl: getAuthUrl() });
How to Find Hardcoded URLs:
# Search for hardcoded localhost:3001 in web apps
grep -r "localhost:3001" apps/*/apps/web/src --include="*.ts" --include="*.svelte"
# Check for the correct pattern (window injection)
grep -r "__PUBLIC_MANA_CORE_AUTH_URL__" apps/*/apps/web/src
Apps Status (as of 2024-12):
| App | Status | Files to Fix |
|---|---|---|
chat/apps/web |
✅ Fixed | - |
todo/apps/web |
✅ Fixed | - |
calendar/apps/web |
✅ Fixed | - |
clock/apps/web |
✅ Fixed | - |
contacts/apps/web |
❌ Needs Fix | auth.svelte.ts, user-settings.svelte.ts, feedback.ts |
manadeck/apps/web |
❌ Needs Fix | user-settings.svelte.ts, feedback.ts |
manacore/apps/web |
❌ Needs Fix | auth.svelte.ts, user-settings.svelte.ts, feedback.ts, credits.ts |
zitare/apps/web |
❌ Needs Fix | auth.svelte.ts, user-settings.svelte.ts, feedback.ts |
picture/apps/web |
❌ Needs Fix | user-settings.svelte.ts |
Complete Fix Checklist for Each App:
- Create/update
src/hooks.server.tswith env injection - Update
src/lib/stores/auth.svelte.ts→ usegetAuthUrl()pattern - Update
src/lib/stores/user-settings.svelte.ts→ usegetAuthUrl()pattern - Update any
feedback.tsor feedback services → usegetAuthUrl()pattern - Update any other files with hardcoded URLs
- Add
PUBLIC_MANA_CORE_AUTH_URL_CLIENTtodocker-compose.staging.yml - Test locally with
pnpm dev - Deploy and test on staging
See Also:
- Problem 5: Client-Side Calling localhost Instead of Public IP
- SvelteKit Web Guidelines - Environment Variables
References
- CLAUDE.md - Turborepo Configuration
- turbo.json - Root task orchestration
- Turborepo Docs
Getting Help
If you encounter an issue not covered here:
- Check the GitHub Issues
- Review recent commits that may have introduced the issue
- Run
pnpm cleanandpnpm installto reset - Create a new issue with full error logs