managarten/services/news-ingester
Till JS 52159ee07a fix(news-ingester): disable Readability fallback to break crash loop
JSDOM throws CSS / parser errors from detached parse5 callbacks that
escape every try/catch in the call stack and even bun's
process.on('uncaughtException') handlers — leaving the daemon stuck
crash-looping past the first bad page in source #4 (heise) without
ever making forward progress.

Set FULL_TEXT_THRESHOLD_WORDS = 0 so we never call into Readability.
Sources that ship full RSS bodies (Tagesschau, Spiegel, BBC, …) are
unaffected. Title-only sources (Hacker News) keep the row with an
empty content field; the reader already falls back to "Original
öffnen ↗" in that case.

Re-enabling extraction in a worker thread is left for a follow-up.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 16:21:09 +02:00
..
src fix(news-ingester): disable Readability fallback to break crash loop 2026-04-09 16:21:09 +02:00
CLAUDE.md feat(news): backend ingester service + curated feed API 2026-04-09 15:53:26 +02:00
Dockerfile feat(news): backend ingester service + curated feed API 2026-04-09 15:53:26 +02:00
drizzle.config.ts feat(news): backend ingester service + curated feed API 2026-04-09 15:53:26 +02:00
package.json fix(news-ingester): drop unused @mana/shared-hono workspace dep 2026-04-09 16:11:58 +02:00
tsconfig.json feat(news): backend ingester service + curated feed API 2026-04-09 15:53:26 +02:00