Commit graph

4 commits

Author SHA1 Message Date
Till JS
52159ee07a fix(news-ingester): disable Readability fallback to break crash loop
JSDOM throws CSS / parser errors from detached parse5 callbacks that
escape every try/catch in the call stack and even bun's
process.on('uncaughtException') handlers — leaving the daemon stuck
crash-looping past the first bad page in source #4 (heise) without
ever making forward progress.

Set FULL_TEXT_THRESHOLD_WORDS = 0 so we never call into Readability.
Sources that ship full RSS bodies (Tagesschau, Spiegel, BBC, …) are
unaffected. Title-only sources (Hacker News) keep the row with an
empty content field; the reader already falls back to "Original
öffnen ↗" in that case.

Re-enabling extraction in a worker thread is left for a follow-up.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 16:21:09 +02:00
Till JS
dad174a631 fix(news-ingester): silence JSDOM CSS errors + add process-level safety net
JSDOM's CSS parser throws on plenty of real-world pages and the error
escapes every try/catch in the buildRow → ingestSource chain because
it fires from a parse5 callback that runs after JSDOM has returned.
In the prod container this killed the process on the first bad page,
docker restarted it, and it crash-looped on the same first source
forever — no progress past tech.

Two-layer fix: a silent VirtualConsole on every JSDOM instance to
swallow CSS / resource errors at the source, plus process-level
uncaughtException + unhandledRejection handlers that log and continue
so any future async escape can't kill the daemon either.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 16:15:46 +02:00
Till JS
68d1bda7e5 fix(news-ingester): drop unused @mana/shared-hono workspace dep
Was copied verbatim from mana-credits' template but not actually
imported anywhere in src/. Removing it lets the Docker build's bun
install resolve from npm only — workspace:* refs need the full
monorepo context which the Dockerfile doesn't copy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 16:11:58 +02:00
Till JS
9ef97a1877 feat(news): backend ingester service + curated feed API
Adds the services/news-ingester Bun service that pulls 25 public RSS/JSON
feeds into news.curated_articles every 15 min, with Mozilla Readability
fallback for thin RSS bodies and 30-day retention. apps/api /feed is
rewritten to read from the new pool table directly instead of the
sync_changes hack, with topics/lang/since/limit/offset query params.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 15:53:26 +02:00