feat(articles): new read-it-later module — save / read / highlight

Pocket-style module for saving arbitrary web URLs, extracting readable
content server-side via @mana/shared-rss (Readability + JSDOM), and
storing it AES-GCM encrypted in IndexedDB for offline reading.

M1 skeleton: Dexie v33 (articles, articleHighlights, articleTags),
crypto registry entries, module registration, app-registry entry with
orange icon, empty-state ListView. articleTags is a pure junction
into the existing globalTags system (appId 'tags') — same pattern as
noteTags, eventTags, placeTags.

M2 URL save + reader: POST /api/v1/articles/extract (one endpoint,
not two — client caches the preview payload to avoid a double
server fetch). AddUrlForm with scope-aware dedupe, DetailView with
ReaderView typography shell (serif/sans, light/sepia/dark, size
slider), auto-tracked reading progress with scroll restore.

M3 highlights: TreeWalker-based plain-text offset resolution
(lib/offsets.ts), highlights store, floating HighlightMenu with
create + edit modes, HighlightLayer orchestrator that wraps/unwraps
highlight spans whenever highlights or htmlVersion changes. Four
colours (yellow/green/blue/pink), optional notes, click-to-edit,
dark-mode-aware overlay colours.

Drive-by: removed stale 'pendingProposals' entry from the plaintext
allowlist — the table was dropped in Dexie v29 and the allowlist
audit was flagging it as a dead entry.

Plan: docs/plans/articles-module.md. M4 (tags + filter + progress),
M5 (news:type='saved' migration), M6 (AI tools), M7 (share target),
M8 (highlights view + stats) still open.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Till JS 2026-04-21 16:20:23 +02:00
parent 8f6a4efddd
commit 3357e88a1c
28 changed files with 2819 additions and 1 deletions

View file

@ -35,6 +35,7 @@ import { guidesRoutes } from './modules/guides/routes';
import { moodlitRoutes } from './modules/moodlit/routes';
import { newsRoutes } from './modules/news/routes';
import { newsResearchRoutes } from './modules/news-research/routes';
import { articlesRoutes } from './modules/articles/routes';
import { tracesRoutes } from './modules/traces/routes';
import { presiRoutes } from './modules/presi/routes';
import { researchRoutes } from './modules/research/routes';
@ -104,6 +105,7 @@ app.route('/api/v1/guides', guidesRoutes);
app.route('/api/v1/moodlit', moodlitRoutes);
app.route('/api/v1/news', newsRoutes);
app.route('/api/v1/news-research', newsResearchRoutes);
app.route('/api/v1/articles', articlesRoutes);
app.route('/api/v1/traces', tracesRoutes);
app.route('/api/v1/presi', presiRoutes);
app.route('/api/v1/research', researchRoutes);

View file

@ -0,0 +1,54 @@
/**
* Articles module server-side URL extraction.
*
* Thin wrapper around `@mana/shared-rss`'s Readability pipeline. The
* extracted payload is returned to the client which then encrypts +
* stores it locally (and syncs via mana-sync). The server keeps no
* per-user article state all reading-list data lives in the unified
* Mana app's IndexedDB.
*
* One endpoint (`POST /extract`), not two. News has a `preview` + `save`
* split for legacy reasons; here both UI paths (AddUrlForm preview + the
* direct saveFromUrl path) use the same payload. The client caches the
* response when the user confirms, avoiding a double server fetch.
*/
import { Hono } from 'hono';
import { extractFromUrl } from '@mana/shared-rss';
const routes = new Hono();
routes.post('/extract', async (c) => {
const body = await c.req.json<{ url?: string }>().catch(() => ({}) as { url?: string });
const url = body.url;
if (!url || typeof url !== 'string') {
return c.json({ error: 'URL is required' }, 400);
}
// Minimal URL shape check — extractFromUrl will no-op on a bad URL but
// the caller deserves a clear 400 vs a generic 502.
try {
new URL(url);
} catch {
return c.json({ error: 'Invalid URL' }, 400);
}
const extracted = await extractFromUrl(url);
if (!extracted) {
return c.json({ error: 'Extraction failed' }, 502);
}
return c.json({
originalUrl: url,
title: extracted.title,
excerpt: extracted.excerpt,
content: extracted.content,
htmlContent: extracted.htmlContent,
author: extracted.byline,
siteName: extracted.siteName,
wordCount: extracted.wordCount,
readingTimeMinutes: extracted.readingTimeMinutes,
});
});
export { routes as articlesRoutes };