- src/crawl/robots.ts: parser, fetcher (10s timeout, 24h-cache), user-agent-substring-match, longest-prefix-Pfad-Match mit Allow-überschreibt-Disallow-Bei-Gleichstand-Standard. - 5xx von robots.txt → konservativ Full-Disallow (RFC-konform). - 4xx/Netzwerk-Fehler → leere Rules bzw. konservativ Full-Disallow (network-error nicht gecached, Retry möglich). - src/crawl/policy.ts: robots-Check vor Rate-Limit; effektives Min-Intervall = max(1.1s, robots.crawl-delay). - 16/16 Unit-Tests grün (parser, isPathAllowed, crawlDelay, case-insensitive UA-Match, *-Fallback, leere Rules). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8 lines
150 B
TypeScript
8 lines
150 B
TypeScript
import { defineConfig } from 'vitest/config';
|
|
|
|
export default defineConfig({
|
|
test: {
|
|
environment: 'node',
|
|
include: ['src/**/*.test.ts'],
|
|
},
|
|
});
|