mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-15 05:41:09 +02:00
NestJS-based web crawler service for structured content extraction. Features: - Depth-controlled crawling with URL pattern filtering - robots.txt compliance - HTML/PDF/Markdown content extraction - BullMQ job queue for async processing - Redis caching layer - Prometheus metrics |
||
|---|---|---|
| .. | ||
| processors | ||
| constants.ts | ||
| processor.module.ts | ||
| queue.module.ts | ||
| queue.service.ts | ||