mirror of
https://github.com/Memo-2023/mana-monorepo.git
synced 2026-05-14 23:41:08 +02:00
NestJS-based web crawler service for structured content extraction. Features: - Depth-controlled crawling with URL pattern filtering - robots.txt compliance - HTML/PDF/Markdown content extraction - BullMQ job queue for async processing - Redis caching layer - Prometheus metrics |
||
|---|---|---|
| .. | ||
| cache | ||
| common/filters | ||
| config | ||
| crawler | ||
| db | ||
| health | ||
| metrics | ||
| parser | ||
| queue | ||
| robots | ||
| app.module.ts | ||
| main.ts | ||