Intelligence & Crawling Resource Map
This map identifies the components responsible for project “Awareness”—the ability to scan the open web for SEO metrics and social brand mentions.
🔍 The Site Auditor (seo_intelligence)
Built as an autonomous “Graph Walker” for technical site health.
The Auditor ViewSet
trigger: Starts the BFS (Breadth-First Search) mission. It creates anSEOAuditrecord in a “PENDING” state and hands the task to the Celery worker.results: Compiles the sharded data from all crawled pages into a unified “Issue Map.” It groups findings by “Severity” and “Type” (e.g., Performance vs. Content).
👂 The Brand Aggregator (media_intelligence)
The primary “Listening Post” for social sentiment and brand health.
The Media ViewSet
targets: Defines the “Keywords of Interest.” When a target is added, the system automatically schedules the first “Deep Crawl” to establish a historical sentiment baseline.history: AggregatesSocialMentionrecords into a time-series view. It computes the “Volume-to-Sentiment” ratio which drives the charts in the main dashboard.
🕸️ Native Scraper Engine
The underlying logic that powers our awareness.
- UnifiedSEOAnalyzer: A stateless link-graph traversed. It is “Politeness-Aware,” respecting robots.txt and implementing rate-limiting to prevent IP-banning during deep research.
- MediaCrawler: Uses a “Source-Agnostic” pattern. Whether the data comes from Reddit or YouTube, it is normalized into a standard
Mentionschema before it ever hits our Sentiment Engine.
📈 Intelligence Models
- CompetitorDomain: A tracking model used for “Differential Benchmarking.” We store the SEO Health of a competitor side-by-side with the project’s own health to provide strategic context.
- SocialMention: The persistent storage for a brand mention. It includes a
polarity_scorecalculated at the moment of ingestion.
Last updated on