Skip to Content

Intelligence & Crawling Resource Map

This map identifies the components responsible for project “Awareness”—the ability to scan the open web for SEO metrics and social brand mentions.


🔍 The Site Auditor (seo_intelligence)

Built as an autonomous “Graph Walker” for technical site health.

The Auditor ViewSet

  • trigger: Starts the BFS (Breadth-First Search) mission. It creates an SEOAudit record in a “PENDING” state and hands the task to the Celery worker.
  • results: Compiles the sharded data from all crawled pages into a unified “Issue Map.” It groups findings by “Severity” and “Type” (e.g., Performance vs. Content).

👂 The Brand Aggregator (media_intelligence)

The primary “Listening Post” for social sentiment and brand health.

The Media ViewSet

  • targets: Defines the “Keywords of Interest.” When a target is added, the system automatically schedules the first “Deep Crawl” to establish a historical sentiment baseline.
  • history: Aggregates SocialMention records into a time-series view. It computes the “Volume-to-Sentiment” ratio which drives the charts in the main dashboard.

🕸️ Native Scraper Engine

The underlying logic that powers our awareness.

  • UnifiedSEOAnalyzer: A stateless link-graph traversed. It is “Politeness-Aware,” respecting robots.txt and implementing rate-limiting to prevent IP-banning during deep research.
  • MediaCrawler: Uses a “Source-Agnostic” pattern. Whether the data comes from Reddit or YouTube, it is normalized into a standard Mention schema before it ever hits our Sentiment Engine.

📈 Intelligence Models

  • CompetitorDomain: A tracking model used for “Differential Benchmarking.” We store the SEO Health of a competitor side-by-side with the project’s own health to provide strategic context.
  • SocialMention: The persistent storage for a brand mention. It includes a polarity_score calculated at the moment of ingestion.

Last updated on