Database Architecture & Data Strategy
This manual defines the data structure, relationships, and management protocols for the PlugZero Intelligence platform.
🏗️ 1. Architectural Philosophy: The Hub-and-Spoke Model
PlugZero uses a Hub-and-Spoke database architecture. The central hub is the Project model. Every uploaded file, survey response, scraped webpage, or AI analysis result MUST be associated with a Project.
The Core Entity Relationship (ER) Logic:
- User (Accounts): Extended from
AbstractUser. Owns or is a member of a Team. - Team (Accounts): Grouping mechanism for Projects. Projects can be team-wide.
- Project (Data Ingestion): The container for all intelligence.
- RawDataFile / Survey / ScrapeTarget: These are “Ingress Spokes” that feed raw data into the Project.
- AnalysisResult / Report / ResearchInsight: These are “Egress Spokes” that store processed intelligence.
📂 2. Key Data Domains
A. The Accounts Engine (accounts/models.py)
PlugZero implements a strictly controlled Role-Based Access Control (RBAC) system.
User: Custom model usingemailas the unique identifier.TeamMembership: A “Through” model managing roles:OWNER,ADMIN,MEMBER,VIEWER.ActivityLog: An append-only audit trail logging every action for compliance.
B. The Ingestion Engine (data_ingestion/models.py)
RawDataFile: Stores file metadata and a JSONcolumns_metadatacache.ScraperJob: Logs individual scraping runs. Linked toScrapedPagefor raw text storage.Survey: Handles complex logic, quotas (SurveyQuota), and responses. Uses UUIDs for public URLs.
C. The Intelligence Engine (analysis/models.py)
AnalysisResult: UsesJSONFieldto store Pandas/Scikit-Learn outputs for fast rendering.ResearchInsight: Stores atomic findings. Includes anembeddingvector field for semantic search.
💾 3. Storage & Integrity Strategy
Data Type Standards
- UUIDs: Used for all public-facing identifiers (Surveys, Reports).
- JSONField: Used for variable schemas to maintain flexibility without frequent migrations.
- DateTime: All records use
auto_now_addfor auditing.
File vs. Database Storage
- Database: Stores metadata, settings, and small text results.
- Filesystem (
media/folder): Stores the actual raw CSVs, Excel files, and PDF uploads.
CRITICAL: The database stores the path to the file. Deleting a record in the DB does not automatically delete the file on the disk. Handle file deletion explicitly in the application logic.
🔧 4. Maintenance & Migrations
Standard Operating Procedure (SOP)
Before applying ANY database changes:
- Verify the environment:
python manage.py check. - Generate migrations:
python manage.py makemigrations. - Apply migrations:
python manage.py migrate.
Data Retention
Background tasks (Celery) automatically purge old ActivityLog entries or temporary caches based on the data_retention_days setting.