The Analysis Engine
The analysis app is the math core of PlugZero Analytics. It takes raw data and turns it into the stats and insights you see in the dashboard.
Where the code lives
Most of the math happens in these files:
plugzero_api/analysis/processors.py: This contains the Python functions for math and machine learning.plugzero_api/analysis/views.py: This handles the API requests that trigger the math.
How we process data
When the system runs an analysis, it follows these technical steps:
1. Loading the data
We use a library called Pandas.
- Functions like
get_data_for_project()find all the files for a project and merge them into one big table (a DataFrame). - We clean the data here (removing extra spaces, fixing date formats).
2. Basic Math (calculate_basic_stats)
Before doing any AI work, we calculate the basics:
- Mean (Average): The middle value.
- Mode: The most common value.
- Count: How many entries exist.
- Technical Detail: We use
pd.to_numeric(errors='coerce')to make sure text columns don’t break our math.
3. Finding “Weird” Data (Outliers)
We use a machine learning algorithm called Isolation Forest from the sklearn library.
- Why?: It’s better at finding “hidden” patterns than simple averages.
- The Result: It gives every row a score. Rows with a low score are flagged as “weird” or “outliers” in the dashboard.
4. AI Insights (ai_engine.py)
We use Gemini 2.0 Flash to write the summaries.
- We never send raw, messy data to the AI.
- We send a clean JSON summary of the math we already did.
- This prevents the AI from making up numbers (hallucinating).
Code Example: Running an Analysis
If you want to trigger a new analysis from the frontend, look at the AnalysisViewSet in the backend. You send a POST request to /api/analysis/run/ with a body like this:
{
"project_id": "123",
"calculation_type": "swot",
"variables": ["revenue", "region"]
}Memory limit: If your dataset is larger than 100,000 rows, do not run the analysis in the web view. You must use the background task system.