Ingestion & Inflow Resource Map

This map outlines the services responsible for the secure collection and “Semantic Cleanup” of external research data.

📂 The Asset Gateway

Handles the physical and logical ingestion of research artifacts.

create: Triggers the “Deconstruction” of a file. It doesn’t just store the blob; it initiates the background worker that scans for headers and data types.
preview: A stateless window into the data. It returns a “Micro-Sample” (the first 50 rows), allowing the frontend to render the initial table view without loading the entire multi-megabyte dataset.
gsheets_import: A bridge to the Google ecosystem. It treats a public URL as a “Live Stream,” pulling data into our internal normalization pipeline.

A complex orchestrator for interactive data collection.

generate_questions: Uses the “Generative Scaffolding” logic to turn a project intent into a valid Survey schema. It returns a Transient Draft which the user must confirm before it is committed to the database.
duplicate: Performs a “Recursive Clone.” It copies not just the survey, but all nested logic gates, distribution settings, and question branches to a new UUID.
distribute: The outbound “Engagement Hub.” It handles the generation of unique, trackable links for participants to ensure data provenance.

The “Low-Level” machinery that powers the ingestors.

PDF/DOCX Extractor: A stateless utility that strips formatting to extract core textual data for our Intelligence Engine’s NLP pipeline.
Semantic Analyzer: The logic that guesses “Who is this column?”. It uses heuristic pattern matching to identify Email addresses, Dates, and Sentiment-heavy text fields.

SurveyResponse: The atomic unit of collection. Every response is “Immutable” once submitted—it can be archived but never modified, ensuring the integrity of the research findings.