An environment for inquiry - complete documentation
Veritheia is formative technology - epistemic infrastructure that makes formation scalable in the midst of information overload. The system provides journey projection spaces where documents are transformed according to user-defined intellectual frameworks, enabling formation through authorship. Users develop intellectual capacity through engagement with projected documents, not through consumption of AI-generated outputs.
Formation Note: The Neurosymbolic Transcendence - The system demonstrates the revolutionary capability where users author their own symbolic systems through natural language. When a researcher writes “Papers are relevant if they provide empirical evidence,” or a teacher writes “Good essays use sensory details,” these natural language statements become the symbolic rules governing document processing. No programming required—the user’s words ARE the symbolic system.
Note: These are core formation patterns - the same infrastructure supports any formative journey that meets the architecture’s authorship constraints.
The system supports formative journeys - intellectual development through engagement with documents. These patterns demonstrate how formation scales through different types of authorship:
FORMATIVE GOAL: Dr. Sarah develops research formation - the accumulated scholarly capacity to conduct systematic literature reviews. She doesn’t receive AI-generated summaries but authors her own understanding through engagement with projected documents.
CONCRETE USER STORY: Dr. Sarah has 3,247 papers but can manually engage with only ~200. Through Veritheia, she develops the intellectual capacity to synthesize insights from the full corpus through her own authorship.
PRECISE PROCESS:
FORMATIVE GOAL: Ms. Priya develops pedagogical formation - the accumulated capacity to create meaningful assignments and evaluate student growth. Students develop authentic voice through constrained composition exercises. Both teacher and students author their understanding through engagement.
CONCRETE USER STORY: Ms. Priya needs to evaluate 30 student essays but can only provide detailed individual feedback to ~10. Through Veritheia, she develops the capacity to support all students’ formation while maintaining authentic assessment.
PRECISE PROCESS:
Level 0: Partition Architecture
Level 1: Journey Projection Spaces
Level 2: Process Engine Framework
// IMMUTABLE INTERFACE (Pattern analogue, not DDD implementation)
public interface IAnalyticalProcess
{
string ProcessType { get; }
Task<ProcessResult> ExecuteAsync(ProcessContext context);
InputDefinition GetInputDefinition();
}
The system follows a composable architectural pattern where components can be combined in different configurations to serve different deployment scenarios. This pattern enables clean separation of concerns while supporting flexible composition of functionality.
ApiService Component: The core business logic component provides the application programming interface for all system operations. This component contains all business logic, data access patterns, and domain services without any presentation or transport concerns. It serves as the foundation that can be composed with different interface components.
Important: The “API” in ApiService refers to Application Programming Interface, not HTTP REST API. This component is a pure business logic library that provides programming interfaces for other components to consume through direct method calls.
Web Component: The user interface component provides the Blazor Server presentation interface. This component imports the ApiService component and calls its programming interface directly. It handles user interaction, authentication, and session management while delegating all business operations to the ApiService component.
ApiGateway Component: The public interface component provides HTTP API endpoints for external system integration. This component imports the ApiService component and exposes its programming interface through HTTP protocols. It serves as the bridge between external systems and the core business logic.
MCPGateway Component: The AI agent interface component provides Model Context Protocol endpoints for AI agent integration. This component imports the ApiService component and exposes its programming interface through MCP protocols. It enables AI agents to access system functionality through standardized MCP interfaces.
Components communicate through direct method calls within the same process. The ApiService component defines the programming interface that other components consume. This pattern eliminates network overhead while maintaining clean architectural boundaries.
Different deployment scenarios can compose these components in various ways. A simple deployment might combine Web and ApiService components. A public API deployment might combine ApiGateway and ApiService components. A full deployment might include all three components.
The composable architecture provides clear extension points for additional components. New interface components can be added that import the ApiService component. The ApiService component can be extended with additional business logic without affecting interface components. This pattern supports progressive enhancement and flexible deployment scenarios.
Level 3: Journey Frameworks (User-defined schemas, evolve per journey)
Example Framework Patterns (Not fixed formats - illustrative structures):
Framework elements might include research questions, definitions, assessment criteria, theoretical orientations, learning objectives, rubrics, safety constraints - but the specific schema is user-defined and can evolve.
Projection rules might include segmentation strategies, embedding contexts, assessment prompts, evaluation stages - but these are configured per journey type and can change as the journey develops.
Note: These are configurable scaffolds that accept multiple schema evolutions per journey type, not required formats.
Level 4: Formative Process Implementations (Extensible, must support formation through authorship)
Known Process Implementations:
SystematicScreeningProcess - Implements the LLAssist methodology for systematic literature review. Enables processing thousands of documents rather than hundreds through dual-phase assessment: Phase 1 extracts key semantics (topics, entities, keywords) from abstracts; Phase 2 evaluates each document against user research questions through relevance assessment (discusses the topic) and contribution assessment (researches the topic directly). Generates scores (0-1 scale), binary decisions, and reasoning chains. Every document receives identical treatment to prevent selective processing bias. Formation occurs through user engagement with these measurements.
ConstrainedCompositionProcess - Embodies EdgePrompt methodology for pedagogical formation within lesson planning journeys.
Extension Requirements: New processes must enable formation through authorship within journey projection spaces
Process Input/Output Patterns: Processes accept researcher frameworks and produce assessments for user evaluation. SystematicScreeningProcess: processes documents from user corpus based on defined research questions and optional screening criteria. Documents are added to corpus via CSV import (title, abstract, authors, venue, keywords) or other methods. Process generates outputs with scores (0-1), binary decisions, and reasoning chains. Enables processing thousands of documents while maintaining user control through evaluation of measurements, not consumption of insights.
Input/Output Formats: CSV input requires columns: title, abstract, authors, publication_venue, keywords. Research questions as plain text, one per line. JSON output contains per-article: extracted semantics (topics, entities, keywords arrays), scores (relevance_score, contribution_score as floats 0.0-1.0), binary decisions (is_relevant, is_contributing), reasoning chains, must-read determination (logical OR of decisions).
User Interface Patterns: The main interface uses enterprise sidebar navigation organizing around journeys, not processes. Sidebar Navigation provides primary access to journeys, personas, and system functions. Main Content Area displays journey-specific interfaces, process results, and configuration screens. Journey Dashboard shows active journeys with persona context, process types, current states, and recent activity. Users create new journeys by selecting persona and process, then configure journey-specific frameworks.
SystematicScreeningProcess requires tabular interface for thousand-document result sets within journey context. Core table displays: Title | Authors | Venue | Relevance Score | Contribution Score | Must-Read | Actions. Filters enable viewing subsets (relevant only, contributing only, must-read only, by venue, by score thresholds). Row selection enables bulk actions. Detail view shows reasoning chains and extracted semantics. Progress interface shows processing status during batch execution. Formation interface enables iterative refinement of research questions based on result patterns. |
Sidebar Navigation (Persistent across all pages):
┌─────────────────────────────────────────────────────────────────────────────┐
│ ┌─────────────┐ │
│ │ VERITHEIA │ Dr. Sarah Chen │ [Researcher ▼] │ [Settings] │ [Logout] │
│ └─────────────┘ │
├─────────────────────────────────────────────────────────────────────────────┤
│ │📊 Dashboard │ Main Content Area (varies by selection) │
│ │ │ │
│ │🚀 Journeys │ ┌─────────────────────────────────────────────────────────┐ │
│ │ • ML Sec... │ │ [Selected content displays here] │ │
│ │ • Research │ │ │ │
│ │ • Market │ │ Dashboard: Journey overview and recent activity │ │
│ │ │ │ Journey: Process interface and results │ │
│ │👤 Personas │ │ Personas: Context management and vocabulary │ │
│ │ • Researcher│ │ Documents: Upload and corpus management │ │
│ │ • Student │ │ Settings: User preferences and configuration │ │
│ │ • Entrepren.│ │ │ │
│ │ │ │ │ │
│ │📄 Documents │ │ │ │
│ │ │ │ │ │
│ │⚙️ Settings │ │ │ │
│ │ │ │ │ │
│ │❓ Help │ │ │ │
│ └─────────────┘ └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
Journey Dashboard (When Dashboard selected in sidebar):
┌─────────────────────────────────────────────────────────────────────────────┐
│ │📊 Dashboard │ Active Journeys (3) [+ New Journey] │
│ │ │ │
│ │🚀 Journeys │ 📊 ML Security Literature Review │
│ │ • ML Sec... │ Researcher persona │ SystematicScreeningProcess │
│ │ • Research │ 2,576 docs → 324 must-read │ Processing (67%) │
│ │ • Market │ [Continue] [View Results] [Pause] │
│ │ │ │
│ │👤 Personas │ 🎓 Research Methods Course │
│ │ • Researcher│ Student persona │ ConstrainedCompositionProcess │
│ │ • Student │ Essay framework │ Active │ Last: 1 day ago │
│ │ • Entrepren.│ [Continue] [View Submissions] [Edit Framework] │
│ │ │ │
│ │📄 Documents │ 💼 Startup Market Analysis │
│ │ │ Entrepreneur persona │ SystematicScreeningProcess │
│ │⚙️ Settings │ 847 docs → 89 must-read │ Completed: 3 days ago │
│ │ │ [View Results] [Export] [Archive] │
│ │❓ Help │ │
│ └─────────────┘ Recent Activity │
│ • ML Security: 156 new assessments completed │
│ • Research Methods: 3 new student submissions │
│ • Market Analysis: Results exported to PDF │
└─────────────────────────────────────────────────────────────────────────────┘
Journey Detail View (When specific journey selected in sidebar):
┌─────────────────────────────────────────────────────────────────────────────┐
│ │📊 Dashboard │ ML Security Literature Review │ [⚙️ Configure] [📤 Export] │
│ │ │ │
│ │🚀 Journeys │ ┌─ Process Status ─────────────────────────────────────────┐ │
│ │ • ML Sec... │ │ SystematicScreeningProcess │ Processing (67%) │ │
│ │ • Research │ │ Article 1,724 of 2,576 │ 4.2h elapsed │ 2.8h remaining │ │
│ │ • Market │ │ [Pause] [Resume] [Cancel] │ │
│ │ │ └─────────────────────────────────────────────────────────┘ │
│ │👤 Personas │ │
│ │ • Researcher│ ┌─ Results Summary ────────────────────────────────────────┐ │
│ │ • Student │ │ Total: 2,576 docs │ Processed: 1,724 │ Must-Read: 243 │ │
│ │ • Entrepren.│ │ [View All Results] [View Must-Read Only] [Export CSV] │ │
│ │ │ └─────────────────────────────────────────────────────────┘ │
│ │📄 Documents │ │
│ │ │ ┌─ Framework Configuration ────────────────────────────────┐ │
│ │⚙️ Settings │ │ Research Questions (4) │ Last updated: 2 hours ago │ │
│ │ │ │ • How are LLMs being utilized for threat detection? │ │
│ │❓ Help │ │ • What are the main challenges in AI security? │ │
│ └─────────────┘ │ [Edit Questions] [Add Question] [View Full Framework] │ │
│ └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
Journey Dashboard (Main Interface - Journey-Centric Organization):
┌──────────────────────────────────────────────────────────────────────────────┐
│ Veritheia │ Dr. Sarah Chen │ [Researcher ▼] │ [Settings] │ [Help] │ [Logout] │
├──────────────────────────────────────────────────────────────────────────────┤
│ Active Journeys (3) [+ New Journey] │
├──────────────────────────────────────────────────────────────────────────────┤
│ 📊 ML Security Literature Review │
│ Researcher persona │ SystematicScreeningProcess │ Processing (67%) │
│ 2,576 docs → 324 must-read │ Last activity: 2 hours ago │
│ [Continue] [View Results] [Pause Processing] │
├──────────────────────────────────────────────────────────────────────────────┤
│ 🎓 Research Methods Course │
│ Student persona │ ConstrainedCompositionProcess │ Active │
│ Essay assignment framework │ Last activity: 1 day ago │
│ [Continue] [View Submissions] [Edit Framework] │
├──────────────────────────────────────────────────────────────────────────────┤
│ 💼 Startup Market Analysis │
│ Entrepreneur persona │ SystematicScreeningProcess │ Completed │
│ 847 docs → 89 must-read │ Completed: 3 days ago │
│ [View Results] [Export] [Archive] │
├──────────────────────────────────────────────────────────────────────────────┤
│ Recent Activity │
│ • ML Security: 156 new assessments completed │
│ • Research Methods: 3 new student submissions │
│ • Market Analysis: Results exported to PDF │
└──────────────────────────────────────────────────────────────────────────────┘
New Journey Creation (Process Selection Interface):
┌──────────────────────────────────────────────────────────────────────────────┐
│ Create New Journey [Cancel][×] │
├──────────────────────────────────────────────────────────────────────────────┤
│ Journey Name: [Cybersecurity AI Applications Review____________] │
│ Purpose: [Systematic review of AI/ML applications in cyber defense_______] │
├──────────────────────────────────────────────────────────────────────────────┤
│ Select Persona: │
│ ● Researcher (Domain expertise, investigation methods) │
│ ○ Student (Academic vocabulary, learning-focused patterns) │
│ ○ Entrepreneur (Business terminology, market analysis approaches) │
├──────────────────────────────────────────────────────────────────────────────┤
│ Select Process: │
│ ● SystematicScreeningProcess │
│ 📋 Literature review with dual-phase assessment (relevance + contribution) │
│ 📊 Handles thousands of documents, generates must-read determinations │
│ 📝 Requires: CSV datasets, research questions │
│ │
│ ○ ConstrainedCompositionProcess │
│ ✍️ Pedagogical formation with structured writing frameworks │
│ 👥 Student response evaluation, assignment creation │
│ 📝 Requires: Learning objectives, rubrics, safety constraints │
│ │
│ ○ [Custom Process] (Extension capability) │
├──────────────────────────────────────────────────────────────────────────────┤
│ [Create Journey] [Cancel] │
└──────────────────────────────────────────────────────────────────────────────┘
Systematic Screening Results (2,576 documents → 324 must-read)
┌─────────────────────────────────────────────────────────────────────────────┐
│ [Must-Read Only ▼] [Year: All ▼] [Venue: Scopus ▼] [Threshold≥0.7] [Export] │
├─────────────────────────────────────────────────────────────────────────────┤
│☐ Title │Authors │RQ1│RQ2│RQ3│RQ4│Must│Actions │
├─────────────────────────────────────────────────────────────────────────────┤
│☐ LLMs for Threat Detection... │Chen, L. │0.9│0.8│0.2│0.1│ ✓ │[View][Tag] │
│☐ AI Security Vulnerabilities..│Smith, J. │0.7│0.9│0.3│0.6│ ✓ │[View][Tag] │
│☐ Adversarial ML in Cyber... │Brown, M. │0.3│0.4│0.9│0.2│ ✓ │[View][Tag] │
├─────────────────────────────────────────────────────────────────────────────┤
│ Showing 1-50 of 324 must-read results [1][2]...[7][>] │
└─────────────────────────────────────────────────────────────────────────────┘
Detail View (when row selected):
┌─────────────────────────────────────────────────────────────────────────────┐
│ AI Security in Critical Infrastructure Systems │
│ Authors: Smith, J., Davis, K. │ Venue: IEEE Security & Privacy 2024 │
├─────────────────────────────────────────────────────────────────────────────┤
│ Relevance Assessment (0.9): Discusses strategic AI implementation in │
│ cyber defense contexts as specified in RQ1. │
│ │
│ Contribution Assessment (0.8): Directly researches AI deployment │
│ strategies with empirical evaluation as specified in RQ2. │
├─────────────────────────────────────────────────────────────────────────────┤
│ Extracted Semantics: [AI deployment][risk mitigation][critical systems] │
│ Must-Read: YES (meets both relevance and contribution thresholds) │
└─────────────────────────────────────────────────────────────────────────────┘
Processing Status (during 10+ hour batch execution):
┌─────────────────────────────────────────────────────────────────────────────┐
│ Processing: Article 1,247 of 2,576 (48%) │ Elapsed: 4.2h │ Remaining: 4.8h │
│ Current: "Machine Learning for Intrusion Detection Systems" │
│ [Pause] [Resume] [Cancel] Last saved: 2 minutes ago │
└─────────────────────────────────────────────────────────────────────────────┘
CRITICAL: All user-owned entities use composite primary keys (UserId, Id)
-- CORRECT partition-enforced schema
CREATE TABLE journeys (
user_id UUID NOT NULL,
id UUID NOT NULL,
PRIMARY KEY (user_id, id),
-- other fields...
);
CREATE TABLE journey_document_segments (
user_id UUID NOT NULL,
id UUID NOT NULL,
journey_id UUID NOT NULL,
PRIMARY KEY (user_id, id),
FOREIGN KEY (user_id, journey_id) REFERENCES journeys(user_id, id)
);
-- ALL indexes start with user_id for partition locality
CREATE INDEX idx_segments_user_journey ON journey_document_segments(user_id, journey_id);
Core Tables for Both Journey Types:
journey_frameworks
- User’s theoretical/pedagogical frameworkjourney_document_segments
- Documents projected into journey spacejourney_segment_assessments
- AI measurements within journey contextjourney_formations
- Captured insights and understanding developmentFramework Storage Pattern:
-- Same table structure supports both journey types
CREATE TABLE journey_frameworks (
user_id UUID NOT NULL,
journey_id UUID NOT NULL,
journey_type VARCHAR(100) NOT NULL, -- 'literature_review' | 'educational_assessment'
framework_elements JSONB NOT NULL, -- RQs + definitions OR objectives + rubrics
projection_rules JSONB NOT NULL, -- How to segment/embed/assess documents
PRIMARY KEY (user_id, journey_id)
);
The Process Engine implements neurosymbolic architecture, transcended—the critical differentiator from all legacy systems. Traditional symbolic AI requires programmers to encode rules in formal languages. Veritheia transcends this: users author symbolic systems in natural language (“Papers must provide empirical evidence”), and neural components interpret these natural language rules. The mechanical orchestration ensures these user-authored rules apply to ALL documents identically, regardless of corpus size.
Execution Context (Neurosymbolic-aware):
public class ProcessContext
{
public Guid UserId { get; set; } // ALWAYS required for partition
public Guid JourneyId { get; set; } // Journey boundary enforcement
public string NaturalLanguageFramework { get; set; } // User's authored symbolic system
public ProcessInputs Inputs { get; set; } // Process-specific parameters
}
Neurosymbolic Process Implementation Examples:
Research Formation Process (Direct implementation of LLAssist methodology):
public class SystematicScreeningProcess : IAnalyticalProcess
{
private readonly ICognitiveService _cognitiveService;
public async Task<ProcessResult> ExecuteAsync(ProcessContext context)
{
// User's natural language framework becomes the symbolic system:
// "I'm investigating how LLMs enhance cybersecurity threat detection.
// By 'contextualized AI' I mean systems that leverage domain-specific expertise...
// I need papers that directly contribute to understanding this relationship..."
var documents = await GetAllDocuments(context.UserId, context.JourneyId);
// MECHANICAL ORCHESTRATION: Process ALL documents identically
foreach (var document in documents)
{
// Neural semantic understanding of user's authored symbolic framework
var assessment = await _cognitiveService.ProcessWithUserFramework(
document.Content,
context.NaturalLanguageFramework
);
// Systematic storage - every document gets processed and stored
await StoreAssessment(context.UserId, document.Id, assessment);
}
// Result: All documents processed through user's authored framework
return new ProcessResult { ProcessedCount = documents.Count };
}
}
Pedagogical Formation Process (Direct implementation of EdgePrompt methodology):
public class ConstrainedCompositionProcess : IAnalyticalProcess
{
private readonly ICognitiveService _cognitiveService;
public async Task<ProcessResult> ExecuteAsync(ProcessContext context)
{
// Teacher's natural language framework becomes the symbolic system:
// "My Grade 5 students need to develop descriptive writing using sensory details.
// I want them to write 50-100 words about familiar topics, using vocabulary
// appropriate for their age level. No inappropriate content or violence..."
var studentResponses = context.Inputs.StudentResponses;
// MECHANICAL ORCHESTRATION: Process ALL student responses identically
foreach (var response in studentResponses)
{
// Neural semantic understanding of teacher's authored symbolic framework
var evaluation = await _cognitiveService.ProcessWithUserFramework(
response.Content,
context.NaturalLanguageFramework
);
// Systematic storage - every student gets identical evaluation process
await StoreEvaluation(context.UserId, response.Id, evaluation);
}
// Result: All responses processed through teacher's authored framework
return new ProcessResult { ProcessedCount = studentResponses.Count };
}
}
All data in production and development must originate from authentic neural processing. Embeddings must come from language models—never from random generation, hashing algorithms, or statistical approximations. Assessments must come from neural evaluation—never from keyword matching, rule-based scoring, or default values. Extractions must come from semantic understanding—never from regular expressions, string manipulation, or template matching.
When neural processing cannot occur, the operation must fail completely. No degraded modes. No approximate processing. No “good enough” substitutes. The absence of neural processing is communicated as explicit failure, not masked with fake data.
Test Environment Exception: Integration tests validating data flow paths may use test doubles that generate deterministic fake embeddings when language models are unavailable. These test doubles must:
TestCognitiveAdapter
)Every document in a processing request must receive identical treatment attempt. The first document and the thousandth document must go through exactly the same pipeline with exactly the same operations. No optimization shortcuts. No sampling strategies. No progressive degradation.
When individual documents cannot be processed, continue processing while explicitly tracking each failure. If document 847 of 3,000 causes an error, record the failure and continue with document 848. The user must receive complete transparency: “2,847 processed successfully, 153 failed” with detailed failure logs showing which documents failed and why. Never silently skip documents. Never report completion without full failure disclosure.
Users must receive immediate notification of any failure that affects formation. The notification must identify the specific failure, explain its impact on formation, and provide actionable next steps. No silent retries. No background error handling. No delayed failure reporting.
Failure information must be preserved for the journey record. Users must be able to review what failed, when it failed, and why it mattered for their formation process. This isn’t error logging—it’s formation history.
What is Hard-Coded: User partition architecture, journey projection framework, process engine interface, database schema with composite keys
What is Configurable: Journey frameworks for different types of formation, projection rules for intellectual development, process implementations that support authorship within specific domains
Extension Path: New formative journey types must enable formation through authorship - users must develop intellectual capacity through engagement, not receive AI-generated outputs
The MVP embodies LLAssist and EdgePrompt as concrete implementations within the Veritheia platform. Both are proto-Veritheia systems that demonstrate formative journey patterns - LLAssist for research formation through literature engagement, EdgePrompt for pedagogical formation through assessment cycles. Veritheia provides the unified infrastructure that makes both possible while maintaining intellectual sovereignty through journey projection spaces and strict user partition boundaries.