An environment for inquiry - complete documentation
Veritheia is an epistemic infrastructure that enables users to author understanding through structured engagement with source materials. The architecture implements a four-tier design: Knowledge Database for document storage and retrieval, Process Engine for workflow orchestration, Cognitive System for assessment operations, and Presentation tier for user interaction. Each component enforces the principle that insights emerge from user engagement, not system generation.
+-----------------------------------------------------+
| III. PRESENTATION |
| (Client: Desktop, Web, CLI, API) |
+-----------------------------------------------------+
^
| (API Calls)
v
+-----------------------------------------------------+ +-----------------------------+
| II. PROCESS ENGINE |<---->| IV. COGNITIVE SYSTEM |
| (Stateful Workflow Orchestration & Logic) | | (Via Adaptor Interface) |
+-----------------------------------------------------+ +-----------------------------+
^
| (Data Operations)
v
+-----------------------------------------------------+
| I. KNOWLEDGE DATABASE |
| (Passive Datastore & Semantic API) |
+-----------------------------------------------------+
The Knowledge Database provides persistent storage for source documents and derived representations. It maintains three data layers: Raw Corpus (original documents), Processed Representation (embeddings, metadata, relationships), and Knowledge Layer (semantic query API). The database preserves provenance and versioning for all transformations.
The Process Engine executes analytical workflows through the IAnalyticalProcess interface. Each process maintains journey context, including user research questions, conceptual vocabulary, and formation markers. The engine provides platform services (document ingestion, embedding generation, metadata extraction) while ensuring process outputs reflect user interpretation rather than automated analysis.
The Presentation tier implements user interfaces for journey management, journal composition, and process execution. It maintains strict separation between user-authored content and system-provided structure. All displays reflect the user’s developing understanding without imposing system-generated interpretations.
The system employs PostgreSQL with pgvector extension for unified storage of relational and vector data. This design choice eliminates synchronization complexity between separate databases while maintaining query performance through appropriate indexing strategies (B-tree for relational queries, IVFFlat for vector similarity).
The Presentation tier implements a web-based interface that maintains architectural separation from backend services.
The Process Engine interacts with the Cognitive System as an assessment engine, not a decision maker.
Function: The cognitive adapter performs structured assessments in specific roles (librarian, peer reviewer, instructor) while the user maintains interpretive sovereignty.
Key Principle: AI performs assessments but users make decisions. In Systematic Screening, AI measures relevance and contribution, but researchers decide which papers are core, contextual, or peripheral to their inquiry.
Rationale: This ensures that:
The cognitive system measures and records rather than replacing human judgment. Each journey generates insights that contribute to formation.
The data within the Knowledge Database is structured into three distinct, extensible layers. All computation and transformation between these layers is performed by the Process Engine.
Raw Corpus: This layer represents the ground truth. It consists of the original, unmodified source artifacts (e.g., PDF, text files, images) provided by the user.
All entries in this layer are versioned and include metadata about their provenance (e.g., the model version used for summary generation, the embedding model name).
The Process Engine executes two distinct categories of processes through a unified interface architecture.
These services never generate insights—they prepare materials for analysis. They are triggered by user actions and serve the inquiry.
These reference implementations show how processes orchestrate intellectual work through the platform services.
Every process produces outputs that are unique to the author—shaped by their questions, guided by their framework, and meaningful only within their journey.
All processes implement a common interface that enables uniform execution, monitoring, and result handling:
This architecture ensures that new processes can be added without modifying the core engine.
The architecture is designed for extensibility through a set of formal interfaces.
These connectors are considered extensions and are not part of the default implementation.
The system implements extensibility through composition rather than modification.
All processes implement the IAnalyticalProcess
interface. The platform provides two reference implementations that demonstrate the pattern:
Core Platform
├── Process Engine (execution runtime)
├── Platform Services (guaranteed capabilities)
└── Reference Processes
├── SystematicScreeningProcess
└── GuidedCompositionProcess
Extensions
├── Methodological Processes (research methodologies)
├── Developmental Processes (skill progression)
├── Analytical Processes (domain-specific analysis)
├── Compositional Processes (creative workflows)
└── Reflective Processes (contemplative practices)
Extensions are full-stack components that may include:
IFormationProcess
)Extensions rely on these always-available services:
These services are provided through dependency injection and maintain consistent interfaces across versions.
Every process execution receives a ProcessContext
containing:
This context ensures outputs remain personally relevant and meaningful within the specific inquiry.
// Core processes (included)
services.AddProcess<SystematicScreeningProcess>();
services.AddProcess<GuidedCompositionProcess>();
// Extended processes (additional)
services.AddProcess<YourCustomProcess>();
Extensions can define their own entities that integrate with the core schema:
public class ProcessSpecificData : BaseEntity
{
public Guid ProcessExecutionId { get; set; }
public ProcessExecution ProcessExecution { get; set; }
// Process-specific properties
}
The platform handles migrations and ensures data consistency across extensions. See EXTENSION-GUIDE.md for implementation details.
The system models users as authors of their own understanding through a journey and journal system.
Users are the constant in the system, maintaining:
Journeys represent specific instances of users engaging with processes:
Journals capture the narrative of intellectual development:
The system assembles context from:
Context is managed to fit within cognitive system limits while maintaining narrative coherence and the user’s voice.
See USER-MODEL.md for detailed specifications.