Clinical Trial Intelligence Agent -- Architecture Guide¶
Version: 2.0.0 Date: March 22, 2026 Author: Adam Jones Platform: NVIDIA DGX Spark -- HCLS AI Factory
Table of Contents¶
- System Overview
- Three-Tier Architecture
- Data Flow
- Component Architecture
- RAG Engine Design
- Workflow Engine Design
- Decision Support Architecture
- Agent Pipeline
- Query Expansion System
- Collection Architecture
- Cross-Agent Integration
- Ingest Pipeline Architecture
- Observability Architecture
- Security Architecture
- Deployment Architecture
1. System Overview¶
The Clinical Trial Intelligence Agent is designed as a modular, layered system that separates concerns across three tiers (presentation, application, data) with well-defined interfaces between each layer. The architecture prioritizes graceful degradation: every component operates without Milvus, without the LLM, and without peer agents, ensuring the system delivers value at every connectivity level.
Design Principles¶
- Graceful Degradation: Each tier and component operates independently. Milvus down? Workflows and knowledge base still function. LLM unavailable? Search-only mode. Peer agents offline? Default responses logged.
- Workflow-First Design: All clinical trial intelligence is channeled through 10 typed workflows following a common preprocess/execute/postprocess contract.
- Multi-Collection RAG: Evidence is distributed across 14 specialized collections, with workflow-specific weights ensuring domain relevance.
- Type Safety: Pydantic models enforce data contracts at every boundary (API input/output, workflow I/O, search results).
- Observable: Prometheus metrics at every layer (query, search, workflow, LLM, ingest, system).
2. Three-Tier Architecture¶
+==============================================================+
| PRESENTATION TIER |
| +------------------+ |
| | Streamlit UI | 5 tabs: Intelligence, Matching, |
| | Port 8128 | Protocol, Competitive, Dashboard |
| | NVIDIA Theme | |
| +--------+---------+ |
| | HTTP/REST (JSON) |
+===========|==================================================+
v
+==============================================================+
| APPLICATION TIER |
| +------------------+ +----------------+ +---------------+ |
| | FastAPI API | | Workflow | | Decision | |
| | Port 8538 | | Engine (10) | | Support (5) | |
| | 26 Endpoints | | | | | |
| | CORS, Auth, Rate | | Protocol | | Confidence | |
| +--------+---------+ | Patient Match | | Complexity | |
| | | Site Select | | Enrollment | |
| +--------+---------+ | Eligibility | | Eligibility | |
| | RAG Engine | | Adaptive | | Competitive | |
| | Multi-collection | | Safety Signal | | Success Rate | |
| | Weighted search | | Regulatory | +---------------+ |
| +--------+---------+ | Competitive | |
| | | Diversity | +---------------+ |
| +--------+---------+ | DCT Planning | | Query | |
| | Agent Pipeline | +----------------+ | Expansion | |
| | Plan-Search- | | 10 maps | |
| | Evaluate-Synth | +----------------+ | 140 aliases | |
| +--------+---------+ | Knowledge | +---------------+ |
| | | Base | |
| | | 40 trials | +---------------+ |
| | | 13 areas | | Cross-Agent | |
| | | 9 agencies | | Integration | |
| | +----------------+ | 4 agents | |
| | +---------------+ |
+===========|==================================================+
v
+==============================================================+
| DATA TIER |
| +------------------+ +----------------+ +---------------+ |
| | Milvus | | etcd | | MinIO | |
| | Port 19530 | | Port 2379 | | Port 9000 | |
| | 14 Collections | | Metadata | | Blob Storage | |
| | IVF_FLAT/COSINE | | | | | |
| | 384-dim BGE | | | | | |
| +------------------+ +----------------+ +---------------+ |
+==============================================================+
Tier Responsibilities¶
| Tier | Responsibility | Failure Mode |
|---|---|---|
| Presentation | User interaction, visualization, form input | API errors shown as warnings |
| Application | Business logic, search, synthesis, decision support | Degrades per component |
| Data | Vector storage, indexing, metadata, blob storage | Workflows still run from knowledge base |
3. Data Flow¶
3.1 Query Flow¶
User Question
|
v
[Streamlit UI] --> HTTP POST /v1/trial/query
|
v
[FastAPI API]
|
v
[Query Expansion] --> Resolve aliases, expand synonyms
|
v
[Agent Pipeline]
|
+-> [Plan] --> Detect workflows, decompose sub-questions
|
+-> [Search] --> RAG Engine
| |
| +-> [Embed query] --> BGE-small-en-v1.5 (384-dim)
| |
| +-> [Search 14 collections] --> Milvus IVF_FLAT
| |
| +-> [Weight & merge] --> Workflow-specific weights
| |
| +-> [Score threshold] --> Filter < 0.4
|
+-> [Evaluate] --> Evidence quality, completeness check
|
+-> [Workflow Execute] --> One or more of 10 workflows
| |
| +-> [Decision Support] --> Confidence, complexity, etc.
| |
| +-> [Cross-Agent] --> Optional oncology/PGx/cardio/biomarker
|
+-> [Synthesize] --> LLM (Claude) generates natural language answer
|
+-> [Report] --> Format with citations, guidelines, confidence
|
v
[TrialResponse] --> JSON to Streamlit UI
3.2 Ingest Flow¶
[External Sources]
|
+-> ClinicalTrials.gov API --> ClinicalTrialsParser
| |
| +-> trial_protocols
| +-> trial_eligibility
| +-> trial_endpoints
| +-> trial_sites
| +-> trial_investigators
|
+-> PubMed E-utilities API --> PubMedParser
| |
| +-> trial_literature
| +-> trial_results
|
+-> FDA/EMA/ICH docs --> RegulatoryParser
|
+-> trial_regulatory
+-> trial_guidelines
+-> trial_safety
3.3 Cross-Agent Flow¶
[Clinical Trial Agent]
|
+-- query_oncology_agent() --> [:8527] Molecular matches
| |
| +-- Graceful degradation if unavailable
|
+-- query_pgx_agent() ------> [:8107] PGx screening
|
+-- query_cardiology_agent() -> [:8126] Cardiac safety
|
+-- query_biomarker_agent() --> [:8529] Biomarker enrichment
|
v
[integrate_cross_agent_results()] --> Unified assessment
4. Component Architecture¶
4.1 Component Dependency Graph¶
api/main.py (FastAPI)
|
+-> api/routes/trial_clinical.py (22 endpoints)
| |
| +-> src/agent.py (TrialIntelligenceAgent)
| | |
| | +-> src/rag_engine.py (TrialRAGEngine)
| | | |
| | | +-> src/collections.py (14 schemas)
| | | +-> src/query_expansion.py (10 maps)
| | |
| | +-> src/clinical_workflows.py (10 workflows)
| | | |
| | | +-> src/decision_support.py (5 engines)
| | | +-> src/knowledge.py (domain knowledge)
| | | +-> src/models.py (data contracts)
| | |
| | +-> src/cross_modal.py (4 agent integrations)
| |
| +-> src/export.py (report generation)
|
+-> api/routes/reports.py (2 endpoints)
+-> api/routes/events.py (2 endpoints)
+-> src/metrics.py (Prometheus)
+-> config/settings.py (TrialSettings)
4.2 Module Roles¶
| Module | Role | Dependencies |
|---|---|---|
agent.py |
Orchestrator: plan, search, evaluate, synthesize | rag_engine, workflows, cross_modal |
rag_engine.py |
Multi-collection RAG with weighted retrieval | collections, query_expansion |
clinical_workflows.py |
10 workflow implementations | models, knowledge, decision_support |
decision_support.py |
Quantitative scoring engines | models |
knowledge.py |
Static domain knowledge (40 trials, 13 areas, etc.) | -- (no dependencies) |
models.py |
Pydantic data contracts and enums | pydantic |
collections.py |
Milvus schema definitions | models, pymilvus |
query_expansion.py |
Synonym resolution and term expansion | models |
cross_modal.py |
Cross-agent HTTP integration | config/settings |
export.py |
Report formatting and export | models |
metrics.py |
Prometheus metric definitions | prometheus_client |
scheduler.py |
Timed ingest and maintenance | ingest, settings |
5. RAG Engine Design¶
5.1 Multi-Collection Search Strategy¶
The RAG engine searches all 14 collections in parallel, with each collection receiving a workflow-specific weight. Results are merged by weighted score and deduplicated.
Query
|
v
[Expand Query] --> query_expansion.py
|
v
[Embed] --> BGE-small-en-v1.5 (384-dim vector)
|
v
[Parallel Search] --> 14 collections x top_k results
|
+-> trial_protocols (weight: varies by workflow)
+-> trial_eligibility (weight: varies by workflow)
+-> ...
+-> genomic_evidence (weight: varies by workflow)
|
v
[Weight & Merge] --> score = raw_score * collection_weight
|
v
[Threshold Filter] --> remove if score < 0.4
|
v
[Re-rank] --> sort by weighted score descending
|
v
[Return top_k] --> TrialSearchResult list
5.2 Graceful Degradation Levels¶
| Level | Milvus | LLM | Peer Agents | Capability |
|---|---|---|---|---|
| Full | Available | Available | Available | Complete RAG + synthesis + cross-agent |
| Search-only | Available | Unavailable | Any | Vector search with structured results |
| Workflow-only | Unavailable | Available | Any | Knowledge-based workflows + LLM synthesis |
| Minimal | Unavailable | Unavailable | Unavailable | Decision engines + knowledge base queries |
6. Workflow Engine Design¶
6.1 Template Method Pattern¶
All workflows follow the BaseTrialWorkflow contract, which enforces a three-phase execution model:
class BaseTrialWorkflow(ABC):
workflow_type: TrialWorkflowType
def run(self, inputs: dict) -> WorkflowResult:
processed = self.preprocess(inputs) # validate, normalize
result = self.execute(processed) # core logic
result = self.postprocess(result) # enrich, add warnings
return result
6.2 Workflow Routing¶
The WorkflowEngine maps TrialWorkflowType enum values to workflow class instances, enabling both automatic detection from query text and explicit routing via the API:
Query Text --> [Detect Keywords] --> TrialWorkflowType
|
v
WorkflowEngine.dispatch(workflow_type, inputs)
|
v
[Specific Workflow Instance].run(inputs)
|
v
WorkflowResult
6.3 Collection Weight Boosting¶
Each workflow overrides the default collection weights to prioritize domain-relevant evidence. For example, the Safety Signal workflow boosts trial_safety to 0.25 (vs. default 0.08), ensuring adverse event data surfaces prominently.
7. Decision Support Architecture¶
7.1 Engine Independence¶
All five decision support engines are stateless, pure-function classes with no external dependencies (no Milvus, no LLM, no network calls). This ensures they operate at every degradation level.
Decision Engines (all stateless, self-contained)
|
+-> ConfidenceCalibrator --> multi-factor calibration
+-> ProtocolComplexityScorer --> Tufts CSDD benchmarks
+-> EnrollmentPredictor --> multi-factor prediction
+-> EligibilityAnalyzer --> 29 pattern matching
+-> CompetitiveThreatScorer --> 4-factor threat model
+-> HistoricalSuccessEstimator -> 12 areas x 3 phases
7.2 Calibration Pipeline¶
The Confidence Calibrator is applied as a post-processing step after workflow execution and RAG retrieval, combining four signals into a single calibrated confidence:
raw_confidence (0.30) --> from workflow
evidence_base (0.30) --> from evidence level (A1=1.0, E=0.15)
doc_factor (0.20) --> log(n_docs + 1) / log(12)
agreement (0.20) --> cross-agent consensus
= calibrated confidence (0.0-1.0)
8. Agent Pipeline¶
8.1 Five-Stage Pipeline¶
The TrialIntelligenceAgent implements the VAST AI OS AgentEngine pattern:
| Stage | Method | Input | Output |
|---|---|---|---|
| Plan | search_plan() |
Query text | SearchPlan (areas, drugs, biomarkers, sub-questions) |
| Search | rag_engine.query() |
SearchPlan | TrialSearchResult list |
| Evaluate | evaluate_evidence() |
Search results | Quality scores, completeness flags |
| Synthesize | synthesize() |
Results + evaluation | Natural language answer (via Claude) |
| Report | generate_report() |
All above | TrialResponse with citations |
8.2 Evidence Hierarchy¶
Level 1a: Systematic review of RCTs (highest)
Level 1b: Individual RCT
Level 2a: Systematic review of cohort studies
Level 2b: Individual cohort study
Level 3: Case-control study
Level 4: Case series
Level 5: Expert opinion
Reg: Regulatory guidance (FDA/EMA/ICH)
9. Query Expansion System¶
9.1 Expansion Pipeline¶
Raw Query: "What NSCLC trials use Keytruda with TMB-H?"
|
v
[Entity Alias Resolution]
NSCLC -> "non-small cell lung cancer"
Keytruda -> "pembrolizumab"
TMB-H -> "tumor mutational burden high"
|
v
[Therapeutic Area Detection]
"lung cancer" -> oncology
|
v
[Drug Expansion]
"pembrolizumab" -> ["Keytruda", "MK-3475", "lambrolizumab", "anti-PD-1"]
|
v
[Biomarker Expansion]
"TMB" -> ["tumor mutational burden", "mutational load", "TMB-high", ...]
|
v
[Expanded Query]
"non-small cell lung cancer pembrolizumab anti-PD-1
tumor mutational burden TMB-high oncology"
9.2 Map Statistics¶
- 10 synonym maps covering all clinical trial domains
- 140 entity aliases for instant abbreviation resolution
- 33 drug entries with brand/generic/code name mapping
- 22 biomarker entries with assay and synonym coverage
10. Collection Architecture¶
10.1 Schema Design Principles¶
- Shared embedding field: All 14 collections use identical 384-dim FLOAT_VECTOR fields
- Shared index config: IVF_FLAT with COSINE metric and nlist=128
- Domain-specific metadata: Each collection has typed metadata fields (VARCHAR, INT32, FLOAT, BOOL) relevant to its domain
- Auto-generated primary keys: INT64 auto_id for all collections
- Text content fields: VARCHAR with appropriate max_length for full-text and chunk storage
10.2 Estimated Data Distribution¶
trial_eligibility ████████████████████████████████████████████████████ 50,000
trial_sites ████████████████████████████████ 30,000
trial_endpoints ████████████████████ 20,000
trial_safety ████████████████████ 20,000
trial_literature ██████████ 10,000
trial_protocols █████ 5,000
trial_investigators █████ 5,000
trial_biomarkers ███ 3,000
trial_results ███ 3,000
trial_regulatory ██ 2,000
trial_rwe ██ 2,000
trial_guidelines █ 1,000
trial_adaptive ▌ 500
genomic_evidence ████████████████████████████████████████████████ ~100,000
11. Cross-Agent Integration¶
11.1 Integration Architecture¶
+----------------------------+
| Clinical Trial Intelligence |
| Agent |
+---+------+------+------+---+
| | | |
+---------------+ +---+ +---+ +---+---------------+
v v v v v
+-------+-------+ +-------+--+ +--+------+ +--------+-------+
| Oncology | | PGx | | Cardio | | Biomarker |
| Intelligence | | Intelli- | | Intelli-| | Intelligence |
| Agent (:8527) | | gence | | gence | | Agent (:8529) |
| | | (:8107) | | (:8126) | | |
| Molecular | | PGx | | Cardiac | | Enrichment |
| trial matches | | screening| | safety | | strategies |
+---------------+ +----------+ +---------+ +----------------+
11.2 Failure Isolation¶
Each cross-agent call is wrapped in a try/except with a 30-second timeout. Failure in one agent never blocks the clinical trial agent's response. The integration module (cross_modal.py) returns structured default responses when agents are unavailable, clearly flagged as degraded.
12. Ingest Pipeline Architecture¶
12.1 Pipeline Hierarchy¶
BaseIngestPipeline (abstract)
|
+-> ClinicalTrialsParser (ClinicalTrials.gov)
| |
| +-> XML/JSON parsing
| +-> Field extraction per collection
| +-> BGE embedding generation
| +-> Milvus upsert
|
+-> PubMedParser (PubMed/MEDLINE)
| |
| +-> E-utilities API calls
| +-> MeSH term extraction
| +-> Literature chunking
|
+-> RegulatoryParser (FDA/EMA/ICH)
|
+-> Document parsing
+-> Guideline extraction
+-> Safety signal parsing
12.2 Scheduler Integration¶
The src/scheduler.py runs ingest pipelines on a configurable interval (default 24 hours) via a daemon thread. It handles collection maintenance (compaction, index rebuild) and supports enable/disable via environment variable.
13. Observability Architecture¶
[Application Code]
|
+-> src/metrics.py (Prometheus client)
| |
| +-> Counters: queries_total, search_total, errors_total
| +-> Histograms: query_duration, search_duration
| +-> Gauges: collection_records
| +-> Info: system_info
|
+-> GET /metrics endpoint
|
v
[Prometheus Server] --> [Grafana Dashboard]
All metrics use the trial_ prefix for dashboard filtering. The metrics module gracefully degrades if prometheus_client is not installed (no-op stubs).
14. Security Architecture¶
14.1 Security Layers¶
[Client Request]
|
v
[CORS Middleware] --> Check origin against whitelist
|
v
[Rate Limiter] --> 100 req/min per IP
|
v
[Auth Middleware] --> Validate X-API-Key header (if configured)
|
v
[Pydantic Validation] --> Enforce field constraints, types, ranges
|
v
[Business Logic] --> No SQL injection risk (vector-only DB)
|
v
[Response] --> Sanitized JSON output
14.2 Secret Management¶
- API keys stored in environment variables or
.envfile TRIAL_prefix isolates agent-specific config- No secrets in source code or version control
- HTTPS termination at reverse proxy layer
15. Deployment Architecture¶
15.1 Docker Composition¶
+-------------------------------------------------------+
| DGX Spark Host |
| |
| +------------------+ +-------------------+ |
| | clinical-trial- | | clinical-trial- | |
| | agent-api | | agent-ui | |
| | Port: 8538 | | Port: 8128 | |
| | FastAPI + Uvicorn | | Streamlit | |
| +--------+---------+ +---------+---------+ |
| | | |
| +----------+-----------+ |
| | |
| +-------+-------+ |
| | milvus- | |
| | standalone | |
| | Port: 19530 | |
| +---+---+---+---+ |
| | | |
| +-------+ +----+------+ |
| | etcd | | MinIO | |
| | :2379 | | :9000 | |
| +-------+ +-----------+ |
| |
| +------------------+ +-------------------+ |
| | Other Agents | | Monitoring | |
| | Oncology (:8527) | | Prometheus | |
| | PGx (:8107) | | Grafana | |
| | Cardio (:8126) | | | |
| | Biomarker (:8529)| | | |
| +------------------+ +-------------------+ |
+-------------------------------------------------------+
15.2 Resource Requirements¶
| Component | CPU | Memory | GPU | Storage |
|---|---|---|---|---|
| API Server | 2 cores | 2 GB | None | <100 MB |
| Streamlit UI | 1 core | 512 MB | None | <50 MB |
| Milvus Standalone | 4 cores | 8 GB | None | 10-50 GB |
| BGE Embedding | 2 cores | 1 GB | Optional | ~500 MB model |
Clinical Trial Intelligence Agent v2.0.0 -- Architecture Guide -- March 2026