Skip to content

Clinical Trial Intelligence Agent — Architecture Design Document

Author: Adam Jones Date: March 2026 License: Apache 2.0


1. Executive Summary

The Clinical Trial Intelligence Agent extends the HCLS AI Factory platform to deliver RAG-powered decision support across the entire clinical trial lifecycle. It integrates protocol design optimization, patient-trial matching, site selection, safety signal detection, and competitive landscape analysis into a single intelligence platform, enabling pharmaceutical R&D teams to make evidence-based decisions faster and with greater confidence.

The platform enables cross-functional queries like "Design an adaptive Phase II/III protocol for EGFR-mutant NSCLC with a biomarker-enriched population" that simultaneously search protocol templates, eligibility criteria, endpoint definitions, regulatory precedents, safety databases, and competitive intelligence — returning grounded recommendations with citations to landmark trials and regulatory guidance.

Key Results

Metric Value
Total vectors indexed ~251,500 across 14 Milvus collections (13 owned + 1 read-only)
Clinical workflows 10 (protocol design, patient matching, site selection, eligibility, adaptive, safety signal, regulatory, competitive, diversity, DCT)
Decision support engines 5 + 1 Historical Success Estimator
Landmark trials in knowledge base 40 across 13 therapeutic areas
Regulatory agencies modeled 9 (FDA, EMA, PMDA, NMPA, Health Canada, TGA, MHRA, Swissmedic, ANVISA)
Entity aliases 140 query expansion mappings
API endpoints 26
Lines of code 22,607
Test suite 769 tests (100% pass, 0.47s runtime)

2. Architecture Overview

2.1 Mapping to VAST AI OS

VAST AI OS Component Clinical Trial Agent Role
DataStore Raw files: ClinicalTrials.gov JSON, PubMed XML, FDA regulatory docs, site performance data
DataEngine Event-driven ingest pipelines with 7+ parsers (protocol, eligibility, endpoint, site, safety, regulatory, literature)
DataBase 14 Milvus collections (13 owned + 1 read-only) + knowledge base (40 trials, 13 areas, 9 agencies)
InsightEngine BGE-small embedding + multi-collection RAG + query expansion (10 maps, 140 aliases)
AgentEngine ClinicalTrialAgent (Plan-Search-Evaluate-Synthesize) + Streamlit UI (5 tabs)

2.2 System Diagram

                        ┌─────────────────────────────────┐
                            Streamlit Chat UI (8128)       
                            5 tabs: Intelligence |         
                            Matching | Protocol |          
                            Competitive | Dashboard        
                        └──────────────┬──────────────────┘
                                       
                        ┌──────────────▼──────────────────┐
                             FastAPI REST API (8538)       
                             26 endpoints, CORS, Auth      
                             Rate limiting, Metrics         
                        └──────────────┬──────────────────┘
                                       
                        ┌──────────────▼──────────────────┐
                             ClinicalTrialAgent            
                          Plan  Search  Evaluate        
                          Synthesize                        
                        └──────────────┬──────────────────┘
                                       
        ┌──────────────────────────────┼──────────────────────────────┐
                                                                    
┌───────▼────────┐          ┌──────────▼──────────┐       ┌──────────▼──────────┐
 Workflows (10)            Decision Engines (5)         RAG Engine           
                                                                             
 Protocol Design           Confidence Scorer            BGE-small-en-v1.5    
 Patient Match             Complexity Estimator         (384-dim embedding)  
 Site Selection            Enrollment Predictor                             
 Eligibility               Eligibility Scorer                               
 Adaptive Design           Competitive Ranker           Parallel Search      
 Safety Signal                                          14 Milvus Collections
 Regulatory                + Historical Success         (ThreadPoolExecutor) 
 Competitive                 Estimator                                      
 Diversity                                                                  
 DCT Planning                                           Claude Sonnet 4.6    
└───────┬────────┘          └──────────┬──────────┘       └──────────────────────┘
                                      
┌───────▼──────────────────────────────▼──────────────────────────────┐
                  Milvus 2.4  14 Collections                        
                                                                      
  trial_protocols (5K)       trial_eligibility (50K)                  
  trial_endpoints (20K)      trial_sites (30K)                        
  trial_investigators (5K)   trial_results (3K)                       
  trial_regulatory (2K)      trial_literature (10K)                   
  trial_biomarkers (3K)      trial_safety (20K)                       
  trial_rwe (2K)             trial_adaptive (500)                     
  trial_guidelines (1K)      genomic_evidence (100K) [shared]         
└──────────────────────────────────────────────────────────────────────┘

3. Data Collections — Actual State

All 14 collections (13 owned + 1 shared read-only) are populated and searchable.

3.1 Collection Catalog

# Collection Est. Records Weight Primary Use
1 trial_protocols 5,000 0.10 Protocol design, competitive intelligence
2 trial_eligibility 50,000 0.09 Patient matching, eligibility optimization
3 trial_endpoints 20,000 0.08 Protocol design, adaptive design evaluation
4 trial_sites 30,000 0.07 Site selection, diversity assessment
5 trial_investigators 5,000 0.05 Site selection, competitive intelligence
6 trial_results 3,000 0.09 Protocol design, competitive intelligence
7 trial_regulatory 2,000 0.07 Regulatory document generation
8 trial_literature 10,000 0.08 Evidence synthesis, protocol design
9 trial_biomarkers 3,000 0.07 Patient matching, biomarker strategy
10 trial_safety 20,000 0.08 Safety signal detection, regulatory docs
11 trial_rwe 2,000 0.06 Eligibility optimization, diversity planning
12 trial_adaptive 500 0.05 Adaptive design evaluation
13 trial_guidelines 1,000 0.08 All workflows (regulatory reference)
14 genomic_evidence 100,000 0.03 Cross-modal genomic queries

3.2 Index Configuration (all collections)

Parameter Value
Index type IVF_FLAT
Metric COSINE
nlist 1024 (protocols, eligibility, safety), 256 (others)
nprobe 16
Embedding dim 384 (BGE-small-en-v1.5)

4. Knowledge Base

4.1 Landmark Trials (40 entries)

Each entry includes: NCT ID, trial name, therapeutic area, phase, design type, primary endpoint, key result, regulatory outcome, and lessons learned.

Therapeutic Area Count Example Trials
Oncology 12 KEYNOTE-024, CheckMate-067, DESTINY-Breast03, ELIANA
Cardiology 5 DAPA-HF, EMPEROR-Reduced, PARADIGM-HF
Neurology 4 CLARITY AD, TRAILBLAZER-ALZ, EMERGE/ENGAGE
Immunology 4 TRANSFORM, RINVOQ, SKYRIZI
Rare Disease 3 FIREFISH, SUNFISH, SPRINT
Infectious Disease 3 RECOVERY, SOLIDARITY, COVE
Metabolic 3 SURPASS, SELECT, STEP
Other 6 Various therapeutic areas

4.2 Regulatory Agencies (9 entries)

Agency Jurisdiction Key Pathways
FDA United States Breakthrough Therapy, Accelerated Approval, RMAT, Fast Track
EMA European Union PRIME, Conditional MA, Orphan Designation
PMDA Japan SAKIGAKE, Conditional/Time-Limited Approval
NMPA China Priority Review, Breakthrough Therapy
Health Canada Canada Priority Review, NOC/c
TGA Australia Priority Review, Provisional Approval
MHRA United Kingdom ILAP, Conditional MA
Swissmedic Switzerland Fast-track, Temporary Authorization
ANVISA Brazil Priority Review

4.3 Adaptive Design Templates (9 entries)

Design Type Use Case Regulatory Precedent
Bayesian adaptive randomization Biomarker-driven oncology I-SPY 2
Platform trial Multi-arm, multi-stage RECOVERY
Seamless Phase II/III Dose selection + confirmatory KEYNOTE-024
Group sequential Early stopping for efficacy/futility Most pivotal trials
Response-adaptive Rare disease, small populations SUNFISH
Biomarker-adaptive Enrichment based on interim CheckMate-227
Sample size re-estimation Adaptive enrollment PARADIGM-HF
Master protocol (basket) Histology-independent NCI-MATCH
Master protocol (umbrella) Biomarker-stratified Lung-MAP

5. Clinical Workflows

5.1 Workflow Catalog

# Workflow Clinical Question Key Collections
1 Protocol Design "What endpoints and design would optimize this Phase II/III?" protocols, endpoints, results, adaptive
2 Patient Matching "Which patients meet eligibility for this trial?" eligibility, biomarkers, rwe
3 Site Selection "Which sites will enroll fastest with the best data quality?" sites, investigators, results
4 Eligibility Optimization "How can we broaden eligibility without compromising safety?" eligibility, safety, rwe, guidelines
5 Adaptive Design "Should we use Bayesian adaptive randomization or group sequential?" adaptive, endpoints, results
6 Safety Signal "Are there emerging safety signals in this ongoing trial?" safety, literature, regulatory
7 Regulatory Strategy "What regulatory pathway should we pursue for accelerated approval?" regulatory, guidelines, results
8 Competitive Landscape "Who else is developing therapies for this indication?" protocols, results, literature
9 Diversity Planning "How do we ensure representative enrollment across demographics?" sites, rwe, eligibility
10 DCT Planning "Which trial activities can be decentralized?" sites, guidelines, rwe

5.2 Decision Support Engines

Engine Function Output
Confidence Scorer Assesses evidence strength for recommendations 0-100 confidence score with supporting evidence count
Complexity Estimator Evaluates protocol complexity against benchmarks Complexity index, simplified alternatives
Enrollment Predictor Projects enrollment timelines from site data Months to full enrollment, at-risk sites
Eligibility Scorer Rates eligibility criteria restrictiveness Restrictiveness score, broadening suggestions
Competitive Ranker Ranks competitive trials by threat level Ranked competitor list with differentiation gaps
Historical Success Estimator Predicts trial success probability from historical data Success probability by phase, indication, design

6. Multi-Collection RAG Engine

6.1 Search Flow

User Query: "Design a Phase II protocol for KRAS G12C NSCLC"
    
    ├── 1. Embed query with BGE asymmetric prefix               [< 5 ms]
    
    ├── 2. Parallel search across 14 collections (top-5 each)   [12-18 ms]
       ├── trial_protocols:    KRAS G12C NSCLC trials           (score: 0.82-0.90)
       ├── trial_results:      Sotorasib/adagrasib outcomes     (score: 0.78-0.88)
       ├── trial_endpoints:    NSCLC Phase II endpoints         (score: 0.75-0.85)
       ├── trial_eligibility:  KRAS-selected criteria           (score: 0.74-0.82)
       └── trial_adaptive:     Biomarker-adaptive designs       (score: 0.70-0.80)
    
    ├── 3. Query expansion: "KRAS G12C NSCLC"                  [< 1 ms]
          [KRAS, sotorasib, adagrasib, non-small cell,
           biomarker-selected, targeted therapy, ...]
    
    ├── 4. Weighted merge + deduplicate (cap at 30 results)      [< 1 ms]
    
    ├── 5. Knowledge base augmentation                           [< 1 ms]
          NSCLC  landmark trials, regulatory pathways,
          historical success rates, competitive landscape
    
    └── 6. Stream Claude Sonnet 4.6 response                    [~22-26 sec]
           Grounded protocol recommendation with
           design rationale, endpoint selection,
           and regulatory pathway guidance

Total: ~26 sec (dominated by LLM generation; retrieval is ~25 ms)

6.2 Embedding Strategy

Model: BGE-small-en-v1.5 (BAAI)

Mode Prefix Usage
Query "Represent this sentence for searching relevant passages: " User questions
Document None (raw text) Ingested records

7. Performance Benchmarks

Measured on NVIDIA DGX Spark (GB10 GPU, 128GB unified LPDDR5x memory, 20 ARM cores).

7.1 Search Performance

Operation Latency Notes
Single collection search (top-5) 3-5 ms Milvus IVF_FLAT with cached index
14-collection parallel search (top-5 each) 12-18 ms ThreadPoolExecutor, 70 total results
Query expansion + filtered search 8-12 ms Up to 5 expanded terms, applicable collections
Knowledge base augmentation < 1 ms In-memory dictionary lookup
Full retrieve() pipeline 22-32 ms Embed + search + expand + merge + knowledge

7.2 RAG Query Performance

Operation Latency Notes
Full query (retrieve + Claude generate) ~26 sec Dominated by LLM generation
Streaming query (time to first token) ~3 sec Evidence returned immediately
Response length 1000-2500 chars Grounded answer with citations
Token count 500-1200 tokens Claude Sonnet 4.6 output

7.3 Decision Engine Performance

Engine Latency
Confidence Scorer <20 ms
Complexity Estimator <30 ms
Enrollment Predictor <50 ms
Eligibility Scorer <20 ms
Competitive Ranker <40 ms
Historical Success Estimator <30 ms
All engines combined <200 ms

8. Infrastructure

8.1 Technology Stack

Component Technology
Language Python 3.10+
Vector DB Milvus 2.4, localhost:19530
Embeddings BGE-small-en-v1.5 (BAAI) — 384-dim
LLM Claude Sonnet 4.6 (Anthropic API)
Web UI Streamlit (port 8128, NVIDIA black/green theme)
REST API FastAPI + Uvicorn (port 8538)
Configuration Pydantic BaseSettings
Testing pytest (769 tests)
Hardware target NVIDIA DGX Spark (GB10 GPU, 128GB unified, $4,699)

8.2 Service Ports

Port Service
8538 FastAPI REST API
8128 Streamlit Chat UI
19530 Milvus vector database (shared)

8.3 Dependencies on HCLS AI Factory

Dependency Usage
Milvus 2.4 instance Shared vector database — adds 13 owned collections alongside existing genomic_evidence (read-only)
ANTHROPIC_API_KEY Loaded from rag-chat-pipeline/.env if not set in environment
BGE-small-en-v1.5 Same embedding model as main RAG pipeline

9. Demo Scenarios

9.1 Validated Demo Queries

1. "Design an adaptive Phase II protocol for EGFR-mutant NSCLC with osimertinib resistance" - Searches: protocols (EGFR NSCLC), endpoints (ORR, PFS), adaptive (biomarker-adaptive), results (AURA3, FLAURA) - Expected: Biomarker-enriched, seamless Phase II/III, with co-primary ORR/PFS endpoints

2. "Match patients with BRCA1/2 mutations to open PARP inhibitor trials" - Searches: eligibility (BRCA criteria), biomarkers (BRCA1/2), protocols (PARP inhibitor), sites (recruiting) - Expected: Ranked trial list with eligibility match scores and site proximity

3. "Identify the top 10 enrolling sites for pediatric ALL trials in the US" - Searches: sites (pediatric ALL, US), investigators (pediatric hematology), results (enrollment data) - Expected: Ranked site list with enrollment rates, investigator profiles, and diversity metrics

4. "Are there safety signals for hepatotoxicity in the latest checkpoint inhibitor trials?" - Searches: safety (hepatotoxicity, checkpoint), literature (hepatic AEs), regulatory (safety reports) - Expected: Signal detection with frequency, severity, and regulatory response summary

5. "What is the competitive landscape for GLP-1 receptor agonists in obesity?" - Searches: protocols (GLP-1, obesity), results (semaglutide, tirzepatide), regulatory (obesity approvals) - Expected: Competitive matrix with mechanism, phase, enrollment, and differentiation analysis


10. File Structure (Actual)

clinical_trial_intelligence_agent/
├── src/
   ├── __init__.py
   ├── agent.py                     # Autonomous reasoning pipeline
   ├── models.py                    # Pydantic data models + enums
   ├── collections.py               # 14 Milvus collection schemas
   ├── rag_engine.py                # Multi-collection RAG engine
   ├── clinical_workflows.py        # 10 trial workflows
   ├── decision_support.py          # 5 decision engines + success estimator
   ├── knowledge.py                 # Domain knowledge base (40 trials, 13 areas, 9 agencies)
   ├── query_expansion.py           # 10 expansion maps, 140 aliases
   ├── cross_modal.py               # Cross-agent integration (4 peer agents)
   ├── metrics.py                   # Prometheus metrics
   ├── export.py                    # Report generation (PDF, Markdown, JSON)
   └── ingest/
       ├── base.py                  # BaseIngestPipeline
       ├── protocol_parser.py       # ClinicalTrials.gov REST API v2
       ├── eligibility_parser.py    # Eligibility criteria extraction
       ├── safety_parser.py         # Adverse event parser
       ├── regulatory_parser.py     # FDA/EMA document parser
       └── literature_parser.py     # PubMed E-utilities
├── app/
   └── trial_ui.py                 # Streamlit (5 tabs, NVIDIA theme)
├── api/
   └── main.py                     # FastAPI REST server (26 endpoints)
├── config/
   └── settings.py                 # Pydantic BaseSettings
├── data/
   └── reference/                  # Seed data JSON files
├── scripts/
   ├── setup_collections.py
   ├── seed_knowledge.py
   └── validate_e2e.py
├── tests/                          # 769 tests
├── requirements.txt
├── Dockerfile
├── docker-compose.yml
└── README.md

46 Python files | ~22,607 lines of code | Apache 2.0


11. Implementation Status

Phase Status Details
Phase 1: Architecture Complete Data models, 14 collection schemas, knowledge base, 5 decision engines, RAG engine, agent orchestrator
Phase 2: Data Complete 40 landmark trials, 13 therapeutic areas, 9 regulatory agencies, 9 adaptive design templates, 140 entity aliases
Phase 3: RAG Integration Complete Multi-collection parallel search, knowledge augmentation, Claude Sonnet 4.6 streaming
Phase 4: Workflows Complete 10 clinical workflows with workflow-specific weight boosting
Phase 5: Testing Complete 769 tests, 100% pass, 0.47s runtime
Phase 6: UI + Demo Complete Streamlit UI on port 8128 (5 tabs), NVIDIA theme, demo scenarios validated

Remaining Work

Item Priority Effort
Real-time ClinicalTrials.gov sync (incremental updates) Medium 2-3 days
FDA CDER approval database integration Medium 1-2 days
Multi-region regulatory strategy comparison Low 1 week
Integration with HCLS AI Factory landing page Low 1 hour

12. Relationship to HCLS AI Factory

The Clinical Trial Intelligence Agent demonstrates the translational extension of the HCLS AI Factory architecture. While the core platform discovers drug candidates (Stage 3), this agent accelerates the path from candidate to clinic by optimizing clinical trial design and execution.

  • Same Milvus instance — 13 new owned collections alongside existing genomic_evidence (read-only)
  • Same embedding model — BGE-small-en-v1.5 (384-dim)
  • Same LLM — Claude via Anthropic API
  • Same hardware — NVIDIA DGX Spark ($4,699)
  • Same patterns — Pydantic models, BaseIngestPipeline, knowledge graph, query expansion

The platform closes the loop: genomic variants (Stage 1) inform biomarker-enriched trial designs, while trial outcomes feed back into the RAG knowledge base to improve future recommendations.


13. Credits

  • Adam Jones
  • Apache 2.0 License

Clinical Decision Support Disclaimer

The Clinical Trial Intelligence Agent is a clinical decision support research tool for clinical trial optimization. It is not FDA-cleared and is not intended as a standalone diagnostic device. All recommendations should be reviewed by qualified healthcare professionals and regulatory experts. Apache 2.0 License.