Clinical Trial Intelligence Agent — Architecture Design Document¶

Author: Adam Jones Date: March 2026 License: Apache 2.0

1. Executive Summary¶

The Clinical Trial Intelligence Agent extends the HCLS AI Factory platform to deliver RAG-powered decision support across the entire clinical trial lifecycle. It integrates protocol design optimization, patient-trial matching, site selection, safety signal detection, and competitive landscape analysis into a single intelligence platform, enabling pharmaceutical R&D teams to make evidence-based decisions faster and with greater confidence.

The platform enables cross-functional queries like "Design an adaptive Phase II/III protocol for EGFR-mutant NSCLC with a biomarker-enriched population" that simultaneously search protocol templates, eligibility criteria, endpoint definitions, regulatory precedents, safety databases, and competitive intelligence — returning grounded recommendations with citations to landmark trials and regulatory guidance.

Key Results¶

Metric	Value
Total vectors indexed	~251,500 across 14 Milvus collections (13 owned + 1 read-only)
Clinical workflows	10 (protocol design, patient matching, site selection, eligibility, adaptive, safety signal, regulatory, competitive, diversity, DCT)
Decision support engines	5 + 1 Historical Success Estimator
Landmark trials in knowledge base	40 across 13 therapeutic areas
Regulatory agencies modeled	9 (FDA, EMA, PMDA, NMPA, Health Canada, TGA, MHRA, Swissmedic, ANVISA)
Entity aliases	140 query expansion mappings
API endpoints	26
Lines of code	22,607
Test suite	769 tests (100% pass, 0.47s runtime)

2. Architecture Overview¶

2.1 Mapping to VAST AI OS¶

VAST AI OS Component	Clinical Trial Agent Role
DataStore	Raw files: ClinicalTrials.gov JSON, PubMed XML, FDA regulatory docs, site performance data
DataEngine	Event-driven ingest pipelines with 7+ parsers (protocol, eligibility, endpoint, site, safety, regulatory, literature)
DataBase	14 Milvus collections (13 owned + 1 read-only) + knowledge base (40 trials, 13 areas, 9 agencies)
InsightEngine	BGE-small embedding + multi-collection RAG + query expansion (10 maps, 140 aliases)
AgentEngine	ClinicalTrialAgent (Plan-Search-Evaluate-Synthesize) + Streamlit UI (5 tabs)

2.2 System Diagram¶

                        ┌─────────────────────────────────┐
                        │    Streamlit Chat UI (8128)       │
                        │    5 tabs: Intelligence |         │
                        │    Matching | Protocol |          │
                        │    Competitive | Dashboard        │
                        └──────────────┬──────────────────┘
                                       │
                        ┌──────────────▼──────────────────┐
                        │     FastAPI REST API (8538)       │
                        │     26 endpoints, CORS, Auth      │
                        │     Rate limiting, Metrics         │
                        └──────────────┬──────────────────┘
                                       │
                        ┌──────────────▼──────────────────┐
                        │     ClinicalTrialAgent            │
                        │  Plan → Search → Evaluate →       │
                        │  Synthesize                        │
                        └──────────────┬──────────────────┘
                                       │
        ┌──────────────────────────────┼──────────────────────────────┐
        │                              │                              │
┌───────▼────────┐          ┌──────────▼──────────┐       ┌──────────▼──────────┐
│ Workflows (10) │          │ Decision Engines (5) │       │ RAG Engine           │
│                │          │                      │       │                      │
│ Protocol Design│          │ Confidence Scorer    │       │ BGE-small-en-v1.5    │
│ Patient Match  │          │ Complexity Estimator │       │ (384-dim embedding)  │
│ Site Selection │          │ Enrollment Predictor │       │         │            │
│ Eligibility    │          │ Eligibility Scorer   │       │         ▼            │
│ Adaptive Design│          │ Competitive Ranker   │       │ Parallel Search      │
│ Safety Signal  │          │                      │       │ 14 Milvus Collections│
│ Regulatory     │          │ + Historical Success │       │ (ThreadPoolExecutor) │
│ Competitive    │          │   Estimator          │       │         │            │
│ Diversity      │          │                      │       │         ▼            │
│ DCT Planning   │          │                      │       │ Claude Sonnet 4.6    │
└───────┬────────┘          └──────────┬──────────┘       └──────────────────────┘
        │                              │
┌───────▼──────────────────────────────▼──────────────────────────────┐
│                  Milvus 2.4 — 14 Collections                        │
│                                                                      │
│  trial_protocols (5K)       trial_eligibility (50K)                  │
│  trial_endpoints (20K)      trial_sites (30K)                        │
│  trial_investigators (5K)   trial_results (3K)                       │
│  trial_regulatory (2K)      trial_literature (10K)                   │
│  trial_biomarkers (3K)      trial_safety (20K)                       │
│  trial_rwe (2K)             trial_adaptive (500)                     │
│  trial_guidelines (1K)      genomic_evidence (100K) [shared]         │
└──────────────────────────────────────────────────────────────────────┘

3. Data Collections — Actual State¶

All 14 collections (13 owned + 1 shared read-only) are populated and searchable.

3.1 Collection Catalog¶

#	Collection	Est. Records	Weight	Primary Use
1	`trial_protocols`	5,000	0.10	Protocol design, competitive intelligence
2	`trial_eligibility`	50,000	0.09	Patient matching, eligibility optimization
3	`trial_endpoints`	20,000	0.08	Protocol design, adaptive design evaluation
4	`trial_sites`	30,000	0.07	Site selection, diversity assessment
5	`trial_investigators`	5,000	0.05	Site selection, competitive intelligence
6	`trial_results`	3,000	0.09	Protocol design, competitive intelligence
7	`trial_regulatory`	2,000	0.07	Regulatory document generation
8	`trial_literature`	10,000	0.08	Evidence synthesis, protocol design
9	`trial_biomarkers`	3,000	0.07	Patient matching, biomarker strategy
10	`trial_safety`	20,000	0.08	Safety signal detection, regulatory docs
11	`trial_rwe`	2,000	0.06	Eligibility optimization, diversity planning
12	`trial_adaptive`	500	0.05	Adaptive design evaluation
13	`trial_guidelines`	1,000	0.08	All workflows (regulatory reference)
14	`genomic_evidence`	100,000	0.03	Cross-modal genomic queries

3.2 Index Configuration (all collections)¶

Parameter	Value
Index type	IVF_FLAT
Metric	COSINE
nlist	1024 (protocols, eligibility, safety), 256 (others)
nprobe	16
Embedding dim	384 (BGE-small-en-v1.5)

4. Knowledge Base¶

4.1 Landmark Trials (40 entries)¶

Each entry includes: NCT ID, trial name, therapeutic area, phase, design type, primary endpoint, key result, regulatory outcome, and lessons learned.

Therapeutic Area	Count	Example Trials
Oncology	12	KEYNOTE-024, CheckMate-067, DESTINY-Breast03, ELIANA
Cardiology	5	DAPA-HF, EMPEROR-Reduced, PARADIGM-HF
Neurology	4	CLARITY AD, TRAILBLAZER-ALZ, EMERGE/ENGAGE
Immunology	4	TRANSFORM, RINVOQ, SKYRIZI
Rare Disease	3	FIREFISH, SUNFISH, SPRINT
Infectious Disease	3	RECOVERY, SOLIDARITY, COVE
Metabolic	3	SURPASS, SELECT, STEP
Other	6	Various therapeutic areas

4.2 Regulatory Agencies (9 entries)¶

Agency	Jurisdiction	Key Pathways
FDA	United States	Breakthrough Therapy, Accelerated Approval, RMAT, Fast Track
EMA	European Union	PRIME, Conditional MA, Orphan Designation
PMDA	Japan	SAKIGAKE, Conditional/Time-Limited Approval
NMPA	China	Priority Review, Breakthrough Therapy
Health Canada	Canada	Priority Review, NOC/c
TGA	Australia	Priority Review, Provisional Approval
MHRA	United Kingdom	ILAP, Conditional MA
Swissmedic	Switzerland	Fast-track, Temporary Authorization
ANVISA	Brazil	Priority Review

4.3 Adaptive Design Templates (9 entries)¶

Design Type	Use Case	Regulatory Precedent
Bayesian adaptive randomization	Biomarker-driven oncology	I-SPY 2
Platform trial	Multi-arm, multi-stage	RECOVERY
Seamless Phase II/III	Dose selection + confirmatory	KEYNOTE-024
Group sequential	Early stopping for efficacy/futility	Most pivotal trials
Response-adaptive	Rare disease, small populations	SUNFISH
Biomarker-adaptive	Enrichment based on interim	CheckMate-227
Sample size re-estimation	Adaptive enrollment	PARADIGM-HF
Master protocol (basket)	Histology-independent	NCI-MATCH
Master protocol (umbrella)	Biomarker-stratified	Lung-MAP

5. Clinical Workflows¶

5.1 Workflow Catalog¶

#	Workflow	Clinical Question	Key Collections
1	Protocol Design	"What endpoints and design would optimize this Phase II/III?"	protocols, endpoints, results, adaptive
2	Patient Matching	"Which patients meet eligibility for this trial?"	eligibility, biomarkers, rwe
3	Site Selection	"Which sites will enroll fastest with the best data quality?"	sites, investigators, results
4	Eligibility Optimization	"How can we broaden eligibility without compromising safety?"	eligibility, safety, rwe, guidelines
5	Adaptive Design	"Should we use Bayesian adaptive randomization or group sequential?"	adaptive, endpoints, results
6	Safety Signal	"Are there emerging safety signals in this ongoing trial?"	safety, literature, regulatory
7	Regulatory Strategy	"What regulatory pathway should we pursue for accelerated approval?"	regulatory, guidelines, results
8	Competitive Landscape	"Who else is developing therapies for this indication?"	protocols, results, literature
9	Diversity Planning	"How do we ensure representative enrollment across demographics?"	sites, rwe, eligibility
10	DCT Planning	"Which trial activities can be decentralized?"	sites, guidelines, rwe

5.2 Decision Support Engines¶

Engine	Function	Output
Confidence Scorer	Assesses evidence strength for recommendations	0-100 confidence score with supporting evidence count
Complexity Estimator	Evaluates protocol complexity against benchmarks	Complexity index, simplified alternatives
Enrollment Predictor	Projects enrollment timelines from site data	Months to full enrollment, at-risk sites
Eligibility Scorer	Rates eligibility criteria restrictiveness	Restrictiveness score, broadening suggestions
Competitive Ranker	Ranks competitive trials by threat level	Ranked competitor list with differentiation gaps
Historical Success Estimator	Predicts trial success probability from historical data	Success probability by phase, indication, design

6. Multi-Collection RAG Engine¶

6.1 Search Flow¶

User Query: "Design a Phase II protocol for KRAS G12C NSCLC"
    │
    ├── 1. Embed query with BGE asymmetric prefix               [< 5 ms]
    │
    ├── 2. Parallel search across 14 collections (top-5 each)   [12-18 ms]
    │   ├── trial_protocols:    KRAS G12C NSCLC trials           (score: 0.82-0.90)
    │   ├── trial_results:      Sotorasib/adagrasib outcomes     (score: 0.78-0.88)
    │   ├── trial_endpoints:    NSCLC Phase II endpoints         (score: 0.75-0.85)
    │   ├── trial_eligibility:  KRAS-selected criteria           (score: 0.74-0.82)
    │   └── trial_adaptive:     Biomarker-adaptive designs       (score: 0.70-0.80)
    │
    ├── 3. Query expansion: "KRAS G12C NSCLC" →                 [< 1 ms]
    │      [KRAS, sotorasib, adagrasib, non-small cell,
    │       biomarker-selected, targeted therapy, ...]
    │
    ├── 4. Weighted merge + deduplicate (cap at 30 results)      [< 1 ms]
    │
    ├── 5. Knowledge base augmentation                           [< 1 ms]
    │      NSCLC → landmark trials, regulatory pathways,
    │      historical success rates, competitive landscape
    │
    └── 6. Stream Claude Sonnet 4.6 response                    [~22-26 sec]
           Grounded protocol recommendation with
           design rationale, endpoint selection,
           and regulatory pathway guidance

Total: ~26 sec (dominated by LLM generation; retrieval is ~25 ms)

6.2 Embedding Strategy¶

Model: BGE-small-en-v1.5 (BAAI)

Mode	Prefix	Usage
Query	`"Represent this sentence for searching relevant passages: "`	User questions
Document	None (raw text)	Ingested records

7. Performance Benchmarks¶

Measured on NVIDIA DGX Spark (GB10 GPU, 128GB unified LPDDR5x memory, 20 ARM cores).

7.1 Search Performance¶

Operation	Latency	Notes
Single collection search (top-5)	3-5 ms	Milvus IVF_FLAT with cached index
14-collection parallel search (top-5 each)	12-18 ms	ThreadPoolExecutor, 70 total results
Query expansion + filtered search	8-12 ms	Up to 5 expanded terms, applicable collections
Knowledge base augmentation	< 1 ms	In-memory dictionary lookup
Full retrieve() pipeline	22-32 ms	Embed + search + expand + merge + knowledge

7.2 RAG Query Performance¶

Operation	Latency	Notes
Full query (retrieve + Claude generate)	~26 sec	Dominated by LLM generation
Streaming query (time to first token)	~3 sec	Evidence returned immediately
Response length	1000-2500 chars	Grounded answer with citations
Token count	500-1200 tokens	Claude Sonnet 4.6 output

7.3 Decision Engine Performance¶

Engine	Latency
Confidence Scorer	<20 ms
Complexity Estimator	<30 ms
Enrollment Predictor	<50 ms
Eligibility Scorer	<20 ms
Competitive Ranker	<40 ms
Historical Success Estimator	<30 ms
All engines combined	<200 ms

8. Infrastructure¶

8.1 Technology Stack¶

Component	Technology
Language	Python 3.10+
Vector DB	Milvus 2.4, localhost:19530
Embeddings	BGE-small-en-v1.5 (BAAI) — 384-dim
LLM	Claude Sonnet 4.6 (Anthropic API)
Web UI	Streamlit (port 8128, NVIDIA black/green theme)
REST API	FastAPI + Uvicorn (port 8538)
Configuration	Pydantic BaseSettings
Testing	pytest (769 tests)
Hardware target	NVIDIA DGX Spark (GB10 GPU, 128GB unified, $4,699)

8.2 Service Ports¶

Port	Service
8538	FastAPI REST API
8128	Streamlit Chat UI
19530	Milvus vector database (shared)

8.3 Dependencies on HCLS AI Factory¶

Dependency	Usage
Milvus 2.4 instance	Shared vector database — adds 13 owned collections alongside existing `genomic_evidence` (read-only)
`ANTHROPIC_API_KEY`	Loaded from `rag-chat-pipeline/.env` if not set in environment
BGE-small-en-v1.5	Same embedding model as main RAG pipeline

9. Demo Scenarios¶

9.1 Validated Demo Queries¶

1. "Design an adaptive Phase II protocol for EGFR-mutant NSCLC with osimertinib resistance" - Searches: protocols (EGFR NSCLC), endpoints (ORR, PFS), adaptive (biomarker-adaptive), results (AURA3, FLAURA) - Expected: Biomarker-enriched, seamless Phase II/III, with co-primary ORR/PFS endpoints

2. "Match patients with BRCA1/2 mutations to open PARP inhibitor trials" - Searches: eligibility (BRCA criteria), biomarkers (BRCA1/2), protocols (PARP inhibitor), sites (recruiting) - Expected: Ranked trial list with eligibility match scores and site proximity

3. "Identify the top 10 enrolling sites for pediatric ALL trials in the US" - Searches: sites (pediatric ALL, US), investigators (pediatric hematology), results (enrollment data) - Expected: Ranked site list with enrollment rates, investigator profiles, and diversity metrics

4. "Are there safety signals for hepatotoxicity in the latest checkpoint inhibitor trials?" - Searches: safety (hepatotoxicity, checkpoint), literature (hepatic AEs), regulatory (safety reports) - Expected: Signal detection with frequency, severity, and regulatory response summary

5. "What is the competitive landscape for GLP-1 receptor agonists in obesity?" - Searches: protocols (GLP-1, obesity), results (semaglutide, tirzepatide), regulatory (obesity approvals) - Expected: Competitive matrix with mechanism, phase, enrollment, and differentiation analysis

10. File Structure (Actual)¶

clinical_trial_intelligence_agent/
├── src/
│   ├── __init__.py
│   ├── agent.py                     # Autonomous reasoning pipeline
│   ├── models.py                    # Pydantic data models + enums
│   ├── collections.py               # 14 Milvus collection schemas
│   ├── rag_engine.py                # Multi-collection RAG engine
│   ├── clinical_workflows.py        # 10 trial workflows
│   ├── decision_support.py          # 5 decision engines + success estimator
│   ├── knowledge.py                 # Domain knowledge base (40 trials, 13 areas, 9 agencies)
│   ├── query_expansion.py           # 10 expansion maps, 140 aliases
│   ├── cross_modal.py               # Cross-agent integration (4 peer agents)
│   ├── metrics.py                   # Prometheus metrics
│   ├── export.py                    # Report generation (PDF, Markdown, JSON)
│   └── ingest/
│       ├── base.py                  # BaseIngestPipeline
│       ├── protocol_parser.py       # ClinicalTrials.gov REST API v2
│       ├── eligibility_parser.py    # Eligibility criteria extraction
│       ├── safety_parser.py         # Adverse event parser
│       ├── regulatory_parser.py     # FDA/EMA document parser
│       └── literature_parser.py     # PubMed E-utilities
├── app/
│   └── trial_ui.py                 # Streamlit (5 tabs, NVIDIA theme)
├── api/
│   └── main.py                     # FastAPI REST server (26 endpoints)
├── config/
│   └── settings.py                 # Pydantic BaseSettings
├── data/
│   └── reference/                  # Seed data JSON files
├── scripts/
│   ├── setup_collections.py
│   ├── seed_knowledge.py
│   └── validate_e2e.py
├── tests/                          # 769 tests
├── requirements.txt
├── Dockerfile
├── docker-compose.yml
└── README.md

46 Python files | ~22,607 lines of code | Apache 2.0

11. Implementation Status¶

Phase	Status	Details
Phase 1: Architecture	Complete	Data models, 14 collection schemas, knowledge base, 5 decision engines, RAG engine, agent orchestrator
Phase 2: Data	Complete	40 landmark trials, 13 therapeutic areas, 9 regulatory agencies, 9 adaptive design templates, 140 entity aliases
Phase 3: RAG Integration	Complete	Multi-collection parallel search, knowledge augmentation, Claude Sonnet 4.6 streaming
Phase 4: Workflows	Complete	10 clinical workflows with workflow-specific weight boosting
Phase 5: Testing	Complete	769 tests, 100% pass, 0.47s runtime
Phase 6: UI + Demo	Complete	Streamlit UI on port 8128 (5 tabs), NVIDIA theme, demo scenarios validated

Remaining Work¶

Item	Priority	Effort
Real-time ClinicalTrials.gov sync (incremental updates)	Medium	2-3 days
FDA CDER approval database integration	Medium	1-2 days
Multi-region regulatory strategy comparison	Low	1 week
Integration with HCLS AI Factory landing page	Low	1 hour

12. Relationship to HCLS AI Factory¶

The Clinical Trial Intelligence Agent demonstrates the translational extension of the HCLS AI Factory architecture. While the core platform discovers drug candidates (Stage 3), this agent accelerates the path from candidate to clinic by optimizing clinical trial design and execution.

Same Milvus instance — 13 new owned collections alongside existing genomic_evidence (read-only)
Same embedding model — BGE-small-en-v1.5 (384-dim)
Same LLM — Claude via Anthropic API
Same hardware — NVIDIA DGX Spark ($4,699)
Same patterns — Pydantic models, BaseIngestPipeline, knowledge graph, query expansion

The platform closes the loop: genomic variants (Stage 1) inform biomarker-enriched trial designs, while trial outcomes feed back into the RAG knowledge base to improve future recommendations.

13. Credits¶

Adam Jones
Apache 2.0 License

Clinical Decision Support Disclaimer

The Clinical Trial Intelligence Agent is a clinical decision support research tool for clinical trial optimization. It is not FDA-cleared and is not intended as a standalone diagnostic device. All recommendations should be reviewed by qualified healthcare professionals and regulatory experts. Apache 2.0 License.