Clinical Trial Intelligence Agent -- Architecture Guide¶

Version: 2.0.0 Date: March 22, 2026 Author: Adam Jones Platform: NVIDIA DGX Spark -- HCLS AI Factory

Table of Contents¶

System Overview
Three-Tier Architecture
Data Flow
Component Architecture
RAG Engine Design
Workflow Engine Design
Decision Support Architecture
Agent Pipeline
Query Expansion System
Collection Architecture
Cross-Agent Integration
Ingest Pipeline Architecture
Observability Architecture
Security Architecture
Deployment Architecture

1. System Overview¶

The Clinical Trial Intelligence Agent is designed as a modular, layered system that separates concerns across three tiers (presentation, application, data) with well-defined interfaces between each layer. The architecture prioritizes graceful degradation: every component operates without Milvus, without the LLM, and without peer agents, ensuring the system delivers value at every connectivity level.

Design Principles¶

Graceful Degradation: Each tier and component operates independently. Milvus down? Workflows and knowledge base still function. LLM unavailable? Search-only mode. Peer agents offline? Default responses logged.
Workflow-First Design: All clinical trial intelligence is channeled through 10 typed workflows following a common preprocess/execute/postprocess contract.
Multi-Collection RAG: Evidence is distributed across 14 specialized collections, with workflow-specific weights ensuring domain relevance.
Type Safety: Pydantic models enforce data contracts at every boundary (API input/output, workflow I/O, search results).
Observable: Prometheus metrics at every layer (query, search, workflow, LLM, ingest, system).

2. Three-Tier Architecture¶

+==============================================================+
|                    PRESENTATION TIER                           |
|  +------------------+                                         |
|  | Streamlit UI     |   5 tabs: Intelligence, Matching,       |
|  | Port 8128        |   Protocol, Competitive, Dashboard      |
|  | NVIDIA Theme     |                                         |
|  +--------+---------+                                         |
|           | HTTP/REST (JSON)                                  |
+===========|==================================================+
            v
+==============================================================+
|                    APPLICATION TIER                            |
|  +------------------+  +----------------+  +---------------+  |
|  | FastAPI API      |  | Workflow       |  | Decision      |  |
|  | Port 8538        |  | Engine (10)    |  | Support (5)   |  |
|  | 26 Endpoints     |  |                |  |               |  |
|  | CORS, Auth, Rate |  | Protocol       |  | Confidence    |  |
|  +--------+---------+  | Patient Match  |  | Complexity    |  |
|           |             | Site Select    |  | Enrollment    |  |
|  +--------+---------+  | Eligibility    |  | Eligibility   |  |
|  | RAG Engine       |  | Adaptive       |  | Competitive   |  |
|  | Multi-collection |  | Safety Signal  |  | Success Rate  |  |
|  | Weighted search  |  | Regulatory     |  +---------------+  |
|  +--------+---------+  | Competitive    |                     |
|           |             | Diversity      |  +---------------+  |
|  +--------+---------+  | DCT Planning   |  | Query         |  |
|  | Agent Pipeline   |  +----------------+  | Expansion     |  |
|  | Plan-Search-     |                      | 10 maps       |  |
|  | Evaluate-Synth   |  +----------------+  | 140 aliases   |  |
|  +--------+---------+  | Knowledge      |  +---------------+  |
|           |             | Base           |                     |
|           |             | 40 trials      |  +---------------+  |
|           |             | 13 areas       |  | Cross-Agent   |  |
|           |             | 9 agencies     |  | Integration   |  |
|           |             +----------------+  | 4 agents      |  |
|           |                                 +---------------+  |
+===========|==================================================+
            v
+==============================================================+
|                       DATA TIER                               |
|  +------------------+  +----------------+  +---------------+  |
|  | Milvus           |  | etcd           |  | MinIO         |  |
|  | Port 19530       |  | Port 2379      |  | Port 9000     |  |
|  | 14 Collections   |  | Metadata       |  | Blob Storage  |  |
|  | IVF_FLAT/COSINE  |  |                |  |               |  |
|  | 384-dim BGE      |  |                |  |               |  |
|  +------------------+  +----------------+  +---------------+  |
+==============================================================+

Tier Responsibilities¶

Tier	Responsibility	Failure Mode
Presentation	User interaction, visualization, form input	API errors shown as warnings
Application	Business logic, search, synthesis, decision support	Degrades per component
Data	Vector storage, indexing, metadata, blob storage	Workflows still run from knowledge base

3. Data Flow¶

3.1 Query Flow¶

User Question
    |
    v
[Streamlit UI] --> HTTP POST /v1/trial/query
    |
    v
[FastAPI API]
    |
    v
[Query Expansion] --> Resolve aliases, expand synonyms
    |
    v
[Agent Pipeline]
    |
    +-> [Plan] --> Detect workflows, decompose sub-questions
    |
    +-> [Search] --> RAG Engine
    |       |
    |       +-> [Embed query] --> BGE-small-en-v1.5 (384-dim)
    |       |
    |       +-> [Search 14 collections] --> Milvus IVF_FLAT
    |       |
    |       +-> [Weight & merge] --> Workflow-specific weights
    |       |
    |       +-> [Score threshold] --> Filter < 0.4
    |
    +-> [Evaluate] --> Evidence quality, completeness check
    |
    +-> [Workflow Execute] --> One or more of 10 workflows
    |       |
    |       +-> [Decision Support] --> Confidence, complexity, etc.
    |       |
    |       +-> [Cross-Agent] --> Optional oncology/PGx/cardio/biomarker
    |
    +-> [Synthesize] --> LLM (Claude) generates natural language answer
    |
    +-> [Report] --> Format with citations, guidelines, confidence
    |
    v
[TrialResponse] --> JSON to Streamlit UI

3.2 Ingest Flow¶

[External Sources]
    |
    +-> ClinicalTrials.gov API --> ClinicalTrialsParser
    |                                    |
    |                                    +-> trial_protocols
    |                                    +-> trial_eligibility
    |                                    +-> trial_endpoints
    |                                    +-> trial_sites
    |                                    +-> trial_investigators
    |
    +-> PubMed E-utilities API --> PubMedParser
    |                                    |
    |                                    +-> trial_literature
    |                                    +-> trial_results
    |
    +-> FDA/EMA/ICH docs --> RegulatoryParser
                                    |
                                    +-> trial_regulatory
                                    +-> trial_guidelines
                                    +-> trial_safety

3.3 Cross-Agent Flow¶

[Clinical Trial Agent]
    |
    +-- query_oncology_agent() --> [:8527] Molecular matches
    |       |
    |       +-- Graceful degradation if unavailable
    |
    +-- query_pgx_agent() ------> [:8107] PGx screening
    |
    +-- query_cardiology_agent() -> [:8126] Cardiac safety
    |
    +-- query_biomarker_agent() --> [:8529] Biomarker enrichment
    |
    v
[integrate_cross_agent_results()] --> Unified assessment

4. Component Architecture¶

4.1 Component Dependency Graph¶

api/main.py (FastAPI)
    |
    +-> api/routes/trial_clinical.py (22 endpoints)
    |       |
    |       +-> src/agent.py (TrialIntelligenceAgent)
    |       |       |
    |       |       +-> src/rag_engine.py (TrialRAGEngine)
    |       |       |       |
    |       |       |       +-> src/collections.py (14 schemas)
    |       |       |       +-> src/query_expansion.py (10 maps)
    |       |       |
    |       |       +-> src/clinical_workflows.py (10 workflows)
    |       |       |       |
    |       |       |       +-> src/decision_support.py (5 engines)
    |       |       |       +-> src/knowledge.py (domain knowledge)
    |       |       |       +-> src/models.py (data contracts)
    |       |       |
    |       |       +-> src/cross_modal.py (4 agent integrations)
    |       |
    |       +-> src/export.py (report generation)
    |
    +-> api/routes/reports.py (2 endpoints)
    +-> api/routes/events.py (2 endpoints)
    +-> src/metrics.py (Prometheus)
    +-> config/settings.py (TrialSettings)

4.2 Module Roles¶

Module	Role	Dependencies
`agent.py`	Orchestrator: plan, search, evaluate, synthesize	rag_engine, workflows, cross_modal
`rag_engine.py`	Multi-collection RAG with weighted retrieval	collections, query_expansion
`clinical_workflows.py`	10 workflow implementations	models, knowledge, decision_support
`decision_support.py`	Quantitative scoring engines	models
`knowledge.py`	Static domain knowledge (40 trials, 13 areas, etc.)	-- (no dependencies)
`models.py`	Pydantic data contracts and enums	pydantic
`collections.py`	Milvus schema definitions	models, pymilvus
`query_expansion.py`	Synonym resolution and term expansion	models
`cross_modal.py`	Cross-agent HTTP integration	config/settings
`export.py`	Report formatting and export	models
`metrics.py`	Prometheus metric definitions	prometheus_client
`scheduler.py`	Timed ingest and maintenance	ingest, settings

5. RAG Engine Design¶

5.1 Multi-Collection Search Strategy¶

The RAG engine searches all 14 collections in parallel, with each collection receiving a workflow-specific weight. Results are merged by weighted score and deduplicated.

Query
  |
  v
[Expand Query] --> query_expansion.py
  |
  v
[Embed] --> BGE-small-en-v1.5 (384-dim vector)
  |
  v
[Parallel Search] --> 14 collections x top_k results
  |
  +-> trial_protocols     (weight: varies by workflow)
  +-> trial_eligibility   (weight: varies by workflow)
  +-> ...
  +-> genomic_evidence    (weight: varies by workflow)
  |
  v
[Weight & Merge] --> score = raw_score * collection_weight
  |
  v
[Threshold Filter] --> remove if score < 0.4
  |
  v
[Re-rank] --> sort by weighted score descending
  |
  v
[Return top_k] --> TrialSearchResult list

5.2 Graceful Degradation Levels¶

Level	Milvus	LLM	Peer Agents	Capability
Full	Available	Available	Available	Complete RAG + synthesis + cross-agent
Search-only	Available	Unavailable	Any	Vector search with structured results
Workflow-only	Unavailable	Available	Any	Knowledge-based workflows + LLM synthesis
Minimal	Unavailable	Unavailable	Unavailable	Decision engines + knowledge base queries

6. Workflow Engine Design¶

6.1 Template Method Pattern¶

All workflows follow the BaseTrialWorkflow contract, which enforces a three-phase execution model:

class BaseTrialWorkflow(ABC):
    workflow_type: TrialWorkflowType

    def run(self, inputs: dict) -> WorkflowResult:
        processed = self.preprocess(inputs)    # validate, normalize
        result = self.execute(processed)        # core logic
        result = self.postprocess(result)       # enrich, add warnings
        return result

6.2 Workflow Routing¶

The WorkflowEngine maps TrialWorkflowType enum values to workflow class instances, enabling both automatic detection from query text and explicit routing via the API:

Query Text --> [Detect Keywords] --> TrialWorkflowType
    |
    v
WorkflowEngine.dispatch(workflow_type, inputs)
    |
    v
[Specific Workflow Instance].run(inputs)
    |
    v
WorkflowResult

6.3 Collection Weight Boosting¶

Each workflow overrides the default collection weights to prioritize domain-relevant evidence. For example, the Safety Signal workflow boosts trial_safety to 0.25 (vs. default 0.08), ensuring adverse event data surfaces prominently.

7. Decision Support Architecture¶

7.1 Engine Independence¶

All five decision support engines are stateless, pure-function classes with no external dependencies (no Milvus, no LLM, no network calls). This ensures they operate at every degradation level.

Decision Engines (all stateless, self-contained)
    |
    +-> ConfidenceCalibrator     --> multi-factor calibration
    +-> ProtocolComplexityScorer --> Tufts CSDD benchmarks
    +-> EnrollmentPredictor      --> multi-factor prediction
    +-> EligibilityAnalyzer      --> 29 pattern matching
    +-> CompetitiveThreatScorer  --> 4-factor threat model
    +-> HistoricalSuccessEstimator -> 12 areas x 3 phases

7.2 Calibration Pipeline¶

The Confidence Calibrator is applied as a post-processing step after workflow execution and RAG retrieval, combining four signals into a single calibrated confidence:

raw_confidence (0.30)  --> from workflow
evidence_base (0.30)   --> from evidence level (A1=1.0, E=0.15)
doc_factor (0.20)      --> log(n_docs + 1) / log(12)
agreement (0.20)       --> cross-agent consensus
                           = calibrated confidence (0.0-1.0)

8. Agent Pipeline¶

8.1 Five-Stage Pipeline¶

The TrialIntelligenceAgent implements the VAST AI OS AgentEngine pattern:

Stage	Method	Input	Output
Plan	`search_plan()`	Query text	SearchPlan (areas, drugs, biomarkers, sub-questions)
Search	`rag_engine.query()`	SearchPlan	TrialSearchResult list
Evaluate	`evaluate_evidence()`	Search results	Quality scores, completeness flags
Synthesize	`synthesize()`	Results + evaluation	Natural language answer (via Claude)
Report	`generate_report()`	All above	TrialResponse with citations

8.2 Evidence Hierarchy¶

Level 1a: Systematic review of RCTs (highest)
Level 1b: Individual RCT
Level 2a: Systematic review of cohort studies
Level 2b: Individual cohort study
Level 3:  Case-control study
Level 4:  Case series
Level 5:  Expert opinion
Reg:      Regulatory guidance (FDA/EMA/ICH)

9. Query Expansion System¶

9.1 Expansion Pipeline¶

Raw Query: "What NSCLC trials use Keytruda with TMB-H?"
    |
    v
[Entity Alias Resolution]
    NSCLC -> "non-small cell lung cancer"
    Keytruda -> "pembrolizumab"
    TMB-H -> "tumor mutational burden high"
    |
    v
[Therapeutic Area Detection]
    "lung cancer" -> oncology
    |
    v
[Drug Expansion]
    "pembrolizumab" -> ["Keytruda", "MK-3475", "lambrolizumab", "anti-PD-1"]
    |
    v
[Biomarker Expansion]
    "TMB" -> ["tumor mutational burden", "mutational load", "TMB-high", ...]
    |
    v
[Expanded Query]
    "non-small cell lung cancer pembrolizumab anti-PD-1
     tumor mutational burden TMB-high oncology"

9.2 Map Statistics¶

10 synonym maps covering all clinical trial domains
140 entity aliases for instant abbreviation resolution
33 drug entries with brand/generic/code name mapping
22 biomarker entries with assay and synonym coverage

10. Collection Architecture¶

10.1 Schema Design Principles¶

Shared embedding field: All 14 collections use identical 384-dim FLOAT_VECTOR fields
Shared index config: IVF_FLAT with COSINE metric and nlist=128
Domain-specific metadata: Each collection has typed metadata fields (VARCHAR, INT32, FLOAT, BOOL) relevant to its domain
Auto-generated primary keys: INT64 auto_id for all collections
Text content fields: VARCHAR with appropriate max_length for full-text and chunk storage

10.2 Estimated Data Distribution¶

trial_eligibility  ████████████████████████████████████████████████████  50,000
trial_sites        ████████████████████████████████                      30,000
trial_endpoints    ████████████████████                                  20,000
trial_safety       ████████████████████                                  20,000
trial_literature   ██████████                                           10,000
trial_protocols    █████                                                 5,000
trial_investigators █████                                                5,000
trial_biomarkers   ███                                                   3,000
trial_results      ███                                                   3,000
trial_regulatory   ██                                                    2,000
trial_rwe          ██                                                    2,000
trial_guidelines   █                                                     1,000
trial_adaptive     ▌                                                       500
genomic_evidence   ████████████████████████████████████████████████ ~100,000

11. Cross-Agent Integration¶

11.1 Integration Architecture¶

                    +----------------------------+
                    | Clinical Trial Intelligence |
                    |         Agent               |
                    +---+------+------+------+---+
                        |      |      |      |
        +---------------+  +---+  +---+  +---+---------------+
        v                  v       v       v                  v
+-------+-------+ +-------+--+ +--+------+ +--------+-------+
| Oncology      | | PGx      | | Cardio  | | Biomarker      |
| Intelligence  | | Intelli- | | Intelli-| | Intelligence   |
| Agent (:8527) | | gence    | | gence   | | Agent (:8529)  |
|               | | (:8107)  | | (:8126) | |                |
| Molecular     | | PGx      | | Cardiac | | Enrichment     |
| trial matches | | screening| | safety  | | strategies     |
+---------------+ +----------+ +---------+ +----------------+

11.2 Failure Isolation¶

Each cross-agent call is wrapped in a try/except with a 30-second timeout. Failure in one agent never blocks the clinical trial agent's response. The integration module (cross_modal.py) returns structured default responses when agents are unavailable, clearly flagged as degraded.

12. Ingest Pipeline Architecture¶

12.1 Pipeline Hierarchy¶

BaseIngestPipeline (abstract)
    |
    +-> ClinicalTrialsParser (ClinicalTrials.gov)
    |       |
    |       +-> XML/JSON parsing
    |       +-> Field extraction per collection
    |       +-> BGE embedding generation
    |       +-> Milvus upsert
    |
    +-> PubMedParser (PubMed/MEDLINE)
    |       |
    |       +-> E-utilities API calls
    |       +-> MeSH term extraction
    |       +-> Literature chunking
    |
    +-> RegulatoryParser (FDA/EMA/ICH)
            |
            +-> Document parsing
            +-> Guideline extraction
            +-> Safety signal parsing

12.2 Scheduler Integration¶

The src/scheduler.py runs ingest pipelines on a configurable interval (default 24 hours) via a daemon thread. It handles collection maintenance (compaction, index rebuild) and supports enable/disable via environment variable.

13. Observability Architecture¶

[Application Code]
    |
    +-> src/metrics.py (Prometheus client)
    |       |
    |       +-> Counters: queries_total, search_total, errors_total
    |       +-> Histograms: query_duration, search_duration
    |       +-> Gauges: collection_records
    |       +-> Info: system_info
    |
    +-> GET /metrics endpoint
            |
            v
    [Prometheus Server] --> [Grafana Dashboard]

All metrics use the trial_ prefix for dashboard filtering. The metrics module gracefully degrades if prometheus_client is not installed (no-op stubs).

14. Security Architecture¶

14.1 Security Layers¶

[Client Request]
    |
    v
[CORS Middleware] --> Check origin against whitelist
    |
    v
[Rate Limiter] --> 100 req/min per IP
    |
    v
[Auth Middleware] --> Validate X-API-Key header (if configured)
    |
    v
[Pydantic Validation] --> Enforce field constraints, types, ranges
    |
    v
[Business Logic] --> No SQL injection risk (vector-only DB)
    |
    v
[Response] --> Sanitized JSON output

14.2 Secret Management¶

API keys stored in environment variables or .env file
TRIAL_ prefix isolates agent-specific config
No secrets in source code or version control
HTTPS termination at reverse proxy layer

15. Deployment Architecture¶

15.1 Docker Composition¶

+-------------------------------------------------------+
|                  DGX Spark Host                        |
|                                                        |
|  +------------------+  +-------------------+           |
|  | clinical-trial-  |  | clinical-trial-   |           |
|  | agent-api        |  | agent-ui          |           |
|  | Port: 8538       |  | Port: 8128        |           |
|  | FastAPI + Uvicorn |  | Streamlit         |           |
|  +--------+---------+  +---------+---------+           |
|           |                      |                     |
|           +----------+-----------+                     |
|                      |                                 |
|              +-------+-------+                         |
|              | milvus-       |                         |
|              | standalone    |                         |
|              | Port: 19530   |                         |
|              +---+---+---+---+                         |
|                  |       |                             |
|          +-------+  +----+------+                      |
|          | etcd  |  | MinIO     |                      |
|          | :2379 |  | :9000     |                      |
|          +-------+  +-----------+                      |
|                                                        |
|  +------------------+  +-------------------+           |
|  | Other Agents     |  | Monitoring        |           |
|  | Oncology (:8527) |  | Prometheus        |           |
|  | PGx (:8107)      |  | Grafana           |           |
|  | Cardio (:8126)   |  |                   |           |
|  | Biomarker (:8529)|  |                   |           |
|  +------------------+  +-------------------+           |
+-------------------------------------------------------+

15.2 Resource Requirements¶

Component	CPU	Memory	GPU	Storage
API Server	2 cores	2 GB	None	<100 MB
Streamlit UI	1 core	512 MB	None	<50 MB
Milvus Standalone	4 cores	8 GB	None	10-50 GB
BGE Embedding	2 cores	1 GB	Optional	~500 MB model

Clinical Trial Intelligence Agent v2.0.0 -- Architecture Guide -- March 2026