Clinical Trial Intelligence Agent -- Demo Guide¶

Version: 2.0.0 Date: March 22, 2026 Author: Adam Jones Platform: NVIDIA DGX Spark -- HCLS AI Factory

Table of Contents¶

Pre-Demo Checklist
Starting the System
UI Walkthrough: 5 Tabs
Demo Scenario 1: Oncology Protocol Design
Demo Scenario 2: Patient-Trial Matching
Demo Scenario 3: Eligibility Optimization
Demo Scenario 4: Competitive Intelligence
Demo Scenario 5: Adaptive Design Selection
API Demo Queries
Talking Points
FAQ
Recovery Procedures

1. Pre-Demo Checklist¶

Run through this checklist before every demo:

#	Check	Command	Expected
1	Tests pass	`python -m pytest tests/ -q`	769 passed in 0.47s
2	API starts	`curl localhost:8538/health`	`{"status": "ok"}`
3	UI loads	Open `http://localhost:8128`	NVIDIA-themed 5-tab UI
4	Collections exist	`curl localhost:8538/collections`	14 collections listed
5	Knowledge loaded	`curl localhost:8538/v1/trial/knowledge-version`	version 2.0.0
6	Workflows available	`curl localhost:8538/workflows`	10+ workflow types
7	Milvus running	Check port 19530	Connected status in health
8	Browser configured	Close unrelated tabs	Clean browser window
9	Terminal ready	Two terminals open	API + backup commands
10	API key set	Check .env	ANTHROPIC_API_KEY present

2. Starting the System¶

Quick Start (Integrated)¶

cd /home/adam/projects/hcls-ai-factory
./start-factory.sh

Quick Start (Standalone)¶

cd /home/adam/projects/hcls-ai-factory/ai_agent_adds/clinical_trial_intelligence_agent

# Terminal 1: API
uvicorn api.main:app --host 0.0.0.0 --port 8538

# Terminal 2: UI
streamlit run app/trial_ui.py --server.port 8128

Verify Both Services¶

curl -s localhost:8538/health | python -m json.tool
# Open http://localhost:8128 in browser

3. UI Walkthrough: 5 Tabs¶

Tab 1: Trial Intelligence¶

Purpose: RAG-powered Q&A across all 14 collections with workflow-aware routing.

What to show: - Type a clinical trial question in the text input - System detects the appropriate workflow automatically - Results show: answer, citations with collection source and relevance scores, guideline references, confidence score - Point out the NVIDIA dark theme and responsive layout

Sample queries to type: - "What are the standard Phase 3 endpoints for NSCLC immunotherapy trials?" - "Compare adaptive design options for a rare disease Phase 2/3 study" - "What regulatory pathways are available for breakthrough therapy designation?"

Tab 2: Patient Matching¶

Purpose: Match a patient profile against clinical trial eligibility criteria.

What to show: - Enter patient demographics (age, sex, diagnosis) - Add biomarkers (e.g., PD-L1 TPS 80%, EGFR L858R) - Add genomic variants and current medications - Submit to see per-criterion match scores - Highlight the overall match score and confidence level - Show nearby trial sites if geographic location is provided

Tab 3: Protocol Optimizer¶

Purpose: Protocol complexity scoring and design recommendations.

What to show: - Enter trial parameters: indication, phase, procedure count, visit count, endpoint count, eligibility criteria count - Submit to see complexity score and percentile rank - Point out comparisons to Tufts CSDD industry benchmarks - Show endpoint recommendations by indication - Demonstrate historical success rate lookup

Tab 4: Competitive Landscape¶

Purpose: Competitive threat assessment and landscape visualization.

What to show: - Enter an indication and mechanism of action - See competitor trial listing with threat scores - Explain the 4-factor threat model (phase, enrollment, sponsor, differentiation) - Highlight threat classification (critical/high/moderate/low/minimal) - Show enrollment progress comparison

Tab 5: Dashboard¶

Purpose: System health and operational metrics.

What to show: - Collection health indicators (14 collections with record counts) - Query volume metrics - Workflow execution breakdown - Cross-agent integration status - Knowledge base version and last update

4. Demo Scenario 1: Oncology Protocol Design¶

Story: "A pharma company wants to design a Phase 3 trial for a novel PD-L1/VEGF bispecific antibody in first-line NSCLC. Let's use the agent to generate an evidence-based protocol blueprint."

Step 1: Ask the Question¶

Go to Tab 1 (Trial Intelligence) and type:

"Design a Phase 3 protocol for a PD-L1/VEGF bispecific antibody in first-line metastatic NSCLC with PD-L1 TPS >= 50%"

Expected Output¶

The system should return:

Workflow detected: Protocol Design
Recommended primary endpoint: Progression-Free Survival (PFS) or Overall Survival (OS)
Recommended secondary endpoints: ORR, DoR, OS (if PFS is primary), quality of life
Comparator recommendation: Pembrolizumab monotherapy (based on KEYNOTE-024)
Sample size estimate: 300-600 patients based on expected HR and event rate
Eligibility framework: PD-L1 TPS >= 50%, ECOG 0-1, no prior systemic therapy
Adaptive design suggestion: Group sequential with interim analysis for OS
Regulatory pathway: Potential for Breakthrough Therapy Designation; RTOR eligible
Landmark trial references: KEYNOTE-024 (pembrolizumab), IMpower110 (atezolizumab)
Historical success rate: ~36% for Phase 3 oncology

Talking Points¶

"Notice how the system referenced KEYNOTE-024 as the benchmark trial -- that's the current standard of care for this indication"
"The PFS and OS dual primary endpoint strategy is exactly what FDA expects for first-line NSCLC"
"The 36% historical success rate is calibrated from BIO/QLS Advisors data across thousands of oncology trials"

5. Demo Scenario 2: Patient-Trial Matching¶

Story: "A 58-year-old male with metastatic NSCLC, PD-L1 TPS 80%, EGFR wild-type, no prior systemic therapy, and ECOG 1 wants to find matching clinical trials."

Step 1: Enter Patient Profile¶

Go to Tab 2 (Patient Matching) and enter:

Field	Value
Age	58
Sex	Male
Diagnosis	Metastatic non-small cell lung cancer
Biomarkers	PD-L1 TPS 80%, EGFR wild-type, ALK negative
Medications	None (treatment-naive)
Genomic variants	KRAS G12C
Comorbidities	Hypertension (controlled)
Location	Boston, MA

Expected Output¶

Multiple trial matches with per-criterion scoring
High match score for first-line IO trials (PD-L1 >= 50%)
KRAS G12C targeted therapy trials flagged
Controlled hypertension should pass cardiac exclusion criteria
Nearby Boston-area trial sites listed

Talking Points¶

"Each eligibility criterion gets its own confidence score -- the system doesn't just say yes or no"
"Notice the KRAS G12C variant triggered an additional match for targeted therapy trials"
"The cross-agent trigger sent this to the oncology agent for molecular match confirmation"

6. Demo Scenario 3: Eligibility Optimization¶

Story: "A trial protocol has 28 eligibility criteria and is struggling with enrollment. The medical team wants to know which criteria could be broadened without compromising safety."

Step 1: Submit Criteria for Analysis¶

Go to Tab 1 (Trial Intelligence) and type:

"Analyze these eligibility criteria for enrollment impact: ECOG 0-1, no prior immunotherapy, no CNS metastases, hemoglobin >= 10 g/dL, no autoimmune disease, creatinine clearance >= 30 mL/min, no prior organ transplant, ejection fraction >= 50%, pregnancy or lactation excluded"

Expected Output¶

The system should return criteria ranked by population impact:

Criterion	Population Impact	Recommendation
Pregnancy or lactation	50%	RETAIN with monitoring
ECOG 0-1	25%	REVIEW: consider ECOG 0-2
No prior immunotherapy	20%	BROADEN: weak justification for some lines
No CNS metastases	15%	REVIEW: stable CNS may be includable
Ejection fraction >= 50%	8%	RETAIN: cardiac safety justified
Creatinine clearance >= 30	15%	REVIEW: could lower threshold
No autoimmune disease	8%	REVIEW: mild autoimmune may be safe

Talking Points¶

"The system identified that 'no prior immunotherapy' excludes 20% of the population but has weak scientific justification for this particular trial"
"Each criterion's population impact is calibrated against real-world data -- not guesswork"
"RETAIN recommendations have strong scientific backing; BROADEN recommendations have high impact with weak justification"

7. Demo Scenario 4: Competitive Intelligence¶

Story: "The sponsor wants to understand the competitive landscape for GLP-1 receptor agonists in obesity before starting their Phase 3 program."

Step 1: Query the Landscape¶

Go to Tab 4 (Competitive Landscape) or type in Tab 1:

"What is the competitive landscape for GLP-1 agonists in obesity Phase 3 trials? Include semaglutide, tirzepatide, and survodutide."

Expected Output¶

Competitor profiles for SURMOUNT-1 (tirzepatide), SELECT (semaglutide), and others
Threat scores with classification:
Tirzepatide/Zepbound: Critical (0.85+) -- Phase 3 completed, 22.5% weight loss
Semaglutide/Wegovy: Critical (0.90+) -- Approved, MACE benefit proven
Survodutide: High (0.65) -- Phase 3 ongoing
Enrollment progress comparison
Differentiation analysis (mechanism similarity)

Talking Points¶

"The threat model considers four factors: phase, enrollment, sponsor resources, and mechanism similarity"
"Semaglutide scores as critical because it's already approved AND has cardiovascular outcome data"
"This analysis helps sponsors decide whether to differentiate on efficacy, safety, convenience, or a new endpoint"

8. Demo Scenario 5: Adaptive Design Selection¶

Story: "A biotech company is developing a novel gene therapy for a rare disease (hemophilia A) and wants to know which adaptive design is best for their Phase 2/3 program."

Step 1: Ask About Adaptive Designs¶

Type in Tab 1:

"What adaptive trial design should I use for a Phase 2/3 seamless gene therapy trial in hemophilia A? Consider the small patient population and uncertain dose-response."

Expected Output¶

Recommended design: Seamless Phase 2/3 with sample size re-estimation
Alternative designs: Response-adaptive randomization, biomarker-adaptive enrichment
Rationale: Small population (rare disease), uncertain effect size, long follow-up needed
Regulatory guidance: FDA Guidance on Adaptive Designs (2019), RMAT designation eligible
Precedent trials: SPRINT_SMA (gene therapy adaptive design), Zolgensma experience
Statistical considerations: Bayesian framework, external historical controls, LTFU requirements
Success rate estimate: ~55% for Phase 3 rare disease (higher than average)

Talking Points¶

"The system recommended seamless Phase 2/3 because it eliminates the inter-trial gap -- critical for rare diseases where every patient matters"
"FDA RMAT designation makes this eligible for accelerated timelines"
"The 55% Phase 3 success rate for rare diseases is significantly higher than the overall average of ~50%, reflecting smaller trials and often dramatic clinical effects"

9. API Demo Queries¶

For technical audiences, demonstrate the REST API directly:

9.1 RAG Query¶

curl -X POST http://localhost:8538/v1/trial/query \
    -H "Content-Type: application/json" \
    -d '{
        "question": "What biomarker strategies are used in NSCLC immunotherapy trials?",
        "workflow_type": "protocol_design",
        "top_k": 5
    }'

9.2 Patient Matching¶

curl -X POST http://localhost:8538/v1/trial/match \
    -H "Content-Type: application/json" \
    -d '{
        "patient": {
            "age": 65,
            "sex": "female",
            "diagnosis": "HER2-low metastatic breast cancer",
            "biomarkers": ["HER2 IHC 1+", "HR-positive", "BRCA wild-type"]
        }
    }'

9.3 Safety Signal Detection¶

curl -X POST http://localhost:8538/v1/trial/safety/signal \
    -H "Content-Type: application/json" \
    -d '{
        "events": [
            {"event_type": "hepatotoxicity", "severity": "grade_3", "frequency": 0.08},
            {"event_type": "neutropenia", "severity": "grade_4", "frequency": 0.12}
        ],
        "trial_id": "NCT00000001"
    }'

9.4 Knowledge Version¶

curl -s http://localhost:8538/v1/trial/knowledge-version | python -m json.tool

9.5 Therapeutic Areas¶

curl -s http://localhost:8538/v1/trial/therapeutic-areas | python -m json.tool

10. Talking Points¶

For Executive Audiences¶

"This agent reduces the time to generate an evidence-based protocol blueprint from weeks to seconds"
"40 landmark trials from KEYNOTE-024 to RECOVERY to CLARITY-AD serve as real-world design precedents"
"The system covers 13 therapeutic areas, from oncology to gene therapy, with calibrated success rates"
"Running on DGX Spark, this integrates with genomics, drug discovery, and four peer AI agents"

For Clinical Operations Audiences¶

"Patient-trial matching evaluates every eligibility criterion with confidence scoring"
"Eligibility optimization identifies criteria that exclude patients without scientific justification"
"Site selection considers enrollment history, diversity index, and screen failure rates"
"Enrollment predictions use prevalence, competition, and phase-specific factors"

For Regulatory Affairs Audiences¶

"The system references ICH E6(R3), E9(R1), and E8(R1) guidelines"
"9 regulatory agencies covered with approval pathways and expedited programs"
"Adaptive design recommendations include specific FDA guidance citations"
"Every recommendation includes evidence level classification (A1 through E)"

For Data Science Audiences¶

"14 Milvus vector collections with IVF_FLAT indexing and COSINE similarity"
"384-dimensional BGE-small-en-v1.5 embeddings for clinical trial text"
"Workflow-specific collection weight boosting for domain-relevant retrieval"
"Calibrated confidence model: evidence base (0.3) + raw confidence (0.3) + documents (0.2) + agreement (0.2)"
"769 tests in 0.47 seconds -- 100% pass rate"

11. FAQ¶

Q: Does this replace clinical trial design teams? A: No. This is a decision support tool that accelerates evidence retrieval and provides calibrated recommendations. All outputs should be reviewed by qualified clinical, regulatory, and statistical professionals.

Q: How current is the trial data? A: The knowledge base contains 40 curated landmark trials and comprehensive reference data. Real-time ClinicalTrials.gov integration is available through the ingest pipeline on a configurable schedule (default 24 hours).

Q: What happens if Milvus is down? A: The system degrades gracefully. All 10 workflows, 5 decision support engines, and the knowledge base continue to function. Only vector search results are unavailable.

Q: What happens without an LLM API key? A: The system operates in search-only mode. Vector searches, workflow execution, and decision support engines all function normally. Natural language synthesis is unavailable.

Q: Can this be used for regulatory submissions? A: The system generates draft documents and evidence summaries. All regulatory submissions require expert review, validation, and formatting per agency-specific requirements.

Q: What therapeutic areas are supported? A: All 13 major therapeutic areas: oncology, cardiovascular, neuroscience, immunology, infectious disease, rare diseases, metabolic/endocrinology, respiratory, hematology, gastroenterology, dermatology, ophthalmology, and gene/cell therapy.

12. Recovery Procedures¶

If the API Crashes¶

# Restart API
uvicorn api.main:app --host 0.0.0.0 --port 8538

If the UI Freezes¶

# Restart Streamlit (it auto-reconnects)
streamlit run app/trial_ui.py --server.port 8128

If Milvus Is Unreachable¶

# Check Milvus status
docker ps | grep milvus

# Restart if needed
docker restart milvus-standalone

# Verify
curl localhost:8538/health

If Searches Return Empty¶

# Re-seed knowledge base
python scripts/seed_knowledge.py

# Re-create collections if needed
python scripts/setup_collections.py
python scripts/seed_knowledge.py

Clinical Trial Intelligence Agent v2.0.0 -- Demo Guide -- March 2026