HCLS AI Factory — Learning Guide: Unified Advanced¶
Author: Adam Jones | License: Apache 2.0 | Date: March 2026
Part 1: Advanced Genomic Analysis¶
Custom Variant Filtering Strategies¶
The default genomics pipeline (Parabricks 4.6 + DeepVariant) produces a comprehensive VCF containing every called variant. Real clinical workflows demand surgical filtering. The strategies below go beyond bcftools view -f PASS.
Tiered Clinical Filtering
A three-tier approach mirrors clinical reporting standards:
| Tier | Criteria | Typical Count |
|---|---|---|
| Tier 1 — Actionable | QUAL >= 100, DP >= 30, GQ >= 40, ClinVar Pathogenic/Likely Pathogenic | ~50-200 |
| Tier 2 — Potentially Significant | QUAL >= 50, DP >= 20, GQ >= 30, ClinVar VUS + AlphaMissense >= 0.8 | ~500-2,000 |
| Tier 3 — Observed | PASS filter, DP >= 10, population AF < 0.01 | ~20,000-50,000 |
# Tier 1: Actionable variants only
bcftools view -f PASS input.vcf.gz \
| bcftools filter -i 'QUAL>=100 && INFO/DP>=30 && FORMAT/GQ>=40' \
| bcftools annotate -a clinvar.vcf.gz -c INFO/CLNSIG \
| bcftools filter -i 'INFO/CLNSIG~"Pathogenic"'
Gene Panel Filtering
When the clinical question targets a specific condition, restrict analysis to curated gene panels:
# Extract regions from a BED file (e.g., hereditary cancer panel — 84 genes)
bcftools view -R cancer_panel.bed -f PASS sample.vcf.gz > panel_variants.vcf
The HCLS AI Factory ships with pre-built BED files for 13 therapeutic areas aligned to the intelligence agents.
Population Frequency Filtering
Filter against gnomAD allele frequencies to surface rare variants:
# Retain variants with AF < 0.001 or absent from gnomAD
bcftools filter -i 'INFO/gnomAD_AF<0.001 || INFO/gnomAD_AF="."' input.vcf.gz
Multi-Sample and Trio Analysis Patterns¶
Trio Analysis (Proband + Parents)
Parabricks 4.6 supports joint calling. For de novo variant detection:
# Joint calling with Parabricks DeepVariant
pbrun deepvariant \
--ref reference.fasta \
--in-bam proband.bam,mother.bam,father.bam \
--out-variants trio_joint.vcf
# De novo filtering: variant present in proband, absent in both parents
bcftools view -s proband trio_joint.vcf \
| bcftools filter -i 'GT="0/1" || GT="1/1"' \
| bcftools isec -C -w1 - mother_only.vcf father_only.vcf
Multi-Sample Cohort Analysis
For cohort studies (e.g., tumor-normal pairs or family groups), use merged VCFs with sample-specific genotype fields. The RAG pipeline can ingest multi-sample VCFs and maintain per-sample lineage in the genomic_evidence Milvus collection.
Structural Variant Detection¶
The default pipeline focuses on SNVs and small indels. Structural variants (SVs) require dedicated tools:
| Tool | SV Types | Integration Point |
|---|---|---|
| Parabricks pbsv | DEL, INS, DUP, INV, BND | Direct GPU-accelerated calling |
| Manta | DEL, INS, DUP, INV, BND | Post-alignment, CPU |
| CNVkit | Copy number variants | BAM coverage analysis |
SVs are annotated against ClinVar SV records and can be ingested into the RAG pipeline using the genomic_evidence collection with variant_type metadata filtering.
ClinVar + AlphaMissense Annotation Deep Dive¶
The platform cross-references two major annotation databases:
ClinVar (4.1M records)
- Variant-level clinical significance: Pathogenic, Likely Pathogenic, VUS, Likely Benign, Benign
- Review status (0-4 stars) indicates evidence strength
- Condition associations with OMIM and MedGen identifiers
- Submission history tracks reclassification over time
AlphaMissense (71M predictions)
- Machine learning pathogenicity scores (0.0 to 1.0) for all possible single amino acid substitutions
- Thresholds: >= 0.564 likely pathogenic, <= 0.340 likely benign
- Covers missense variants that ClinVar has not yet classified
- Particularly valuable for VUS reclassification support
Cross-Reference Strategy
The RAG pipeline performs a layered annotation:
- Check ClinVar first — if Pathogenic/Likely Pathogenic with >= 2 stars, use as primary evidence
- For VUS or missing ClinVar entries, check AlphaMissense score
- Concordance between ClinVar and AlphaMissense strengthens confidence
- Discordance flags the variant for manual review
VCF Quality Metrics Interpretation¶
| Metric | Field | Good | Marginal | Poor |
|---|---|---|---|---|
| Variant Quality | QUAL | >= 100 | 30-100 | < 30 |
| Read Depth | DP | >= 30x | 15-30x | < 15x |
| Genotype Quality | GQ | >= 40 | 20-40 | < 20 |
| Allele Depth | AD | Ref/Alt balanced for het | Skewed > 4:1 | Single-strand only |
| Mapping Quality | MQ | >= 50 | 30-50 | < 30 |
| Strand Bias | FS | < 60 | 60-200 | > 200 |
Interpreting AD (Allele Depth) for heterozygous calls:
A heterozygous SNV should show roughly 50/50 reference/alternate reads. A call showing AD=95,5 at DP=100 is suspicious — possibly a sequencing artifact or somatic mosaic variant. The expected range for germline heterozygous calls is 30-70% alternate allele fraction.
Parabricks 4.6 Performance Tuning on DGX Spark¶
The NVIDIA DGX Spark (Grace Blackwell GB10, 128GB unified memory) runs the full genomics pipeline. Key tuning parameters:
| Parameter | Default | Tuned | Impact |
|---|---|---|---|
--num-gpus |
1 | 1 (GB10 is single-GPU) | N/A |
--bwa-options |
default | -K 100000000 -Y |
Improves reproducibility |
--run-partition-size |
25M | 50M | Fewer partitions, less overhead |
--memory-limit |
auto | 100G | Reserve 28GB for system + Milvus |
Benchmark on DGX Spark:
| Stage | Time (30x WGS) |
|---|---|
| BWA-MEM2 alignment | ~45 min |
| DeepVariant calling | ~60 min |
| Annotation + VCF post-processing | ~15 min |
| Total | ~120 min |
This compares to 24-48 hours on a 32-core CPU-only server.
Part 2: Advanced RAG Architecture¶
Milvus 2.4 Collection Design Patterns¶
Every intelligence agent in the HCLS AI Factory follows a consistent Milvus collection design. Understanding these patterns is essential for extending or customizing the platform.
Index Configuration
All collections use the same core index parameters:
index_params = {
"index_type": "IVF_FLAT",
"metric_type": "COSINE",
"params": {"nlist": 128}
}
search_params = {
"metric_type": "COSINE",
"params": {"nprobe": 16}
}
- IVF_FLAT: Inverted file index with flat (exact) distance computation within clusters. Chosen over HNSW for its lower memory footprint and deterministic results.
- COSINE: Cosine similarity metric. Normalized embeddings make this equivalent to inner product but more interpretable (0.0 to 1.0 scale).
- nlist=128: Number of Voronoi cells. With collections ranging from hundreds to millions of vectors, 128 provides a good balance between recall and speed.
- nprobe=16: Number of cells to search at query time (12.5% of nlist). Increasing to 32 improves recall by ~2% but doubles search latency.
Schema Pattern
Every collection follows this field layout:
| Field | Type | Purpose |
|---|---|---|
id |
INT64 (auto-increment) | Primary key |
embedding |
FLOAT_VECTOR(384) | BGE-small-en-v1.5 embedding |
text |
VARCHAR(65535) | Source text chunk |
source |
VARCHAR(512) | Provenance (PubMed ID, URL, file path) |
metadata |
VARCHAR(65535) | JSON-encoded domain metadata |
created_at |
VARCHAR(64) | ISO 8601 timestamp |
| Domain-specific fields | Various | Filterable attributes (e.g., cancer_type, target_antigen, gene_symbol) |
Shared genomic_evidence Collection
All 11 agents mount the genomic_evidence collection as read-only. This collection holds 3.56M vectors derived from the Stage 2 RAG pipeline (ClinVar + AlphaMissense annotations). Agents query it to enrich domain-specific analysis with genomic context without duplicating the data.
Collection Sizing Across Agents
| Agent | Owned Collections | Approx. Vectors | Shared |
|---|---|---|---|
| Precision Oncology | 10 | ~10,000+ | +genomic_evidence |
| CAR-T Intelligence | 10 | 6,266 | +genomic_evidence |
| Precision Biomarker | 10 | ~320 | +genomic_evidence |
| Cardiology | 12 | ~5,000+ | +genomic_evidence |
| Neurology | 12 | ~5,000+ | +genomic_evidence |
| Autoimmune | 13 | ~5,000+ | +genomic_evidence |
| Rare Disease | 13 | ~5,000+ | +genomic_evidence |
| Pharmacogenomics | 14 | ~5,000+ | +genomic_evidence |
| Imaging | 10 | 2,814+ | +genomic_evidence |
| Single-Cell | 12 | ~5,000+ | +genomic_evidence |
| Clinical Trial | 13 | ~5,000+ | +genomic_evidence |
BGE-small-en-v1.5 Embedding Model¶
Why BGE-small-en-v1.5?
| Property | Value |
|---|---|
| Dimensions | 384 |
| Model size | 33M parameters (~130 MB) |
| Max sequence length | 512 tokens |
| Embedding speed (DGX Spark) | ~1,200 texts/sec |
| MTEB retrieval score | 51.68 |
The model was chosen for three reasons:
- Efficiency — 384 dimensions keeps memory manageable at scale (3.56M vectors x 384 dims x 4 bytes = ~5.2 GB for genomic_evidence alone)
- Quality — Top-tier retrieval performance among small models; sufficient for domain-specific retrieval when combined with query expansion
- On-device inference — Runs entirely on the DGX Spark without requiring a separate embedding service
Limitations
- 512-token context window means long clinical documents must be chunked (default: 2,500 characters with 200-character overlap)
- General-purpose English model — not fine-tuned for biomedical text. Query expansion compensates for vocabulary gaps.
- Does not understand structured data (tables, lab results) natively. The ingest pipeline converts structured data to natural language descriptions before embedding.
Asymmetric Query Prefix
BGE models use an instruction prefix for queries (but not for documents):
# At query time
query_embedding = model.encode("Represent this sentence for searching relevant passages: " + query)
# At ingest time (no prefix)
doc_embedding = model.encode(document_text)
This asymmetric encoding improves retrieval accuracy by ~3-5% on domain benchmarks.
Multi-Collection Parallel Search Strategy¶
The core architectural pattern across all 11 agents is parallel multi-collection search. A single user query fans out to all collections simultaneously.
Search Flow
User Query
|
v
[Query Expansion] -- add domain synonyms and related terms
|
v
[BGE Embedding] -- 384-dim vector
|
v
[Parallel Search] -- fan out to N collections simultaneously
| | | | ... |
v v v v v
Coll1 Coll2 Coll3 Coll4 ... CollN
| | | | ... |
v v v v v
[Merge + Deduplicate] -- by content hash
|
v
[Score Normalization] -- weight by collection relevance
|
v
[Top-K Selection] -- default: 5 per collection, 30 total max
|
v
[Claude Synthesis] -- grounded response with citations
Weighted Collection Scoring
Each agent assigns weights to its collections. The final relevance score combines cosine similarity with collection weight:
The Autoimmune Agent provides a good example with weights ranging from 0.18 (clinical documents) down to 0.02 (genomic evidence and cross-disease). This ensures that highly relevant clinical notes outrank tangentially related genomic data.
Performance on DGX Spark
| Metric | Value |
|---|---|
| Single-collection search (top-5) | 2-5 ms |
| 11-collection parallel search | 12-16 ms |
| Dual retrieval (comparative mode) | ~365 ms |
| Full RAG with Claude synthesis | ~24 sec |
| End-to-end with streaming | First token ~3 sec |
Total query-to-response time stays under 5 seconds for retrieval; the majority of latency comes from LLM synthesis.
Query Expansion with Domain Keyword Maps¶
Raw user queries often miss domain-specific terminology. Each agent maintains expansion maps that inject synonyms and related terms.
Example: CAR-T Agent Expansion
The CAR-T agent maintains 12 expansion maps covering 169 seed keywords that expand to 1,496 related terms:
| Map Category | Seed Keywords | Expanded Terms |
|---|---|---|
| Target Antigen | CD19, BCMA, CD22 | B-cell maturation antigen, TNFRSF17, ... |
| Toxicity | CRS, ICANS | cytokine release syndrome, neurotoxicity, tocilizumab, ... |
| Manufacturing | transduction, expansion | lentiviral vector, T-cell activation, ... |
Expansion Logic
def expand_query(query: str) -> str:
expanded_terms = []
for keyword, synonyms in expansion_maps.items():
if keyword.lower() in query.lower():
expanded_terms.extend(synonyms[:5]) # Top 5 synonyms
return f"{query} {' '.join(expanded_terms)}"
The expanded query is embedded alongside the original, and both embedding vectors are used in the parallel search. This typically improves recall by 15-25% on domain-specific queries.
Claude LLM Prompt Engineering for Clinical RAG Synthesis¶
All agents use a structured prompt template for Claude synthesis:
System Prompt Structure
You are a {domain} clinical decision support AI.
You must:
1. Ground every claim in the retrieved evidence
2. Cite sources using [Collection:ID] format
3. Flag clinical alerts (contraindications, critical values)
4. Acknowledge uncertainty when evidence is limited
5. Never fabricate references or statistics
Patient context: {patient_context}
Conversation history: {last_N_turns}
Evidence Injection Format
Retrieved evidence is injected between XML-style tags:
<evidence>
[Literature:PMID_12345678] (score: 0.87) EGFR mutations in non-small cell lung
cancer respond to erlotinib with a median PFS of 10.4 months...
[Trials:NCT04012345] (score: 0.82) Phase III trial of osimertinib vs.
comparator in EGFR T790M-positive NSCLC...
</evidence>
Key Prompt Engineering Patterns
- Citation enforcement — The system prompt requires
[Collection:Source]format. Claude rarely hallucinates citations when the format is explicitly specified and evidence is provided. - Confidence calibration — Agents instruct Claude to state "Based on limited evidence..." or "No relevant evidence found..." rather than generating speculative answers.
- Streaming — All agents support SSE streaming for real-time token delivery. The first token typically arrives within 3 seconds.
Cross-Modal Triggers Between Agents¶
Agents communicate through an event-driven architecture. When one agent detects a finding that is relevant to another domain, it publishes a cross-modal event.
Example Trigger Chains
| Source Agent | Trigger Condition | Target Agent | Action |
|---|---|---|---|
| Imaging | Lung-RADS 4A+ nodule | Oncology | Query EGFR/ALK/ROS1 variants in genomic_evidence |
| Oncology | BRCA1/2 pathogenic variant | Clinical Trial | Search for PARP inhibitor trials |
| Rare Disease | HPO phenotype match score > 0.8 | Pharmacogenomics | Check PGx implications for candidate therapies |
| Autoimmune | Flare risk > 0.8 (imminent) | Biomarker | Pull longitudinal CRP/ESR/complement trends |
| Single-Cell | Antigen escape detected | CAR-T | Flag resistance mechanism and alternative targets |
| Cardiology | EF < 35% detected | Oncology | Check cardiotoxicity risk for proposed chemotherapy |
Event Format
{
"event_type": "cross_modal_trigger",
"source_agent": "imaging_intelligence",
"target_agent": "precision_oncology",
"trigger": "lung_rads_4a_detected",
"payload": {
"finding": "Spiculated nodule 18mm RUL",
"query": "EGFR ALK ROS1 KRAS lung adenocarcinoma variants"
}
}
Performance Optimization¶
Achieving Sub-5-Second Search Latency
- Pre-load collections — All collections are loaded into memory at agent startup. Milvus
load()is called during the health check lifecycle. - Connection pooling — A single
MilvusClientconnection is reused across requests. Connection setup (~200ms) is amortized. - Embedding cache — Frequently repeated queries (e.g., gene names, drug names) are cached with an LRU cache (default 1,024 entries).
- Batch embedding — During ingest, texts are embedded in batches of 64 to maximize GPU throughput.
- Collection partitioning — Large collections (genomic_evidence at 3.56M vectors) use partition keys (e.g., chromosome) for targeted search.
Memory Budget on DGX Spark (128GB unified)
| Component | Memory |
|---|---|
| Milvus (all collections loaded) | ~25 GB |
| BGE-small-en-v1.5 model | ~0.5 GB |
| Python agent processes (11 agents) | ~8 GB |
| System + Docker overhead | ~10 GB |
| Parabricks (when running) | ~60 GB |
| Available headroom | ~24 GB |
Part 3: Deep Dive — All 11 Intelligence Agents¶
Each agent follows the same foundational architecture: FastAPI REST server, Streamlit UI, multi-collection Milvus RAG engine, BGE-small-en-v1.5 embeddings (384-dim), and Claude LLM synthesis. What differs is the domain knowledge, clinical workflows, and cross-agent integration points.
3.1 Precision Oncology Agent (:8527)¶
Overview and Clinical Purpose
The Precision Oncology Agent is a closed-loop clinical decision support system designed for molecular tumor board (MTB) workflows. Given a patient's somatic and germline VCF, it identifies actionable variants, matches them against oncology knowledge bases, ranks candidate therapies, surfaces relevant clinical trials, and assembles a structured MTB-ready report. The entire pipeline runs on a single NVIDIA DGX Spark ($4,699, March 2026).
Advanced Configuration Options
| Variable | Default | Description |
|---|---|---|
ONCO_AGENT_PORT |
8527 |
FastAPI listen port |
MILVUS_HOST |
localhost |
Milvus server hostname |
MILVUS_PORT |
19530 |
Milvus gRPC port |
MILVUS_POOL_SIZE |
10 |
Connection pool size; increase for concurrent MTB sessions |
ANTHROPIC_API_KEY |
(required) | Claude API key for LLM synthesis |
LLM_MODEL |
claude-sonnet-4-20250514 |
Claude model ID; upgrade to claude-opus-4-20250514 for complex cases |
EMBEDDING_MODEL |
BAAI/bge-small-en-v1.5 |
Sentence-transformer for 384-dim embeddings |
TOP_K |
15 |
Max chunks retrieved per collection search |
EVIDENCE_THRESHOLD |
0.65 |
Minimum cosine similarity for evidence inclusion |
CROSS_AGENT_TIMEOUT |
30 |
Seconds to wait for cross-agent responses |
Milvus Collections (11)
| Collection | Description | Source |
|---|---|---|
onco_literature |
PubMed/PMC chunks by cancer type | PubMed E-utils |
onco_trials |
Clinical trial summaries with biomarker criteria | ClinicalTrials.gov |
onco_variants |
Actionable somatic/germline variants | CIViC, OncoKB |
onco_biomarkers |
Predictive/prognostic biomarkers and assays | CIViC, literature |
onco_therapies |
Approved and investigational therapies with MOA | OncoKB, FDA labels |
onco_pathways |
Signaling pathways, cross-talk, druggable nodes | KEGG, Reactome |
onco_guidelines |
NCCN/ASCO/ESMO guideline recommendations | Guideline PDFs |
onco_resistance |
Resistance mechanisms and bypass strategies | CIViC, literature |
onco_outcomes |
Real-world treatment outcome records | De-identified RWD |
onco_cases |
De-identified patient case snapshots | Synthetic/RWD |
genomic_evidence |
VCF-derived evidence (shared, read-only) | Stage 1 pipeline |
Complete API Endpoint Table
| Method | Path | Description |
|---|---|---|
| GET | /health |
Service health with component status |
| GET | /collections |
List all loaded collections with record counts |
| GET | /knowledge/stats |
Knowledge base statistics (total vectors, last ingest) |
| GET | /metrics |
Prometheus-compatible metrics export |
| POST | /query |
Free-text oncology RAG query with evidence synthesis |
| POST | /search |
Direct vector similarity search across collections |
| POST | /find-related |
Find related concepts given an entity (gene, drug, pathway) |
| POST | /v1/onco/integrated-assessment |
Full multi-collection integrated assessment |
| POST | /api/ask |
Meta-agent question answering with chain-of-thought |
| POST | /api/cases |
Create a new patient case with cancer type and variants |
| GET | /api/cases/{case_id} |
Retrieve case details and analysis status |
| POST | /api/cases/{case_id}/mtb |
Run full molecular tumor board analysis for a case |
| GET | /api/cases/{case_id}/variants |
List annotated variants for a case |
| POST | /api/trials/match |
Match molecular profile to clinical trials |
| POST | /api/trials/match-case/{case_id} |
Match a stored case to clinical trials |
| POST | /api/therapies/rank |
Rank therapies by evidence level and biomarker fit |
| POST | /api/reports/generate |
Generate MTB report (Markdown, JSON, PDF, FHIR R4) |
| GET | /api/reports/{case_id}/{fmt} |
Download generated report in specified format |
| GET | /api/events |
List cross-agent events (SSE stream) |
| GET | /api/events/{event_id} |
Retrieve a specific event by ID |
Detailed Clinical Workflow Walkthrough
- Variant submission -- Clinician uploads somatic/germline VCF via Streamlit UI or POSTs variant list to
/api/cases. The VCF parser extracts gene, protein change, allele frequency, and quality metrics. - Entity detection -- The agent identifies gene names (EGFR, BRAF, ALK), variant notations (L858R, V600E), and cancer types from free-text or structured input.
- Evidence-level annotation -- Variants are annotated against CIViC and OncoKB with AMP/ASCO/CAP evidence-level tiering (Level IA through Level III). Each variant receives a clinical significance classification.
- Multi-collection retrieval -- Parallel search across all 11 collections: therapies, biomarkers, trials, resistance mechanisms, pathways, guidelines, outcomes, and cases. Each result carries a cosine similarity score.
- TherapyRanker scoring -- Approved and investigational therapies are scored using a composite of evidence level (40%), biomarker match (25%), resistance profile (15%), guideline concordance (10%), and trial availability (10%).
- TrialMatcher execution -- Molecular profile is matched against eligibility criteria in
onco_trials. Results include NCT IDs, phase, enrollment status, and distance to nearest trial sites. - Claude LLM synthesis -- All retrieved evidence is assembled into a structured prompt. Claude generates a narrative summary with inline citations, therapy recommendations, and resistance warnings.
- MTB report generation -- Structured report exported as Markdown, JSON, PDF, or FHIR R4 DiagnosticReport with tiered evidence, therapy rankings, trial matches, and resistance alerts.
Example Query and Response
curl -X POST http://localhost:8527/query \
-H "Content-Type: application/json" \
-d '{"question": "What targeted therapies are available for EGFR L858R in NSCLC?"}'
{
"answer": "EGFR L858R is a Level IA actionable target in NSCLC. Osimertinib (3rd-gen TKI) is the preferred first-line therapy per NCCN 2026 guidelines, with a median PFS of 18.9 months (FLAURA trial). Second-line options include amivantamab + lazertinib for C797S-mediated resistance...",
"evidence": [
{"collection": "onco_therapies", "score": 0.94, "text": "Osimertinib: 3rd-generation EGFR TKI, FDA-approved for EGFR-mutant NSCLC..."},
{"collection": "onco_guidelines", "score": 0.91, "text": "NCCN NSCLC v3.2026: EGFR L858R — Preferred first-line: osimertinib..."},
{"collection": "onco_resistance", "score": 0.87, "text": "C797S mutation confers resistance to osimertinib in 10-15% of cases..."},
{"collection": "onco_trials", "score": 0.83, "text": "NCT04487080: Phase III, osimertinib + savolitinib for MET-amplified progression..."}
],
"therapies_ranked": [
{"rank": 1, "drug": "Osimertinib", "evidence": "Level IA", "score": 0.96},
{"rank": 2, "drug": "Amivantamab + Lazertinib", "evidence": "Level IB", "score": 0.82},
{"rank": 3, "drug": "Erlotinib + Ramucirumab", "evidence": "Level IB", "score": 0.74}
],
"active_trials": 8,
"confidence": 0.93,
"collections_searched": 11,
"processing_time_ms": 3420
}
Cross-Agent Integration
- Receives genomic evidence from Stage 1 pipeline via shared
genomic_evidencecollection (3.56M variants) - Triggers Clinical Trial Agent (:8538) when novel actionable variants are identified, passing variant list and cancer type for trial matching
- Receives cardiotoxicity alerts from Cardiology Agent (:8126) for proposed chemotherapy regimens (anthracyclines, trastuzumab) with pre-treatment ejection fraction data
- Sends antigen expression queries to Single-Cell Agent (:8540) for immunotherapy candidate validation (TME profiling, PD-L1 expression)
- Connects to Drug Discovery pipeline (Stage 3) for candidate molecule docking when no approved therapies match the variant profile
Performance Characteristics
| Operation | Typical Latency |
|---|---|
| Single collection search (top-15) | 80-150 ms |
| Full 11-collection parallel search | 200-400 ms |
| Embedding generation (BGE-small) | 15-25 ms |
| Claude LLM synthesis | 2,000-3,500 ms |
| Full MTB analysis (end-to-end) | 3,500-5,000 ms |
| Report generation (PDF) | 1,500-2,500 ms |
Common Pitfalls and Troubleshooting
- "No actionable variants found" -- Check that the VCF contains PASS-filtered variants with gene annotations. Raw VCF without functional annotation will not match CIViC/OncoKB entries. Run
bcftools view -f PASSfirst. - Therapy ranking returns empty -- Verify that
onco_therapiescollection is loaded (GET /collections). If the collection shows 0 records, re-run the ingest script. - Slow LLM responses (>10s) -- Reduce
TOP_Kfrom 15 to 8 to shrink the context window. Alternatively, switch to a faster Claude model viaLLM_MODEL. - Cross-agent timeout -- If the Cardiology or Trial agent is down, the oncology agent logs a warning but continues without cross-agent enrichment. Check
CROSS_AGENT_TIMEOUTand verify target agents are healthy. - Duplicate case IDs -- Case creation is idempotent on
patient_id+cancer_type. Submitting the same combination returns the existing case rather than creating a duplicate.
3.2 CAR-T Intelligence Agent (:8522)¶
Overview and Clinical Purpose
The CAR-T Intelligence Agent breaks down data silos across the 5 stages of CAR-T cell therapy development: target selection, construct design, manufacturing, clinical evaluation, and post-market surveillance. It features a unique comparative analysis mode that auto-detects "X vs Y" queries and produces structured side-by-side comparisons. The agent covers all 6 FDA-approved CAR-T products, 25 target antigens, and 39+ product aliases, running on a single DGX Spark ($4,699, March 2026).
Advanced Configuration Options
| Variable | Default | Description |
|---|---|---|
CART_AGENT_PORT |
8522 |
FastAPI listen port |
MILVUS_HOST |
localhost |
Milvus server hostname |
MILVUS_PORT |
19530 |
Milvus gRPC port |
MILVUS_POOL_SIZE |
10 |
Connection pool size |
ANTHROPIC_API_KEY |
(required) | Claude API key for LLM synthesis |
LLM_MODEL |
claude-sonnet-4-20250514 |
Claude model; use claude-opus-4-20250514 for nuanced comparative analyses |
EMBEDDING_MODEL |
BAAI/bge-small-en-v1.5 |
384-dim sentence-transformer |
TOP_K |
15 |
Max chunks per collection search |
COMPARATIVE_THRESHOLD |
0.70 |
Minimum confidence to trigger comparative mode |
QUERY_EXPANSION_MAPS |
12 |
Number of domain-specific expansion maps (169 keywords to 1,496 terms) |
Milvus Collections (11)
| Collection | Records | Source |
|---|---|---|
cart_literature |
5,047 | PubMed abstracts via NCBI E-utilities |
cart_trials |
973 | ClinicalTrials.gov API v2 |
cart_constructs |
6 | 6 FDA-approved CAR-T products |
cart_assay_results |
45 | Curated from landmark papers (ELIANA, ZUMA-1, KarMMa, CARTITUDE-1) |
cart_manufacturing |
30 | CMC/process data (transduction, expansion, release, cryo, logistics) |
cart_safety |
-- | Pharmacovigilance, CRS/ICANS profiles |
cart_biomarkers |
-- | CRS prediction, exhaustion monitoring |
cart_regulatory |
-- | Approval timelines, post-marketing requirements |
cart_sequences |
-- | Molecular binding, scFv sequences |
cart_rwe |
-- | Registry outcomes, real-world data |
genomic_evidence |
3.56M | Shared from Stage 2 RAG pipeline (read-only) |
Complete API Endpoint Table
| Method | Path | Description |
|---|---|---|
| GET | /health |
Service health with Milvus and engine status |
| GET | /collections |
List loaded collections with record counts |
| GET | /knowledge/stats |
Knowledge base statistics and last ingest time |
| GET | /metrics |
Prometheus-compatible metrics export |
| POST | /query |
Cross-functional RAG query (standard or comparative, auto-detected) |
| POST | /search |
Direct vector similarity search with collection filtering |
| POST | /find-related |
Find related entities (antigens, products, mechanisms) |
| POST | /v1/cart/integrated-assessment |
Full multi-collection integrated CAR-T assessment |
| POST | /api/ask |
Meta-agent question answering with reasoning chain |
| GET | /api/reports/{patient_id} |
Retrieve generated report for a patient |
| GET | /api/reports/{patient_id}/{fmt} |
Download report in specified format (md, json, pdf, fhir) |
| GET | /api/events |
List cross-agent events |
| GET | /api/events/{event_id} |
Retrieve a specific cross-agent event |
Detailed Clinical Workflow Walkthrough
- Query submission -- User submits a question about any stage of CAR-T development via Streamlit UI or POST to
/query. Input can be free-text or structured with explicit parameters. - Comparative detection -- The engine scans for "vs", "versus", "compare", "difference between" patterns. If detected, it enters comparative mode; otherwise, standard RAG mode.
- Entity resolution (comparative) -- Both entities are parsed and resolved against the knowledge graph containing 25 antigens, 6 FDA-approved products, and 39+ aliases (e.g., "tisa-cel" resolves to "tisagenlecleucel / Kymriah").
- Dual retrieval (comparative) -- Separate searches are run for each entity across all 11 collections. Results are aligned by category (efficacy, safety, manufacturing, cost) for side-by-side comparison.
- Query expansion (standard) -- 12 domain-specific expansion maps transform 169 seed keywords into 1,496 search terms. For example, "CRS management" expands to include "cytokine release syndrome", "tocilizumab", "IL-6 blockade", "grading criteria".
- Parallel multi-collection search -- All 11 collections are searched concurrently. Results are scored by cosine similarity and weighted by collection relevance to the query domain.
- Claude LLM synthesis -- Retrieved evidence is assembled with comparative formatting when applicable. Responses include clickable PubMed IDs (PMID) and ClinicalTrials.gov NCT numbers.
- Report generation -- Exportable as Markdown, JSON, PDF, or FHIR R4 DiagnosticReport.
Example Query and Response
curl -X POST http://localhost:8522/query \
-H "Content-Type: application/json" \
-d '{"question": "Compare 4-1BB vs CD28 costimulatory domains for DLBCL"}'
{
"answer": "4-1BB (CD137) and CD28 costimulatory domains represent the two main signaling architectures in FDA-approved CAR-T products for DLBCL...",
"comparison": {
"entity_a": "4-1BB (CD137)",
"entity_b": "CD28",
"dimensions": [
{"category": "Persistence", "entity_a": "Superior T-cell persistence (>6 months detectable)", "entity_b": "Rapid expansion, shorter persistence (~3 months)"},
{"category": "CRS Rate", "entity_a": "Lower grade 3+ CRS (2-22%)", "entity_b": "Higher grade 3+ CRS (13-15%)"},
{"category": "FDA Products", "entity_a": "Tisagenlecleucel (Kymriah), Lisocabtagene (Breyanzi)", "entity_b": "Axicabtagene (Yescarta), Brexucabtagene (Tecartus)"},
{"category": "ORR (DLBCL)", "entity_a": "52-73%", "entity_b": "72-83%"}
]
},
"evidence": [
{"collection": "cart_constructs", "score": 0.93, "text": "4-1BB signaling promotes memory T-cell differentiation..."},
{"collection": "cart_literature", "score": 0.89, "text": "ZUMA-1: axi-cel (CD28) achieved 83% ORR in r/r DLBCL (PMID: 28982775)..."},
{"collection": "cart_safety", "score": 0.86, "text": "CRS incidence comparison across costimulatory domains..."}
],
"mode": "comparative",
"confidence": 0.91,
"processing_time_ms": 3180
}
Cross-Agent Integration
- Receives antigen escape alerts from Single-Cell Agent (:8540) when off-tumor expression is detected for a target antigen (e.g., CD19 loss in B-ALL relapse)
- Queries Oncology Agent (:8527) for tumor mutational burden and microsatellite instability context that may influence CAR-T eligibility
- Shares CRS/ICANS toxicity profiles with Biomarker Agent (:8529) for real-time monitoring protocols (IL-6, ferritin, CRP thresholds)
- Triggers Clinical Trial Agent (:8538) when a query involves investigational CAR-T constructs not yet FDA-approved
Performance Characteristics
| Operation | Typical Latency |
|---|---|
| Standard single-collection search | 80-150 ms |
| Full 11-collection parallel search | 200-400 ms |
| Comparative dual-entity retrieval | 350-600 ms |
| Query expansion (12 maps) | 5-10 ms |
| Claude LLM synthesis (standard) | 2,000-3,000 ms |
| Claude LLM synthesis (comparative) | 2,500-4,000 ms |
| End-to-end query (standard) | 2,500-3,500 ms |
| End-to-end query (comparative) | 3,000-5,000 ms |
Common Pitfalls and Troubleshooting
- Comparative mode not triggering -- The detection engine requires explicit comparison language ("vs", "versus", "compare", "difference between"). Rephrase ambiguous queries to include one of these trigger words.
- Entity resolution failure -- If a product name is misspelled or uses a non-standard alias, the knowledge graph may not resolve it. Use canonical names (e.g., "axicabtagene ciloleucel" not "axi-cel") or add custom aliases via the configuration.
- Empty
cart_constructscollection -- This is a small curated collection (6 records). If it returns 0, re-runscripts/seed_constructs.pyto reload the 6 FDA-approved product profiles. - Slow comparative queries -- Comparative mode runs two full retrieval passes. If latency exceeds 5s, reduce
TOP_Kfrom 15 to 10 or limit the collection set. - Missing PubMed citations -- The
cart_literaturecollection requires periodic refresh via the NCBI E-utilities ingest script. Stale data will miss recent publications.
3.3 Precision Biomarker Agent (:8529)¶
Overview and Clinical Purpose
The Precision Biomarker Agent interprets patient biomarker panels with genotype awareness. It estimates biological age using PhenoAge/GrimAge algorithms, detects disease trajectories across 6 categories, provides pharmacogenomic profiling for 7 key pharmacogenes, and generates genotype-adjusted reference ranges. All processing runs on a single DGX Spark ($4,699, March 2026).
Advanced Configuration Options
| Variable | Default | Description |
|---|---|---|
BIOMARKER_AGENT_PORT |
8529 |
FastAPI listen port |
MILVUS_HOST |
localhost |
Milvus server hostname |
MILVUS_PORT |
19530 |
Milvus gRPC port |
MILVUS_POOL_SIZE |
10 |
Connection pool size |
ANTHROPIC_API_KEY |
(required) | Claude API key for LLM synthesis |
LLM_MODEL |
claude-sonnet-4-20250514 |
Claude model ID |
EMBEDDING_MODEL |
BAAI/bge-small-en-v1.5 |
384-dim sentence-transformer |
TOP_K |
15 |
Max chunks per collection search |
AGING_ALGORITHM |
phenoage |
Biological age algorithm: phenoage or grimage |
DISEASE_CATEGORIES |
6 |
Number of disease trajectory categories screened |
PGX_GENES |
7 |
Number of pharmacogenes profiled (CYP2D6, CYP2C19, CYP2C9, CYP3A5, SLCO1B1, VKORC1, MTHFR) |
Milvus Collections (11)
| Collection | Description | Records |
|---|---|---|
biomarker_reference |
Reference biomarker definitions and ranges | ~60 |
biomarker_genetic_variants |
Genetic variants affecting biomarker levels | ~30 |
biomarker_pgx_rules |
CPIC pharmacogenomic dosing rules | ~50 |
biomarker_disease_trajectories |
Disease progression stage definitions | ~30 |
biomarker_clinical_evidence |
Published clinical evidence | ~40 |
biomarker_nutrition |
Genotype-aware nutrition guidelines | ~25 |
biomarker_drug_interactions |
Gene-drug interactions | ~35 |
biomarker_aging_markers |
Epigenetic aging clock markers | ~15 |
biomarker_genotype_adjustments |
Genotype-based reference range adjustments | ~20 |
biomarker_monitoring |
Condition-specific monitoring protocols | ~15 |
genomic_evidence |
Shared genomic variants (read-only) | 3.56M |
Complete API Endpoint Table
| Method | Path | Description |
|---|---|---|
| GET | /health |
Service health with Milvus and engine status |
| GET | /collections |
List loaded collections with record counts |
| GET | /knowledge/stats |
Knowledge base statistics |
| GET | /metrics |
Prometheus-compatible metrics export |
| POST | /v1/biomarker/integrated-assessment |
Full multi-collection integrated biomarker assessment |
| POST | /v1/analyze |
Full patient analysis (biomarkers + genotypes combined) |
| POST | /v1/biological-age |
Biological age calculation (PhenoAge/GrimAge) |
| POST | /v1/disease-risk |
Disease risk trajectory analysis across 6 categories |
| POST | /v1/pgx |
Pharmacogenomic profile from star allele diplotypes |
| POST | /v1/query |
Free-text RAG query across all biomarker collections |
| POST | /v1/query/stream |
Streaming RAG query (Server-Sent Events) |
| POST | /v1/report/generate |
Generate clinical report (Markdown, JSON, PDF, FHIR R4) |
| GET | /v1/report/{report_id}/pdf |
Download generated PDF report |
| POST | /v1/report/fhir |
Export analysis as FHIR R4 DiagnosticReport |
| POST | /v1/events/cross-modal |
Receive cross-modal events from other agents |
| POST | /v1/events/biomarker-alert |
Receive biomarker alert events (threshold breaches) |
| GET | /v1/events/cross-modal |
List received cross-modal events |
| GET | /v1/events/biomarker-alert |
List received biomarker alerts |
Detailed Clinical Workflow Walkthrough
- Data submission -- Patient biomarker panel (CRP, HbA1c, LDL, albumin, creatinine, glucose, etc.) and genotype data (star alleles or VCF-derived diplotypes) are submitted via Streamlit UI or POST to
/v1/analyze. - Biological Age Engine -- Computes PhenoAge and GrimAge estimates from 9 routine blood biomarkers (albumin, creatinine, glucose, CRP, lymphocyte %, mean cell volume, red cell distribution width, alkaline phosphatase, white blood cell count). Output: chronological age, biological age, acceleration (positive = aging faster).
- Disease Trajectory Analyzer -- Screens 6 disease categories (cardiovascular, metabolic/diabetes, hepatic, renal, inflammatory, hematologic) using genotype-stratified thresholds. Each trajectory returns a stage (normal, borderline, early, established, advanced) with progression rate.
- Pharmacogenomic Mapper -- Interprets star alleles for CYP2D6, CYP2C19, CYP2C9, CYP3A5, SLCO1B1, VKORC1, MTHFR, plus HLA-B*57:01 screening. Maps diplotypes to metabolizer phenotypes (ultra-rapid, normal, intermediate, poor) with CPIC-level dosing recommendations.
- Genotype Adjustment Engine -- Modifies standard reference ranges based on genomic context. For example: PNPLA3 I148M carriers have lower ALT thresholds for NAFLD screening; TCF7L2 rs7903146 T/T carriers have elevated fasting glucose baselines; APOE e4/e4 carriers have higher LDL reference ceilings.
- Evidence retrieval -- Parallel search across all 11 collections to ground recommendations in published literature, CPIC guidelines, and clinical evidence.
- Report generation -- 12-section clinical report covering biological age summary, disease trajectories (6 categories), pharmacogenomic profile, genotype-adjusted ranges, nutrition recommendations, drug interaction alerts, and monitoring schedule. Exported as PDF and FHIR R4.
Example Query and Response
curl -X POST http://localhost:8529/v1/analyze \
-H "Content-Type: application/json" \
-d '{"biomarkers": {"CRP": 2.1, "HbA1c": 6.8, "LDL": 145, "albumin": 4.2, "creatinine": 0.9, "glucose": 118}, "genotypes": {"CYP2D6": "*1/*4", "CYP2C19": "*1/*2"}}'
{
"biological_age": {
"chronological_age": 55,
"phenoage": 58.3,
"grimage": 57.1,
"acceleration": "+3.3 years",
"percentile": 72,
"interpretation": "Moderately accelerated aging. CRP and glucose are primary contributors."
},
"disease_trajectories": [
{"category": "Metabolic", "stage": "Borderline", "risk_score": 0.62, "key_markers": ["HbA1c: 6.8 (pre-diabetic)", "glucose: 118"]},
{"category": "Cardiovascular", "stage": "Early", "risk_score": 0.48, "key_markers": ["LDL: 145", "CRP: 2.1"]},
{"category": "Inflammatory", "stage": "Borderline", "risk_score": 0.35, "key_markers": ["CRP: 2.1"]}
],
"pharmacogenomics": {
"CYP2D6": {"diplotype": "*1/*4", "phenotype": "Intermediate Metabolizer", "drugs_affected": ["codeine", "tramadol", "tamoxifen"]},
"CYP2C19": {"diplotype": "*1/*2", "phenotype": "Intermediate Metabolizer", "drugs_affected": ["clopidogrel", "omeprazole", "escitalopram"]}
},
"genotype_adjustments": [
{"marker": "CYP2D6", "adjustment": "Codeine: avoid or reduce dose by 50%; consider morphine alternative"}
],
"confidence": 0.88,
"processing_time_ms": 3850
}
Cross-Agent Integration
- Receives inflammation monitoring requests from Autoimmune Agent (:8532) during flare events, tracking CRP, ESR, IL-6, and calprotectin trends
- Provides biological age context to Cardiology Agent (:8126) for ASCVD risk refinement -- accelerated biological age increases the 10-year risk estimate
- Shares pharmacogenomic profiles with Pharmacogenomics Agent (:8107) for detailed CPIC/DPWG dosing guidance
- Sends biomarker alerts to Oncology Agent (:8527) when tumor markers (CEA, CA-125, PSA) exceed surveillance thresholds
- Receives genotype-adjusted reference range requests from Rare Disease Agent (:8134) for rare metabolic conditions
Performance Characteristics
| Operation | Typical Latency |
|---|---|
| Single collection search (top-15) | 80-150 ms |
| Full 11-collection parallel search | 200-400 ms |
| Biological age calculation | 50-100 ms |
| Disease trajectory analysis (6 categories) | 100-200 ms |
| PGx diplotype interpretation | 30-60 ms |
| Claude LLM synthesis | 2,000-3,500 ms |
| Full analysis (end-to-end) | 3,000-5,000 ms |
| PDF report generation | 1,500-2,500 ms |
Common Pitfalls and Troubleshooting
- Biological age returns null -- The PhenoAge algorithm requires at least 7 of the 9 input biomarkers. If fewer are provided, the engine falls back to a partial estimate with a warning flag. Ensure albumin, creatinine, and glucose are included at minimum.
- PGx phenotype "Indeterminate" -- This occurs when the star allele combination is not in the CPIC lookup table. Verify that diplotype notation follows PharmVar conventions (e.g.,
*1/*4not*1/4). - Genotype adjustments not applied -- The adjustment engine requires both the biomarker value and the relevant genotype. Submitting biomarkers without genotypes produces standard (non-adjusted) reference ranges.
- Disease trajectory showing "Unknown" stage -- Some trajectories require specific marker combinations. For example, the hepatic trajectory needs ALT, AST, and albumin together. A single liver enzyme without context is insufficient.
- Slow report generation -- PDF rendering adds 1-2s overhead. For faster iteration, request Markdown format first, then generate PDF only for the final version.
3.4 Cardiology Intelligence Agent (:8126)¶
Overview and Clinical Purpose
The Cardiology Intelligence Agent synthesizes cardiac imaging, electrophysiology, hemodynamics, heart failure management, valvular disease, preventive cardiology, interventional data, and cardio-oncology surveillance into ACC/AHA/ESC guideline-aligned recommendations. It includes 6 validated risk calculators (ASCVD, HEART Score, CHA2DS2-VASc, HAS-BLED, MAGGIC, EuroSCORE II), a GDMT optimizer, and 11 clinical workflows. All processing runs on a single DGX Spark ($4,699, March 2026).
Advanced Configuration Options
| Variable | Default | Description |
|---|---|---|
CARDIO_AGENT_PORT |
8126 |
FastAPI listen port |
MILVUS_HOST |
localhost |
Milvus server hostname |
MILVUS_PORT |
19530 |
Milvus gRPC port |
MILVUS_POOL_SIZE |
10 |
Connection pool size |
ANTHROPIC_API_KEY |
(required) | Claude API key for LLM synthesis |
LLM_MODEL |
claude-sonnet-4-20250514 |
Claude model ID |
EMBEDDING_MODEL |
BAAI/bge-small-en-v1.5 |
384-dim sentence-transformer |
TOP_K |
15 |
Max chunks per collection search |
GDMT_TITRATION_STEPS |
4 |
GDMT medication classes optimized simultaneously (ARNI/ACEi, beta-blocker, MRA, SGLT2i) |
RISK_CALCULATOR_COUNT |
6 |
Number of validated risk calculators |
Milvus Collections (13)
| Collection | Description |
|---|---|
cardio_literature |
Published cardiovascular research, reviews, meta-analyses |
cardio_trials |
Cardiovascular clinical trials and landmark results |
cardio_imaging |
Cardiac imaging protocols, findings, measurements |
cardio_electrophysiology |
ECG interpretation, arrhythmia classification, EP data |
cardio_heart_failure |
HF classification, GDMT protocols, management algorithms |
cardio_valvular |
Valvular heart disease assessment and intervention criteria |
cardio_prevention |
Preventive cardiology: risk stratification, lipid management |
cardio_interventional |
Interventional procedures, techniques, outcomes |
cardio_oncology |
Cardio-oncology surveillance, cardiotoxicity detection |
cardio_devices |
FDA-cleared cardiovascular AI devices, implantables |
cardio_guidelines |
ACC/AHA/ESC/HRS clinical practice guidelines |
cardio_hemodynamics |
Catheterization data, pressure tracings, derived calculations |
genomic_evidence |
Shared genomic evidence (read-only, 3.56M variants) |
6 Risk Calculators
| Calculator | Clinical Use |
|---|---|
| ASCVD (PCE) | 10-year atherosclerotic cardiovascular disease risk |
| HEART Score | Chest pain risk stratification in ED |
| CHA2DS2-VASc | Atrial fibrillation stroke risk |
| HAS-BLED | Anticoagulation bleeding risk |
| MAGGIC | Heart failure mortality risk |
| EuroSCORE II | Cardiac surgical mortality risk |
Complete API Endpoint Table
| Method | Path | Description |
|---|---|---|
| GET | /health |
Service health with Milvus, RAG engine, GDMT optimizer, and calculator status |
| GET | /collections |
List loaded collections with record counts |
| GET | /workflows |
List all 11 clinical workflow definitions |
| GET | /metrics |
Prometheus-compatible metrics export |
| POST | /v1/cardio/integrated-assessment |
Full multi-collection integrated cardiac assessment |
| POST | /v1/cardio/query |
Free-text RAG query across all cardiology collections |
| POST | /v1/cardio/search |
Direct vector similarity search with collection filtering |
| POST | /v1/cardio/find-related |
Find related entities (drugs, conditions, biomarkers) |
| POST | /v1/cardio/risk/ascvd |
ASCVD 10-year risk (Pooled Cohort Equations) |
| POST | /v1/cardio/risk/heart-score |
HEART Score for ED chest pain risk stratification |
| POST | /v1/cardio/risk/cha2ds2-vasc |
CHA2DS2-VASc stroke risk for atrial fibrillation |
| POST | /v1/cardio/risk/has-bled |
HAS-BLED anticoagulation bleeding risk |
| POST | /v1/cardio/risk/maggic |
MAGGIC heart failure mortality risk |
| POST | /v1/cardio/risk/euroscore |
EuroSCORE II cardiac surgical mortality |
| POST | /v1/cardio/gdmt/optimize |
GDMT titration optimization for heart failure |
| POST | /v1/cardio/workflow/cad |
Coronary artery disease assessment workflow |
| POST | /v1/cardio/workflow/heart-failure |
Heart failure classification and management |
| POST | /v1/cardio/workflow/valvular |
Valvular heart disease workflow |
| POST | /v1/cardio/workflow/arrhythmia |
Arrhythmia and EP assessment |
| POST | /v1/cardio/workflow/cardiac-mri |
Cardiac MRI tissue characterization |
| POST | /v1/cardio/workflow/stress-test |
Stress testing protocol |
| POST | /v1/cardio/workflow/prevention |
Cardiovascular prevention and risk stratification |
| POST | /v1/cardio/workflow/cardio-oncology |
Cardio-oncology surveillance |
| POST | /v1/cardio/cross-modal/evaluate |
Cross-modal genomic-cardiac evaluation |
| GET | /v1/cardio/guidelines |
List available clinical guideline references |
| GET | /v1/cardio/conditions |
List supported cardiovascular conditions |
| GET | /v1/cardio/biomarkers |
List tracked cardiac biomarkers |
| GET | /v1/cardio/drugs |
List cardiovascular drugs in knowledge base |
| GET | /v1/cardio/genes |
List cardiomyopathy-associated genes |
| GET | /v1/cardio/knowledge-version |
Knowledge base version and last update timestamp |
| POST | /v1/reports/generate |
Generate clinical report (Markdown, JSON, PDF, FHIR R4) |
| GET | /v1/reports/formats |
List available report export formats |
| GET | /v1/events/stream |
SSE stream for cross-agent events |
| GET | /v1/events/health |
Event system health status |
Detailed Clinical Workflow Walkthrough (11 workflows)
- Coronary Artery Disease (CAD) -- Calcium score interpretation, CAD-RADS classification, plaque characterization, revascularization planning (PCI vs CABG), and ischemia-guided management.
- Heart Failure Management -- HFrEF/HFpEF/HFmrEF classification by ejection fraction, GDMT optimization (4-pillar: ARNI/ACEi, beta-blocker, MRA, SGLT2i), device evaluation (ICD/CRT), and hemodynamic profiling.
- Valvular Heart Disease -- Severity grading (mild/moderate/severe) by echocardiographic criteria, intervention timing per ACC/AHA guidelines, and prosthetic valve follow-up.
- Arrhythmia and EP -- AF management with CHA2DS2-VASc and HAS-BLED scoring, ablation candidacy assessment, and device programming optimization.
- Cardiac MRI -- LGE pattern interpretation, T1/T2 mapping analysis, strain assessment, and tissue characterization for cardiomyopathy diagnosis.
- Stress Testing -- Exercise and pharmacologic stress interpretation, Duke treadmill score, perfusion defect classification, and ischemia quantification.
- Cardiovascular Prevention -- ASCVD risk stratification, lipid management (statin/PCSK9i/ezetimibe selection), lifestyle Rx, and risk enhancer assessment.
- Cardio-Oncology -- Chemotherapy cardiotoxicity surveillance, GLS tracking (global longitudinal strain), troponin/BNP monitoring, and CTRCD detection.
- Acute Decompensated HF -- Hemodynamic profiling (warm-wet/cold-wet/cold-dry/warm-dry), IV diuretic dosing, inotrope selection, and MCS escalation criteria.
- Post-MI Secondary Prevention -- Reperfusion assessment, DAPT strategy duration, beta-blocker/statin/ACEi titration, cardiac rehab referral, and ICD timing.
- Myocarditis/Pericarditis -- Lake Louise CMR criteria evaluation, biopsy indications, NSAIDs/colchicine dosing, and activity restriction guidelines.
Example Query and Response
curl -X POST http://localhost:8126/v1/cardio/risk/heart-score \
-H "Content-Type: application/json" \
-d '{"age": 62, "history": "moderately suspicious", "ecg": "nonspecific ST changes", "troponin": "normal", "risk_factors": 3}'
{
"calculator": "HEART Score",
"score": 6,
"risk_category": "Moderate",
"interpretation": "HEART Score 6/10 (Moderate risk). 30-day MACE rate: 12-17%. Recommend observation, serial troponins, and non-invasive testing within 72 hours.",
"components": {
"history": {"score": 1, "detail": "Moderately suspicious for ACS"},
"ecg": {"score": 1, "detail": "Nonspecific ST-segment changes"},
"age": {"score": 2, "detail": "Age 62 (>= 45 and < 65)"},
"risk_factors": {"score": 2, "detail": "3+ risk factors (>= 3)"},
"troponin": {"score": 0, "detail": "Normal troponin"}
},
"guideline_reference": "ACC/AHA 2021 Chest Pain Guideline, Section 5.3",
"confidence": 0.95,
"processing_time_ms": 280
}
Cross-Agent Integration
- Sends cardiotoxicity alerts to Oncology Agent (:8527) when chemotherapy candidates show cardiac risk (pre-treatment EF, GLS baseline, anthracycline cumulative dose tracking)
- Receives biomarker trends (BNP, troponin, CRP) from Biomarker Agent (:8529) for longitudinal heart failure monitoring
- Queries
genomic_evidencefor cardiomyopathy-associated variants (TTN, LMNA, MYH7, MYBPC3, SCN5A) to integrate genetic risk into clinical assessment - Receives stroke triage events from Neurology Agent (:8528) to trigger cardiac workup (AF screening, echocardiography) after cryptogenic stroke
- Provides cardiotoxicity risk context to CAR-T Agent (:8522) for patients receiving lymphodepleting chemotherapy
Performance Characteristics
| Operation | Typical Latency |
|---|---|
| Risk calculator (any of 6) | 50-150 ms |
| GDMT optimization | 200-500 ms |
| Single collection search (top-15) | 80-150 ms |
| Full 13-collection parallel search | 250-450 ms |
| Claude LLM synthesis | 2,000-3,500 ms |
| Full integrated assessment | 3,500-5,500 ms |
| Report generation (PDF) | 1,500-2,500 ms |
Common Pitfalls and Troubleshooting
- ASCVD risk returns "out of range" -- The Pooled Cohort Equations are validated for ages 40-79. Inputs outside this range receive a warning and extrapolated estimate.
- HEART Score confusion with Framingham -- The HEART Score is for acute chest pain triage in the ED (History, ECG, Age, Risk factors, Troponin), not for long-term cardiovascular risk. Use ASCVD (PCE) for 10-year risk prediction.
- GDMT optimizer returns "contraindicated" -- The optimizer checks for absolute contraindications (hyperkalemia for MRA, bradycardia for beta-blockers, angioedema history for ACEi). Review the contraindication reason in the response before overriding.
- Missing workflow endpoints -- Workflows for acute decompensated HF, post-MI, and myocarditis/pericarditis use the generic
/v1/cardio/queryendpoint with structured parameters rather than dedicated workflow routes. - Cross-modal evaluation empty -- The
/v1/cardio/cross-modal/evaluateendpoint requiresgenomic_evidencecollection to be loaded. Verify viaGET /collectionsthat it shows 3.56M records.
3.5 Neurology Intelligence Agent (:8528)¶
Overview and Clinical Purpose
The Neurology Intelligence Agent supports 8 specialized clinical workflows plus a general neurology RAG query mode (9 total), covering acute stroke triage, dementia evaluation, epilepsy classification, brain tumor grading, MS monitoring, Parkinson's assessment, headache classification, and neuromuscular evaluation. It integrates guidelines from AAN, AHA/ASA, ILAE, ICHD-3, WHO CNS 2021, McDonald 2017, and MDS criteria, with 10 validated clinical scale calculators. All processing runs on a single DGX Spark ($4,699, March 2026).
Advanced Configuration Options
| Variable | Default | Description |
|---|---|---|
NEURO_AGENT_PORT |
8528 |
FastAPI listen port |
MILVUS_HOST |
localhost |
Milvus server hostname |
MILVUS_PORT |
19530 |
Milvus gRPC port |
MILVUS_POOL_SIZE |
10 |
Connection pool size |
ANTHROPIC_API_KEY |
(required) | Claude API key for LLM synthesis |
LLM_MODEL |
claude-sonnet-4-20250514 |
Claude model ID |
EMBEDDING_MODEL |
BAAI/bge-small-en-v1.5 |
384-dim sentence-transformer |
TOP_K |
15 |
Max chunks per collection search |
STROKE_WINDOW_HOURS |
4.5 |
tPA eligibility window in hours from symptom onset |
THROMBECTOMY_WINDOW_HOURS |
24 |
Mechanical thrombectomy eligibility window (DAWN/DEFUSE-3) |
10 Clinical Scale Calculators
| Scale | Purpose |
|---|---|
| NIHSS | National Institutes of Health Stroke Scale |
| GCS | Glasgow Coma Scale |
| MoCA | Montreal Cognitive Assessment |
| MDS-UPDRS Part III | Parkinson's motor examination |
| EDSS | Expanded Disability Status Scale (MS) |
| mRS | Modified Rankin Scale (functional outcome) |
| HIT-6 | Headache Impact Test |
| ALSFRS-R | ALS Functional Rating Scale -- Revised |
| ASPECTS | Alberta Stroke Programme Early CT Score |
| Hoehn-Yahr | Parkinson's disease staging |
Complete API Endpoint Table
| Method | Path | Description |
|---|---|---|
| GET | /health |
Service health with Milvus, RAG engine, and workflow engine status |
| GET | /collections |
List loaded collections with record counts |
| GET | /workflows |
List all 9 workflow definitions |
| GET | /metrics |
Prometheus-compatible metrics export |
| POST | /v1/neuro/integrated-assessment |
Full multi-collection integrated neurological assessment |
| POST | /v1/neuro/query |
Free-text RAG query across all neurology collections |
| POST | /v1/neuro/search |
Direct vector similarity search with collection filtering |
| POST | /v1/neuro/scale/calculate |
Clinical scale calculation (any of 10 scales) |
| POST | /v1/neuro/stroke/triage |
Acute stroke triage with NIHSS, ASPECTS, thrombolysis/thrombectomy eligibility |
| POST | /v1/neuro/dementia/evaluate |
Dementia evaluation with MoCA, ATN staging, differential diagnosis |
| POST | /v1/neuro/epilepsy/classify |
Epilepsy classification (ILAE 2017), syndrome identification, EEG-MRI concordance |
| POST | /v1/neuro/tumor/grade |
Brain tumor grading (WHO CNS 2021) with molecular markers (IDH, MGMT, 1p/19q) |
| POST | /v1/neuro/ms/assess |
MS disease monitoring with EDSS, NEDA-3/4, DMT escalation evaluation |
| POST | /v1/neuro/parkinsons/assess |
Parkinson's assessment with MDS-UPDRS III, Hoehn-Yahr, motor subtype classification |
| POST | /v1/neuro/headache/classify |
Headache classification (ICHD-3), HIT-6 scoring, red flag screening |
| POST | /v1/neuro/neuromuscular/evaluate |
Neuromuscular evaluation with ALSFRS-R, EMG/NCS pattern analysis |
| POST | /v1/neuro/workflow/{workflow_type} |
Generic workflow endpoint for any of 9 workflow types |
| GET | /v1/neuro/domains |
List supported neurological domains |
| GET | /v1/neuro/scales |
List available clinical scales with input requirements |
| GET | /v1/neuro/guidelines |
List integrated clinical guidelines |
| GET | /v1/neuro/knowledge-version |
Knowledge base version and last update timestamp |
| POST | /v1/reports/generate |
Generate clinical report (Markdown, JSON, PDF, FHIR R4) |
| GET | /v1/reports/formats |
List available report export formats |
| GET | /v1/events/stream |
SSE stream for cross-agent events |
| GET | /v1/events/health |
Event system health status |
Detailed Clinical Workflow Walkthrough
- Acute Stroke Triage -- Clinician enters onset time, NIHSS score, and CT findings. The agent calculates time-to-treatment, evaluates tPA eligibility (<4.5 hours per AHA/ASA guidelines), assesses thrombectomy candidacy (<24 hours, DAWN/DEFUSE-3 criteria), computes ASPECTS score for anterior circulation strokes, and localizes vascular territory.
- Dementia Evaluation -- MoCA score, age, and symptom list are submitted. The agent performs ATN biomarker staging (Amyloid/Tau/Neurodegeneration), generates a differential diagnosis ranked by probability (AD, FTD, LBD, VaD, NPH, CJD), recommends further workup (PET, CSF, MRI volumetrics), and suggests treatment options.
- Epilepsy Classification -- Seizure semiology and EEG/MRI findings are entered. The agent classifies seizures per ILAE 2017 (focal, generalized, unknown onset), identifies epilepsy syndromes, performs EEG-MRI concordance analysis for surgical candidacy, and recommends ASMs based on seizure type with PGx context.
- Brain Tumor Grading -- Histology and molecular markers (IDH mutation, MGMT methylation, 1p/19q codeletion, ATRX, H3K27M) are submitted. The agent applies WHO CNS 2021 classification, determines integrated diagnosis (e.g., "Astrocytoma, IDH-mutant, grade 3"), and maps to NCCN treatment guidelines.
- MS Disease Monitoring -- EDSS score, relapse history, and MRI lesion data are entered. The agent assesses NEDA-3/4 status (No Evidence of Disease Activity), evaluates DMT escalation criteria, calculates relapse risk stratification, and tracks lesion burden over time.
- Parkinson's Assessment -- MDS-UPDRS Part III motor items are scored. The agent computes total score, determines Hoehn-Yahr stage, classifies motor subtype (tremor-dominant, PIGD, indeterminate), and optimizes medication regimen (levodopa, dopamine agonists, MAO-B inhibitors).
- Headache Classification -- Attack characteristics (duration, location, quality, associated symptoms) are entered. The agent applies ICHD-3 criteria, computes HIT-6 disability score, screens for red flags (thunderclap, positional, progressive), and recommends preventive therapy if indicated.
- Neuromuscular Evaluation -- Weakness pattern, reflexes, and EMG/NCS findings are submitted. The agent computes ALSFRS-R functional score, analyzes EMG/NCS patterns (neuropathic vs myopathic vs NMJ), guides genetic testing, and recommends specialty referral.
- General Neurology Query -- Free-form RAG-powered Q&A across all neurology knowledge collections, not bound to a specific workflow.
Example Query and Response
curl -X POST http://localhost:8528/v1/neuro/stroke/triage \
-H "Content-Type: application/json" \
-d '{"onset_time": "2026-03-29T08:30:00", "current_time": "2026-03-29T10:45:00", "nihss_score": 14, "ct_findings": "no hemorrhage", "aspects_score": 8}'
{
"triage_result": {
"time_from_onset_minutes": 135,
"tpa_eligible": true,
"tpa_window_remaining_minutes": 135,
"thrombectomy_eligible": true,
"thrombectomy_window_remaining_minutes": 1305,
"nihss_severity": "Moderate-Severe",
"aspects_interpretation": "Score 8/10 -- favorable for intervention (>= 6 threshold)",
"vascular_territory": "MCA (likely M1 or M2 occlusion based on NIHSS pattern)",
"recommendation": "URGENT: Administer IV tPA immediately (within 4.5-hour window). Activate neurointerventional team for mechanical thrombectomy evaluation. Obtain CTA head and neck to confirm large vessel occlusion."
},
"guideline_references": [
"AHA/ASA 2019 Acute Ischemic Stroke Guidelines, Section 3.2 (tPA)",
"DAWN Trial: thrombectomy 6-24 hours with clinical-core mismatch",
"DEFUSE-3 Trial: thrombectomy 6-16 hours with perfusion mismatch"
],
"evidence": [
{"collection": "neuro_stroke", "score": 0.93, "text": "IV alteplase within 4.5 hours of symptom onset..."},
{"collection": "neuro_guidelines", "score": 0.91, "text": "AHA/ASA Class I recommendation for mechanical thrombectomy..."}
],
"confidence": 0.96,
"processing_time_ms": 1850
}
Cross-Agent Integration
- Queries Imaging Agent (:8524) for MRI findings (MS lesion counts, brain volumetrics, DWI/ADC maps for stroke)
- Receives pharmacogenomic context from PGx Agent (:8107) for antiepileptic drug selection (HLA-B*15:02 and carbamazepine, CYP2C9 and phenytoin)
- Shares stroke triage events with Cardiology Agent (:8126) for cardiac workup -- AF screening, echocardiography, and Holter monitoring after cryptogenic stroke
- Queries
genomic_evidencefor neurogenetic variants (APP, PSEN1/2, GBA, LRRK2, PARK2, SMN1) to integrate genetic risk into clinical assessment - Receives tumor molecular markers from Oncology Agent (:8527) for integrated brain tumor classification
Performance Characteristics
| Operation | Typical Latency |
|---|---|
| Clinical scale calculation (any of 10) | 20-80 ms |
| Single collection search (top-15) | 80-150 ms |
| Full multi-collection parallel search | 250-450 ms |
| Stroke triage (end-to-end, no LLM) | 200-500 ms |
| Claude LLM synthesis | 2,000-3,500 ms |
| Full workflow assessment (end-to-end) | 3,000-5,500 ms |
| Report generation (PDF) | 1,500-2,500 ms |
Common Pitfalls and Troubleshooting
- Stroke triage returns "window expired" -- Verify that
onset_timeandcurrent_timeare in ISO 8601 format with timezone. Ifcurrent_timeis omitted, the server uses UTC now, which may differ from local time. - NIHSS score mismatch -- The agent accepts either a pre-computed total score or individual item scores. If both are provided, individual items take precedence and the total is recalculated. Ensure all 15 items are included for accurate scoring.
- MoCA score interpretation varies by education -- The agent applies the standard +1 point adjustment for education <= 12 years. Provide the raw score without manual adjustment; the agent handles the correction.
- MS NEDA-3 assessment incomplete -- NEDA-3 requires three inputs: relapse count (0), new/enlarging T2 lesions (0), and disability progression (stable EDSS). Missing any component results in a partial assessment.
- Workflow type not found -- Valid workflow types are:
acute_stroke_triage,dementia_evaluation,epilepsy_focus,brain_tumor_grading,ms_monitoring,parkinsons_assessment,headache_classification,neuromuscular_evaluation,general. Use the exact ID fromGET /workflows.
3.6 Precision Autoimmune Agent (:8532)¶
Overview and Clinical Purpose
The Precision Autoimmune Agent addresses the diagnostic odyssey that autoimmune patients face -- often 3-6 years across multiple specialists before diagnosis. It interprets autoantibody panels, HLA typing, biomarker trends, and genomic data across 13 autoimmune conditions. It features disease activity scoring (DAS28-CRP, SLEDAI-2K, CDAI, BASDAI), flare prediction, biologic therapy recommendations with PGx context, and diagnostic odyssey analysis. Also detects the POTS/hEDS/MCAS triad and overlap syndromes. All processing runs on a single DGX Spark ($4,699, March 2026).
Advanced Configuration Options
| Variable | Default | Description |
|---|---|---|
AUTOIMMUNE_AGENT_PORT |
8532 |
FastAPI listen port |
MILVUS_HOST |
localhost |
Milvus server hostname |
MILVUS_PORT |
19530 |
Milvus gRPC port |
MILVUS_POOL_SIZE |
10 |
Connection pool size |
ANTHROPIC_API_KEY |
(required) | Claude API key for LLM synthesis |
LLM_MODEL |
claude-sonnet-4-20250514 |
Claude model ID; upgrade to claude-opus-4-20250514 for complex diagnostic odyssey analyses |
EMBEDDING_MODEL |
BAAI/bge-small-en-v1.5 |
384-dim sentence-transformer |
TOP_K |
15 |
Max chunks per collection search |
FLARE_RISK_THRESHOLD |
0.65 |
Risk score threshold for flare alerts (0-1 scale) |
HLA_ODDS_RATIO_MIN |
2.0 |
Minimum odds ratio for HLA-disease associations to flag |
COLLECTION_WEIGHTS |
(see table) | Weighted relevance for each collection in composite scoring |
13 Supported Conditions
Rheumatoid arthritis, SLE, multiple sclerosis, type 1 diabetes, inflammatory bowel disease, psoriasis/psoriatic arthritis, ankylosing spondylitis, Sjogren's syndrome, systemic sclerosis, myasthenia gravis, celiac disease, Graves' disease, Hashimoto's thyroiditis.
Milvus Collections (14)
| Collection | Weight | Description |
|---|---|---|
autoimmune_clinical_documents |
0.18 | Progress notes, discharge summaries |
autoimmune_patient_labs |
0.14 | Lab results with flags and reference ranges |
autoimmune_autoantibody_panels |
0.12 | ANA, anti-dsDNA, anti-CCP, RF, and 14+ types |
autoimmune_hla_associations |
0.08 | HLA allele-disease risk (50+ alleles with odds ratios) |
autoimmune_disease_criteria |
0.08 | ACR/EULAR classification criteria |
autoimmune_disease_activity |
0.07 | DAS28-CRP, SLEDAI-2K, CDAI, BASDAI scoring |
autoimmune_flare_patterns |
0.06 | Biomarker patterns preceding flares |
autoimmune_biologic_therapies |
0.06 | TNF inhibitors, anti-CD20, IL-6R, JAK inhibitors |
autoimmune_pgx_rules |
0.04 | Pharmacogenomic rules (CYP2C19, FCGR3A, HLA) |
autoimmune_clinical_trials |
0.05 | Active and recent clinical trials |
autoimmune_literature |
0.05 | Published literature with year filtering |
autoimmune_patient_timelines |
0.03 | Longitudinal patient event timelines |
autoimmune_cross_disease |
0.02 | Overlap syndromes, cross-disease mechanisms |
genomic_evidence |
0.02 | Shared genomic evidence (read-only) |
Complete API Endpoint Table
| Method | Path | Description |
|---|---|---|
| GET | /health |
Service health with Milvus and engine status |
| GET | /healthz |
Kubernetes-style liveness probe |
| GET | /collections |
List loaded collections with record counts |
| GET | /metrics |
Prometheus-compatible metrics export |
| POST | /v1/autoimmune/integrated-assessment |
Full multi-collection integrated autoimmune assessment |
| POST | /query |
Free-text RAG query across all autoimmune collections |
| POST | /query/stream |
Streaming RAG query (Server-Sent Events) |
| POST | /search |
Direct vector similarity search with collection filtering |
| POST | /analyze |
Full patient analysis (antibodies + HLA + biomarkers combined) |
| POST | /differential |
Differential diagnosis from symptom/antibody profile |
| POST | /ingest/upload |
Upload and ingest clinical documents (PDF, DOCX, text) |
| POST | /ingest/demo-data |
Load demo patient dataset for testing |
| POST | /collections/create |
Create custom collections for institution-specific data |
| POST | /export |
Export analysis results (Markdown, JSON, PDF, FHIR R4) |
Detailed Clinical Workflow Walkthrough
- Document ingestion -- Clinical documents (progress notes, lab reports, imaging reports) are uploaded via
/ingest/uploador Streamlit UI. Documents are chunked, embedded with BGE-small-en-v1.5, and stored inautoimmune_clinical_documents. - Autoantibody panel interpretation -- Antibody results (ANA, anti-dsDNA, anti-CCP, RF, anti-SSA/SSB, anti-Scl-70, anti-Jo-1, ANCA, anti-centromere, anti-Smith, anti-RNP, anti-TPO, TSI, anti-AChR, anti-tTG) are mapped to disease associations with published sensitivity and specificity. For example: anti-CCP positive with RF positive has 95% specificity for RA.
- HLA typing analysis -- HLA alleles are evaluated against 50+ allele-disease associations with odds ratios. Key associations: HLA-B27:05 and ankylosing spondylitis (OR=87.4), HLA-DRB104:01 and RA (OR=4.2), HLA-DQ2/DQ8 and celiac disease (OR=7.0).
- Disease activity scoring -- Standardized scores are calculated based on disease:
- RA: DAS28-CRP (remission <2.6, low 2.6-3.2, moderate 3.2-5.1, high >5.1)
- SLE: SLEDAI-2K (no activity 0, mild 1-5, moderate 6-10, high 11-19, very high >=20)
- IBD: CDAI for Crohn's (remission <150, mild 150-219, moderate 220-450, severe >450)
- AS: BASDAI (low <4, high >=4)
- Flare prediction -- Analyzes longitudinal biomarker patterns (CRP, ESR, IL-6, complement C3/C4, calprotectin, ferritin) against known flare precursors. Returns a 0-1 risk score with contributing factors. Scores above 0.65 trigger an alert to the Biomarker Agent.
- Biologic therapy recommendations -- Filtered by confirmed diagnosis, disease activity level, prior therapy failures, and PGx context (FCGR3A V/F polymorphism for rituximab response, TPMT for azathioprine dosing). Recommendations include TNF inhibitors, anti-CD20, IL-6R blockers, JAK inhibitors, and IL-17/23 inhibitors.
- Diagnostic odyssey analysis -- Surfaces patterns missed across fragmented multi-specialist records. Identifies symptom constellations that span years and multiple providers, flags potential overlap syndromes (e.g., RA + Sjogren's, SLE + APS), and detects the POTS/hEDS/MCAS triad.
Example Query and Response
curl -X POST http://localhost:8532/analyze \
-H "Content-Type: application/json" \
-d '{"antibodies": {"ANA": "positive", "anti_dsDNA": 85, "anti_Smith": "positive"}, "hla": ["DRB1*15:01"], "biomarkers": {"CRP": 18.5, "ESR": 42, "C3": 62, "C4": 7, "anti_dsDNA_titer": 120}}'
{
"primary_diagnosis": {
"disease": "Systemic Lupus Erythematosus (SLE)",
"confidence": 0.92,
"acr_eular_score": 18,
"classification_criteria": "ACR/EULAR 2019 SLE Classification (threshold >= 10)"
},
"disease_activity": {
"score_type": "SLEDAI-2K",
"score": 12,
"level": "High Activity",
"active_domains": ["immunologic (anti-dsDNA, low complement)", "serositis risk"]
},
"flare_risk": {
"score": 0.78,
"risk_level": "High",
"contributing_factors": ["Rising anti-dsDNA (120 IU/mL)", "Low C4 (7 mg/dL)", "Low C3 (62 mg/dL)"],
"recommendation": "Consider preemptive dose adjustment; monitor weekly for 4 weeks"
},
"hla_associations": [
{"allele": "DRB1*15:01", "disease": "SLE", "odds_ratio": 2.1, "significance": "Moderate risk association"}
],
"therapy_recommendations": [
{"rank": 1, "drug": "Belimumab (anti-BLyS)", "rationale": "Add-on for active SLE with high anti-dsDNA and low complement"},
{"rank": 2, "drug": "Mycophenolate mofetil", "rationale": "Steroid-sparing immunosuppression for moderate-severe SLE"},
{"rank": 3, "drug": "Anifrolumab (anti-IFNAR1)", "rationale": "Type I IFN pathway blockade for refractory skin/joint SLE"}
],
"evidence": [
{"collection": "autoimmune_disease_criteria", "score": 0.94, "text": "ACR/EULAR 2019: Anti-dsDNA + low complement = 10 points..."},
{"collection": "autoimmune_biologic_therapies", "score": 0.89, "text": "BLISS-76: Belimumab + SOC improved SRI-4 response vs placebo..."}
],
"processing_time_ms": 4120
}
Cross-Agent Integration
- Receives longitudinal biomarker trends from Biomarker Agent (:8529) for disease monitoring -- CRP, ESR, complement, calprotectin tracked over time with trend analysis
- Requests imaging assessment from Imaging Agent (:8524) for joint erosion scoring (MRI/ultrasound), organ evaluation (renal, pulmonary), and skin lesion characterization
- Publishes diagnosis events to other agents via SSE event bus -- new autoimmune diagnoses trigger downstream assessments in Cardiology (pericarditis risk), Neurology (CNS lupus, MS overlap), and PGx (immunosuppressant metabolism)
- Queries
genomic_evidencefor HLA alleles and autoimmune-associated variants (CTLA4, PTPN22, IL2RA, STAT4) to integrate genetic risk - Sends flare alerts to Biomarker Agent (:8529) when biomarker patterns indicate impending flare, triggering intensified monitoring protocols
Performance Characteristics
| Operation | Typical Latency |
|---|---|
| Single collection search (top-15) | 80-150 ms |
| Full 14-collection weighted search | 300-500 ms |
| Disease activity calculation | 30-80 ms |
| Flare risk prediction | 50-120 ms |
| Autoantibody panel interpretation | 40-100 ms |
| HLA association lookup | 20-50 ms |
| Claude LLM synthesis | 2,500-4,000 ms |
| Full analysis (end-to-end) | 3,500-5,500 ms |
| Document ingestion (per page) | 500-1,000 ms |
Common Pitfalls and Troubleshooting
- ANA "positive" without titer -- The agent accepts qualitative ("positive"/"negative") or quantitative (titer, e.g., "1:640") ANA values. Quantitative values enable more precise disease association scoring. Include titer and pattern (homogeneous, speckled, nucleolar, centromere) when available.
- Disease activity score returns "insufficient data" -- Each score requires specific inputs. DAS28-CRP needs tender joint count (28 joints), swollen joint count (28 joints), CRP (mg/L), and patient global assessment (0-100 VAS). Missing any component prevents calculation.
- HLA alleles not recognized -- Use standard nomenclature (e.g., "B27:05" not "B27" or "HLA-B27"). The agent accepts both 2-field (B27:05) and 1-field (B*27) resolution but 2-field provides more accurate odds ratios.
- Flare prediction unreliable with single timepoint -- The prediction model works best with longitudinal data (3+ timepoints). A single snapshot can estimate current risk but cannot detect trajectory-based patterns (rising anti-dsDNA, falling complement).
- Overlap syndromes missed -- The diagnostic odyssey analysis requires documents from multiple visits and specialties. Single-visit data limits the agent's ability to detect cross-specialist patterns. Upload historical records for comprehensive analysis.
- Demo data ingestion fails -- Ensure the
/data/demodirectory exists and contains the sample patient files. Runcurl http://localhost:8532/healthfirst to verify Milvus connectivity before ingesting.
3.7 Rare Disease Diagnostic Agent (:8134)¶
Overview and Clinical Purpose
The Rare Disease Diagnostic Agent provides differential diagnosis, ACMG/AMP variant interpretation (implementing 23 ACMG criteria), HPO-based phenotype matching with information content (IC) weighted similarity scoring, therapeutic option search (orphan drugs, gene therapy, enzyme replacement), and clinical trial eligibility assessment. It covers 13 disease categories with 88 diseases in its knowledge base and tracks 25 approved or investigational gene therapies.
Advanced Configuration Options
All environment variables use the RD_ prefix. Key settings:
| Variable | Default | Description |
|---|---|---|
RD_MILVUS_HOST |
localhost |
Milvus vector database host |
RD_MILVUS_PORT |
19530 |
Milvus port |
RD_API_PORT |
8134 |
FastAPI REST server port |
RD_STREAMLIT_PORT |
8544 |
5-tab Streamlit UI port |
RD_LLM_MODEL |
claude-sonnet-4-6 |
Claude model for synthesis |
RD_ANTHROPIC_API_KEY |
(none) | Required for LLM; search-only mode without it |
RD_SCORE_THRESHOLD |
0.4 |
Minimum cosine similarity for evidence retrieval |
RD_TOP_K_PHENOTYPES |
50 |
Max results from rd_phenotypes per query |
RD_TOP_K_VARIANTS |
100 |
Max results from rd_variants per query |
RD_INGEST_SCHEDULE_HOURS |
24 |
Auto-ingest cycle (set RD_INGEST_ENABLED=true) |
RD_MAX_CONVERSATION_CONTEXT |
3 |
Prior exchanges injected for multi-turn |
RD_CITATION_HIGH_THRESHOLD |
0.75 |
Score above which citations are marked high-confidence |
RD_API_KEY |
(empty) | Set to require X-API-Key header on all endpoints |
RD_CROSS_AGENT_TIMEOUT |
30 |
Seconds before cross-agent HTTP calls time out |
Collection weight tuning: 14 RD_WEIGHT_* variables (e.g., RD_WEIGHT_PHENOTYPES=0.12) control the relative influence of each collection during multi-collection fusion. Weights must sum to approximately 1.0 (tolerance 0.05).
Milvus Collections (14)
| Collection | Description |
|---|---|
rd_phenotypes |
HPO-coded phenotype descriptions |
rd_diseases |
Disease definitions (OMIM, Orphanet) |
rd_genes |
Gene-disease associations |
rd_variants |
Pathogenic variant records |
rd_literature |
Rare disease literature |
rd_trials |
Rare disease clinical trials |
rd_therapies |
Orphan drugs, gene therapy, ERT |
rd_case_reports |
Published case reports |
rd_guidelines |
Clinical management guidelines |
rd_pathways |
Molecular pathways |
rd_registries |
Patient registries |
rd_natural_history |
Natural history studies |
rd_newborn_screening |
Newborn screening panels |
genomic_evidence |
Shared genomic evidence (read-only) |
Complete API Endpoint Table
| Method | Path | Description |
|---|---|---|
| GET | /health |
Service health with component readiness and vector counts |
| GET | /collections |
Collection names and record counts |
| GET | /workflows |
11 available diagnostic workflow definitions |
| GET | /metrics |
Prometheus-compatible metrics export |
| POST | /v1/diagnostic/query |
RAG Q&A across all rare disease collections |
| POST | /v1/diagnostic/search |
Multi-collection evidence search (no LLM) |
| POST | /v1/diagnostic/diagnose |
Differential diagnosis from HPO terms |
| POST | /v1/diagnostic/variants/interpret |
ACMG/AMP variant classification |
| POST | /v1/diagnostic/phenotype/match |
HPO-to-disease phenotype matching |
| POST | /v1/diagnostic/therapy/search |
Therapeutic option search (ERT, SRT, gene therapy) |
| POST | /v1/diagnostic/trial/match |
Clinical trial eligibility assessment |
| POST | /v1/diagnostic/workflow/{type} |
Generic workflow dispatch (11 types) |
| POST | /v1/diagnostic/integrated-assessment |
Cross-agent multi-agent assessment |
| GET | /v1/diagnostic/disease-categories |
Reference catalog of 13 disease categories |
| GET | /v1/diagnostic/gene-therapies |
25 approved/investigational gene therapies |
| GET | /v1/diagnostic/acmg-criteria |
23 ACMG criteria reference |
| GET | /v1/diagnostic/hpo-categories |
HPO top-level ontology terms |
| GET | /v1/diagnostic/knowledge-version |
Knowledge base version metadata |
| POST | /v1/reports/generate |
Multi-format report generation |
| GET | /v1/reports/formats |
Supported export formats |
| GET | /v1/events/stream |
Server-sent event stream |
Detailed Clinical Workflow Walkthrough
-
Phenotype entry -- Clinician enters patient phenotypes as HPO terms (e.g., HP:0001250 Seizure, HP:0001263 Global developmental delay) via the 5-tab Streamlit UI or the
/v1/diagnostic/diagnoseAPI. Age of onset category (antenatal, neonatal, infantile, childhood, juvenile, adult) is provided as context. -
HPO-to-disease matching -- The agent embeds the HPO term descriptions using BGE-small-en-v1.5, searches the
rd_phenotypescollection (top_k=50), and computes IC-weighted Resnik similarity against OMIM and Orphanet disease phenotype profiles. Each disease receives a composite phenotype match score. -
Differential diagnosis ranking -- Candidate diseases are ranked by phenotype match score. The LLM synthesizes evidence from
rd_diseases,rd_literature, andrd_case_reportsto assign confidence levels (high/moderate/low) and highlight discriminating features. -
Variant interpretation -- If VCF data is available, candidate variants in genes associated with top-ranked diseases are classified using 23 ACMG/AMP criteria: PVS1, PM1-PM6, PP1-PP5, BA1, BS1-BS4, BP1-BP7. ClinVar, gnomAD allele frequency, AlphaMissense scores, and in-silico predictors are integrated.
-
Therapeutic search -- For confirmed or suspected diagnoses, the agent queries
rd_therapiesfor orphan drugs (FDA/EMA approved), gene therapies (25 tracked), enzyme replacement therapies, substrate reduction therapies, and investigational compounds. -
Trial eligibility -- The agent cross-references patient age, gene, diagnosis, and variant status against
rd_trialsand triggers the Clinical Trial Agent for expanded matching. -
Report export -- Final report exported as Markdown, JSON, PDF, FHIR R4 DiagnosticReport, or GA4GH Phenopacket v2.
Example Query and Response
curl -X POST http://localhost:8134/v1/diagnostic/diagnose \
-H "Content-Type: application/json" \
-d '{"hpo_terms": ["HP:0001250", "HP:0001263", "HP:0002079"], "age_of_onset": "infantile"}'
{
"status": "completed",
"differential_diagnosis": [
{
"rank": 1,
"disease": "Dravet syndrome",
"omim": "607208",
"orphanet": "ORPHA:33069",
"phenotype_match_score": 0.87,
"confidence": "high",
"key_gene": "SCN1A",
"discriminating_features": ["Febrile seizures progressing to afebrile", "Temperature sensitivity"]
},
{
"rank": 2,
"disease": "CLN2 disease (late-infantile neuronal ceroid lipofuscinosis)",
"omim": "204500",
"orphanet": "ORPHA:228346",
"phenotype_match_score": 0.74,
"confidence": "moderate",
"key_gene": "TPP1",
"discriminating_features": ["Progressive vision loss", "Cerebellar atrophy on MRI"]
}
],
"hpo_terms_matched": 3,
"diseases_evaluated": 88,
"evidence_sources": 42,
"processing_time_ms": 2340.5
}
Cross-Agent Integration
| Direction | Partner Agent | Trigger/Scenario |
|---|---|---|
| Outbound | Pharmacogenomics (:8107) | Queries drug metabolism context when rare disease therapeutics have PGx implications (e.g., ERT dosing) |
| Outbound | Clinical Trial (:8538) | Triggers expanded trial matching for rare disease patients via /v1/trial/workflow/patient_matching/run |
| Outbound | Imaging (:8524) | Requests imaging assessment for rare diseases with organ involvement (e.g., lysosomal storage diseases) |
| Outbound | Cardiology (:8126) | Queries cardiac phenotype context for cardiomyopathy-associated rare diseases (e.g., Fabry, Pompe) |
| Inbound | Biomarker (:8529) | Receives phenotype-genotype correlation requests |
| Bidirectional | All agents | Publishes diagnosis events via SSE event bus; reads genomic_evidence (shared) |
Performance Characteristics
| Operation | Typical Latency | Notes |
|---|---|---|
| Phenotype match (3 HPO terms) | 800-1,200 ms | Vector search across rd_phenotypes (top_k=50) |
| Differential diagnosis (full) | 2,000-3,500 ms | Multi-collection search + LLM synthesis |
| ACMG variant interpretation | 1,500-2,500 ms | ClinVar + gnomAD + AlphaMissense lookup + LLM classification |
| Therapeutic search | 600-1,000 ms | rd_therapies search only |
| Report generation (PDF) | 3,000-5,000 ms | LLM narrative + template rendering |
Common Pitfalls and Troubleshooting
- HPO term formatting: Terms must use the
HP:NNNNNNNformat (7 digits, zero-padded). Using free-text symptom descriptions instead of HPO codes bypasses the IC-weighted matching and falls back to text similarity, which is less precise. - ACMG criteria incomplete: The agent implements 23 of 28 ACMG criteria. Functional assay evidence (PS3) and segregation data (PP1 with quantitative scoring) require manual review.
- Low phenotype match scores: If all differential diagnoses score below 0.5, consider whether the phenotype is incomplete. Adding more HPO terms (especially specific sub-terms rather than general parent terms) significantly improves matching.
- Gene therapy list currency: The 25 gene therapies are current as of March 2026. New FDA/EMA approvals require a data refresh via
scripts/seed_knowledge.py. - Cross-agent timeout: If the PGx or Trial agent is down, the integrated assessment degrades gracefully (returns partial results). Check
RD_CROSS_AGENT_TIMEOUTif timeouts are frequent.
3.8 Pharmacogenomics Intelligence Agent (:8107)¶
Overview and Clinical Purpose
The Pharmacogenomics (PGx) Intelligence Agent translates patient genotype data into actionable prescribing guidance. It interprets star alleles for 14 pharmacogenes, applies 9 CPIC guideline algorithms, screens for 12 HLA-mediated adverse drug reaction associations, models phenoconversion from drug-drug interactions, and provides genotype-guided dosing algorithms including the IWPC warfarin dosing calculator.
Advanced Configuration Options
All environment variables use the PGX_ prefix. Key settings:
| Variable | Default | Description |
|---|---|---|
PGX_MILVUS_HOST |
localhost |
Milvus vector database host |
PGX_MILVUS_PORT |
19530 |
Milvus port |
PGX_API_PORT |
8107 |
FastAPI REST server port |
PGX_STREAMLIT_PORT |
8507 |
Streamlit UI port |
PGX_LLM_MODEL |
claude-sonnet-4-6 |
Claude model for synthesis |
PGX_ANTHROPIC_API_KEY |
(none) | Required for LLM features |
PGX_SCORE_THRESHOLD |
0.4 |
Minimum cosine similarity for evidence retrieval |
PGX_TOP_K_PER_COLLECTION |
5 |
Default results per collection |
PGX_INGEST_SCHEDULE_HOURS |
168 |
Weekly auto-ingest cycle (7 days) |
PGX_API_KEY |
(empty) | Set to require X-API-Key header |
PGX_CROSS_AGENT_TIMEOUT |
30 |
Cross-agent HTTP timeout in seconds |
PGX_RAG_PIPELINE_ROOT |
/app/rag-chat-pipeline |
Path to shared RAG pipeline |
Collection weight tuning: 15 PGX_WEIGHT_* variables control multi-collection fusion. PGX_WEIGHT_DRUG_GUIDELINES defaults to 0.14 (highest weight) because CPIC/DPWG guideline matching is the most clinically relevant signal.
Milvus Collections (15)
| Collection | Description |
|---|---|
pgx_gene_reference |
Pharmacogene star allele definitions and activity scores |
pgx_drug_guidelines |
CPIC/DPWG clinical prescribing guidelines |
pgx_drug_interactions |
Drug-gene interaction records (PharmGKB) |
pgx_hla_hypersensitivity |
HLA-mediated adverse drug reaction screening |
pgx_phenoconversion |
Metabolic phenoconversion via DDI |
pgx_dosing_algorithms |
Genotype-guided dosing algorithms and formulas |
pgx_clinical_evidence |
Published PGx clinical evidence and outcomes |
pgx_population_data |
Population-specific allele frequency data |
pgx_clinical_trials |
PGx-related clinical trials |
pgx_fda_labels |
FDA pharmacogenomic labeling information |
pgx_drug_alternatives |
Genotype-guided therapeutic alternatives |
pgx_patient_profiles |
Patient diplotype-phenotype profiles |
pgx_implementation |
Clinical PGx implementation programs |
pgx_education |
PGx educational resources and guidelines |
genomic_evidence |
Shared genomic evidence (read-only) |
Complete API Endpoint Table
| Method | Path | Description |
|---|---|---|
| GET | /health |
Service health with Milvus and LLM status |
| GET | /collections |
Collection names and record counts |
| GET | /metrics |
Prometheus-compatible metrics |
| POST | /query |
RAG Q&A across all PGx collections |
| POST | /search |
Multi-collection evidence search (no LLM) |
| POST | /profile |
Full PGx profile from diplotype map |
| POST | /dosing |
Drug-specific dosing guidance |
| POST | /hla-screen |
HLA hypersensitivity screening |
| POST | /v1/pgx/drug-check |
Single-drug PGx check with CPIC alerts |
| POST | /v1/pgx/medication-review |
Polypharmacy review with DDI detection |
| POST | /v1/pgx/dosing/warfarin |
IWPC warfarin dosing algorithm |
| POST | /v1/pgx/hla-screen |
HLA adverse reaction screening |
| POST | /v1/pgx/phenoconversion |
Phenoconversion analysis from concomitant drugs |
| POST | /v1/pgx/integrated-assessment |
Cross-agent multi-agent assessment |
| GET | /v1/pgx/genes |
List all 14 pharmacogenes with metadata |
| GET | /v1/pgx/drugs |
List all tracked drugs by therapeutic category |
| POST | /v1/reports/generate |
Multi-format report generation |
| GET | /v1/reports/formats |
Supported export formats |
| GET | /v1/events/stream |
Server-sent event stream |
Detailed Clinical Workflow Walkthrough
-
Diplotype submission -- Clinician submits patient diplotypes (e.g.,
CYP2D6: *1/*4,CYP2C19: *1/*2) from VCF-based star allele calling (Stargazer, Cyrius) or laboratory report. The/profileor/v1/pgx/drug-checkendpoint receives the data. -
Metabolizer phenotype assignment -- Each diplotype is mapped to an activity score and metabolizer phenotype: ultra-rapid metabolizer (UM), normal metabolizer (NM), intermediate metabolizer (IM), or poor metabolizer (PM). The mapping follows CPIC standardized terms.
-
CPIC/DPWG guideline matching -- For each gene-drug pair in the patient's medication list, the agent queries
pgx_drug_guidelinesfor applicable CPIC Level A/B recommendations. 9 CPIC algorithms are implemented covering CYP2D6, CYP2C19, CYP2C9, CYP3A5, DPYD, TPMT, NUDT15, SLCO1B1, and UGT1A1. -
Phenoconversion check -- Current medications are screened against CYP inhibitor/inducer databases. A strong CYP2D6 inhibitor (e.g., fluoxetine, paroxetine) in a genotypic NM converts the effective phenotype to PM. The
/v1/pgx/phenoconversionendpoint returns the precipitant drug, inhibitor strength, converted phenotype, and affected substrate drugs. -
HLA screening -- 12 HLA-drug associations are checked. Key associations include HLA-B57:01/abacavir (mandatory screening), HLA-B58:01/allopurinol, HLA-B15:02/carbamazepine, and HLA-A31:01/carbamazepine. Population-specific prevalence data is returned.
-
Dosing algorithm -- For warfarin, the IWPC algorithm (Klein et al., NEJM 2009) calculates predicted weekly dose incorporating VKORC1, CYP2C9, age, height, weight, race, amiodarone use, and enzyme inducer status. Safety bounds clamp output to 3-105 mg/week.
-
Alternative recommendations -- For PM/UM patients, the agent retrieves genotype-appropriate alternative drugs from
pgx_drug_alternativesand generates a structured report.
Example Query and Response
curl -X POST http://localhost:8107/v1/pgx/drug-check \
-H "Content-Type: application/json" \
-d '{"drug": "codeine", "gene": "CYP2D6", "phenotype": "poor_metabolizer", "diplotype": "*4/*4"}'
{
"drug": "codeine",
"gene": "CYP2D6",
"phenotype": "poor_metabolizer",
"alerts": [
{
"alert_level": "critical",
"gene": "CYP2D6",
"drug": "codeine",
"phenotype": "poor_metabolizer",
"recommendation": "CPIC guideline exists for CYP2D6-codeine. Patient phenotype: poor_metabolizer. Consider alternative therapy or significant dose reduction.",
"guideline_body": "CPIC",
"cpic_level": "A",
"alternative_drugs": ["morphine", "non-opioid analgesics", "acetaminophen"]
}
],
"knowledge_context": "CYP2D6 encodes cytochrome P450 2D6, which metabolizes codeine to morphine via O-demethylation. Poor metabolizers (*4/*4) produce insufficient morphine for analgesic effect.",
"processing_time_ms": 45.2
}
Cross-Agent Integration
| Direction | Partner Agent | Trigger/Scenario |
|---|---|---|
| Outbound | Oncology (:8527) | Queries planned chemotherapy context for metabolism assessment (e.g., tamoxifen and CYP2D6) |
| Outbound | Cardiology (:8126) | Queries cardiac drug interaction context (e.g., clopidogrel and CYP2C19) |
| Outbound | Neurology (:8528) | Queries neurotoxic drug interaction context (e.g., carbamazepine and HLA-B*15:02) |
| Outbound | Clinical Trial (:8538) | Queries PGx-guided clinical trials |
| Inbound | All prescribing agents | Any agent recommending drug therapy can query PGx for metabolism context |
| Inbound | Biomarker (:8529) | Receives requests for genotype-adjusted biomarker reference ranges |
Performance Characteristics
| Operation | Typical Latency | Notes |
|---|---|---|
| Single drug check | 30-80 ms | Knowledge graph lookup only, no vector search |
| Polypharmacy review (5 drugs) | 100-250 ms | Iterates CYP inhibitor/inducer databases |
| Warfarin IWPC dosing | 10-30 ms | Pure mathematical calculation |
| HLA screening | 20-60 ms | Dictionary lookup against 12 associations |
| Phenoconversion analysis | 40-120 ms | CYP inhibitor/inducer scan per drug |
| Full RAG query | 1,500-3,000 ms | 15-collection vector search + LLM synthesis |
Common Pitfalls and Troubleshooting
- Star allele formatting: Diplotypes must use the
*N/*Nformat (e.g.,*1/*4). Suballeles like*1aand*5are supported for SLCO1B1. Submitting raw VCF positions instead of called star alleles will not produce useful results. - Phenoconversion missed: The phenoconversion engine only detects interactions from the concomitant drug list explicitly provided. It does not infer medications from other clinical data. Ensure the full medication list is submitted.
- Population-specific allele frequencies: HLA prevalence data varies significantly by ancestry. The agent reports population-specific frequencies (e.g., HLA-B*57:01 prevalence: 6-8% European, <1% East Asian) but does not automatically select the relevant population.
- IWPC warfarin clamping: Doses below 3 mg/week or above 105 mg/week are clamped to safety bounds. If the algorithm produces extreme values, verify input genotypes and clinical parameters.
- Missing CPIC guideline: Not all gene-drug pairs have CPIC Level A guidelines. The agent returns an
infoalert listing which drugs do have guidelines for the queried gene.
3.9 Imaging Intelligence Agent (:8524)¶
Overview and Clinical Purpose
The Imaging Intelligence Agent provides clinical decision support for radiology through a RAG knowledge system, 4 NVIDIA NIM microservices (VISTA-3D with 127 anatomical structures, MAISI, VILA-M3, Llama-3), and 4 reference clinical workflows. It supports Orthanc DICOM auto-ingestion, cross-modal genomics enrichment, NVIDIA FLARE federated learning, and FHIR R4 interoperability.
Advanced Configuration Options
All environment variables use the IMAGING_ prefix. Key settings:
| Variable | Default | Description |
|---|---|---|
IMAGING_MILVUS_HOST |
localhost |
Milvus vector database host |
IMAGING_MILVUS_PORT |
19530 |
Milvus port |
IMAGING_API_PORT |
8524 |
FastAPI REST server port |
IMAGING_STREAMLIT_PORT |
8525 |
9-tab Streamlit UI port |
IMAGING_LLM_MODEL |
claude-sonnet-4-20250514 |
Claude model for synthesis |
IMAGING_NIM_MODE |
local |
NIM mode: local, cloud, or mock |
IMAGING_NIM_LLM_URL |
http://localhost:8520/v1 |
Local Llama-3 NIM endpoint |
IMAGING_NIM_VISTA3D_URL |
http://localhost:8530 |
VISTA-3D segmentation endpoint |
IMAGING_NIM_MAISI_URL |
http://localhost:8531 |
MAISI synthetic CT endpoint |
IMAGING_NIM_VILAM3_URL |
http://localhost:8532 |
VILA-M3 vision-language endpoint |
IMAGING_NIM_ALLOW_MOCK_FALLBACK |
true |
Fall back to mock responses if NIM unavailable |
IMAGING_NVIDIA_API_KEY |
(none) | Required for cloud NIM mode |
IMAGING_ORTHANC_URL |
http://localhost:8042 |
Orthanc DICOM server |
IMAGING_DICOM_AUTO_INGEST |
false |
Enable automatic DICOM webhook processing |
IMAGING_DICOM_WATCH_INTERVAL |
5 |
Seconds between Orthanc /changes polls |
IMAGING_CROSS_MODAL_ENABLED |
false |
Enable imaging-to-genomics cross-modal triggers |
IMAGING_PREVIEW_DEFAULT_FPS |
8 |
Frames per second for NIfTI preview videos |
NVIDIA NIM Integration
| NIM Service | Port | Capability |
|---|---|---|
| VISTA-3D | 8530 | 3D medical image segmentation (127 anatomical structures) |
| MAISI | 8531 | Synthetic CT volume generation with paired segmentation masks |
| VILA-M3 | 8532 | Vision-language model for radiology image understanding |
| Llama-3 8B | 8520 | Clinical reasoning and report generation |
Milvus Collections (11)
| Collection | Content |
|---|---|
imaging_literature |
2,678 PubMed research papers |
imaging_trials |
12 clinical trials |
imaging_findings |
Imaging finding templates and patterns |
imaging_protocols |
Acquisition protocols and parameters |
imaging_devices |
FDA-cleared AI/ML medical devices |
imaging_anatomy |
Anatomical structure references |
imaging_benchmarks |
Model performance benchmarks |
imaging_guidelines |
ACR, RSNA, NCCN guidelines |
imaging_report_templates |
Structured radiology report templates |
imaging_datasets |
Public imaging datasets (TCIA, PhysioNet) |
genomic_evidence |
Shared from Stage 2 (read-only, 3.56M) |
4 Reference Workflows
| Workflow | Modality | Target Latency |
|---|---|---|
| CT Head Hemorrhage Triage | CT | < 90 sec |
| CT Chest Lung Nodule Tracking | CT | < 5 min |
| CXR Rapid Findings | X-ray | < 30 sec |
| MRI Brain MS Lesion Tracking | MRI | < 5 min |
Complete API Endpoint Table
| Method | Path | Description |
|---|---|---|
| GET | /health |
Service health with collection stats and NIM status |
| GET | / |
Service info with links to docs and health |
| GET | /collections |
Collection names, counts, and labels |
| GET | /metrics |
Prometheus metrics (Counter, Histogram) |
| GET | /knowledge/stats |
Imaging domain knowledge graph statistics |
| POST | /query |
Full RAG query: multi-collection retrieval + LLM synthesis |
| POST | /search |
Evidence-only search (no LLM synthesis) |
| POST | /find-related |
Cross-collection entity linking |
| POST | /api/ask |
Meta-agent question answering |
| POST | /nim/vista3d/segment |
VISTA-3D 3D segmentation (127 structures) |
| POST | /nim/maisi/generate |
MAISI synthetic CT generation |
| POST | /nim/vilam3/analyze |
VILA-M3 image understanding |
| POST | /nim/llm/generate |
Llama-3 text generation |
| GET | /nim/status |
NIM service availability check |
| POST | /workflow/{name}/run |
Execute a clinical workflow |
| GET | /workflows |
List available workflows |
| POST | /events/dicom-webhook |
DICOM auto-ingest webhook (Orthanc) |
| GET | /events/stream |
SSE event stream |
| POST | /reports/generate |
Multi-format report generation |
| GET | /reports/formats |
Supported export formats |
| GET | /preview/{study_id} |
NIfTI slice preview (MP4/GIF) |
| GET | /demo-cases |
Pre-loaded demo imaging cases |
| POST | /protocol/optimize |
Protocol optimization advisor |
| POST | /dose/estimate |
Radiation dose intelligence |
Detailed Clinical Workflow Walkthrough (CT Head Hemorrhage Triage)
-
DICOM arrival -- A CT head study arrives at Orthanc. If
DICOM_AUTO_INGESTis enabled, the/events/dicom-webhookendpoint is triggered with the study UID, modality, and body part. -
Study retrieval -- The agent fetches the DICOM series from Orthanc, converts to NIfTI format, and generates a slice preview video (8 FPS MP4).
-
VISTA-3D segmentation -- The NIfTI volume is sent to the VISTA-3D NIM at port 8530. The model segments 127 anatomical structures, identifying brain parenchyma, ventricles, midline structures, and extra-axial spaces.
-
Hemorrhage detection -- The workflow applies density thresholds and morphological analysis on the segmented volume to detect hyperdense regions consistent with acute hemorrhage. Classification includes epidural, subdural, subarachnoid, intraparenchymal, and intraventricular subtypes.
-
RAG evidence enrichment -- The agent searches
imaging_literature,imaging_guidelines, andimaging_findingsfor evidence relevant to the detected pattern. Genomic evidence fromgenomic_evidenceis queried if cross-modal is enabled (e.g., coagulation factor variants). -
Report generation -- A structured radiology report is generated using the Llama-3 NIM or Claude, populated with findings, measurements, classification, and relevant literature citations. FHIR R4 DiagnosticReport format is available.
-
Triage alert -- If hemorrhage is detected, a high-priority SSE event is published to the event stream for downstream consumers.
Example Query and Response
curl -X POST http://localhost:8524/query \
-H "Content-Type: application/json" \
-d '{"question": "What is the sensitivity of AI-assisted CXR for pneumothorax detection?", "top_k": 5}'
{
"question": "What is the sensitivity of AI-assisted CXR for pneumothorax detection?",
"answer": "AI-assisted chest X-ray systems demonstrate sensitivity of 85-95% for pneumothorax detection, with specificity of 90-98%. FDA-cleared devices include Annalise.ai CXR (sensitivity 94.2%), Qure.ai qXR (sensitivity 91.8%), and Zebra Medical Vision (sensitivity 89.5%). Performance is highest for moderate-to-large pneumothoraces (>2 cm) and decreases for small or loculated collections. Current ACR guidelines recommend AI as a triage aid rather than a standalone diagnostic tool.",
"evidence_count": 18,
"collections_searched": 8,
"search_time_ms": 1245.3,
"nim_services_used": ["llm"]
}
Cross-Agent Integration
| Direction | Partner Agent | Trigger/Scenario |
|---|---|---|
| Outbound | Oncology (:8527) | Lung-RADS 4A+ nodules trigger genomic_evidence query for EGFR/ALK/ROS1/KRAS variants |
| Outbound | Cardiology (:8126) | Cardiac CT findings shared for integrated cardiovascular assessment |
| Inbound | Neurology (:8528) | Receives MRI brain requests for MS lesion tracking workflow |
| Inbound | Rare Disease (:8134) | Receives imaging assessment requests for organ-involved rare diseases |
| Bidirectional | All agents | Publishes imaging events via SSE; reads genomic_evidence (shared) |
Performance Characteristics
| Operation | Typical Latency | Notes |
|---|---|---|
| RAG query (full) | 1,500-3,000 ms | 11-collection search + LLM synthesis |
| Evidence search (no LLM) | 400-800 ms | Vector search only |
| VISTA-3D segmentation | 30-90 sec | GPU-dependent; 127-class volumetric segmentation |
| MAISI synthetic generation | 60-180 sec | Full CT volume synthesis |
| VILA-M3 image analysis | 5-15 sec | Single image understanding |
| CXR Rapid Findings workflow | < 30 sec | End-to-end triage pipeline |
| NIfTI preview generation | 2-5 sec | MP4/GIF slice animation |
Common Pitfalls and Troubleshooting
- NIM service unavailable: If NIM containers are not running, set
IMAGING_NIM_ALLOW_MOCK_FALLBACK=trueto get structured mock responses for development. CheckGET /nim/statusfor per-service availability. - DICOM webhook not triggering: Verify
IMAGING_DICOM_AUTO_INGEST=trueand that Orthanc is configured with a Lua script pointing itsOnStableStudycallback tohttp://imaging-agent:8524/events/dicom-webhook. - Cross-modal disabled by default: The
IMAGING_CROSS_MODAL_ENABLEDflag defaults tofalse. Enable it to allow imaging findings to automatically query genomic evidence. - Large DICOM series: Series with >500 slices may exceed the 10MB request size limit. Increase
IMAGING_MAX_REQUEST_SIZE_MBor process via the Orthanc webhook path (which streams data). - Cloud vs local NIM: When
IMAGING_NIM_MODE=cloud, requests go tointegrate.api.nvidia.comand require a validIMAGING_NVIDIA_API_KEY. Cloud mode uses different model identifiers (e.g.,meta/llama-3.1-8b-instruct).
3.10 Single-Cell Intelligence Agent (:8540)¶
Overview and Clinical Purpose
The Single-Cell Intelligence Agent provides cell-type annotation, tumor microenvironment (TME) profiling, drug response prediction, subclonal architecture analysis, spatial transcriptomics niche mapping, trajectory inference, ligand-receptor interaction analysis, biomarker discovery, CAR-T target validation, and treatment monitoring. Its knowledge base covers 57 cell types, 30 drugs, 75 markers, 4 TME phenotypes (hot, cold, excluded, immunosuppressive), and 4 spatial platforms (Visium, MERFISH, Xenium, CosMx). It supports 10 specialized workflows plus general RAG query.
Advanced Configuration Options
All environment variables use the SC_ prefix. Key settings:
| Variable | Default | Description |
|---|---|---|
SC_MILVUS_HOST |
localhost |
Milvus vector database host |
SC_MILVUS_PORT |
19530 |
Milvus port |
SC_API_PORT |
8540 |
FastAPI REST server port |
SC_STREAMLIT_PORT |
8130 |
Streamlit UI port |
SC_LLM_MODEL |
claude-sonnet-4-6 |
Claude model for synthesis |
SC_ANTHROPIC_API_KEY |
(none) | Required for LLM features |
SC_SCORE_THRESHOLD |
0.4 |
Minimum cosine similarity |
SC_TOP_K_CELL_TYPES |
50 |
Max results from sc_cell_types per query |
SC_TOP_K_MARKERS |
40 |
Max results from sc_markers per query |
SC_TOP_K_SPATIAL |
30 |
Max results from sc_spatial per query |
SC_INGEST_SCHEDULE_HOURS |
24 |
Auto-ingest cycle |
SC_GPU_MEMORY_LIMIT_GB |
120 |
GPU memory limit for RAPIDS-based analysis |
SC_CELLXGENE_API_URL |
(none) | Optional CellxGene Census API endpoint |
SC_CROSS_AGENT_TIMEOUT |
30 |
Cross-agent HTTP timeout in seconds |
Collection weight tuning: 12 SC_WEIGHT_* variables control multi-collection fusion. SC_WEIGHT_CELL_TYPES defaults to 0.14 (highest) followed by SC_WEIGHT_MARKERS at 0.12, reflecting the primacy of cell identity in single-cell analysis.
Milvus Collections (13)
| Collection | Description |
|---|---|
sc_cell_types |
Cell type reference profiles and markers (57 types) |
sc_markers |
Marker gene signatures per cell type (75 markers) |
sc_literature |
Single-cell genomics literature |
sc_trials |
Clinical trials involving single-cell analysis |
sc_drugs |
Drug response signatures (GDSC/DepMap, 30 drugs) |
sc_pathways |
Pathway activity signatures |
sc_spatial |
Spatial transcriptomics references (4 platforms) |
sc_trajectories |
Differentiation trajectory templates |
sc_interactions |
Ligand-receptor pair databases (CellPhoneDB/NicheNet) |
sc_tme |
TME classification profiles (4 phenotypes) |
sc_clinical |
Clinical correlation data |
sc_methods |
Computational method references |
genomic_evidence |
Shared genomic evidence (read-only) |
Complete API Endpoint Table
| Method | Path | Description |
|---|---|---|
| GET | /health |
Service health with component readiness |
| GET | /collections |
Collection names and record counts |
| GET | /workflows |
10 available workflow definitions |
| GET | /metrics |
Prometheus-compatible metrics |
| POST | /v1/sc/query |
RAG Q&A across all single-cell collections |
| POST | /v1/sc/search |
Multi-collection evidence search (no LLM) |
| POST | /v1/sc/annotate |
Cell type annotation from marker expression |
| POST | /v1/sc/tme-profile |
TME classification from cell proportions |
| POST | /v1/sc/drug-response |
Drug response prediction per cell type |
| POST | /v1/sc/cart-validate |
CAR-T target validation (on/off-tumor expression) |
| POST | /v1/sc/spatial-niche |
Spatial niche mapping (Visium/MERFISH/Xenium/CosMx) |
| POST | /v1/sc/trajectory |
Trajectory inference from pseudotime data |
| POST | /v1/sc/interactions |
Ligand-receptor interaction analysis |
| POST | /v1/sc/biomarker-discovery |
Biomarker discovery from differential expression |
| POST | /v1/sc/subclonal |
Subclonal architecture and CNV analysis |
| POST | /v1/sc/treatment-monitor |
Treatment response monitoring over time |
| POST | /v1/sc/workflow/{type} |
Generic workflow dispatch (10 types) |
| POST | /v1/sc/integrated-assessment |
Cross-agent multi-agent assessment |
| GET | /v1/sc/cell-types |
Reference catalog of 57 cell types |
| GET | /v1/sc/markers |
Reference catalog of 75 markers |
| GET | /v1/sc/tme-classes |
TME classification definitions (4 phenotypes) |
| GET | /v1/sc/spatial-platforms |
Supported spatial platforms |
| POST | /v1/reports/generate |
Multi-format report generation |
| GET | /v1/reports/formats |
Supported export formats |
| GET | /v1/events/stream |
Server-sent event stream |
Detailed Clinical Workflow Walkthrough (TME-Guided Immunotherapy Selection)
-
Expression data submission -- Clinician submits a gene expression marker panel or cell type proportions from a tumor biopsy via the
/v1/sc/tme-profileendpoint. Input can be raw marker values (e.g.,CD3D: 5.2, CD8A: 4.1, GZMB: 3.8) or pre-computed cell type proportions. -
Cell type annotation -- If raw markers are provided, the agent performs multi-strategy annotation: reference-based matching against the 57-cell-type atlas, marker-based scoring using the 75-marker database, and LLM-augmented classification for ambiguous profiles. Confidence levels (high/medium/low) are assigned.
-
TME classification -- Cell type proportions are analyzed against the 4 TME phenotype profiles:
- Hot (immune-inflamed): high CD8+ T cells (>15%), high PD-L1 expression, active IFN-gamma signaling
- Cold (immune-desert): low immune infiltrate (<5%), absent T cell signatures
- Excluded (immune-excluded): immune cells at tumor periphery, high stromal content
-
Immunosuppressive: high Treg (>8%), high M2 macrophage, elevated TGF-beta/IL-10
-
Drug response prediction -- Based on the TME phenotype and cell-type-resolved expression, the agent queries
sc_drugsfor GDSC/DepMap sensitivity signatures across 30 drugs. Checkpoint inhibitors (pembrolizumab, nivolumab, atezolizumab) are ranked for hot TMEs; combination strategies are suggested for cold/excluded TMEs. -
Subclonal architecture -- If CNV data is available, the agent performs subclonal detection to identify resistant subpopulations and assess antigen escape risk. Clones with loss of heterozygosity at HLA loci are flagged.
-
Spatial context -- For spatial transcriptomics data (Visium, MERFISH, Xenium, CosMx), the agent maps immune cell niches relative to tumor boundaries, quantifying immune exclusion zones and tertiary lymphoid structures.
-
Report generation -- Results exported as Markdown, JSON, PDF, or FHIR R4.
Example Query and Response
curl -X POST http://localhost:8540/v1/sc/tme-profile \
-H "Content-Type: application/json" \
-d '{"cell_type_proportions": {"CD8_T_cell": 0.15, "Macrophage": 0.25, "Treg": 0.08, "Fibroblast": 0.30, "Tumor": 0.18, "NK_cell": 0.04}}'
{
"status": "completed",
"tme_classification": "immunosuppressive",
"confidence": "high",
"rationale": "Elevated Treg proportion (8%) combined with high macrophage content (25%) suggests immunosuppressive microenvironment. CD8+ T cell infiltration (15%) is present but likely functionally exhausted based on the Treg:CD8 ratio.",
"immunotherapy_recommendations": [
{
"drug": "ipilimumab + nivolumab",
"rationale": "Combination anti-CTLA-4/PD-1 to overcome Treg-mediated suppression",
"evidence_level": "Level 1A (CheckMate 067)"
},
{
"drug": "anti-CCR4 (mogamulizumab)",
"rationale": "Treg depletion strategy for immunosuppressive TME",
"evidence_level": "Level 2B (Phase II data)"
}
],
"escape_risk": "moderate",
"collections_searched": 8,
"processing_time_ms": 1850.3
}
Cross-Agent Integration
| Direction | Partner Agent | Trigger/Scenario |
|---|---|---|
| Outbound | CAR-T (:8522) | Sends antigen escape alerts when off-tumor expression detected in CAR-T target validation |
| Outbound | Oncology (:8527) | Provides TME context for immunotherapy selection and combination strategy |
| Outbound | Biomarker (:8529) | Shares cell-type-resolved biomarker panels for panel enrichment |
| Outbound | Drug Discovery | Queries compound integration for cell-type-selective drug candidates |
| Inbound | Imaging (:8524) | Receives spatial-imaging correlation requests |
| Inbound | Pharmacogenomics (:8107) | Receives requests for cell-type-resolved drug sensitivity data |
| Bidirectional | All agents | Reads genomic_evidence (shared); publishes events via SSE |
Performance Characteristics
| Operation | Typical Latency | Notes |
|---|---|---|
| Cell type annotation (marker panel) | 500-1,200 ms | Multi-strategy: reference + marker + LLM |
| TME profiling | 800-1,500 ms | Proportion analysis + evidence retrieval |
| Drug response prediction | 600-1,200 ms | GDSC/DepMap signature matching |
| CAR-T target validation | 400-800 ms | Expression ratio analysis |
| Spatial niche mapping | 1,000-2,500 ms | Platform-specific spatial analysis |
| Full RAG query | 1,500-3,500 ms | 13-collection search + LLM synthesis |
| Subclonal architecture | 2,000-4,000 ms | CNV detection + clone assignment |
Common Pitfalls and Troubleshooting
- Cell type proportion normalization: Proportions submitted to
/v1/sc/tme-profileshould sum to approximately 1.0. The agent normalizes internally, but significantly unbalanced inputs (e.g., sum > 2.0) may indicate double-counting. - Marker gene naming: Gene symbols must follow HGNC nomenclature (e.g.,
CD3DnotCD3). The agent attempts alias resolution viaENTITY_ALIASESbut non-standard names may fail silently. - TME classification edge cases: Tumors with mixed hot/cold regions (spatially heterogeneous) may be classified inconsistently depending on the biopsy site. Spatial transcriptomics data is recommended for heterogeneous tumors.
- GDSC/DepMap coverage: Drug response predictions are limited to the 30 drugs in the sensitivity database. Novel agents or combination regimens fall back to LLM-based inference from literature.
- GPU memory for RAPIDS: If
SC_GPU_MEMORY_LIMIT_GBis set too low for large expression matrices, RAPIDS operations will fall back to CPU, significantly increasing latency.
3.11 Clinical Trial Intelligence Agent (:8538)¶
Overview and Clinical Purpose
The Clinical Trial Intelligence Agent provides AI-driven support across the full clinical trial lifecycle: protocol optimization, patient-trial matching, site selection with 7-factor scoring, eligibility optimization, adaptive design evaluation, safety signal detection using PRR/ROR disproportionality analysis, regulatory document generation, competitive intelligence, diversity assessment, and decentralized trial planning. It supports 10 specialized workflows plus general RAG query capability, and its knowledge base includes 40 landmark trial references.
Advanced Configuration Options
All environment variables use the TRIAL_ prefix. Key settings:
| Variable | Default | Description |
|---|---|---|
TRIAL_MILVUS_HOST |
localhost |
Milvus vector database host |
TRIAL_MILVUS_PORT |
19530 |
Milvus port |
TRIAL_API_PORT |
8538 |
FastAPI REST server port |
TRIAL_STREAMLIT_PORT |
8128 |
Streamlit UI port |
TRIAL_LLM_MODEL |
claude-sonnet-4-6 |
Claude model for synthesis |
TRIAL_ANTHROPIC_API_KEY |
(none) | Required for LLM features |
TRIAL_SCORE_THRESHOLD |
0.4 |
Minimum cosine similarity |
TRIAL_TOP_K_PER_COLLECTION |
5 |
Default results per collection |
TRIAL_INGEST_SCHEDULE_HOURS |
24 |
Auto-ingest cycle |
TRIAL_API_KEY |
(empty) | Set to require X-API-Key header |
TRIAL_CROSS_AGENT_TIMEOUT |
30 |
Cross-agent HTTP timeout in seconds |
TRIAL_CLINICALTRIALS_API_KEY |
(none) | Optional ClinicalTrials.gov API key |
The agent connects to 8 partner agents for cross-agent integration, with dedicated URL variables: TRIAL_ONCOLOGY_AGENT_URL, TRIAL_PGX_AGENT_URL, TRIAL_CARDIOLOGY_AGENT_URL, TRIAL_BIOMARKER_AGENT_URL, TRIAL_RARE_DISEASE_AGENT_URL, TRIAL_NEUROLOGY_AGENT_URL, TRIAL_SINGLE_CELL_AGENT_URL, TRIAL_IMAGING_AGENT_URL.
Milvus Collections (14)
| Collection | Description |
|---|---|
trial_protocols |
Protocol designs, amendments, SOAs |
trial_eligibility |
Inclusion/exclusion criteria |
trial_endpoints |
Primary, secondary, exploratory endpoints |
trial_sites |
Site feasibility, enrollment history |
trial_investigators |
Investigator profiles, experience |
trial_results |
Published trial results, CSRs (40 landmark trials) |
trial_regulatory |
FDA/EMA guidance, IND/NDA data |
trial_literature |
PubMed clinical trial publications |
trial_biomarkers |
Biomarker-driven trial designs |
trial_safety |
Safety data, DSMB reports, AE databases |
trial_rwe |
Real-world evidence, registries |
trial_adaptive |
Adaptive design references |
trial_guidelines |
ICH, FDA, EMA guidelines |
genomic_evidence |
Shared genomic evidence (cross-agent) |
10 Specialized Workflows
| Workflow | Description |
|---|---|
| Protocol Design | Complexity scoring, SOA review, endpoint optimization |
| Patient Matching | AI eligibility screening with genomic/biomarker integration |
| Site Selection | 7-factor feasibility scoring, enrollment forecasting, diversity metrics |
| Eligibility Optimization | Population impact modeling, competitor benchmarking |
| Adaptive Design | Bayesian interim analysis, dose-response, futility assessment |
| Safety Signal | Disproportionality analysis (PRR/ROR), causality assessment |
| Regulatory Docs | IND, CSR, briefing doc generation (FDA/EMA/PMDA) |
| Competitive Intel | Landscape analysis, enrollment race tracking |
| Diversity Assessment | FDA diversity guidance compliance, gap analysis |
| Decentralized Planning | DCT component feasibility, hybrid model design |
Complete API Endpoint Table
| Method | Path | Description |
|---|---|---|
| GET | /health |
Service health with component readiness |
| GET | /collections |
Collection names and record counts |
| GET | /workflows |
10 workflow definitions + general query |
| GET | /metrics |
Prometheus-compatible metrics |
| POST | /v1/trial/query |
RAG Q&A across all trial collections |
| POST | /v1/trial/search |
Multi-collection evidence search (no LLM) |
| POST | /v1/trial/workflow/protocol_design/run |
Protocol design complexity scoring |
| POST | /v1/trial/workflow/patient_matching/run |
Patient-trial matching with genomics |
| POST | /v1/trial/workflow/site_selection/run |
7-factor site feasibility scoring |
| POST | /v1/trial/workflow/eligibility_optimization/run |
Population impact modeling |
| POST | /v1/trial/workflow/adaptive_design/run |
Bayesian adaptive design evaluation |
| POST | /v1/trial/workflow/safety_signal/run |
PRR/ROR disproportionality analysis |
| POST | /v1/trial/workflow/regulatory_docs/run |
Regulatory document generation |
| POST | /v1/trial/workflow/competitive_intel/run |
Competitive landscape analysis |
| POST | /v1/trial/workflow/diversity_assessment/run |
FDA diversity guidance compliance |
| POST | /v1/trial/workflow/decentralized_planning/run |
DCT feasibility assessment |
| POST | /v1/trial/integrated-assessment |
Cross-agent multi-agent assessment |
| GET | /v1/trial/therapeutic-areas |
Reference catalog of therapeutic areas |
| GET | /v1/trial/phases |
Trial phase definitions |
| GET | /v1/trial/landmark-trials |
40 landmark trial references |
| GET | /v1/trial/regulatory-agencies |
Supported regulatory agencies (FDA/EMA/PMDA) |
| POST | /v1/reports/generate |
Multi-format report generation |
| GET | /v1/reports/formats |
Supported export formats |
| GET | /v1/events/stream |
Server-sent event stream |
Detailed Clinical Workflow Walkthrough (Safety Signal Detection)
-
Adverse event submission -- Safety team submits adverse event data via
/v1/trial/workflow/safety_signal/run. Input includes the drug name, a list of adverse events with counts (exposed and unexposed), and optional background rate data. -
Disproportionality analysis -- The agent computes two standard pharmacovigilance metrics:
- PRR (Proportional Reporting Ratio): ratio of the proportion of a specific AE for the drug vs. all other drugs in the database. PRR > 2 with chi-squared > 4 and N >= 3 triggers a signal.
-
ROR (Reporting Odds Ratio): odds ratio of the AE in exposed vs. unexposed populations. The 95% confidence interval lower bound > 1 indicates a statistically significant signal.
-
Historical comparator retrieval -- The agent searches
trial_safetyfor historical AE rates from similar drug classes and indications. Published DSMB reports and FDA safety communications are retrieved fromtrial_regulatory. -
Causality assessment -- Evidence from
trial_literatureandtrial_resultsis used to assess temporal relationship, dose-response, biological plausibility, and consistency across studies. The LLM synthesizes a causality narrative referencing the WHO-UMC and Naranjo scales. -
Benchmark against landmark trials -- The detected signal is compared against AE profiles from relevant landmark trials in
trial_results(40 stored). For example, hepatotoxicity in a checkpoint inhibitor trial would be benchmarked against CheckMate 067 and KEYNOTE 024 safety data. -
Report generation -- A structured safety signal report is generated with PRR/ROR calculations, confidence intervals, historical comparator data, causality assessment, and recommended actions (continue monitoring, dose modification, or study hold).
Example Query and Response
curl -X POST http://localhost:8538/v1/trial/workflow/safety_signal/run \
-H "Content-Type: application/json" \
-d '{
"drug": "experimental_agent_X",
"adverse_events": [
{"event": "hepatotoxicity", "count": 8, "total_exposed": 200},
{"event": "rash", "count": 15, "total_exposed": 200},
{"event": "fatigue", "count": 45, "total_exposed": 200}
],
"comparator_class": "checkpoint_inhibitor"
}'
{
"status": "completed",
"workflow": "safety_signal",
"signals_detected": [
{
"event": "hepatotoxicity",
"prr": 3.2,
"prr_ci_lower": 1.8,
"ror": 3.5,
"ror_ci_lower": 1.6,
"signal_strength": "moderate",
"historical_rate_class": "2-5% for checkpoint inhibitors",
"observed_rate": "4.0%",
"assessment": "Signal detected. Observed hepatotoxicity rate (4.0%) is within the expected range for checkpoint inhibitors (2-5%) but PRR exceeds threshold. Recommend enhanced liver function monitoring at 2-week intervals.",
"causality_score": "possible (Naranjo: 4)"
},
{
"event": "rash",
"prr": 1.4,
"signal_strength": "none",
"assessment": "No signal. Rash rate (7.5%) is below the expected range for checkpoint inhibitors (10-20%)."
}
],
"landmark_comparators": ["CheckMate 067", "KEYNOTE 024", "IMpower 150"],
"evidence_sources": 28,
"processing_time_ms": 3120.7
}
7-Factor Site Selection Scoring
The site selection workflow evaluates investigative sites using a composite score across 7 factors:
| Factor | Weight | Data Sources |
|---|---|---|
| Disease prevalence in catchment area | 0.20 | trial_sites, census data |
| Prior enrollment performance | 0.20 | trial_investigators, historical enrollment |
| Investigator experience (publications, prior trials) | 0.15 | trial_investigators, trial_literature |
| Regulatory readiness (IRB turnaround, startup time) | 0.15 | trial_sites |
| Diversity index (demographic representation) | 0.10 | trial_sites, FDA diversity guidance |
| Competitor trial burden (active competing studies) | 0.10 | trial_protocols, competitive intelligence |
| Infrastructure score (lab, imaging, pharmacy) | 0.10 | trial_sites |
Cross-Agent Integration
| Direction | Partner Agent | Trigger/Scenario |
|---|---|---|
| Inbound | Oncology (:8527) | Receives trial matching requests when actionable variants found |
| Inbound | Rare Disease (:8134) | Receives rare disease trial eligibility queries |
| Inbound | Biomarker (:8529) | Receives biomarker-driven trial design queries |
| Inbound | PGx (:8107) | Receives queries for PGx-guided clinical trials |
| Outbound | Oncology (:8527) | Queries tumor profiling for patient eligibility context |
| Outbound | PGx (:8107) | Queries metabolizer status for trial-specific dosing arms |
| Outbound | Cardiology (:8126) | Queries cardiac safety context for cardiovascular endpoint trials |
| Outbound | Biomarker (:8529) | Queries biomarker interpretation for eligibility criteria |
| Bidirectional | All agents | Reads genomic_evidence (shared); publishes events via SSE |
Performance Characteristics
| Operation | Typical Latency | Notes |
|---|---|---|
| RAG query | 1,500-3,000 ms | 14-collection search + LLM synthesis |
| Patient-trial matching | 2,000-4,000 ms | Multi-collection search + eligibility scoring |
| Protocol design review | 2,500-4,500 ms | Complexity scoring + benchmark comparison |
| Safety signal (PRR/ROR) | 1,500-3,500 ms | Statistical calculation + evidence retrieval |
| Site selection (7-factor) | 3,000-5,000 ms | Multi-factor scoring + enrollment forecasting |
| Regulatory document generation | 5,000-10,000 ms | Template population + LLM narrative |
| Competitive intelligence | 2,000-4,000 ms | Landscape analysis across trial_protocols |
Common Pitfalls and Troubleshooting
- Patient matching requires complete profiles: The patient-trial matching workflow performs best with age, diagnosis, biomarker status, prior therapy history, and ECOG performance status. Partial profiles produce fewer matches with lower confidence.
- PRR/ROR minimum counts: Disproportionality analysis requires at least 3 events (N >= 3) to generate a statistically meaningful signal. Submitting events with count < 3 will return "insufficient data" rather than a false signal.
- Landmark trial coverage: The 40 landmark trials are curated references. New landmark trials (e.g., recently published Phase III results) require manual addition via
scripts/seed_knowledge.py. - ClinicalTrials.gov rate limiting: If
TRIAL_CLINICALTRIALS_API_KEYis not set, the ClinicalTrials.gov v2 API rate-limits to 3 requests/second. Set the key for production workloads. - Regulatory document templates: Generated IND/CSR documents are templates requiring clinical review. They auto-populate statistical sections, safety summaries, and study schemas but do not replace regulatory affairs expertise.
- Cross-agent integration breadth: This agent connects to 8 partner agents (more than any other agent). If multiple partner agents are unavailable, the integrated assessment still returns partial results from available agents.
Part 4: Advanced Drug Discovery¶
The 10-Stage Therapeutic Discovery Pipeline¶
The Drug Discovery Pipeline transforms a clinically actionable variant into ranked therapeutic candidates. Each stage feeds the next, producing an auditable chain of evidence from gene to molecule.
| Stage | Name | Technology | Typical Time |
|---|---|---|---|
| 1 | Variant Reception | VCF parser + ClinVar lookup | <1 s |
| 2 | Target Identification | UniProt mapping, PDB structure fetch | 2-5 s |
| 3 | Binding Site Analysis | fpocket cavity detection, druggability scoring | 5-10 s |
| 4 | Seed Compound Retrieval | ChEMBL/DrugBank similarity search | 3-8 s |
| 5 | Molecular Generation | BioNeMo MolMIM conditional generation | 15-30 s |
| 6 | Chemical Feasibility | RDKit Lipinski, QED, SA score | 1-2 s |
| 7 | Pediatric Safety Filtering | 6-filter pediatric safety gate | <1 s |
| 8 | Binding Prediction | BioNeMo DiffDock pose sampling | 30-90 s |
| 9 | Composite Scoring | Weighted multi-objective ranking | <1 s |
| 10 | Report Generation | Claude LLM narrative + evidence assembly | 5-10 s |
End-to-end time per variant: 60-150 seconds depending on the number of generated molecules and DiffDock sampling depth.
MolMIM Molecular Generation¶
MolMIM (Molecular Masked Inverse Modeling) is a BioNeMo NIM that generates novel molecules conditioned on a seed SMILES string. The HCLS AI Factory sends generation requests to the local MolMIM endpoint.
Seed SMILES Selection
The pipeline selects seed compounds from the top ChEMBL/DrugBank hits in Stage 4. Selection criteria:
- Known activity against the target gene product (IC50 or Ki reported)
- Molecular weight between 200-500 Da (optimal generation range)
- Existing SAR data available for the scaffold class
Scaffold Constraints
MolMIM supports scaffold-constrained generation, where a substructure is preserved while peripheral groups are modified:
generation_params = {
"algorithm": "CMA-ES",
"num_molecules": 30,
"property_name": "QED",
"min_similarity": 0.4,
"particles": 30,
"iterations": 10,
"smi": seed_smiles # e.g., "CC(=O)Oc1ccccc1C(=O)O"
}
response = requests.post(f"{MOLMIM_URL}/generate", json=generation_params)
The min_similarity parameter (Tanimoto on Morgan fingerprints) controls how far generated molecules can drift from the seed. Lower values (0.3) explore more chemical space; higher values (0.7) stay closer to known actives.
Diversity Control
To avoid redundant candidates, the pipeline applies MaxMin diversity picking after generation. From 30 raw candidates, the top 10 most structurally diverse molecules are retained for downstream evaluation.
DiffDock Binding Prediction¶
DiffDock is a diffusion-based molecular docking model that predicts binding poses without requiring a pre-defined search box. The HCLS AI Factory uses the BioNeMo NIM deployment.
Inputs and Outputs
- Input: Protein structure (PDB), ligand (SDF/MOL2), number of poses
- Output: Ranked poses with confidence scores, per-pose RMSD estimates, contact residue lists
Confidence Score Interpretation
| Confidence Range | Interpretation | Action |
|---|---|---|
| > 0.8 | High confidence binding | Strong candidate |
| 0.5 - 0.8 | Moderate confidence | Proceed with caution |
| < 0.5 | Low confidence | Likely non-binder, discard |
Binding Energy Conversion
DiffDock confidence scores are converted to approximate binding energies using a calibrated regression:
A confidence of 0.78 yields approximately -11.4 kcal/mol, consistent with the VCP demo results.
Contact Residue Analysis
For each predicted pose, the pipeline extracts residues within 4.0 Angstroms of the ligand. These contact residues are cross-referenced against:
- Known active site residues from UniProt annotations
- Mutation sites from the patient's VCF
- Conserved residues across orthologs
RDKit Chemical Property Analysis¶
Every generated molecule passes through a battery of RDKit property calculations before scoring.
Lipinski's Rule of Five
Molecules violating more than one Lipinski rule are flagged but not automatically discarded (many approved oncology drugs violate Lipinski):
| Property | Threshold | Calculation |
|---|---|---|
| Molecular Weight | <= 500 Da | Descriptors.MolWt(mol) |
| LogP | <= 5.0 | Descriptors.MolLogP(mol) |
| H-Bond Donors | <= 5 | Descriptors.NumHDonors(mol) |
| H-Bond Acceptors | <= 10 | Descriptors.NumHAcceptors(mol) |
Quantitative Estimate of Drug-likeness (QED)
QED combines eight molecular descriptors into a single 0-1 score. The HCLS AI Factory uses the weighted QED variant:
Typical thresholds: QED >= 0.5 passes initial screen; QED >= 0.7 is considered favorable.
Synthetic Accessibility (SA) Score
The SA score (1-10 scale, lower is easier to synthesize) helps prioritize molecules that can realistically be made:
from rdkit.Chem import RDConfig
from rdkit.Contrib.SA_Score import sascorer
sa = sascorer.calculateScore(mol) # 1.0 (trivial) to 10.0 (very hard)
Molecules with SA > 6.0 are flagged as synthetically challenging.
MMFF Conformer Generation
For 3D analysis and docking preparation, the pipeline generates low-energy conformers:
from rdkit.Chem import AllChem
AllChem.EmbedMultipleConfs(mol, numConfs=50, pruneRmsThresh=0.5)
AllChem.MMFFOptimizeMoleculeConfs(mol)
The lowest-energy conformer is selected as the docking input for DiffDock.
Pediatric Safety Filters¶
The HCLS AI Factory applies six pediatric-specific safety filters. A molecule must pass all six to be considered a viable pediatric candidate. These filters go beyond standard adult drug safety screens to address the unique vulnerabilities of developing organ systems.
Filter 1: Blood-Brain Barrier (BBB) Penetration Risk
- Threshold: MW > 500 Da triggers concern
- Rationale: Large molecules that cross the BBB pose neurotoxicity risk in developing brains
- Implementation:
Descriptors.MolWt(mol) > 500 - Action: Flag, do not auto-reject (some CNS-targeted therapies require BBB penetration)
Filter 2: Cardiac Safety (hERG Liability)
- Threshold: Predicted hERG IC50 < 10 uM
- Rationale: QT prolongation risk is higher in pediatric patients with smaller cardiac mass
- Implementation: Structure-based hERG prediction using Morgan fingerprint similarity to known hERG blockers
- Action: Hard reject if predicted IC50 < 1 uM; flag if 1-10 uM
Filter 3: Hepatotoxicity Risk
- Threshold: LogP > 5.0
- Rationale: Highly lipophilic compounds accumulate in hepatic tissue; pediatric livers have immature metabolic capacity
- Implementation:
Descriptors.MolLogP(mol) > 5.0 - Action: Hard reject
Filter 4: Teratogenicity Screen
- Threshold: SMARTS pattern match against known teratogenic substructures
- Rationale: Adolescent patients of reproductive age require teratogenicity screening
- Implementation: Library of 47 SMARTS patterns derived from FDA pregnancy category X drugs
- Action: Hard reject on match with known high-risk pharmacophores
Filter 5: Oral Bioavailability
- Threshold: TPSA > 140 Angstroms squared
- Rationale: Pediatric formulation strongly prefers oral dosing; high TPSA reduces absorption
- Implementation:
Descriptors.TPSA(mol) > 140 - Action: Flag (injectable formulations remain possible but less preferred)
Filter 6: Gastrointestinal Tolerability
- Threshold: Rotatable bonds > 10
- Rationale: Flexible molecules often cause GI distress; pediatric patients are more sensitive
- Implementation:
Descriptors.NumRotatableBonds(mol) > 10 - Action: Flag
Composite Scoring Methodology¶
Candidates that survive safety filtering receive a composite score combining three weighted components:
composite_score = (docking_weight * normalized_docking) +
(generation_weight * normalized_generation) +
(qed_weight * normalized_qed)
Default Weights:
| Component | Weight | Source | Normalization |
|---|---|---|---|
| Docking Score | 0.4 | DiffDock confidence | Min-max across candidate set |
| Generation Score | 0.3 | MolMIM similarity to seed | Already 0-1 (Tanimoto) |
| QED Score | 0.3 | RDKit QED | Already 0-1 |
Weights are configurable per therapeutic area. Oncology workflows may increase docking weight to 0.5; rare disease workflows may increase generation weight to favor novelty.
VCP Demo Results¶
The VCP (Valosin-Containing Protein) variant demonstration showcases the full pipeline:
- Input Variant: VCP p.R155H (rs121909334), associated with IBMPFD
- Target Structure: PDB 5KIU, ATPase domain
- Best Candidate Docking Score: -11.4 kcal/mol (DiffDock confidence 0.78)
- Best Candidate QED: 0.81
- Candidates Generated: 30 via MolMIM, 10 after diversity picking
- Candidates Passing Safety: 7 of 10 passed all 6 pediatric filters
- Top Composite Score: 0.84
- Pipeline Runtime: 127 seconds end-to-end
Part 5: Cross-Agent Coordination¶
Shared Genomic Evidence Architecture¶
All 11 intelligence agents share access to the genomic_evidence Milvus collection. This collection contains 35,678 curated variant-evidence vectors derived from ClinVar, AlphaMissense, and the patient's annotated VCF.
Collection Schema:
| Field | Type | Description |
|---|---|---|
| id | INT64 (primary) | Auto-generated unique ID |
| text | VARCHAR(4096) | Evidence text (variant description + clinical significance) |
| embedding | FLOAT_VECTOR(384) | BGE-small-en-v1.5 embedding |
| source | VARCHAR(256) | Origin database (ClinVar, AlphaMissense, VCF) |
| gene | VARCHAR(64) | Gene symbol |
| variant_id | VARCHAR(128) | rs number or HGVS notation |
| clinical_significance | VARCHAR(128) | ClinVar classification |
| therapeutic_area | VARCHAR(128) | Mapped therapeutic area(s) |
Access Pattern: Read-only. Agents query genomic_evidence using cosine similarity search with domain-specific query expansion. Write access is restricted to the RAG ingestion pipeline (Stage 2).
Query Expansion by Agent Type:
Each agent type prepends domain-specific context to the raw user query before embedding:
- Oncology Agent: Appends "cancer somatic mutation driver gene tumor"
- Cardiology Agent: Appends "cardiac arrhythmia cardiomyopathy channelopathy"
- Neurology Agent: Appends "neurodegeneration epilepsy demyelination"
- Pharmacogenomics Agent: Appends "drug metabolism CYP450 pharmacokinetic"
This expansion biases the similarity search toward domain-relevant evidence without requiring separate collections.
Cross-Modal Trigger Examples¶
Agents communicate through structured trigger events. When one agent's analysis produces a finding that is relevant to another agent's domain, it emits a trigger that the orchestrator routes to the appropriate downstream agent.
Imaging Finding to Genomic Variant Query
Scenario: The Imaging Intelligence Agent detects a suspicious mass pattern in a brain MRI.
{
"trigger_type": "imaging_to_genomic",
"source_agent": "imaging_intelligence",
"target_agent": "precision_oncology",
"payload": {
"finding": "ring-enhancing lesion, temporal lobe",
"modality": "MRI_T1_contrast",
"urgency": "high",
"suggested_query": "IDH1 IDH2 ATRX TP53 glioma variants"
}
}
The Oncology Agent receives this trigger and queries genomic_evidence for variants in IDH1, IDH2, ATRX, and TP53. Results are correlated with the imaging finding to produce an integrated assessment.
Pharmacogenomic Alert to Dosing Recommendation
Scenario: The Pharmacogenomics Agent identifies a CYP2D6 poor metabolizer genotype.
{
"trigger_type": "pgx_to_dosing",
"source_agent": "pharmacogenomics_intelligence",
"target_agent": "precision_oncology",
"payload": {
"gene": "CYP2D6",
"phenotype": "poor_metabolizer",
"star_alleles": ["*4", "*4"],
"affected_drugs": ["tamoxifen", "codeine", "ondansetron"],
"recommendation": "avoid_or_reduce_dose"
}
}
The receiving agent adjusts therapeutic recommendations, avoiding CYP2D6-metabolized prodrugs and suggesting alternatives.
Rare Disease Variant to Family Cascade
Scenario: The Rare Disease Diagnostic Agent identifies a pathogenic variant in a recessive disorder gene.
{
"trigger_type": "rare_to_cascade",
"source_agent": "rare_disease_diagnostic",
"target_agent": "precision_biomarker",
"payload": {
"variant": "CFTR p.F508del",
"zygosity": "heterozygous",
"inheritance": "autosomal_recessive",
"cascade_recommendation": "test_parents_and_siblings",
"carrier_frequency": "1_in_25_European"
}
}
The Biomarker Agent prepares a carrier screening panel and family counseling summary.
Multi-Agent Query Orchestration Patterns¶
Pattern 1: Fan-Out / Fan-In
A single patient query is broadcast to multiple agents simultaneously. Each agent returns domain-specific findings, and the orchestrator merges results:
User Query --> Orchestrator --> [Oncology, Cardiology, PGx, Biomarker]
| | | |
v v v v
Findings Findings Findings Findings
\ | | /
--> Merged Report <---
Pattern 2: Sequential Pipeline
Results flow through agents in a defined order, each enriching the analysis:
Variant --> Biomarker Agent --> Oncology Agent --> Drug Discovery --> Clinical Trial Agent
(classify variant) (assess cancer risk) (find candidates) (match trials)
Pattern 3: Conditional Routing
The orchestrator examines variant annotations and routes to the most relevant agent:
- Variants in cardiac genes (SCN5A, KCNQ1, MYH7) route to the Cardiology Agent
- Variants in neurological genes (APP, PSEN1, HTT) route to the Neurology Agent
- Variants with oncogenic annotations route to the Oncology Agent
- Unclassified rare variants route to the Rare Disease Agent
Building Custom Cross-Agent Workflows¶
To define a custom workflow, create a JSON workflow definition:
{
"workflow_name": "pediatric_cns_tumor_workup",
"description": "Integrated CNS tumor assessment for pediatric patients",
"steps": [
{
"agent": "imaging_intelligence",
"action": "analyze_neuroimaging",
"timeout_seconds": 30
},
{
"agent": "precision_oncology",
"action": "variant_assessment",
"depends_on": ["imaging_intelligence"],
"timeout_seconds": 20
},
{
"agent": "pharmacogenomics_intelligence",
"action": "drug_interaction_check",
"parallel_with": ["precision_oncology"],
"timeout_seconds": 15
},
{
"agent": "clinical_trial_intelligence",
"action": "match_trials",
"depends_on": ["precision_oncology"],
"timeout_seconds": 20
}
],
"output": "merged_report"
}
Register the workflow via the orchestrator API:
curl -X POST http://localhost:8080/api/workflows \
-H "Content-Type: application/json" \
-d @pediatric_cns_tumor_workup.json
Part 6: Deployment & Operations¶
Docker Compose Architecture¶
The HCLS AI Factory runs as 21 interconnected services defined in docker-compose.dgx-spark.yml. Services are organized into four tiers:
Tier 1: Infrastructure (always-on)
| Service | Image | Port | Purpose |
|---|---|---|---|
| etcd | quay.io/coreos/etcd:v3.5.x | 2379 | Milvus metadata store |
| minio | minio/minio:latest | 9000, 9001 | Milvus object storage |
| milvus-standalone | milvusdb/milvus:v2.4.x | 19530, 9091 | Vector database |
Tier 2: Core Services
| Service | Port | Purpose |
|---|---|---|
| landing-page | 8080 | Flask hub with health dashboard |
| rag-api | 8888 | RAG query engine |
| genomics-portal | 8085 | Genomics web interface |
| drug-discovery-api | 8890 | Drug discovery pipeline API |
Tier 3: Intelligence Agents
| Service | Port | Purpose |
|---|---|---|
| precision-oncology-agent | 8501 | Oncology analysis |
| cart-intelligence-agent | 8502 | CAR-T therapy design |
| imaging-intelligence-agent | 8503 | Medical imaging analysis |
| precision-autoimmune-agent | 8504 | Autoimmune assessment |
| precision-biomarker-agent | 8505 | Biomarker interpretation |
| cardiology-intelligence-agent | 8506 | Cardiac variant analysis |
| pharmacogenomics-intelligence-agent | 8507 | PGx and drug interaction |
| neurology-intelligence-agent | 8508 | Neurological assessment |
| rare-disease-diagnostic-agent | 8509 | Rare disease diagnosis |
| single-cell-intelligence-agent | 8510 | Single-cell analysis |
| clinical-trial-intelligence-agent | 8511 | Clinical trial matching |
Tier 4: Monitoring
| Service | Port | Purpose |
|---|---|---|
| prometheus | 9090 | Metrics collection |
| grafana | 3000 | Dashboards and alerting |
Health Monitoring¶
The health-monitor.sh script (19 KB) provides comprehensive health monitoring across 22 service endpoints. It runs via cron every 5 minutes.
Monitored Checks per Service:
- TCP port reachability (timeout: 5 seconds)
- HTTP health endpoint response (expected: 200 OK)
- Process memory consumption (threshold: 90% of allocated)
- Response latency (threshold: 10 seconds)
Auto-Restart Behavior:
When a service fails 3 consecutive health checks, the monitor executes:
# For Docker services
docker restart <service_name>
# For native processes
kill -TERM <pid> && sleep 2 && <start_command>
Watchdog Mode:
The health monitor itself is supervised by a systemd watchdog. If health-monitor.sh fails to write a heartbeat file within 10 minutes, systemd restarts it.
Log Locations:
- Health check results:
logs/cron-health.log - Service-specific logs:
logs/<service-name>.log - Auto-restart events:
logs/health-restarts.log
Milvus Administration¶
The platform manages 139 Milvus collections across all agents and core services.
Collection Categories:
| Category | Count | Example |
|---|---|---|
| Genomic evidence | 1 | genomic_evidence (35,678 vectors) |
| Agent knowledge bases | 55 | oncology_guidelines, cart_protocols |
| ClinVar annotations | 12 | clinvar_pathogenic, clinvar_vus |
| Drug/compound data | 8 | drug_interactions, chembl_targets |
| Clinical trials | 14 | trial_eligibility, trial_endpoints |
| Literature | 22 | pubmed_oncology, pubmed_cardiology |
| Patient data (demo) | 5 | demo_patient_variants |
| Imaging features | 10 | imaging_features_mri, imaging_features_ct |
| Other | 12 | system_prompts, workflow_templates |
Backup Strategy:
# Full backup of all collections (run weekly)
python3 -c "
from pymilvus import connections, utility
connections.connect('default', host='localhost', port='19530')
collections = utility.list_collections()
for coll in collections:
utility.do_bulk_insert(coll, files=[], backup=True)
"
# Incremental backup via Milvus backup tool
milvus-backup create --name weekly_$(date +%Y%m%d)
Flush Rate Tuning:
By default, Milvus flushes segments every 1 second. For bulk ingestion (e.g., loading 3.56M annotated variants), increase the flush interval:
from pymilvus import Collection
collection = Collection("genomic_evidence")
collection.set_properties({"collection.flush.interval": "30"}) # 30 seconds
Reset to default after ingestion completes.
Agent Startup Sequence¶
Each intelligence agent follows a deterministic startup sequence:
- Configuration Load (< 1 s): Read environment variables and config files
- Embedding Model Load (5-15 s): Load BGE-small-en-v1.5 into GPU memory (~130 MB)
- Milvus Connection (1-3 s): Connect to Milvus standalone, verify collection availability
- Collection Warm-up (2-5 s): Load frequently-queried collections into memory
- Health Endpoint Registration (< 1 s): Register with the health monitor
- API Ready (< 1 s): Begin accepting requests
Total cold start time: 10-25 seconds per agent. When running all 11 agents, stagger starts by 3 seconds each to avoid GPU memory contention during embedding model loading.
Performance Tuning¶
GPU Memory Management
The DGX Spark provides 128 GB unified memory. Recommended allocation:
| Component | Memory | Notes |
|---|---|---|
| Milvus indexes | 16-24 GB | Depends on loaded collections |
| BGE-small embedding (x11) | ~1.4 GB | 130 MB per agent instance |
| MolMIM NIM | 4-8 GB | Active during drug discovery |
| DiffDock NIM | 4-8 GB | Active during drug discovery |
| Parabricks | 32-64 GB | Active during genomics pipeline |
| System / OS | 8-16 GB | Linux kernel, Docker overhead |
Concurrent Agent Tuning
For systems with limited memory, reduce the number of concurrent agents:
# In docker-compose.dgx-spark.yml, add resource limits
services:
precision-oncology-agent:
deploy:
resources:
limits:
memory: 4G
reservations:
memory: 2G
Milvus Index Tuning
The default IVF_FLAT index uses nlist=1024 and nprobe=16. Adjust for your workload:
| Scenario | nlist | nprobe | Trade-off |
|---|---|---|---|
| Low latency (< 10 ms) | 1024 | 10 | Slightly lower recall |
| Balanced (default) | 1024 | 16 | Good recall, ~15 ms |
| High recall | 2048 | 32 | Higher latency (~30 ms) |
| Maximum recall | 4096 | 64 | ~50 ms, near-exhaustive |
search_params = {"metric_type": "COSINE", "params": {"nprobe": 32}}
results = collection.search(
data=[query_embedding],
anns_field="embedding",
param=search_params,
limit=20
)
HTTPS via Caddy¶
For production deployments, use Caddy as a reverse proxy with automatic TLS:
# Caddyfile
hcls-ai-factory.example.com {
handle /api/rag/* {
reverse_proxy localhost:8888
}
handle /api/genomics/* {
reverse_proxy localhost:8085
}
handle /api/drugs/* {
reverse_proxy localhost:8890
}
handle /agents/* {
reverse_proxy localhost:8501-8511
}
handle {
reverse_proxy localhost:8080
}
header {
X-Content-Type-Options nosniff
X-Frame-Options DENY
Referrer-Policy strict-origin-when-cross-origin
}
}
Security¶
Portal Authentication
The landing page (localhost:8080) supports configurable authentication:
# In landing-page/app.py
PORTAL_AUTH_ENABLED = os.getenv("PORTAL_AUTH_ENABLED", "false").lower() == "true"
PORTAL_USERNAME = os.getenv("PORTAL_USERNAME", "admin")
PORTAL_PASSWORD_HASH = os.getenv("PORTAL_PASSWORD_HASH") # bcrypt hash
XSS Protection
All agent Streamlit interfaces and the Flask landing page implement output sanitization:
- HTML entities escaped in all rendered text
- Content-Security-Policy headers set to
default-src 'self' - User inputs validated and sanitized before Milvus query construction
Pickle Hardening
The platform avoids pickle.loads() on untrusted data. Model artifacts use SafeTensors format. Where pickle is unavoidable (legacy sklearn models), a restricted unpickler is used:
import pickle
import io
ALLOWED_CLASSES = {"numpy.ndarray", "sklearn.ensemble._forest.RandomForestClassifier"}
class RestrictedUnpickler(pickle.Unpickler):
def find_class(self, module, name):
full_name = f"{module}.{name}"
if full_name not in ALLOWED_CLASSES:
raise pickle.UnpicklingError(f"Blocked: {full_name}")
return super().find_class(module, name)
def safe_load(data: bytes):
return RestrictedUnpickler(io.BytesIO(data)).load()
Part 7: Extending the Platform¶
Adding a New Intelligence Agent¶
Follow these steps to add a custom intelligence agent (e.g., a "Reproductive Health Agent"):
Step 1: Create Milvus Collections
Define the knowledge collections your agent needs:
from pymilvus import Collection, CollectionSchema, FieldSchema, DataType, connections
connections.connect("default", host="localhost", port="19530")
fields = [
FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True),
FieldSchema(name="text", dtype=DataType.VARCHAR, max_length=4096),
FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=384),
FieldSchema(name="source", dtype=DataType.VARCHAR, max_length=256),
FieldSchema(name="category", dtype=DataType.VARCHAR, max_length=128),
]
schema = CollectionSchema(fields, description="Reproductive health knowledge base")
collection = Collection("reproductive_health_knowledge", schema)
# Create index
index_params = {"index_type": "IVF_FLAT", "metric_type": "COSINE", "params": {"nlist": 1024}}
collection.create_index("embedding", index_params)
Step 2: Ingest Domain Knowledge
Prepare knowledge sources and embed them:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("BAAI/bge-small-en-v1.5")
documents = [
{"text": "Preeclampsia affects 2-8% of pregnancies...", "source": "ACOG", "category": "guidelines"},
# ... more documents
]
for doc in documents:
embedding = model.encode(doc["text"]).tolist()
collection.insert([[doc["text"]], [embedding], [doc["source"]], [doc["category"]]])
collection.flush()
Step 3: Create the Agent API
Use the standard agent template (Streamlit + FastAPI):
reproductive-health-agent/
app.py # Streamlit UI
api.py # FastAPI health + query endpoints
knowledge/ # Ingestion scripts and raw data
requirements.txt
Dockerfile
The api.py must expose:
GET /health-- returns{"status": "healthy", "collections": [...], "uptime": ...}POST /query-- accepts{"query": "...", "patient_context": {...}}and returns findings
Step 4: Register in the UI
Add the agent to the landing page configuration:
# In landing-page/config.py
AGENTS = {
# ... existing agents ...
"reproductive_health": {
"name": "Reproductive Health Agent",
"port": 8512,
"icon": "baby",
"description": "Maternal-fetal medicine and reproductive genomics",
"collections": ["reproductive_health_knowledge", "genomic_evidence"]
}
}
Step 5: Add to Docker Compose
reproductive-health-agent:
build: ./reproductive-health-agent
ports:
- "8512:8512"
environment:
- MILVUS_HOST=milvus-standalone
- MILVUS_PORT=19530
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
depends_on:
- milvus-standalone
restart: unless-stopped
Step 6: Write Tests
At minimum, every agent needs:
- Unit tests for query processing and response formatting
- Integration test confirming Milvus connectivity and collection access
- End-to-end test with a sample query and expected output structure
def test_reproductive_health_query():
response = client.post("/query", json={"query": "BRCA1 and pregnancy risk"})
assert response.status_code == 200
assert "findings" in response.json()
assert len(response.json()["findings"]) > 0
Creating Custom Milvus Collections with Dynamic Fields¶
Milvus supports dynamic fields for flexible schema evolution:
schema = CollectionSchema(fields, enable_dynamic_field=True)
collection = Collection("flexible_collection", schema)
# Insert with extra fields not in the schema
collection.insert([{
"text": "Example document",
"embedding": [0.1] * 384,
"custom_score": 0.95, # dynamic field
"reviewer": "Dr. Smith" # dynamic field
}])
Dynamic fields are stored as JSON and can be filtered in search expressions:
results = collection.search(
data=[query_embedding],
anns_field="embedding",
param=search_params,
limit=10,
expr='custom_score > 0.8 and reviewer == "Dr. Smith"'
)
Integrating New Knowledge Sources¶
PubMed Integration
Use the NCBI E-utilities API to fetch and ingest PubMed abstracts:
import requests
def fetch_pubmed_abstracts(query: str, max_results: int = 100):
# Search for PMIDs
search_url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi"
search_resp = requests.get(search_url, params={
"db": "pubmed", "term": query, "retmax": max_results, "retmode": "json"
})
pmids = search_resp.json()["esearchresult"]["idlist"]
# Fetch abstracts
fetch_url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi"
fetch_resp = requests.get(fetch_url, params={
"db": "pubmed", "id": ",".join(pmids), "rettype": "abstract", "retmode": "xml"
})
return parse_abstracts(fetch_resp.text)
ClinicalTrials.gov Integration
The ClinicalTrials.gov API v2 provides structured trial data:
def fetch_trials(condition: str, status: str = "RECRUITING"):
url = "https://clinicaltrials.gov/api/v2/studies"
params = {
"query.cond": condition,
"filter.overallStatus": status,
"pageSize": 50,
"fields": "NCTId,BriefTitle,EligibilityCriteria,InterventionName"
}
response = requests.get(url, params=params)
return response.json()["studies"]
Ingest trial data into agent-specific collections for real-time trial matching.
Custom Clinical Workflows¶
Beyond the built-in workflows, you can define condition-specific analysis pipelines:
from hcls_common.workflow import WorkflowEngine
workflow = WorkflowEngine()
@workflow.step("variant_triage")
def triage_variants(context):
"""Filter and prioritize variants by clinical significance."""
variants = context["variants"]
return [v for v in variants if v["clinical_significance"] in ("Pathogenic", "Likely_pathogenic")]
@workflow.step("literature_search")
def search_literature(context):
"""Query PubMed and internal knowledge bases for supporting evidence."""
genes = set(v["gene"] for v in context["triaged_variants"])
evidence = {}
for gene in genes:
evidence[gene] = query_milvus(f"{gene} clinical significance treatment")
return evidence
@workflow.step("generate_report")
def generate_report(context):
"""Produce a clinical summary using Claude."""
prompt = build_clinical_prompt(context["triaged_variants"], context["evidence"])
return call_claude(prompt)
# Execute
result = workflow.run(
steps=["variant_triage", "literature_search", "generate_report"],
initial_context={"variants": patient_variants}
)
Contributing to the Open-Source Project¶
The HCLS AI Factory is released under the Apache 2.0 License. Contributions are welcome.
Getting Started:
- Fork the repository at
https://github.com/ajones1923/hcls-ai-factory - Create a feature branch:
git checkout -b feature/your-feature-name - Follow the existing code style (Black formatting, type hints, docstrings)
- Add tests for any new functionality
- Run the test suite:
pytest tests/ -v - Submit a pull request with a clear description of changes
Areas Seeking Contributions:
- New intelligence agents for underserved therapeutic areas
- Improved embedding models for biomedical text
- Additional safety filters for specific patient populations
- Performance optimizations for large-scale variant sets
- Documentation and tutorials
- Internationalization of clinical content
Code Review Process:
All pull requests require:
- Passing CI checks (linting, tests, type checking)
- At least one maintainer review
- No reduction in test coverage
- Updated documentation if user-facing behavior changes
Clinical Decision Support Disclaimer
The HCLS AI Factory platform and its components are clinical decision support research tools. They are not FDA-cleared and are not intended as standalone diagnostic devices. All recommendations should be reviewed by qualified healthcare professionals. Apache 2.0 License.