HCLS AI Factory — Learning Guide: Unified Advanced¶

Author: Adam Jones | License: Apache 2.0 | Date: March 2026

Part 1: Advanced Genomic Analysis¶

Custom Variant Filtering Strategies¶

The default genomics pipeline (Parabricks 4.6 + DeepVariant) produces a comprehensive VCF containing every called variant. Real clinical workflows demand surgical filtering. The strategies below go beyond bcftools view -f PASS.

Tiered Clinical Filtering

A three-tier approach mirrors clinical reporting standards:

Tier	Criteria	Typical Count
Tier 1 — Actionable	QUAL >= 100, DP >= 30, GQ >= 40, ClinVar Pathogenic/Likely Pathogenic	~50-200
Tier 2 — Potentially Significant	QUAL >= 50, DP >= 20, GQ >= 30, ClinVar VUS + AlphaMissense >= 0.8	~500-2,000
Tier 3 — Observed	PASS filter, DP >= 10, population AF < 0.01	~20,000-50,000

# Tier 1: Actionable variants only
bcftools view -f PASS input.vcf.gz \
  | bcftools filter -i 'QUAL>=100 && INFO/DP>=30 && FORMAT/GQ>=40' \
  | bcftools annotate -a clinvar.vcf.gz -c INFO/CLNSIG \
  | bcftools filter -i 'INFO/CLNSIG~"Pathogenic"'

Gene Panel Filtering

When the clinical question targets a specific condition, restrict analysis to curated gene panels:

# Extract regions from a BED file (e.g., hereditary cancer panel — 84 genes)
bcftools view -R cancer_panel.bed -f PASS sample.vcf.gz > panel_variants.vcf

The HCLS AI Factory ships with pre-built BED files for 13 therapeutic areas aligned to the intelligence agents.

Population Frequency Filtering

Filter against gnomAD allele frequencies to surface rare variants:

# Retain variants with AF < 0.001 or absent from gnomAD
bcftools filter -i 'INFO/gnomAD_AF<0.001 || INFO/gnomAD_AF="."' input.vcf.gz

Multi-Sample and Trio Analysis Patterns¶

Trio Analysis (Proband + Parents)

Parabricks 4.6 supports joint calling. For de novo variant detection:

# Joint calling with Parabricks DeepVariant
pbrun deepvariant \
  --ref reference.fasta \
  --in-bam proband.bam,mother.bam,father.bam \
  --out-variants trio_joint.vcf

# De novo filtering: variant present in proband, absent in both parents
bcftools view -s proband trio_joint.vcf \
  | bcftools filter -i 'GT="0/1" || GT="1/1"' \
  | bcftools isec -C -w1 - mother_only.vcf father_only.vcf

Multi-Sample Cohort Analysis

For cohort studies (e.g., tumor-normal pairs or family groups), use merged VCFs with sample-specific genotype fields. The RAG pipeline can ingest multi-sample VCFs and maintain per-sample lineage in the genomic_evidence Milvus collection.

Structural Variant Detection¶

The default pipeline focuses on SNVs and small indels. Structural variants (SVs) require dedicated tools:

Tool	SV Types	Integration Point
Parabricks pbsv	DEL, INS, DUP, INV, BND	Direct GPU-accelerated calling
Manta	DEL, INS, DUP, INV, BND	Post-alignment, CPU
CNVkit	Copy number variants	BAM coverage analysis

SVs are annotated against ClinVar SV records and can be ingested into the RAG pipeline using the genomic_evidence collection with variant_type metadata filtering.

ClinVar + AlphaMissense Annotation Deep Dive¶

The platform cross-references two major annotation databases:

ClinVar (4.1M records)

Variant-level clinical significance: Pathogenic, Likely Pathogenic, VUS, Likely Benign, Benign
Review status (0-4 stars) indicates evidence strength
Condition associations with OMIM and MedGen identifiers
Submission history tracks reclassification over time

AlphaMissense (71M predictions)

Machine learning pathogenicity scores (0.0 to 1.0) for all possible single amino acid substitutions
Thresholds: >= 0.564 likely pathogenic, <= 0.340 likely benign
Covers missense variants that ClinVar has not yet classified
Particularly valuable for VUS reclassification support

Cross-Reference Strategy

The RAG pipeline performs a layered annotation:

Check ClinVar first — if Pathogenic/Likely Pathogenic with >= 2 stars, use as primary evidence
For VUS or missing ClinVar entries, check AlphaMissense score
Concordance between ClinVar and AlphaMissense strengthens confidence
Discordance flags the variant for manual review

VCF Quality Metrics Interpretation¶

Metric	Field	Good	Marginal	Poor
Variant Quality	QUAL	>= 100	30-100	< 30
Read Depth	DP	>= 30x	15-30x	< 15x
Genotype Quality	GQ	>= 40	20-40	< 20
Allele Depth	AD	Ref/Alt balanced for het	Skewed > 4:1	Single-strand only
Mapping Quality	MQ	>= 50	30-50	< 30
Strand Bias	FS	< 60	60-200	> 200

Interpreting AD (Allele Depth) for heterozygous calls:

A heterozygous SNV should show roughly 50/50 reference/alternate reads. A call showing AD=95,5 at DP=100 is suspicious — possibly a sequencing artifact or somatic mosaic variant. The expected range for germline heterozygous calls is 30-70% alternate allele fraction.

Parabricks 4.6 Performance Tuning on DGX Spark¶

The NVIDIA DGX Spark (Grace Blackwell GB10, 128GB unified memory) runs the full genomics pipeline. Key tuning parameters:

Parameter	Default	Tuned	Impact
`--num-gpus`	1	1 (GB10 is single-GPU)	N/A
`--bwa-options`	default	`-K 100000000 -Y`	Improves reproducibility
`--run-partition-size`	25M	50M	Fewer partitions, less overhead
`--memory-limit`	auto	100G	Reserve 28GB for system + Milvus

Benchmark on DGX Spark:

Stage	Time (30x WGS)
BWA-MEM2 alignment	~45 min
DeepVariant calling	~60 min
Annotation + VCF post-processing	~15 min
Total	~120 min

This compares to 24-48 hours on a 32-core CPU-only server.

Part 2: Advanced RAG Architecture¶

Milvus 2.4 Collection Design Patterns¶

Every intelligence agent in the HCLS AI Factory follows a consistent Milvus collection design. Understanding these patterns is essential for extending or customizing the platform.

Index Configuration

All collections use the same core index parameters:

index_params = {
    "index_type": "IVF_FLAT",
    "metric_type": "COSINE",
    "params": {"nlist": 128}
}

search_params = {
    "metric_type": "COSINE",
    "params": {"nprobe": 16}
}

IVF_FLAT: Inverted file index with flat (exact) distance computation within clusters. Chosen over HNSW for its lower memory footprint and deterministic results.
COSINE: Cosine similarity metric. Normalized embeddings make this equivalent to inner product but more interpretable (0.0 to 1.0 scale).
nlist=128: Number of Voronoi cells. With collections ranging from hundreds to millions of vectors, 128 provides a good balance between recall and speed.
nprobe=16: Number of cells to search at query time (12.5% of nlist). Increasing to 32 improves recall by ~2% but doubles search latency.

Schema Pattern

Every collection follows this field layout:

Field	Type	Purpose
`id`	INT64 (auto-increment)	Primary key
`embedding`	FLOAT_VECTOR(384)	BGE-small-en-v1.5 embedding
`text`	VARCHAR(65535)	Source text chunk
`source`	VARCHAR(512)	Provenance (PubMed ID, URL, file path)
`metadata`	VARCHAR(65535)	JSON-encoded domain metadata
`created_at`	VARCHAR(64)	ISO 8601 timestamp
Domain-specific fields	Various	Filterable attributes (e.g., `cancer_type`, `target_antigen`, `gene_symbol`)

Shared genomic_evidence Collection

All 11 agents mount the genomic_evidence collection as read-only. This collection holds 3.56M vectors derived from the Stage 2 RAG pipeline (ClinVar + AlphaMissense annotations). Agents query it to enrich domain-specific analysis with genomic context without duplicating the data.

Collection Sizing Across Agents

Agent	Owned Collections	Approx. Vectors	Shared
Precision Oncology	10	~10,000+	+genomic_evidence
CAR-T Intelligence	10	6,266	+genomic_evidence
Precision Biomarker	10	~320	+genomic_evidence
Cardiology	12	~5,000+	+genomic_evidence
Neurology	12	~5,000+	+genomic_evidence
Autoimmune	13	~5,000+	+genomic_evidence
Rare Disease	13	~5,000+	+genomic_evidence
Pharmacogenomics	14	~5,000+	+genomic_evidence
Imaging	10	2,814+	+genomic_evidence
Single-Cell	12	~5,000+	+genomic_evidence
Clinical Trial	13	~5,000+	+genomic_evidence

BGE-small-en-v1.5 Embedding Model¶

Why BGE-small-en-v1.5?

Property	Value
Dimensions	384
Model size	33M parameters (~130 MB)
Max sequence length	512 tokens
Embedding speed (DGX Spark)	~1,200 texts/sec
MTEB retrieval score	51.68

The model was chosen for three reasons:

Efficiency — 384 dimensions keeps memory manageable at scale (3.56M vectors x 384 dims x 4 bytes = ~5.2 GB for genomic_evidence alone)
Quality — Top-tier retrieval performance among small models; sufficient for domain-specific retrieval when combined with query expansion
On-device inference — Runs entirely on the DGX Spark without requiring a separate embedding service

Limitations

512-token context window means long clinical documents must be chunked (default: 2,500 characters with 200-character overlap)
General-purpose English model — not fine-tuned for biomedical text. Query expansion compensates for vocabulary gaps.
Does not understand structured data (tables, lab results) natively. The ingest pipeline converts structured data to natural language descriptions before embedding.

Asymmetric Query Prefix

BGE models use an instruction prefix for queries (but not for documents):

# At query time
query_embedding = model.encode("Represent this sentence for searching relevant passages: " + query)

# At ingest time (no prefix)
doc_embedding = model.encode(document_text)

This asymmetric encoding improves retrieval accuracy by ~3-5% on domain benchmarks.

Multi-Collection Parallel Search Strategy¶

The core architectural pattern across all 11 agents is parallel multi-collection search. A single user query fans out to all collections simultaneously.

Search Flow

User Query
    |
    v
[Query Expansion] -- add domain synonyms and related terms
    |
    v
[BGE Embedding] -- 384-dim vector
    |
    v
[Parallel Search] -- fan out to N collections simultaneously
    |   |   |   |   ... |
    v   v   v   v       v
 Coll1 Coll2 Coll3 Coll4 ... CollN
    |   |   |   |   ... |
    v   v   v   v       v
[Merge + Deduplicate] -- by content hash
    |
    v
[Score Normalization] -- weight by collection relevance
    |
    v
[Top-K Selection] -- default: 5 per collection, 30 total max
    |
    v
[Claude Synthesis] -- grounded response with citations

Weighted Collection Scoring

Each agent assigns weights to its collections. The final relevance score combines cosine similarity with collection weight:

final_score = cosine_similarity * collection_weight

The Autoimmune Agent provides a good example with weights ranging from 0.18 (clinical documents) down to 0.02 (genomic evidence and cross-disease). This ensures that highly relevant clinical notes outrank tangentially related genomic data.

Performance on DGX Spark

Metric	Value
Single-collection search (top-5)	2-5 ms
11-collection parallel search	12-16 ms
Dual retrieval (comparative mode)	~365 ms
Full RAG with Claude synthesis	~24 sec
End-to-end with streaming	First token ~3 sec

Total query-to-response time stays under 5 seconds for retrieval; the majority of latency comes from LLM synthesis.

Query Expansion with Domain Keyword Maps¶

Raw user queries often miss domain-specific terminology. Each agent maintains expansion maps that inject synonyms and related terms.

Example: CAR-T Agent Expansion

The CAR-T agent maintains 12 expansion maps covering 169 seed keywords that expand to 1,496 related terms:

Map Category	Seed Keywords	Expanded Terms
Target Antigen	CD19, BCMA, CD22	B-cell maturation antigen, TNFRSF17, ...
Toxicity	CRS, ICANS	cytokine release syndrome, neurotoxicity, tocilizumab, ...
Manufacturing	transduction, expansion	lentiviral vector, T-cell activation, ...

Expansion Logic

def expand_query(query: str) -> str:
    expanded_terms = []
    for keyword, synonyms in expansion_maps.items():
        if keyword.lower() in query.lower():
            expanded_terms.extend(synonyms[:5])  # Top 5 synonyms
    return f"{query} {' '.join(expanded_terms)}"

The expanded query is embedded alongside the original, and both embedding vectors are used in the parallel search. This typically improves recall by 15-25% on domain-specific queries.

Claude LLM Prompt Engineering for Clinical RAG Synthesis¶

All agents use a structured prompt template for Claude synthesis:

System Prompt Structure

You are a {domain} clinical decision support AI.
You must:
1. Ground every claim in the retrieved evidence
2. Cite sources using [Collection:ID] format
3. Flag clinical alerts (contraindications, critical values)
4. Acknowledge uncertainty when evidence is limited
5. Never fabricate references or statistics

Patient context: {patient_context}
Conversation history: {last_N_turns}

Evidence Injection Format

Retrieved evidence is injected between XML-style tags:

<evidence>
[Literature:PMID_12345678] (score: 0.87) EGFR mutations in non-small cell lung
cancer respond to erlotinib with a median PFS of 10.4 months...

[Trials:NCT04012345] (score: 0.82) Phase III trial of osimertinib vs.
comparator in EGFR T790M-positive NSCLC...
</evidence>

Key Prompt Engineering Patterns

Citation enforcement — The system prompt requires [Collection:Source] format. Claude rarely hallucinates citations when the format is explicitly specified and evidence is provided.
Confidence calibration — Agents instruct Claude to state "Based on limited evidence..." or "No relevant evidence found..." rather than generating speculative answers.
Streaming — All agents support SSE streaming for real-time token delivery. The first token typically arrives within 3 seconds.

Agents communicate through an event-driven architecture. When one agent detects a finding that is relevant to another domain, it publishes a cross-modal event.

Example Trigger Chains

Source Agent	Trigger Condition	Target Agent	Action
Imaging	Lung-RADS 4A+ nodule	Oncology	Query EGFR/ALK/ROS1 variants in genomic_evidence
Oncology	BRCA1/2 pathogenic variant	Clinical Trial	Search for PARP inhibitor trials
Rare Disease	HPO phenotype match score > 0.8	Pharmacogenomics	Check PGx implications for candidate therapies
Autoimmune	Flare risk > 0.8 (imminent)	Biomarker	Pull longitudinal CRP/ESR/complement trends
Single-Cell	Antigen escape detected	CAR-T	Flag resistance mechanism and alternative targets
Cardiology	EF < 35% detected	Oncology	Check cardiotoxicity risk for proposed chemotherapy

Event Format

{
  "event_type": "cross_modal_trigger",
  "source_agent": "imaging_intelligence",
  "target_agent": "precision_oncology",
  "trigger": "lung_rads_4a_detected",
  "payload": {
    "finding": "Spiculated nodule 18mm RUL",
    "query": "EGFR ALK ROS1 KRAS lung adenocarcinoma variants"
  }
}

Performance Optimization¶

Achieving Sub-5-Second Search Latency

Pre-load collections — All collections are loaded into memory at agent startup. Milvus load() is called during the health check lifecycle.
Connection pooling — A single MilvusClient connection is reused across requests. Connection setup (~200ms) is amortized.
Embedding cache — Frequently repeated queries (e.g., gene names, drug names) are cached with an LRU cache (default 1,024 entries).
Batch embedding — During ingest, texts are embedded in batches of 64 to maximize GPU throughput.
Collection partitioning — Large collections (genomic_evidence at 3.56M vectors) use partition keys (e.g., chromosome) for targeted search.

Memory Budget on DGX Spark (128GB unified)

Component	Memory
Milvus (all collections loaded)	~25 GB
BGE-small-en-v1.5 model	~0.5 GB
Python agent processes (11 agents)	~8 GB
System + Docker overhead	~10 GB
Parabricks (when running)	~60 GB
Available headroom	~24 GB

Part 3: Deep Dive — All 11 Intelligence Agents¶

Each agent follows the same foundational architecture: FastAPI REST server, Streamlit UI, multi-collection Milvus RAG engine, BGE-small-en-v1.5 embeddings (384-dim), and Claude LLM synthesis. What differs is the domain knowledge, clinical workflows, and cross-agent integration points.

3.1 Precision Oncology Agent (:8527)¶

Overview and Clinical Purpose

The Precision Oncology Agent is a closed-loop clinical decision support system designed for molecular tumor board (MTB) workflows. Given a patient's somatic and germline VCF, it identifies actionable variants, matches them against oncology knowledge bases, ranks candidate therapies, surfaces relevant clinical trials, and assembles a structured MTB-ready report. The entire pipeline runs on a single NVIDIA DGX Spark ($4,699, March 2026).

Advanced Configuration Options

Variable	Default	Description
`ONCO_AGENT_PORT`	`8527`	FastAPI listen port
`MILVUS_HOST`	`localhost`	Milvus server hostname
`MILVUS_PORT`	`19530`	Milvus gRPC port
`MILVUS_POOL_SIZE`	`10`	Connection pool size; increase for concurrent MTB sessions
`ANTHROPIC_API_KEY`	(required)	Claude API key for LLM synthesis
`LLM_MODEL`	`claude-sonnet-4-20250514`	Claude model ID; upgrade to `claude-opus-4-20250514` for complex cases
`EMBEDDING_MODEL`	`BAAI/bge-small-en-v1.5`	Sentence-transformer for 384-dim embeddings
`TOP_K`	`15`	Max chunks retrieved per collection search
`EVIDENCE_THRESHOLD`	`0.65`	Minimum cosine similarity for evidence inclusion
`CROSS_AGENT_TIMEOUT`	`30`	Seconds to wait for cross-agent responses

Milvus Collections (11)

Collection	Description	Source
`onco_literature`	PubMed/PMC chunks by cancer type	PubMed E-utils
`onco_trials`	Clinical trial summaries with biomarker criteria	ClinicalTrials.gov
`onco_variants`	Actionable somatic/germline variants	CIViC, OncoKB
`onco_biomarkers`	Predictive/prognostic biomarkers and assays	CIViC, literature
`onco_therapies`	Approved and investigational therapies with MOA	OncoKB, FDA labels
`onco_pathways`	Signaling pathways, cross-talk, druggable nodes	KEGG, Reactome
`onco_guidelines`	NCCN/ASCO/ESMO guideline recommendations	Guideline PDFs
`onco_resistance`	Resistance mechanisms and bypass strategies	CIViC, literature
`onco_outcomes`	Real-world treatment outcome records	De-identified RWD
`onco_cases`	De-identified patient case snapshots	Synthetic/RWD
`genomic_evidence`	VCF-derived evidence (shared, read-only)	Stage 1 pipeline

Complete API Endpoint Table

Method	Path	Description
GET	`/health`	Service health with component status
GET	`/collections`	List all loaded collections with record counts
GET	`/knowledge/stats`	Knowledge base statistics (total vectors, last ingest)
GET	`/metrics`	Prometheus-compatible metrics export
POST	`/query`	Free-text oncology RAG query with evidence synthesis
POST	`/search`	Direct vector similarity search across collections
POST	`/find-related`	Find related concepts given an entity (gene, drug, pathway)
POST	`/v1/onco/integrated-assessment`	Full multi-collection integrated assessment
POST	`/api/ask`	Meta-agent question answering with chain-of-thought
POST	`/api/cases`	Create a new patient case with cancer type and variants
GET	`/api/cases/{case_id}`	Retrieve case details and analysis status
POST	`/api/cases/{case_id}/mtb`	Run full molecular tumor board analysis for a case
GET	`/api/cases/{case_id}/variants`	List annotated variants for a case
POST	`/api/trials/match`	Match molecular profile to clinical trials
POST	`/api/trials/match-case/{case_id}`	Match a stored case to clinical trials
POST	`/api/therapies/rank`	Rank therapies by evidence level and biomarker fit
POST	`/api/reports/generate`	Generate MTB report (Markdown, JSON, PDF, FHIR R4)
GET	`/api/reports/{case_id}/{fmt}`	Download generated report in specified format
GET	`/api/events`	List cross-agent events (SSE stream)
GET	`/api/events/{event_id}`	Retrieve a specific event by ID

Detailed Clinical Workflow Walkthrough

Variant submission -- Clinician uploads somatic/germline VCF via Streamlit UI or POSTs variant list to /api/cases. The VCF parser extracts gene, protein change, allele frequency, and quality metrics.
Entity detection -- The agent identifies gene names (EGFR, BRAF, ALK), variant notations (L858R, V600E), and cancer types from free-text or structured input.
Evidence-level annotation -- Variants are annotated against CIViC and OncoKB with AMP/ASCO/CAP evidence-level tiering (Level IA through Level III). Each variant receives a clinical significance classification.
Multi-collection retrieval -- Parallel search across all 11 collections: therapies, biomarkers, trials, resistance mechanisms, pathways, guidelines, outcomes, and cases. Each result carries a cosine similarity score.
TherapyRanker scoring -- Approved and investigational therapies are scored using a composite of evidence level (40%), biomarker match (25%), resistance profile (15%), guideline concordance (10%), and trial availability (10%).
TrialMatcher execution -- Molecular profile is matched against eligibility criteria in onco_trials. Results include NCT IDs, phase, enrollment status, and distance to nearest trial sites.
Claude LLM synthesis -- All retrieved evidence is assembled into a structured prompt. Claude generates a narrative summary with inline citations, therapy recommendations, and resistance warnings.
MTB report generation -- Structured report exported as Markdown, JSON, PDF, or FHIR R4 DiagnosticReport with tiered evidence, therapy rankings, trial matches, and resistance alerts.

Example Query and Response

curl -X POST http://localhost:8527/query \
  -H "Content-Type: application/json" \
  -d '{"question": "What targeted therapies are available for EGFR L858R in NSCLC?"}'

{
  "answer": "EGFR L858R is a Level IA actionable target in NSCLC. Osimertinib (3rd-gen TKI) is the preferred first-line therapy per NCCN 2026 guidelines, with a median PFS of 18.9 months (FLAURA trial). Second-line options include amivantamab + lazertinib for C797S-mediated resistance...",
  "evidence": [
    {"collection": "onco_therapies", "score": 0.94, "text": "Osimertinib: 3rd-generation EGFR TKI, FDA-approved for EGFR-mutant NSCLC..."},
    {"collection": "onco_guidelines", "score": 0.91, "text": "NCCN NSCLC v3.2026: EGFR L858R — Preferred first-line: osimertinib..."},
    {"collection": "onco_resistance", "score": 0.87, "text": "C797S mutation confers resistance to osimertinib in 10-15% of cases..."},
    {"collection": "onco_trials", "score": 0.83, "text": "NCT04487080: Phase III, osimertinib + savolitinib for MET-amplified progression..."}
  ],
  "therapies_ranked": [
    {"rank": 1, "drug": "Osimertinib", "evidence": "Level IA", "score": 0.96},
    {"rank": 2, "drug": "Amivantamab + Lazertinib", "evidence": "Level IB", "score": 0.82},
    {"rank": 3, "drug": "Erlotinib + Ramucirumab", "evidence": "Level IB", "score": 0.74}
  ],
  "active_trials": 8,
  "confidence": 0.93,
  "collections_searched": 11,
  "processing_time_ms": 3420
}

Cross-Agent Integration

Receives genomic evidence from Stage 1 pipeline via shared genomic_evidence collection (3.56M variants)
Triggers Clinical Trial Agent (:8538) when novel actionable variants are identified, passing variant list and cancer type for trial matching
Receives cardiotoxicity alerts from Cardiology Agent (:8126) for proposed chemotherapy regimens (anthracyclines, trastuzumab) with pre-treatment ejection fraction data
Sends antigen expression queries to Single-Cell Agent (:8540) for immunotherapy candidate validation (TME profiling, PD-L1 expression)
Connects to Drug Discovery pipeline (Stage 3) for candidate molecule docking when no approved therapies match the variant profile

Performance Characteristics

Operation	Typical Latency
Single collection search (top-15)	80-150 ms
Full 11-collection parallel search	200-400 ms
Embedding generation (BGE-small)	15-25 ms
Claude LLM synthesis	2,000-3,500 ms
Full MTB analysis (end-to-end)	3,500-5,000 ms
Report generation (PDF)	1,500-2,500 ms

Common Pitfalls and Troubleshooting

"No actionable variants found" -- Check that the VCF contains PASS-filtered variants with gene annotations. Raw VCF without functional annotation will not match CIViC/OncoKB entries. Run bcftools view -f PASS first.
Therapy ranking returns empty -- Verify that onco_therapies collection is loaded (GET /collections). If the collection shows 0 records, re-run the ingest script.
Slow LLM responses (>10s) -- Reduce TOP_K from 15 to 8 to shrink the context window. Alternatively, switch to a faster Claude model via LLM_MODEL.
Cross-agent timeout -- If the Cardiology or Trial agent is down, the oncology agent logs a warning but continues without cross-agent enrichment. Check CROSS_AGENT_TIMEOUT and verify target agents are healthy.
Duplicate case IDs -- Case creation is idempotent on patient_id + cancer_type. Submitting the same combination returns the existing case rather than creating a duplicate.

3.2 CAR-T Intelligence Agent (:8522)¶

Overview and Clinical Purpose

The CAR-T Intelligence Agent breaks down data silos across the 5 stages of CAR-T cell therapy development: target selection, construct design, manufacturing, clinical evaluation, and post-market surveillance. It features a unique comparative analysis mode that auto-detects "X vs Y" queries and produces structured side-by-side comparisons. The agent covers all 6 FDA-approved CAR-T products, 25 target antigens, and 39+ product aliases, running on a single DGX Spark ($4,699, March 2026).

Advanced Configuration Options

Variable	Default	Description
`CART_AGENT_PORT`	`8522`	FastAPI listen port
`MILVUS_HOST`	`localhost`	Milvus server hostname
`MILVUS_PORT`	`19530`	Milvus gRPC port
`MILVUS_POOL_SIZE`	`10`	Connection pool size
`ANTHROPIC_API_KEY`	(required)	Claude API key for LLM synthesis
`LLM_MODEL`	`claude-sonnet-4-20250514`	Claude model; use `claude-opus-4-20250514` for nuanced comparative analyses
`EMBEDDING_MODEL`	`BAAI/bge-small-en-v1.5`	384-dim sentence-transformer
`TOP_K`	`15`	Max chunks per collection search
`COMPARATIVE_THRESHOLD`	`0.70`	Minimum confidence to trigger comparative mode
`QUERY_EXPANSION_MAPS`	`12`	Number of domain-specific expansion maps (169 keywords to 1,496 terms)

Milvus Collections (11)

Collection	Records	Source
`cart_literature`	5,047	PubMed abstracts via NCBI E-utilities
`cart_trials`	973	ClinicalTrials.gov API v2
`cart_constructs`	6	6 FDA-approved CAR-T products
`cart_assay_results`	45	Curated from landmark papers (ELIANA, ZUMA-1, KarMMa, CARTITUDE-1)
`cart_manufacturing`	30	CMC/process data (transduction, expansion, release, cryo, logistics)
`cart_safety`	--	Pharmacovigilance, CRS/ICANS profiles
`cart_biomarkers`	--	CRS prediction, exhaustion monitoring
`cart_regulatory`	--	Approval timelines, post-marketing requirements
`cart_sequences`	--	Molecular binding, scFv sequences
`cart_rwe`	--	Registry outcomes, real-world data
`genomic_evidence`	3.56M	Shared from Stage 2 RAG pipeline (read-only)

Complete API Endpoint Table

Method	Path	Description
GET	`/health`	Service health with Milvus and engine status
GET	`/collections`	List loaded collections with record counts
GET	`/knowledge/stats`	Knowledge base statistics and last ingest time
GET	`/metrics`	Prometheus-compatible metrics export
POST	`/query`	Cross-functional RAG query (standard or comparative, auto-detected)
POST	`/search`	Direct vector similarity search with collection filtering
POST	`/find-related`	Find related entities (antigens, products, mechanisms)
POST	`/v1/cart/integrated-assessment`	Full multi-collection integrated CAR-T assessment
POST	`/api/ask`	Meta-agent question answering with reasoning chain
GET	`/api/reports/{patient_id}`	Retrieve generated report for a patient
GET	`/api/reports/{patient_id}/{fmt}`	Download report in specified format (md, json, pdf, fhir)
GET	`/api/events`	List cross-agent events
GET	`/api/events/{event_id}`	Retrieve a specific cross-agent event

Detailed Clinical Workflow Walkthrough

Query submission -- User submits a question about any stage of CAR-T development via Streamlit UI or POST to /query. Input can be free-text or structured with explicit parameters.
Comparative detection -- The engine scans for "vs", "versus", "compare", "difference between" patterns. If detected, it enters comparative mode; otherwise, standard RAG mode.
Entity resolution (comparative) -- Both entities are parsed and resolved against the knowledge graph containing 25 antigens, 6 FDA-approved products, and 39+ aliases (e.g., "tisa-cel" resolves to "tisagenlecleucel / Kymriah").
Dual retrieval (comparative) -- Separate searches are run for each entity across all 11 collections. Results are aligned by category (efficacy, safety, manufacturing, cost) for side-by-side comparison.
Query expansion (standard) -- 12 domain-specific expansion maps transform 169 seed keywords into 1,496 search terms. For example, "CRS management" expands to include "cytokine release syndrome", "tocilizumab", "IL-6 blockade", "grading criteria".
Parallel multi-collection search -- All 11 collections are searched concurrently. Results are scored by cosine similarity and weighted by collection relevance to the query domain.
Claude LLM synthesis -- Retrieved evidence is assembled with comparative formatting when applicable. Responses include clickable PubMed IDs (PMID) and ClinicalTrials.gov NCT numbers.
Report generation -- Exportable as Markdown, JSON, PDF, or FHIR R4 DiagnosticReport.

Example Query and Response

curl -X POST http://localhost:8522/query \
  -H "Content-Type: application/json" \
  -d '{"question": "Compare 4-1BB vs CD28 costimulatory domains for DLBCL"}'

{
  "answer": "4-1BB (CD137) and CD28 costimulatory domains represent the two main signaling architectures in FDA-approved CAR-T products for DLBCL...",
  "comparison": {
    "entity_a": "4-1BB (CD137)",
    "entity_b": "CD28",
    "dimensions": [
      {"category": "Persistence", "entity_a": "Superior T-cell persistence (>6 months detectable)", "entity_b": "Rapid expansion, shorter persistence (~3 months)"},
      {"category": "CRS Rate", "entity_a": "Lower grade 3+ CRS (2-22%)", "entity_b": "Higher grade 3+ CRS (13-15%)"},
      {"category": "FDA Products", "entity_a": "Tisagenlecleucel (Kymriah), Lisocabtagene (Breyanzi)", "entity_b": "Axicabtagene (Yescarta), Brexucabtagene (Tecartus)"},
      {"category": "ORR (DLBCL)", "entity_a": "52-73%", "entity_b": "72-83%"}
    ]
  },
  "evidence": [
    {"collection": "cart_constructs", "score": 0.93, "text": "4-1BB signaling promotes memory T-cell differentiation..."},
    {"collection": "cart_literature", "score": 0.89, "text": "ZUMA-1: axi-cel (CD28) achieved 83% ORR in r/r DLBCL (PMID: 28982775)..."},
    {"collection": "cart_safety", "score": 0.86, "text": "CRS incidence comparison across costimulatory domains..."}
  ],
  "mode": "comparative",
  "confidence": 0.91,
  "processing_time_ms": 3180
}

Cross-Agent Integration

Receives antigen escape alerts from Single-Cell Agent (:8540) when off-tumor expression is detected for a target antigen (e.g., CD19 loss in B-ALL relapse)
Queries Oncology Agent (:8527) for tumor mutational burden and microsatellite instability context that may influence CAR-T eligibility
Shares CRS/ICANS toxicity profiles with Biomarker Agent (:8529) for real-time monitoring protocols (IL-6, ferritin, CRP thresholds)
Triggers Clinical Trial Agent (:8538) when a query involves investigational CAR-T constructs not yet FDA-approved

Performance Characteristics

Operation	Typical Latency
Standard single-collection search	80-150 ms
Full 11-collection parallel search	200-400 ms
Comparative dual-entity retrieval	350-600 ms
Query expansion (12 maps)	5-10 ms
Claude LLM synthesis (standard)	2,000-3,000 ms
Claude LLM synthesis (comparative)	2,500-4,000 ms
End-to-end query (standard)	2,500-3,500 ms
End-to-end query (comparative)	3,000-5,000 ms

Common Pitfalls and Troubleshooting

Comparative mode not triggering -- The detection engine requires explicit comparison language ("vs", "versus", "compare", "difference between"). Rephrase ambiguous queries to include one of these trigger words.
Entity resolution failure -- If a product name is misspelled or uses a non-standard alias, the knowledge graph may not resolve it. Use canonical names (e.g., "axicabtagene ciloleucel" not "axi-cel") or add custom aliases via the configuration.
Empty cart_constructs collection -- This is a small curated collection (6 records). If it returns 0, re-run scripts/seed_constructs.py to reload the 6 FDA-approved product profiles.
Slow comparative queries -- Comparative mode runs two full retrieval passes. If latency exceeds 5s, reduce TOP_K from 15 to 10 or limit the collection set.
Missing PubMed citations -- The cart_literature collection requires periodic refresh via the NCBI E-utilities ingest script. Stale data will miss recent publications.

3.3 Precision Biomarker Agent (:8529)¶

Overview and Clinical Purpose

The Precision Biomarker Agent interprets patient biomarker panels with genotype awareness. It estimates biological age using PhenoAge/GrimAge algorithms, detects disease trajectories across 6 categories, provides pharmacogenomic profiling for 7 key pharmacogenes, and generates genotype-adjusted reference ranges. All processing runs on a single DGX Spark ($4,699, March 2026).

Advanced Configuration Options

Variable	Default	Description
`BIOMARKER_AGENT_PORT`	`8529`	FastAPI listen port
`MILVUS_HOST`	`localhost`	Milvus server hostname
`MILVUS_PORT`	`19530`	Milvus gRPC port
`MILVUS_POOL_SIZE`	`10`	Connection pool size
`ANTHROPIC_API_KEY`	(required)	Claude API key for LLM synthesis
`LLM_MODEL`	`claude-sonnet-4-20250514`	Claude model ID
`EMBEDDING_MODEL`	`BAAI/bge-small-en-v1.5`	384-dim sentence-transformer
`TOP_K`	`15`	Max chunks per collection search
`AGING_ALGORITHM`	`phenoage`	Biological age algorithm: `phenoage` or `grimage`
`DISEASE_CATEGORIES`	`6`	Number of disease trajectory categories screened
`PGX_GENES`	`7`	Number of pharmacogenes profiled (CYP2D6, CYP2C19, CYP2C9, CYP3A5, SLCO1B1, VKORC1, MTHFR)

Milvus Collections (11)

Collection	Description	Records
`biomarker_reference`	Reference biomarker definitions and ranges	~60
`biomarker_genetic_variants`	Genetic variants affecting biomarker levels	~30
`biomarker_pgx_rules`	CPIC pharmacogenomic dosing rules	~50
`biomarker_disease_trajectories`	Disease progression stage definitions	~30
`biomarker_clinical_evidence`	Published clinical evidence	~40
`biomarker_nutrition`	Genotype-aware nutrition guidelines	~25
`biomarker_drug_interactions`	Gene-drug interactions	~35
`biomarker_aging_markers`	Epigenetic aging clock markers	~15
`biomarker_genotype_adjustments`	Genotype-based reference range adjustments	~20
`biomarker_monitoring`	Condition-specific monitoring protocols	~15
`genomic_evidence`	Shared genomic variants (read-only)	3.56M

Complete API Endpoint Table

Method	Path	Description
GET	`/health`	Service health with Milvus and engine status
GET	`/collections`	List loaded collections with record counts
GET	`/knowledge/stats`	Knowledge base statistics
GET	`/metrics`	Prometheus-compatible metrics export
POST	`/v1/biomarker/integrated-assessment`	Full multi-collection integrated biomarker assessment
POST	`/v1/analyze`	Full patient analysis (biomarkers + genotypes combined)
POST	`/v1/biological-age`	Biological age calculation (PhenoAge/GrimAge)
POST	`/v1/disease-risk`	Disease risk trajectory analysis across 6 categories
POST	`/v1/pgx`	Pharmacogenomic profile from star allele diplotypes
POST	`/v1/query`	Free-text RAG query across all biomarker collections
POST	`/v1/query/stream`	Streaming RAG query (Server-Sent Events)
POST	`/v1/report/generate`	Generate clinical report (Markdown, JSON, PDF, FHIR R4)
GET	`/v1/report/{report_id}/pdf`	Download generated PDF report
POST	`/v1/report/fhir`	Export analysis as FHIR R4 DiagnosticReport
POST	`/v1/events/cross-modal`	Receive cross-modal events from other agents
POST	`/v1/events/biomarker-alert`	Receive biomarker alert events (threshold breaches)
GET	`/v1/events/cross-modal`	List received cross-modal events
GET	`/v1/events/biomarker-alert`	List received biomarker alerts

Detailed Clinical Workflow Walkthrough

Data submission -- Patient biomarker panel (CRP, HbA1c, LDL, albumin, creatinine, glucose, etc.) and genotype data (star alleles or VCF-derived diplotypes) are submitted via Streamlit UI or POST to /v1/analyze.
Biological Age Engine -- Computes PhenoAge and GrimAge estimates from 9 routine blood biomarkers (albumin, creatinine, glucose, CRP, lymphocyte %, mean cell volume, red cell distribution width, alkaline phosphatase, white blood cell count). Output: chronological age, biological age, acceleration (positive = aging faster).
Disease Trajectory Analyzer -- Screens 6 disease categories (cardiovascular, metabolic/diabetes, hepatic, renal, inflammatory, hematologic) using genotype-stratified thresholds. Each trajectory returns a stage (normal, borderline, early, established, advanced) with progression rate.
Pharmacogenomic Mapper -- Interprets star alleles for CYP2D6, CYP2C19, CYP2C9, CYP3A5, SLCO1B1, VKORC1, MTHFR, plus HLA-B*57:01 screening. Maps diplotypes to metabolizer phenotypes (ultra-rapid, normal, intermediate, poor) with CPIC-level dosing recommendations.
Genotype Adjustment Engine -- Modifies standard reference ranges based on genomic context. For example: PNPLA3 I148M carriers have lower ALT thresholds for NAFLD screening; TCF7L2 rs7903146 T/T carriers have elevated fasting glucose baselines; APOE e4/e4 carriers have higher LDL reference ceilings.
Evidence retrieval -- Parallel search across all 11 collections to ground recommendations in published literature, CPIC guidelines, and clinical evidence.
Report generation -- 12-section clinical report covering biological age summary, disease trajectories (6 categories), pharmacogenomic profile, genotype-adjusted ranges, nutrition recommendations, drug interaction alerts, and monitoring schedule. Exported as PDF and FHIR R4.

Example Query and Response

curl -X POST http://localhost:8529/v1/analyze \
  -H "Content-Type: application/json" \
  -d '{"biomarkers": {"CRP": 2.1, "HbA1c": 6.8, "LDL": 145, "albumin": 4.2, "creatinine": 0.9, "glucose": 118}, "genotypes": {"CYP2D6": "*1/*4", "CYP2C19": "*1/*2"}}'

{
  "biological_age": {
    "chronological_age": 55,
    "phenoage": 58.3,
    "grimage": 57.1,
    "acceleration": "+3.3 years",
    "percentile": 72,
    "interpretation": "Moderately accelerated aging. CRP and glucose are primary contributors."
  },
  "disease_trajectories": [
    {"category": "Metabolic", "stage": "Borderline", "risk_score": 0.62, "key_markers": ["HbA1c: 6.8 (pre-diabetic)", "glucose: 118"]},
    {"category": "Cardiovascular", "stage": "Early", "risk_score": 0.48, "key_markers": ["LDL: 145", "CRP: 2.1"]},
    {"category": "Inflammatory", "stage": "Borderline", "risk_score": 0.35, "key_markers": ["CRP: 2.1"]}
  ],
  "pharmacogenomics": {
    "CYP2D6": {"diplotype": "*1/*4", "phenotype": "Intermediate Metabolizer", "drugs_affected": ["codeine", "tramadol", "tamoxifen"]},
    "CYP2C19": {"diplotype": "*1/*2", "phenotype": "Intermediate Metabolizer", "drugs_affected": ["clopidogrel", "omeprazole", "escitalopram"]}
  },
  "genotype_adjustments": [
    {"marker": "CYP2D6", "adjustment": "Codeine: avoid or reduce dose by 50%; consider morphine alternative"}
  ],
  "confidence": 0.88,
  "processing_time_ms": 3850
}

Cross-Agent Integration

Receives inflammation monitoring requests from Autoimmune Agent (:8532) during flare events, tracking CRP, ESR, IL-6, and calprotectin trends
Provides biological age context to Cardiology Agent (:8126) for ASCVD risk refinement -- accelerated biological age increases the 10-year risk estimate
Shares pharmacogenomic profiles with Pharmacogenomics Agent (:8107) for detailed CPIC/DPWG dosing guidance
Sends biomarker alerts to Oncology Agent (:8527) when tumor markers (CEA, CA-125, PSA) exceed surveillance thresholds
Receives genotype-adjusted reference range requests from Rare Disease Agent (:8134) for rare metabolic conditions

Performance Characteristics

Operation	Typical Latency
Single collection search (top-15)	80-150 ms
Full 11-collection parallel search	200-400 ms
Biological age calculation	50-100 ms
Disease trajectory analysis (6 categories)	100-200 ms
PGx diplotype interpretation	30-60 ms
Claude LLM synthesis	2,000-3,500 ms
Full analysis (end-to-end)	3,000-5,000 ms
PDF report generation	1,500-2,500 ms

Common Pitfalls and Troubleshooting

Biological age returns null -- The PhenoAge algorithm requires at least 7 of the 9 input biomarkers. If fewer are provided, the engine falls back to a partial estimate with a warning flag. Ensure albumin, creatinine, and glucose are included at minimum.
PGx phenotype "Indeterminate" -- This occurs when the star allele combination is not in the CPIC lookup table. Verify that diplotype notation follows PharmVar conventions (e.g., *1/*4 not *1/4).
Genotype adjustments not applied -- The adjustment engine requires both the biomarker value and the relevant genotype. Submitting biomarkers without genotypes produces standard (non-adjusted) reference ranges.
Disease trajectory showing "Unknown" stage -- Some trajectories require specific marker combinations. For example, the hepatic trajectory needs ALT, AST, and albumin together. A single liver enzyme without context is insufficient.
Slow report generation -- PDF rendering adds 1-2s overhead. For faster iteration, request Markdown format first, then generate PDF only for the final version.

3.4 Cardiology Intelligence Agent (:8126)¶

Overview and Clinical Purpose

The Cardiology Intelligence Agent synthesizes cardiac imaging, electrophysiology, hemodynamics, heart failure management, valvular disease, preventive cardiology, interventional data, and cardio-oncology surveillance into ACC/AHA/ESC guideline-aligned recommendations. It includes 6 validated risk calculators (ASCVD, HEART Score, CHA2DS2-VASc, HAS-BLED, MAGGIC, EuroSCORE II), a GDMT optimizer, and 11 clinical workflows. All processing runs on a single DGX Spark ($4,699, March 2026).

Advanced Configuration Options

Variable	Default	Description
`CARDIO_AGENT_PORT`	`8126`	FastAPI listen port
`MILVUS_HOST`	`localhost`	Milvus server hostname
`MILVUS_PORT`	`19530`	Milvus gRPC port
`MILVUS_POOL_SIZE`	`10`	Connection pool size
`ANTHROPIC_API_KEY`	(required)	Claude API key for LLM synthesis
`LLM_MODEL`	`claude-sonnet-4-20250514`	Claude model ID
`EMBEDDING_MODEL`	`BAAI/bge-small-en-v1.5`	384-dim sentence-transformer
`TOP_K`	`15`	Max chunks per collection search
`GDMT_TITRATION_STEPS`	`4`	GDMT medication classes optimized simultaneously (ARNI/ACEi, beta-blocker, MRA, SGLT2i)
`RISK_CALCULATOR_COUNT`	`6`	Number of validated risk calculators

Milvus Collections (13)

Collection	Description
`cardio_literature`	Published cardiovascular research, reviews, meta-analyses
`cardio_trials`	Cardiovascular clinical trials and landmark results
`cardio_imaging`	Cardiac imaging protocols, findings, measurements
`cardio_electrophysiology`	ECG interpretation, arrhythmia classification, EP data
`cardio_heart_failure`	HF classification, GDMT protocols, management algorithms
`cardio_valvular`	Valvular heart disease assessment and intervention criteria
`cardio_prevention`	Preventive cardiology: risk stratification, lipid management
`cardio_interventional`	Interventional procedures, techniques, outcomes
`cardio_oncology`	Cardio-oncology surveillance, cardiotoxicity detection
`cardio_devices`	FDA-cleared cardiovascular AI devices, implantables
`cardio_guidelines`	ACC/AHA/ESC/HRS clinical practice guidelines
`cardio_hemodynamics`	Catheterization data, pressure tracings, derived calculations
`genomic_evidence`	Shared genomic evidence (read-only, 3.56M variants)

6 Risk Calculators

Calculator	Clinical Use
ASCVD (PCE)	10-year atherosclerotic cardiovascular disease risk
HEART Score	Chest pain risk stratification in ED
CHA2DS2-VASc	Atrial fibrillation stroke risk
HAS-BLED	Anticoagulation bleeding risk
MAGGIC	Heart failure mortality risk
EuroSCORE II	Cardiac surgical mortality risk

Complete API Endpoint Table

Method	Path	Description
GET	`/health`	Service health with Milvus, RAG engine, GDMT optimizer, and calculator status
GET	`/collections`	List loaded collections with record counts
GET	`/workflows`	List all 11 clinical workflow definitions
GET	`/metrics`	Prometheus-compatible metrics export
POST	`/v1/cardio/integrated-assessment`	Full multi-collection integrated cardiac assessment
POST	`/v1/cardio/query`	Free-text RAG query across all cardiology collections
POST	`/v1/cardio/search`	Direct vector similarity search with collection filtering
POST	`/v1/cardio/find-related`	Find related entities (drugs, conditions, biomarkers)
POST	`/v1/cardio/risk/ascvd`	ASCVD 10-year risk (Pooled Cohort Equations)
POST	`/v1/cardio/risk/heart-score`	HEART Score for ED chest pain risk stratification
POST	`/v1/cardio/risk/cha2ds2-vasc`	CHA2DS2-VASc stroke risk for atrial fibrillation
POST	`/v1/cardio/risk/has-bled`	HAS-BLED anticoagulation bleeding risk
POST	`/v1/cardio/risk/maggic`	MAGGIC heart failure mortality risk
POST	`/v1/cardio/risk/euroscore`	EuroSCORE II cardiac surgical mortality
POST	`/v1/cardio/gdmt/optimize`	GDMT titration optimization for heart failure
POST	`/v1/cardio/workflow/cad`	Coronary artery disease assessment workflow
POST	`/v1/cardio/workflow/heart-failure`	Heart failure classification and management
POST	`/v1/cardio/workflow/valvular`	Valvular heart disease workflow
POST	`/v1/cardio/workflow/arrhythmia`	Arrhythmia and EP assessment
POST	`/v1/cardio/workflow/cardiac-mri`	Cardiac MRI tissue characterization
POST	`/v1/cardio/workflow/stress-test`	Stress testing protocol
POST	`/v1/cardio/workflow/prevention`	Cardiovascular prevention and risk stratification
POST	`/v1/cardio/workflow/cardio-oncology`	Cardio-oncology surveillance
POST	`/v1/cardio/cross-modal/evaluate`	Cross-modal genomic-cardiac evaluation
GET	`/v1/cardio/guidelines`	List available clinical guideline references
GET	`/v1/cardio/conditions`	List supported cardiovascular conditions
GET	`/v1/cardio/biomarkers`	List tracked cardiac biomarkers
GET	`/v1/cardio/drugs`	List cardiovascular drugs in knowledge base
GET	`/v1/cardio/genes`	List cardiomyopathy-associated genes
GET	`/v1/cardio/knowledge-version`	Knowledge base version and last update timestamp
POST	`/v1/reports/generate`	Generate clinical report (Markdown, JSON, PDF, FHIR R4)
GET	`/v1/reports/formats`	List available report export formats
GET	`/v1/events/stream`	SSE stream for cross-agent events
GET	`/v1/events/health`	Event system health status

Detailed Clinical Workflow Walkthrough (11 workflows)

Coronary Artery Disease (CAD) -- Calcium score interpretation, CAD-RADS classification, plaque characterization, revascularization planning (PCI vs CABG), and ischemia-guided management.
Heart Failure Management -- HFrEF/HFpEF/HFmrEF classification by ejection fraction, GDMT optimization (4-pillar: ARNI/ACEi, beta-blocker, MRA, SGLT2i), device evaluation (ICD/CRT), and hemodynamic profiling.
Valvular Heart Disease -- Severity grading (mild/moderate/severe) by echocardiographic criteria, intervention timing per ACC/AHA guidelines, and prosthetic valve follow-up.
Arrhythmia and EP -- AF management with CHA2DS2-VASc and HAS-BLED scoring, ablation candidacy assessment, and device programming optimization.
Cardiac MRI -- LGE pattern interpretation, T1/T2 mapping analysis, strain assessment, and tissue characterization for cardiomyopathy diagnosis.
Stress Testing -- Exercise and pharmacologic stress interpretation, Duke treadmill score, perfusion defect classification, and ischemia quantification.
Cardiovascular Prevention -- ASCVD risk stratification, lipid management (statin/PCSK9i/ezetimibe selection), lifestyle Rx, and risk enhancer assessment.
Cardio-Oncology -- Chemotherapy cardiotoxicity surveillance, GLS tracking (global longitudinal strain), troponin/BNP monitoring, and CTRCD detection.
Acute Decompensated HF -- Hemodynamic profiling (warm-wet/cold-wet/cold-dry/warm-dry), IV diuretic dosing, inotrope selection, and MCS escalation criteria.
Post-MI Secondary Prevention -- Reperfusion assessment, DAPT strategy duration, beta-blocker/statin/ACEi titration, cardiac rehab referral, and ICD timing.
Myocarditis/Pericarditis -- Lake Louise CMR criteria evaluation, biopsy indications, NSAIDs/colchicine dosing, and activity restriction guidelines.

Example Query and Response

curl -X POST http://localhost:8126/v1/cardio/risk/heart-score \
  -H "Content-Type: application/json" \
  -d '{"age": 62, "history": "moderately suspicious", "ecg": "nonspecific ST changes", "troponin": "normal", "risk_factors": 3}'

{
  "calculator": "HEART Score",
  "score": 6,
  "risk_category": "Moderate",
  "interpretation": "HEART Score 6/10 (Moderate risk). 30-day MACE rate: 12-17%. Recommend observation, serial troponins, and non-invasive testing within 72 hours.",
  "components": {
    "history": {"score": 1, "detail": "Moderately suspicious for ACS"},
    "ecg": {"score": 1, "detail": "Nonspecific ST-segment changes"},
    "age": {"score": 2, "detail": "Age 62 (>= 45 and < 65)"},
    "risk_factors": {"score": 2, "detail": "3+ risk factors (>= 3)"},
    "troponin": {"score": 0, "detail": "Normal troponin"}
  },
  "guideline_reference": "ACC/AHA 2021 Chest Pain Guideline, Section 5.3",
  "confidence": 0.95,
  "processing_time_ms": 280
}

Cross-Agent Integration

Sends cardiotoxicity alerts to Oncology Agent (:8527) when chemotherapy candidates show cardiac risk (pre-treatment EF, GLS baseline, anthracycline cumulative dose tracking)
Receives biomarker trends (BNP, troponin, CRP) from Biomarker Agent (:8529) for longitudinal heart failure monitoring
Queries genomic_evidence for cardiomyopathy-associated variants (TTN, LMNA, MYH7, MYBPC3, SCN5A) to integrate genetic risk into clinical assessment
Receives stroke triage events from Neurology Agent (:8528) to trigger cardiac workup (AF screening, echocardiography) after cryptogenic stroke
Provides cardiotoxicity risk context to CAR-T Agent (:8522) for patients receiving lymphodepleting chemotherapy

Performance Characteristics

Operation	Typical Latency
Risk calculator (any of 6)	50-150 ms
GDMT optimization	200-500 ms
Single collection search (top-15)	80-150 ms
Full 13-collection parallel search	250-450 ms
Claude LLM synthesis	2,000-3,500 ms
Full integrated assessment	3,500-5,500 ms
Report generation (PDF)	1,500-2,500 ms

Common Pitfalls and Troubleshooting

ASCVD risk returns "out of range" -- The Pooled Cohort Equations are validated for ages 40-79. Inputs outside this range receive a warning and extrapolated estimate.
HEART Score confusion with Framingham -- The HEART Score is for acute chest pain triage in the ED (History, ECG, Age, Risk factors, Troponin), not for long-term cardiovascular risk. Use ASCVD (PCE) for 10-year risk prediction.
GDMT optimizer returns "contraindicated" -- The optimizer checks for absolute contraindications (hyperkalemia for MRA, bradycardia for beta-blockers, angioedema history for ACEi). Review the contraindication reason in the response before overriding.
Missing workflow endpoints -- Workflows for acute decompensated HF, post-MI, and myocarditis/pericarditis use the generic /v1/cardio/query endpoint with structured parameters rather than dedicated workflow routes.
Cross-modal evaluation empty -- The /v1/cardio/cross-modal/evaluate endpoint requires genomic_evidence collection to be loaded. Verify via GET /collections that it shows 3.56M records.

3.5 Neurology Intelligence Agent (:8528)¶

Overview and Clinical Purpose

The Neurology Intelligence Agent supports 8 specialized clinical workflows plus a general neurology RAG query mode (9 total), covering acute stroke triage, dementia evaluation, epilepsy classification, brain tumor grading, MS monitoring, Parkinson's assessment, headache classification, and neuromuscular evaluation. It integrates guidelines from AAN, AHA/ASA, ILAE, ICHD-3, WHO CNS 2021, McDonald 2017, and MDS criteria, with 10 validated clinical scale calculators. All processing runs on a single DGX Spark ($4,699, March 2026).

Advanced Configuration Options

Variable	Default	Description
`NEURO_AGENT_PORT`	`8528`	FastAPI listen port
`MILVUS_HOST`	`localhost`	Milvus server hostname
`MILVUS_PORT`	`19530`	Milvus gRPC port
`MILVUS_POOL_SIZE`	`10`	Connection pool size
`ANTHROPIC_API_KEY`	(required)	Claude API key for LLM synthesis
`LLM_MODEL`	`claude-sonnet-4-20250514`	Claude model ID
`EMBEDDING_MODEL`	`BAAI/bge-small-en-v1.5`	384-dim sentence-transformer
`TOP_K`	`15`	Max chunks per collection search
`STROKE_WINDOW_HOURS`	`4.5`	tPA eligibility window in hours from symptom onset
`THROMBECTOMY_WINDOW_HOURS`	`24`	Mechanical thrombectomy eligibility window (DAWN/DEFUSE-3)

10 Clinical Scale Calculators

Scale	Purpose
NIHSS	National Institutes of Health Stroke Scale
GCS	Glasgow Coma Scale
MoCA	Montreal Cognitive Assessment
MDS-UPDRS Part III	Parkinson's motor examination
EDSS	Expanded Disability Status Scale (MS)
mRS	Modified Rankin Scale (functional outcome)
HIT-6	Headache Impact Test
ALSFRS-R	ALS Functional Rating Scale -- Revised
ASPECTS	Alberta Stroke Programme Early CT Score
Hoehn-Yahr	Parkinson's disease staging

Complete API Endpoint Table

Method	Path	Description
GET	`/health`	Service health with Milvus, RAG engine, and workflow engine status
GET	`/collections`	List loaded collections with record counts
GET	`/workflows`	List all 9 workflow definitions
GET	`/metrics`	Prometheus-compatible metrics export
POST	`/v1/neuro/integrated-assessment`	Full multi-collection integrated neurological assessment
POST	`/v1/neuro/query`	Free-text RAG query across all neurology collections
POST	`/v1/neuro/search`	Direct vector similarity search with collection filtering
POST	`/v1/neuro/scale/calculate`	Clinical scale calculation (any of 10 scales)
POST	`/v1/neuro/stroke/triage`	Acute stroke triage with NIHSS, ASPECTS, thrombolysis/thrombectomy eligibility
POST	`/v1/neuro/dementia/evaluate`	Dementia evaluation with MoCA, ATN staging, differential diagnosis
POST	`/v1/neuro/epilepsy/classify`	Epilepsy classification (ILAE 2017), syndrome identification, EEG-MRI concordance
POST	`/v1/neuro/tumor/grade`	Brain tumor grading (WHO CNS 2021) with molecular markers (IDH, MGMT, 1p/19q)
POST	`/v1/neuro/ms/assess`	MS disease monitoring with EDSS, NEDA-3/4, DMT escalation evaluation
POST	`/v1/neuro/parkinsons/assess`	Parkinson's assessment with MDS-UPDRS III, Hoehn-Yahr, motor subtype classification
POST	`/v1/neuro/headache/classify`	Headache classification (ICHD-3), HIT-6 scoring, red flag screening
POST	`/v1/neuro/neuromuscular/evaluate`	Neuromuscular evaluation with ALSFRS-R, EMG/NCS pattern analysis
POST	`/v1/neuro/workflow/{workflow_type}`	Generic workflow endpoint for any of 9 workflow types
GET	`/v1/neuro/domains`	List supported neurological domains
GET	`/v1/neuro/scales`	List available clinical scales with input requirements
GET	`/v1/neuro/guidelines`	List integrated clinical guidelines
GET	`/v1/neuro/knowledge-version`	Knowledge base version and last update timestamp
POST	`/v1/reports/generate`	Generate clinical report (Markdown, JSON, PDF, FHIR R4)
GET	`/v1/reports/formats`	List available report export formats
GET	`/v1/events/stream`	SSE stream for cross-agent events
GET	`/v1/events/health`	Event system health status

Detailed Clinical Workflow Walkthrough

Acute Stroke Triage -- Clinician enters onset time, NIHSS score, and CT findings. The agent calculates time-to-treatment, evaluates tPA eligibility (<4.5 hours per AHA/ASA guidelines), assesses thrombectomy candidacy (<24 hours, DAWN/DEFUSE-3 criteria), computes ASPECTS score for anterior circulation strokes, and localizes vascular territory.
Dementia Evaluation -- MoCA score, age, and symptom list are submitted. The agent performs ATN biomarker staging (Amyloid/Tau/Neurodegeneration), generates a differential diagnosis ranked by probability (AD, FTD, LBD, VaD, NPH, CJD), recommends further workup (PET, CSF, MRI volumetrics), and suggests treatment options.
Epilepsy Classification -- Seizure semiology and EEG/MRI findings are entered. The agent classifies seizures per ILAE 2017 (focal, generalized, unknown onset), identifies epilepsy syndromes, performs EEG-MRI concordance analysis for surgical candidacy, and recommends ASMs based on seizure type with PGx context.
Brain Tumor Grading -- Histology and molecular markers (IDH mutation, MGMT methylation, 1p/19q codeletion, ATRX, H3K27M) are submitted. The agent applies WHO CNS 2021 classification, determines integrated diagnosis (e.g., "Astrocytoma, IDH-mutant, grade 3"), and maps to NCCN treatment guidelines.
MS Disease Monitoring -- EDSS score, relapse history, and MRI lesion data are entered. The agent assesses NEDA-3/4 status (No Evidence of Disease Activity), evaluates DMT escalation criteria, calculates relapse risk stratification, and tracks lesion burden over time.
Parkinson's Assessment -- MDS-UPDRS Part III motor items are scored. The agent computes total score, determines Hoehn-Yahr stage, classifies motor subtype (tremor-dominant, PIGD, indeterminate), and optimizes medication regimen (levodopa, dopamine agonists, MAO-B inhibitors).
Headache Classification -- Attack characteristics (duration, location, quality, associated symptoms) are entered. The agent applies ICHD-3 criteria, computes HIT-6 disability score, screens for red flags (thunderclap, positional, progressive), and recommends preventive therapy if indicated.
Neuromuscular Evaluation -- Weakness pattern, reflexes, and EMG/NCS findings are submitted. The agent computes ALSFRS-R functional score, analyzes EMG/NCS patterns (neuropathic vs myopathic vs NMJ), guides genetic testing, and recommends specialty referral.
General Neurology Query -- Free-form RAG-powered Q&A across all neurology knowledge collections, not bound to a specific workflow.

Example Query and Response

curl -X POST http://localhost:8528/v1/neuro/stroke/triage \
  -H "Content-Type: application/json" \
  -d '{"onset_time": "2026-03-29T08:30:00", "current_time": "2026-03-29T10:45:00", "nihss_score": 14, "ct_findings": "no hemorrhage", "aspects_score": 8}'

{
  "triage_result": {
    "time_from_onset_minutes": 135,
    "tpa_eligible": true,
    "tpa_window_remaining_minutes": 135,
    "thrombectomy_eligible": true,
    "thrombectomy_window_remaining_minutes": 1305,
    "nihss_severity": "Moderate-Severe",
    "aspects_interpretation": "Score 8/10 -- favorable for intervention (>= 6 threshold)",
    "vascular_territory": "MCA (likely M1 or M2 occlusion based on NIHSS pattern)",
    "recommendation": "URGENT: Administer IV tPA immediately (within 4.5-hour window). Activate neurointerventional team for mechanical thrombectomy evaluation. Obtain CTA head and neck to confirm large vessel occlusion."
  },
  "guideline_references": [
    "AHA/ASA 2019 Acute Ischemic Stroke Guidelines, Section 3.2 (tPA)",
    "DAWN Trial: thrombectomy 6-24 hours with clinical-core mismatch",
    "DEFUSE-3 Trial: thrombectomy 6-16 hours with perfusion mismatch"
  ],
  "evidence": [
    {"collection": "neuro_stroke", "score": 0.93, "text": "IV alteplase within 4.5 hours of symptom onset..."},
    {"collection": "neuro_guidelines", "score": 0.91, "text": "AHA/ASA Class I recommendation for mechanical thrombectomy..."}
  ],
  "confidence": 0.96,
  "processing_time_ms": 1850
}

Cross-Agent Integration

Queries Imaging Agent (:8524) for MRI findings (MS lesion counts, brain volumetrics, DWI/ADC maps for stroke)
Receives pharmacogenomic context from PGx Agent (:8107) for antiepileptic drug selection (HLA-B*15:02 and carbamazepine, CYP2C9 and phenytoin)
Shares stroke triage events with Cardiology Agent (:8126) for cardiac workup -- AF screening, echocardiography, and Holter monitoring after cryptogenic stroke
Queries genomic_evidence for neurogenetic variants (APP, PSEN1/2, GBA, LRRK2, PARK2, SMN1) to integrate genetic risk into clinical assessment
Receives tumor molecular markers from Oncology Agent (:8527) for integrated brain tumor classification

Performance Characteristics

Operation	Typical Latency
Clinical scale calculation (any of 10)	20-80 ms
Single collection search (top-15)	80-150 ms
Full multi-collection parallel search	250-450 ms
Stroke triage (end-to-end, no LLM)	200-500 ms
Claude LLM synthesis	2,000-3,500 ms
Full workflow assessment (end-to-end)	3,000-5,500 ms
Report generation (PDF)	1,500-2,500 ms

Common Pitfalls and Troubleshooting

Stroke triage returns "window expired" -- Verify that onset_time and current_time are in ISO 8601 format with timezone. If current_time is omitted, the server uses UTC now, which may differ from local time.
NIHSS score mismatch -- The agent accepts either a pre-computed total score or individual item scores. If both are provided, individual items take precedence and the total is recalculated. Ensure all 15 items are included for accurate scoring.
MoCA score interpretation varies by education -- The agent applies the standard +1 point adjustment for education <= 12 years. Provide the raw score without manual adjustment; the agent handles the correction.
MS NEDA-3 assessment incomplete -- NEDA-3 requires three inputs: relapse count (0), new/enlarging T2 lesions (0), and disability progression (stable EDSS). Missing any component results in a partial assessment.
Workflow type not found -- Valid workflow types are: acute_stroke_triage, dementia_evaluation, epilepsy_focus, brain_tumor_grading, ms_monitoring, parkinsons_assessment, headache_classification, neuromuscular_evaluation, general. Use the exact ID from GET /workflows.

3.6 Precision Autoimmune Agent (:8532)¶

Overview and Clinical Purpose

The Precision Autoimmune Agent addresses the diagnostic odyssey that autoimmune patients face -- often 3-6 years across multiple specialists before diagnosis. It interprets autoantibody panels, HLA typing, biomarker trends, and genomic data across 13 autoimmune conditions. It features disease activity scoring (DAS28-CRP, SLEDAI-2K, CDAI, BASDAI), flare prediction, biologic therapy recommendations with PGx context, and diagnostic odyssey analysis. Also detects the POTS/hEDS/MCAS triad and overlap syndromes. All processing runs on a single DGX Spark ($4,699, March 2026).

Advanced Configuration Options

Variable	Default	Description
`AUTOIMMUNE_AGENT_PORT`	`8532`	FastAPI listen port
`MILVUS_HOST`	`localhost`	Milvus server hostname
`MILVUS_PORT`	`19530`	Milvus gRPC port
`MILVUS_POOL_SIZE`	`10`	Connection pool size
`ANTHROPIC_API_KEY`	(required)	Claude API key for LLM synthesis
`LLM_MODEL`	`claude-sonnet-4-20250514`	Claude model ID; upgrade to `claude-opus-4-20250514` for complex diagnostic odyssey analyses
`EMBEDDING_MODEL`	`BAAI/bge-small-en-v1.5`	384-dim sentence-transformer
`TOP_K`	`15`	Max chunks per collection search
`FLARE_RISK_THRESHOLD`	`0.65`	Risk score threshold for flare alerts (0-1 scale)
`HLA_ODDS_RATIO_MIN`	`2.0`	Minimum odds ratio for HLA-disease associations to flag
`COLLECTION_WEIGHTS`	(see table)	Weighted relevance for each collection in composite scoring

13 Supported Conditions

Rheumatoid arthritis, SLE, multiple sclerosis, type 1 diabetes, inflammatory bowel disease, psoriasis/psoriatic arthritis, ankylosing spondylitis, Sjogren's syndrome, systemic sclerosis, myasthenia gravis, celiac disease, Graves' disease, Hashimoto's thyroiditis.

Milvus Collections (14)

Collection	Weight	Description
`autoimmune_clinical_documents`	0.18	Progress notes, discharge summaries
`autoimmune_patient_labs`	0.14	Lab results with flags and reference ranges
`autoimmune_autoantibody_panels`	0.12	ANA, anti-dsDNA, anti-CCP, RF, and 14+ types
`autoimmune_hla_associations`	0.08	HLA allele-disease risk (50+ alleles with odds ratios)
`autoimmune_disease_criteria`	0.08	ACR/EULAR classification criteria
`autoimmune_disease_activity`	0.07	DAS28-CRP, SLEDAI-2K, CDAI, BASDAI scoring
`autoimmune_flare_patterns`	0.06	Biomarker patterns preceding flares
`autoimmune_biologic_therapies`	0.06	TNF inhibitors, anti-CD20, IL-6R, JAK inhibitors
`autoimmune_pgx_rules`	0.04	Pharmacogenomic rules (CYP2C19, FCGR3A, HLA)
`autoimmune_clinical_trials`	0.05	Active and recent clinical trials
`autoimmune_literature`	0.05	Published literature with year filtering
`autoimmune_patient_timelines`	0.03	Longitudinal patient event timelines
`autoimmune_cross_disease`	0.02	Overlap syndromes, cross-disease mechanisms
`genomic_evidence`	0.02	Shared genomic evidence (read-only)

Complete API Endpoint Table

Method	Path	Description
GET	`/health`	Service health with Milvus and engine status
GET	`/healthz`	Kubernetes-style liveness probe
GET	`/collections`	List loaded collections with record counts
GET	`/metrics`	Prometheus-compatible metrics export
POST	`/v1/autoimmune/integrated-assessment`	Full multi-collection integrated autoimmune assessment
POST	`/query`	Free-text RAG query across all autoimmune collections
POST	`/query/stream`	Streaming RAG query (Server-Sent Events)
POST	`/search`	Direct vector similarity search with collection filtering
POST	`/analyze`	Full patient analysis (antibodies + HLA + biomarkers combined)
POST	`/differential`	Differential diagnosis from symptom/antibody profile
POST	`/ingest/upload`	Upload and ingest clinical documents (PDF, DOCX, text)
POST	`/ingest/demo-data`	Load demo patient dataset for testing
POST	`/collections/create`	Create custom collections for institution-specific data
POST	`/export`	Export analysis results (Markdown, JSON, PDF, FHIR R4)

Detailed Clinical Workflow Walkthrough

Document ingestion -- Clinical documents (progress notes, lab reports, imaging reports) are uploaded via /ingest/upload or Streamlit UI. Documents are chunked, embedded with BGE-small-en-v1.5, and stored in autoimmune_clinical_documents.
Autoantibody panel interpretation -- Antibody results (ANA, anti-dsDNA, anti-CCP, RF, anti-SSA/SSB, anti-Scl-70, anti-Jo-1, ANCA, anti-centromere, anti-Smith, anti-RNP, anti-TPO, TSI, anti-AChR, anti-tTG) are mapped to disease associations with published sensitivity and specificity. For example: anti-CCP positive with RF positive has 95% specificity for RA.
HLA typing analysis -- HLA alleles are evaluated against 50+ allele-disease associations with odds ratios. Key associations: HLA-B27:05 and ankylosing spondylitis (OR=87.4), HLA-DRB104:01 and RA (OR=4.2), HLA-DQ2/DQ8 and celiac disease (OR=7.0).
Disease activity scoring -- Standardized scores are calculated based on disease:
RA: DAS28-CRP (remission <2.6, low 2.6-3.2, moderate 3.2-5.1, high >5.1)
SLE: SLEDAI-2K (no activity 0, mild 1-5, moderate 6-10, high 11-19, very high >=20)
IBD: CDAI for Crohn's (remission <150, mild 150-219, moderate 220-450, severe >450)
AS: BASDAI (low <4, high >=4)
Flare prediction -- Analyzes longitudinal biomarker patterns (CRP, ESR, IL-6, complement C3/C4, calprotectin, ferritin) against known flare precursors. Returns a 0-1 risk score with contributing factors. Scores above 0.65 trigger an alert to the Biomarker Agent.
Biologic therapy recommendations -- Filtered by confirmed diagnosis, disease activity level, prior therapy failures, and PGx context (FCGR3A V/F polymorphism for rituximab response, TPMT for azathioprine dosing). Recommendations include TNF inhibitors, anti-CD20, IL-6R blockers, JAK inhibitors, and IL-17/23 inhibitors.
Diagnostic odyssey analysis -- Surfaces patterns missed across fragmented multi-specialist records. Identifies symptom constellations that span years and multiple providers, flags potential overlap syndromes (e.g., RA + Sjogren's, SLE + APS), and detects the POTS/hEDS/MCAS triad.

Example Query and Response

curl -X POST http://localhost:8532/analyze \
  -H "Content-Type: application/json" \
  -d '{"antibodies": {"ANA": "positive", "anti_dsDNA": 85, "anti_Smith": "positive"}, "hla": ["DRB1*15:01"], "biomarkers": {"CRP": 18.5, "ESR": 42, "C3": 62, "C4": 7, "anti_dsDNA_titer": 120}}'

{
  "primary_diagnosis": {
    "disease": "Systemic Lupus Erythematosus (SLE)",
    "confidence": 0.92,
    "acr_eular_score": 18,
    "classification_criteria": "ACR/EULAR 2019 SLE Classification (threshold >= 10)"
  },
  "disease_activity": {
    "score_type": "SLEDAI-2K",
    "score": 12,
    "level": "High Activity",
    "active_domains": ["immunologic (anti-dsDNA, low complement)", "serositis risk"]
  },
  "flare_risk": {
    "score": 0.78,
    "risk_level": "High",
    "contributing_factors": ["Rising anti-dsDNA (120 IU/mL)", "Low C4 (7 mg/dL)", "Low C3 (62 mg/dL)"],
    "recommendation": "Consider preemptive dose adjustment; monitor weekly for 4 weeks"
  },
  "hla_associations": [
    {"allele": "DRB1*15:01", "disease": "SLE", "odds_ratio": 2.1, "significance": "Moderate risk association"}
  ],
  "therapy_recommendations": [
    {"rank": 1, "drug": "Belimumab (anti-BLyS)", "rationale": "Add-on for active SLE with high anti-dsDNA and low complement"},
    {"rank": 2, "drug": "Mycophenolate mofetil", "rationale": "Steroid-sparing immunosuppression for moderate-severe SLE"},
    {"rank": 3, "drug": "Anifrolumab (anti-IFNAR1)", "rationale": "Type I IFN pathway blockade for refractory skin/joint SLE"}
  ],
  "evidence": [
    {"collection": "autoimmune_disease_criteria", "score": 0.94, "text": "ACR/EULAR 2019: Anti-dsDNA + low complement = 10 points..."},
    {"collection": "autoimmune_biologic_therapies", "score": 0.89, "text": "BLISS-76: Belimumab + SOC improved SRI-4 response vs placebo..."}
  ],
  "processing_time_ms": 4120
}

Cross-Agent Integration

Receives longitudinal biomarker trends from Biomarker Agent (:8529) for disease monitoring -- CRP, ESR, complement, calprotectin tracked over time with trend analysis
Requests imaging assessment from Imaging Agent (:8524) for joint erosion scoring (MRI/ultrasound), organ evaluation (renal, pulmonary), and skin lesion characterization
Publishes diagnosis events to other agents via SSE event bus -- new autoimmune diagnoses trigger downstream assessments in Cardiology (pericarditis risk), Neurology (CNS lupus, MS overlap), and PGx (immunosuppressant metabolism)
Queries genomic_evidence for HLA alleles and autoimmune-associated variants (CTLA4, PTPN22, IL2RA, STAT4) to integrate genetic risk
Sends flare alerts to Biomarker Agent (:8529) when biomarker patterns indicate impending flare, triggering intensified monitoring protocols

Performance Characteristics

Operation	Typical Latency
Single collection search (top-15)	80-150 ms
Full 14-collection weighted search	300-500 ms
Disease activity calculation	30-80 ms
Flare risk prediction	50-120 ms
Autoantibody panel interpretation	40-100 ms
HLA association lookup	20-50 ms
Claude LLM synthesis	2,500-4,000 ms
Full analysis (end-to-end)	3,500-5,500 ms
Document ingestion (per page)	500-1,000 ms

Common Pitfalls and Troubleshooting

ANA "positive" without titer -- The agent accepts qualitative ("positive"/"negative") or quantitative (titer, e.g., "1:640") ANA values. Quantitative values enable more precise disease association scoring. Include titer and pattern (homogeneous, speckled, nucleolar, centromere) when available.
Disease activity score returns "insufficient data" -- Each score requires specific inputs. DAS28-CRP needs tender joint count (28 joints), swollen joint count (28 joints), CRP (mg/L), and patient global assessment (0-100 VAS). Missing any component prevents calculation.
HLA alleles not recognized -- Use standard nomenclature (e.g., "B27:05" not "B27" or "HLA-B27"). The agent accepts both 2-field (B27:05) and 1-field (B*27) resolution but 2-field provides more accurate odds ratios.
Flare prediction unreliable with single timepoint -- The prediction model works best with longitudinal data (3+ timepoints). A single snapshot can estimate current risk but cannot detect trajectory-based patterns (rising anti-dsDNA, falling complement).
Overlap syndromes missed -- The diagnostic odyssey analysis requires documents from multiple visits and specialties. Single-visit data limits the agent's ability to detect cross-specialist patterns. Upload historical records for comprehensive analysis.
Demo data ingestion fails -- Ensure the /data/demo directory exists and contains the sample patient files. Run curl http://localhost:8532/health first to verify Milvus connectivity before ingesting.

3.7 Rare Disease Diagnostic Agent (:8134)¶

Overview and Clinical Purpose

The Rare Disease Diagnostic Agent provides differential diagnosis, ACMG/AMP variant interpretation (implementing 23 ACMG criteria), HPO-based phenotype matching with information content (IC) weighted similarity scoring, therapeutic option search (orphan drugs, gene therapy, enzyme replacement), and clinical trial eligibility assessment. It covers 13 disease categories with 88 diseases in its knowledge base and tracks 25 approved or investigational gene therapies.

Advanced Configuration Options

All environment variables use the RD_ prefix. Key settings:

Variable	Default	Description
`RD_MILVUS_HOST`	`localhost`	Milvus vector database host
`RD_MILVUS_PORT`	`19530`	Milvus port
`RD_API_PORT`	`8134`	FastAPI REST server port
`RD_STREAMLIT_PORT`	`8544`	5-tab Streamlit UI port
`RD_LLM_MODEL`	`claude-sonnet-4-6`	Claude model for synthesis
`RD_ANTHROPIC_API_KEY`	(none)	Required for LLM; search-only mode without it
`RD_SCORE_THRESHOLD`	`0.4`	Minimum cosine similarity for evidence retrieval
`RD_TOP_K_PHENOTYPES`	`50`	Max results from `rd_phenotypes` per query
`RD_TOP_K_VARIANTS`	`100`	Max results from `rd_variants` per query
`RD_INGEST_SCHEDULE_HOURS`	`24`	Auto-ingest cycle (set `RD_INGEST_ENABLED=true`)
`RD_MAX_CONVERSATION_CONTEXT`	`3`	Prior exchanges injected for multi-turn
`RD_CITATION_HIGH_THRESHOLD`	`0.75`	Score above which citations are marked high-confidence
`RD_API_KEY`	(empty)	Set to require X-API-Key header on all endpoints
`RD_CROSS_AGENT_TIMEOUT`	`30`	Seconds before cross-agent HTTP calls time out

Collection weight tuning: 14 RD_WEIGHT_* variables (e.g., RD_WEIGHT_PHENOTYPES=0.12) control the relative influence of each collection during multi-collection fusion. Weights must sum to approximately 1.0 (tolerance 0.05).

Milvus Collections (14)

Collection	Description
`rd_phenotypes`	HPO-coded phenotype descriptions
`rd_diseases`	Disease definitions (OMIM, Orphanet)
`rd_genes`	Gene-disease associations
`rd_variants`	Pathogenic variant records
`rd_literature`	Rare disease literature
`rd_trials`	Rare disease clinical trials
`rd_therapies`	Orphan drugs, gene therapy, ERT
`rd_case_reports`	Published case reports
`rd_guidelines`	Clinical management guidelines
`rd_pathways`	Molecular pathways
`rd_registries`	Patient registries
`rd_natural_history`	Natural history studies
`rd_newborn_screening`	Newborn screening panels
`genomic_evidence`	Shared genomic evidence (read-only)

Complete API Endpoint Table

Method	Path	Description
GET	`/health`	Service health with component readiness and vector counts
GET	`/collections`	Collection names and record counts
GET	`/workflows`	11 available diagnostic workflow definitions
GET	`/metrics`	Prometheus-compatible metrics export
POST	`/v1/diagnostic/query`	RAG Q&A across all rare disease collections
POST	`/v1/diagnostic/search`	Multi-collection evidence search (no LLM)
POST	`/v1/diagnostic/diagnose`	Differential diagnosis from HPO terms
POST	`/v1/diagnostic/variants/interpret`	ACMG/AMP variant classification
POST	`/v1/diagnostic/phenotype/match`	HPO-to-disease phenotype matching
POST	`/v1/diagnostic/therapy/search`	Therapeutic option search (ERT, SRT, gene therapy)
POST	`/v1/diagnostic/trial/match`	Clinical trial eligibility assessment
POST	`/v1/diagnostic/workflow/{type}`	Generic workflow dispatch (11 types)
POST	`/v1/diagnostic/integrated-assessment`	Cross-agent multi-agent assessment
GET	`/v1/diagnostic/disease-categories`	Reference catalog of 13 disease categories
GET	`/v1/diagnostic/gene-therapies`	25 approved/investigational gene therapies
GET	`/v1/diagnostic/acmg-criteria`	23 ACMG criteria reference
GET	`/v1/diagnostic/hpo-categories`	HPO top-level ontology terms
GET	`/v1/diagnostic/knowledge-version`	Knowledge base version metadata
POST	`/v1/reports/generate`	Multi-format report generation
GET	`/v1/reports/formats`	Supported export formats
GET	`/v1/events/stream`	Server-sent event stream

Detailed Clinical Workflow Walkthrough

Phenotype entry -- Clinician enters patient phenotypes as HPO terms (e.g., HP:0001250 Seizure, HP:0001263 Global developmental delay) via the 5-tab Streamlit UI or the /v1/diagnostic/diagnose API. Age of onset category (antenatal, neonatal, infantile, childhood, juvenile, adult) is provided as context.
HPO-to-disease matching -- The agent embeds the HPO term descriptions using BGE-small-en-v1.5, searches the rd_phenotypes collection (top_k=50), and computes IC-weighted Resnik similarity against OMIM and Orphanet disease phenotype profiles. Each disease receives a composite phenotype match score.
Differential diagnosis ranking -- Candidate diseases are ranked by phenotype match score. The LLM synthesizes evidence from rd_diseases, rd_literature, and rd_case_reports to assign confidence levels (high/moderate/low) and highlight discriminating features.
Variant interpretation -- If VCF data is available, candidate variants in genes associated with top-ranked diseases are classified using 23 ACMG/AMP criteria: PVS1, PM1-PM6, PP1-PP5, BA1, BS1-BS4, BP1-BP7. ClinVar, gnomAD allele frequency, AlphaMissense scores, and in-silico predictors are integrated.
Therapeutic search -- For confirmed or suspected diagnoses, the agent queries rd_therapies for orphan drugs (FDA/EMA approved), gene therapies (25 tracked), enzyme replacement therapies, substrate reduction therapies, and investigational compounds.
Trial eligibility -- The agent cross-references patient age, gene, diagnosis, and variant status against rd_trials and triggers the Clinical Trial Agent for expanded matching.
Report export -- Final report exported as Markdown, JSON, PDF, FHIR R4 DiagnosticReport, or GA4GH Phenopacket v2.

Example Query and Response

curl -X POST http://localhost:8134/v1/diagnostic/diagnose \
  -H "Content-Type: application/json" \
  -d '{"hpo_terms": ["HP:0001250", "HP:0001263", "HP:0002079"], "age_of_onset": "infantile"}'

{
  "status": "completed",
  "differential_diagnosis": [
    {
      "rank": 1,
      "disease": "Dravet syndrome",
      "omim": "607208",
      "orphanet": "ORPHA:33069",
      "phenotype_match_score": 0.87,
      "confidence": "high",
      "key_gene": "SCN1A",
      "discriminating_features": ["Febrile seizures progressing to afebrile", "Temperature sensitivity"]
    },
    {
      "rank": 2,
      "disease": "CLN2 disease (late-infantile neuronal ceroid lipofuscinosis)",
      "omim": "204500",
      "orphanet": "ORPHA:228346",
      "phenotype_match_score": 0.74,
      "confidence": "moderate",
      "key_gene": "TPP1",
      "discriminating_features": ["Progressive vision loss", "Cerebellar atrophy on MRI"]
    }
  ],
  "hpo_terms_matched": 3,
  "diseases_evaluated": 88,
  "evidence_sources": 42,
  "processing_time_ms": 2340.5
}

Cross-Agent Integration

Direction	Partner Agent	Trigger/Scenario
Outbound	Pharmacogenomics (:8107)	Queries drug metabolism context when rare disease therapeutics have PGx implications (e.g., ERT dosing)
Outbound	Clinical Trial (:8538)	Triggers expanded trial matching for rare disease patients via `/v1/trial/workflow/patient_matching/run`
Outbound	Imaging (:8524)	Requests imaging assessment for rare diseases with organ involvement (e.g., lysosomal storage diseases)
Outbound	Cardiology (:8126)	Queries cardiac phenotype context for cardiomyopathy-associated rare diseases (e.g., Fabry, Pompe)
Inbound	Biomarker (:8529)	Receives phenotype-genotype correlation requests
Bidirectional	All agents	Publishes diagnosis events via SSE event bus; reads `genomic_evidence` (shared)

Performance Characteristics

Operation	Typical Latency	Notes
Phenotype match (3 HPO terms)	800-1,200 ms	Vector search across `rd_phenotypes` (top_k=50)
Differential diagnosis (full)	2,000-3,500 ms	Multi-collection search + LLM synthesis
ACMG variant interpretation	1,500-2,500 ms	ClinVar + gnomAD + AlphaMissense lookup + LLM classification
Therapeutic search	600-1,000 ms	`rd_therapies` search only
Report generation (PDF)	3,000-5,000 ms	LLM narrative + template rendering

Common Pitfalls and Troubleshooting

HPO term formatting: Terms must use the HP:NNNNNNN format (7 digits, zero-padded). Using free-text symptom descriptions instead of HPO codes bypasses the IC-weighted matching and falls back to text similarity, which is less precise.
ACMG criteria incomplete: The agent implements 23 of 28 ACMG criteria. Functional assay evidence (PS3) and segregation data (PP1 with quantitative scoring) require manual review.
Low phenotype match scores: If all differential diagnoses score below 0.5, consider whether the phenotype is incomplete. Adding more HPO terms (especially specific sub-terms rather than general parent terms) significantly improves matching.
Gene therapy list currency: The 25 gene therapies are current as of March 2026. New FDA/EMA approvals require a data refresh via scripts/seed_knowledge.py.
Cross-agent timeout: If the PGx or Trial agent is down, the integrated assessment degrades gracefully (returns partial results). Check RD_CROSS_AGENT_TIMEOUT if timeouts are frequent.

3.8 Pharmacogenomics Intelligence Agent (:8107)¶

Overview and Clinical Purpose

The Pharmacogenomics (PGx) Intelligence Agent translates patient genotype data into actionable prescribing guidance. It interprets star alleles for 14 pharmacogenes, applies 9 CPIC guideline algorithms, screens for 12 HLA-mediated adverse drug reaction associations, models phenoconversion from drug-drug interactions, and provides genotype-guided dosing algorithms including the IWPC warfarin dosing calculator.

Advanced Configuration Options

All environment variables use the PGX_ prefix. Key settings:

Variable	Default	Description
`PGX_MILVUS_HOST`	`localhost`	Milvus vector database host
`PGX_MILVUS_PORT`	`19530`	Milvus port
`PGX_API_PORT`	`8107`	FastAPI REST server port
`PGX_STREAMLIT_PORT`	`8507`	Streamlit UI port
`PGX_LLM_MODEL`	`claude-sonnet-4-6`	Claude model for synthesis
`PGX_ANTHROPIC_API_KEY`	(none)	Required for LLM features
`PGX_SCORE_THRESHOLD`	`0.4`	Minimum cosine similarity for evidence retrieval
`PGX_TOP_K_PER_COLLECTION`	`5`	Default results per collection
`PGX_INGEST_SCHEDULE_HOURS`	`168`	Weekly auto-ingest cycle (7 days)
`PGX_API_KEY`	(empty)	Set to require X-API-Key header
`PGX_CROSS_AGENT_TIMEOUT`	`30`	Cross-agent HTTP timeout in seconds
`PGX_RAG_PIPELINE_ROOT`	`/app/rag-chat-pipeline`	Path to shared RAG pipeline

Collection weight tuning: 15 PGX_WEIGHT_* variables control multi-collection fusion. PGX_WEIGHT_DRUG_GUIDELINES defaults to 0.14 (highest weight) because CPIC/DPWG guideline matching is the most clinically relevant signal.

Milvus Collections (15)

Collection	Description
`pgx_gene_reference`	Pharmacogene star allele definitions and activity scores
`pgx_drug_guidelines`	CPIC/DPWG clinical prescribing guidelines
`pgx_drug_interactions`	Drug-gene interaction records (PharmGKB)
`pgx_hla_hypersensitivity`	HLA-mediated adverse drug reaction screening
`pgx_phenoconversion`	Metabolic phenoconversion via DDI
`pgx_dosing_algorithms`	Genotype-guided dosing algorithms and formulas
`pgx_clinical_evidence`	Published PGx clinical evidence and outcomes
`pgx_population_data`	Population-specific allele frequency data
`pgx_clinical_trials`	PGx-related clinical trials
`pgx_fda_labels`	FDA pharmacogenomic labeling information
`pgx_drug_alternatives`	Genotype-guided therapeutic alternatives
`pgx_patient_profiles`	Patient diplotype-phenotype profiles
`pgx_implementation`	Clinical PGx implementation programs
`pgx_education`	PGx educational resources and guidelines
`genomic_evidence`	Shared genomic evidence (read-only)

Complete API Endpoint Table

Method	Path	Description
GET	`/health`	Service health with Milvus and LLM status
GET	`/collections`	Collection names and record counts
GET	`/metrics`	Prometheus-compatible metrics
POST	`/query`	RAG Q&A across all PGx collections
POST	`/search`	Multi-collection evidence search (no LLM)
POST	`/profile`	Full PGx profile from diplotype map
POST	`/dosing`	Drug-specific dosing guidance
POST	`/hla-screen`	HLA hypersensitivity screening
POST	`/v1/pgx/drug-check`	Single-drug PGx check with CPIC alerts
POST	`/v1/pgx/medication-review`	Polypharmacy review with DDI detection
POST	`/v1/pgx/dosing/warfarin`	IWPC warfarin dosing algorithm
POST	`/v1/pgx/hla-screen`	HLA adverse reaction screening
POST	`/v1/pgx/phenoconversion`	Phenoconversion analysis from concomitant drugs
POST	`/v1/pgx/integrated-assessment`	Cross-agent multi-agent assessment
GET	`/v1/pgx/genes`	List all 14 pharmacogenes with metadata
GET	`/v1/pgx/drugs`	List all tracked drugs by therapeutic category
POST	`/v1/reports/generate`	Multi-format report generation
GET	`/v1/reports/formats`	Supported export formats
GET	`/v1/events/stream`	Server-sent event stream

Detailed Clinical Workflow Walkthrough

Diplotype submission -- Clinician submits patient diplotypes (e.g., CYP2D6: *1/*4, CYP2C19: *1/*2) from VCF-based star allele calling (Stargazer, Cyrius) or laboratory report. The /profile or /v1/pgx/drug-check endpoint receives the data.
Metabolizer phenotype assignment -- Each diplotype is mapped to an activity score and metabolizer phenotype: ultra-rapid metabolizer (UM), normal metabolizer (NM), intermediate metabolizer (IM), or poor metabolizer (PM). The mapping follows CPIC standardized terms.
CPIC/DPWG guideline matching -- For each gene-drug pair in the patient's medication list, the agent queries pgx_drug_guidelines for applicable CPIC Level A/B recommendations. 9 CPIC algorithms are implemented covering CYP2D6, CYP2C19, CYP2C9, CYP3A5, DPYD, TPMT, NUDT15, SLCO1B1, and UGT1A1.
Phenoconversion check -- Current medications are screened against CYP inhibitor/inducer databases. A strong CYP2D6 inhibitor (e.g., fluoxetine, paroxetine) in a genotypic NM converts the effective phenotype to PM. The /v1/pgx/phenoconversion endpoint returns the precipitant drug, inhibitor strength, converted phenotype, and affected substrate drugs.
HLA screening -- 12 HLA-drug associations are checked. Key associations include HLA-B57:01/abacavir (mandatory screening), HLA-B58:01/allopurinol, HLA-B15:02/carbamazepine, and HLA-A31:01/carbamazepine. Population-specific prevalence data is returned.
Dosing algorithm -- For warfarin, the IWPC algorithm (Klein et al., NEJM 2009) calculates predicted weekly dose incorporating VKORC1, CYP2C9, age, height, weight, race, amiodarone use, and enzyme inducer status. Safety bounds clamp output to 3-105 mg/week.
Alternative recommendations -- For PM/UM patients, the agent retrieves genotype-appropriate alternative drugs from pgx_drug_alternatives and generates a structured report.

Example Query and Response

curl -X POST http://localhost:8107/v1/pgx/drug-check \
  -H "Content-Type: application/json" \
  -d '{"drug": "codeine", "gene": "CYP2D6", "phenotype": "poor_metabolizer", "diplotype": "*4/*4"}'

{
  "drug": "codeine",
  "gene": "CYP2D6",
  "phenotype": "poor_metabolizer",
  "alerts": [
    {
      "alert_level": "critical",
      "gene": "CYP2D6",
      "drug": "codeine",
      "phenotype": "poor_metabolizer",
      "recommendation": "CPIC guideline exists for CYP2D6-codeine. Patient phenotype: poor_metabolizer. Consider alternative therapy or significant dose reduction.",
      "guideline_body": "CPIC",
      "cpic_level": "A",
      "alternative_drugs": ["morphine", "non-opioid analgesics", "acetaminophen"]
    }
  ],
  "knowledge_context": "CYP2D6 encodes cytochrome P450 2D6, which metabolizes codeine to morphine via O-demethylation. Poor metabolizers (*4/*4) produce insufficient morphine for analgesic effect.",
  "processing_time_ms": 45.2
}

Cross-Agent Integration

Direction	Partner Agent	Trigger/Scenario
Outbound	Oncology (:8527)	Queries planned chemotherapy context for metabolism assessment (e.g., tamoxifen and CYP2D6)
Outbound	Cardiology (:8126)	Queries cardiac drug interaction context (e.g., clopidogrel and CYP2C19)
Outbound	Neurology (:8528)	Queries neurotoxic drug interaction context (e.g., carbamazepine and HLA-B*15:02)
Outbound	Clinical Trial (:8538)	Queries PGx-guided clinical trials
Inbound	All prescribing agents	Any agent recommending drug therapy can query PGx for metabolism context
Inbound	Biomarker (:8529)	Receives requests for genotype-adjusted biomarker reference ranges

Performance Characteristics

Operation	Typical Latency	Notes
Single drug check	30-80 ms	Knowledge graph lookup only, no vector search
Polypharmacy review (5 drugs)	100-250 ms	Iterates CYP inhibitor/inducer databases
Warfarin IWPC dosing	10-30 ms	Pure mathematical calculation
HLA screening	20-60 ms	Dictionary lookup against 12 associations
Phenoconversion analysis	40-120 ms	CYP inhibitor/inducer scan per drug
Full RAG query	1,500-3,000 ms	15-collection vector search + LLM synthesis

Common Pitfalls and Troubleshooting

Star allele formatting: Diplotypes must use the *N/*N format (e.g., *1/*4). Suballeles like *1a and *5 are supported for SLCO1B1. Submitting raw VCF positions instead of called star alleles will not produce useful results.
Phenoconversion missed: The phenoconversion engine only detects interactions from the concomitant drug list explicitly provided. It does not infer medications from other clinical data. Ensure the full medication list is submitted.
Population-specific allele frequencies: HLA prevalence data varies significantly by ancestry. The agent reports population-specific frequencies (e.g., HLA-B*57:01 prevalence: 6-8% European, <1% East Asian) but does not automatically select the relevant population.
IWPC warfarin clamping: Doses below 3 mg/week or above 105 mg/week are clamped to safety bounds. If the algorithm produces extreme values, verify input genotypes and clinical parameters.
Missing CPIC guideline: Not all gene-drug pairs have CPIC Level A guidelines. The agent returns an info alert listing which drugs do have guidelines for the queried gene.

3.9 Imaging Intelligence Agent (:8524)¶

Overview and Clinical Purpose

The Imaging Intelligence Agent provides clinical decision support for radiology through a RAG knowledge system, 4 NVIDIA NIM microservices (VISTA-3D with 127 anatomical structures, MAISI, VILA-M3, Llama-3), and 4 reference clinical workflows. It supports Orthanc DICOM auto-ingestion, cross-modal genomics enrichment, NVIDIA FLARE federated learning, and FHIR R4 interoperability.

Advanced Configuration Options

All environment variables use the IMAGING_ prefix. Key settings:

Variable	Default	Description
`IMAGING_MILVUS_HOST`	`localhost`	Milvus vector database host
`IMAGING_MILVUS_PORT`	`19530`	Milvus port
`IMAGING_API_PORT`	`8524`	FastAPI REST server port
`IMAGING_STREAMLIT_PORT`	`8525`	9-tab Streamlit UI port
`IMAGING_LLM_MODEL`	`claude-sonnet-4-20250514`	Claude model for synthesis
`IMAGING_NIM_MODE`	`local`	NIM mode: `local`, `cloud`, or `mock`
`IMAGING_NIM_LLM_URL`	`http://localhost:8520/v1`	Local Llama-3 NIM endpoint
`IMAGING_NIM_VISTA3D_URL`	`http://localhost:8530`	VISTA-3D segmentation endpoint
`IMAGING_NIM_MAISI_URL`	`http://localhost:8531`	MAISI synthetic CT endpoint
`IMAGING_NIM_VILAM3_URL`	`http://localhost:8532`	VILA-M3 vision-language endpoint
`IMAGING_NIM_ALLOW_MOCK_FALLBACK`	`true`	Fall back to mock responses if NIM unavailable
`IMAGING_NVIDIA_API_KEY`	(none)	Required for cloud NIM mode
`IMAGING_ORTHANC_URL`	`http://localhost:8042`	Orthanc DICOM server
`IMAGING_DICOM_AUTO_INGEST`	`false`	Enable automatic DICOM webhook processing
`IMAGING_DICOM_WATCH_INTERVAL`	`5`	Seconds between Orthanc `/changes` polls
`IMAGING_CROSS_MODAL_ENABLED`	`false`	Enable imaging-to-genomics cross-modal triggers
`IMAGING_PREVIEW_DEFAULT_FPS`	`8`	Frames per second for NIfTI preview videos

NVIDIA NIM Integration

NIM Service	Port	Capability
VISTA-3D	8530	3D medical image segmentation (127 anatomical structures)
MAISI	8531	Synthetic CT volume generation with paired segmentation masks
VILA-M3	8532	Vision-language model for radiology image understanding
Llama-3 8B	8520	Clinical reasoning and report generation

Milvus Collections (11)

Collection	Content
`imaging_literature`	2,678 PubMed research papers
`imaging_trials`	12 clinical trials
`imaging_findings`	Imaging finding templates and patterns
`imaging_protocols`	Acquisition protocols and parameters
`imaging_devices`	FDA-cleared AI/ML medical devices
`imaging_anatomy`	Anatomical structure references
`imaging_benchmarks`	Model performance benchmarks
`imaging_guidelines`	ACR, RSNA, NCCN guidelines
`imaging_report_templates`	Structured radiology report templates
`imaging_datasets`	Public imaging datasets (TCIA, PhysioNet)
`genomic_evidence`	Shared from Stage 2 (read-only, 3.56M)

4 Reference Workflows

Workflow	Modality	Target Latency
CT Head Hemorrhage Triage	CT	< 90 sec
CT Chest Lung Nodule Tracking	CT	< 5 min
CXR Rapid Findings	X-ray	< 30 sec
MRI Brain MS Lesion Tracking	MRI	< 5 min

Complete API Endpoint Table

Method	Path	Description
GET	`/health`	Service health with collection stats and NIM status
GET	`/`	Service info with links to docs and health
GET	`/collections`	Collection names, counts, and labels
GET	`/metrics`	Prometheus metrics (Counter, Histogram)
GET	`/knowledge/stats`	Imaging domain knowledge graph statistics
POST	`/query`	Full RAG query: multi-collection retrieval + LLM synthesis
POST	`/search`	Evidence-only search (no LLM synthesis)
POST	`/find-related`	Cross-collection entity linking
POST	`/api/ask`	Meta-agent question answering
POST	`/nim/vista3d/segment`	VISTA-3D 3D segmentation (127 structures)
POST	`/nim/maisi/generate`	MAISI synthetic CT generation
POST	`/nim/vilam3/analyze`	VILA-M3 image understanding
POST	`/nim/llm/generate`	Llama-3 text generation
GET	`/nim/status`	NIM service availability check
POST	`/workflow/{name}/run`	Execute a clinical workflow
GET	`/workflows`	List available workflows
POST	`/events/dicom-webhook`	DICOM auto-ingest webhook (Orthanc)
GET	`/events/stream`	SSE event stream
POST	`/reports/generate`	Multi-format report generation
GET	`/reports/formats`	Supported export formats
GET	`/preview/{study_id}`	NIfTI slice preview (MP4/GIF)
GET	`/demo-cases`	Pre-loaded demo imaging cases
POST	`/protocol/optimize`	Protocol optimization advisor
POST	`/dose/estimate`	Radiation dose intelligence

Detailed Clinical Workflow Walkthrough (CT Head Hemorrhage Triage)

DICOM arrival -- A CT head study arrives at Orthanc. If DICOM_AUTO_INGEST is enabled, the /events/dicom-webhook endpoint is triggered with the study UID, modality, and body part.
Study retrieval -- The agent fetches the DICOM series from Orthanc, converts to NIfTI format, and generates a slice preview video (8 FPS MP4).
VISTA-3D segmentation -- The NIfTI volume is sent to the VISTA-3D NIM at port 8530. The model segments 127 anatomical structures, identifying brain parenchyma, ventricles, midline structures, and extra-axial spaces.
Hemorrhage detection -- The workflow applies density thresholds and morphological analysis on the segmented volume to detect hyperdense regions consistent with acute hemorrhage. Classification includes epidural, subdural, subarachnoid, intraparenchymal, and intraventricular subtypes.
RAG evidence enrichment -- The agent searches imaging_literature, imaging_guidelines, and imaging_findings for evidence relevant to the detected pattern. Genomic evidence from genomic_evidence is queried if cross-modal is enabled (e.g., coagulation factor variants).
Report generation -- A structured radiology report is generated using the Llama-3 NIM or Claude, populated with findings, measurements, classification, and relevant literature citations. FHIR R4 DiagnosticReport format is available.
Triage alert -- If hemorrhage is detected, a high-priority SSE event is published to the event stream for downstream consumers.

Example Query and Response

curl -X POST http://localhost:8524/query \
  -H "Content-Type: application/json" \
  -d '{"question": "What is the sensitivity of AI-assisted CXR for pneumothorax detection?", "top_k": 5}'

{
  "question": "What is the sensitivity of AI-assisted CXR for pneumothorax detection?",
  "answer": "AI-assisted chest X-ray systems demonstrate sensitivity of 85-95% for pneumothorax detection, with specificity of 90-98%. FDA-cleared devices include Annalise.ai CXR (sensitivity 94.2%), Qure.ai qXR (sensitivity 91.8%), and Zebra Medical Vision (sensitivity 89.5%). Performance is highest for moderate-to-large pneumothoraces (>2 cm) and decreases for small or loculated collections. Current ACR guidelines recommend AI as a triage aid rather than a standalone diagnostic tool.",
  "evidence_count": 18,
  "collections_searched": 8,
  "search_time_ms": 1245.3,
  "nim_services_used": ["llm"]
}

Cross-Agent Integration

Direction	Partner Agent	Trigger/Scenario
Outbound	Oncology (:8527)	Lung-RADS 4A+ nodules trigger genomic_evidence query for EGFR/ALK/ROS1/KRAS variants
Outbound	Cardiology (:8126)	Cardiac CT findings shared for integrated cardiovascular assessment
Inbound	Neurology (:8528)	Receives MRI brain requests for MS lesion tracking workflow
Inbound	Rare Disease (:8134)	Receives imaging assessment requests for organ-involved rare diseases
Bidirectional	All agents	Publishes imaging events via SSE; reads `genomic_evidence` (shared)

Performance Characteristics

Operation	Typical Latency	Notes
RAG query (full)	1,500-3,000 ms	11-collection search + LLM synthesis
Evidence search (no LLM)	400-800 ms	Vector search only
VISTA-3D segmentation	30-90 sec	GPU-dependent; 127-class volumetric segmentation
MAISI synthetic generation	60-180 sec	Full CT volume synthesis
VILA-M3 image analysis	5-15 sec	Single image understanding
CXR Rapid Findings workflow	< 30 sec	End-to-end triage pipeline
NIfTI preview generation	2-5 sec	MP4/GIF slice animation

Common Pitfalls and Troubleshooting

NIM service unavailable: If NIM containers are not running, set IMAGING_NIM_ALLOW_MOCK_FALLBACK=true to get structured mock responses for development. Check GET /nim/status for per-service availability.
DICOM webhook not triggering: Verify IMAGING_DICOM_AUTO_INGEST=true and that Orthanc is configured with a Lua script pointing its OnStableStudy callback to http://imaging-agent:8524/events/dicom-webhook.
Cross-modal disabled by default: The IMAGING_CROSS_MODAL_ENABLED flag defaults to false. Enable it to allow imaging findings to automatically query genomic evidence.
Large DICOM series: Series with >500 slices may exceed the 10MB request size limit. Increase IMAGING_MAX_REQUEST_SIZE_MB or process via the Orthanc webhook path (which streams data).
Cloud vs local NIM: When IMAGING_NIM_MODE=cloud, requests go to integrate.api.nvidia.com and require a valid IMAGING_NVIDIA_API_KEY. Cloud mode uses different model identifiers (e.g., meta/llama-3.1-8b-instruct).

3.10 Single-Cell Intelligence Agent (:8540)¶

Overview and Clinical Purpose

The Single-Cell Intelligence Agent provides cell-type annotation, tumor microenvironment (TME) profiling, drug response prediction, subclonal architecture analysis, spatial transcriptomics niche mapping, trajectory inference, ligand-receptor interaction analysis, biomarker discovery, CAR-T target validation, and treatment monitoring. Its knowledge base covers 57 cell types, 30 drugs, 75 markers, 4 TME phenotypes (hot, cold, excluded, immunosuppressive), and 4 spatial platforms (Visium, MERFISH, Xenium, CosMx). It supports 10 specialized workflows plus general RAG query.

Advanced Configuration Options

All environment variables use the SC_ prefix. Key settings:

Variable	Default	Description
`SC_MILVUS_HOST`	`localhost`	Milvus vector database host
`SC_MILVUS_PORT`	`19530`	Milvus port
`SC_API_PORT`	`8540`	FastAPI REST server port
`SC_STREAMLIT_PORT`	`8130`	Streamlit UI port
`SC_LLM_MODEL`	`claude-sonnet-4-6`	Claude model for synthesis
`SC_ANTHROPIC_API_KEY`	(none)	Required for LLM features
`SC_SCORE_THRESHOLD`	`0.4`	Minimum cosine similarity
`SC_TOP_K_CELL_TYPES`	`50`	Max results from `sc_cell_types` per query
`SC_TOP_K_MARKERS`	`40`	Max results from `sc_markers` per query
`SC_TOP_K_SPATIAL`	`30`	Max results from `sc_spatial` per query
`SC_INGEST_SCHEDULE_HOURS`	`24`	Auto-ingest cycle
`SC_GPU_MEMORY_LIMIT_GB`	`120`	GPU memory limit for RAPIDS-based analysis
`SC_CELLXGENE_API_URL`	(none)	Optional CellxGene Census API endpoint
`SC_CROSS_AGENT_TIMEOUT`	`30`	Cross-agent HTTP timeout in seconds

Collection weight tuning: 12 SC_WEIGHT_* variables control multi-collection fusion. SC_WEIGHT_CELL_TYPES defaults to 0.14 (highest) followed by SC_WEIGHT_MARKERS at 0.12, reflecting the primacy of cell identity in single-cell analysis.

Milvus Collections (13)

Collection	Description
`sc_cell_types`	Cell type reference profiles and markers (57 types)
`sc_markers`	Marker gene signatures per cell type (75 markers)
`sc_literature`	Single-cell genomics literature
`sc_trials`	Clinical trials involving single-cell analysis
`sc_drugs`	Drug response signatures (GDSC/DepMap, 30 drugs)
`sc_pathways`	Pathway activity signatures
`sc_spatial`	Spatial transcriptomics references (4 platforms)
`sc_trajectories`	Differentiation trajectory templates
`sc_interactions`	Ligand-receptor pair databases (CellPhoneDB/NicheNet)
`sc_tme`	TME classification profiles (4 phenotypes)
`sc_clinical`	Clinical correlation data
`sc_methods`	Computational method references
`genomic_evidence`	Shared genomic evidence (read-only)

Complete API Endpoint Table

Method	Path	Description
GET	`/health`	Service health with component readiness
GET	`/collections`	Collection names and record counts
GET	`/workflows`	10 available workflow definitions
GET	`/metrics`	Prometheus-compatible metrics
POST	`/v1/sc/query`	RAG Q&A across all single-cell collections
POST	`/v1/sc/search`	Multi-collection evidence search (no LLM)
POST	`/v1/sc/annotate`	Cell type annotation from marker expression
POST	`/v1/sc/tme-profile`	TME classification from cell proportions
POST	`/v1/sc/drug-response`	Drug response prediction per cell type
POST	`/v1/sc/cart-validate`	CAR-T target validation (on/off-tumor expression)
POST	`/v1/sc/spatial-niche`	Spatial niche mapping (Visium/MERFISH/Xenium/CosMx)
POST	`/v1/sc/trajectory`	Trajectory inference from pseudotime data
POST	`/v1/sc/interactions`	Ligand-receptor interaction analysis
POST	`/v1/sc/biomarker-discovery`	Biomarker discovery from differential expression
POST	`/v1/sc/subclonal`	Subclonal architecture and CNV analysis
POST	`/v1/sc/treatment-monitor`	Treatment response monitoring over time
POST	`/v1/sc/workflow/{type}`	Generic workflow dispatch (10 types)
POST	`/v1/sc/integrated-assessment`	Cross-agent multi-agent assessment
GET	`/v1/sc/cell-types`	Reference catalog of 57 cell types
GET	`/v1/sc/markers`	Reference catalog of 75 markers
GET	`/v1/sc/tme-classes`	TME classification definitions (4 phenotypes)
GET	`/v1/sc/spatial-platforms`	Supported spatial platforms
POST	`/v1/reports/generate`	Multi-format report generation
GET	`/v1/reports/formats`	Supported export formats
GET	`/v1/events/stream`	Server-sent event stream

Detailed Clinical Workflow Walkthrough (TME-Guided Immunotherapy Selection)

Expression data submission -- Clinician submits a gene expression marker panel or cell type proportions from a tumor biopsy via the /v1/sc/tme-profile endpoint. Input can be raw marker values (e.g., CD3D: 5.2, CD8A: 4.1, GZMB: 3.8) or pre-computed cell type proportions.
Cell type annotation -- If raw markers are provided, the agent performs multi-strategy annotation: reference-based matching against the 57-cell-type atlas, marker-based scoring using the 75-marker database, and LLM-augmented classification for ambiguous profiles. Confidence levels (high/medium/low) are assigned.
TME classification -- Cell type proportions are analyzed against the 4 TME phenotype profiles:
Hot (immune-inflamed): high CD8+ T cells (>15%), high PD-L1 expression, active IFN-gamma signaling
Cold (immune-desert): low immune infiltrate (<5%), absent T cell signatures
Excluded (immune-excluded): immune cells at tumor periphery, high stromal content
Immunosuppressive: high Treg (>8%), high M2 macrophage, elevated TGF-beta/IL-10
Drug response prediction -- Based on the TME phenotype and cell-type-resolved expression, the agent queries sc_drugs for GDSC/DepMap sensitivity signatures across 30 drugs. Checkpoint inhibitors (pembrolizumab, nivolumab, atezolizumab) are ranked for hot TMEs; combination strategies are suggested for cold/excluded TMEs.
Subclonal architecture -- If CNV data is available, the agent performs subclonal detection to identify resistant subpopulations and assess antigen escape risk. Clones with loss of heterozygosity at HLA loci are flagged.
Spatial context -- For spatial transcriptomics data (Visium, MERFISH, Xenium, CosMx), the agent maps immune cell niches relative to tumor boundaries, quantifying immune exclusion zones and tertiary lymphoid structures.
Report generation -- Results exported as Markdown, JSON, PDF, or FHIR R4.

Example Query and Response

curl -X POST http://localhost:8540/v1/sc/tme-profile \
  -H "Content-Type: application/json" \
  -d '{"cell_type_proportions": {"CD8_T_cell": 0.15, "Macrophage": 0.25, "Treg": 0.08, "Fibroblast": 0.30, "Tumor": 0.18, "NK_cell": 0.04}}'

{
  "status": "completed",
  "tme_classification": "immunosuppressive",
  "confidence": "high",
  "rationale": "Elevated Treg proportion (8%) combined with high macrophage content (25%) suggests immunosuppressive microenvironment. CD8+ T cell infiltration (15%) is present but likely functionally exhausted based on the Treg:CD8 ratio.",
  "immunotherapy_recommendations": [
    {
      "drug": "ipilimumab + nivolumab",
      "rationale": "Combination anti-CTLA-4/PD-1 to overcome Treg-mediated suppression",
      "evidence_level": "Level 1A (CheckMate 067)"
    },
    {
      "drug": "anti-CCR4 (mogamulizumab)",
      "rationale": "Treg depletion strategy for immunosuppressive TME",
      "evidence_level": "Level 2B (Phase II data)"
    }
  ],
  "escape_risk": "moderate",
  "collections_searched": 8,
  "processing_time_ms": 1850.3
}

Cross-Agent Integration

Direction	Partner Agent	Trigger/Scenario
Outbound	CAR-T (:8522)	Sends antigen escape alerts when off-tumor expression detected in CAR-T target validation
Outbound	Oncology (:8527)	Provides TME context for immunotherapy selection and combination strategy
Outbound	Biomarker (:8529)	Shares cell-type-resolved biomarker panels for panel enrichment
Outbound	Drug Discovery	Queries compound integration for cell-type-selective drug candidates
Inbound	Imaging (:8524)	Receives spatial-imaging correlation requests
Inbound	Pharmacogenomics (:8107)	Receives requests for cell-type-resolved drug sensitivity data
Bidirectional	All agents	Reads `genomic_evidence` (shared); publishes events via SSE

Performance Characteristics

Operation	Typical Latency	Notes
Cell type annotation (marker panel)	500-1,200 ms	Multi-strategy: reference + marker + LLM
TME profiling	800-1,500 ms	Proportion analysis + evidence retrieval
Drug response prediction	600-1,200 ms	GDSC/DepMap signature matching
CAR-T target validation	400-800 ms	Expression ratio analysis
Spatial niche mapping	1,000-2,500 ms	Platform-specific spatial analysis
Full RAG query	1,500-3,500 ms	13-collection search + LLM synthesis
Subclonal architecture	2,000-4,000 ms	CNV detection + clone assignment

Common Pitfalls and Troubleshooting

Cell type proportion normalization: Proportions submitted to /v1/sc/tme-profile should sum to approximately 1.0. The agent normalizes internally, but significantly unbalanced inputs (e.g., sum > 2.0) may indicate double-counting.
Marker gene naming: Gene symbols must follow HGNC nomenclature (e.g., CD3D not CD3). The agent attempts alias resolution via ENTITY_ALIASES but non-standard names may fail silently.
TME classification edge cases: Tumors with mixed hot/cold regions (spatially heterogeneous) may be classified inconsistently depending on the biopsy site. Spatial transcriptomics data is recommended for heterogeneous tumors.
GDSC/DepMap coverage: Drug response predictions are limited to the 30 drugs in the sensitivity database. Novel agents or combination regimens fall back to LLM-based inference from literature.
GPU memory for RAPIDS: If SC_GPU_MEMORY_LIMIT_GB is set too low for large expression matrices, RAPIDS operations will fall back to CPU, significantly increasing latency.

3.11 Clinical Trial Intelligence Agent (:8538)¶

Overview and Clinical Purpose

The Clinical Trial Intelligence Agent provides AI-driven support across the full clinical trial lifecycle: protocol optimization, patient-trial matching, site selection with 7-factor scoring, eligibility optimization, adaptive design evaluation, safety signal detection using PRR/ROR disproportionality analysis, regulatory document generation, competitive intelligence, diversity assessment, and decentralized trial planning. It supports 10 specialized workflows plus general RAG query capability, and its knowledge base includes 40 landmark trial references.

Advanced Configuration Options

All environment variables use the TRIAL_ prefix. Key settings:

Variable	Default	Description
`TRIAL_MILVUS_HOST`	`localhost`	Milvus vector database host
`TRIAL_MILVUS_PORT`	`19530`	Milvus port
`TRIAL_API_PORT`	`8538`	FastAPI REST server port
`TRIAL_STREAMLIT_PORT`	`8128`	Streamlit UI port
`TRIAL_LLM_MODEL`	`claude-sonnet-4-6`	Claude model for synthesis
`TRIAL_ANTHROPIC_API_KEY`	(none)	Required for LLM features
`TRIAL_SCORE_THRESHOLD`	`0.4`	Minimum cosine similarity
`TRIAL_TOP_K_PER_COLLECTION`	`5`	Default results per collection
`TRIAL_INGEST_SCHEDULE_HOURS`	`24`	Auto-ingest cycle
`TRIAL_API_KEY`	(empty)	Set to require X-API-Key header
`TRIAL_CROSS_AGENT_TIMEOUT`	`30`	Cross-agent HTTP timeout in seconds
`TRIAL_CLINICALTRIALS_API_KEY`	(none)	Optional ClinicalTrials.gov API key

The agent connects to 8 partner agents for cross-agent integration, with dedicated URL variables: TRIAL_ONCOLOGY_AGENT_URL, TRIAL_PGX_AGENT_URL, TRIAL_CARDIOLOGY_AGENT_URL, TRIAL_BIOMARKER_AGENT_URL, TRIAL_RARE_DISEASE_AGENT_URL, TRIAL_NEUROLOGY_AGENT_URL, TRIAL_SINGLE_CELL_AGENT_URL, TRIAL_IMAGING_AGENT_URL.

Milvus Collections (14)

Collection	Description
`trial_protocols`	Protocol designs, amendments, SOAs
`trial_eligibility`	Inclusion/exclusion criteria
`trial_endpoints`	Primary, secondary, exploratory endpoints
`trial_sites`	Site feasibility, enrollment history
`trial_investigators`	Investigator profiles, experience
`trial_results`	Published trial results, CSRs (40 landmark trials)
`trial_regulatory`	FDA/EMA guidance, IND/NDA data
`trial_literature`	PubMed clinical trial publications
`trial_biomarkers`	Biomarker-driven trial designs
`trial_safety`	Safety data, DSMB reports, AE databases
`trial_rwe`	Real-world evidence, registries
`trial_adaptive`	Adaptive design references
`trial_guidelines`	ICH, FDA, EMA guidelines
`genomic_evidence`	Shared genomic evidence (cross-agent)

10 Specialized Workflows

Workflow	Description
Protocol Design	Complexity scoring, SOA review, endpoint optimization
Patient Matching	AI eligibility screening with genomic/biomarker integration
Site Selection	7-factor feasibility scoring, enrollment forecasting, diversity metrics
Eligibility Optimization	Population impact modeling, competitor benchmarking
Adaptive Design	Bayesian interim analysis, dose-response, futility assessment
Safety Signal	Disproportionality analysis (PRR/ROR), causality assessment
Regulatory Docs	IND, CSR, briefing doc generation (FDA/EMA/PMDA)
Competitive Intel	Landscape analysis, enrollment race tracking
Diversity Assessment	FDA diversity guidance compliance, gap analysis
Decentralized Planning	DCT component feasibility, hybrid model design

Complete API Endpoint Table

Method	Path	Description
GET	`/health`	Service health with component readiness
GET	`/collections`	Collection names and record counts
GET	`/workflows`	10 workflow definitions + general query
GET	`/metrics`	Prometheus-compatible metrics
POST	`/v1/trial/query`	RAG Q&A across all trial collections
POST	`/v1/trial/search`	Multi-collection evidence search (no LLM)
POST	`/v1/trial/workflow/protocol_design/run`	Protocol design complexity scoring
POST	`/v1/trial/workflow/patient_matching/run`	Patient-trial matching with genomics
POST	`/v1/trial/workflow/site_selection/run`	7-factor site feasibility scoring
POST	`/v1/trial/workflow/eligibility_optimization/run`	Population impact modeling
POST	`/v1/trial/workflow/adaptive_design/run`	Bayesian adaptive design evaluation
POST	`/v1/trial/workflow/safety_signal/run`	PRR/ROR disproportionality analysis
POST	`/v1/trial/workflow/regulatory_docs/run`	Regulatory document generation
POST	`/v1/trial/workflow/competitive_intel/run`	Competitive landscape analysis
POST	`/v1/trial/workflow/diversity_assessment/run`	FDA diversity guidance compliance
POST	`/v1/trial/workflow/decentralized_planning/run`	DCT feasibility assessment
POST	`/v1/trial/integrated-assessment`	Cross-agent multi-agent assessment
GET	`/v1/trial/therapeutic-areas`	Reference catalog of therapeutic areas
GET	`/v1/trial/phases`	Trial phase definitions
GET	`/v1/trial/landmark-trials`	40 landmark trial references
GET	`/v1/trial/regulatory-agencies`	Supported regulatory agencies (FDA/EMA/PMDA)
POST	`/v1/reports/generate`	Multi-format report generation
GET	`/v1/reports/formats`	Supported export formats
GET	`/v1/events/stream`	Server-sent event stream

Detailed Clinical Workflow Walkthrough (Safety Signal Detection)

Adverse event submission -- Safety team submits adverse event data via /v1/trial/workflow/safety_signal/run. Input includes the drug name, a list of adverse events with counts (exposed and unexposed), and optional background rate data.
Disproportionality analysis -- The agent computes two standard pharmacovigilance metrics:
PRR (Proportional Reporting Ratio): ratio of the proportion of a specific AE for the drug vs. all other drugs in the database. PRR > 2 with chi-squared > 4 and N >= 3 triggers a signal.
ROR (Reporting Odds Ratio): odds ratio of the AE in exposed vs. unexposed populations. The 95% confidence interval lower bound > 1 indicates a statistically significant signal.
Historical comparator retrieval -- The agent searches trial_safety for historical AE rates from similar drug classes and indications. Published DSMB reports and FDA safety communications are retrieved from trial_regulatory.
Causality assessment -- Evidence from trial_literature and trial_results is used to assess temporal relationship, dose-response, biological plausibility, and consistency across studies. The LLM synthesizes a causality narrative referencing the WHO-UMC and Naranjo scales.
Benchmark against landmark trials -- The detected signal is compared against AE profiles from relevant landmark trials in trial_results (40 stored). For example, hepatotoxicity in a checkpoint inhibitor trial would be benchmarked against CheckMate 067 and KEYNOTE 024 safety data.
Report generation -- A structured safety signal report is generated with PRR/ROR calculations, confidence intervals, historical comparator data, causality assessment, and recommended actions (continue monitoring, dose modification, or study hold).

Example Query and Response

curl -X POST http://localhost:8538/v1/trial/workflow/safety_signal/run \
  -H "Content-Type: application/json" \
  -d '{
    "drug": "experimental_agent_X",
    "adverse_events": [
      {"event": "hepatotoxicity", "count": 8, "total_exposed": 200},
      {"event": "rash", "count": 15, "total_exposed": 200},
      {"event": "fatigue", "count": 45, "total_exposed": 200}
    ],
    "comparator_class": "checkpoint_inhibitor"
  }'

{
  "status": "completed",
  "workflow": "safety_signal",
  "signals_detected": [
    {
      "event": "hepatotoxicity",
      "prr": 3.2,
      "prr_ci_lower": 1.8,
      "ror": 3.5,
      "ror_ci_lower": 1.6,
      "signal_strength": "moderate",
      "historical_rate_class": "2-5% for checkpoint inhibitors",
      "observed_rate": "4.0%",
      "assessment": "Signal detected. Observed hepatotoxicity rate (4.0%) is within the expected range for checkpoint inhibitors (2-5%) but PRR exceeds threshold. Recommend enhanced liver function monitoring at 2-week intervals.",
      "causality_score": "possible (Naranjo: 4)"
    },
    {
      "event": "rash",
      "prr": 1.4,
      "signal_strength": "none",
      "assessment": "No signal. Rash rate (7.5%) is below the expected range for checkpoint inhibitors (10-20%)."
    }
  ],
  "landmark_comparators": ["CheckMate 067", "KEYNOTE 024", "IMpower 150"],
  "evidence_sources": 28,
  "processing_time_ms": 3120.7
}

7-Factor Site Selection Scoring

The site selection workflow evaluates investigative sites using a composite score across 7 factors:

Factor	Weight	Data Sources
Disease prevalence in catchment area	0.20	`trial_sites`, census data
Prior enrollment performance	0.20	`trial_investigators`, historical enrollment
Investigator experience (publications, prior trials)	0.15	`trial_investigators`, `trial_literature`
Regulatory readiness (IRB turnaround, startup time)	0.15	`trial_sites`
Diversity index (demographic representation)	0.10	`trial_sites`, FDA diversity guidance
Competitor trial burden (active competing studies)	0.10	`trial_protocols`, competitive intelligence
Infrastructure score (lab, imaging, pharmacy)	0.10	`trial_sites`

Cross-Agent Integration

Direction	Partner Agent	Trigger/Scenario
Inbound	Oncology (:8527)	Receives trial matching requests when actionable variants found
Inbound	Rare Disease (:8134)	Receives rare disease trial eligibility queries
Inbound	Biomarker (:8529)	Receives biomarker-driven trial design queries
Inbound	PGx (:8107)	Receives queries for PGx-guided clinical trials
Outbound	Oncology (:8527)	Queries tumor profiling for patient eligibility context
Outbound	PGx (:8107)	Queries metabolizer status for trial-specific dosing arms
Outbound	Cardiology (:8126)	Queries cardiac safety context for cardiovascular endpoint trials
Outbound	Biomarker (:8529)	Queries biomarker interpretation for eligibility criteria
Bidirectional	All agents	Reads `genomic_evidence` (shared); publishes events via SSE

Performance Characteristics

Operation	Typical Latency	Notes
RAG query	1,500-3,000 ms	14-collection search + LLM synthesis
Patient-trial matching	2,000-4,000 ms	Multi-collection search + eligibility scoring
Protocol design review	2,500-4,500 ms	Complexity scoring + benchmark comparison
Safety signal (PRR/ROR)	1,500-3,500 ms	Statistical calculation + evidence retrieval
Site selection (7-factor)	3,000-5,000 ms	Multi-factor scoring + enrollment forecasting
Regulatory document generation	5,000-10,000 ms	Template population + LLM narrative
Competitive intelligence	2,000-4,000 ms	Landscape analysis across `trial_protocols`

Common Pitfalls and Troubleshooting

Patient matching requires complete profiles: The patient-trial matching workflow performs best with age, diagnosis, biomarker status, prior therapy history, and ECOG performance status. Partial profiles produce fewer matches with lower confidence.
PRR/ROR minimum counts: Disproportionality analysis requires at least 3 events (N >= 3) to generate a statistically meaningful signal. Submitting events with count < 3 will return "insufficient data" rather than a false signal.
Landmark trial coverage: The 40 landmark trials are curated references. New landmark trials (e.g., recently published Phase III results) require manual addition via scripts/seed_knowledge.py.
ClinicalTrials.gov rate limiting: If TRIAL_CLINICALTRIALS_API_KEY is not set, the ClinicalTrials.gov v2 API rate-limits to 3 requests/second. Set the key for production workloads.
Regulatory document templates: Generated IND/CSR documents are templates requiring clinical review. They auto-populate statistical sections, safety summaries, and study schemas but do not replace regulatory affairs expertise.
Cross-agent integration breadth: This agent connects to 8 partner agents (more than any other agent). If multiple partner agents are unavailable, the integrated assessment still returns partial results from available agents.

Part 4: Advanced Drug Discovery¶

The 10-Stage Therapeutic Discovery Pipeline¶

The Drug Discovery Pipeline transforms a clinically actionable variant into ranked therapeutic candidates. Each stage feeds the next, producing an auditable chain of evidence from gene to molecule.

Stage	Name	Technology	Typical Time
1	Variant Reception	VCF parser + ClinVar lookup	<1 s
2	Target Identification	UniProt mapping, PDB structure fetch	2-5 s
3	Binding Site Analysis	fpocket cavity detection, druggability scoring	5-10 s
4	Seed Compound Retrieval	ChEMBL/DrugBank similarity search	3-8 s
5	Molecular Generation	BioNeMo MolMIM conditional generation	15-30 s
6	Chemical Feasibility	RDKit Lipinski, QED, SA score	1-2 s
7	Pediatric Safety Filtering	6-filter pediatric safety gate	<1 s
8	Binding Prediction	BioNeMo DiffDock pose sampling	30-90 s
9	Composite Scoring	Weighted multi-objective ranking	<1 s
10	Report Generation	Claude LLM narrative + evidence assembly	5-10 s

End-to-end time per variant: 60-150 seconds depending on the number of generated molecules and DiffDock sampling depth.

MolMIM Molecular Generation¶

MolMIM (Molecular Masked Inverse Modeling) is a BioNeMo NIM that generates novel molecules conditioned on a seed SMILES string. The HCLS AI Factory sends generation requests to the local MolMIM endpoint.

Seed SMILES Selection

The pipeline selects seed compounds from the top ChEMBL/DrugBank hits in Stage 4. Selection criteria:

Known activity against the target gene product (IC50 or Ki reported)
Molecular weight between 200-500 Da (optimal generation range)
Existing SAR data available for the scaffold class

Scaffold Constraints

MolMIM supports scaffold-constrained generation, where a substructure is preserved while peripheral groups are modified:

generation_params = {
    "algorithm": "CMA-ES",
    "num_molecules": 30,
    "property_name": "QED",
    "min_similarity": 0.4,
    "particles": 30,
    "iterations": 10,
    "smi": seed_smiles  # e.g., "CC(=O)Oc1ccccc1C(=O)O"
}
response = requests.post(f"{MOLMIM_URL}/generate", json=generation_params)

The min_similarity parameter (Tanimoto on Morgan fingerprints) controls how far generated molecules can drift from the seed. Lower values (0.3) explore more chemical space; higher values (0.7) stay closer to known actives.

Diversity Control

To avoid redundant candidates, the pipeline applies MaxMin diversity picking after generation. From 30 raw candidates, the top 10 most structurally diverse molecules are retained for downstream evaluation.

DiffDock Binding Prediction¶

DiffDock is a diffusion-based molecular docking model that predicts binding poses without requiring a pre-defined search box. The HCLS AI Factory uses the BioNeMo NIM deployment.

Inputs and Outputs

Input: Protein structure (PDB), ligand (SDF/MOL2), number of poses
Output: Ranked poses with confidence scores, per-pose RMSD estimates, contact residue lists

Confidence Score Interpretation

Confidence Range	Interpretation	Action
> 0.8	High confidence binding	Strong candidate
0.5 - 0.8	Moderate confidence	Proceed with caution
< 0.5	Low confidence	Likely non-binder, discard

Binding Energy Conversion

DiffDock confidence scores are converted to approximate binding energies using a calibrated regression:

docking_score_kcal = -2.0 + (confidence * -12.0)

A confidence of 0.78 yields approximately -11.4 kcal/mol, consistent with the VCP demo results.

Contact Residue Analysis

For each predicted pose, the pipeline extracts residues within 4.0 Angstroms of the ligand. These contact residues are cross-referenced against:

Known active site residues from UniProt annotations
Mutation sites from the patient's VCF
Conserved residues across orthologs

RDKit Chemical Property Analysis¶

Every generated molecule passes through a battery of RDKit property calculations before scoring.

Lipinski's Rule of Five

Molecules violating more than one Lipinski rule are flagged but not automatically discarded (many approved oncology drugs violate Lipinski):

Property	Threshold	Calculation
Molecular Weight	<= 500 Da	`Descriptors.MolWt(mol)`
LogP	<= 5.0	`Descriptors.MolLogP(mol)`
H-Bond Donors	<= 5	`Descriptors.NumHDonors(mol)`
H-Bond Acceptors	<= 10	`Descriptors.NumHAcceptors(mol)`

Quantitative Estimate of Drug-likeness (QED)

QED combines eight molecular descriptors into a single 0-1 score. The HCLS AI Factory uses the weighted QED variant:

from rdkit.Chem.QED import qed
score = qed(mol)  # 0.0 (least drug-like) to 1.0 (most drug-like)

Typical thresholds: QED >= 0.5 passes initial screen; QED >= 0.7 is considered favorable.

Synthetic Accessibility (SA) Score

The SA score (1-10 scale, lower is easier to synthesize) helps prioritize molecules that can realistically be made:

from rdkit.Chem import RDConfig
from rdkit.Contrib.SA_Score import sascorer
sa = sascorer.calculateScore(mol)  # 1.0 (trivial) to 10.0 (very hard)

Molecules with SA > 6.0 are flagged as synthetically challenging.

MMFF Conformer Generation

For 3D analysis and docking preparation, the pipeline generates low-energy conformers:

from rdkit.Chem import AllChem
AllChem.EmbedMultipleConfs(mol, numConfs=50, pruneRmsThresh=0.5)
AllChem.MMFFOptimizeMoleculeConfs(mol)

The lowest-energy conformer is selected as the docking input for DiffDock.

Pediatric Safety Filters¶

The HCLS AI Factory applies six pediatric-specific safety filters. A molecule must pass all six to be considered a viable pediatric candidate. These filters go beyond standard adult drug safety screens to address the unique vulnerabilities of developing organ systems.

Filter 1: Blood-Brain Barrier (BBB) Penetration Risk

Threshold: MW > 500 Da triggers concern
Rationale: Large molecules that cross the BBB pose neurotoxicity risk in developing brains
Implementation: Descriptors.MolWt(mol) > 500
Action: Flag, do not auto-reject (some CNS-targeted therapies require BBB penetration)

Filter 2: Cardiac Safety (hERG Liability)

Threshold: Predicted hERG IC50 < 10 uM
Rationale: QT prolongation risk is higher in pediatric patients with smaller cardiac mass
Implementation: Structure-based hERG prediction using Morgan fingerprint similarity to known hERG blockers
Action: Hard reject if predicted IC50 < 1 uM; flag if 1-10 uM

Filter 3: Hepatotoxicity Risk

Threshold: LogP > 5.0
Rationale: Highly lipophilic compounds accumulate in hepatic tissue; pediatric livers have immature metabolic capacity
Implementation: Descriptors.MolLogP(mol) > 5.0
Action: Hard reject

Filter 4: Teratogenicity Screen

Threshold: SMARTS pattern match against known teratogenic substructures
Rationale: Adolescent patients of reproductive age require teratogenicity screening
Implementation: Library of 47 SMARTS patterns derived from FDA pregnancy category X drugs
Action: Hard reject on match with known high-risk pharmacophores

Filter 5: Oral Bioavailability

Threshold: TPSA > 140 Angstroms squared
Rationale: Pediatric formulation strongly prefers oral dosing; high TPSA reduces absorption
Implementation: Descriptors.TPSA(mol) > 140
Action: Flag (injectable formulations remain possible but less preferred)

Filter 6: Gastrointestinal Tolerability

Threshold: Rotatable bonds > 10
Rationale: Flexible molecules often cause GI distress; pediatric patients are more sensitive
Implementation: Descriptors.NumRotatableBonds(mol) > 10
Action: Flag

Composite Scoring Methodology¶

Candidates that survive safety filtering receive a composite score combining three weighted components:

composite_score = (docking_weight * normalized_docking) +
                  (generation_weight * normalized_generation) +
                  (qed_weight * normalized_qed)

Default Weights:

Component	Weight	Source	Normalization
Docking Score	0.4	DiffDock confidence	Min-max across candidate set
Generation Score	0.3	MolMIM similarity to seed	Already 0-1 (Tanimoto)
QED Score	0.3	RDKit QED	Already 0-1

Weights are configurable per therapeutic area. Oncology workflows may increase docking weight to 0.5; rare disease workflows may increase generation weight to favor novelty.

VCP Demo Results¶

The VCP (Valosin-Containing Protein) variant demonstration showcases the full pipeline:

Input Variant: VCP p.R155H (rs121909334), associated with IBMPFD
Target Structure: PDB 5KIU, ATPase domain
Best Candidate Docking Score: -11.4 kcal/mol (DiffDock confidence 0.78)
Best Candidate QED: 0.81
Candidates Generated: 30 via MolMIM, 10 after diversity picking
Candidates Passing Safety: 7 of 10 passed all 6 pediatric filters
Top Composite Score: 0.84
Pipeline Runtime: 127 seconds end-to-end

Part 5: Cross-Agent Coordination¶

Shared Genomic Evidence Architecture¶

All 11 intelligence agents share access to the genomic_evidence Milvus collection. This collection contains 35,678 curated variant-evidence vectors derived from ClinVar, AlphaMissense, and the patient's annotated VCF.

Collection Schema:

Field	Type	Description
id	INT64 (primary)	Auto-generated unique ID
text	VARCHAR(4096)	Evidence text (variant description + clinical significance)
embedding	FLOAT_VECTOR(384)	BGE-small-en-v1.5 embedding
source	VARCHAR(256)	Origin database (ClinVar, AlphaMissense, VCF)
gene	VARCHAR(64)	Gene symbol
variant_id	VARCHAR(128)	rs number or HGVS notation
clinical_significance	VARCHAR(128)	ClinVar classification
therapeutic_area	VARCHAR(128)	Mapped therapeutic area(s)

Access Pattern: Read-only. Agents query genomic_evidence using cosine similarity search with domain-specific query expansion. Write access is restricted to the RAG ingestion pipeline (Stage 2).

Query Expansion by Agent Type:

Each agent type prepends domain-specific context to the raw user query before embedding:

Oncology Agent: Appends "cancer somatic mutation driver gene tumor"
Cardiology Agent: Appends "cardiac arrhythmia cardiomyopathy channelopathy"
Neurology Agent: Appends "neurodegeneration epilepsy demyelination"
Pharmacogenomics Agent: Appends "drug metabolism CYP450 pharmacokinetic"

This expansion biases the similarity search toward domain-relevant evidence without requiring separate collections.

Agents communicate through structured trigger events. When one agent's analysis produces a finding that is relevant to another agent's domain, it emits a trigger that the orchestrator routes to the appropriate downstream agent.

Imaging Finding to Genomic Variant Query

Scenario: The Imaging Intelligence Agent detects a suspicious mass pattern in a brain MRI.

{
  "trigger_type": "imaging_to_genomic",
  "source_agent": "imaging_intelligence",
  "target_agent": "precision_oncology",
  "payload": {
    "finding": "ring-enhancing lesion, temporal lobe",
    "modality": "MRI_T1_contrast",
    "urgency": "high",
    "suggested_query": "IDH1 IDH2 ATRX TP53 glioma variants"
  }
}

The Oncology Agent receives this trigger and queries genomic_evidence for variants in IDH1, IDH2, ATRX, and TP53. Results are correlated with the imaging finding to produce an integrated assessment.

Pharmacogenomic Alert to Dosing Recommendation

Scenario: The Pharmacogenomics Agent identifies a CYP2D6 poor metabolizer genotype.

{
  "trigger_type": "pgx_to_dosing",
  "source_agent": "pharmacogenomics_intelligence",
  "target_agent": "precision_oncology",
  "payload": {
    "gene": "CYP2D6",
    "phenotype": "poor_metabolizer",
    "star_alleles": ["*4", "*4"],
    "affected_drugs": ["tamoxifen", "codeine", "ondansetron"],
    "recommendation": "avoid_or_reduce_dose"
  }
}

The receiving agent adjusts therapeutic recommendations, avoiding CYP2D6-metabolized prodrugs and suggesting alternatives.

Rare Disease Variant to Family Cascade

Scenario: The Rare Disease Diagnostic Agent identifies a pathogenic variant in a recessive disorder gene.

{
  "trigger_type": "rare_to_cascade",
  "source_agent": "rare_disease_diagnostic",
  "target_agent": "precision_biomarker",
  "payload": {
    "variant": "CFTR p.F508del",
    "zygosity": "heterozygous",
    "inheritance": "autosomal_recessive",
    "cascade_recommendation": "test_parents_and_siblings",
    "carrier_frequency": "1_in_25_European"
  }
}

The Biomarker Agent prepares a carrier screening panel and family counseling summary.

Multi-Agent Query Orchestration Patterns¶

Pattern 1: Fan-Out / Fan-In

A single patient query is broadcast to multiple agents simultaneously. Each agent returns domain-specific findings, and the orchestrator merges results:

User Query --> Orchestrator --> [Oncology, Cardiology, PGx, Biomarker]
                                      |         |        |       |
                                      v         v        v       v
                                 Findings   Findings  Findings  Findings
                                      \        |        |       /
                                       --> Merged Report <---

Pattern 2: Sequential Pipeline

Results flow through agents in a defined order, each enriching the analysis:

Variant --> Biomarker Agent --> Oncology Agent --> Drug Discovery --> Clinical Trial Agent
         (classify variant)  (assess cancer risk)  (find candidates)  (match trials)

Pattern 3: Conditional Routing

The orchestrator examines variant annotations and routes to the most relevant agent:

Variants in cardiac genes (SCN5A, KCNQ1, MYH7) route to the Cardiology Agent
Variants in neurological genes (APP, PSEN1, HTT) route to the Neurology Agent
Variants with oncogenic annotations route to the Oncology Agent
Unclassified rare variants route to the Rare Disease Agent

Building Custom Cross-Agent Workflows¶

To define a custom workflow, create a JSON workflow definition:

{
  "workflow_name": "pediatric_cns_tumor_workup",
  "description": "Integrated CNS tumor assessment for pediatric patients",
  "steps": [
    {
      "agent": "imaging_intelligence",
      "action": "analyze_neuroimaging",
      "timeout_seconds": 30
    },
    {
      "agent": "precision_oncology",
      "action": "variant_assessment",
      "depends_on": ["imaging_intelligence"],
      "timeout_seconds": 20
    },
    {
      "agent": "pharmacogenomics_intelligence",
      "action": "drug_interaction_check",
      "parallel_with": ["precision_oncology"],
      "timeout_seconds": 15
    },
    {
      "agent": "clinical_trial_intelligence",
      "action": "match_trials",
      "depends_on": ["precision_oncology"],
      "timeout_seconds": 20
    }
  ],
  "output": "merged_report"
}

Register the workflow via the orchestrator API:

curl -X POST http://localhost:8080/api/workflows \
  -H "Content-Type: application/json" \
  -d @pediatric_cns_tumor_workup.json

Part 6: Deployment & Operations¶

Docker Compose Architecture¶

The HCLS AI Factory runs as 21 interconnected services defined in docker-compose.dgx-spark.yml. Services are organized into four tiers:

Tier 1: Infrastructure (always-on)

Service	Image	Port	Purpose
etcd	quay.io/coreos/etcd:v3.5.x	2379	Milvus metadata store
minio	minio/minio:latest	9000, 9001	Milvus object storage
milvus-standalone	milvusdb/milvus:v2.4.x	19530, 9091	Vector database

Tier 2: Core Services

Service	Port	Purpose
landing-page	8080	Flask hub with health dashboard
rag-api	8888	RAG query engine
genomics-portal	8085	Genomics web interface
drug-discovery-api	8890	Drug discovery pipeline API

Tier 3: Intelligence Agents

Service	Port	Purpose
precision-oncology-agent	8501	Oncology analysis
cart-intelligence-agent	8502	CAR-T therapy design
imaging-intelligence-agent	8503	Medical imaging analysis
precision-autoimmune-agent	8504	Autoimmune assessment
precision-biomarker-agent	8505	Biomarker interpretation
cardiology-intelligence-agent	8506	Cardiac variant analysis
pharmacogenomics-intelligence-agent	8507	PGx and drug interaction
neurology-intelligence-agent	8508	Neurological assessment
rare-disease-diagnostic-agent	8509	Rare disease diagnosis
single-cell-intelligence-agent	8510	Single-cell analysis
clinical-trial-intelligence-agent	8511	Clinical trial matching

Tier 4: Monitoring

Service	Port	Purpose
prometheus	9090	Metrics collection
grafana	3000	Dashboards and alerting

Health Monitoring¶

The health-monitor.sh script (19 KB) provides comprehensive health monitoring across 22 service endpoints. It runs via cron every 5 minutes.

Monitored Checks per Service:

TCP port reachability (timeout: 5 seconds)
HTTP health endpoint response (expected: 200 OK)
Process memory consumption (threshold: 90% of allocated)
Response latency (threshold: 10 seconds)

Auto-Restart Behavior:

When a service fails 3 consecutive health checks, the monitor executes:

# For Docker services
docker restart <service_name>

# For native processes
kill -TERM <pid> && sleep 2 && <start_command>

Watchdog Mode:

The health monitor itself is supervised by a systemd watchdog. If health-monitor.sh fails to write a heartbeat file within 10 minutes, systemd restarts it.

Log Locations:

Health check results: logs/cron-health.log
Service-specific logs: logs/<service-name>.log
Auto-restart events: logs/health-restarts.log

Milvus Administration¶

The platform manages 139 Milvus collections across all agents and core services.

Collection Categories:

Category	Count	Example
Genomic evidence	1	`genomic_evidence` (35,678 vectors)
Agent knowledge bases	55	`oncology_guidelines`, `cart_protocols`
ClinVar annotations	12	`clinvar_pathogenic`, `clinvar_vus`
Drug/compound data	8	`drug_interactions`, `chembl_targets`
Clinical trials	14	`trial_eligibility`, `trial_endpoints`
Literature	22	`pubmed_oncology`, `pubmed_cardiology`
Patient data (demo)	5	`demo_patient_variants`
Imaging features	10	`imaging_features_mri`, `imaging_features_ct`
Other	12	`system_prompts`, `workflow_templates`

Backup Strategy:

# Full backup of all collections (run weekly)
python3 -c "
from pymilvus import connections, utility
connections.connect('default', host='localhost', port='19530')
collections = utility.list_collections()
for coll in collections:
    utility.do_bulk_insert(coll, files=[], backup=True)
"

# Incremental backup via Milvus backup tool
milvus-backup create --name weekly_$(date +%Y%m%d)

Flush Rate Tuning:

By default, Milvus flushes segments every 1 second. For bulk ingestion (e.g., loading 3.56M annotated variants), increase the flush interval:

from pymilvus import Collection
collection = Collection("genomic_evidence")
collection.set_properties({"collection.flush.interval": "30"})  # 30 seconds

Reset to default after ingestion completes.

Agent Startup Sequence¶

Each intelligence agent follows a deterministic startup sequence:

Configuration Load (< 1 s): Read environment variables and config files
Embedding Model Load (5-15 s): Load BGE-small-en-v1.5 into GPU memory (~130 MB)
Milvus Connection (1-3 s): Connect to Milvus standalone, verify collection availability
Collection Warm-up (2-5 s): Load frequently-queried collections into memory
Health Endpoint Registration (< 1 s): Register with the health monitor
API Ready (< 1 s): Begin accepting requests

Total cold start time: 10-25 seconds per agent. When running all 11 agents, stagger starts by 3 seconds each to avoid GPU memory contention during embedding model loading.

Performance Tuning¶

GPU Memory Management

The DGX Spark provides 128 GB unified memory. Recommended allocation:

Component	Memory	Notes
Milvus indexes	16-24 GB	Depends on loaded collections
BGE-small embedding (x11)	~1.4 GB	130 MB per agent instance
MolMIM NIM	4-8 GB	Active during drug discovery
DiffDock NIM	4-8 GB	Active during drug discovery
Parabricks	32-64 GB	Active during genomics pipeline
System / OS	8-16 GB	Linux kernel, Docker overhead

Concurrent Agent Tuning

For systems with limited memory, reduce the number of concurrent agents:

# In docker-compose.dgx-spark.yml, add resource limits
services:
  precision-oncology-agent:
    deploy:
      resources:
        limits:
          memory: 4G
        reservations:
          memory: 2G

Milvus Index Tuning

The default IVF_FLAT index uses nlist=1024 and nprobe=16. Adjust for your workload:

Scenario	nlist	nprobe	Trade-off
Low latency (< 10 ms)	1024	10	Slightly lower recall
Balanced (default)	1024	16	Good recall, ~15 ms
High recall	2048	32	Higher latency (~30 ms)
Maximum recall	4096	64	~50 ms, near-exhaustive

search_params = {"metric_type": "COSINE", "params": {"nprobe": 32}}
results = collection.search(
    data=[query_embedding],
    anns_field="embedding",
    param=search_params,
    limit=20
)

HTTPS via Caddy¶

For production deployments, use Caddy as a reverse proxy with automatic TLS:

# Caddyfile
hcls-ai-factory.example.com {
    handle /api/rag/* {
        reverse_proxy localhost:8888
    }
    handle /api/genomics/* {
        reverse_proxy localhost:8085
    }
    handle /api/drugs/* {
        reverse_proxy localhost:8890
    }
    handle /agents/* {
        reverse_proxy localhost:8501-8511
    }
    handle {
        reverse_proxy localhost:8080
    }
    header {
        X-Content-Type-Options nosniff
        X-Frame-Options DENY
        Referrer-Policy strict-origin-when-cross-origin
    }
}

Security¶

Portal Authentication

The landing page (localhost:8080) supports configurable authentication:

# In landing-page/app.py
PORTAL_AUTH_ENABLED = os.getenv("PORTAL_AUTH_ENABLED", "false").lower() == "true"
PORTAL_USERNAME = os.getenv("PORTAL_USERNAME", "admin")
PORTAL_PASSWORD_HASH = os.getenv("PORTAL_PASSWORD_HASH")  # bcrypt hash

XSS Protection

All agent Streamlit interfaces and the Flask landing page implement output sanitization:

HTML entities escaped in all rendered text
Content-Security-Policy headers set to default-src 'self'
User inputs validated and sanitized before Milvus query construction

Pickle Hardening

The platform avoids pickle.loads() on untrusted data. Model artifacts use SafeTensors format. Where pickle is unavoidable (legacy sklearn models), a restricted unpickler is used:

import pickle
import io

ALLOWED_CLASSES = {"numpy.ndarray", "sklearn.ensemble._forest.RandomForestClassifier"}

class RestrictedUnpickler(pickle.Unpickler):
    def find_class(self, module, name):
        full_name = f"{module}.{name}"
        if full_name not in ALLOWED_CLASSES:
            raise pickle.UnpicklingError(f"Blocked: {full_name}")
        return super().find_class(module, name)

def safe_load(data: bytes):
    return RestrictedUnpickler(io.BytesIO(data)).load()

Part 7: Extending the Platform¶

Adding a New Intelligence Agent¶

Follow these steps to add a custom intelligence agent (e.g., a "Reproductive Health Agent"):

Step 1: Create Milvus Collections

Define the knowledge collections your agent needs:

from pymilvus import Collection, CollectionSchema, FieldSchema, DataType, connections

connections.connect("default", host="localhost", port="19530")

fields = [
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True),
    FieldSchema(name="text", dtype=DataType.VARCHAR, max_length=4096),
    FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=384),
    FieldSchema(name="source", dtype=DataType.VARCHAR, max_length=256),
    FieldSchema(name="category", dtype=DataType.VARCHAR, max_length=128),
]

schema = CollectionSchema(fields, description="Reproductive health knowledge base")
collection = Collection("reproductive_health_knowledge", schema)

# Create index
index_params = {"index_type": "IVF_FLAT", "metric_type": "COSINE", "params": {"nlist": 1024}}
collection.create_index("embedding", index_params)

Step 2: Ingest Domain Knowledge

Prepare knowledge sources and embed them:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("BAAI/bge-small-en-v1.5")

documents = [
    {"text": "Preeclampsia affects 2-8% of pregnancies...", "source": "ACOG", "category": "guidelines"},
    # ... more documents
]

for doc in documents:
    embedding = model.encode(doc["text"]).tolist()
    collection.insert([[doc["text"]], [embedding], [doc["source"]], [doc["category"]]])

collection.flush()

Step 3: Create the Agent API

Use the standard agent template (Streamlit + FastAPI):

reproductive-health-agent/
    app.py              # Streamlit UI
    api.py              # FastAPI health + query endpoints
    knowledge/          # Ingestion scripts and raw data
    requirements.txt
    Dockerfile

The api.py must expose:

GET /health -- returns {"status": "healthy", "collections": [...], "uptime": ...}
POST /query -- accepts {"query": "...", "patient_context": {...}} and returns findings

Step 4: Register in the UI

Add the agent to the landing page configuration:

# In landing-page/config.py
AGENTS = {
    # ... existing agents ...
    "reproductive_health": {
        "name": "Reproductive Health Agent",
        "port": 8512,
        "icon": "baby",
        "description": "Maternal-fetal medicine and reproductive genomics",
        "collections": ["reproductive_health_knowledge", "genomic_evidence"]
    }
}

Step 5: Add to Docker Compose

reproductive-health-agent:
  build: ./reproductive-health-agent
  ports:
    - "8512:8512"
  environment:
    - MILVUS_HOST=milvus-standalone
    - MILVUS_PORT=19530
    - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
  depends_on:
    - milvus-standalone
  restart: unless-stopped

Step 6: Write Tests

At minimum, every agent needs:

Unit tests for query processing and response formatting
Integration test confirming Milvus connectivity and collection access
End-to-end test with a sample query and expected output structure

def test_reproductive_health_query():
    response = client.post("/query", json={"query": "BRCA1 and pregnancy risk"})
    assert response.status_code == 200
    assert "findings" in response.json()
    assert len(response.json()["findings"]) > 0

Creating Custom Milvus Collections with Dynamic Fields¶

Milvus supports dynamic fields for flexible schema evolution:

schema = CollectionSchema(fields, enable_dynamic_field=True)
collection = Collection("flexible_collection", schema)

# Insert with extra fields not in the schema
collection.insert([{
    "text": "Example document",
    "embedding": [0.1] * 384,
    "custom_score": 0.95,       # dynamic field
    "reviewer": "Dr. Smith"     # dynamic field
}])

Dynamic fields are stored as JSON and can be filtered in search expressions:

results = collection.search(
    data=[query_embedding],
    anns_field="embedding",
    param=search_params,
    limit=10,
    expr='custom_score > 0.8 and reviewer == "Dr. Smith"'
)

Integrating New Knowledge Sources¶

PubMed Integration

Use the NCBI E-utilities API to fetch and ingest PubMed abstracts:

import requests

def fetch_pubmed_abstracts(query: str, max_results: int = 100):
    # Search for PMIDs
    search_url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi"
    search_resp = requests.get(search_url, params={
        "db": "pubmed", "term": query, "retmax": max_results, "retmode": "json"
    })
    pmids = search_resp.json()["esearchresult"]["idlist"]

    # Fetch abstracts
    fetch_url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi"
    fetch_resp = requests.get(fetch_url, params={
        "db": "pubmed", "id": ",".join(pmids), "rettype": "abstract", "retmode": "xml"
    })
    return parse_abstracts(fetch_resp.text)

ClinicalTrials.gov Integration

The ClinicalTrials.gov API v2 provides structured trial data:

def fetch_trials(condition: str, status: str = "RECRUITING"):
    url = "https://clinicaltrials.gov/api/v2/studies"
    params = {
        "query.cond": condition,
        "filter.overallStatus": status,
        "pageSize": 50,
        "fields": "NCTId,BriefTitle,EligibilityCriteria,InterventionName"
    }
    response = requests.get(url, params=params)
    return response.json()["studies"]

Ingest trial data into agent-specific collections for real-time trial matching.

Custom Clinical Workflows¶

Beyond the built-in workflows, you can define condition-specific analysis pipelines:

from hcls_common.workflow import WorkflowEngine

workflow = WorkflowEngine()

@workflow.step("variant_triage")
def triage_variants(context):
    """Filter and prioritize variants by clinical significance."""
    variants = context["variants"]
    return [v for v in variants if v["clinical_significance"] in ("Pathogenic", "Likely_pathogenic")]

@workflow.step("literature_search")
def search_literature(context):
    """Query PubMed and internal knowledge bases for supporting evidence."""
    genes = set(v["gene"] for v in context["triaged_variants"])
    evidence = {}
    for gene in genes:
        evidence[gene] = query_milvus(f"{gene} clinical significance treatment")
    return evidence

@workflow.step("generate_report")
def generate_report(context):
    """Produce a clinical summary using Claude."""
    prompt = build_clinical_prompt(context["triaged_variants"], context["evidence"])
    return call_claude(prompt)

# Execute
result = workflow.run(
    steps=["variant_triage", "literature_search", "generate_report"],
    initial_context={"variants": patient_variants}
)

Contributing to the Open-Source Project¶

The HCLS AI Factory is released under the Apache 2.0 License. Contributions are welcome.

Getting Started:

Fork the repository at https://github.com/ajones1923/hcls-ai-factory
Create a feature branch: git checkout -b feature/your-feature-name
Follow the existing code style (Black formatting, type hints, docstrings)
Add tests for any new functionality
Run the test suite: pytest tests/ -v
Submit a pull request with a clear description of changes

Areas Seeking Contributions:

New intelligence agents for underserved therapeutic areas
Improved embedding models for biomedical text
Additional safety filters for specific patient populations
Performance optimizations for large-scale variant sets
Documentation and tutorials
Internationalization of clinical content

Code Review Process:

All pull requests require:

Passing CI checks (linting, tests, type checking)
At least one maintainer review
No reduction in test coverage
Updated documentation if user-facing behavior changes

Clinical Decision Support Disclaimer

The HCLS AI Factory platform and its components are clinical decision support research tools. They are not FDA-cleared and are not intended as standalone diagnostic devices. All recommendations should be reviewed by qualified healthcare professionals. Apache 2.0 License.

HCLS AI Factory — Learning Guide: Unified Advanced¶

Part 1: Advanced Genomic Analysis¶

Custom Variant Filtering Strategies¶

Multi-Sample and Trio Analysis Patterns¶

Structural Variant Detection¶

ClinVar + AlphaMissense Annotation Deep Dive¶

VCF Quality Metrics Interpretation¶

Parabricks 4.6 Performance Tuning on DGX Spark¶

Part 2: Advanced RAG Architecture¶

Milvus 2.4 Collection Design Patterns¶

BGE-small-en-v1.5 Embedding Model¶

Multi-Collection Parallel Search Strategy¶

Query Expansion with Domain Keyword Maps¶

Claude LLM Prompt Engineering for Clinical RAG Synthesis¶

Cross-Modal Triggers Between Agents¶

Performance Optimization¶

Part 3: Deep Dive — All 11 Intelligence Agents¶

3.1 Precision Oncology Agent (:8527)¶

3.2 CAR-T Intelligence Agent (:8522)¶

3.3 Precision Biomarker Agent (:8529)¶

3.4 Cardiology Intelligence Agent (:8126)¶

3.5 Neurology Intelligence Agent (:8528)¶

3.6 Precision Autoimmune Agent (:8532)¶

3.7 Rare Disease Diagnostic Agent (:8134)¶

3.8 Pharmacogenomics Intelligence Agent (:8107)¶

3.9 Imaging Intelligence Agent (:8524)¶

3.10 Single-Cell Intelligence Agent (:8540)¶

3.11 Clinical Trial Intelligence Agent (:8538)¶

Part 4: Advanced Drug Discovery¶

The 10-Stage Therapeutic Discovery Pipeline¶

MolMIM Molecular Generation¶

DiffDock Binding Prediction¶

RDKit Chemical Property Analysis¶

Pediatric Safety Filters¶

Composite Scoring Methodology¶

VCP Demo Results¶

Part 5: Cross-Agent Coordination¶

Shared Genomic Evidence Architecture¶

Cross-Modal Trigger Examples¶

Multi-Agent Query Orchestration Patterns¶

Building Custom Cross-Agent Workflows¶

Part 6: Deployment & Operations¶

Docker Compose Architecture¶

Health Monitoring¶

Milvus Administration¶

Agent Startup Sequence¶

Performance Tuning¶

HTTPS via Caddy¶

Security¶

Part 7: Extending the Platform¶

Adding a New Intelligence Agent¶

Creating Custom Milvus Collections with Dynamic Fields¶

Integrating New Knowledge Sources¶

Custom Clinical Workflows¶

Contributing to the Open-Source Project¶