Cardiology Intelligence Agent — Architecture Design Document¶
Author: Adam Jones Date: March 2026 License: Apache 2.0
1. Executive Summary¶
The Cardiology Intelligence Agent extends the HCLS AI Factory platform to deliver RAG-powered cardiovascular clinical decision support. It synthesizes cardiac imaging, electrophysiology, hemodynamics, heart failure management, valvular disease, preventive cardiology, interventional data, and cardio-oncology surveillance into guideline-aligned clinical recommendations using ACC/AHA/ESC evidence.
The system implements 6 validated cardiovascular risk calculators (ASCVD, MAGGIC, EuroSCORE II, CHA2DS2-VASc, HAS-BLED, HEART), optimizes guideline-directed medical therapy (GDMT) for heart failure across 7 therapy classes, and provides 11 clinical workflows covering the highest-impact cardiovascular use cases — all backed by 13 Milvus vector collections containing cardiac imaging protocols, ECG data, hemodynamic measurements, genomic-cardiac correlations, and clinical guidelines.
The platform enables cross-modal queries like "This patient has an LVEF of 28%, LBBB on ECG, and a LMNA pathogenic variant — what is the optimal management strategy?" that simultaneously search imaging protocols, guideline recommendations, genomic correlations, and clinical evidence.
Key Results¶
| Metric | Value |
|---|---|
| Total Python LOC | 28,189 |
| Milvus collections | 13 (12 cardiology-specific + 1 shared genomic_evidence) |
| Cardiovascular conditions | 45 in knowledge graph |
| Cardiac biomarkers | 29 |
| Drug classes | 32 |
| Cardiovascular genes | 56 |
| Imaging protocols | 27 (12 echo, 7 CMR, 5 nuclear, 3 CT) |
| Guideline documents | 20 (ACC/AHA/ESC) |
| Guideline recommendations | 63 structured with class/level of evidence |
| Risk calculators | 6 validated scoring systems |
| Clinical workflows | 11 |
| Cross-modal imaging triggers | 18 genomic trigger patterns |
| GDMT therapy classes | 7 (including finerenone, omecamtiv, sotagliflozin) |
| Entity aliases | 167 abbreviation mappings |
| Test suite | 1,966 tests (100% pass, <1.2s runtime) |
2. Architecture Overview¶
2.1 Mapping to VAST AI OS¶
| VAST AI OS Component | Cardiology Agent Role |
|---|---|
| DataStore | Raw files: PubMed XML, ClinicalTrials.gov JSON, imaging protocol specs, ECG templates, guideline PDFs |
| DataEngine | 7 ingest parsers (PubMed, trials, imaging, ECG, guideline, device, hemodynamics) |
| DataBase | 13 Milvus collections + knowledge graph (45 conditions, 56 genes, 29 biomarkers, 32 drug classes) |
| InsightEngine | BGE-small embedding + multi-collection RAG + 6 risk calculators + GDMT optimizer + cross-modal triggers |
| AgentEngine | CardiologyAgent orchestrator + Streamlit UI (10 tabs) + FastAPI REST |
2.2 System Diagram¶
EXTERNAL USERS
|
+----------+----------+
| |
+-----+------+ +------+-----+
| Streamlit | | REST API |
| UI :8536 | | :8126 |
+-----+------+ +------+-----+
| |
+----------+----------+
|
+----------+----------+
| Agent Orchestrator |
+----------+----------+
|
+--------------------+--------------------+
| | |
+------+------+ +-------+-------+ +-------+-------+
| Query | | Workflow | | Clinical |
| Expansion | | Engine (11) | | Engines |
| 167 aliases | | | | |
+------+------+ +-------+-------+ | 6 Risk Calcs |
| | | GDMT Optimizer |
| | | Cross-Modal |
| | +-------+-------+
+--------------------+--------------------+
|
+----------+----------+
| RAG Engine |
+----------+----------+
|
+--------------------+--------------------+
| | |
+------+------+ +-------+-------+ +-------+-------+
| Knowledge | | Milvus | | LLM |
| Graph | | Vector DB | | (Claude 4.6) |
| 45 conds | | 13 Collections| | |
| 56 genes | | | | |
+-------------+ +---------------+ +---------------+
3. Data Collections — Actual State¶
3.1 Collection Catalog¶
| # | Collection | Est. Records | Weight | Primary Use |
|---|---|---|---|---|
| 1 | cardio_literature |
8,000 | 0.12 | Evidence synthesis, guideline grounding |
| 2 | cardio_trials |
3,000 | 0.10 | Trial evidence, outcome data |
| 3 | cardio_imaging |
2,500 | 0.10 | Echo, CMR, nuclear, CT protocols |
| 4 | cardio_ecg |
1,500 | 0.08 | ECG patterns, arrhythmia recognition |
| 5 | cardio_hemodynamics |
1,000 | 0.07 | Catheterization, pressure-volume data |
| 6 | cardio_guidelines |
1,200 | 0.10 | ACC/AHA/ESC recommendations |
| 7 | cardio_drugs |
800 | 0.08 | GDMT protocols, drug interactions |
| 8 | cardio_devices |
600 | 0.06 | ICD, CRT, LVAD, TAVR data |
| 9 | cardio_genomics |
1,500 | 0.08 | Cardiac gene panels, channelopathies |
| 10 | cardio_biomarkers |
500 | 0.06 | Troponin, BNP, hs-CRP interpretation |
| 11 | cardio_risk_scores |
200 | 0.05 | Validation cohort data for calculators |
| 12 | cardio_rehab |
400 | 0.04 | Cardiac rehabilitation protocols |
| 13 | genomic_evidence |
3,560,000 | 0.06 | Shared genomic variant context |
3.2 Index Configuration (all collections)¶
| Parameter | Value |
|---|---|
| Index type | IVF_FLAT |
| Metric | COSINE |
| nlist | 1024 (literature, trials), 256 (others) |
| nprobe | 16 |
| Embedding dim | 384 (BGE-small-en-v1.5) |
4. Risk Calculators¶
4.1 ASCVD 10-Year Risk (Pooled Cohort Equations)¶
| Parameter | Input |
|---|---|
| Age | 40-79 years |
| Sex | Male / Female |
| Race | White / African American |
| Total cholesterol | mg/dL |
| HDL cholesterol | mg/dL |
| Systolic BP | mmHg |
| BP treatment | Yes / No |
| Diabetes | Yes / No |
| Smoking | Yes / No |
| Output | 10-year ASCVD risk (%), risk category (low/borderline/intermediate/high) |
4.2 MAGGIC Heart Failure Mortality¶
| Input Variables | Count |
|---|---|
| Age, sex, LVEF, NYHA class, SBP, BMI, creatinine, smoking, diabetes, COPD, HF duration, beta-blocker, ACE-I/ARB | 13 variables |
| Output | 1-year and 3-year mortality risk (%), risk stratification |
4.3 EuroSCORE II (Cardiac Surgery)¶
Logistic regression model with 18 variables predicting operative mortality for cardiac surgery. Covers patient factors (age, sex, renal function, extracardiac arteriopathy, mobility, cardiac surgery history, chronic lung disease, endocarditis, neurological status, diabetes on insulin) and cardiac factors (NYHA class, CCS angina, LVEF, recent MI, pulmonary hypertension, urgency, weight of procedure, surgery on thoracic aorta).
4.4 CHA2DS2-VASc (Stroke Risk in AF)¶
| Component | Points |
|---|---|
| Congestive heart failure | 1 |
| Hypertension | 1 |
| Age >= 75 | 2 |
| Diabetes mellitus | 1 |
| Stroke/TIA/thromboembolism | 2 |
| Vascular disease | 1 |
| Age 65-74 | 1 |
| Sex category (female) | 1 |
| Output | Score (0-9), annual stroke risk (%), anticoagulation recommendation |
4.5 HAS-BLED (Bleeding Risk)¶
| Component | Points |
|---|---|
| Hypertension (uncontrolled, SBP >160) | 1 |
| Abnormal renal/liver function | 1-2 |
| Stroke history | 1 |
| Bleeding predisposition | 1 |
| Labile INR | 1 |
| Elderly (>65) | 1 |
| Drugs/alcohol | 1-2 |
| Output | Score (0-9), annual major bleeding risk (%), risk category |
4.6 HEART Score (Chest Pain)¶
| Component | Points |
|---|---|
| History | 0-2 |
| ECG | 0-2 |
| Age | 0-2 |
| Risk factors | 0-2 |
| Troponin | 0-2 |
| Output | Score (0-10), 6-week MACE risk (%), disposition recommendation (discharge / observe / intervene) |
5. GDMT Optimizer¶
The GDMT Optimizer implements guideline-directed medical therapy titration for heart failure with reduced ejection fraction (HFrEF) across 7 therapy classes:
| Therapy Class | Target | Key Agents |
|---|---|---|
| ACEi/ARB/ARNI | Blood pressure, remodeling | Sacubitril/valsartan (ARNI preferred) |
| Beta-blocker | Heart rate, remodeling | Carvedilol, metoprolol succinate, bisoprolol |
| MRA | Potassium-sparing diuresis | Spironolactone, eplerenone |
| SGLT2i | Cardiovascular mortality | Dapagliflozin, empagliflozin |
| Finerenone | Non-steroidal MRA (CKD+HF) | Finerenone (FIDELIO/FIGARO evidence) |
| Omecamtiv mecarbil | Cardiac myosin activator | Omecamtiv (GALACTIC-HF evidence) |
| Sotagliflozin | Dual SGLT1/2 inhibitor | Sotagliflozin (SOLOIST/SCORED evidence) |
The optimizer checks contraindications, current doses vs. target doses, lab monitoring requirements (potassium, creatinine, eGFR), and generates a step-by-step titration plan.
6. Clinical Workflows¶
| # | Workflow | Clinical Question |
|---|---|---|
| 1 | Chest Pain Triage | "Risk-stratify this chest pain presentation using HEART score" |
| 2 | Heart Failure Management | "Optimize GDMT for this HFrEF patient" |
| 3 | Atrial Fibrillation | "CHA2DS2-VASc and HAS-BLED for anticoagulation decision" |
| 4 | Valvular Assessment | "Is this patient a TAVR or surgical AVR candidate?" |
| 5 | Cardiac Imaging | "Which imaging modality is optimal for this clinical scenario?" |
| 6 | Preventive Cardiology | "ASCVD risk and statin therapy recommendation" |
| 7 | Cardiomyopathy Genetics | "Evaluate this DCM/HCM genetic panel result" |
| 8 | Device Therapy | "Does this patient meet criteria for ICD or CRT?" |
| 9 | Perioperative Risk | "Estimate surgical risk with EuroSCORE II" |
| 10 | Cardio-Oncology | "Monitor for cardiotoxicity during anthracycline therapy" |
| 11 | Cardiac Rehabilitation | "Design a rehab protocol for post-CABG recovery" |
7. Cross-Modal Imaging Triggers¶
The agent detects 18 genomic patterns that trigger specific cardiac imaging recommendations:
| Genomic Finding | Triggered Imaging | Rationale |
|---|---|---|
| LMNA pathogenic variant | CMR with late gadolinium enhancement | DCM + conduction disease risk |
| MYH7/MYBPC3 variant | Echocardiogram + CMR | HCM screening |
| TTN truncating variant | Echocardiogram + CMR | DCM assessment |
| SCN5A variant | 12-lead ECG + signal-averaged ECG | Brugada syndrome |
| KCNQ1/KCNH2 variant | 12-lead ECG + exercise stress | Long QT syndrome |
| PKP2/DSP variant | CMR + signal-averaged ECG | ARVC evaluation |
8. Multi-Collection RAG Engine¶
8.1 Search Flow¶
User Query: "Optimize GDMT for 62M, LVEF 25%, NYHA III, CKD stage 3"
│
├── 1. Embed query with BGE asymmetric prefix [< 5 ms]
│
├── 2. Parallel search across 13 collections (top-5 each) [12-18 ms]
│ ├── cardio_guidelines: HFrEF GDMT recommendations (score: 0.84-0.92)
│ ├── cardio_drugs: ARNI/BB/MRA/SGLT2i protocols (score: 0.80-0.88)
│ ├── cardio_literature: DAPA-HF, EMPEROR-Reduced data (score: 0.78-0.86)
│ ├── cardio_biomarkers: BNP/NT-proBNP monitoring (score: 0.72-0.80)
│ └── cardio_trials: HFrEF landmark trials (score: 0.70-0.82)
│
├── 3. Query expansion + knowledge augmentation [< 1 ms]
│
├── 4. GDMT Optimizer: titration plan generation [< 50 ms]
│
└── 5. Stream Claude Sonnet 4.6 response [~22-26 sec]
GDMT plan with dose targets, monitoring schedule,
CKD-specific adjustments, and guideline citations
9. Performance Benchmarks¶
Measured on NVIDIA DGX Spark (GB10 GPU, 128GB unified LPDDR5x memory, 20 ARM cores).
9.1 Risk Calculator Performance¶
| Calculator | Latency | Validated Against |
|---|---|---|
| ASCVD (Pooled Cohort) | <10 ms | ACC/AHA risk calculator |
| MAGGIC | <15 ms | Original MAGGIC publication |
| EuroSCORE II | <20 ms | euroscore.org |
| CHA2DS2-VASc | <5 ms | ESC guidelines |
| HAS-BLED | <5 ms | ESC guidelines |
| HEART | <5 ms | HEART Pathway validation study |
| All 6 calculators | <60 ms |
9.2 RAG Query Performance¶
| Operation | Latency |
|---|---|
| Full query (retrieve + Claude generate) | ~24 sec |
| Streaming query (time to first token) | ~3 sec |
| 13-collection parallel search | 12-18 ms |
| GDMT optimization | <50 ms |
10. Infrastructure¶
10.1 Technology Stack¶
| Component | Technology |
|---|---|
| Language | Python 3.10+ |
| Vector DB | Milvus 2.4, localhost:19530 |
| Embeddings | BGE-small-en-v1.5 (BAAI) — 384-dim |
| LLM | Claude Sonnet 4.6 (Anthropic API) |
| Web UI | Streamlit (port 8536, NVIDIA black/green theme) |
| REST API | FastAPI + Uvicorn (port 8126) |
| Configuration | Pydantic BaseSettings |
| Hardware target | NVIDIA DGX Spark (GB10 GPU, 128GB unified, $4,699) |
10.2 Service Ports¶
| Port | Service |
|---|---|
| 8126 | FastAPI REST API |
| 8536 | Streamlit Chat UI |
| 19530 | Milvus vector database (shared) |
10.3 Dependencies on HCLS AI Factory¶
| Dependency | Usage |
|---|---|
| Milvus 2.4 instance | Shared vector database — adds 12 owned collections alongside existing genomic_evidence (3.56M vectors, read-only) |
ANTHROPIC_API_KEY |
Shared Anthropic API key |
| BGE-small-en-v1.5 | Same embedding model as main RAG pipeline |
11. Knowledge Graph¶
11.1 Cardiovascular Conditions (45 entries)¶
Organized into 8 categories: heart failure (8), arrhythmia (7), coronary artery disease (5), valvular disease (6), cardiomyopathy (5), congenital heart disease (4), vascular disease (5), pericardial disease (5).
11.2 Cardiac Genes (56 entries)¶
| Category | Count | Key Genes |
|---|---|---|
| Cardiomyopathy | 18 | MYH7, MYBPC3, TTN, LMNA, DES, PLN, RBM20, FLNC |
| Arrhythmia (channelopathy) | 14 | SCN5A, KCNQ1, KCNH2, KCNJ2, RYR2, CASQ2, CALM1-3 |
| ARVC | 6 | PKP2, DSP, DSG2, DSC2, JUP, TMEM43 |
| Aortopathy | 5 | FBN1, TGFBR1, TGFBR2, SMAD3, ACTA2 |
| Lipid metabolism | 5 | LDLR, PCSK9, APOB, LDLRAP1, ABCG5 |
| Cardiac development | 4 | NKX2-5, GATA4, TBX5, TBX20 |
| Other | 4 | TNNT2, TNNI3, ACTC1, MYL2 |
11.3 Guidelines (51 structured recommendations)¶
All recommendations include ACC/AHA class (I, IIa, IIb, III) and level of evidence (A, B-R, B-NR, C-LD, C-EO).
12. Demo Scenarios¶
12.1 Validated Demo Queries¶
1. "62-year-old male, LVEF 25%, NYHA Class III, on lisinopril and metoprolol — optimize GDMT" - GDMT Optimizer: Switch lisinopril to sacubitril/valsartan, add dapagliflozin, add spironolactone - Titration plan with target doses, lab monitoring schedule
2. "Calculate ASCVD risk: 55F, TC 240, HDL 45, SBP 138, on BP meds, non-diabetic, non-smoker, white" - ASCVD Calculator: 10-year risk with statin recommendation per ACC/AHA guidelines
3. "New-onset AF, CHA2DS2-VASc 4, HAS-BLED 2 — anticoagulation strategy" - Dual calculator with anticoagulation recommendation, DOAC vs. warfarin guidance
4. "LMNA p.R190W variant found in DCM patient — what cardiac workup is needed?" - Cross-modal trigger: CMR + ICD evaluation + family screening cascade - Genomic knowledge: LMNA → high risk of sudden cardiac death, conduction disease
5. "Chest pain, HEART score inputs: typical history, normal ECG, age 58, 3 risk factors, troponin normal" - HEART Score: Score calculation with 6-week MACE risk and disposition recommendation
13. File Structure (Actual)¶
cardiology_intelligence_agent/
├── src/
│ ├── agent.py # Agent orchestrator
│ ├── models.py # Pydantic models + 16 enums
│ ├── collections.py # 13 Milvus collection schemas
│ ├── rag_engine.py # Multi-collection RAG (1,589 LOC)
│ ├── clinical_workflows.py # 11 workflows (2,445 LOC)
│ ├── risk_calculators.py # 6 validated calculators (2,397 LOC)
│ ├── gdmt_optimizer.py # GDMT titration engine (2,457 LOC)
│ ├── cross_modal.py # 18 imaging trigger patterns (1,734 LOC)
│ ├── knowledge.py # Knowledge graph (1,431 LOC)
│ ├── query_expansion.py # 167 aliases (2,025 LOC)
│ ├── metrics.py # Prometheus metrics
│ ├── export.py # Report generation
│ └── ingest/
│ ├── pubmed_parser.py
│ ├── trials_parser.py
│ ├── imaging_parser.py
│ ├── ecg_parser.py
│ ├── guideline_parser.py
│ ├── device_parser.py
│ └── hemodynamics_parser.py
├── app/
│ └── cardio_ui.py # Streamlit (10 tabs, NVIDIA theme)
├── api/
│ └── main.py # FastAPI REST server
├── config/
│ └── settings.py # Pydantic BaseSettings
├── tests/ # 1,966 tests
├── requirements.txt
├── Dockerfile
├── docker-compose.yml
└── README.md
43 files | ~28,189 lines of code | Apache 2.0
14. Implementation Status¶
| Phase | Status | Details |
|---|---|---|
| Phase 1: Architecture | Complete | 13 collections, knowledge graph, 6 risk calculators, GDMT optimizer, 11 workflows |
| Phase 2: Data | Complete | 45 conditions, 56 genes, 29 biomarkers, 32 drug classes, 27 imaging protocols, 63 recommendations |
| Phase 3: RAG Integration | Complete | Multi-collection parallel search, Claude Sonnet 4.6 streaming |
| Phase 4: Testing | Complete | 1,966 tests, 100% pass, <1.2s runtime |
| Phase 5: UI + Demo | Complete | 10-tab Streamlit UI, 5 demo scenarios validated |
15. Relationship to HCLS AI Factory¶
The Cardiology Intelligence Agent demonstrates the multi-modal clinical extension of the HCLS AI Factory architecture, integrating imaging, electrophysiology, hemodynamics, and genomics into a single decision support platform.
- Same Milvus instance — 12 new owned collections alongside existing
genomic_evidence(3.56M vectors, read-only) - Same embedding model — BGE-small-en-v1.5 (384-dim)
- Same LLM — Claude via Anthropic API
- Same hardware — NVIDIA DGX Spark ($4,699)
- Same patterns — Pydantic models, BaseIngestPipeline, knowledge graph, query expansion
The cross-modal trigger system uniquely bridges genomics (Stage 1) and cardiac imaging, enabling queries like "This TTN truncating variant patient needs echocardiographic follow-up" — connecting molecular findings to clinical imaging protocols automatically.
16. Credits¶
- Adam Jones
- Apache 2.0 License
Clinical Decision Support Disclaimer
The Cardiology Intelligence Agent is a clinical decision support research tool for cardiovascular medicine. It is not FDA-cleared and is not intended as a standalone diagnostic device. All recommendations should be reviewed by qualified healthcare professionals. Risk calculator outputs should be validated against institutional protocols. Apache 2.0 License.