Skip to content

HCLS AI Factory — Executive Bullets

One-page reference for executives, stakeholders, and demo audiences.

License: Apache 2.0 | Date: March 2026


What It Is

The HCLS AI Factory transforms patient DNA into ranked novel drug candidates in under 5 hours on a single NVIDIA DGX Spark ($4,699). Three GPU-accelerated engines -- Genomic Foundation, Precision Intelligence (11 agents), and Therapeutic Discovery -- run end-to-end with no manual intervention. Eleven domain-specialized intelligence agents provide comprehensive clinical decision support across oncology, cardiology, neurology, rare disease, pharmacogenomics, autoimmune disease, medical imaging, CAR-T therapy, biomarker analysis, single-cell genomics, and clinical trial operations.


The Problem

  • CPU-based genomics pipelines take 12-36 hours for a single 30x WGS sample
  • Variant annotation is fragmented across disconnected databases and manual curation
  • The gap from identified variant to drug lead compound is months of manual work
  • Clinical decision support is siloed by specialty -- no integrated platform connects genomics, clinical reasoning, and drug discovery
  • Access requires $100K+ infrastructure and multiple specialist teams

The Solution — Three Engines, 11 Agents

Engine 1: Genomic Foundation Engine (120-240 min)

  • NVIDIA Parabricks 4.6 -- 10-20x faster than CPU
  • BWA-MEM2 alignment: 20-45 min (vs. 12-24 hours on CPU)
  • Google DeepVariant: 10-35 min, >99% accuracy
  • Input: ~200 GB FASTQ (30x WGS, HG002)
  • Output: ~11.7 million variants, 3.56 million annotated variant embeddings in Milvus

Engine 2: Precision Intelligence Engine (Interactive)

  • 10 intelligence agents sharing read-only access to 3.56M annotated variant vectors
  • 139 Milvus collections containing ~47,691 agent-owned vectors across all domains
  • Anthropic Claude (RAG-grounded reasoning) powers each agent
  • 201 genes across 13 therapeutic areas, 171 druggable targets (85%)
  • Output: Validated target gene with full evidence chain, clinical reports (PDF, FHIR R4)

The 8 Specialized Agents:

Agent Key Capabilities
Precision Oncology Molecular tumor board, CIViC/OncoKB annotation, AMP/ASCO/CAP evidence tiers, therapy ranking
Cardiology Intelligence 6 risk calculators (ASCVD, HEART, CHA2DS2-VASc, HAS-BLED, MAGGIC, EuroSCORE II), GDMT optimizer, 8 workflows
Neurology Intelligence 10 clinical scales (NIHSS, GCS, MoCA, MDS-UPDRS, EDSS, mRS, HIT-6, ALSFRS-R, ASPECTS, Hoehn-Yahr), 8 workflows
Rare Disease Diagnostic 88 rare diseases across 13 categories, 23 ACMG criteria, HPO phenotype matching, GA4GH Phenopacket export
Pharmacogenomics 25 pharmacogenes, CPIC/DPWG dosing, phenoconversion detection, HLA hypersensitivity screening
Precision Autoimmune 13 autoimmune conditions, autoantibody panels, HLA typing, disease activity scoring, flare prediction
Precision Biomarker Biological age estimation (PhenoAge/GrimAge), disease trajectory, pharmacogenomic profiling
CAR-T Intelligence Construct comparison (4-1BB vs CD28), manufacturing intelligence, clinical trial matching
Imaging Intelligence NVIDIA NIM (VISTA-3D, MAISI, VILA-M3), DICOM ingestion, Lung-RADS, cross-modal genomics triggers
Single-Cell Intelligence 57 cell types, TME profiling, spatial niche mapping, drug response prediction, CAR-T target validation
Clinical Trial Intelligence Protocol optimization, patient-trial matching, site selection, adaptive design, regulatory documents

Engine 3: Therapeutic Discovery Engine (8-16 min)

  • BioNeMo MolMIM -- generative chemistry (novel molecule design)
  • BioNeMo DiffDock -- molecular docking (binding affinity prediction)
  • RDKit -- drug-likeness scoring (Lipinski, QED, TPSA)
  • Composite scoring: 30% generation + 40% docking + 30% QED
  • Output: 100 ranked novel drug candidates + PDF report

Key Numbers

Metric Value
Total Pipeline Time < 5 hours
Input Data ~200 GB FASTQ (30x WGS)
Variants Called ~11.7 million
Annotated Variants ~3.56 million
Specialized Agents 8 (spanning 11 medical specialties)
Milvus Collections 139 (agent-owned) + shared genomic evidence
Agent Vectors ~47,691 (domain-specific)
Services 21 (engines + agents + infrastructure)
Genes in Knowledge Base 201 (13 therapeutic areas)
Druggable Targets 171 (85%)
Drug Candidates Generated 100 (ranked by composite score)
Test Files 158 (core + all 11 agents)
Hardware Cost $4,699 (DGX Spark)

VCP/FTD Demo Highlights

  • Target: VCP gene -- Frontotemporal Dementia, ALS, IBMPFD
  • Variant: rs188935092 -- ClinVar Pathogenic, AlphaMissense 0.87
  • Seed: CB-5083 (Phase I clinical VCP inhibitor)
  • Result: Top candidate shows +39% composite improvement over seed
  • Docking: -11.4 kcal/mol (vs. -8.1 for CB-5083)
  • QED: 0.81 (vs. 0.62 for CB-5083)
  • All top 10 pass Lipinski's Rule of Five

Technology Stack

Layer Technology
Hardware NVIDIA DGX Spark (GB10 GPU, 128 GB unified, $4,699)
Genomics NVIDIA Parabricks 4.6, DeepVariant (>99% accuracy)
Annotation ClinVar (4.1M records), AlphaMissense (71M predictions), Ensembl VEP
Vector DB Milvus 2.4, BGE-small-en-v1.5, IVF_FLAT, 139 collections
LLM Anthropic Claude (RAG-grounded reasoning across all 11 agents)
Drug Discovery BioNeMo MolMIM, BioNeMo DiffDock, RDKit
Orchestration Nextflow DSL2 (5 modes: full, target, drug, demo, genomics_only)
Monitoring Grafana, Prometheus, DCGM Exporter
License Apache 2.0 (fully open)

Deployment Roadmap

Phase Hardware Scale Cost
1 -- Proof Build DGX Spark 1 patient, Docker Compose, 21 services $4,699
2 -- Departmental DGX B200 Multiple concurrent, Kubernetes $500K-$1M
3 -- Enterprise DGX SuperPOD Thousands concurrent, FLARE federated $7M-$60M+

Cross-Modal Integration

  • Imaging --> Genomics: Lung-RADS 4B+ triggers tumor gene profiling (EGFR, ALK, ROS1, KRAS)
  • Genomics --> Drug Discovery: Pathogenic variants trigger molecule generation
  • Single-Cell --> Oncology: TME profiling informs immunotherapy selection
  • Pharmacogenomics --> All Agents: Genotype-guided dosing across clinical domains
  • NVIDIA FLARE: Federated learning across institutions (data stays local)

Competitive Differentiation

  • Only platform running genomics-to-drug-candidates with 11 clinical intelligence agents on a single desktop GPU
  • End-to-end: No manual handoffs between engines
  • < 5 hours total pipeline time (vs. weeks/months traditional)
  • $4,699 proof build cost (vs. $100K+ for equivalent CPU infrastructure)
  • 11 agents covering oncology, cardiology, neurology, rare disease, pharmacogenomics, autoimmune, biomarker, CAR-T, imaging, single-cell, and clinical trials
  • Open project: Apache 2.0, reproducible, auditable, 158 test files
  • Scalable: Same Nextflow pipelines scale from DGX Spark to SuperPOD

HCLS AI Factory -- Apache 2.0 | March 2026


Clinical Decision Support Disclaimer

The HCLS AI Factory platform and all intelligence agents described in this document are clinical decision support research tools. It is not FDA-cleared and is not intended as a standalone diagnostic device. All recommendations should be reviewed by qualified healthcare professionals. Apache 2.0 License.