From Genomic Foundation to Precision Intelligence to Therapeutic Discovery
11 Intelligence Agents. 3 Engines. One Platform.
Patient DNA to Drug Candidates in Under 5 Hours.
See It in Action
From Patient DNA to New Medicines — a 9-minute walkthrough of the complete pipeline
What It Is
The Healthcare & Life Sciences (HCLS) AI Factory unifies three production-grade AI workflows into a single, continuous system — designed to take raw patient DNA and produce viable drug candidates without the fragmentation and delays that define traditional approaches.
This enables computational predictions. They're promising starting points for laboratory testing, not finished medicines. Real drug development requires years of laboratory and clinical validation. But what this HCLS AI Factory does is collapse the first and most expensive step — identifying targets and generating promising candidates — from months of work into hours.
Raw FASTQ files flow through NVIDIA Parabricks for GPU-accelerated genomics — alignment, variant calling, clinical-grade accuracy via DeepVariant — completing in hours instead of days.
Outputs feed directly into an evidence layer where millions of variants can be queried in natural language, grounded in ClinVar, AlphaMissense, structural data, and curated biomedical knowledge.
Validated targets then move into generative drug discovery via NVIDIA BioNeMo, where novel molecules are created, docked, scored, and ranked.
No batch jobs. No manual handoffs. Full lineage from patient DNA to candidate therapeutic.
One workstation. One workflow. Hours, not months.
What This Is Not
The intent is clarity, not control.
It is a reference architecture and research platform, not a regulated medical device or diagnostic system.
All workflows, data flows, and reasoning layers are inspectable and reproducible.
It is designed to augment researchers and clinicians, not automate judgment.
The system is modular by design and intended to be adapted, extended, or specialized.
The architecture is deliberately vendor-neutral and infrastructure-agnostic.
Three Engines
End-to-end from raw sequencing data to drug candidates
Genomic Foundation Engine
Patient DNA → 3.5M annotated variant vectors in 2 hours. GPU-accelerated with NVIDIA Parabricks.
120 – 240 minExplore pipeline →
Precision Intelligence Network
11 specialized intelligence agents across every major medical domain. RAG-powered with Milvus + Claude.
InteractiveExplore pipeline →
Therapeutic Discovery Engine
Validated targets → novel drug candidates. Generative design with BioNeMo MolMIM + DiffDock.
8 – 16 minExplore pipeline →
Intelligence Agents
Domain-specific AI agents extending the core platform with cross-modal evidence linking
CAR-T Intelligence
Cross-functional intelligence across the CAR-T cell therapy lifecycle. 11 collections, 6,266+ vectors, comparative analysis, and deep research mode.
Port 8521Explore agent →
Imaging Intelligence
AI-powered medical imaging with NVIDIA NIM microservices: VISTA-3D segmentation, MAISI generation, VILA-M3 analysis, and FHIR R4 export.
Port 8525Explore agent →
Precision Oncology
Clinical decision support for molecular tumor boards. VCF-to-MTB packets, trial matching, therapy ranking, and FHIR R4 diagnostic bundles.
Port 8526Explore agent →
By the Numbers
Real results from a single NVIDIA DGX Spark workstation
Choose Your Path
The HCLS AI Factory serves every stakeholder in precision medicine
Clinician
Find your specialty agent — cardiology, neurology, oncology, and more
Researcher
Explore GPU-accelerated genomics and single-cell analysis
Pharma
Protocol optimization, patient matching, and trial intelligence
Developer
Docker Compose deployment, API docs, and architecture guides
Patient Advocate
Learn how open-source precision medicine changes access for everyone
Traditional vs. HCLS AI Factory
What used to take months now takes hours
| Metric | Traditional Approach | HCLS AI Factory |
|---|---|---|
| Sequence Alignment | 12 – 24 hours | 2 – 3 hours |
| Variant Calling | 8 – 12 hours | 1 – 2 hours |
| Annotation & Interpretation | Days of manual work | Minutes (automated) |
| Target Identification | Weeks of literature review | Minutes (AI-powered) |
| Drug Candidate Design | Months of medicinal chemistry | 8 – 16 minutes |
| Total Time | Weeks to months | < 5 hours |
| Infrastructure Cost | $100K+ (cluster) | $3,999 (DGX Spark) |
| Reproducibility | Variable | Deterministic |
Origin
In 2012, I set out to use my high-performance computing background for something that mattered. I started with one conviction: no parent should ever have to lose a child to disease.
That conviction led me to Pediatric Neuroblastoma. I taught myself biology, genomics, molecular pathways, drug discovery — whatever the work required. I made one commitment early: I would not profit from this. Whatever I built, I would give away freely, so others could build on it and move faster than any one person ever could alone.
Thousands of hours later, this is the result.
— Adam Jones
Why This Is Open
This project is open by design, not as a shortcut or a visibility exercise, but as a deliberate decision about how foundational healthcare infrastructure should be built.
The challenges in precision medicine are no longer primarily scientific — they are architectural. Fragmented pipelines, opaque tooling, and closed systems slow the transition from genomic data to actionable insight.
By openly publishing the HCLS AI Factory, this project provides a reproducible, inspectable reference implementation for end-to-end genomics, AI reasoning, and therapeutic exploration.
Open access to infrastructure knowledge accelerates progress, enables collaboration, and shifts innovation away from re-solving plumbing toward advancing care.
Who This Is For
For people and institutions working at the intersection of healthcare, life sciences, and AI who need systems, not abstractions.
Researchers & Bioinformaticians
Building, extending, or validating secondary genomics pipelines and variant interpretation workflows who need reproducible, inspectable infrastructure.
Clinicians & Translational Teams
Exploring how genomic data, AI reasoning, and therapeutic insights can be integrated into real-world decision-making.
Academic Medical Centers
Teaching, researching, or operationalizing genomics and AI at scale, where transparency, reproducibility, and extensibility are critical.
AI & Systems Engineers
Interested in how real biomedical workloads behave when treated as first-class AI systems, including data flow, orchestration, and reasoning layers.
Platform Builders
Designing future healthcare platforms who want a concrete reference architecture for AI-native pipelines rather than high-level diagrams.
Anyone, Anywhere
This project is not limited to a single specialty, disease, or institution. It is designed to be a shared foundation that can support many domains.
Getting Started
The HCLS AI Factory documentation site provides everything needed to deploy and run a complete precision medicine pipeline — from patient DNA to drug candidates — on a single NVIDIA DGX Spark. Resources include a quick-start checklist, full deployment guide, live demo walkthrough, and detailed technical documentation for each of the three pipeline stages. The site also offers architecture diagrams, a comprehensive project bible, learning guides for both introductory and professional audiences, and all source code under an Apache 2.0 open-source license.
Technology Stack
Take it. Use it. Make it better.
All HCLS AI Factory code is Apache 2.0. Deploy the full pipeline on your DGX Spark in minutes.