Skip to content

HCLS AI Factory: An Open-Source Three-Engine Precision Medicine Platform with Eleven Domain-Specialized Intelligence Agents on Desktop GPU Hardware

Adam Jones

March 2026


Abstract

We present the HCLS AI Factory, an open-source platform that transforms raw patient DNA sequencing data into ranked novel drug candidates in under five hours on a single NVIDIA DGX Spark desktop workstation (\$4,699). The platform implements a three-engine pipeline: (1) GPU-accelerated genomic variant calling via NVIDIA Parabricks 4.6, producing 11.7 million variants from 200 GB of whole-genome sequencing data in 120--240 minutes with >99% accuracy; (2) retrieval-augmented generation (RAG) over 3.56 million annotated variants using Milvus 2.4 and Anthropic Claude, identifying druggable gene targets across 13 therapeutic areas in under 5 seconds per query; and (3) AI-driven drug discovery using BioNeMo MolMIM and DiffDock, generating and ranking 100 novel molecular candidates with pediatric safety assessment in 8--16 minutes. Eleven domain-specialized intelligence agents extend the platform to precision oncology, CAR-T cell therapy, medical imaging, precision biomarkers, autoimmune disease, cardiology, neurology, pharmacogenomics, rare disease diagnostics, single-cell genomics, and clinical trial operations---collectively maintaining 139 Milvus vector collections containing approximately 47,691 vectors across 158 test files. The entire system---21 containerized services, 128 GB unified memory, and a Nextflow DSL2 orchestrator---runs on consumer-grade hardware at a fraction of the cost of traditional approaches (\$50K--\$500K+). All code is released under the Apache 2.0 license. To our knowledge, the HCLS AI Factory is the first open-source platform to integrate genomic variant calling, RAG-grounded target identification, generative drug discovery, and multi-domain clinical intelligence agents into a single end-to-end workflow on a desktop workstation.


1. Introduction

Precision medicine promises therapies tailored to an individual's genetic profile. A single 30x whole-genome sequencing (WGS) run produces approximately 200 GB of raw data and 11.7 million genomic variants. The challenge is not generating this data---modern sequencers produce it reliably---but transforming it into actionable therapeutic hypotheses within a clinically relevant timeframe.

Today's genomic analysis pipelines assemble disconnected components: CPU-based alignment tools that require 12--24 hours, separate variant callers, annotation databases accessed through web APIs, and manual literature review for target identification. Each step must be independently provisioned, and the gap between "variants identified" and "drug target nominated" is filled by months of manual curation. Traditional end-to-end timelines span 6--18 months at infrastructure costs of \$50K--\$500K+ [1, 2].

This fragmentation introduces three structural problems. First, a compute bottleneck: CPU-based BWA-MEM alignment of a 30x WGS sample takes 12--24 hours on a 32-core server, and DeepVariant on CPU adds another 8--12 hours. The genomics stage alone consumes 1--2 days of wall time. Second, annotation fragmentation: clinical variant databases (ClinVar [3]), AI pathogenicity predictors (AlphaMissense [4]), and functional annotation tools (Ensembl VEP [5]) exist as separate resources requiring bespoke ETL pipelines. Third, a target-to-drug gap: even after identifying a pathogenic variant in a druggable gene, the path to a lead compound requires separate molecular modeling tools, docking servers, and medicinal chemistry expertise.

GPU-accelerated computing offers an opportunity to collapse these bottlenecks. The NVIDIA DGX Spark (\$4,699) packages a GB10 Grace Blackwell Superchip with 128 GB unified LPDDR5x memory, 20 ARM cores, and NVLink-C2C interconnect into a desktop form factor. The same GPU that accelerates genomic alignment can run vector similarity search, molecular generation, and molecular docking---eliminating data transfer overhead and enabling sequential execution of all three stages on a single machine.

In this paper, we present the HCLS AI Factory, an open-source platform that exploits this convergence. Our contributions are:

  1. An integrated three-engine pipeline that processes raw FASTQ sequencing data through GPU-accelerated variant calling, RAG-grounded target identification, and AI-driven drug discovery---producing 100 ranked novel drug candidates in under 5 hours.
  2. A demonstration on the VCP gene target for Frontotemporal Dementia, where the top AI-generated candidate achieves a 39% composite improvement over the CB-5083 seed compound, with docking affinity of -11.4 kcal/mol (vs. -8.1) and QED of 0.81 (vs. 0.62).
  3. Eleven domain-specialized intelligence agents (Precision Oncology, CAR-T, Precision Biomarker, Cardiology, Neurology, Precision Autoimmune, Rare Disease Diagnostic, Pharmacogenomics, Imaging, Single-Cell, and Clinical Trial) that extend the platform to comprehensive clinical decision support, backed by 158 test files and 139 Milvus collections containing approximately 47,691 vectors.
  4. Pediatric safety assessment integrated into the drug discovery pipeline, applying six molecular filters derived from pediatric pharmacokinetic considerations.
  5. Full reproducibility on consumer-grade hardware: all code is Apache 2.0, the platform runs on a \$4,699 workstation, and the reference dataset (GIAB HG002) is publicly available.

CPU-based genomic pipelines. The canonical genomics workflow pairs BWA-MEM [6] for read alignment with GATK HaplotypeCaller [7] for variant calling. On a 32-core server, this pipeline processes a 30x WGS sample in 24--48 hours. DeepVariant [8], a CNN-based variant caller from Google, improved accuracy---particularly on indels---but remained CPU-bound until GPU-accelerated implementations became available.

GPU-accelerated genomics. NVIDIA Parabricks [9] provides GPU-accelerated implementations of BWA-MEM2, DeepVariant, and other GATK best-practices tools. Published benchmarks demonstrate 10--50x speedups over CPU implementations on NVIDIA A100 and H100 GPUs. The NVIDIA Clara suite [10] provides additional tools for clinical genomics, but does not extend to drug discovery.

RAG in biomedical NLP. Retrieval-augmented generation [11] has been applied to biomedical question answering in systems such as BioRAG, MedPaLM [12], and domain-specific chatbots. These systems typically operate over PubMed abstracts or clinical notes. Our approach differs by grounding RAG queries in patient-specific genomic variants annotated with ClinVar and AlphaMissense, producing evidence chains that link genomic positions to clinical significance to druggability.

AI-driven drug discovery. AlphaFold [13] revolutionized protein structure prediction. DiffDock [14] introduced score-based diffusion models for blind molecular docking, eliminating the need for pre-defined binding pockets. MolMIM [15] applies masked language modeling to molecular generation, producing structurally novel analogs from seed compounds. RDKit [16] provides open-source cheminformatics for chemical analysis. However, these tools are typically used in isolation, requiring manual orchestration and domain expertise to connect them.

Clinical decision support agents. Domain-specific AI agents for clinical decision support have emerged in oncology (OncoKB, CIViC), pharmacogenomics (CPIC, PharmGKB), and imaging (MONAI, VISTA-3D). However, these systems operate independently and lack integration with patient-specific genomic data. No existing platform unifies multi-domain clinical intelligence agents with an end-to-end genomics-to-drug-discovery pipeline.

Integrated platforms. Cloud-based platforms such as Terra (Broad Institute), DNAnexus, and Seven Bridges provide managed genomics workflows with scalable compute. These platforms handle Stage 1 (genomics) effectively but do not integrate target identification via RAG, generative drug discovery, or multi-domain clinical intelligence agents. No existing open-source platform provides an end-to-end pipeline from raw FASTQ to ranked drug candidates with domain-specialized decision support. The HCLS AI Factory fills this gap by integrating all three stages and eleven intelligence agents on a single workstation under a unified orchestration framework.


3. System Architecture

3.1 Hardware Platform

The HCLS AI Factory targets the NVIDIA DGX Spark as its reference hardware platform. Table 1 summarizes the hardware specification.

Table 1. NVIDIA DGX Spark hardware specification.

Component Specification
GPU NVIDIA GB10 Grace Blackwell Superchip
Memory 128 GB unified LPDDR5x
CPU 20 ARM cores (Grace)
Interconnect NVLink-C2C (GPU--CPU unified memory)
Storage NVMe SSD
Form factor Desktop workstation
Price \$4,699

The unified memory architecture is critical: NVLink-C2C enables the GPU and CPU to share the same 128 GB memory pool without transfer bottlenecks. This allows sequential execution of GPU-intensive stages (genomics, docking) and memory-intensive stages (vector indexing, annotation) without resource contention.

3.2 Three-Engine Pipeline Overview

The platform processes data through three sequential engines, summarized in Table 2.

Table 2. Pipeline engine summary.

Engine Technology Duration Input Output
1 -- Genomics Parabricks 4.6 (BWA-MEM2 + DeepVariant) 120--240 min FASTQ (~200 GB) VCF (~11.7M variants)
2 -- RAG/Chat Milvus 2.4 + BGE-small-en-v1.5 + Claude Interactive (<5 sec/query) VCF Target gene + evidence chain
3 -- Drug Discovery MolMIM + DiffDock + RDKit 8--16 min Target gene + seed compound 100 ranked drug candidates

3.3 Service Architecture

The platform runs 21 containerized services managed by Docker Compose:

  • Orchestration: Landing page with health monitor (port 8080), Nextflow DSL2 orchestrator
  • Engine 1: Genomics portal (5000)
  • Engine 2: Milvus vector database (19530), etcd metadata store (2379), MinIO object storage (9000), Attu management UI (8000), RAG API (5001), Streamlit Chat UI (8501)
  • Engine 3: MolMIM NIM (8001), DiffDock NIM (8002), Drug Discovery UI (8505), Integrated Portal (8510)
  • Intelligence Agents: Eleven agent services with FastAPI backends and Streamlit frontends (ports 8107--8544)
  • Monitoring: Grafana dashboards (3000), Prometheus metrics (9099), Node Exporter (9100), DCGM GPU telemetry (9400)

Nextflow DSL2 orchestrates pipeline execution across five modes: full (Engines 1-2-3), target (Engines 2-3 from existing VCF), drug (Engine 3 only), demo (pre-configured VCP/FTD demonstration), and genomics_only (Engine 1). Six execution profiles (standard, docker, singularity, dgx_spark, slurm, test) adapt the pipeline to different infrastructure environments.


4. Stage 1: GPU-Accelerated Genomics

4.1 Alignment (BWA-MEM2)

Stage 1 begins with the alignment of paired-end reads against the GRCh38 human reference genome using BWA-MEM2, accelerated through NVIDIA Parabricks 4.6. The fq2bam module performs alignment, sorting, and duplicate marking in a single GPU-accelerated pass. On the DGX Spark GB10 GPU, alignment achieves 70--90% GPU utilization and completes in 20--45 minutes, producing a coordinate-sorted BAM file with index.

The reference dataset is HG002 (NA24385) from the Genome in a Bottle (GIAB) Consortium [17]---an Ashkenazi Jewish male sample with extensively validated truth sets enabling rigorous accuracy benchmarking. Input characteristics: 30x WGS coverage, 2x250 bp paired-end reads, approximately 200 GB total FASTQ size (R1: 99.4 GB, R2: 99.3 GB), with 800 million to 1.2 billion reads aligned.

4.2 Variant Calling (DeepVariant)

Variant calling uses Google DeepVariant [8] accelerated through Parabricks. DeepVariant applies a convolutional neural network to classify candidate variant sites, achieving >99% concordance with the GIAB truth set---outperforming traditional statistical callers (GATK HaplotypeCaller) on both SNPs and indels. GPU-accelerated execution achieves 80--95% utilization and completes in 10--35 minutes.

The resulting VCF contains approximately 11.7 million variants: 4.2 million SNPs, 1.0 million indels, and 148,762 multi-allelic sites. Of these, 3.56 million pass quality filtering (QUAL > 30), with 35,616 in coding regions. The transition/transversion ratio of 2.07 falls within the expected range (2.0--2.1) for a high-quality whole-genome call set.

4.3 Performance

Table 3 summarizes the GPU acceleration advantage over traditional CPU-based pipelines.

Table 3. Stage 1 performance: GPU vs. CPU.

Step CPU Baseline GPU (DGX Spark) Speedup
Alignment (BWA-MEM2) 12--24 hours 20--45 min 10--50x
Variant Calling (DeepVariant) 8--12 hours 10--35 min 10--50x
Total 24--48 hours 120--240 min 10--20x

Peak GPU memory utilization during variant calling reaches 54 GB, well within the 128 GB unified memory budget. GPU utilization during alignment averages 82% and during variant calling averages 91%, indicating efficient hardware utilization.


5. Stage 2: RAG-Grounded Target Identification

5.1 Variant Annotation (ClinVar + AlphaMissense + VEP)

Stage 2 annotates the 3.56 million high-quality variants against three complementary databases:

ClinVar [3]: 4.1 million clinical variant records from NCBI, mapping genomic positions to clinical significance classifications (Pathogenic, Likely pathogenic, VUS, Likely benign, Benign). Approximately 35,616 patient variants match ClinVar entries.

AlphaMissense [4]: 71,697,560 AI-predicted pathogenicity scores for missense variants, derived from AlphaFold protein structure features. Classification thresholds: pathogenic > 0.564, ambiguous 0.34--0.564, benign < 0.34. Approximately 6,831 ClinVar-matched variants carry AlphaMissense predictions.

Ensembl VEP [5]: Functional consequence annotation mapping variants to genes, transcripts, and impact levels (HIGH, MODERATE, LOW, MODIFIER). VEP identifies missense variants, stop gains, frameshift variants, and splice site disruptions.

The annotation funnel reduces the search space systematically: 11.7M raw variants --> 3.56M quality-filtered --> 35,616 ClinVar-annotated --> 6,831 AlphaMissense-scored --> 2,412 high-impact pathogenic --> 847 in druggable genes.

5.2 Vector Embedding and Indexing (Milvus, BGE, IVF_FLAT)

Each annotated variant is transformed into a structured text summary incorporating genomic position, alleles, clinical significance, pathogenicity score, gene symbol, and functional consequence. These summaries are embedded using BGE-small-en-v1.5 [18], producing 384-dimensional dense vectors.

The 3.56 million embeddings are indexed in Milvus 2.4 [19] using IVF_FLAT with nlist=1024 and COSINE similarity metric. Each record stores 17 structured fields alongside the embedding vector, enabling hybrid search with metadata filtering. One-time indexing requires approximately 75 minutes (45 min for ClinVar, 30 min for AlphaMissense sampling).

Query-time search uses nprobe=16, retrieving the top-k=20 most similar variant contexts in under 100 ms. End-to-end embedding (BGE inference) adds less than 50 ms, for a total vector search latency under 150 ms.

5.3 RAG Pipeline (Query Expansion, Therapeutic Areas, Claude Synthesis)

User queries undergo domain-specific expansion using 13 therapeutic area keyword maps covering Neurology (36 genes), Oncology (27), Metabolic (22), Infectious Disease (21), Respiratory (13), Rare Disease (12), Hematology (12), GI/Hepatology (12), Pharmacogenomics (11), Ophthalmology (11), Cardiovascular (10), Immunology (9), and Dermatology (9). Expansion increases query coverage by mapping clinical terms to gene symbols, variant types, and pathway identifiers.

Expanded queries are embedded and used for approximate nearest-neighbor search in Milvus. The top-20 retrieved variant contexts are assembled into a RAG prompt and processed by Anthropic Claude (claude-sonnet-4-20250514, temperature=0.3). Claude generates structured target hypotheses comprising: gene name, confidence level, evidence chain (variant --> clinical significance --> druggability), therapeutic area, and recommended next action. The eleven intelligence agents share read-only access to the genomic_evidence collection (3.56M annotated variants from Stage 2), enabling cross-modal queries that connect agent-specific knowledge to patient-specific genomic data.

End-to-end query latency---from natural language input through embedding, vector search, context assembly, and LLM synthesis---is under 5 seconds.

5.4 Knowledge Base Coverage (201 Genes, 85% Druggable)

The RAG pipeline is grounded by a curated knowledge base of 201 genes spanning 13 therapeutic areas. Of these, 171 (85%) are classified as druggable targets based on known binding sites, existing inhibitors, or approved therapeutics. This coverage enables the platform to identify actionable targets across the majority of common disease areas encountered in clinical genomics.


6. Stage 3: AI-Driven Drug Discovery

6.1 10-Stage Pipeline

Stage 3 transforms a target gene hypothesis into 100 ranked novel drug candidates through a fully automated 10-stage pipeline:

  1. Initialize -- Load target hypothesis from Stage 2, validate inputs and API connectivity.
  2. Normalize Target -- Map gene symbol to UniProt ID to PDB structures via programmatic API queries.
  3. Structure Discovery -- Query RCSB PDB for available Cryo-EM and X-ray crystallography structures.
  4. Structure Preparation -- Score structures by resolution (lower is better, 5 A maximum), inhibitor presence (+3 bonus), druggable pocket count (+0.5 each), and experimental method (Cryo-EM +0.5).
  5. Molecule Generation -- BioNeMo MolMIM generates 100 novel SMILES strings from a seed compound.
  6. Chemistry QC -- RDKit validates chemical feasibility (valence checks, sanitization, ring analysis).
  7. Conformer Generation -- RDKit 3D conformer embedding using ETKDG (experimental-torsion knowledge distance geometry).
  8. Molecular Docking -- BioNeMo DiffDock predicts binding poses and affinities (10 poses per molecule).
  9. Composite Ranking -- Weighted scoring: 30% generation confidence + 40% docking affinity + 30% QED.
  10. Reporting -- PDF report generation via ReportLab with molecular visualizations.

6.2 Molecular Generation (MolMIM)

MolMIM [15] is a masked language model for molecular generation deployed as an NVIDIA BioNeMo NIM microservice (port 8001). Given a seed compound's SMILES string, MolMIM generates structurally novel analogs by masking and regenerating molecular tokens. The model explores chemical space around the seed while maintaining chemical validity.

In the VCP demonstration, MolMIM generates 100 novel molecules from the CB-5083 seed in 2 minutes 14 seconds, with 98 passing RDKit valence checks (98% chemical validity). The platform supports three NIM execution modes: Cloud (health.api.nvidia.com, required for ARM64 DGX Spark), Local (x86 GPU containers), and Mock (simulated output for testing and CI/CD).

6.3 Molecular Docking (DiffDock)

DiffDock [14] is a score-based generative diffusion model for blind molecular docking deployed as a BioNeMo NIM microservice (port 8002). Unlike traditional grid-based methods (AutoDock Vina, Glide), DiffDock predicts the 3D binding pose and affinity without requiring pre-defined binding pockets---a significant advantage when exploring novel binding modes.

In the VCP demonstration, DiffDock processes 98 valid molecules against the 5FTK protein structure (VCP D2 ATPase domain) in 8 minutes 42 seconds. The mean docking score is -7.4 kcal/mol, with 34 candidates achieving scores below -8.0 kcal/mol (excellent binding) and 78 below -6.0 kcal/mol (good or better).

6.4 Composite Scoring and Ranking

Each candidate is evaluated against three drug-likeness criteria: Lipinski's Rule of Five (MW <= 500, LogP <= 5, HBD <= 5, HBA <= 10), Quantitative Estimate of Drug-likeness (QED > 0.67 = drug-like), and Topological Polar Surface Area (TPSA < 140 A^2 for oral bioavailability).

The final ranking uses a weighted composite:

Score = 0.30 * S_gen + 0.40 * S_dock + 0.30 * S_QED

where S_gen is the MolMIM generation confidence, S_dock is the normalized docking score (max(0, min(1, (10 + dock_score) / 20))), and S_QED is the RDKit QED score. This formulation balances novelty (generation), predicted efficacy (binding), and pharmaceutical viability (drug-likeness).

6.5 Pediatric Safety Assessment

Given the platform's focus on pediatric oncology applications, Stage 3 incorporates a six-filter pediatric safety assessment applied to each candidate molecule. Pediatric patients present unique pharmacokinetic challenges: immature hepatic metabolism (CYP3A4 reaches adult levels by age 1--2), developing blood-brain barrier (more permeable in young children), higher body water percentage (affecting hydrophilic drug distribution), cardiac sensitivity (lower threshold for QT prolongation), and renal maturation (GFR reaches adult levels by age 2).

The six molecular safety filters are:

  1. Molecular Weight / BBB Risk -- MW > 500 Da flags compounds that may exhibit limited CNS penetration for pediatric brain tumors (severity: medium).
  2. Lipophilicity / Hepatotoxicity -- LogP > 5 flags compounds with hepatotoxic potential in immature liver (severity: high).
  3. hERG Cardiac Liability -- hERG IC50 < 10 uM flags compounds with QT prolongation risk, to which pediatric patients are more sensitive (severity: critical).
  4. Teratogenicity -- Compounds with known teratogenic risk require pregnancy prevention programs for adolescent patients (severity: high).
  5. TPSA / Oral Bioavailability -- TPSA > 140 A^2 flags compounds with limited oral bioavailability, creating pediatric formulation challenges (severity: medium).
  6. Rotatable Bond Flexibility -- More than 10 rotatable bonds flags variable GI absorption in pediatric patients (severity: low).

Candidates with any critical-severity flag are classified as not pediatric-safe. In the VCP demonstration, 71 of 98 valid candidates (72.4%) pass all pediatric safety filters without critical flags, and 34 pass with no flags of any severity.


7. Intelligence Agents

7.1 Agent Architecture (Shared Pattern for Eleven Agents)

The HCLS AI Factory extends beyond the core three-engine pipeline with eleven domain-specialized intelligence agents. All agents share the same architectural pattern:

  • Vector storage: Milvus 2.4 with BGE-small-en-v1.5 embeddings (384-dim, IVF_FLAT, COSINE)
  • RAG pipeline: Multi-collection weighted retrieval with domain-specific query expansion
  • LLM synthesis: Anthropic Claude (with local LLM fallback) for evidence-grounded response generation
  • Configuration: Pydantic BaseSettings with environment variable prefixes per agent
  • API layer: FastAPI with SSE event streaming for real-time workflow progress
  • UI layer: Streamlit with NVIDIA dark theme for interactive clinical exploration
  • Export: Markdown, JSON, PDF, and FHIR R4 DiagnosticReport formats
  • Cross-modal genomics: Read-only access to the shared genomic_evidence collection (3.56M annotated variants from Stage 2)
  • Monitoring: Prometheus-compatible metrics endpoints

Each agent maintains its own domain-specific Milvus collections but shares read-only access to the genomic_evidence collection, enabling cross-modal queries that connect agent-specific knowledge to patient-specific genomic variants. Cross-modal triggers automate inter-agent communication: for example, a Lung-RADS 4A+ imaging finding triggers a query to the genomic evidence collection for EGFR, ALK, ROS1, and KRAS variants.

Table 7 summarizes all eleven agents.

Table 7. Intelligence agent summary.

Agent Collections Key Capability Clinical Domain
Precision Oncology 11 (10+1) VCF-to-MTB packet generation Molecular tumor boards
CAR-T Intelligence 11 (10+1) Cross-functional therapy lifecycle Cell therapy development
Precision Biomarker 11 (10+1) Genotype-aware biomarker interpretation Laboratory medicine
Cardiology Intelligence 13 (12+1) Guideline-driven CDSS with risk calculators Cardiovascular medicine
Neurology Intelligence 14 (13+1) Clinical scale calculators and triage Neurological disorders
Precision Autoimmune 14 (13+1) Diagnostic odyssey analysis Autoimmune disease
Rare Disease Diagnostic 14 (13+1) ACMG variant interpretation and HPO matching Rare disease diagnostics
Pharmacogenomics Intelligence 15 (14+1) Genotype-to-prescribing guidance Precision prescribing
Imaging Intelligence 11 (10+1) NIM-integrated imaging workflows Diagnostic radiology
Single-Cell Intelligence 12 (11+1) TME profiling and cell type annotation Translational research
Clinical Trial Intelligence 14 (13+1) Protocol optimization and patient matching Clinical operations

7.2 Oncology & Immunotherapy Agents

7.2.1 Precision Oncology Agent

The Precision Oncology Agent transforms VCF files from Engine 1 into structured Molecular Tumor Board (MTB) packets. It classifies somatic and germline variants against approximately 40 actionable targets using AMP/ASCO/CAP evidence tiers, performs multi-collection RAG across 11 collections (~1,490+ owned vectors plus 3.56M shared genomic vectors), and generates evidence-level-sorted treatment recommendations with resistance awareness and contraindication flags.

Key subsystems include: a case manager for VCF-to-MTB transformation (< 2 sec case creation, 10--30 sec packet generation), a therapy ranker with resistance-aware scoring, a hybrid deterministic + semantic clinical trial matcher, and structured export as Markdown, JSON, PDF, or FHIR R4. The knowledge graph encompasses 40 targets, 30 therapies, 20 resistance mechanisms, 10 pathways, and 15 biomarkers with 50+ entity aliases.

Table 8. Precision Oncology Agent capabilities.

Capability Detail
Variant classification AMP/ASCO/CAP evidence tiers (Level 1--4)
Data sources CIViC, OncoKB, ClinicalTrials.gov, PubMed, NCCN/ASCO/ESMO
Therapy ranking Multi-factor scoring with resistance mechanism awareness
MTB packet generation 10--30 seconds from VCF to structured report
Export formats Markdown, JSON, PDF, FHIR R4

7.2.2 CAR-T Intelligence Agent

The CAR-T Intelligence Agent supports cross-functional intelligence across the CAR-T cell therapy development lifecycle: target identification, CAR construct design, vector engineering, in vitro/in vivo testing, and clinical development. It indexes 6,266 vectors across 11 Milvus collections (10 owned + 1 read-only genomic_evidence) with weighted retrieval (literature 0.30, trials 0.25, constructs 0.20, assays 0.15, manufacturing 0.10).

A comparative analysis mode auto-detects "X vs. Y" queries (e.g., "Compare 4-1BB vs. CD28 costimulatory domains"), performs dual retrieval with per-entity filtering, and produces structured side-by-side analysis. Multi-collection search latency is 12--16 ms; comparative dual retrieval completes in approximately 365 ms. The knowledge graph covers 25 target antigens, 6 FDA-approved products, 8 toxicity profiles, 10 manufacturing processes, 15 biomarkers, and 6 regulatory histories. The agent supports response biomarker tracking across longitudinal treatment courses.

Table 9. CAR-T Intelligence Agent capabilities.

Capability Detail
Owned vectors 6,266 across 10 collections
Comparative analysis Auto-detected dual retrieval (~365 ms)
Knowledge graph 25 antigens, 6 products, 39+ entity aliases
Data sources PubMed (5,047), ClinicalTrials.gov (973), FDA constructs (6)
Response biomarker tracking Longitudinal treatment response monitoring

7.2.3 Precision Biomarker Agent

The Precision Biomarker Agent provides genotype-aware biomarker interpretation, biological age estimation, disease trajectory detection, and pharmacogenomic profiling. It maintains 11 Milvus collections (10 domain-specific + 1 shared genomic evidence) containing approximately 320 reference vectors across biomarker definitions, genetic variants, pharmacogenomic rules, disease trajectories, clinical evidence, nutrition guidelines, drug interactions, aging markers, genotype adjustments, and monitoring protocols.

Seven analysis modules operate in concert: (1) Biological Age Engine computing PhenoAge and GrimAge surrogates from 9 routine blood biomarkers; (2) Disease Trajectory Analyzer detecting pre-symptomatic progression across 6 disease categories with genotype-stratified thresholds; (3) Pharmacogenomic Mapper interpreting star alleles for 7 pharmacogenes (CYP2D6, CYP2C19, CYP2C9, CYP3A5, SLCO1B1, VKORC1, MTHFR) plus HLA-B*57:01; (4) Genotype Adjustment Engine modifying standard reference ranges based on patient genotype; (5) RAG Evidence Engine for cross-collection semantic search; (6) Nutrition Advisor for genotype-aware supplementation; and (7) Report Generator producing 12-section clinical reports with PDF and FHIR R4 export.

Table 10. Precision Biomarker Agent capabilities.

Capability Detail
Biological age models PhenoAge (Levine 2018), GrimAge surrogates
Pharmacogene coverage 7 genes + HLA-B*57:01
Disease trajectory models 6 disease categories
Genotype adjustments PNPLA3, TCF7L2, APOE, and others
Report sections 12-section clinical report

7.3 Specialty Medicine Agents

7.3.1 Cardiology Intelligence Agent

The Cardiology Intelligence Agent provides RAG-powered cardiovascular clinical decision support synthesizing cardiac imaging, electrophysiology, hemodynamics, heart failure management, valvular disease, preventive cardiology, interventional data, and cardio-oncology surveillance into guideline-aligned clinical recommendations. It maintains 13 Milvus collections (12 domain-specific + 1 shared genomic evidence) covering literature, trials, imaging, electrophysiology, heart failure, valvular disease, prevention, interventional, cardio-oncology, devices, guidelines, and hemodynamics.

The knowledge graph encompasses 45 cardiovascular conditions, 29 biomarkers, 32 drug classes, 56 genes, 15 imaging modalities, and 51 guideline recommendations from ACC/AHA/ESC/HRS sources. Eleven clinical workflows span coronary artery disease assessment, heart failure classification and GDMT optimization, valvular heart disease quantification, arrhythmia detection and management, cardiac MRI tissue characterization, stress test interpretation, preventive risk stratification, cardio-oncology surveillance, acute decompensated heart failure, post-MI care, and myocarditis/pericarditis assessment.

Six validated risk calculators are integrated: ASCVD (Pooled Cohort Equations), HEART Score for chest pain risk stratification, CHA2DS2-VASc for atrial fibrillation stroke risk, HAS-BLED for anticoagulation bleeding risk, MAGGIC for heart failure mortality, and EuroSCORE II for cardiac surgical mortality.

Table 11. Cardiology Intelligence Agent capabilities.

Capability Detail
Collections 13 (12 domain + 1 genomic)
Clinical workflows 11 (CAD, HF/GDMT, valvular, arrhythmia, CMR, stress, prevention, cardio-onc, ADHF, post-MI, myocarditis)
Risk calculators 6 (ASCVD, HEART, CHA2DS2-VASc, HAS-BLED, MAGGIC, EuroSCORE II)
Knowledge graph 45 conditions, 29 biomarkers, 32 drug classes, 56 genes
Guidelines ACC/AHA, ESC, HRS with 51 recommendations

7.3.2 Neurology Intelligence Agent

The Neurology Intelligence Agent provides clinical decision support across 10+ neurological disease domains with 8 structured clinical workflows: acute stroke triage, dementia evaluation, epilepsy classification, brain tumor grading, MS monitoring, Parkinson's assessment, headache classification, and neuromuscular evaluation. It maintains 14 Milvus collections (13 domain-specific + 1 shared genomic evidence) spanning literature, trials, imaging, electrophysiology, degenerative diseases, cerebrovascular disease, epilepsy, neuro-oncology, multiple sclerosis, movement disorders, headache disorders, neuromuscular diseases, and clinical guidelines.

Ten validated clinical scale calculators are integrated: NIHSS (stroke severity), GCS (consciousness), MoCA (cognitive screening), MDS-UPDRS Part III (Parkinson's motor), EDSS (MS disability), mRS (functional outcome), HIT-6 (headache impact), ALSFRS-R (ALS function), ASPECTS (stroke imaging), and Hoehn-Yahr (Parkinson's staging). The knowledge base covers 42 drugs, 38+ genes, 55 imaging protocols, 35 EEG patterns, 15 neurodegenerative diseases, 12 epilepsy syndromes, 6 stroke protocols, and 8 headache classifications, sourced from AAN, AHA/ASA, ILAE, ICHD-3, WHO CNS 2021, McDonald 2017, and MDS criteria.

Table 12. Neurology Intelligence Agent capabilities.

Capability Detail
Collections 14 (13 domain + 1 genomic)
Clinical workflows 8 (stroke, dementia, epilepsy, tumor, MS, Parkinson's, headache, neuromuscular)
Clinical scales 10 (NIHSS, GCS, MoCA, MDS-UPDRS III, EDSS, mRS, HIT-6, ALSFRS-R, ASPECTS, Hoehn-Yahr)
Knowledge graph 10 disease domains, 42 drugs, 38 genes, 55 imaging protocols
Guidelines AAN, AHA/ASA, ILAE, ICHD-3, WHO CNS 2021, McDonald 2017, MDS

7.3.3 Precision Autoimmune Agent

The Precision Autoimmune Agent provides multi-collection RAG-powered clinical decision support for autoimmune disease analysis. It interprets autoantibody panels, HLA typing, biomarker trends, and genomic data to provide integrated autoimmune assessments including disease activity scoring, flare prediction, and biologic therapy recommendations with pharmacogenomic context. The agent is designed to surface diagnostic patterns across fragmented clinical records spanning years of multi-specialist visits, addressing the diagnostic odyssey that autoimmune patients commonly face.

The agent maintains 14 Milvus collections (13 domain-specific + 1 shared genomic evidence) with configurable relevance weights, covering clinical documents, patient labs, autoantibody panels, HLA associations, disease criteria, disease activity, flare patterns, biologic therapies, pharmacogenomic rules, clinical trials, literature, patient timelines, and cross-disease mechanisms. It supports 13 autoimmune conditions: rheumatoid arthritis, systemic lupus erythematosus, multiple sclerosis, type 1 diabetes, inflammatory bowel disease, psoriasis/psoriatic arthritis, ankylosing spondylitis, Sjogren's syndrome, systemic sclerosis, myasthenia gravis, celiac disease, Graves' disease, and Hashimoto's thyroiditis, plus POTS/hEDS/MCAS triad detection and overlap syndrome recognition.

Key capabilities include: autoantibody panel interpretation with sensitivity/specificity data for 14+ antibody types, HLA association analysis against 50+ allele-disease associations with odds ratios, disease activity scoring (DAS28-CRP, SLEDAI-2K, CDAI, BASDAI), flare prediction with biomarker pattern analysis, biologic therapy recommendations with pharmacogenomic context, and diagnostic odyssey analysis across fragmented multi-specialist records.

Table 13. Precision Autoimmune Agent capabilities.

Capability Detail
Collections 14 (13 domain + 1 genomic)
Autoimmune conditions 13 diseases + POTS/hEDS/MCAS triad
Autoantibody panels 14+ antibody types with sensitivity/specificity
HLA associations 50+ allele-disease pairs with odds ratios
Disease activity scores DAS28-CRP, SLEDAI-2K, CDAI, BASDAI
Flare prediction Biomarker pattern analysis (CRP, ESR, IL-6, C3/C4, calprotectin)

7.4 Diagnostics & Genomics Agents

7.4.1 Rare Disease Diagnostic Agent

The Rare Disease Diagnostic Agent provides differential diagnosis, ACMG/AMP variant interpretation, HPO-based phenotype matching, therapeutic option search, and clinical trial eligibility assessment. It maintains 14 Milvus collections (13 domain-specific + 1 shared genomic evidence) covering phenotypes, diseases, genes, variants, literature, trials, therapies, case reports, guidelines, pathways, registries, natural history, and newborn screening.

The knowledge base spans 13 disease categories encompassing 88 rare diseases across metabolic (28), neurological (23), connective tissue (10), hematologic (15), and immunologic (13) domains, plus 8 cancer predisposition syndromes. The agent implements 23 ACMG classification criteria for variant interpretation, 9 diagnostic algorithms, and tracks 12 approved gene therapies. Twenty-three HPO top-level terms enable structured phenotype-driven differential diagnosis with information-content-weighted similarity scoring. Export formats include Markdown, JSON, PDF, FHIR R4 DiagnosticReport, and GA4GH Phenopacket v2 for rare disease data exchange.

Table 14. Rare Disease Diagnostic Agent capabilities.

Capability Detail
Collections 14 (13 domain + 1 genomic)
Disease categories 13 covering 88 rare diseases
ACMG criteria 23 classification criteria
Gene therapies tracked 12 approved/recent
HPO terms 23 top-level terms for phenotype matching
Export formats Markdown, JSON, PDF, FHIR R4, GA4GH Phenopacket v2

7.4.2 Pharmacogenomics Intelligence Agent

The Pharmacogenomics Intelligence Agent translates patient genotype data into actionable prescribing guidance using CPIC/DPWG guidelines, PharmGKB annotations, FDA labeling, and published clinical evidence. It maintains 15 Milvus collections (14 domain-specific + 1 shared genomic evidence) covering gene references, drug guidelines, drug interactions, HLA hypersensitivity, phenoconversion, dosing algorithms, clinical evidence, population data, clinical trials, FDA labels, drug alternatives, patient profiles, implementation programs, and educational resources.

The knowledge base encompasses 25 pharmacogene entries with comprehensive star allele definitions, 5 metabolizer phenotype classifications, 12 therapeutic drug categories, 12 HLA-drug hypersensitivity associations, 30+ gene-phenotype-specific drug substitutions, and 80+ entity aliases. Key analytical capabilities include: genotype-to-phenotype translation with star allele interpretation, CPIC/DPWG guideline-based dosing recommendations, phenoconversion detection (metabolic phenotype alteration via drug-drug interactions), HLA-mediated adverse drug reaction screening, population-specific allele frequency analysis, and genotype-guided therapeutic alternative identification.

Table 15. Pharmacogenomics Intelligence Agent capabilities.

Capability Detail
Collections 15 (14 domain + 1 genomic)
Pharmacogenes 25 gene entries with star allele definitions
HLA screening 12 HLA-drug hypersensitivity associations
Dosing algorithms CPIC/DPWG genotype-guided dosing
Phenoconversion Metabolic phenotype alteration detection
Data sources CPIC, PharmVar, PharmGKB, FDA, PubMed, ClinicalTrials.gov

7.4.3 Imaging Intelligence Agent

The Imaging Intelligence Agent provides automated detection, segmentation, longitudinal tracking, and clinical triage across four reference workflows: CT Head Hemorrhage Triage (SegResNet), CT Chest Lung Nodule Tracking (RetinaNet + SegResNet, Lung-RADS classification), CXR Rapid Findings (DenseNet-121, CheXpert pretrained), and MRI Brain MS Lesion Tracking (UNEST).

The agent integrates four NVIDIA NIM microservices: VISTA-3D (port 8530, 132 anatomical segmentation classes), MAISI (port 8531, synthetic CT generation), VILA-M3 (port 8532, vision-language radiology), and Llama-3 8B (port 8520, clinical reasoning). It maintains 11 Milvus collections (10 domain-specific + 1 shared genomic evidence) containing 2,814 imaging-specific vectors plus read-only access to 3.56M genomic vectors. The knowledge graph spans 15 pathologies, 8 imaging modalities, and 15 anatomy regions. Cross-modal genomics integration automatically queries genomic variants (EGFR, ALK, ROS1, KRAS) when high-risk imaging findings (Lung-RADS 4A+) are detected.

Table 16. Imaging Intelligence Agent capabilities.

Capability Detail
Collections 11 (10 domain + 1 genomic), 2,814 owned vectors
Clinical workflows 4 (CT head hemorrhage, CT lung nodule, CXR rapid findings, MRI brain MS)
NVIDIA NIMs VISTA-3D, MAISI, VILA-M3, Llama-3 8B
Knowledge graph 15 pathologies, 8 modalities, 15 anatomy regions
Cross-modal triggers Lung-RADS 4A+ triggers genomic EGFR/ALK/ROS1/KRAS query

7.4.4 Single-Cell Intelligence Agent

The Single-Cell Intelligence Agent provides RAG-powered single-cell genomics analysis for translational research and clinical decision support. It maintains 12 Milvus collections (11 domain-specific + 1 shared genomic evidence) covering cell types, markers, spatial data, tumor microenvironment profiles, drug response predictions, literature, analytical methods, reference datasets, trajectories, pathways, and clinical correlations.

The knowledge base encompasses 57 cell types with Cell Ontology identifiers, 4 tumor microenvironment profiles (hot, cold, excluded, immunosuppressive) with clinical trial correlations, 30 drugs with IC50 ranges from GDSC/DepMap, 4 spatial transcriptomics platforms (Visium, MERFISH, Xenium, CosMx), 75 marker genes, 10 immune signatures, 3 foundation models (scGPT, Geneformer, scFoundation), 25 ligand-receptor pairs, and 12 cancer TME atlas profiles. Ten analytical workflows span cell type annotation, TME profiling, drug response prediction, subclonal architecture analysis, spatial niche mapping, trajectory inference, ligand-receptor interaction analysis, biomarker discovery, CAR-T target validation, and treatment monitoring.

Table 17. Single-Cell Intelligence Agent capabilities.

Capability Detail
Collections 12 (11 domain + 1 genomic)
Cell types 57 with Cell Ontology IDs
TME profiles 4 (hot/cold/excluded/immunosuppressive)
Spatial platforms 4 (Visium, MERFISH, Xenium, CosMx)
Analytical workflows 10 (annotation, TME, drug response, subclonal, spatial, trajectory, L-R, biomarker, CAR-T, monitoring)
Data sources Human Cell Atlas, CellMarker 2.0, GDSC, DepMap, CellPhoneDB, NicheNet

7.5 Clinical Operations

7.5.1 Clinical Trial Intelligence Agent

The Clinical Trial Intelligence Agent provides AI-driven protocol optimization, patient-trial matching, site selection, eligibility optimization, adaptive design evaluation, safety signal detection, regulatory document generation, competitive intelligence, diversity assessment, and decentralized trial planning. It maintains 14 Milvus collections (13 domain-specific + 1 shared genomic evidence) covering protocols, eligibility criteria, endpoints, sites, investigators, results, regulatory documents, literature, biomarkers, safety data, real-world evidence, adaptive designs, and guidelines.

The knowledge base encompasses 13 therapeutic areas, 7 trial phases, 9 regulatory agencies (FDA, EMA, PMDA, and others), 9 endpoint types, 9 adaptive design frameworks, 9 biomarker strategies, 9 decentralized trial components, 40 landmark trials, and 6 safety signal metrics. Ten structured workflows plus general RAG Q&A cover protocol design (complexity scoring, SOA review), patient matching (genomic/biomarker integration), site selection (feasibility scoring, enrollment forecasting), eligibility optimization (population impact modeling), adaptive design evaluation (Bayesian interim analysis), safety signal detection (PRR/ROR disproportionality analysis), regulatory document generation (IND, CSR, briefing documents for FDA/EMA/PMDA), competitive intelligence (landscape analysis), diversity assessment (FDA guidance compliance), and decentralized trial planning (DCT component feasibility).

Table 18. Clinical Trial Intelligence Agent capabilities.

Capability Detail
Collections 14 (13 domain + 1 genomic)
Workflows 10 + general RAG Q&A
Regulatory agencies 9 (FDA, EMA, PMDA, and others)
Adaptive designs 9 frameworks (Bayesian, dose-response, futility)
Document generation IND, CSR, briefing documents
Safety analysis PRR/ROR disproportionality, causality assessment

8. Evaluation

8.1 End-to-End Pipeline Timing

Table 19 presents the end-to-end timing for a complete pipeline run on DGX Spark with the VCP/FTD demonstration case.

Table 19. End-to-end pipeline timing (VCP/FTD demonstration).

Stage Step Duration GPU Utilization Peak Memory
1 BWA-MEM2 alignment (fq2bam) 34 min 82% 38 GB
1 DeepVariant variant calling 22 min 91% 54 GB
2 Variant annotation 18 min 15% (CPU-bound) 12 GB
2 Milvus indexing 24 min 35% 22 GB
2 RAG/Chat (interactive session) 45 min 5% 8 GB
3 Structure retrieval 2 min 0% (network I/O) 2 GB
3 MolMIM generation 2 min 14 sec 78% 18 GB
3 DiffDock docking 8 min 42 sec 85% 24 GB
3 Pediatric safety assessment < 1 sec 0% (CPU) < 1 GB
3 Scoring + reporting 1 min 30 sec 0% (CPU) 4 GB
Total ~4 hr 12 min

All timings are wall-clock measurements on DGX Spark with Ubuntu 22.04 LTS. The total pipeline time of 4 hours 12 minutes falls well within the 5-hour target, with the interactive RAG/Chat session (45 minutes of researcher exploration) being the primary variable component.

8.2 Variant Calling Accuracy

DeepVariant achieves >99% concordance with the GIAB HG002 truth set, validated by hap.py benchmarking. The VCF contains 11,724,891 total variants with a transition/transversion ratio of 2.07, consistent with high-quality 30x WGS of an Ashkenazi ancestry sample. Quality filtering (QUAL > 30) retains 3,487,216 variants (29.7%), with 35,616 in coding regions suitable for clinical annotation.

8.3 RAG Query Quality

The RAG pipeline produces structured target hypotheses with explicit evidence chains. For the VCP/FTD demonstration, Claude correctly identifies:

  • Genomic evidence: rs188935092 at chr9:35065263 (G>A), heterozygous, QUAL=892
  • Clinical evidence: ClinVar Pathogenic classification (expert panel review)
  • AI prediction: AlphaMissense score 0.87 (pathogenic threshold > 0.564)
  • Functional annotation: VEP missense_variant, HIGH impact, D2 ATPase domain
  • Druggability: Known drug target with CB-5083 Phase I clinical trial precedent
  • Structural evidence: 4 PDB structures available, including inhibitor-bound 5FTK (2.3 A)

Vector search (Milvus, nprobe=16, top-k=20) returns relevant contexts with cosine similarity scores of 0.74--0.90 on demonstration queries. Search latency is 12 ms; full RAG query latency including Claude synthesis is approximately 24 seconds.

8.4 Drug Candidate Quality (VCP/FTD Case Study)

Table 20 compares the top AI-generated candidate against the CB-5083 seed compound.

Table 20. VCP drug candidate comparison: AI-generated top candidate vs. CB-5083 seed.

Metric CB-5083 (Seed) Top AI Candidate Improvement
Docking score -8.1 kcal/mol -11.4 kcal/mol +41% binding affinity
QED (drug-likeness) 0.62 0.81 +31%
Molecular weight 487.2 Da 423.5 Da -13% (improved oral absorption)
Composite score 0.64 0.89 +39% overall
Lipinski compliance PASS PASS --

Of the 100 generated molecules, 98 pass chemistry QC (RDKit valence checks), 87 pass Lipinski's Rule of Five (88.8%), 72 have QED > 0.67 (73.5%), and 34 achieve excellent docking scores below -8.0 kcal/mol. The top 10 candidates show docking scores ranging from -8.2 to -11.4 kcal/mol, with composite scores of 0.74--0.89.

These results are computational predictions---promising starting points for laboratory validation, not finished therapeutics. Real drug development requires synthesis, in vitro assays, in vivo models, clinical trials, and regulatory approval. The platform's contribution is collapsing the initial target identification and lead generation phase from months to hours.

8.5 Pediatric Safety Assessment

The six-filter pediatric safety assessment was applied to all 98 valid candidates from the VCP demonstration. Table 21 summarizes the results.

Table 21. Pediatric safety filter results (VCP demonstration, n=98).

Filter Threshold Flagged Severity
Molecular weight / BBB risk MW > 500 Da 11 (11.2%) Medium
Lipophilicity / hepatotoxicity LogP > 5 7 (7.1%) High
hERG cardiac liability IC50 < 10 uM 3 (3.1%) Critical
Teratogenicity Known risk 0 (0%) High
TPSA / oral bioavailability TPSA > 140 A^2 9 (9.2%) Medium
Rotatable bond flexibility > 10 bonds 14 (14.3%) Low

Of 98 candidates, 71 (72.4%) pass all filters without critical flags and are classified as pediatric-safe. Thirty-four candidates (34.7%) pass with no flags of any severity. Three candidates (3.1%) are classified as not pediatric-safe due to critical hERG cardiac liability flags. The pediatric safety assessment adds negligible computational overhead (< 1 second for all 98 candidates) and integrates directly into the composite ranking pipeline.

8.6 Test Coverage (158 Files, 11 Agents)

Table 22 summarizes the automated test suite across all eleven intelligence agents and the core platform.

Table 22. Intelligence agent and platform test coverage.

Agent Test Files Key Coverage Areas
CAR-T Intelligence 7 Models, knowledge, query expansion, RAG, export, integration
Imaging Intelligence 11 NIM clients, cross-modal, export, DICOM, workflows, RAG, query expansion
Precision Oncology 9 Collections, agent, case manager, trial matcher, therapy ranker, knowledge, RAG
Precision Biomarker 16 Biological age, disease trajectory, PGx, genotype adjustment, critical values, discordance, lab ranges
Precision Autoimmune 7 Autoimmune core, export, collections, API, diagnostic engine, timeline builder, RAG
Cardiology Intelligence 16 Risk calculators, GDMT optimizer, clinical workflows, cross-modal, API routes, knowledge, metrics
Neurology Intelligence 12 Clinical scales, workflows, execution, knowledge, RAG, query expansion, integration
Pharmacogenomics Intelligence 15 PGx pipeline, phenoconversion, HLA screener, dosing, ingest, API routes, metrics
Rare Disease Diagnostic 12 Decision support, clinical workflows, execution, knowledge, models, RAG
Single-Cell Intelligence 12 Decision support, cell types, TME, spatial, trajectories, RAG, workflows
Clinical Trial Intelligence 12 Decision support, clinical workflows, execution, knowledge, models, RAG
Core Platform 29 Genomics, RAG pipeline, drug discovery, orchestrator, health monitoring
Total 158

All agent tests execute via pytest under default configuration using mock NIM fallbacks, enabling continuous integration without GPU dependencies.


9. Discussion

9.1 Democratization of Precision Medicine

The HCLS AI Factory demonstrates that the complete precision medicine pipeline---from raw DNA to novel drug candidates with comprehensive clinical decision support across eleven medical specialties---can run on hardware costing \$4,699. Traditional approaches require \$50K--\$500K+ in infrastructure (CPU clusters, commercial software licenses, cloud compute) and 6--18 months of elapsed time involving multiple specialist teams.

By consolidating 21 services onto a single workstation with unified memory, we eliminate data transfer bottlenecks between pipeline stages and reduce operational complexity. A single researcher can operate the entire platform, from FASTQ input to ranked drug candidates, in a single session. This has implications for academic medical centers, small biotech companies, and institutions in resource-constrained settings that lack access to cloud genomics platforms or commercial drug discovery suites.

The eleven intelligence agents further democratize clinical expertise: a cardiology agent that implements ASCVD risk calculation, HEART Score, and GDMT optimization; a rare disease agent that performs ACMG variant classification across 88 diseases; a pharmacogenomics agent that translates genotypes to prescribing guidance for 25 pharmacogenes. These capabilities, previously requiring separate specialist consultations, are now available on-demand through a unified RAG-grounded interface.

The three-phase scaling path (DGX Spark at \$4,699 --> DGX B200 at \$500K--\$1M --> DGX SuperPOD at \$7M--\$60M+) ensures that proof-of-concept work on a desktop workstation can scale to departmental and enterprise deployments using the same Nextflow pipelines and Docker containers.

9.2 Limitations

Several limitations should be noted. First, the drug candidates are computational predictions that require experimental validation; favorable docking scores and QED values do not guarantee therapeutic efficacy. Second, the platform currently processes a single sample sequentially; multi-sample parallelism requires hardware beyond the DGX Spark. Third, the annotation databases (ClinVar, AlphaMissense) have known coverage gaps, particularly for rare variants and non-European ancestry populations. Fourth, the 201-gene knowledge base, while covering 13 therapeutic areas, represents a subset of the approximately 20,000 protein-coding genes. Fifth, DiffDock's docking predictions, while faster than traditional methods, have not been validated against experimental binding affinities for the specific VCP candidates generated.

The platform uses BioNeMo Cloud NIM endpoints on ARM64 (DGX Spark), as the x86-only NIM containers cannot run natively. This introduces a dependency on NVIDIA's cloud API availability and network connectivity.

The intelligence agents rely on curated knowledge bases that require periodic updates as clinical guidelines, drug approvals, and genomic databases evolve. The pediatric safety assessment, while based on established pharmacokinetic principles, provides screening-level flags rather than definitive safety determinations; comprehensive pediatric safety evaluation requires in vivo studies and regulatory review.

9.3 Ethical Considerations

Genomic data carries profound privacy implications. The HCLS AI Factory processes data locally on a single workstation, eliminating the need to upload patient genomic data to cloud services (with the exception of Claude API calls for RAG synthesis, which receive variant summaries rather than raw sequence data). The GIAB HG002 reference sample used for demonstration is a consented, de-identified public resource.

Computational drug candidates must not be interpreted as clinical recommendations. The platform explicitly labels outputs as research hypotheses requiring experimental validation. Clinical deployment would require CLIA/CAP certification of the genomics pipeline and FDA clearance of the drug discovery workflow. Intelligence agent recommendations are decision-support tools, not autonomous clinical decisions, and must be reviewed by qualified clinicians.

9.4 Future Work

Several directions for future development are planned. Clinical validation: Benchmarking variant calling against additional GIAB truth sets (HG001, HG003--HG007) and independent clinical sequencing datasets. Regulatory pathway: Pursuing CLIA/CAP laboratory certification for the Parabricks genomics pipeline. Multi-sample support: Enabling concurrent processing of multiple patients via Kubernetes orchestration on multi-GPU systems. Expanded knowledge base: Integrating additional annotation sources (gnomAD, COSMIC, PharmGKB) and expanding gene coverage beyond 201 genes. Federated learning: Deploying NVIDIA FLARE for cross-institutional model training while maintaining data sovereignty---models train locally, only gradient updates are shared, and patient data never leaves the originating institution. Experimental validation: Synthesizing top VCP candidates and testing in biochemical ATPase activity assays. Institutional partnerships: Expanding to multi-site evaluation across pediatric oncology centers. Agent expansion: Developing additional intelligence agents for infectious disease, endocrinology, and maternal-fetal medicine.


10. Conclusion

We have presented the HCLS AI Factory, an open-source platform that integrates GPU-accelerated genomics, RAG-grounded target identification, and AI-driven drug discovery into a single end-to-end workflow running on a \$4,699 desktop workstation. The platform processes 200 GB of raw sequencing data through 11.7 million variant calls, 3.56 million annotated variant embeddings, and 100 ranked novel drug candidates with pediatric safety assessment in under 5 hours---a 99% reduction from the 6--18 months required by traditional approaches.

The VCP/Frontotemporal Dementia demonstration produces candidates with 39% composite improvement over the CB-5083 seed compound, with the top candidate achieving -11.4 kcal/mol docking affinity and 0.81 QED drug-likeness. Eleven domain-specialized intelligence agents---spanning precision oncology, CAR-T therapy, biomarker analysis, cardiology, neurology, autoimmune disease, rare disease diagnostics, pharmacogenomics, medical imaging, single-cell genomics, and clinical trial operations---extend the platform with 139 Milvus collections, approximately 47,691 vectors, and 158 test files. The key architectural insight is that modern GPU workstations with unified memory can consolidate what previously required separate compute clusters, cloud platforms, specialist teams, and domain-specific decision support systems. By releasing all code under Apache 2.0 and targeting a \$4,699 hardware platform, we aim to make end-to-end precision medicine---from patient DNA to drug candidates with comprehensive clinical intelligence---accessible to any researcher with a desktop workstation and a sequenced genome.


Acknowledgments

The author thanks NVIDIA for the DGX Spark hardware platform and BioNeMo NIM microservices; Anthropic for the Claude API; the Genome in a Bottle (GIAB) Consortium for the HG002 reference standard and truth sets; the ClinVar, AlphaMissense, and Ensembl VEP teams for open variant annotation databases; the Milvus, RDKit, and Nextflow open-source communities; the CPIC, PharmGKB, CIViC, OncoKB, Orphanet, OMIM, HPO, and Human Cell Atlas teams for curated biomedical knowledge bases; and the broader open-source bioinformatics ecosystem that makes work of this nature possible.


References

[1] M. A. Hamburg and F. S. Collins, "The path to personalized medicine," New England Journal of Medicine, vol. 363, no. 4, pp. 301--304, 2010.

[2] G. S. Ginsburg and K. A. Phillips, "Precision medicine: From science to value," Health Affairs, vol. 37, no. 5, pp. 694--701, 2018.

[3] M. J. Landrum, J. M. Lee, M. Benson, et al., "ClinVar: Improving access to variant interpretations and supporting evidence," Nucleic Acids Research, vol. 46, no. D1, pp. D1062--D1067, 2018.

[4] J. Cheng, G. Novati, J. Pan, et al., "Accurate proteome-wide missense variant effect prediction with AlphaMissense," Science, vol. 381, no. 6664, eadg7492, 2023.

[5] W. McLaren, L. Gil, S. E. Hunt, et al., "The Ensembl Variant Effect Predictor," Genome Biology, vol. 17, no. 1, p. 122, 2016.

[6] H. Li, "Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM," arXiv:1303.3997, 2013.

[7] A. McKenna, M. Hanna, E. Banks, et al., "The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data," Genome Research, vol. 20, no. 9, pp. 1297--1303, 2010.

[8] R. Poplin, P.-C. Chang, D. Alexander, et al., "A universal SNP and small-indel variant caller using deep neural networks," Nature Biotechnology, vol. 36, no. 10, pp. 983--987, 2018.

[9] NVIDIA Corporation, "NVIDIA Parabricks: GPU-accelerated genomics analysis," https://www.nvidia.com/en-us/clara/parabricks/, 2024.

[10] NVIDIA Corporation, "NVIDIA Clara: Healthcare application framework," https://www.nvidia.com/en-us/clara/, 2024.

[11] P. Lewis, E. Perez, A. Piktus, et al., "Retrieval-augmented generation for knowledge-intensive NLP tasks," Advances in Neural Information Processing Systems, vol. 33, pp. 9459--9474, 2020.

[12] K. Singhal, S. Azizi, T. Tu, et al., "Large language models encode clinical knowledge," Nature, vol. 620, pp. 172--180, 2023.

[13] J. Jumper, R. Evans, A. Pritzel, et al., "Highly accurate protein structure prediction with AlphaFold," Nature, vol. 596, pp. 583--589, 2021.

[14] G. Corso, H. Stark, B. Jing, R. Barzilay, and T. Jaakkola, "DiffDock: Diffusion steps, twists, and turns for molecular docking," International Conference on Learning Representations (ICLR), 2023.

[15] NVIDIA Corporation, "BioNeMo MolMIM: Masked inverse modeling for molecular generation," NVIDIA BioNeMo Framework, 2024.

[16] G. Landrum, "RDKit: Open-source cheminformatics," https://www.rdkit.org/, 2024.

[17] J. M. Zook, B. McDaniel, N. D. Olson, et al., "An open resource for accurately benchmarking small variant and reference calls," Nature Biotechnology, vol. 37, no. 5, pp. 561--566, 2019.

[18] S. Xiao, Z. Liu, P. Zhang, and N. Muennighoff, "C-Pack: Packaged resources to advance general Chinese embedding," arXiv:2309.07597, 2023.

[19] J. Wang, X. Yi, R. Guo, et al., "Milvus: A purpose-built vector data management system," Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 2614--2627, 2021.

[20] P. Di Tommaso, M. Chatzou, E. W. Floden, et al., "Nextflow enables reproducible computational workflows," Nature Biotechnology, vol. 35, no. 4, pp. 316--319, 2017.

[21] S. A. Forbes, D. Beare, H. Boutselakis, et al., "COSMIC: Somatic cancer genetics at high-resolution," Nucleic Acids Research, vol. 45, no. D1, pp. D777--D783, 2017.

[22] C. A. Lipinski, F. Lombardo, B. W. Dominy, and P. J. Feeney, "Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings," Advanced Drug Delivery Reviews, vol. 46, no. 1--3, pp. 3--26, 2001.

[23] G. R. Bickerton, G. V. Paolini, J. Besnard, S. Muresan, and A. L. Hopkins, "Quantifying the chemical beauty of drugs," Nature Chemistry, vol. 4, no. 2, pp. 90--98, 2012.

[24] S. Richards, N. Aziz, S. Bale, et al., "Standards and guidelines for the interpretation of sequence variants: A joint consensus recommendation of the ACMG and AMP," Genetics in Medicine, vol. 17, no. 5, pp. 405--424, 2015.

[25] S. Koehler, P. N. Robinson, et al., "The Human Phenotype Ontology project: Linking molecular biology and disease through phenotype data," Nucleic Acids Research, vol. 42, no. D1, pp. D966--D974, 2014.

[26] M. V. Relling and T. E. Klein, "CPIC: Clinical Pharmacogenetics Implementation Consortium of the Pharmacogenomics Research Network," Clinical Pharmacology & Therapeutics, vol. 89, no. 3, pp. 464--467, 2011.

[27] M. Whirl-Carrillo, R. Huddart, L. Gong, et al., "An evidence-based framework for evaluating pharmacogenomics knowledge for personalized medicine," Clinical Pharmacology & Therapeutics, vol. 110, no. 3, pp. 563--572, 2021.

[28] S. Griffith, M. Griffith, et al., "CIViC is a community knowledgebase for expert crowdsourcing the clinical interpretation of variants in cancer," Nature Genetics, vol. 49, pp. 170--174, 2017.

[29] D. Chakravarty, J. Gao, S. Phillips, et al., "OncoKB: A precision oncology knowledge base," JCO Precision Oncology, vol. 1, pp. 1--16, 2017.

[30] A. S. Rao, M. Phan, et al., "MONAI: An open-source framework for deep learning in healthcare," arXiv:2211.02701, 2022.


HCLS AI Factory -- Apache 2.0 | March 2026


Clinical Decision Support Disclaimer

The HCLS AI Factory platform and all intelligence agents described in this document are clinical decision support research tools. It is not FDA-cleared and is not intended as a standalone diagnostic device. All recommendations should be reviewed by qualified healthcare professionals. Apache 2.0 License.