Skip to content

Performance Benchmarks

Measured performance of the HCLS AI Factory on NVIDIA DGX Spark ($3,999). All timings represent end-to-end wall clock time under default configurations.


Summary

Metric Value
End-to-end time (DNA to Drug Candidates) < 5 hours
Traditional approach 6-18 months
Reduction ~99%
Hardware cost $3,999 (single workstation)
Traditional infrastructure cost $50K-500K+ (cluster + licenses)

Hardware: NVIDIA DGX Spark

Component Specification
GPU NVIDIA GB10 Grace Blackwell
Memory 128 GB unified LPDDR5x
CPU 20 ARM cores (Grace)
Interconnect NVLink-C2C (GPU-CPU)
Storage NVMe SSD
Power Desktop form factor

Stage 1: GPU Genomics (FASTQ to VCF)

Pipeline: NVIDIA Parabricks 4.6 with BWA-MEM2 + DeepVariant

Step Time GPU Utilization
Alignment (BWA-MEM2) 20-45 min 85-95%
Sorting + deduplication included
Indexing (samtools) 2-5 min CPU
Variant calling (DeepVariant) 10-35 min 85-95%
Total 120-240 min 85-95%
Metric Value
Input ~200 GB paired-end FASTQ (HG002 WGS)
Output ~11.7 million variant calls (VCF)
Accuracy >99% concordance (DeepVariant)
Speedup vs. CPU 10-50x
CPU baseline 24-48 hours

Stage 2: Evidence RAG (VCF to Target Hypothesis)

Pipeline: Milvus + BGE-small-en-v1.5 + Claude

Vector Database

Collection Records Embedding Time
ClinVar variants ~2.7M records ~45 min (one-time)
AlphaMissense predictions 71M records (sampled) ~30 min (one-time)
Clinker knowledge base 201 genes, 150+ diseases ~5 min (one-time)
Total searchable vectors 3.56M

Query Performance

Operation Latency
Vector embedding (BGE-small-en-v1.5) < 50 ms
Milvus similarity search (top-10) < 100 ms
Claude evidence synthesis 2-5 sec
End-to-end query < 5 sec
Metric Value
Embedding model BGE-small-en-v1.5 (384 dimensions)
LLM Claude (Anthropic)
Therapeutic areas covered 13
Target genes 201
Druggability rate 85%

Stage 3: Drug Discovery (Target to Molecules)

Pipeline: BioNeMo MolMIM + DiffDock + RDKit

Step Time Mode
Structure retrieval (RCSB PDB) < 5 sec API
Structure preparation < 30 sec CPU
Molecule generation (MolMIM) 10-60 sec Cloud NIM
3D conformer generation (RDKit) < 30 sec CPU
Molecular docking (DiffDock) 2-8 min Cloud NIM
Scoring and ranking (QED + Lipinski) < 10 sec CPU
Report generation < 30 sec CPU
Total 8-16 min
Metric Value
Candidate molecules generated 10-100 per run
Docking poses per molecule 10
Drug-likeness filter Lipinski Rule of 5 + QED
Seed compound (demo) CB-5083 (VCP inhibitor)
PDB structures used (demo) 5FTK, 8OOI, 9DIL, 7K56

NIM Execution Modes

Mode MolMIM DiffDock Best For
Cloud health.api.nvidia.com health.api.nvidia.com DGX Spark (ARM64), no local GPU containers needed
Local localhost:8001 localhost:8002 x86 workstations with dedicated GPU
Mock Simulated output Simulated output Testing, CI/CD, demos without API keys

Intelligence Agent Benchmarks

CAR-T Intelligence Agent (Port 8521)

Metric Value
Collections 10 owned + 1 shared (read-only)
Total vectors 6,266+
Query latency (evidence retrieval) < 3 sec
Comparative analysis < 8 sec
Deep research mode 10-30 sec
PDF export < 5 sec
Test suite 241 tests, < 1 sec

Imaging Intelligence Agent (Port 8525)

Metric Value
Collections 10
NIM services 4 (VISTA-3D, MAISI, VILA-M3, Llama-3)
Workflow demo execution 5-15 sec per modality
Evidence query < 5 sec
Comparative analysis < 8 sec
FHIR R4 DiagnosticReport export < 2 sec
Test suite 539 tests, ~3 sec

Precision Oncology Agent (Port 8526)

Metric Value
Collections 11 (10 owned + 1 shared)
Case creation < 2 sec
MTB packet generation 10-30 sec
Trial matching < 5 sec
Therapy ranking < 5 sec
FHIR R4 bundle export < 2 sec
Test suite 516 tests, < 1 sec

Combined Test Suite

Agent Tests Time
CAR-T Intelligence 241 0.18 sec
Imaging Intelligence 539 3.20 sec
Precision Oncology 516 0.40 sec
Total 1,296 3.78 sec

Infrastructure

Service Startup

Component Cold Start Warm Restart
Milvus 30-60 sec 10-15 sec
Landing page < 5 sec < 2 sec
RAG Chat UI 5-10 sec < 5 sec
Drug Discovery UI 5-10 sec < 5 sec
Agent UIs 5-10 sec each < 5 sec
Full platform (all services) 2-3 min < 1 min

Resource Usage (Idle)

Resource Usage
CPU < 5% (20 ARM cores)
Memory ~8-12 GB (128 GB available)
GPU Memory < 2 GB (128 GB unified)
Disk (platform + data) ~400-500 GB

Resource Usage (Peak — Genomics Pipeline Running)

Resource Usage
CPU 40-60%
Memory 30-50 GB
GPU 85-95% utilization
Disk I/O High (FASTQ read + BAM write)

Scalability

Dimension Current Potential
Samples per day 3-6 (sequential) 10-20 (with pipeline parallelism)
Vector database 3.56M vectors 100M+ (Milvus scales horizontally)
Knowledge base 201 genes, 13 areas Expandable with additional collections
Concurrent users 5-10 (single workstation) 50+ (with load balancing)
Agent instances 3 Additional agents via plugin architecture

Methodology

  • All benchmarks measured on NVIDIA DGX Spark with Ubuntu 22.04 LTS
  • Timings are wall clock measurements averaged over 3 runs
  • GPU utilization measured via nvidia-smi and DCGM Exporter
  • Query latencies measured end-to-end including network overhead
  • Test suite timings measured via pytest with default configuration
  • "Traditional approach" estimates based on published literature for manual genomics + drug discovery workflows at academic medical centers