HCLS AI Factory — Deployment and Configuration Guide for NVIDIA DGX Spark¶

Open-Source Precision Medicine Platform on NVIDIA DGX Spark

The HCLS AI Factory is a three-stage precision medicine pipeline that transforms raw genomic sequencing data (FASTQ) into actionable drug discovery candidates. Stage 1 performs GPU-accelerated genomics alignment and variant calling with NVIDIA Parabricks. Stage 2 annotates variants against clinical databases, embeds them into a Milvus vector store, and provides a RAG-powered conversational interface using Anthropic Claude. Stage 3 leverages NVIDIA BioNeMo NIM microservices for structure-aware molecule generation and molecular docking, producing ranked drug candidates with composite scores. The entire platform runs on a single NVIDIA DGX Spark desktop workstation — a $3,999 system powered by the GB10 Grace Blackwell Superchip with 128 GB unified LPDDR5x memory. This guide covers the open-source fork: everything you need to clone, configure, and deploy the full stack using Docker Compose.

License: Apache 2.0 | Author: Adam Jones | Date: February 2026

Table of Contents¶

Introduction
Architecture Overview
Prerequisites
Environment Preparation
Repository Setup
Reference Data Preparation
Docker Compose Configuration
Deploy Genomics Pipeline (Stage 1)
Deploy RAG Chat Pipeline (Stage 2)
Deploy Drug Discovery Pipeline (Stage 3)
Nextflow Orchestration
Service Startup and Health
Monitoring and Observability
Security Configuration
Data Management
Performance Tuning
Troubleshooting Guide
VCP/FTD Demo Walkthrough
Scaling Beyond DGX Spark
Appendix A: Complete Configuration Reference
Appendix B: API Reference
Appendix C: Schema Definitions
Appendix D: Docker Image Reference
Appendix E: Validation Checklists
Appendix F: Glossary

1. Introduction¶

1.1 Purpose¶

This document provides step-by-step instructions for deploying the HCLS AI Factory on an NVIDIA DGX Spark workstation. It covers all three pipeline stages — genomics, RAG-powered variant intelligence, and AI-driven drug discovery — using exclusively open-source and publicly available components.

1.2 Scope¶

The guide addresses hardware validation, software installation, container deployment, data preparation, pipeline execution, monitoring, security, and troubleshooting. It targets the open-source HCLS AI Factory that runs entirely on Docker Compose without requiring Kubernetes or multi-node infrastructure.

1.3 Audience¶

Bioinformatics Engineers deploying genomics pipelines on DGX Spark
ML/AI Engineers integrating RAG and BioNeMo NIM microservices
DevOps Engineers managing containerized service stacks
Researchers forking the project for their own precision medicine workflows

1.4 Document Conventions¶

Convention	Meaning
`monospace`	Commands, file paths, code
Bold	UI elements, key terms
Italic	Variable values to be replaced
`$VARIABLE`	Environment variable
`<placeholder>`	User-supplied value

1.5 Genomics and Drug Discovery Primer¶

This section provides essential background for engineers who may not have a biology or chemistry background.

1.5.1 DNA Sequencing¶

DNA sequencing reads the order of nucleotide bases (A, T, C, G) in an organism's genome. Modern short-read sequencers (e.g., Illumina) produce paired-end reads — two sequences from opposite ends of a DNA fragment. The standard demo sample HG002 is a 30x whole-genome sequencing (WGS) dataset with 2x250 bp paired-end reads, producing approximately 200 GB of FASTQ data.

1.5.2 Genomics Pipeline Stages¶

Stage	Input	Tool	Output	Description
Quality Control	FASTQ	FastQC	QC Report	Assess read quality and adapter contamination
Alignment	FASTQ + Reference	BWA-MEM2 (fq2bam)	BAM	Map reads to GRCh38 reference genome
Variant Calling	BAM	DeepVariant	VCF	Identify SNPs and indels vs. reference
Annotation	VCF	VEP + ClinVar + AlphaMissense	Annotated VCF	Add functional, clinical, and pathogenicity data
Embedding	Annotated VCF	BGE-small-en-v1.5	Vectors (384-dim)	Convert variant evidence to dense embeddings

1.5.3 Variant Annotation¶

Variants are annotated from multiple sources:

VEP (Variant Effect Predictor): Assigns functional consequences and impact levels — HIGH, MODERATE, LOW, or MODIFIER.
ClinVar: NCBI database of 4.1 million clinical variant interpretations (Pathogenic, Likely Pathogenic, Benign, etc.).
AlphaMissense: DeepMind model with 71,697,560 missense variant pathogenicity predictions. Thresholds: pathogenic (>0.564), ambiguous (0.34-0.564), benign (<0.34).

1.5.4 Vector Embeddings and RAG¶

Annotated variants are converted to 384-dimensional dense vectors using the BGE-small-en-v1.5 embedding model and stored in Milvus. Retrieval-Augmented Generation (RAG) queries Milvus for relevant genomic evidence, then passes the results as context to Anthropic Claude for natural-language clinical interpretation.

1.5.5 Drug Discovery Pipeline¶

The 10-stage drug discovery pipeline transforms a genomic target into ranked drug candidates:

Stage	Name	Description
1	Initialize	Load configuration, validate target gene and variant
2	Normalize Target	Map gene symbol to UniProt ID and canonical name
3	Structure Discovery	Query RCSB PDB for 3D protein structures, score by resolution and method
4	Structure Preparation	Download PDB files, extract binding site coordinates
5	Molecule Generation	Generate SMILES candidates via MolMIM NIM (Port 8001) using seed molecule
6	Chemistry QC	Filter by Lipinski Rule of Five (MW<=500, LogP<=5, HBD<=5, HBA<=10)
7	Conformer Generation	Generate 3D conformers with RDKit for docking input
8	Molecular Docking	Score binding affinity via DiffDock NIM (Port 8002)
9	Composite Ranking	Rank candidates: 30% generation + 40% docking + 30% QED
10	Reporting	Generate PDF report with structures, scores, and recommendations

1.5.6 End-to-End Data Flow Summary¶

FASTQ (200 GB) ─► Parabricks fq2bam ─► BAM (100 GB) ─► DeepVariant ─► VCF (11.7M variants)
    ─► Annotation (ClinVar + AlphaMissense + VEP) ─► Milvus (384-dim vectors)
    ─► Claude RAG (variant interpretation) ─► Target Hypothesis
    ─► PDB Structure Retrieval ─► MolMIM (molecule generation)
    ─► DiffDock (molecular docking) ─► Composite Ranking ─► PDF Report

2. Architecture Overview¶

2.1 System Components¶

The HCLS AI Factory comprises three application pipeline stages running on a single DGX Spark:

Stage	Name	Function
Stage 1	Genomics Pipeline	FASTQ alignment and variant calling with GPU-accelerated Parabricks
Stage 2	RAG Chat Pipeline	Variant annotation, vector embedding, and Claude-powered conversational AI
Stage 3	Drug Discovery Pipeline	Structure-aware molecule generation, docking, and composite ranking

2.2 Technology Stack¶

Layer	Technology	Version / Details
Hardware	NVIDIA DGX Spark	GB10 GPU, 128 GB unified LPDDR5x, 144 ARM64 cores
OS	DGX OS	Ubuntu-based, ARM64 (aarch64)
Container Runtime	Docker + NVIDIA Container Toolkit	nvidia-docker runtime
Orchestration	Docker Compose	Multi-service deployment
Pipeline Orchestration	Nextflow	DSL2, multiple profiles
GPU Genomics	NVIDIA Parabricks	4.6.0-1
Vector Database	Milvus	2.4 (with etcd + MinIO)
Embedding Model	BGE-small-en-v1.5	384 dimensions
LLM	Anthropic Claude	claude-sonnet-4-20250514
Molecule Generation	BioNeMo MolMIM NIM	1.0
Molecular Docking	BioNeMo DiffDock NIM	1.0
Cheminformatics	RDKit	Python library
Monitoring	Grafana + Prometheus	10.2.2 / v2.48.0
GPU Monitoring	DCGM Exporter	Port 9400
Language	Python	3.10+

2.3 Service Architecture¶

The platform deploys 14 services across 14 ports:

#	Service	Port	Protocol	Description
1	Landing Page	8080	HTTP	Platform entry point and service directory
2	Genomics Portal	5000	HTTP	Genomics pipeline UI and results viewer
3	RAG API	5001	HTTP	REST API for variant queries and RAG
4	Milvus	19530	gRPC	Vector database for genomic evidence
5	Attu	8000	HTTP	Milvus administration UI
6	Streamlit Chat	8501	HTTP	Conversational AI interface for variant analysis
7	MolMIM NIM	8001	HTTP	BioNeMo molecule generation microservice
8	DiffDock NIM	8002	HTTP	BioNeMo molecular docking microservice
9	Discovery UI	8505	HTTP	Drug discovery pipeline interface
10	Discovery Portal	8510	HTTP	Drug discovery results and reporting portal
11	Grafana	3000	HTTP	Monitoring dashboards
12	Prometheus	9099	HTTP	Metrics collection and storage
13	Node Exporter	9100	HTTP	Host system metrics
14	DCGM Exporter	9400	HTTP	NVIDIA GPU metrics

Infrastructure services (not externally exposed):

Service	Port	Purpose
etcd	2379	Milvus metadata store
MinIO	9000	Milvus object storage

2.4 Data Flow¶

┌─────────────────────────────────────────────────────────────────────────────┐
│                        HCLS AI Factory — Data Flow                         │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  FASTQ ──► Parabricks fq2bam ──► BAM ──► Parabricks DeepVariant ──► VCF   │
│  (200 GB)   (20-45 min)         (100 GB)   (10-35 min)            (11.7M)  │
│                                                                             │
│  VCF ──► ClinVar (4.1M) ──► AlphaMissense (71.7M) ──► VEP ──► Annotated   │
│          (35,616 match)     (6,831 matched)                                 │
│                                                                             │
│  Annotated ──► BGE-small-en-v1.5 ──► Milvus (384-dim, IVF_FLAT) ──►       │
│                                      (COSINE, nlist=1024)                   │
│                                                                             │
│  Milvus ──► Claude (sonnet-4) ──► Target Hypothesis                        │
│             (temp=0.3, 4096 tokens)                                         │
│                                                                             │
│  Target ──► PDB Structures ──► MolMIM (8001) ──► Chemistry QC ──►          │
│                                                   (Lipinski + QED)          │
│                                                                             │
│  Conformers ──► DiffDock (8002) ──► Composite Ranking ──► PDF Report       │
│                                     (0.3*gen + 0.4*dock + 0.3*QED)         │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

3. Prerequisites¶

3.1 Hardware Requirements¶

Component	Specification
System	NVIDIA DGX Spark
GPU	GB10 Grace Blackwell Superchip
Memory	128 GB unified LPDDR5x
CPU	144 ARM64 cores
Architecture	aarch64 (ARM64)
Price	$3,999

Storage requirements:

Dataset / Component	Size
GRCh38 Reference Genome	3.1 GB
FASTQ Input (HG002 30x WGS)	~200 GB
BAM Output (intermediate)	~100 GB
ClinVar Database	~1.2 GB
AlphaMissense Predictions	~4 GB
Milvus Index Data	~2 GB
BioNeMo Model Cache	~10 GB
Total Minimum	~320 GB
Recommended	1 TB NVMe

3.2 Software Requirements¶

Software	Minimum Version	Notes
DGX OS	Latest	Ubuntu-based ARM64
Docker Engine	24.0+	With Compose V2
NVIDIA Container Toolkit	Latest	nvidia-docker runtime
CUDA Toolkit	12.x	Included with DGX OS
Python	3.10+	For pipeline scripts
Nextflow	23.04+	DSL2 support required
Git	2.30+	For repository clone
NGC CLI	Latest	For BioNeMo container pulls

3.3 Network Requirements¶

Internet access for initial setup (container pulls, data downloads)
Outbound HTTPS to api.anthropic.com for Claude API calls
Outbound HTTPS to nvcr.io for NGC container registry
Outbound HTTPS to NCBI, RCSB PDB for reference data downloads
All service ports (listed in Section 2.3) accessible on localhost

3.4 Access Credentials¶

Credential	Purpose	How to Obtain
`ANTHROPIC_API_KEY`	Claude API access	https://console.anthropic.com
`NGC_API_KEY`	NVIDIA NGC container registry	https://ngc.nvidia.com

4. Environment Preparation¶

4.1 DGX Spark Initial Setup¶

Verify the system is a DGX Spark with the expected hardware:

# Verify ARM64 architecture
uname -m
# Expected: aarch64

# Verify CPU cores
nproc
# Expected: 144

# Verify total memory (128 GB)
free -h | grep Mem
# Expected: ~128 GB total

# Verify GPU is detected
nvidia-smi
# Expected: GB10 GPU listed with driver version

4.2 NVIDIA Driver and CUDA Verification¶

# Check NVIDIA driver version
nvidia-smi --query-gpu=driver_version --format=csv,noheader
# Expected: 550.x or later

# Check CUDA version
nvcc --version
# Expected: CUDA 12.x

# Verify GPU compute capability
nvidia-smi --query-gpu=compute_cap --format=csv,noheader

# Run a quick GPU test
nvidia-smi -q | head -30

4.3 Docker Installation and Configuration¶

# Verify Docker is installed
docker --version
# Expected: Docker version 24.0+

# Verify Docker Compose V2
docker compose version
# Expected: Docker Compose version v2.x

# Verify NVIDIA runtime is available
docker info | grep -i runtime
# Expected: nvidia runtime listed

# Test GPU access from a container
docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi

Configure Docker daemon for NVIDIA runtime as default:

sudo tee /etc/docker/daemon.json <<'EOF'
{
  "default-runtime": "nvidia",
  "runtimes": {
    "nvidia": {
      "path": "nvidia-container-runtime",
      "runtimeArgs": []
    }
  },
  "default-address-pools": [
    {"base": "172.20.0.0/16", "size": 24}
  ],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "50m",
    "max-file": "3"
  }
}
EOF

sudo systemctl restart docker

4.4 Python Environment Setup¶

# Verify Python version
python3 --version
# Expected: Python 3.10+

# Create virtual environment
python3 -m venv ~/hcls-env
source ~/hcls-env/bin/activate

# Install core dependencies
pip install --upgrade pip
pip install \
  anthropic \
  pymilvus \
  sentence-transformers \
  rdkit-pypi \
  pydantic \
  streamlit \
  fastapi \
  uvicorn \
  requests \
  pandas \
  numpy \
  reportlab \
  biopython \
  nextflow

4.5 NGC CLI Installation¶

# Download NGC CLI for ARM64
wget -O ngc-cli.zip https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/latest/files/ngccli_arm64.zip

# Extract and install
unzip ngc-cli.zip -d ~/ngc-cli
chmod +x ~/ngc-cli/ngc-cli/ngc
export PATH=$PATH:~/ngc-cli/ngc-cli

# Configure NGC CLI
ngc config set
# Enter your NGC API key when prompted

# Verify authentication
ngc registry image list --format_type csv | head -5

5. Repository Setup¶

5.1 Fork and Clone¶

# Fork the repository on GitHub, then clone your fork
git clone https://github.com/<your-username>/hcls-ai-factory.git
cd hcls-ai-factory

# Verify repository structure
ls -la

5.2 Repository Layout¶

hcls-ai-factory/
├── docker-compose.yml              # All 14 services + infrastructure
├── .env.example                    # Template environment configuration
├── nextflow.config                 # Nextflow pipeline configuration
├── main.nf                         # Nextflow DSL2 pipeline definition
├── start-services.sh               # Service startup script
├── requirements.txt                # Python dependencies
│
├── genomics/                       # Stage 1: Genomics Pipeline
│   ├── parabricks/                 # Parabricks configs and scripts
│   │   ├── fq2bam.sh              # BWA-MEM2 alignment wrapper
│   │   └── deepvariant.sh         # DeepVariant variant calling wrapper
│   ├── portal/                     # Genomics Portal (Port 5000)
│   │   └── app.py
│   └── data/                       # Input/output data directory
│       ├── reference/              # GRCh38 reference genome
│       ├── fastq/                  # Input FASTQ files
│       ├── bam/                    # Alignment output
│       └── vcf/                    # Variant call output
│
├── rag/                            # Stage 2: RAG Chat Pipeline
│   ├── api/                        # RAG API (Port 5001)
│   │   └── app.py
│   ├── chat/                       # Streamlit Chat (Port 8501)
│   │   └── app.py
│   ├── embeddings/                 # BGE embedding pipeline
│   │   └── embed_variants.py
│   ├── annotation/                 # Variant annotation pipeline
│   │   ├── clinvar.py
│   │   ├── alphamissense.py
│   │   └── vep.py
│   ├── knowledge/                  # Gene knowledge base
│   │   └── genes.json              # 201 genes, 13 therapeutic areas
│   └── data/
│       ├── clinvar/                # ClinVar database
│       └── alphamissense/          # AlphaMissense predictions
│
├── discovery/                      # Stage 3: Drug Discovery Pipeline
│   ├── pipeline/                   # 10-stage discovery pipeline
│   │   ├── __init__.py
│   │   ├── initialize.py           # Stage 1: Initialize
│   │   ├── normalize.py            # Stage 2: Normalize Target
│   │   ├── structure_discovery.py  # Stage 3: Structure Discovery
│   │   ├── structure_prep.py       # Stage 4: Structure Preparation
│   │   ├── molecule_gen.py         # Stage 5: Molecule Generation
│   │   ├── chemistry_qc.py        # Stage 6: Chemistry QC
│   │   ├── conformer_gen.py        # Stage 7: Conformer Generation
│   │   ├── docking.py              # Stage 8: Molecular Docking
│   │   ├── ranking.py              # Stage 9: Composite Ranking
│   │   └── reporting.py            # Stage 10: Reporting
│   ├── ui/                         # Discovery UI (Port 8505)
│   │   └── app.py
│   ├── portal/                     # Discovery Portal (Port 8510)
│   │   └── app.py
│   └── models/                     # Pydantic data models
│       └── schemas.py
│
├── monitoring/                     # Monitoring stack
│   ├── grafana/
│   │   ├── provisioning/
│   │   └── dashboards/
│   ├── prometheus/
│   │   └── prometheus.yml
│   └── exporters/
│
├── landing/                        # Landing Page (Port 8080)
│   └── index.html
│
├── scripts/                        # Utility scripts
│   ├── run_pipeline.py             # Pipeline launcher
│   ├── download_references.sh      # Reference data downloader
│   └── validate_deployment.sh      # Deployment validator
│
└── docs/                           # Documentation
    └── ...

5.3 Environment Configuration¶

# Copy the example environment file
cp .env.example .env

# Edit with your credentials and paths
nano .env

The .env file should contain:

# === API Keys ===
ANTHROPIC_API_KEY=sk-ant-api03-XXXXXXXXXXXX
NGC_API_KEY=XXXXXXXXXXXX

# === Model Configuration ===
CLAUDE_MODEL=claude-sonnet-4-20250514
CLAUDE_TEMPERATURE=0.3

# === Reference Data ===
REFERENCE_GENOME=/data/reference/GRCh38.fa

# === Milvus Configuration ===
MILVUS_HOST=localhost
MILVUS_PORT=19530

# === BioNeMo NIM URLs ===
MOLMIM_URL=http://localhost:8001
DIFFDOCK_URL=http://localhost:8002

# === Pipeline Configuration ===
PIPELINE_MODE=full
NUM_CANDIDATES=100
MIN_QED=0.3
MIN_DOCK_SCORE=-6.0

# === Monitoring ===
GRAFANA_USER=admin
GRAFANA_PASSWORD=changeme

5.4 Directory Structure for Data¶

# Create data directories
mkdir -p genomics/data/{reference,fastq,bam,vcf}
mkdir -p rag/data/{clinvar,alphamissense}
mkdir -p discovery/data/{structures,molecules,reports}
mkdir -p monitoring/data/{grafana,prometheus}

6. Reference Data Preparation¶

6.1 GRCh38 Reference Genome¶

# Download GRCh38 reference genome (~3.1 GB)
cd genomics/data/reference

wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.15_GRCh38/seqs_for_alignment_pipelines.ucsc_ids/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.gz

# Decompress
gunzip GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.gz
mv GCA_000001405.15_GRCh38_no_alt_analysis_set.fna GRCh38.fa

# Index the reference (required by Parabricks)
# Note: Parabricks fq2bam can build its own index, but pre-building saves time
samtools faidx GRCh38.fa

# Verify
ls -lh GRCh38.fa*
# Expected: GRCh38.fa (~3.1 GB), GRCh38.fa.fai

6.2 ClinVar Database¶

# Download ClinVar VCF (~1.2 GB)
cd rag/data/clinvar

wget https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/clinvar.vcf.gz
wget https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/clinvar.vcf.gz.tbi

# Verify record count (~4.1M clinical variants)
zcat clinvar.vcf.gz | grep -v '^#' | wc -l
# Expected: ~4,100,000

echo "ClinVar download complete"

6.3 AlphaMissense Database¶

# Download AlphaMissense predictions (~4 GB)
cd rag/data/alphamissense

wget https://storage.googleapis.com/dm_alphamissense/AlphaMissense_hg38.tsv.gz

# Verify record count (~71.7M predictions)
zcat AlphaMissense_hg38.tsv.gz | tail -n +5 | wc -l
# Expected: ~71,697,560

echo "AlphaMissense download complete"

AlphaMissense pathogenicity thresholds:

Classification	Score Range	Description
Pathogenic	> 0.564	Likely damaging to protein function
Ambiguous	0.34 - 0.564	Uncertain significance
Benign	< 0.34	Likely tolerated

6.4 HG002 Sample Data¶

# Download HG002 FASTQ files for demo/testing (~200 GB)
cd genomics/data/fastq

# GIAB HG002 30x WGS, 2x250 bp paired-end
# Note: These are large files — ensure ~200 GB free space
wget ftp://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/HG002_NA24385_son/NIST_HiSeq_HG002_Homogeneity-10953946/NHGRI_Illumina300X_AJtrio_novoalign_bams/HG002.GRCh38.2x250.fastq.gz

# For a smaller test subset, use a downsampled version if available
echo "HG002 download complete — verify file sizes match expected ~200 GB"
ls -lh *.fastq.gz

7. Docker Compose Configuration¶

7.1 Service Definition Overview¶

The docker-compose.yml defines all 14 application services plus 2 infrastructure services (etcd, MinIO) for Milvus. Services are organized into three groups matching the pipeline stages, plus monitoring.

7.2 docker-compose.yml Structure¶

version: '3.8'

services:
  # ─── Infrastructure ───────────────────────────────────────
  etcd:
    image: quay.io/coreos/etcd:v3.5.5
    environment:
      - ETCD_AUTO_COMPACTION_MODE=revision
      - ETCD_AUTO_COMPACTION_RETENTION=1000
    ports:
      - "2379:2379"
    volumes:
      - etcd_data:/etcd
    restart: unless-stopped

  minio:
    image: minio/minio:latest
    environment:
      MINIO_ACCESS_KEY: minioadmin
      MINIO_SECRET_KEY: minioadmin
    ports:
      - "9000:9000"
    volumes:
      - minio_data:/data
    command: server /data
    restart: unless-stopped

  # ─── Milvus Vector Database ──────────────────────────────
  milvus:
    image: milvusdb/milvus:v2.4-latest
    ports:
      - "19530:19530"
    environment:
      ETCD_ENDPOINTS: etcd:2379
      MINIO_ADDRESS: minio:9000
    depends_on:
      - etcd
      - minio
    volumes:
      - milvus_data:/var/lib/milvus
    restart: unless-stopped

  attu:
    image: zilliz/attu:latest
    ports:
      - "8000:3000"
    environment:
      MILVUS_URL: milvus:19530
    depends_on:
      - milvus
    restart: unless-stopped

  # ─── Stage 1: Genomics ──────────────────────────────────
  genomics-portal:
    build: ./genomics/portal
    ports:
      - "5000:5000"
    volumes:
      - ./genomics/data:/data
    environment:
      - REFERENCE_GENOME=/data/reference/GRCh38.fa
    restart: unless-stopped

  # ─── Stage 2: RAG Chat ──────────────────────────────────
  rag-api:
    build: ./rag/api
    ports:
      - "5001:5001"
    environment:
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
      - CLAUDE_MODEL=${CLAUDE_MODEL}
      - CLAUDE_TEMPERATURE=${CLAUDE_TEMPERATURE}
      - MILVUS_HOST=milvus
      - MILVUS_PORT=19530
    depends_on:
      - milvus
    restart: unless-stopped

  streamlit-chat:
    build: ./rag/chat
    ports:
      - "8501:8501"
    environment:
      - RAG_API_URL=http://rag-api:5001
    depends_on:
      - rag-api
    restart: unless-stopped

  # ─── Stage 3: Drug Discovery ────────────────────────────
  molmim:
    image: nvcr.io/nvidia/clara/bionemo-molmim:1.0
    ports:
      - "8001:8001"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    restart: unless-stopped

  diffdock:
    image: nvcr.io/nvidia/clara/diffdock:1.0
    ports:
      - "8002:8002"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    restart: unless-stopped

  discovery-ui:
    build: ./discovery/ui
    ports:
      - "8505:8505"
    environment:
      - MOLMIM_URL=http://molmim:8001
      - DIFFDOCK_URL=http://diffdock:8002
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
      - NUM_CANDIDATES=${NUM_CANDIDATES}
      - MIN_QED=${MIN_QED}
      - MIN_DOCK_SCORE=${MIN_DOCK_SCORE}
    depends_on:
      - molmim
      - diffdock
    restart: unless-stopped

  discovery-portal:
    build: ./discovery/portal
    ports:
      - "8510:8510"
    depends_on:
      - discovery-ui
    restart: unless-stopped

  # ─── Landing Page ───────────────────────────────────────
  landing-page:
    build: ./landing
    ports:
      - "8080:8080"
    restart: unless-stopped

  # ─── Monitoring ─────────────────────────────────────────
  prometheus:
    image: prom/prometheus:v2.48.0
    ports:
      - "9099:9090"
    volumes:
      - ./monitoring/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    restart: unless-stopped

  grafana:
    image: grafana/grafana:10.2.2
    ports:
      - "3000:3000"
    environment:
      - GF_SECURITY_ADMIN_USER=${GRAFANA_USER}
      - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
    volumes:
      - ./monitoring/grafana/provisioning:/etc/grafana/provisioning
      - ./monitoring/grafana/dashboards:/var/lib/grafana/dashboards
      - grafana_data:/var/lib/grafana
    depends_on:
      - prometheus
    restart: unless-stopped

  node-exporter:
    image: prom/node-exporter:latest
    ports:
      - "9100:9100"
    restart: unless-stopped

  dcgm-exporter:
    image: nvcr.io/nvidia/k8s/dcgm-exporter:latest
    ports:
      - "9400:9400"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    restart: unless-stopped

volumes:
  etcd_data:
  minio_data:
  milvus_data:
  prometheus_data:
  grafana_data:

7.3 Infrastructure Services¶

Milvus 2.4 requires two backend services:

Service	Image	Port	Purpose
etcd	quay.io/coreos/etcd:v3.5.5	2379	Metadata storage for Milvus
MinIO	minio/minio:latest	9000	Object storage for Milvus segments
Milvus	milvusdb/milvus:v2.4-latest	19530	Vector database

7.4 Volume Mounts and Data Paths¶

Volume	Container Path	Host Purpose
`./genomics/data`	`/data`	Reference genome, FASTQ, BAM, VCF
`./rag/data`	`/data`	ClinVar, AlphaMissense databases
`etcd_data`	`/etcd`	Milvus metadata persistence
`minio_data`	`/data`	Milvus segment persistence
`milvus_data`	`/var/lib/milvus`	Milvus index persistence
`prometheus_data`	`/prometheus`	Prometheus TSDB
`grafana_data`	`/var/lib/grafana`	Grafana state and dashboards

7.5 GPU Resource Allocation¶

The GB10 GPU is shared across GPU-consuming services. Only one GPU-heavy workload should run at a time:

Service	GPU Usage	Peak Memory	Typical Duration
Parabricks fq2bam	70-90% GPU	~40 GB	20-45 min
Parabricks DeepVariant	80-95% GPU	~60 GB	10-35 min
MolMIM NIM	Moderate	~8 GB	Always running
DiffDock NIM	Moderate	~8 GB	Always running
DCGM Exporter	Minimal	Minimal	Always running

8. Deploy Genomics Pipeline (Stage 1)¶

8.1 Parabricks Container Setup¶

# Pull Parabricks container for ARM64
docker pull nvcr.io/nvidia/clara/clara-parabricks:4.6.0-1

# Verify the image
docker images | grep parabricks
# Expected: clara-parabricks   4.6.0-1

8.2 BWA-MEM2 Alignment (fq2bam)¶

The fq2bam tool performs GPU-accelerated read alignment using BWA-MEM2 and produces a sorted, duplicate-marked BAM file.

docker run --rm --gpus all \
  -v $(pwd)/genomics/data:/data \
  nvcr.io/nvidia/clara/clara-parabricks:4.6.0-1 \
  pbrun fq2bam \
    --ref /data/reference/GRCh38.fa \
    --in-fq /data/fastq/HG002_R1.fastq.gz /data/fastq/HG002_R2.fastq.gz \
    --out-bam /data/bam/HG002.bam \
    --num-gpus 1

Expected performance:

Metric	Value
Runtime	20-45 minutes
GPU Utilization	70-90%
Peak GPU Memory	~40 GB
Output	Sorted, duplicate-marked BAM (~100 GB)

8.3 DeepVariant Variant Calling¶

docker run --rm --gpus all \
  -v $(pwd)/genomics/data:/data \
  nvcr.io/nvidia/clara/clara-parabricks:4.6.0-1 \
  pbrun deepvariant \
    --ref /data/reference/GRCh38.fa \
    --in-bam /data/bam/HG002.bam \
    --out-variants /data/vcf/HG002.vcf.gz \
    --num-gpus 1

Expected performance:

Metric	Value
Runtime	10-35 minutes
GPU Utilization	80-95%
Peak GPU Memory	~60 GB
Output	Compressed VCF (gzipped)

8.4 VCF Output Verification¶

# Count total variants
zcat genomics/data/vcf/HG002.vcf.gz | grep -v '^#' | wc -l
# Expected: ~11,700,000 (11.7M variants)

# Count PASS variants with QUAL > 30
zcat genomics/data/vcf/HG002.vcf.gz | grep -v '^#' | \
  awk '$7 == "PASS" && $6 > 30' | wc -l
# Expected: ~3,500,000 (3.5M)

# Count SNPs vs Indels
zcat genomics/data/vcf/HG002.vcf.gz | grep -v '^#' | \
  awk '{if(length($4)==1 && length($5)==1) print "SNP"; else print "INDEL"}' | \
  sort | uniq -c
# Expected: ~4,200,000 SNPs, ~1,000,000 indels

VCF output summary:

Metric	Expected Value
Total variants	~11.7M
PASS variants (QUAL > 30)	~3.5M
SNPs	~4.2M
Indels	~1.0M
Coding region variants	~35,000

8.5 Genomics Portal (Port 5000)¶

After genomics processing, start the portal:

docker compose up -d genomics-portal

# Verify
curl -s http://localhost:5000/health
# Expected: {"status": "healthy"}

Access the Genomics Portal at http://<dgx-spark-ip>:5000 to browse VCF results.

8.6 Performance Benchmarks¶

Step	Wall Time	GPU Util	Peak Memory	Output Size
fq2bam (alignment)	20-45 min	70-90%	~40 GB	~100 GB BAM
DeepVariant (calling)	10-35 min	80-95%	~60 GB	~1 GB VCF.gz
Total Stage 1	30-80 min	—	—	—

9. Deploy RAG Chat Pipeline (Stage 2)¶

9.1 Milvus Vector Database Setup¶

# Start Milvus and its dependencies
docker compose up -d etcd minio milvus attu

# Wait for Milvus to be ready (30-60 seconds)
sleep 30

# Verify Milvus is running
curl -s http://localhost:19530/v1/health/ready
# Expected: {"status":"ok"}

# Verify Attu UI
curl -s -o /dev/null -w "%{http_code}" http://localhost:8000
# Expected: 200

9.2 Collection Schema¶

Create the genomic_evidence collection with 17 fields:

from pymilvus import connections, Collection, FieldSchema, CollectionSchema, DataType, utility

# Connect to Milvus
connections.connect(host="localhost", port=19530)

# Define schema with 17 fields
fields = [
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True),
    FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=384),
    FieldSchema(name="chrom", dtype=DataType.VARCHAR, max_length=10),
    FieldSchema(name="pos", dtype=DataType.INT64),
    FieldSchema(name="ref", dtype=DataType.VARCHAR, max_length=500),
    FieldSchema(name="alt", dtype=DataType.VARCHAR, max_length=500),
    FieldSchema(name="qual", dtype=DataType.FLOAT),
    FieldSchema(name="gene", dtype=DataType.VARCHAR, max_length=100),
    FieldSchema(name="consequence", dtype=DataType.VARCHAR, max_length=200),
    FieldSchema(name="impact", dtype=DataType.VARCHAR, max_length=20),
    FieldSchema(name="genotype", dtype=DataType.VARCHAR, max_length=10),
    FieldSchema(name="text_summary", dtype=DataType.VARCHAR, max_length=5000),
    FieldSchema(name="clinical_significance", dtype=DataType.VARCHAR, max_length=200),
    FieldSchema(name="rsid", dtype=DataType.VARCHAR, max_length=20),
    FieldSchema(name="disease_associations", dtype=DataType.VARCHAR, max_length=2000),
    FieldSchema(name="am_pathogenicity", dtype=DataType.FLOAT),
    FieldSchema(name="am_class", dtype=DataType.VARCHAR, max_length=20),
]

schema = CollectionSchema(fields, description="Genomic evidence for RAG")
collection = Collection("genomic_evidence", schema)

# Create IVF_FLAT index on embedding field
index_params = {
    "metric_type": "COSINE",
    "index_type": "IVF_FLAT",
    "params": {"nlist": 1024}
}
collection.create_index("embedding", index_params)

# Load collection into memory
collection.load()

print(f"Collection created: {collection.name}")
print(f"Schema fields: {len(fields)}")

Collection schema reference:

#	Field	Type	Details
1	id	INT64	Primary key, auto-generated
2	embedding	FLOAT_VECTOR	384 dimensions (BGE-small-en-v1.5)
3	chrom	VARCHAR(10)	Chromosome (chr1-22, chrX, chrY)
4	pos	INT64	Genomic position
5	ref	VARCHAR(500)	Reference allele
6	alt	VARCHAR(500)	Alternate allele
7	qual	FLOAT	Variant quality score
8	gene	VARCHAR(100)	Gene symbol
9	consequence	VARCHAR(200)	VEP functional consequence
10	impact	VARCHAR(20)	HIGH, MODERATE, LOW, MODIFIER
11	genotype	VARCHAR(10)	Sample genotype (e.g., 0/1, 1/1)
12	text_summary	VARCHAR(5000)	Natural-language variant summary
13	clinical_significance	VARCHAR(200)	ClinVar classification
14	rsid	VARCHAR(20)	dbSNP identifier
15	disease_associations	VARCHAR(2000)	Associated diseases/conditions
16	am_pathogenicity	FLOAT	AlphaMissense score (0.0-1.0)
17	am_class	VARCHAR(20)	pathogenic, ambiguous, or benign

9.3 Variant Annotation Pipeline¶

The annotation pipeline enriches VCF variants with data from three sources:

# Run the annotation pipeline
python3 rag/annotation/clinvar.py \
  --vcf genomics/data/vcf/HG002.vcf.gz \
  --clinvar rag/data/clinvar/clinvar.vcf.gz \
  --output rag/data/annotated_clinvar.tsv

python3 rag/annotation/alphamissense.py \
  --vcf genomics/data/vcf/HG002.vcf.gz \
  --am rag/data/alphamissense/AlphaMissense_hg38.tsv.gz \
  --output rag/data/annotated_am.tsv

python3 rag/annotation/vep.py \
  --vcf genomics/data/vcf/HG002.vcf.gz \
  --output rag/data/annotated_vep.tsv

Expected annotation matches:

Source	Total Records	Patient Matches
ClinVar	4,100,000	~35,616
AlphaMissense	71,697,560	~6,831 (ClinVar-matched with predictions)
VEP	Per-variant	All coding variants

9.4 BGE Embedding and Indexing¶

from sentence_transformers import SentenceTransformer
from pymilvus import connections, Collection

# Load embedding model
model = SentenceTransformer('BAAI/bge-small-en-v1.5')  # 384 dimensions

# Connect to Milvus
connections.connect(host="localhost", port=19530)
collection = Collection("genomic_evidence")

# Example: embed and insert a variant
text = "chr9:35065263 G>A in VCP gene. ClinVar: Pathogenic. AlphaMissense: 0.87 (pathogenic). Consequence: missense_variant. Impact: MODERATE."
embedding = model.encode(text).tolist()  # 384-dim vector

# Insert into Milvus
data = [{
    "embedding": embedding,
    "chrom": "chr9",
    "pos": 35065263,
    "ref": "G",
    "alt": "A",
    "qual": 99.0,
    "gene": "VCP",
    "consequence": "missense_variant",
    "impact": "MODERATE",
    "genotype": "0/1",
    "text_summary": text,
    "clinical_significance": "Pathogenic",
    "rsid": "rs188935092",
    "disease_associations": "Inclusion body myopathy with Paget disease and frontotemporal dementia",
    "am_pathogenicity": 0.87,
    "am_class": "pathogenic"
}]

collection.insert(data)
collection.flush()

Milvus index configuration:

Parameter	Value
Embedding Model	BGE-small-en-v1.5
Dimensions	384
Index Type	IVF_FLAT
Metric Type	COSINE
nlist	1024
nprobe (search)	16

9.5 Anthropic Claude Integration¶

import anthropic

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

def query_claude(question: str, context: str) -> str:
    """Send RAG query to Claude with retrieved genomic context."""
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=4096,
        temperature=0.3,
        messages=[{
            "role": "user",
            "content": f"""You are a genomics expert. Answer the question using the provided genomic evidence.

Context:
{context}

Question: {question}"""
        }]
    )
    return response.content[0].text

Claude configuration:

Parameter	Value
Model	claude-sonnet-4-20250514
Temperature	0.3
Max Tokens	4096

9.6 Knowledge Base¶

The platform includes a curated knowledge base of 201 genes across 13 therapeutic areas, with 171 genes (85%) classified as druggable.

Metric	Value
Total genes	201
Therapeutic areas	13
Druggable genes	171 (85%)

9.7 RAG API and Streamlit Chat¶

# Start RAG API and Chat services
docker compose up -d rag-api streamlit-chat

# Verify RAG API
curl -s http://localhost:5001/health
# Expected: {"status": "healthy"}

# Verify Streamlit Chat
curl -s -o /dev/null -w "%{http_code}" http://localhost:8501
# Expected: 200

Access the Streamlit Chat at http://<dgx-spark-ip>:8501 for conversational variant analysis.

10. Deploy Drug Discovery Pipeline (Stage 3)¶

10.1 BioNeMo NIM Services¶

# Pull BioNeMo containers (requires NGC authentication)
docker pull nvcr.io/nvidia/clara/bionemo-molmim:1.0
docker pull nvcr.io/nvidia/clara/diffdock:1.0

# Start NIM services
docker compose up -d molmim diffdock

# Wait for models to load (may take 2-5 minutes)
sleep 120

# Verify MolMIM
curl -s http://localhost:8001/v1/health/ready
# Expected: {"status": "ready"}

# Verify DiffDock
curl -s http://localhost:8002/v1/health/ready
# Expected: {"status": "ready"}

10.2 10-Stage Pipeline Detail¶

Stage	Name	Input	Output	Key Operations
1	Initialize	Config + target gene	PipelineConfig	Validate parameters, create run ID
2	Normalize Target	Gene symbol	Normalized target	Map to UniProt, canonical name
3	Structure Discovery	UniProt ID	PDB structure list	Query RCSB PDB, score by resolution
4	Structure Preparation	PDB IDs	Prepared structures	Download PDB, extract binding sites
5	Molecule Generation	Seed SMILES + protein	Generated SMILES	MolMIM NIM (Port 8001)
6	Chemistry QC	SMILES list	Filtered SMILES	Lipinski, QED, TPSA checks
7	Conformer Generation	Filtered SMILES	3D conformers (SDF)	RDKit conformer embedding
8	Molecular Docking	Conformers + protein	Docking scores	DiffDock NIM (Port 8002)
9	Composite Ranking	All scores	Ranked candidates	Weighted composite formula
10	Reporting	Ranked candidates	PDF report	Visualizations, recommendations

10.3 Structure Retrieval and Scoring¶

import requests

def search_pdb_structures(uniprot_id: str) -> list:
    """Search RCSB PDB for protein structures by UniProt ID."""
    url = "https://search.rcsb.org/rcsbsearch/v2/query"
    query = {
        "query": {
            "type": "terminal",
            "service": "text",
            "parameters": {
                "attribute": "rcsb_polymer_entity_container_identifiers.reference_sequence_identifiers.database_accession",
                "operator": "exact_match",
                "value": uniprot_id
            }
        },
        "return_type": "entry"
    }
    response = requests.post(url, json=query)
    return response.json().get("result_set", [])

10.4 Molecule Generation (MolMIM)¶

import requests

def generate_molecules(seed_smiles: str, num_candidates: int = 100) -> list:
    """Generate molecule candidates using MolMIM NIM."""
    response = requests.post(
        "http://localhost:8001/generate",
        json={
            "smiles": seed_smiles,
            "num_molecules": num_candidates,
            "algorithm": "CMA-ES",
            "property_name": "QED",
            "min_similarity": 0.3,
            "particles": 30,
            "iterations": 10
        }
    )
    return response.json()["generated_molecules"]

10.5 Molecular Docking (DiffDock)¶

def dock_molecule(protein_pdb: str, ligand_sdf: str) -> dict:
    """Score binding affinity using DiffDock NIM."""
    response = requests.post(
        "http://localhost:8002/molecular-docking/diffdock/generate",
        json={
            "protein": protein_pdb,
            "ligand": ligand_sdf,
            "num_poses": 10
        }
    )
    return response.json()

10.6 Drug-Likeness Scoring¶

Drug-likeness is assessed using three criteria:

Lipinski Rule of Five:

Property	Threshold	Description
Molecular Weight	<= 500 Da	Size constraint
LogP	<= 5	Lipophilicity
H-Bond Donors (HBD)	<= 5	Polar surface groups
H-Bond Acceptors (HBA)	<= 10	Polar surface groups

Additional thresholds:

Metric	Threshold	Interpretation
QED	> 0.67	Drug-like
TPSA	< 140 Angstrom squared	Good oral bioavailability

from rdkit import Chem
from rdkit.Chem import Descriptors, QED

def assess_drug_likeness(smiles: str) -> dict:
    """Evaluate drug-likeness using Lipinski, QED, and TPSA."""
    mol = Chem.MolFromSmiles(smiles)
    if mol is None:
        return {"valid": False}

    mw = Descriptors.MolWt(mol)
    logp = Descriptors.MolLogP(mol)
    hbd = Descriptors.NumHDonors(mol)
    hba = Descriptors.NumHAcceptors(mol)
    tpsa = Descriptors.TPSA(mol)
    qed_score = QED.qed(mol)

    lipinski_pass = (mw <= 500 and logp <= 5 and hbd <= 5 and hba <= 10)

    return {
        "valid": True,
        "mw": mw,
        "logp": logp,
        "hbd": hbd,
        "hba": hba,
        "tpsa": tpsa,
        "qed": qed_score,
        "lipinski_pass": lipinski_pass,
        "drug_like": qed_score > 0.67,
        "oral_bioavail": tpsa < 140
    }

10.7 Composite Ranking Formula¶

Candidates are ranked using a weighted composite score:

composite = 0.30 * generation_score + 0.40 * docking_score_normalized + 0.30 * qed_score

Docking score normalization:

def normalize_docking_score(dock_score: float) -> float:
    """Normalize docking score to [0, 1] range.
    More negative = better binding = higher normalized score."""
    return max(0.0, min(1.0, (10.0 + dock_score) / 20.0))

Raw Docking Score	Normalized Score	Interpretation
-10.0 kcal/mol	0.00	Excellent binding
-8.0 kcal/mol	0.10	Strong binding
-6.0 kcal/mol	0.20	Moderate binding
0.0 kcal/mol	0.50	Weak binding
+10.0 kcal/mol	1.00	No binding

Note: The normalization maps more negative (better) docking scores to lower normalized values. In the composite formula, the docking component rewards lower (better) scores.

Composite score weights:

Component	Weight	Source
Generation Score	30%	MolMIM similarity/property score
Docking Score (normalized)	40%	DiffDock binding affinity
QED Score	30%	RDKit quantitative drug-likeness

10.8 Discovery UI and Portal¶

# Start Discovery services
docker compose up -d discovery-ui discovery-portal

# Verify Discovery UI
curl -s -o /dev/null -w "%{http_code}" http://localhost:8505
# Expected: 200

# Verify Discovery Portal
curl -s -o /dev/null -w "%{http_code}" http://localhost:8510
# Expected: 200

Discovery UI (Port 8505): Interactive pipeline execution interface
Discovery Portal (Port 8510): Results browser and reporting portal

10.9 PDF Report Generation¶

The final pipeline stage generates a PDF report containing:

Target gene and variant summary
PDB structure details with binding site analysis
Top-ranked candidates with SMILES, scores, and 2D depictions
Docking poses and binding affinity plots
Lipinski and QED compliance table
Composite score ranking

11. Nextflow Orchestration¶

11.1 DSL2 Pipeline Architecture¶

The HCLS AI Factory uses Nextflow DSL2 for pipeline orchestration. Each pipeline stage is defined as a separate process, with channels connecting inputs and outputs.

11.2 Pipeline Modes¶

Mode	Description	Stages Executed
`full`	Complete end-to-end pipeline	1 + 2 + 3 (all stages)
`target`	Start from target gene (skip genomics)	2 + 3
`drug`	Drug discovery only (pre-existing target)	3 only
`demo`	VCP demo with pre-loaded data	1 + 2 + 3 (demo subset)
`genomics_only`	Genomics pipeline only	1 only

11.3 Execution Profiles¶

Profile	Description	Use Case
`standard`	Local execution, default settings	Development
`docker`	Docker container execution	Standard deployment
`singularity`	Singularity container execution	HPC environments
`dgx_spark`	Optimized for DGX Spark hardware	Production on DGX Spark
`slurm`	SLURM workload manager	Multi-node clusters
`test`	Minimal test data, fast execution	CI/CD testing

11.4 Pipeline Launcher¶

# Run with the pipeline launcher script
python3 scripts/run_pipeline.py \
  --mode full \
  --profile dgx_spark \
  --fastq genomics/data/fastq/ \
  --reference genomics/data/reference/GRCh38.fa

# Or run directly with Nextflow
nextflow run main.nf \
  -profile dgx_spark \
  --mode full \
  --fastq_dir genomics/data/fastq/ \
  --reference genomics/data/reference/GRCh38.fa \
  --outdir results/

11.5 Pipeline Configuration¶

// nextflow.config
params {
    // Pipeline mode
    mode = 'full'

    // Input paths
    fastq_dir = 'genomics/data/fastq'
    reference = 'genomics/data/reference/GRCh38.fa'
    outdir = 'results'

    // Service endpoints
    milvus_host = 'localhost'
    milvus_port = 19530
    molmim_url = 'http://localhost:8001'
    diffdock_url = 'http://localhost:8002'

    // Drug discovery parameters
    num_candidates = 100
    min_qed = 0.67
    min_dock_score = -6.0
}

profiles {
    dgx_spark {
        docker.enabled = true
        docker.runOptions = '--gpus all'
        process {
            executor = 'local'
            memory = '120 GB'
            cpus = 128
        }
    }

    test {
        params.mode = 'demo'
        process {
            memory = '16 GB'
            cpus = 4
        }
    }
}

12. Service Startup and Health¶

12.1 start-services.sh Startup Order¶

Services should be started in dependency order:

#!/bin/bash
# start-services.sh — Start all HCLS AI Factory services

set -e

echo "Starting infrastructure services..."
docker compose up -d etcd minio
sleep 10

echo "Starting Milvus..."
docker compose up -d milvus attu
sleep 30

echo "Starting BioNeMo NIM services..."
docker compose up -d molmim diffdock
sleep 120

echo "Starting application services..."
docker compose up -d genomics-portal rag-api streamlit-chat discovery-ui discovery-portal landing-page

echo "Starting monitoring..."
docker compose up -d prometheus grafana node-exporter dcgm-exporter

echo "All services started. Running health checks..."
sleep 10
bash scripts/validate_deployment.sh

12.2 Landing Page (Port 8080)¶

The landing page at http://<dgx-spark-ip>:8080 provides a directory of all services with links and status indicators.

12.3 Health Check Endpoints¶

Service	Port	Health Endpoint	Expected Response
Genomics Portal	5000	`/health`	`{"status": "healthy"}`
RAG API	5001	`/health`	`{"status": "healthy"}`
Milvus	19530	`/v1/health/ready`	`{"status": "ok"}`
Attu	8000	`/api/health`	HTTP 200
Streamlit Chat	8501	`/healthz`	HTTP 200
MolMIM NIM	8001	`/v1/health/ready`	`{"status": "ready"}`
DiffDock NIM	8002	`/v1/health/ready`	`{"status": "ready"}`
Discovery UI	8505	`/health`	`{"status": "healthy"}`
Discovery Portal	8510	`/health`	`{"status": "healthy"}`
Grafana	3000	`/api/health`	`{"status": "ok"}`
Prometheus	9099	`/-/healthy`	HTTP 200
Node Exporter	9100	`/metrics`	Metrics text
DCGM Exporter	9400	`/metrics`	Metrics text

12.4 Verifying All Services¶

#!/bin/bash
# validate_deployment.sh — Verify all services are running

declare -A SERVICES=(
  ["Landing Page"]="http://localhost:8080"
  ["Genomics Portal"]="http://localhost:5000/health"
  ["RAG API"]="http://localhost:5001/health"
  ["Milvus"]="http://localhost:19530/v1/health/ready"
  ["Attu"]="http://localhost:8000"
  ["Streamlit Chat"]="http://localhost:8501/healthz"
  ["MolMIM"]="http://localhost:8001/v1/health/ready"
  ["DiffDock"]="http://localhost:8002/v1/health/ready"
  ["Discovery UI"]="http://localhost:8505/health"
  ["Discovery Portal"]="http://localhost:8510/health"
  ["Grafana"]="http://localhost:3000/api/health"
  ["Prometheus"]="http://localhost:9099/-/healthy"
  ["Node Exporter"]="http://localhost:9100/metrics"
  ["DCGM Exporter"]="http://localhost:9400/metrics"
)

echo "=== HCLS AI Factory Health Check ==="
for service in "${!SERVICES[@]}"; do
  url="${SERVICES[$service]}"
  status=$(curl -s -o /dev/null -w "%{http_code}" "$url" 2>/dev/null || echo "ERR")
  if [ "$status" == "200" ]; then
    echo "[OK]   $service ($url)"
  else
    echo "[FAIL] $service ($url) — HTTP $status"
  fi
done

13. Monitoring and Observability¶

13.1 Grafana Setup (Port 3000)¶

# Start Grafana
docker compose up -d grafana

# Access at http://<dgx-spark-ip>:3000
# Default credentials: admin / changeme

Default Grafana credentials:

Parameter	Value
Username	admin
Password	changeme

13.2 Prometheus Configuration (Port 9099)¶

# monitoring/prometheus/prometheus.yml
global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'node-exporter'
    static_configs:
      - targets: ['node-exporter:9100']

  - job_name: 'dcgm-exporter'
    static_configs:
      - targets: ['dcgm-exporter:9400']

  - job_name: 'rag-api'
    static_configs:
      - targets: ['rag-api:5001']
    metrics_path: /metrics

  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

13.3 DCGM Exporter (Port 9400)¶

Key GPU metrics exposed by the DCGM Exporter:

Metric	Description
`DCGM_FI_DEV_GPU_UTIL`	GPU utilization percentage
`DCGM_FI_DEV_FB_USED`	GPU framebuffer memory used (MB)
`DCGM_FI_DEV_FB_FREE`	GPU framebuffer memory free (MB)
`DCGM_FI_DEV_GPU_TEMP`	GPU temperature (Celsius)
`DCGM_FI_DEV_POWER_USAGE`	Power consumption (Watts)
`DCGM_FI_DEV_SM_CLOCK`	Streaming multiprocessor clock (MHz)
`DCGM_FI_DEV_MEM_CLOCK`	Memory clock (MHz)

13.4 Node Exporter (Port 9100)¶

The Node Exporter provides host system metrics — CPU, memory, disk, and network utilization — critical for monitoring the DGX Spark ARM64 system.

13.5 Key Dashboard Panels¶

Recommended Grafana dashboard panels:

Panel	Data Source	Purpose
GPU Utilization	DCGM	Track fq2bam and DeepVariant GPU usage
GPU Memory	DCGM	Monitor peak memory during genomics
CPU Utilization	Node Exporter	ARM64 core usage across 144 cores
Memory Usage	Node Exporter	Unified 128 GB LPDDR5x utilization
Disk I/O	Node Exporter	NVMe throughput for FASTQ/BAM processing
Network I/O	Node Exporter	API call throughput
Container Status	Docker	Service health overview

13.6 Alert Configuration¶

# Example alert rules for Prometheus
groups:
  - name: hcls-alerts
    rules:
      - alert: GPUMemoryHigh
        expr: DCGM_FI_DEV_FB_USED / (DCGM_FI_DEV_FB_USED + DCGM_FI_DEV_FB_FREE) > 0.95
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "GPU memory usage above 95%"

      - alert: ServiceDown
        expr: up == 0
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "Service {{ $labels.job }} is down"

14. Security Configuration¶

14.1 API Key Management¶

# Store API keys in .env file (not committed to git)
echo ".env" >> .gitignore

# Set restrictive permissions
chmod 600 .env

# Verify .env is in .gitignore
grep -q '.env' .gitignore && echo "OK: .env is gitignored"

Never commit API keys to version control. Use environment variables exclusively:

Variable	Sensitivity	Storage
`ANTHROPIC_API_KEY`	High	`.env` file, `chmod 600`
`NGC_API_KEY`	High	`.env` file, `chmod 600`
`GRAFANA_PASSWORD`	Medium	`.env` file

14.2 Docker Network Isolation¶

Docker Compose creates an isolated bridge network. Only explicitly exposed ports are accessible from the host:

# Verify network isolation
docker network ls | grep hcls
docker network inspect hcls-ai-factory_default

14.3 Container Security¶

Best practices applied to the deployment:

Run application containers as non-root users where possible
Use read-only filesystem mounts for reference data
Limit container capabilities with --cap-drop ALL
Pin container image versions (no latest tags in production)

14.4 Data Access Controls¶

# Set appropriate permissions on data directories
chmod -R 750 genomics/data/
chmod -R 750 rag/data/
chmod -R 750 discovery/data/

# Ensure only the deployment user can access sensitive data
chown -R $(whoami):$(whoami) genomics/data/ rag/data/ discovery/data/

15. Data Management¶

15.1 Storage Layout¶

Directory	Contents	Size	Persistence
`genomics/data/reference/`	GRCh38 genome	3.1 GB	Permanent
`genomics/data/fastq/`	Input FASTQ files	~200 GB	Keep until processed
`genomics/data/bam/`	Alignment output	~100 GB	Delete after VCF
`genomics/data/vcf/`	Variant calls	~1 GB	Permanent
`rag/data/clinvar/`	ClinVar database	~1.2 GB	Permanent
`rag/data/alphamissense/`	AlphaMissense DB	~4 GB	Permanent
`milvus_data` (Docker volume)	Vector index	~2 GB	Permanent
`discovery/data/`	Structures, molecules	Variable	Per-run

15.2 Intermediate File Cleanup¶

BAM files are the largest intermediate output (~100 GB). Once the VCF has been verified, BAM files can be deleted to reclaim storage:

# Verify VCF is complete before deleting BAM
zcat genomics/data/vcf/HG002.vcf.gz | grep -v '^#' | wc -l
# Confirm ~11.7M variants

# Delete intermediate BAM
rm -f genomics/data/bam/HG002.bam genomics/data/bam/HG002.bam.bai
echo "Reclaimed ~100 GB"

15.3 Milvus Data Persistence¶

Milvus data is stored in Docker volumes. To back up:

# Stop Milvus for consistent backup
docker compose stop milvus

# Back up volumes
docker run --rm \
  -v hcls-ai-factory_milvus_data:/data \
  -v $(pwd)/backups:/backup \
  alpine tar czf /backup/milvus_data_$(date +%Y%m%d).tar.gz /data

# Restart
docker compose start milvus

15.4 Backup Procedures¶

# Full backup script
#!/bin/bash
BACKUP_DIR=./backups/$(date +%Y%m%d)
mkdir -p $BACKUP_DIR

# Back up VCF results
cp -r genomics/data/vcf/ $BACKUP_DIR/vcf/

# Back up environment config (without secrets)
grep -v 'API_KEY' .env > $BACKUP_DIR/env_sanitized.txt

# Back up Milvus volumes
docker compose stop milvus
for vol in milvus_data etcd_data minio_data; do
  docker run --rm \
    -v hcls-ai-factory_${vol}:/data \
    -v $(pwd)/$BACKUP_DIR:/backup \
    alpine tar czf /backup/${vol}.tar.gz /data
done
docker compose start milvus

echo "Backup complete: $BACKUP_DIR"

16. Performance Tuning¶

16.1 GPU Memory Management¶

The DGX Spark uses 128 GB unified LPDDR5x memory shared between CPU and GPU. Key considerations:

Parabricks DeepVariant peaks at ~60 GB GPU memory — ensure other GPU services are idle during genomics processing
MolMIM and DiffDock each require ~8 GB — they can co-exist during drug discovery
Monitor with nvidia-smi and DCGM metrics during pipeline runs

# Monitor GPU memory in real-time
watch -n 1 nvidia-smi

# Check unified memory allocation
nvidia-smi --query-gpu=memory.used,memory.free,memory.total --format=csv

16.2 Milvus Index Tuning¶

Parameter	Default	Tuning Guidance
`nlist`	1024	Increase for larger collections (trade build time for search quality)
`nprobe`	16	Increase for higher recall (trade latency for accuracy)
`metric_type`	COSINE	Use COSINE for normalized BGE embeddings

# Search with tuned parameters
search_params = {
    "metric_type": "COSINE",
    "params": {"nprobe": 16}
}

results = collection.search(
    data=[query_embedding],
    anns_field="embedding",
    param=search_params,
    limit=10,
    output_fields=["gene", "clinical_significance", "text_summary"]
)

16.3 Docker Resource Limits¶

# Example resource limits in docker-compose.yml
services:
  rag-api:
    deploy:
      resources:
        limits:
          memory: 16G
          cpus: '16'
        reservations:
          memory: 4G
          cpus: '4'

16.4 NVMe I/O Optimization¶

For FASTQ and BAM processing, I/O throughput is critical:

# Check NVMe performance
fio --name=seqread --rw=read --bs=1M --size=1G --numjobs=4 --runtime=10 --group_reporting

# Ensure data directories are on NVMe
df -h genomics/data/

16.5 Pipeline Concurrency Settings¶

The Nextflow pipeline supports controlled concurrency:

// nextflow.config — concurrency settings
process {
    maxForks = 4          // Maximum parallel processes
    maxRetries = 2        // Retry failed processes
    errorStrategy = 'retry'
}

executor {
    queueSize = 8         // Maximum queued tasks
    pollInterval = '5 sec'
}

17. Troubleshooting Guide¶

17.1 Service Not Starting¶

# Check service logs
docker compose logs <service-name> --tail 50

# Check if port is already in use
ss -tlnp | grep <port>

# Restart a specific service
docker compose restart <service-name>

17.2 GPU Out of Memory¶

# Check current GPU memory usage
nvidia-smi

# Kill any orphaned GPU processes
sudo fuser -v /dev/nvidia*

# Reduce Parabricks memory by limiting GPU threads
# Add --gpu-mem-limit flag if available

# Ensure NIM services are stopped during genomics
docker compose stop molmim diffdock

17.3 Milvus Connection Issues¶

# Verify Milvus dependencies are running
docker compose ps etcd minio milvus

# Check Milvus logs for errors
docker compose logs milvus --tail 100

# Test connectivity
curl -s http://localhost:19530/v1/health/ready

# Reset Milvus if corrupted
docker compose down milvus etcd minio
docker volume rm hcls-ai-factory_milvus_data hcls-ai-factory_etcd_data hcls-ai-factory_minio_data
docker compose up -d etcd minio milvus

17.4 BioNeMo NIM Not Ready¶

# NIM services may take 2-5 minutes to load models
# Check logs for model loading progress
docker compose logs molmim --tail 50
docker compose logs diffdock --tail 50

# Verify GPU is available for NIM
nvidia-smi | grep -i "molmim\|diffdock"

# Restart if stuck
docker compose restart molmim diffdock

17.5 Parabricks Failures¶

Error	Cause	Resolution
`CUDA out of memory`	Insufficient GPU memory	Stop other GPU services first
`Reference index not found`	Missing .fai file	Run `samtools faidx GRCh38.fa`
`Input file not found`	Wrong FASTQ path	Check volume mount paths
`Unsupported GPU`	Driver mismatch	Update NVIDIA driver

17.6 Claude API Errors¶

Error	Cause	Resolution
`401 Unauthorized`	Invalid API key	Verify `ANTHROPIC_API_KEY` in `.env`
`429 Rate Limited`	Too many requests	Implement exponential backoff
`500 Server Error`	Anthropic service issue	Retry after 30 seconds
`Connection refused`	No internet	Check network connectivity

17.7 Docker Issues¶

# Docker daemon not running
sudo systemctl start docker
sudo systemctl enable docker

# Disk space full
docker system prune -a --volumes
df -h /var/lib/docker

# Permission denied
sudo usermod -aG docker $USER
newgrp docker

17.8 Common Error Messages Table¶

Error Message	Service	Resolution
`Connection refused on port 19530`	Milvus	Start etcd + MinIO first, then Milvus
`NVIDIA driver not found`	Docker	Install NVIDIA Container Toolkit
`Model not loaded`	MolMIM/DiffDock	Wait 2-5 minutes for model loading
`Collection not found`	Milvus	Run schema creation script (Section 9.2)
`API key not set`	RAG API	Set `ANTHROPIC_API_KEY` in `.env`
`Out of disk space`	Parabricks	Clean BAM intermediates, expand storage
`Permission denied: /data`	Any	Check volume mount permissions

18. VCP/FTD Demo Walkthrough¶

18.1 Demo Overview¶

The VCP (Valosin-Containing Protein) / FTD (Frontotemporal Dementia) demo showcases the full three-stage pipeline using a known pathogenic variant:

Parameter	Value
Variant	rs188935092
Location	chr9:35065263 G>A
Gene	VCP
ClinVar Classification	Pathogenic
AlphaMissense Score	0.87 (pathogenic, threshold >0.564)
Disease	Inclusion body myopathy with Paget disease and FTD
Seed Molecule	CB-5083 (VCP/p97 inhibitor)
PDB Structures	8OOI, 9DIL, 7K56, 5FTK
Binding Domain	D2 ATPase domain, ~450 cubic angstroms
Druggability Score	0.92

18.2 Pre-Demo Setup¶

# Ensure all services are running
bash scripts/validate_deployment.sh

# Verify Milvus has the VCP variant loaded
python3 -c "
from pymilvus import connections, Collection
connections.connect(host='localhost', port=19530)
col = Collection('genomic_evidence')
col.load()
results = col.query('gene == \"VCP\"', output_fields=['rsid', 'clinical_significance', 'am_pathogenicity'])
print(f'VCP variants found: {len(results)}')
for r in results[:3]:
    print(r)
"

18.3 Running the Demo¶

# Run the demo pipeline mode
python3 scripts/run_pipeline.py --mode demo

# Or via Nextflow
nextflow run main.nf -profile dgx_spark --mode demo

Step-by-step execution:

Stage 1 (Genomics): Process demo FASTQ subset through Parabricks fq2bam and DeepVariant
Stage 2 (RAG): Annotate VCP variant with ClinVar (Pathogenic) and AlphaMissense (0.87), embed into Milvus, query Claude for clinical interpretation
Stage 3 (Drug Discovery): Retrieve PDB structures (8OOI, 9DIL, 7K56, 5FTK), generate molecules from CB-5083 seed via MolMIM, dock with DiffDock, rank by composite score

18.4 Expected Results¶

Metric	Expected Value
Candidates generated	100
Pass Lipinski Rule of Five	87
QED > 0.67 (drug-like)	72
Top docking scores	-8.2 to -11.4 kcal/mol
Composite score range	0.68 - 0.89

Top candidate characteristics:

Property	Range
Molecular Weight	300 - 500 Da
LogP	1.5 - 4.5
QED	0.67 - 0.92
TPSA	40 - 130 squared angstroms
Docking Score	-8.2 to -11.4 kcal/mol
Composite Score	0.68 - 0.89

19. Scaling Beyond DGX Spark¶

19.1 Phase 1 to Phase 3 Roadmap¶

Phase	Hardware	Scale	Use Case
Phase 1	DGX Spark	Single workstation	Development, demos, single-patient analysis
Phase 2	DGX B200	Single server, multi-GPU	Production cohort analysis
Phase 3	DGX SuperPOD	Multi-node cluster	Population-scale genomics

19.2 Kubernetes Migration Path¶

For Phase 2 and beyond, migrate from Docker Compose to Kubernetes:

Replace docker-compose.yml with Helm charts
Use NVIDIA GPU Operator for GPU scheduling
Deploy Milvus Cluster mode (distributed) instead of standalone
Use persistent volume claims (PVCs) for data storage
Implement horizontal pod autoscaling for RAG API

19.3 Multi-GPU Considerations¶

Parabricks supports --num-gpus for multi-GPU parallelism
MolMIM and DiffDock can be replicated across GPUs
Milvus supports distributed deployment with multiple query nodes

19.4 NVIDIA FLARE for Federated Learning¶

For multi-institutional deployments, NVIDIA FLARE enables federated learning across DGX Spark nodes without sharing raw patient data.

20. Appendix A: Complete Configuration Reference¶

20.1 All Environment Variables¶

Variable	Default	Description
`ANTHROPIC_API_KEY`	(required)	Anthropic API key for Claude
`NGC_API_KEY`	(required)	NVIDIA NGC API key
`REFERENCE_GENOME`	`/data/reference/GRCh38.fa`	Path to reference genome
`MILVUS_HOST`	`localhost`	Milvus server hostname
`MILVUS_PORT`	`19530`	Milvus server port
`MOLMIM_URL`	`http://localhost:8001`	MolMIM NIM endpoint
`DIFFDOCK_URL`	`http://localhost:8002`	DiffDock NIM endpoint
`CLAUDE_MODEL`	`claude-sonnet-4-20250514`	Claude model identifier
`CLAUDE_TEMPERATURE`	`0.3`	Claude sampling temperature
`PIPELINE_MODE`	`full`	Pipeline execution mode
`NUM_CANDIDATES`	`100`	Number of molecules to generate
`MIN_QED`	`0.3`	Minimum QED threshold
`MIN_DOCK_SCORE`	`-6.0`	Minimum docking score (kcal/mol)
`GRAFANA_USER`	`admin`	Grafana admin username
`GRAFANA_PASSWORD`	`changeme`	Grafana admin password

20.2 AlphaMissense Thresholds¶

Classification	Score Range
Pathogenic	> 0.564
Ambiguous	0.34 - 0.564
Benign	< 0.34

20.3 Scoring Weights¶

Component	Weight
Generation Score	0.30 (30%)
Docking Score (normalized)	0.40 (40%)
QED Score	0.30 (30%)

20.4 Drug-Likeness Thresholds¶

Property	Threshold	Rule
Molecular Weight	<= 500 Da	Lipinski
LogP	<= 5	Lipinski
H-Bond Donors	<= 5	Lipinski
H-Bond Acceptors	<= 10	Lipinski
QED	> 0.67	Drug-likeness
TPSA	< 140 squared angstroms	Oral bioavailability

20.5 Docking Score Interpretation¶

Score (kcal/mol)	Binding Affinity	Assessment
< -10.0	Excellent	Strong candidate
-8.0 to -10.0	Strong	Viable candidate
-6.0 to -8.0	Moderate	Marginal candidate
> -6.0	Weak	Poor candidate

Normalization formula:

normalized = max(0, min(1, (10 + dock_score) / 20))

21. Appendix B: API Reference¶

21.1 MolMIM API (Port 8001)¶

Generate Molecules:

// POST http://localhost:8001/generate
// Request:
{
  "smiles": "CC1=CC=C(C=C1)C(=O)NC2=CC=CC=C2",
  "num_molecules": 100,
  "algorithm": "CMA-ES",
  "property_name": "QED",
  "min_similarity": 0.3,
  "particles": 30,
  "iterations": 10
}

// Response:
{
  "generated_molecules": [
    {
      "smiles": "CC1=CC=C(C=C1)C(=O)NC2=CC=C(F)C=C2",
      "score": 0.85,
      "similarity": 0.78
    }
  ]
}

Health Check:

GET http://localhost:8001/v1/health/ready
Response: {"status": "ready"}

21.2 DiffDock API (Port 8002)¶

Molecular Docking:

// POST http://localhost:8002/molecular-docking/diffdock/generate
// Request:
{
  "protein": "<PDB file content>",
  "ligand": "<SDF file content>",
  "num_poses": 10
}

// Response:
{
  "poses": [
    {
      "pose_id": 0,
      "confidence": 0.95,
      "score": -9.7,
      "ligand_sdf": "<docked SDF content>"
    }
  ]
}

Health Check:

GET http://localhost:8002/v1/health/ready
Response: {"status": "ready"}

21.3 RAG API Endpoints (Port 5001)¶

Method	Endpoint	Description
GET	`/health`	Service health check
POST	`/query`	RAG query with context retrieval
POST	`/search`	Vector similarity search
GET	`/collections`	List Milvus collections
GET	`/stats`	Collection statistics

RAG Query Example:

// POST http://localhost:5001/query
// Request:
{
  "question": "What pathogenic variants are found in the VCP gene?",
  "top_k": 10,
  "filters": {
    "gene": "VCP",
    "impact": "HIGH"
  }
}

// Response:
{
  "answer": "The VCP gene contains the variant rs188935092...",
  "sources": [
    {
      "gene": "VCP",
      "rsid": "rs188935092",
      "clinical_significance": "Pathogenic",
      "am_pathogenicity": 0.87,
      "similarity_score": 0.94
    }
  ],
  "model": "claude-sonnet-4-20250514",
  "tokens_used": 1847
}

21.4 Health Check Endpoints Summary¶

Service	Endpoint	Method
Genomics Portal	`/health`	GET
RAG API	`/health`	GET
Milvus	`/v1/health/ready`	GET
Attu	`/api/health`	GET
Streamlit Chat	`/healthz`	GET
MolMIM	`/v1/health/ready`	GET
DiffDock	`/v1/health/ready`	GET
Discovery UI	`/health`	GET
Discovery Portal	`/health`	GET
Grafana	`/api/health`	GET
Prometheus	`/-/healthy`	GET
Node Exporter	`/metrics`	GET
DCGM Exporter	`/metrics`	GET

22. Appendix C: Schema Definitions¶

22.1 Milvus Collection Schema¶

Collection: genomic_evidence

#	Field	Data Type	Constraints	Description
1	`id`	INT64	Primary Key, Auto ID	Unique record identifier
2	`embedding`	FLOAT_VECTOR	dim=384	BGE-small-en-v1.5 embedding
3	`chrom`	VARCHAR	max_length=10	Chromosome (chr1-22, chrX, chrY)
4	`pos`	INT64	—	Genomic position (1-based)
5	`ref`	VARCHAR	max_length=500	Reference allele
6	`alt`	VARCHAR	max_length=500	Alternate allele
7	`qual`	FLOAT	—	Variant quality score
8	`gene`	VARCHAR	max_length=100	HGNC gene symbol
9	`consequence`	VARCHAR	max_length=200	VEP consequence term
10	`impact`	VARCHAR	max_length=20	HIGH/MODERATE/LOW/MODIFIER
11	`genotype`	VARCHAR	max_length=10	Sample genotype (0/1, 1/1)
12	`text_summary`	VARCHAR	max_length=5000	Natural-language summary
13	`clinical_significance`	VARCHAR	max_length=200	ClinVar classification
14	`rsid`	VARCHAR	max_length=20	dbSNP RS identifier
15	`disease_associations`	VARCHAR	max_length=2000	Associated diseases
16	`am_pathogenicity`	FLOAT	0.0-1.0	AlphaMissense pathogenicity
17	`am_class`	VARCHAR	max_length=20	pathogenic/ambiguous/benign

Index configuration:

Parameter	Value
Index Type	IVF_FLAT
Metric Type	COSINE
nlist	1024
nprobe (search)	16

22.2 Pydantic Data Models¶

from pydantic import BaseModel, Field
from typing import List, Optional
from enum import Enum

class TargetHypothesis(BaseModel):
    """Genomic target identified from variant analysis."""
    gene: str
    variant_id: str
    rsid: Optional[str]
    clinical_significance: str
    am_pathogenicity: Optional[float]
    am_class: Optional[str]
    therapeutic_area: str
    druggability_score: float
    rationale: str

class StructureInfo(BaseModel):
    """PDB structure information for a target protein."""
    pdb_id: str
    resolution: float
    method: str
    chain: str
    binding_site_volume: Optional[float]

class StructureManifest(BaseModel):
    """Collection of structures for a target."""
    target_gene: str
    uniprot_id: str
    structures: List[StructureInfo]
    selected_structure: str

class MoleculeProperties(BaseModel):
    """Chemical properties of a generated molecule."""
    molecular_weight: float
    logp: float
    hbd: int
    hba: int
    tpsa: float
    qed: float
    lipinski_pass: bool

class GeneratedMolecule(BaseModel):
    """Molecule generated by MolMIM."""
    smiles: str
    generation_score: float
    similarity_to_seed: float
    properties: MoleculeProperties

class DockingResult(BaseModel):
    """Molecular docking result from DiffDock."""
    smiles: str
    dock_score: float  # kcal/mol (negative = better)
    confidence: float
    pose_sdf: str

class RankedCandidate(BaseModel):
    """Final ranked drug candidate with composite score."""
    rank: int
    smiles: str
    generation_score: float
    dock_score: float
    dock_score_normalized: float
    qed: float
    composite_score: float  # 0.3*gen + 0.4*dock + 0.3*qed
    lipinski_pass: bool
    properties: MoleculeProperties

class PipelineConfig(BaseModel):
    """Configuration for a pipeline run."""
    mode: str = "full"
    target_gene: Optional[str]
    seed_smiles: Optional[str]
    num_candidates: int = 100
    min_qed: float = 0.67
    min_dock_score: float = -6.0
    molmim_url: str = "http://localhost:8001"
    diffdock_url: str = "http://localhost:8002"
    claude_model: str = "claude-sonnet-4-20250514"
    claude_temperature: float = 0.3

class PipelineRun(BaseModel):
    """Record of a pipeline execution."""
    run_id: str
    config: PipelineConfig
    target: Optional[TargetHypothesis]
    structures: Optional[StructureManifest]
    candidates_generated: int = 0
    candidates_passed_lipinski: int = 0
    candidates_drug_like: int = 0
    top_composite_score: float = 0.0
    status: str = "initialized"

23. Appendix D: Docker Image Reference¶

23.1 All Container Images¶

Service	Image	Tag	Architecture
Parabricks	nvcr.io/nvidia/clara/clara-parabricks	4.6.0-1	ARM64 (aarch64)
Milvus	milvusdb/milvus	v2.4-latest	ARM64
MolMIM	nvcr.io/nvidia/clara/bionemo-molmim	1.0	ARM64
DiffDock	nvcr.io/nvidia/clara/diffdock	1.0	ARM64
Grafana	grafana/grafana	10.2.2	ARM64
Prometheus	prom/prometheus	v2.48.0	ARM64
Node Exporter	prom/node-exporter	latest	ARM64
DCGM Exporter	nvcr.io/nvidia/k8s/dcgm-exporter	latest	ARM64
etcd	quay.io/coreos/etcd	v3.5.5	ARM64
MinIO	minio/minio	latest	ARM64
Attu	zilliz/attu	latest	ARM64

23.2 ARM64 Compatibility Notes¶

The DGX Spark uses an ARM64 (aarch64) processor. All container images must be ARM64-compatible:

NVIDIA NGC images for Parabricks, BioNeMo, and DCGM include ARM64 variants
Community images (Grafana, Prometheus, MinIO, etcd) provide multi-arch manifests
Custom application images must be built with --platform linux/arm64
If building locally, ensure the base image supports ARM64

# Verify image architecture
docker inspect --format='{{.Architecture}}' <image-name>
# Expected: arm64

# Build for ARM64 explicitly
docker build --platform linux/arm64 -t my-service:latest ./my-service/

24. Appendix E: Validation Checklists¶

24.1 Pre-Deployment Checklist¶

#	Item	Command / Check	Expected
1	DGX Spark hardware	`uname -m`	aarch64
2	GPU detected	`nvidia-smi`	GB10 GPU listed
3	Docker installed	`docker --version`	24.0+
4	Docker Compose V2	`docker compose version`	v2.x
5	NVIDIA runtime	`docker info \\| grep nvidia`	nvidia listed
6	Python version	`python3 --version`	3.10+
7	Disk space	`df -h /`	>= 320 GB free
8	Reference genome	`ls genomics/data/reference/GRCh38.fa`	File exists, ~3.1 GB
9	ClinVar data	`ls rag/data/clinvar/clinvar.vcf.gz`	File exists, ~1.2 GB
10	AlphaMissense data	`ls rag/data/alphamissense/AlphaMissense_hg38.tsv.gz`	File exists, ~4 GB
11	API keys configured	`grep ANTHROPIC_API_KEY .env`	Key set (not empty)
12	NGC key configured	`grep NGC_API_KEY .env`	Key set (not empty)
13	`.env` permissions	`stat -c %a .env`	600
14	`.env` in .gitignore	`grep .env .gitignore`	Present

24.2 Post-Deployment Checklist¶

#	Item	Command / Check	Expected
1	All containers running	`docker compose ps`	14+ services "Up"
2	Landing Page	`curl http://localhost:8080`	HTTP 200
3	Genomics Portal	`curl http://localhost:5000/health`	`{"status":"healthy"}`
4	RAG API	`curl http://localhost:5001/health`	`{"status":"healthy"}`
5	Milvus ready	`curl http://localhost:19530/v1/health/ready`	`{"status":"ok"}`
6	Attu UI	`curl -o /dev/null -w "%{http_code}" http://localhost:8000`	200
7	Streamlit Chat	`curl -o /dev/null -w "%{http_code}" http://localhost:8501`	200
8	MolMIM ready	`curl http://localhost:8001/v1/health/ready`	`{"status":"ready"}`
9	DiffDock ready	`curl http://localhost:8002/v1/health/ready`	`{"status":"ready"}`
10	Discovery UI	`curl http://localhost:8505/health`	`{"status":"healthy"}`
11	Discovery Portal	`curl http://localhost:8510/health`	`{"status":"healthy"}`
12	Grafana	`curl http://localhost:3000/api/health`	`{"status":"ok"}`
13	Prometheus	`curl http://localhost:9099/-/healthy`	HTTP 200
14	DCGM metrics	`curl http://localhost:9400/metrics`	Metrics text
15	Milvus collection	Python: `Collection("genomic_evidence").num_entities`	> 0

24.3 Demo Readiness Checklist¶

#	Item	Check	Expected
1	All services healthy	Run `validate_deployment.sh`	All [OK]
2	VCP variant in Milvus	Query gene="VCP"	rs188935092 found
3	ClinVar annotation	VCP classification	Pathogenic
4	AlphaMissense score	VCP am_pathogenicity	0.87
5	PDB structures accessible	Query RCSB for VCP	8OOI, 9DIL, 7K56, 5FTK
6	MolMIM generates	Test generation from CB-5083	Molecules returned
7	DiffDock docks	Test docking against VCP structure	Scores returned
8	Claude responds	Test RAG query about VCP	Coherent response
9	Grafana dashboards	Login at port 3000	Dashboards visible
10	GPU metrics flowing	Check DCGM in Grafana	GPU util, memory shown

25. Appendix F: Glossary¶

25.1 Genomics Terms¶

Term	Definition
FASTQ	Text-based format for storing nucleotide sequences and quality scores
BAM	Binary Alignment Map — compressed format for aligned sequencing reads
VCF	Variant Call Format — standard format for genomic variants
SNP	Single Nucleotide Polymorphism — single base-pair variant
Indel	Insertion or deletion of nucleotides in the genome
WGS	Whole Genome Sequencing — sequencing of entire genome
GRCh38	Genome Reference Consortium Human Build 38 — current reference genome
GIAB	Genome in a Bottle — NIST benchmark samples (e.g., HG002)
ClinVar	NCBI database of clinically relevant genomic variants
VEP	Variant Effect Predictor — functional annotation tool
AlphaMissense	DeepMind model predicting missense variant pathogenicity
Paired-end	Sequencing both ends of a DNA fragment for improved alignment
Coverage (30x)	Average number of reads covering each position in the genome

25.2 ML/AI Terms¶

Term	Definition
RAG	Retrieval-Augmented Generation — combining search with LLM generation
Embedding	Dense vector representation of text or data
BGE	BAAI General Embedding — sentence transformer model family
IVF_FLAT	Inverted File Index — approximate nearest neighbor search method
COSINE	Cosine similarity — metric for comparing vector directions
NIM	NVIDIA Inference Microservice — containerized model serving
LLM	Large Language Model — e.g., Claude
Vector Database	Database optimized for similarity search on dense vectors
nlist	Number of clusters in IVF index (build-time parameter)
nprobe	Number of clusters to search at query time (recall vs. latency)

25.3 Drug Discovery Terms¶

Term	Definition
SMILES	Simplified Molecular Input Line Entry System — text notation for molecules
PDB	Protein Data Bank — repository of 3D protein structures
Molecular Docking	Computational prediction of ligand-protein binding pose and affinity
QED	Quantitative Estimate of Drug-likeness — composite drug-likeness score (0-1)
Lipinski Rule of Five	Empirical rules predicting oral bioavailability
TPSA	Topological Polar Surface Area — predictor of membrane permeability
LogP	Partition coefficient — measure of lipophilicity
HBD / HBA	Hydrogen Bond Donors / Acceptors
Conformer	3D spatial arrangement of a molecule's atoms
Binding Affinity	Strength of interaction between a drug molecule and its target protein
kcal/mol	Kilocalories per mole — unit for binding energy (more negative = stronger)
MolMIM	Molecule generation model from NVIDIA BioNeMo
DiffDock	Diffusion-based molecular docking model
Druggability	Assessment of whether a protein target can be modulated by a small molecule
CB-5083	VCP/p97 inhibitor used as seed molecule in the VCP demo
RDKit	Open-source cheminformatics toolkit for molecular analysis

This deployment guide is maintained as part of the HCLS AI Factory open-source project. For updates, issues, and contributions, visit the project repository on GitHub.