Quick-Start Checklist¶
Deploy the HCLS AI Factory in 30 minutes. This checklist extracts the critical steps from the full Deployment Guide.
Prerequisites¶
- Hardware: NVIDIA DGX Spark (or equivalent with GB10 GPU, 128GB unified memory)
- Storage: 500GB+ free space for genomics data and models
- Network: Internet access for pulling containers and reference data
- Software: Docker 24+, Docker Compose 2.20+, Git
- Download tools:
aria2c,pigz(sudo apt-get install -y aria2 pigz)
Step 1: Clone the Repository¶
Step 2: Environment Setup¶
Required variables:
| Variable | Description |
|---|---|
ANTHROPIC_API_KEY |
Claude API key for RAG chat |
NGC_API_KEY |
NVIDIA NGC key for BioNeMo models |
Step 3: Stage 0 — Download Required Data¶
# Download all data (~500 GB, one-time)
./setup-data.sh --all
# Or download by stage
./setup-data.sh --stage2 # ClinVar + AlphaMissense (~2 GB, fast)
./setup-data.sh --stage1 # HG002 FASTQ + reference (~300 GB, 2-6 hours)
# Check status
./setup-data.sh --status
Note: Stage 0 (data acquisition) is a one-time step and the most time-consuming part of setup. See Stage 0: Data Acquisition for troubleshooting FASTQ checksum failures, disk space issues, and resuming interrupted downloads.
Step 4: Start Core Services¶
Expected services:
genomics-portal— Parabricks + DeepVariant (port 5000)rag-api— RAG engine + Claude integration (port 5001)streamlit-chat— Chat UI (port 8501)molmim/diffdock— BioNeMo NIMs (ports 8001, 8002)discovery-ui— Drug discovery interface (port 8505)milvus/etcd/minio— Vector database stack (port 19530)grafana— Monitoring dashboard (port 3000)landing-page— Service health monitor (port 8080)
Step 5: Verify GPU Access¶
You should see your GPU(s) listed with available memory.
Step 6: Run a Test Pipeline¶
# Run the demo pipeline
python run_pipeline.py --mode demo
# Expected output: variant calls in output/demo/
Step 7: Access the UI¶
Open your browser:
| Service | URL | Purpose |
|---|---|---|
| Streamlit Chat | http://localhost:8501 |
Query variants with Claude |
| Grafana | http://localhost:3000 |
Monitor pipeline metrics |
| Landing Page | http://localhost:8080 |
Service health dashboard |
| RAG API | http://localhost:5001 |
REST API for variant queries |
Troubleshooting¶
Services won't start¶
GPU not detected¶
Out of memory¶
Reduce batch sizes in .env:
Next Steps¶
- Full deployment: Deployment Guide
- Run the demo: Demo Guide
- Understand the architecture: White Paper
Success Criteria¶
You're ready when:
- All Docker services show
Upstatus - GPU is visible in containers
- Streamlit chat responds to queries
- Grafana shows pipeline metrics
Total time: ~30 minutes (excluding data download — see Step 3 and DATA_SETUP.md)
Need help? Open an issue on GitHub.