HCLS AI Factory — Intelligence Report
Patient: GEN-2026-0087 | Pipeline Run: HCLS-VCP-2026-0087 | Mode: Full (Stages 1→2→3)
License: Apache 2.0 | Author: Adam Jones | Date: February 2026
| Field |
Value |
| Patient ID |
GEN-2026-0087 |
| Run ID |
HCLS-VCP-2026-0087 |
| Pipeline Version |
HLS-Pipeline v1.0.0 |
| Pipeline Mode |
Full (Genomics → RAG/Chat → Drug Discovery) |
| Hardware |
NVIDIA DGX Spark (GB10 GPU, 128 GB unified) |
| Total Pipeline Duration |
4 hours 12 minutes |
| Report Generated |
February 2026 |
| Status |
COMPLETE — 100 novel drug candidates ranked |
1. Genomics Summary (Stage 1)
| Parameter |
Value |
| Sample |
HG002 (NA24385, GIAB Ashkenazi male) |
| Sequencing |
Illumina, 30× WGS, 2×250 bp paired-end |
| FASTQ Size |
198.7 GB (R1: 99.4 GB, R2: 99.3 GB) |
| Reference Genome |
GRCh38 (3.1 GB) |
Parabricks Execution
| Step |
Tool |
Duration |
GPU Utilization |
Peak Memory |
| Alignment |
BWA-MEM2 (fq2bam) |
34 min |
82% |
38 GB |
| Variant Calling |
Google DeepVariant |
22 min |
91% |
54 GB |
| Total Stage 1 |
|
56 min |
|
|
VCF Output
| Metric |
Count |
| Total Variants Called |
11,724,891 |
| PASS Quality (QUAL>30) |
3,487,216 |
| SNPs |
4,198,433 |
| Indels |
1,012,548 |
| Multi-allelic Sites |
148,762 |
| Coding Region Variants |
35,616 |
| Transition/Transversion Ratio |
2.07 |
Quality Assessment: PASS — Ts/Tv ratio within expected range (2.0-2.1), variant counts consistent with 30× WGS of Ashkenazi ancestry sample.
2. Annotation & Target Identification (Stage 2)
Annotation Funnel
| Stage |
Variants |
Filter Applied |
| Raw VCF |
11,724,891 |
— |
| Quality Filter |
3,487,216 |
QUAL > 30 |
| ClinVar Annotated |
35,616 |
Clinical significance match |
| AlphaMissense Scored |
6,831 |
AI pathogenicity prediction |
| HIGH Impact + Pathogenic |
2,412 |
Actionable subset |
| Druggable Gene Targets |
847 |
Knowledge base match |
Top 5 Target Hypotheses (Claude RAG Analysis)
| Rank |
Gene |
Variant |
ClinVar |
AlphaMissense |
Therapeutic Area |
Druggability |
| 1 |
VCP |
rs188935092 (chr9:35065263 G>A) |
Pathogenic |
0.87 |
Neurology |
0.92 |
| 2 |
EGFR |
rs121913229 (chr7:55259515 T>G) |
Pathogenic |
0.79 |
Oncology |
0.95 |
| 3 |
BRCA1 |
rs80357914 (chr17:43091434 G>A) |
Pathogenic |
0.72 |
Oncology |
0.78 |
| 4 |
PCSK9 |
rs11591147 (chr1:55505647 T>G) |
Pathogenic |
0.68 |
Cardiovascular |
0.88 |
| 5 |
CFTR |
rs75527207 (chr7:117559590 G>A) |
Pathogenic |
0.81 |
Respiratory |
0.71 |
Primary Target Selection: VCP
| Parameter |
Value |
| Gene |
VCP (Valosin-Containing Protein / p97) |
| UniProt |
P55072 |
| Function |
AAA+ ATPase, ubiquitin-proteasome pathway |
| Diseases |
Frontotemporal Dementia (FTD), ALS, IBMPFD |
| Variant |
rs188935092 — missense, HIGH impact |
| ClinVar Classification |
Pathogenic (reviewed by expert panel) |
| AlphaMissense Score |
0.87 (pathogenic, >0.564 threshold) |
| Druggability Score |
0.92 (D2 ATPase domain, ~450 ų pocket) |
| Known Inhibitors |
CB-5083 (Phase I), NMS-873 |
| Confidence |
HIGH — multiple independent evidence sources |
Evidence Chain
- Genomic: rs188935092 detected at chr9:35065263 (G>A), heterozygous, QUAL=892
- Clinical: ClinVar classifies as Pathogenic for FTD/ALS/IBMPFD (expert panel review)
- AI Prediction: AlphaMissense score 0.87 (pathogenic threshold >0.564)
- Functional: VEP annotates as missense_variant, HIGH impact, D2 ATPase domain
- Druggability: Known drug target — CB-5083 reached Phase I clinical trial
- Structural: 4 PDB structures available including inhibitor-bound 5FTK
Milvus Vector Search Statistics
| Parameter |
Value |
| Total Embeddings Indexed |
3,487,216 |
| Embedding Model |
BGE-small-en-v1.5 (384-dim) |
| Index Type |
IVF_FLAT (nlist=1024) |
| Search nprobe |
16 |
| Query Expansion |
Neurology therapeutic area keywords |
| Top-k Retrieved |
20 variant contexts |
| Search Latency |
12 ms |
3. Drug Discovery Results (Stage 3)
Structure Evidence
| PDB ID |
Resolution |
Method |
Description |
Score |
| 5FTK |
2.3 Å |
X-ray |
VCP D2 + CB-5083 inhibitor |
13.2 (selected) |
| 7K56 |
2.5 Å |
Cryo-EM |
VCP complex |
10.8 |
| 8OOI |
2.9 Å |
Cryo-EM |
WT VCP hexamer |
8.9 |
| 9DIL |
3.2 Å |
Cryo-EM |
Mutant VCP |
7.4 |
Selected Structure: 5FTK — inhibitor-bound (CB-5083), X-ray at 2.3 Å resolution. Binding site in D2 ATPase domain, key residues: ALA464, GLY479, ASP320, GLY215.
Molecule Generation (MolMIM)
| Parameter |
Value |
| Seed Compound |
CB-5083 (ATP-competitive VCP inhibitor) |
| NIM Endpoint |
MolMIM (port 8001) |
| Molecules Generated |
100 |
| Chemically Valid |
98 (2 rejected by RDKit valence check) |
| Generation Time |
2 min 14 sec |
Drug-Likeness Profile
| Metric |
Pass |
Fail |
Pass Rate |
| Lipinski Rule of Five |
87 |
11 |
88.8% |
| QED > 0.67 (drug-like) |
72 |
26 |
73.5% |
| QED > 0.49 (moderate+) |
91 |
7 |
92.9% |
| TPSA < 140 Ų |
94 |
4 |
95.9% |
Molecular Docking (DiffDock)
| Parameter |
Value |
| NIM Endpoint |
DiffDock (port 8002) |
| Protein Target |
5FTK (VCP D2 domain) |
| Candidates Docked |
98 (all valid molecules) |
| Docking Time |
8 min 42 sec |
| Mean Dock Score |
-7.4 kcal/mol |
| Best Dock Score |
-11.4 kcal/mol |
| Scores < -8.0 (excellent) |
34 candidates |
| Scores < -6.0 (good+) |
78 candidates |
Composite Scoring
Formula: 30% Generation + 40% Docking + 30% QED
Docking normalization: max(0, min(1, (10 + dock_score) / 20))
Top 10 Ranked Candidates
| Rank |
Composite |
Gen Score |
Dock (kcal/mol) |
QED |
MW (Da) |
LogP |
Lipinski |
| 1 |
0.89 |
0.92 |
-11.4 |
0.81 |
423.5 |
3.2 |
PASS |
| 2 |
0.86 |
0.88 |
-10.8 |
0.79 |
441.2 |
3.7 |
PASS |
| 3 |
0.84 |
0.85 |
-10.2 |
0.82 |
398.7 |
2.9 |
PASS |
| 4 |
0.82 |
0.91 |
-9.8 |
0.74 |
467.1 |
4.1 |
PASS |
| 5 |
0.81 |
0.83 |
-9.5 |
0.78 |
412.3 |
3.4 |
PASS |
| 6 |
0.79 |
0.87 |
-9.1 |
0.71 |
455.8 |
3.8 |
PASS |
| 7 |
0.78 |
0.80 |
-8.9 |
0.76 |
389.2 |
2.7 |
PASS |
| 8 |
0.76 |
0.84 |
-8.7 |
0.69 |
478.4 |
4.3 |
PASS |
| 9 |
0.75 |
0.79 |
-8.5 |
0.73 |
401.6 |
3.1 |
PASS |
| 10 |
0.74 |
0.82 |
-8.2 |
0.68 |
448.9 |
3.9 |
PASS |
CB-5083 Seed Comparison
| Metric |
CB-5083 (Seed) |
Top Candidate |
Improvement |
| Dock Score |
-8.1 kcal/mol |
-11.4 kcal/mol |
+41% binding affinity |
| QED |
0.62 |
0.81 |
+31% drug-likeness |
| MW |
487.2 Da |
423.5 Da |
-13% (better oral absorption) |
| Composite |
0.64 |
0.89 |
+39% overall |
Stage Timing
| Stage |
Duration |
GPU Util (avg) |
Peak Memory |
| 1 — Genomics (fq2bam) |
34 min |
82% |
38 GB |
| 1 — Genomics (DeepVariant) |
22 min |
91% |
54 GB |
| 2 — Annotation |
18 min |
15% (CPU-bound) |
12 GB |
| 2 — Milvus Indexing |
24 min |
35% |
22 GB |
| 2 — RAG/Chat (interactive) |
45 min |
5% |
8 GB |
| 3 — Structure Retrieval |
2 min |
0% (network I/O) |
2 GB |
| 3 — MolMIM Generation |
2 min 14 sec |
78% |
18 GB |
| 3 — DiffDock Docking |
8 min 42 sec |
85% |
24 GB |
| 3 — Scoring + Reporting |
1 min 30 sec |
0% (CPU) |
4 GB |
| Total |
~4 hr 12 min |
|
|
Service Health at Report Generation
| Service |
Port |
Status |
| Landing Page |
8080 |
HEALTHY |
| Genomics Portal |
5000 |
HEALTHY |
| Milvus |
19530 |
HEALTHY |
| RAG API |
5001 |
HEALTHY |
| Streamlit Chat |
8501 |
HEALTHY |
| MolMIM NIM |
8001 |
HEALTHY |
| DiffDock NIM |
8002 |
HEALTHY |
| Discovery UI |
8505 |
HEALTHY |
| Grafana |
3000 |
HEALTHY |
| Prometheus |
9099 |
HEALTHY |
5. Clinical Interpretation
Summary
Patient GEN-2026-0087 carries a heterozygous pathogenic missense variant (rs188935092) in the VCP gene. This variant is associated with Frontotemporal Dementia (FTD), ALS, and Inclusion Body Myopathy with Paget Disease and Frontotemporal Dementia (IBMPFD). The variant is classified as Pathogenic by ClinVar expert panel review and scores 0.87 on the AlphaMissense pathogenicity scale.
Drug Discovery Outcome
The AI-driven drug discovery pipeline identified 100 novel VCP inhibitor candidates with the top candidate showing a 39% improvement in composite score over the CB-5083 seed compound. All top 10 candidates pass Lipinski's Rule of Five and show favorable QED scores (>0.67), suggesting oral drug-likeness.
Recommended Actions
- Genetic counseling for FTD/ALS risk assessment
- Experimental validation of top 5 candidates in VCP ATPase activity assays
- ADMET profiling for lead optimization
- Cross-modal follow-up with Imaging Intelligence Agent for neurological assessment
6. Provenance
| Item |
Value |
| Pipeline |
HLS-Pipeline v1.0.0 (Nextflow DSL2) |
| Parabricks |
nvcr.io/nvidia/clara/clara-parabricks:4.6.0-1 |
| DeepVariant |
Google DeepVariant (via Parabricks, >99% accuracy) |
| Reference |
GRCh38 (3.1 GB) |
| ClinVar |
February 2026 release (4.1M variants) |
| AlphaMissense |
v1.0 (71,697,560 predictions) |
| VEP |
Ensembl VEP (GRCh38) |
| Milvus |
v2.4 (IVF_FLAT, nlist=1024, COSINE) |
| Embedding Model |
BGE-small-en-v1.5 (384-dim) |
| LLM |
claude-sonnet-4-20250514 (temperature=0.3) |
| MolMIM |
nvcr.io/nvidia/clara/bionemo-molmim:1.0 |
| DiffDock |
nvcr.io/nvidia/clara/diffdock:1.0 |
| Hardware |
NVIDIA DGX Spark (GB10, 128 GB unified) |
| Scoring |
30% gen + 40% dock + 30% QED |
This is a demonstration intelligence report generated by the HCLS AI Factory. All patient data is synthetic. The report illustrates the platform's end-to-end capability from genomic variant calling through drug candidate ranking.
HCLS AI Factory — Apache 2.0 | Author: Adam Jones | February 2026