Single-Cell Intelligence Agent -- Architecture Guide¶
Version: 1.0.0 Date: 2026-03-22 Author: Adam Jones
1. Architecture Overview¶
The Single-Cell Intelligence Agent is built on a layered architecture that separates presentation, API, reasoning, search, clinical decision support, and data storage concerns.
+----------------------------------------------------------+
| PRESENTATION LAYER |
| Streamlit UI (8130) External Clients |
+----------------------------------------------------------+
|
+----------------------------------------------------------+
| API LAYER |
| FastAPI REST (8540) |
| - Auth middleware - Rate limiting |
| - Request validation - CORS |
| - Error handling - Metrics collection |
+----------------------------------------------------------+
|
+----------------------------------------------------------+
| REASONING LAYER |
| SingleCellAgent |
| - Plan: Query classification, search strategy |
| - Execute: Multi-collection RAG search |
| - Evaluate: Evidence quality scoring |
| - Synthesize: LLM-powered response generation |
| - Report: Structured output with citations |
+----------------------------------------------------------+
|
+----------------+----------------+
| | |
+---------+--+ +--------+---+ +------+------+
| RAG ENGINE | | WORKFLOW | | DECISION |
| Parallel | | ENGINE | | SUPPORT |
| search | | 10 clinical| | 4 engines |
| synthesis | | workflows | | |
+------+-----+ +------+-----+ +------+------+
| | |
+------+-----------------+----------------+------+
| DATA LAYER |
| Milvus Vector DB (12 collections) |
| Knowledge Base (57 cell types, 30 drugs, ...) |
| Conversation Store (disk-backed, 24h TTL) |
+-------------------------------------------------+
2. Module Architecture¶
2.1 Core Modules¶
single_cell_intelligence_agent/
|-- config/
| |-- settings.py # Pydantic BaseSettings (197 lines)
|-- src/
| |-- agent.py # Autonomous reasoning engine (2,090 lines)
| |-- models.py # Pydantic data models (820 lines)
| |-- collections.py # 12 Milvus collection schemas (1,210 lines)
| |-- rag_engine.py # Multi-collection RAG search (1,490 lines)
| |-- clinical_workflows.py # 10 analysis workflows (1,792 lines)
| |-- decision_support.py # 4 clinical engines (886 lines)
| |-- knowledge.py # Domain knowledge base (1,816 lines)
| |-- query_expansion.py # Synonym expansion (893 lines)
| |-- cross_modal.py # Inter-agent communication (392 lines)
| |-- metrics.py # Prometheus metrics (476 lines)
| |-- export.py # Report generation (588 lines)
| |-- scheduler.py # APScheduler ingest (496 lines)
| |-- ingest/
| |-- base.py # BaseIngestParser ABC (228 lines)
| |-- cellxgene_parser.py # CellxGene data parser (679 lines)
| |-- marker_parser.py # Marker gene parser (286 lines)
| |-- tme_parser.py # TME profile parser (418 lines)
|-- api/
| |-- main.py # FastAPI application (615 lines)
| |-- routes/
| |-- sc_clinical.py # Clinical endpoint routes
| |-- reports.py # Report generation routes
| |-- events.py # SSE event stream routes
|-- app/
| |-- sc_ui.py # Streamlit 5-tab UI (~600 lines)
2.2 Dependency Graph¶
settings.py
|
v
models.py <---- collections.py
| |
v v
agent.py -------> rag_engine.py
| |
+--------+-------+
| | |
v v v
clinical_ decision_ knowledge.py
workflows support
| |
v v
api/main.py
|
v
app/sc_ui.py
3. GPU Acceleration Pipeline¶
3.1 RAPIDS Integration Architecture¶
The agent is architecturally prepared for GPU acceleration via NVIDIA RAPIDS. The integration targets five computational bottlenecks:
Single-Cell Data (AnnData .h5ad)
|
+----+----+
| |
CPU Path GPU Path (RAPIDS)
| |
v v
scikit- cuML
learn UMAP/PCA
| |
v v
scanpy cuGraph
Leiden Leiden/Louvain
| |
v v
scipy cuSPARSE
sparse operations
| |
+----+----+
|
v
Annotation + Analysis
3.2 RAPIDS Component Mapping¶
| CPU Library | RAPIDS GPU | Operation | Expected Speedup |
|---|---|---|---|
| sklearn.decomposition.PCA | cuml.PCA | Dimensionality reduction | 30-50x |
| umap-learn | cuml.UMAP | Manifold embedding | 50-100x |
| sklearn.neighbors.NearestNeighbors | cuml.NearestNeighbors | kNN graph | 80-120x |
| igraph/leidenalg | cugraph.leiden | Community detection | 20-40x |
| scipy.sparse | cuSPARSE | Sparse matrix ops | 15-25x |
| sklearn.cluster.KMeans | cuml.KMeans | K-means clustering | 40-60x |
3.3 GPU Memory Management¶
# Configuration from settings.py
GPU_MEMORY_LIMIT_GB = 120 # DGX Spark default
# Planned memory allocation strategy:
# - 40% for RAPIDS cuML operations (48 GB)
# - 30% for Milvus GPU index (36 GB)
# - 20% for foundation model inference (24 GB)
# - 10% for system overhead (12 GB)
3.4 GPU-Accelerated Methods Registry¶
The sc_methods collection includes a gpu_accelerated boolean field to track which analytical methods support GPU execution:
| Method | GPU Support | Library |
|---|---|---|
| UMAP | Yes | cuml.UMAP |
| Leiden clustering | Yes | cugraph.leiden |
| PCA | Yes | cuml.PCA |
| t-SNE | Yes | cuml.TSNE |
| kNN graph | Yes | cuml.NearestNeighbors |
| Differential expression | Partial | RAPIDS cuDF |
| Trajectory inference | No | Monocle3 (R) / scVelo |
| CellChat | No | R-based |
| Scanpy preprocessing | Partial | rapids-singlecell |
4. TME Classification Architecture¶
4.1 Classification Pipeline¶
Input: Cell Type Proportions + Gene Expression
|
v
+-------------------+
| Immune Score | Sum of 8 immune cell type fractions
| (CD8_T, CD4_T, | (CD8_T, CD4_T, NK, B_cell,
| NK, B_cell, ...) | Macrophage_M1, Dendritic,
+-------------------+ Plasma, Neutrophil)
|
v
+-------------------+
| Suppressive Score | Weighted combination of:
| (Treg, MDSC, | - Suppressive cell fraction (50%)
| M2 Macrophage) | - Suppressive gene score (50%)
+-------------------+ (IDO1, TGFB1, IL10, VEGFA, ARG1, NOS2)
|
v
+-------------------+
| Checkpoint Score | 6 checkpoint genes normalized:
| (CD274, PDCD1LG2, | CD274, PDCD1LG2, CTLA4,
| CTLA4, LAG3, ... | LAG3, HAVCR2, TIGIT
+-------------------+
|
v
+-------------------+
| Spatial Override | "absent" -> COLD_DESERT
| | "margin" -> EXCLUDED
+-------------------+
|
v
+-------------------+
| Classification |
| Decision Tree |
+-------------------+
|
+---------+---------+-----------+
| | | |
v v v v
HOT COLD EXCLUDED IMMUNO-
INFLAMED DESERT SUPPRESSIVE
4.2 Classification Decision Tree¶
IF spatial == "absent" AND immune < 0.05:
-> COLD_DESERT
IF spatial == "margin" AND immune > 0.05:
-> EXCLUDED
IF CD8 >= 0.15 AND immune >= 0.25:
IF suppressive > 0.4:
-> IMMUNOSUPPRESSIVE
ELSE:
-> HOT_INFLAMED
IF immune >= 0.10 AND stromal > 0.20:
-> EXCLUDED
IF suppressive > 0.3 AND immune >= 0.10:
-> IMMUNOSUPPRESSIVE
IF immune < 0.10:
-> COLD_DESERT
IF PD-L1_high AND CD8 >= 0.05:
-> HOT_INFLAMED
DEFAULT:
-> COLD_DESERT
4.3 Treatment Recommendation Engine¶
Each TME class maps to a set of evidence-based treatment recommendations:
| TME Class | Primary Recommendation | Conditional Recommendations |
|---|---|---|
| HOT_INFLAMED | Checkpoint inhibitor (anti-PD-1/PD-L1) | PD-L1 TPS >= 50%: pembrolizumab mono; LAG3+: relatlimab + nivolumab |
| COLD_DESERT | Priming strategies (oncolytic virus, STING agonist) | BiTE or adoptive cell therapy |
| EXCLUDED | Anti-TGFb or anti-VEGF to remove stromal barrier | Anti-CXCL12/CXCR4 for T-cell migration |
| IMMUNOSUPPRESSIVE | Dual checkpoint (anti-PD-1 + anti-CTLA-4) | Anti-CCR8 for Treg depletion; CSF1R inhibitor for M2 repolarization |
5. Spatial Deconvolution Architecture¶
5.1 Spatial Platform Support¶
| Platform | Resolution | Genes | Spatial Feature |
|---|---|---|---|
| Visium (10x) | 55 um spots | Whole transcriptome | H&E morphology overlay |
| MERFISH (Vizgen) | Subcellular | 100-500 panel | Single-molecule FISH |
| Xenium (10x) | Subcellular | 100-5000 panel | In situ sequencing |
| CODEX | Single-cell | 40-60 proteins | Protein co-detection |
5.2 Spatial Niche Detection Pipeline¶
Spatial Coordinates + Gene Expression
|
v
+-------------------+
| Cell Type | Assign cell types to spatial
| Annotation | locations using marker genes
+-------------------+
|
v
+-------------------+
| Spatial | Moran's I statistic for
| Autocorrelation | spatially variable genes
+-------------------+
|
v
+-------------------+
| Niche | k-NN graph on spatial
| Construction | coordinates, community
+-------------------+ detection on cell types
|
v
+-------------------+
| L-R Interaction | Spatially-aware ligand-
| Analysis | receptor scoring
+-------------------+
|
v
+-------------------+
| Clinical | Map niches to clinical
| Interpretation | significance
+-------------------+
5.3 Spatial Data Schema¶
The sc_spatial collection stores spatial niche data with the following key fields:
- niche_label: Descriptive niche name (e.g., "Tumor-immune interface")
- platform: Spatial technology (Visium, MERFISH, Xenium, CODEX)
- cell_types: Pipe-delimited cell types in the niche
- signature_genes: Spatially variable genes characterizing the niche
- morans_i: Spatial autocorrelation statistic (0-1)
- clinical_relevance: Clinical significance text
6. RAG Search Architecture¶
6.1 Multi-Collection Parallel Search¶
User Query
|
v
BGE-small-en-v1.5 Embedding (384-dim)
|
v
+-- ThreadPoolExecutor (max_workers=12) --+
| |
| sc_cell_types (w=0.14, k=50) |
| sc_markers (w=0.12, k=40) |
| sc_spatial (w=0.10, k=30) |
| sc_tme (w=0.10, k=30) |
| sc_drug_response (w=0.09, k=20) |
| sc_literature (w=0.08, k=20) |
| sc_methods (w=0.07, k=15) |
| sc_datasets (w=0.06, k=15) |
| sc_trajectories (w=0.07, k=20) |
| sc_pathways (w=0.07, k=20) |
| sc_clinical (w=0.07, k=15) |
| genomic_evidence (w=0.03, k=20) |
| |
+------------------------------------------+
|
v
Score Aggregation
|-- Weighted score = cosine_similarity * collection_weight
|-- Deduplication across collections
|-- Score threshold filter (>= 0.4)
|
v
Evidence Ranking
|-- Sort by weighted score
|-- Citation relevance: HIGH (>0.75), MEDIUM (>0.60), LOW
|
v
Context Window Construction
|-- Top-K evidence formatted for LLM
|-- Conversation history (3-turn window)
|
v
Claude Sonnet Synthesis
|
v
SCResponse
6.2 Workflow-Specific Weight Boosting¶
When a query is classified as a specific workflow type, the default weights are replaced with a workflow-optimized profile. For example, a TME Profiling query boosts sc_tme from 0.10 to 0.25 and redistributes weight from less relevant collections.
6.3 Query Expansion¶
The query_expansion.py module expands queries with:
- Cell type synonyms (e.g., "T cell" -> "T lymphocyte", "CD3+ cell")
- Gene aliases (e.g., "PD-L1" -> "CD274", "B7-H1")
- Disease synonyms (e.g., "lung cancer" -> "NSCLC", "non-small cell lung carcinoma")
- 232 cell type aliases mapped to canonical names
7. Cross-Agent Communication¶
7.1 Integration Architecture¶
+------------------+ +------------------+
| Single-Cell | <--> | Genomics Agent |
| Intelligence | | (port 8527) |
| Agent | +------------------+
| (port 8540) |
| | <--> +------------------+
| | | Biomarker Agent |
| | | (port 8529) |
| | +------------------+
| |
| | <--> +------------------+
| | | Oncology Agent |
| | | (port 8528) |
| | +------------------+
| |
| | <--> +------------------+
| | | Trial Agent |
| | | (port 8538) |
+------------------+ +------------------+
|
v
+------------------+
| Shared Milvus |
| genomic_evidence |
| (3.56M records) |
+------------------+
7.2 Communication Protocol¶
- Protocol: HTTP REST
- Timeout: 30 seconds per agent
- Failure mode: Graceful degradation (response returned without cross-agent data)
- Authentication: Internal network trust (no auth between agents)
8. Data Storage Architecture¶
8.1 Milvus Vector Database¶
- Index type: IVF_FLAT
- Metric: COSINE similarity
- nlist: 128
- Embedding dimension: 384
- Total estimated records: 3,765,000 across 12 collections
8.2 Conversation Store¶
- Backend: Disk-backed JSON files
- Location:
data/cache/conversations/{session_id}.json - TTL: 24 hours
- Format:
{session_id, updated, messages[]}
8.3 Knowledge Base¶
- Backend: In-memory Python dictionaries (loaded from
knowledge.py) - Size: 57 cell types, 30 drugs, 75 markers, 10 signatures, 25 L-R pairs, 12 TME profiles
- Update: Code deployment (static knowledge), scheduled ingest (dynamic data)
9. Security Architecture¶
External Client
|
v
TLS Termination (nginx/traefik)
|
v
API Key Authentication (X-API-Key header)
|
v
Rate Limiting (100 req/min per IP)
|
v
Request Size Limiting (10 MB)
|
v
CORS Validation (configured origins)
|
v
Pydantic Input Validation
|
v
Application Logic
|
v
Non-root Container (scuser)
10. Deployment Architecture¶
10.1 Standalone Deployment¶
docker-compose.yml
|
+-- milvus-etcd (etcd:v3.5.5)
+-- milvus-minio (minio)
+-- milvus-standalone (milvus:v2.4)
+-- sc-api (FastAPI, port 8540)
+-- sc-streamlit (Streamlit, port 8130)
+-- sc-setup (one-shot seed)
10.2 Integrated DGX Spark Deployment¶
docker-compose.dgx-spark.yml (top-level)
|
+-- Shared Milvus (port 19530)
+-- ... (other agents)
+-- sc-api (port 8540)
+-- sc-streamlit (port 8130)
HCLS AI Factory -- Single-Cell Intelligence Agent Architecture Guide v1.0.0