CAR-T Intelligence Agent — Project Bible¶

Purpose: Complete implementation reference for building the HCLS CAR-T Intelligence Agent on NVIDIA DGX Spark. Import this document into a Claude Code session as context for implementation.

License: Apache 2.0 | Author: Adam Jones | Date: February 2026

Table of Contents¶

Project Overview & Goals
DGX Spark Hardware Reference
Repository Layout
Docker Compose Services
Milvus Collection Schemas
Pydantic Data Models
Configuration Reference
Embedding Strategy
Multi-Collection RAG Engine
Knowledge Graph
Query Expansion
Comparative Analysis Mode
Agent Reasoning Pipeline
Data Ingest Pipelines
FastAPI REST Server
Streamlit Chat UI
Report Export
Monitoring & Metrics
Testing Strategy
HCLS AI Factory Integration
Implementation Sequence

1. Project Overview & Goals¶

What This Agent Does¶

The CAR-T Intelligence Agent is a cross-functional research intelligence system that unifies CAR-T cell therapy knowledge across 11 Milvus collections spanning the entire development lifecycle — from target identification through post-market pharmacovigilance. It uses retrieval-augmented generation to scan curated biomedical collections and generate structured, traceable reports with clickable PubMed and ClinicalTrials.gov citations.

Five Development Stages¶

Stage	Scope
Target Identification	Antigen biology, expression profiling, disease association
CAR Design	scFv selection, costimulatory domains, signaling architecture
Vector Engineering	Transduction, viral vector production, manufacturing processes
In Vitro / In Vivo Testing	Cytotoxicity, cytokine assays, animal models, persistence
Clinical Development	Trial design, response rates, toxicity management

Key Results¶

Metric	Value
Total vectors indexed	6,266 across 11 Milvus collections (10 owned + 1 read-only)
Multi-collection search latency	12-16 ms (11 collections, top-5 each, cached)
Comparative dual retrieval	~365 ms (2 × 11 collections, entity-filtered)
Full RAG query (search + Claude)	~24 sec end-to-end
Comparative RAG query	~30 sec end-to-end
Cosine similarity scores	0.74 – 0.90 on demo queries
Knowledge graph entities	70+ (25 targets, 8 toxicities, 10 manufacturing, 15 biomarkers, 6 regulatory)
Query expansion	12 maps, 169 keywords → 1,496 terms
Entity aliases	39+ for comparative resolution

Pipeline Pattern¶

Every query follows the same pattern:

User question submitted (Streamlit UI or REST API)
Agent plans search strategy (identify antigens, stages, decompose complex queries)
BGE-small embeds query with asymmetric instruction prefix
Parallel search across 11 Milvus collections (ThreadPoolExecutor)
Query expansion adds semantically related terms
Knowledge graph augments results with domain context
Evidence merged, deduplicated, ranked (top 30)
Claude Sonnet 4 generates grounded answer with citations
Response streamed to user with evidence panel

HCLS AI Factory Integration¶

This agent is one node in the broader HCLS AI Factory. Cross-agent triggers include:

CAR-T → Genomics (Parabricks): Shared Milvus instance, genomic_evidence collection (3.56M vectors, read-only)
CAR-T → Drug Discovery (BioNeMo): Target validation feeds molecule generation
CAR-T → Imaging Agent: Cross-modal phenotype correlation (future)
CAR-T → Biomarker Agent: CRS prediction, exhaustion monitoring (future)

2. DGX Spark Hardware Reference¶

Specifications¶

Parameter	Value
GPU	NVIDIA GB10 Grace Blackwell Superchip
Architecture	Blackwell (GPU) + Grace (CPU) via NVLink-C2C
GPU Memory	128 GB unified LPDDR5x (shared CPU+GPU)
CPU	20 ARM Neoverse V2 cores (Grace)
Storage	NVMe SSD (local)
Networking	10GbE, WiFi 7, Bluetooth 5.3
Price	$3,999
Power	Desktop form factor, standard AC

Why DGX Spark¶

128 GB unified memory eliminates CPU↔GPU transfer bottleneck
NVLink-C2C provides 900 GB/s bandwidth between Grace CPU and Blackwell GPU
ARM64 Grace CPU runs all Python services natively
Single-box deployment — Milvus, embeddings, LLM inference, and UI on one machine
$3,999 price point — accessible for research labs and departmental pilots

3. Repository Layout¶

cart_intelligence_agent/
├── Docs/
│   ├── CART_Intelligence_Agent_Design.md         # Architecture design document
│   ├── CART_INTELLIGENCE_AGENT_PROJECT_BIBLE.md   # This document
│   ├── CART_Intelligence_Agent_DGX_Spark_Infographic_Prompt.md
│   └── CART_Intelligence_Agent_VAST_AI_OS_Infographic_Prompt.md
├── docs/
│   ├── ARCHITECTURE_GUIDE.md                     # Architecture walkthrough
│   ├── PROJECT_BIBLE.md                          # HCLS Factory project bible
│   ├── WHITE_PAPER.md                            # Technical white paper
│   ├── LEARNING_GUIDE_FOUNDATIONS.md             # Foundations guide
│   ├── LEARNING_GUIDE_ADVANCED.md                # Advanced guide
│   └── DEMO_GUIDE.md                            # Demo walkthrough
├── src/
│   ├── __init__.py
│   ├── models.py                    # Pydantic data models (474 lines)
│   ├── collections.py              # Milvus collection schemas + manager (1,004 lines)
│   ├── knowledge.py                # Knowledge graph (1,511 lines)
│   ├── query_expansion.py          # 12 expansion maps (1,257 lines)
│   ├── rag_engine.py               # Multi-collection RAG engine (686 lines)
│   ├── agent.py                    # Agent reasoning pipeline (262 lines)
│   ├── export.py                   # Report export: MD, JSON, PDF (903 lines)
│   ├── metrics.py                  # Prometheus metrics (189 lines)
│   ├── scheduler.py                # Data ingestion scheduler (226 lines)
│   ├── ingest/
│   │   ├── __init__.py
│   │   ├── base.py                 # Base ingest pipeline (184 lines)
│   │   ├── literature_parser.py    # PubMed E-utilities ingest (350 lines)
│   │   ├── clinical_trials_parser.py  # ClinicalTrials.gov API v2 (403 lines)
│   │   ├── construct_parser.py     # CAR construct parser + FDA seed (292 lines)
│   │   ├── assay_parser.py         # Assay data parser (163 lines)
│   │   └── manufacturing_parser.py # Manufacturing/CMC parser (111 lines)
│   └── utils/
│       ├── __init__.py
│       └── pubmed_client.py        # NCBI E-utilities HTTP client (390 lines)
├── app/
│   └── cart_ui.py                  # Streamlit chat + comparative UI (1,123 lines)
├── api/
│   └── main.py                     # FastAPI REST server (555 lines)
├── config/
│   └── settings.py                 # Pydantic BaseSettings (101 lines)
├── data/
│   ├── reference/
│   │   ├── assay_seed_data.json    # 45 curated assay records
│   │   └── manufacturing_seed_data.json  # 30 curated manufacturing records
│   └── cache/                      # Embedding cache
├── scripts/
│   ├── setup_collections.py        # Create collections + seed FDA constructs
│   ├── ingest_pubmed.py            # CLI: PubMed ingest
│   ├── ingest_clinical_trials.py   # CLI: ClinicalTrials.gov ingest
│   ├── seed_assays.py              # CLI: Seed assay data
│   ├── seed_manufacturing.py       # CLI: Seed manufacturing data
│   ├── validate_e2e.py             # End-to-end validation (5 tests)
│   ├── test_rag_pipeline.py        # Full RAG + LLM integration test
│   └── seed_knowledge.py           # Knowledge graph export
├── docker-compose.yml              # Full deployment (6 services)
├── Dockerfile                      # Multi-stage build (86 lines)
├── requirements.txt                # Python dependencies
├── .env.example                    # Environment variable template
├── LICENSE                         # Apache 2.0
└── README.md

55 Python files | ~16,748 lines of code | Apache 2.0

4. Docker Compose Services¶

Service Architecture¶

Service	Image	Port	Role
`milvus-etcd`	quay.io/coreos/etcd:v3.5.5	2379	Milvus metadata store
`milvus-minio`	minio/minio:v2023.03	9000, 9001	Milvus object storage
`milvus-standalone`	milvusdb/milvus:v2.4	19530, 9091	Vector database
`cart-streamlit`	Built from Dockerfile	8521	Streamlit chat UI
`cart-api`	Built from Dockerfile	8522	FastAPI REST server
`cart-setup`	Built from Dockerfile	—	One-shot collection setup + seed

Startup Sequence¶

# 1. Configure
cp .env.example .env
# Edit .env: set ANTHROPIC_API_KEY

# 2. Launch all services
docker compose up -d

# 3. Watch setup completion
docker compose logs -f cart-setup

# 4. Verify
curl http://localhost:8522/health
# → {"status": "healthy", "collections": 11, "total_vectors": 6266}

Dockerfile (Multi-Stage)¶

Stage 1 — Builder: - Base: python:3.10-slim - System deps: build-essential, gcc, g++, libxml2-dev, libxslt1-dev - Virtual environment with all pip dependencies

Stage 2 — Runtime: - Base: python:3.10-slim - Copy venv from builder - Copy source: config/, src/, app/, api/, scripts/, data/ - Non-root user: cartuser - Healthcheck: curl -f http://localhost:8521/_stcore/health

5. Milvus Collection Schemas¶

11 Collections Overview¶

Collection	Records	Primary Fields	Source
`cart_literature`	5,047	PMID, title, text_chunk, year, cart_stage, target_antigen	PubMed E-utilities
`cart_trials`	973	NCT ID, title, phase, status, sponsor, target_antigen	ClinicalTrials.gov API v2
`cart_constructs`	6	name, target_antigen, scfv_origin, costimulatory, generation	FDA product labels
`cart_assays`	45	assay_type, construct_id, target_antigen, key_metric, outcome	Landmark publications
`cart_manufacturing`	30	process_step, vector_type, parameter, target_spec, met_spec	Published CMC data
`cart_safety`	40	product, event_type, severity_grade, incidence_rate, management	Pharmacovigilance
`cart_biomarkers`	43	biomarker_name, biomarker_type, clinical_cutoff, evidence_level	CRS/response markers
`cart_regulatory`	25	product, regulatory_event, date, agency, indication, decision	FDA/EMA milestones
`cart_sequences`	27	construct_name, scfv_clone, binding_affinity_kd, species_origin	Molecular data
`cart_realworld`	30	study_type, data_source, product, population_size, outcome_value	Registry/RWE outcomes
`genomic_evidence`	3,561,170	chrom, pos, gene, consequence, clinical_significance	Shared (read-only)

Unified Schema Pattern¶

All collections share this structure:

# Primary key
id: VARCHAR(100)              # PMID, NCT ID, construct ID, etc.

# Embedding vector (indexed)
embedding: FLOAT_VECTOR(384)  # BGE-small-en-v1.5
# Index: IVF_FLAT, metric: COSINE, nlist: 1024, nprobe: 16

# Text fields
title / text_summary / text_chunk: VARCHAR(500-3000)

# Collection-specific metadata
# VARCHAR for enums, INT64 for years/counts, FLOAT for metrics

Index Configuration¶

INDEX_PARAMS = {
    "metric_type": "COSINE",
    "index_type": "IVF_FLAT",
    "params": {"nlist": 1024},
}

SEARCH_PARAMS = {
    "metric_type": "COSINE",
    "params": {"nprobe": 16},
}

CARTCollectionManager¶

Method	Purpose
`create_all_collections()`	Create all 11 collections with IVF_FLAT indexes
`get_collection_stats()`	Return record count per collection
`insert_batch()`	Bulk insert records with Enum→str conversion and UTF-8 truncation
`search()`	Single-collection vector similarity search with optional filters
`search_all()`	Parallel multi-collection search via ThreadPoolExecutor (11 threads)

6. Pydantic Data Models¶

Enumeration Types¶

CARTStage:        TARGET_ID, CAR_DESIGN, VECTOR_ENG, TESTING, CLINICAL
TrialPhase:       EARLY_1, PHASE_1, PHASE_1_2, PHASE_2, PHASE_2_3, PHASE_3, PHASE_4, NA
TrialStatus:      RECRUITING, ACTIVE, COMPLETED, TERMINATED, WITHDRAWN, SUSPENDED, NOT_YET
CARGeneration:    1st, 2nd, 3rd, 4th, armored, universal
AssayType:        CYTOTOXICITY, CYTOKINE, FLOW_CYTOMETRY, PROLIFERATION, IN_VIVO, PERSISTENCE, EXHAUSTION
ProcessStep:      TRANSDUCTION, EXPANSION, HARVEST, FORMULATION, RELEASE_TESTING, CRYOPRESERVATION
SafetyEventType:  CRS, ICANS, CYTOPENIA, INFECTION, SECONDARY_MALIGNANCY, ORGAN_TOXICITY, NEUROLOGIC, CARDIAC
BiomarkerType:    PREDICTIVE, PROGNOSTIC, PHARMACODYNAMIC, MONITORING, RESISTANCE
EvidenceLevel:    VALIDATED, EMERGING, EXPLORATORY
RegulatoryEvent:  BLA, BREAKTHROUGH_THERAPY, RMAT, ACCELERATED_APPROVAL, FULL_APPROVAL, LABEL_UPDATE, REMS
RWEStudyType:     RETROSPECTIVE, REGISTRY, CLAIMS, EHR_ANALYSIS, META_ANALYSIS

Collection Models¶

Each collection has a corresponding Pydantic model with a to_embedding_text() method that combines key fields for BGE encoding:

Model	Collection	Key Fields
`CARTLiterature`	cart_literature	PMID, title, text_chunk, year, cart_stage, target_antigen, disease, journal
`ClinicalTrial`	cart_trials	NCT ID, title, phase, status, sponsor, target_antigen, car_generation, costimulatory
`CARConstruct`	cart_constructs	name, target_antigen, scfv_origin, costimulatory_domain, generation, fda_status
`AssayResult`	cart_assays	assay_type, construct_id, target_antigen, cell_line, key_metric, outcome
`ManufacturingRecord`	cart_manufacturing	process_step, vector_type, parameter, target_spec, met_spec
`SafetyRecord`	cart_safety	product, event_type, severity_grade, incidence_rate, management_protocol
`BiomarkerRecord`	cart_biomarkers	biomarker_name, biomarker_type, assay_method, clinical_cutoff, evidence_level
`RegulatoryRecord`	cart_regulatory	product, regulatory_event, date, agency, indication, decision
`SequenceRecord`	cart_sequences	construct_name, scfv_clone, binding_affinity_kd, species_origin
`RealWorldRecord`	cart_realworld	study_type, data_source, product, population_size, primary_endpoint

Search Result Models¶

SearchHit:              # Single evidence item (collection, id, score, text, metadata)
CrossCollectionResult:  # Merged multi-collection results (hits, knowledge_context, search_time_ms)
ComparativeResult:      # Dual-entity comparison (entity_a, entity_b, evidence_a, evidence_b)
AgentQuery:             # User input (question, target_antigen, cart_stage, include_genomic)
AgentResponse:          # Agent output (question, answer, evidence, knowledge_used, timestamp)

7. Configuration Reference¶

CARTSettings (Pydantic BaseSettings)¶

# Environment prefix: CART_
# Loaded from: .env file or environment variables

# Paths
PROJECT_ROOT: Path          # Auto-detected
DATA_DIR: Path              # PROJECT_ROOT / "data"
RAG_PIPELINE_ROOT: Path     # For shared resources

# Milvus
MILVUS_HOST: str = "localhost"
MILVUS_PORT: int = 19530

# Embeddings
EMBEDDING_MODEL: str = "BAAI/bge-small-en-v1.5"
EMBEDDING_DIMENSION: int = 384
EMBEDDING_BATCH_SIZE: int = 32

# LLM
LLM_PROVIDER: str = "anthropic"
LLM_MODEL: str = "claude-sonnet-4-20250514"
ANTHROPIC_API_KEY: Optional[str] = None

# RAG Search
TOP_K_PER_COLLECTION: int = 5
SCORE_THRESHOLD: float = 0.4

# Collection Weights (sum ≈ 1.0)
WEIGHT_LITERATURE: float = 0.20
WEIGHT_TRIALS: float = 0.16
WEIGHT_CONSTRUCTS: float = 0.10
WEIGHT_ASSAYS: float = 0.09
WEIGHT_MANUFACTURING: float = 0.07
WEIGHT_SAFETY: float = 0.08
WEIGHT_BIOMARKERS: float = 0.08
WEIGHT_REGULATORY: float = 0.06
WEIGHT_SEQUENCES: float = 0.06
WEIGHT_REALWORLD: float = 0.07
WEIGHT_GENOMIC: float = 0.04

# External APIs
NCBI_API_KEY: Optional[str] = None
PUBMED_MAX_RESULTS: int = 5000
CT_GOV_BASE_URL: str = "https://clinicaltrials.gov/api/v2"

# Services
API_HOST: str = "0.0.0.0"
API_PORT: int = 8522
STREAMLIT_PORT: int = 8521

# Scheduling
INGEST_SCHEDULE_HOURS: int = 168     # Weekly refresh
INGEST_ENABLED: bool = False

# RAG Context
MAX_CONVERSATION_CONTEXT: int = 3    # Prior exchanges
CITATION_HIGH_THRESHOLD: float = 0.75
CITATION_MEDIUM_THRESHOLD: float = 0.60

8. Embedding Strategy¶

BGE-small-en-v1.5¶

Parameter	Value
Model	BAAI/bge-small-en-v1.5
Dimensions	384
Parameters	~33M
Encoding	Asymmetric (query vs document)

Asymmetric Encoding¶

# Query embedding (with instruction prefix)
prefix = "Represent this sentence for searching relevant passages: "
query_embedding = embedder.encode(prefix + question)

# Document embedding (raw text, no prefix)
doc_embedding = embedder.encode(record.to_embedding_text())

This asymmetric approach improves retrieval relevance by 5-15% compared to symmetric encoding.

to_embedding_text() Pattern¶

Each Pydantic model generates its embedding input by combining key fields:

# CARTLiterature
def to_embedding_text(self) -> str:
    parts = [self.title, self.text_chunk]
    if self.target_antigen:
        parts.append(f"Target: {self.target_antigen}")
    if self.disease:
        parts.append(f"Disease: {self.disease}")
    return " ".join(parts)

9. Multi-Collection RAG Engine¶

System Prompt¶

CART_SYSTEM_PROMPT = """You are a CAR-T cell therapy intelligence agent with deep expertise in:

1. Target Identification — antigen biology, expression profiling
2. CAR Design — scFv selection, costimulatory domains (CD28 vs 4-1BB)
3. Vector Engineering — lentiviral/retroviral production, VCN optimization
4. In Vitro & In Vivo Testing — cytotoxicity, cytokine profiling, persistence
5. Clinical Development — trial design, response rates, toxicity management
6. Manufacturing — leukapheresis, expansion, cryopreservation, release testing
7. Safety & Pharmacovigilance — REMS, FAERS, long-term follow-up
8. Biomarkers — CRS prediction (ferritin, CRP, IL-6), MRD monitoring
9. Regulatory Intelligence — FDA pathways, BLA timelines, breakthrough therapy
10. Molecular Design — scFv binding affinity, CDR sequences, humanization
11. Real-World Evidence — CIBMTR registry, community vs academic outcomes
12. Genomic Evidence — patient variants, ClinVar, AlphaMissense

When answering:
- Cite evidence using clickable markdown links
- Think cross-functionally — connect insights across stages
- Highlight failure modes and resistance mechanisms
- Be specific — cite trial names, products, quantitative data
- Acknowledge uncertainty — distinguish facts from emerging data"""

Collection Configuration¶

COLLECTION_CONFIG = {
    "cart_literature":    {"weight": 0.20, "label": "Literature",    "has_target_antigen": True},
    "cart_trials":        {"weight": 0.16, "label": "Trial",         "has_target_antigen": True},
    "cart_constructs":    {"weight": 0.10, "label": "Construct",     "has_target_antigen": True},
    "cart_assays":        {"weight": 0.09, "label": "Assay",         "has_target_antigen": True},
    "cart_manufacturing": {"weight": 0.07, "label": "Manufacturing", "has_target_antigen": False},
    "cart_safety":        {"weight": 0.08, "label": "Safety",        "has_target_antigen": False},
    "cart_biomarkers":    {"weight": 0.08, "label": "Biomarker",     "has_target_antigen": True},
    "cart_regulatory":    {"weight": 0.06, "label": "Regulatory",    "has_target_antigen": False},
    "cart_sequences":     {"weight": 0.06, "label": "Sequence",      "has_target_antigen": True},
    "cart_realworld":     {"weight": 0.07, "label": "RealWorld",     "has_target_antigen": False},
    "genomic_evidence":   {"weight": 0.04, "label": "Genomic",       "has_target_antigen": False},
}

Core Methods¶

Method	Purpose	Returns
`retrieve()`	Parallel search + expansion + knowledge augmentation	`CrossCollectionResult`
`query()`	Full RAG: retrieve + Claude generation	`str` (answer)
`query_stream()`	Streaming RAG: yields evidence then tokens	`Generator[Dict]`
`find_related()`	Cross-collection entity linking	`Dict[str, List[SearchHit]]`
`retrieve_comparative()`	Dual-entity comparative retrieval	`ComparativeResult`

Retrieve Pipeline¶

User Query
    ↓
1. Embed query (BGE asymmetric prefix)                        [< 5 ms]
    ↓
2. Parallel search across 11 collections (top-5 each)        [12-16 ms]
   (ThreadPoolExecutor, 11 concurrent threads)
    ↓
3. Query expansion: detect keywords → expand → re-search     [8-12 ms]
    ↓
4. Merge + deduplicate + weighted rank (cap at 30)            [< 1 ms]
    ↓
5. Score citations: high (≥0.75) / medium (≥0.60) / low      [< 1 ms]
    ↓
6. Knowledge graph augmentation (all domains)                 [< 1 ms]
    ↓
7. Build prompt: evidence + knowledge + question              [< 1 ms]
    ↓
8. Stream Claude Sonnet 4 response                            [~22-24 sec]

Citation Formatting¶

# Literature → PubMed link
"[Literature:PMID](https://pubmed.ncbi.nlm.nih.gov/PMID/)"

# Trial → ClinicalTrials.gov link
"[Trial:NCT...](https://clinicaltrials.gov/study/NCT...)"

10. Knowledge Graph¶

Components¶

Component	Count	Description
Target Antigens	25	Full profiles: protein, UniProt ID, expression, diseases, approved products, resistance, toxicity
Toxicity Profiles	8	CRS, ICANS, B-cell aplasia, HLH/MAS, cytopenias, TLS, GvHD, on-target/off-tumor
Manufacturing Processes	10	Transduction, expansion, harvest, formulation, release testing, cryopreservation
Biomarker Database	15+	CRS prediction (ferritin, CRP, IL-6), exhaustion (PD-1, LAG-3, TIM-3), response markers
Regulatory Pathways	6	FDA breakthrough, RMAT, accelerated approval, REMS, post-marketing
Entity Aliases	39+	Product names, generic names, costimulatory domains, biomarker terms, regulatory terms

Target Antigens (25 entries)¶

B-cell:    CD19, CD22, CD20, BCMA, GPRC5D, CD38, CD70
Myeloid:   CD33, CD123, FLT3, CLL-1
Solid:     HER2, GPC3, EGFR, EGFRvIII, Mesothelin, Claudin18.2, GD2,
           PSMA, ROR1, DLL3, B7-H3, MUC1, IL13RA2, EpCAM

Each target includes: protein name, UniProt ID, expression pattern, associated diseases, FDA-approved products, key trials, known resistance mechanisms, toxicity profile, and normal tissue expression.

Toxicity Profiles¶

Toxicity	Grading	Management
CRS	CTCAE Grade 1-4	Tocilizumab, corticosteroids
ICANS	ICE score-based	Dexamethasone, supportive
B-cell aplasia	On-target depletion	IVIG replacement
HLH/MAS	Ferritin + fibrinogen	Anakinra, etoposide
Cytopenias	CBC monitoring	G-CSF, transfusions
TLS	Labs (uric acid, K+, PO4)	Rasburicase, hydration
GvHD	Allogeneic only	Steroids, ruxolitinib
On-target/off-tumor	Normal tissue expression	Affinity tuning, safety switches

API Functions¶

get_target_context("CD19")              # Full CD19 knowledge block
get_toxicity_context("CRS")             # CRS grading + management
get_manufacturing_context("expansion")   # Expansion process specs
get_biomarker_context("ferritin")        # CRS prediction data
get_regulatory_context("Kymriah")        # FDA approval timeline
resolve_comparison_entity("Kymriah")     # → {"type": "product", "target": "CD19"}
get_comparison_context(entity_a, entity_b)  # Side-by-side knowledge
get_knowledge_stats()                    # {target_antigens: 25, ...}

Entity Resolution Priority¶

CART_TARGETS (25) → ENTITY_ALIASES (39+) → CART_TOXICITIES (8) → CART_MANUFACTURING (10)

11. Query Expansion¶

12 Expansion Map Categories¶

Category	Keywords	Expanded Terms	Examples
Target Antigen	26	196	CD19 → [B-ALL, DLBCL, tisagenlecleucel, axicabtagene, FMC63, ...]
Disease	16	143	multiple myeloma → [MM, RRMM, plasma cell neoplasm, ...]
Toxicity	14	136	CRS → [cytokine release syndrome, tocilizumab, IL-6, ...]
Manufacturing	16	181	transduction → [lentiviral, VCN, MOI, viral vector, ...]
Mechanism	19	224	resistance → [antigen loss, lineage switch, trogocytosis, ...]
Construct	20	206	bispecific → [dual-targeting, tandem, bicistronic, ...]
Safety	15	135	REMS → [risk evaluation, FAERS, adverse event, ...]
Biomarker	14	125	CRS prediction → [ferritin, CRP, IL-6, sIL-2R, ...]
Regulatory	12	108	BLA → [biologics license application, RMAT, ...]
Sequence	10	95	scFv → [single-chain variable fragment, VH, VL, CDR, ...]
Real-World	12	110	registry → [CIBMTR, real-world evidence, post-marketing, ...]
Immunogenicity	10	92	ADA → [anti-drug antibody, neutralizing antibody, ...]
Total	169	1,496

How It Works¶

def expand_query(question: str) -> List[str]:
    """Detect keywords in user question, return expansion terms.

    Used by RAG engine to broaden recall via semantic re-search.
    Returns up to 10-20 related terms per matched keyword.
    """

12. Comparative Analysis Mode¶

Auto-Detection¶

Comparative queries are detected by regex matching: compare, vs, versus, comparing, X vs Y.

Pipeline¶

User: "Compare 4-1BB vs CD28 costimulatory domains"
    ↓
1. _is_comparative() detects "vs"                              [< 1 ms]
    ↓
2. _parse_comparison_entities() extracts "4-1BB", "CD28"       [< 1 ms]
    ↓
3. resolve_comparison_entity() for each entity                  [< 1 ms]
   "4-1BB" → {"type": "costimulatory", "canonical": "4-1BB (CD137)"}
   "CD28"  → {"type": "costimulatory", "canonical": "CD28"}
    ↓
4. Dual retrieve() — one per entity                             [~365 ms]
   Entity A: retrieve(question, target_antigen=entity_a.target)
   Entity B: retrieve(question, target_antigen=entity_b.target)
    ↓
5. get_comparison_context() — side-by-side knowledge graph     [< 1 ms]
    ↓
6. _build_comparative_prompt() — structured comparison         [< 1 ms]
   Evidence grouped by entity (A section, B section)
   Instructions: table + advantages + limitations
    ↓
7. Stream Claude Sonnet 4 (max_tokens=3000)                    [~28-30 sec]
   Output: comparison table, pros/cons, clinical context

Supported Entity Types¶

Type	Examples	Resolution
Target Antigens	CD19, BCMA, CD22	Direct match in CART_TARGETS (25)
FDA Products	Kymriah, Yescarta, Carvykti	ENTITY_ALIASES → canonical + target
Costimulatory Domains	4-1BB, CD28	ENTITY_ALIASES → type: costimulatory
Toxicity Profiles	CRS, ICANS	CART_TOXICITIES → type: toxicity
Manufacturing	Lentiviral, transduction	CART_MANUFACTURING → type: manufacturing

Fallback¶

If entity parsing fails, the query gracefully falls back to the standard single-query retrieve() path.

13. Agent Reasoning Pipeline¶

CARTIntelligenceAgent¶

class CARTIntelligenceAgent:
    """Autonomous reasoning agent: Plan → Search → Evaluate → Synthesize → Report"""

SearchPlan¶

@dataclass
class SearchPlan:
    question: str
    identified_topics: List[str]
    target_antigens: List[str]
    relevant_stages: List[CARTStage]
    search_strategy: str           # "broad" | "targeted" | "comparative"
    sub_questions: List[str]       # Decomposed for complex queries

Agent Pipeline¶

Phase 1: Plan
├── Extract target antigens (regex against 25 known antigens)
├── Identify development stages (keyword match)
├── Determine strategy (broad / targeted / comparative)
└── Decompose complex questions into sub-questions

Phase 2: Search
├── Build AgentQuery with plan parameters
└── Execute retrieve() via RAG engine

Phase 3: Evaluate
├── Sufficient:   ≥3 collections with hits + ≥10 total
├── Partial:      ≥2 collections + ≥5 total
└── Insufficient: try sub-questions for more evidence

Phase 4: Synthesize
└── LLM generates grounded answer with citations

Phase 5: Report
├── Query metadata (timestamp, collections, evidence count)
├── Analysis (the answer)
├── Evidence sources (by collection, top 5 each)
└── Knowledge graph items referenced

14. Data Ingest Pipelines¶

BaseIngestPipeline Pattern¶

class BaseIngestPipeline(ABC):
    def fetch(self, **kwargs) -> Any:          # Retrieve raw data
    def parse(self, raw_data) -> List[Model]:  # Validate into Pydantic models
    def embed_and_store(self, records, ...) -> int:  # Embed + insert to Milvus
    def run(self, **kwargs) -> int:            # Orchestrate full pipeline

Five Ingest Pipelines¶

Pipeline	Source	Records	Time	Collection
PubMed	NCBI E-utilities (esearch + efetch)	5,047	~15 min	cart_literature
ClinicalTrials.gov	REST API v2	973	~3 min	cart_trials
FDA Constructs	Seed script (6 products)	6	~5 sec	cart_constructs
Assays	Curated JSON (landmark papers)	45	~30 sec	cart_assays
Manufacturing	Curated JSON (CMC data)	30	~30 sec	cart_manufacturing

Additional seed scripts populate: cart_safety (40), cart_biomarkers (43), cart_regulatory (25), cart_sequences (27), cart_realworld (30).

Batch Processing Safety¶

# Enum → string conversion for Milvus VARCHAR fields
for key, value in record_dict.items():
    if isinstance(value, Enum):
        record_dict[key] = value.value

# UTF-8 byte truncation for Milvus VARCHAR limits
for key, value in record_dict.items():
    if isinstance(value, str):
        encoded = value.encode("utf-8")
        if len(encoded) > 2990 and key in ("text_chunk", "text_summary"):
            record_dict[key] = encoded[:2990].decode("utf-8", errors="ignore")

15. FastAPI REST Server¶

Service Configuration¶

Port: 8522
Command: uvicorn api.main:app --host 0.0.0.0 --port 8522

Endpoints¶

Method	Path	Purpose
`GET`	`/health`	Service health + collection count + total vectors
`GET`	`/collections`	List all collections with record counts
`POST`	`/query`	Full RAG query (retrieve + Claude generation)
`POST`	`/search`	Evidence-only search (no LLM)
`POST`	`/find-related`	Cross-collection entity linking
`GET`	`/knowledge/stats`	Knowledge graph statistics
`GET`	`/metrics`	Prometheus-compatible metrics

Query Request Schema¶

{
    "question": "Why do CD19 CAR-T therapies fail?",
    "target_antigen": "CD19",
    "collections": ["cart_literature", "cart_trials"],
    "year_min": 2015,
    "year_max": 2025
}

Query Response Schema¶

{
    "question": "...",
    "answer": "...",
    "evidence": [
        {
            "collection": "Literature",
            "id": "12345678",
            "score": 0.89,
            "text": "...",
            "metadata": {"relevance": "high"}
        }
    ],
    "knowledge_context": "...",
    "collections_searched": 11,
    "search_time_ms": 234.5
}

16. Streamlit Chat UI¶

Service Configuration¶

Port: 8521
Command: streamlit run app/cart_ui.py --server.port 8521
Theme: NVIDIA dark (black/green)

Components¶

Sidebar: - Collection statistics (10 owned + 1 read-only genomic) - Deep Research mode toggle (activates agent reasoning pipeline) - Collection filtering checkboxes - Year range slider (1990–2030) - Stage filter dropdown - Target antigen multi-select

Main Panel: - Chat message history (conversation memory: last 3 exchanges) - Query input with demo query buttons - Streaming response display - Evidence panel (grouped by collection, color-coded badges) - Citation relevance tags (high/medium/low) - Clickable PubMed and ClinicalTrials.gov links

Comparative Mode UI: - Entity A header (blue) — evidence cards for entity A - "— VS —" divider (green) - Entity B header (purple) — evidence cards for entity B

Advanced Features: - Image upload for claim verification - Knowledge graph visualization (PyVis network) - Export buttons (PDF, Markdown, JSON)

17. Report Export¶

Supported Formats¶

Format	Method	Output
Markdown	`export_markdown()`	Structured report with headers, evidence table, citations
JSON	`export_json()`	Machine-readable export with all metadata
PDF	`export_pdf()`	Formatted report via ReportLab

Report Structure¶

1. Query Metadata (timestamp, collections searched, evidence count)
2. Analysis (LLM-generated answer)
3. Evidence Sources (by collection, top 5 per collection)
4. Knowledge Graph Items Referenced
5. Citation Links (PubMed, ClinicalTrials.gov)

18. Monitoring & Metrics¶

Prometheus Metrics¶

cart_api_requests_total              # Total API requests (counter)
cart_collection_vectors{collection}  # Vectors per collection (gauge)
cart_query_duration_seconds          # Query latency histogram
cart_search_duration_seconds         # Search latency histogram

Endpoints¶

Metrics: GET /metrics (Prometheus scrape format)
Health: GET /health (JSON)

19. Testing Strategy¶

End-to-End Validation (`validate_e2e.py`)¶

5 tests: 1. Collection stats — verify all 11 collections have expected record counts 2. Single-collection search — verify vector similarity works 3. Multi-collection search_all() — verify parallel search returns results from multiple collections 4. Filtered search — verify target_antigen == "CD19" filter 5. Demo queries — run all 5 validated demo queries

RAG Integration Test (`test_rag_pipeline.py`)¶

Tests full pipeline: embed → search_all → knowledge graph → Claude LLM response generation. Validates both synchronous and streaming modes.

Demo Queries¶

1. "Why do CD19 CAR-T therapies fail in relapsed B-ALL?"
2. "Compare 4-1BB vs CD28 costimulatory domains for DLBCL"
3. "What manufacturing parameters predict clinical response?"
4. "BCMA CAR-T resistance mechanisms in multiple myeloma"
5. "How does T-cell exhaustion affect CAR-T persistence?"

20. HCLS AI Factory Integration¶

Shared Infrastructure¶

Resource	Sharing
Milvus 2.4	Shared instance — CAR-T adds 10 owned collections alongside `genomic_evidence`
BGE-small-en-v1.5	Same embedding model as RAG/Chat pipeline
ANTHROPIC_API_KEY	Loaded from `rag-chat-pipeline/.env` if not set
DGX Spark	All services on single GB10 hardware ($3,999)

Architectural Insight¶

The platform is not disease-specific. By changing the knowledge graph, query expansion maps, and collection schemas, the same RAG architecture serves any therapeutic area. The CAR-T agent proves this with a completely different domain (cell therapy manufacturing vs. small molecule drug discovery) running on the same infrastructure.

21. Implementation Sequence¶

Quick Start¶

# 1. Clone
git clone https://github.com/ajones1923/cart-intelligence-agent.git
cd cart-intelligence-agent

# 2. Install
pip install -r requirements.txt

# 3. Configure
export ANTHROPIC_API_KEY=sk-ant-...

# 4. Create collections + seed FDA constructs
python3 scripts/setup_collections.py --seed-constructs

# 5. Ingest PubMed (~15 min)
python3 scripts/ingest_pubmed.py --max-results 5000

# 6. Ingest ClinicalTrials.gov (~3 min)
python3 scripts/ingest_clinical_trials.py --max-results 1500

# 7. Seed assays + manufacturing (~1 min)
python3 scripts/seed_assays.py
python3 scripts/seed_manufacturing.py

# 8. Validate
python3 scripts/validate_e2e.py

# 9. Launch UI
streamlit run app/cart_ui.py --server.port 8521

# 10. Launch API (separate terminal)
uvicorn api.main:app --host 0.0.0.0 --port 8522

Docker Quick Start¶

cp .env.example .env
# Edit .env: set ANTHROPIC_API_KEY
docker compose up -d
# UI: http://localhost:8521
# API: http://localhost:8522/docs

Dependencies¶

Package	Version	Purpose
pydantic	≥2.0	Data validation
pydantic-settings	≥2.0	Environment config
pymilvus	≥2.4.0	Vector DB client
sentence-transformers	≥2.2.0	BGE embeddings
anthropic	≥0.18.0	Claude API
streamlit	≥1.30.0	Web UI
fastapi	≥0.109.0	REST API
uvicorn[standard]	≥0.27.0	ASGI server
lxml	≥5.0.0	PubMed XML parsing
biopython	≥1.83	NCBI E-utilities
reportlab	≥4.0.0	PDF generation
pyvis	≥0.3.0	Network visualization
prometheus-client	≥0.20.0	Metrics export
apscheduler	≥3.10.0	Job scheduling

End of Project Bible

CAR-T Intelligence Agent — Project Bible¶

Table of Contents¶

1. Project Overview & Goals¶

What This Agent Does¶

Five Development Stages¶

Key Results¶

Pipeline Pattern¶

HCLS AI Factory Integration¶

2. DGX Spark Hardware Reference¶

Specifications¶

Why DGX Spark¶

3. Repository Layout¶

4. Docker Compose Services¶

Service Architecture¶

Startup Sequence¶

Dockerfile (Multi-Stage)¶

5. Milvus Collection Schemas¶

11 Collections Overview¶

Unified Schema Pattern¶

Index Configuration¶

CARTCollectionManager¶

6. Pydantic Data Models¶

Enumeration Types¶

Collection Models¶

Search Result Models¶

7. Configuration Reference¶

CARTSettings (Pydantic BaseSettings)¶

8. Embedding Strategy¶

BGE-small-en-v1.5¶

Asymmetric Encoding¶

to_embedding_text() Pattern¶

9. Multi-Collection RAG Engine¶

System Prompt¶

Collection Configuration¶

Core Methods¶

Retrieve Pipeline¶

Citation Formatting¶

10. Knowledge Graph¶

Components¶

Target Antigens (25 entries)¶

Toxicity Profiles¶

API Functions¶

Entity Resolution Priority¶

11. Query Expansion¶

12 Expansion Map Categories¶

How It Works¶

12. Comparative Analysis Mode¶

Auto-Detection¶

Pipeline¶

Supported Entity Types¶

Fallback¶

13. Agent Reasoning Pipeline¶

CARTIntelligenceAgent¶

SearchPlan¶

Agent Pipeline¶

14. Data Ingest Pipelines¶

BaseIngestPipeline Pattern¶

Five Ingest Pipelines¶

Batch Processing Safety¶

15. FastAPI REST Server¶

Service Configuration¶

Endpoints¶

Query Request Schema¶

Query Response Schema¶

16. Streamlit Chat UI¶

Service Configuration¶

Components¶

17. Report Export¶

Supported Formats¶

Report Structure¶

18. Monitoring & Metrics¶

Prometheus Metrics¶

Endpoints¶

19. Testing Strategy¶

End-to-End Validation (validate_e2e.py)¶

RAG Integration Test (test_rag_pipeline.py)¶

Demo Queries¶

20. HCLS AI Factory Integration¶

Shared Infrastructure¶

Architectural Insight¶

End-to-End Validation (`validate_e2e.py`)¶

RAG Integration Test (`test_rag_pipeline.py`)¶