CAR-T Intelligence Agent — Deployment Guide¶
HCLS AI Factory / ai_agent_adds / cart_intelligence_agent
Version 1.0.0 | March 2026 | Author: Adam Jones
Table of Contents¶
- Overview
- Prerequisites
- Quick Start
- Deployment Modes
- 4a. Docker Lite (Milvus + API Only)
- 4b. Docker Full Stack (All 6 Services)
- 4c. DGX Spark Production
- 4d. Development Mode (Local Python)
- Configuration Reference
- Collection Setup and Seeding
- Data Ingestion (PubMed, ClinicalTrials.gov)
- Networking and Ports
- Storage and Persistence
- Monitoring and Metrics
- Security Hardening
- Health Checks and Troubleshooting
- Backup and Recovery
- Scaling Considerations
- Integration with HCLS AI Factory
- Updating and Maintenance
- Appendix A: Complete docker-compose.yml
- Appendix B: Environment Variable Quick Reference
1. Overview¶
The CAR-T Intelligence Agent is a multi-collection RAG-powered clinical decision support system for CAR-T cell therapy. It combines 11 Milvus vector collections (10 agent-owned + 1 shared), a 71-node knowledge graph, BGE-small-en-v1.5 sentence embeddings (384-dim), and Claude LLM reasoning to deliver evidence-based intelligence spanning literature, clinical trials, constructs, assays, manufacturing, safety, biomarkers, regulatory, sequences, and real-world evidence.
Core Capabilities¶
- Multi-collection RAG search across 11 knowledge domains with weighted relevance scoring and antigen-specific filtering.
- Knowledge graph augmentation with 71 curated nodes (34 targets, 17 toxicities, 20 manufacturing processes).
- Query expansion with 229 keyword categories mapping to 1,961 expansion terms across 12 domain-specific maps for improved recall.
- Deep Research mode with autonomous multi-pass agent pipeline.
- Cross-collection entity linking for holistic product intelligence.
- PDF/Markdown/JSON report export for clinical and research use.
- Prometheus-compatible metrics for operational monitoring.
Architecture at a Glance¶
The agent runs as 6 Docker services on a bridge network (cart-network):
cart-network (bridge)
+--------------------------------------------------+
| |
| milvus-etcd -----> milvus-standalone <-----+ |
| milvus-minio ---/ :19530 :9091 | |
| | |
| cart-api (:8522) --------------------------+ |
| cart-streamlit (:8521) ---------------------+ |
| cart-setup (one-shot) ----------------------+ |
| |
+--------------------------------------------------+
| Service | Container Name | Purpose |
|---|---|---|
milvus-etcd |
cart-milvus-etcd |
Metadata key-value store for Milvus |
milvus-minio |
cart-milvus-minio |
Object storage for Milvus log/index data |
milvus-standalone |
cart-milvus-standalone |
Vector database (Milvus 2.4) |
cart-streamlit |
cart-streamlit |
Streamlit clinical chat UI (port 8521) |
cart-api |
cart-api |
FastAPI REST API (port 8522) |
cart-setup |
cart-setup |
One-shot: create collections + seed all data |
2. Prerequisites¶
Hardware¶
| Component | Minimum | Recommended |
|---|---|---|
| CPU | 4 cores | 8+ cores |
| RAM | 16 GB | 32 GB |
| Disk | 20 GB | 50 GB (with Milvus data) |
| GPU | Not required | NVIDIA GPU (for faster embedding) |
The agent runs entirely on CPU. GPU acceleration is beneficial for initial embedding generation but not required for runtime.
Software¶
| Dependency | Version | Purpose |
|---|---|---|
| Docker Engine | 24.0+ | Container runtime |
| Docker Compose | 2.20+ | Multi-container orchestration |
| Python | 3.10+ | Development mode only |
| NVIDIA DGX Spark | Grace Blackwell | Production deployment target |
API Keys¶
| Key | Required | Source |
|---|---|---|
ANTHROPIC_API_KEY |
Yes (for LLM) | console.anthropic.com |
NCBI_API_KEY |
Optional | ncbi.nlm.nih.gov/account |
The agent functions without an API key for embedding-only search (/search
endpoint), but requires ANTHROPIC_API_KEY for full RAG queries with LLM
synthesis.
3. Quick Start¶
# 1. Clone the repository
cd /path/to/hcls-ai-factory/ai_agent_adds/cart_intelligence_agent
# 2. Create environment file
cp .env.example .env
# Edit .env and add: ANTHROPIC_API_KEY=sk-ant-...
# 3. Start all services
docker compose up -d
# 4. Watch setup progress (one-shot container seeds all collections)
docker compose logs -f cart-setup
# 5. Access the interfaces
# Streamlit UI: http://localhost:8521
# FastAPI docs: http://localhost:8522/docs
# Health check: http://localhost:8522/health
Startup sequence:
1. milvus-etcd + milvus-minio start first (health-checked)
2. milvus-standalone starts after etcd + minio are healthy (~60s)
3. cart-setup runs once: creates 11 collections and seeds 649 records from 13 JSON files
4. cart-streamlit + cart-api start after Milvus is healthy
Total cold-start time: ~3–5 minutes (including Milvus init and embedding model download).
4. Deployment Modes¶
4a. Docker Lite (Milvus + API Only)¶
For headless/API-only deployments without the Streamlit UI:
This starts only the vector database infrastructure and the REST API. Useful for backend integration where another frontend consumes the API.
4b. Docker Full Stack (All 6 Services)¶
The default docker compose up -d launches all 6 services. This is the
recommended deployment for demos and development.
4c. DGX Spark Production¶
On the NVIDIA DGX Spark ($3,999), the CAR-T agent runs as part of the full
HCLS AI Factory stack via the master docker-compose.dgx-spark.yml:
# From the HCLS AI Factory root
cd $HCLS_HOME
docker compose -f docker-compose.dgx-spark.yml up -d
# Or start just the CAR-T services
docker compose -f docker-compose.dgx-spark.yml up -d \
cart-milvus-etcd cart-milvus-minio cart-milvus-standalone \
cart-streamlit cart-api cart-setup
In production, the agent integrates with:
- Landing Page (:8080) — service health dashboard
- Health Monitor (health-monitor.sh) — auto-restart on failure
- Prometheus + Grafana — metrics collection and visualization
- Cross-agent event bus (hcls_common.event_bus) — inter-agent communication
4d. Development Mode (Local Python)¶
For development without Docker:
# 1. Create virtual environment
python3 -m venv .venv
source .venv/bin/activate
# 2. Install dependencies
pip install -r requirements.txt
# 3. Start Milvus (still needs Docker)
docker compose up -d milvus-etcd milvus-minio milvus-standalone
# 4. Set environment variables
export ANTHROPIC_API_KEY=sk-ant-...
export CART_MILVUS_HOST=localhost
export CART_MILVUS_PORT=19530
# 5. Seed collections
python3 scripts/setup_collections.py --drop-existing --seed-constructs
python3 scripts/seed_knowledge.py
python3 scripts/seed_assays.py
python3 scripts/seed_manufacturing.py
python3 scripts/seed_safety.py
python3 scripts/seed_biomarkers.py
python3 scripts/seed_regulatory.py
python3 scripts/seed_sequences.py
python3 scripts/seed_realworld.py
# 6. Launch Streamlit UI
streamlit run app/cart_ui.py --server.port 8521
# 7. (Optional) Launch FastAPI server
uvicorn api.main:app --host 0.0.0.0 --port 8522 --reload
# 8. Run tests
python3 -m pytest tests/ -v
5. Configuration Reference¶
All configuration is managed through config/settings.py using Pydantic
BaseSettings with env_prefix="CART_". Every setting can be overridden via
environment variables prefixed with CART_.
5.1 Path Settings¶
| Variable | Type | Default | Env Override |
|---|---|---|---|
PROJECT_ROOT |
Path | (auto-detected) | -- |
DATA_DIR |
Path | {PROJECT_ROOT}/data |
-- |
CACHE_DIR |
Path | {DATA_DIR}/cache |
-- |
REFERENCE_DIR |
Path | {DATA_DIR}/reference |
-- |
RAG_PIPELINE_ROOT |
Path | /app/rag-chat-pipeline |
CART_RAG_PIPELINE_ROOT |
5.2 Milvus Settings¶
| Variable | Type | Default | Env Override |
|---|---|---|---|
MILVUS_HOST |
str | localhost |
CART_MILVUS_HOST |
MILVUS_PORT |
int | 19530 |
CART_MILVUS_PORT |
5.3 Collection Names¶
| Variable | Default Collection Name |
|---|---|
COLLECTION_LITERATURE |
cart_literature |
COLLECTION_TRIALS |
cart_trials |
COLLECTION_CONSTRUCTS |
cart_constructs |
COLLECTION_ASSAYS |
cart_assays |
COLLECTION_MANUFACTURING |
cart_manufacturing |
COLLECTION_GENOMIC |
genomic_evidence |
COLLECTION_SAFETY |
cart_safety |
COLLECTION_BIOMARKERS |
cart_biomarkers |
COLLECTION_REGULATORY |
cart_regulatory |
COLLECTION_SEQUENCES |
cart_sequences |
COLLECTION_REALWORLD |
cart_realworld |
5.4 Embedding Settings¶
| Variable | Type | Default | Env Override |
|---|---|---|---|
EMBEDDING_MODEL |
str | BAAI/bge-small-en-v1.5 |
CART_EMBEDDING_MODEL |
EMBEDDING_DIMENSION |
int | 384 |
CART_EMBEDDING_DIMENSION |
EMBEDDING_BATCH_SIZE |
int | 32 |
CART_EMBEDDING_BATCH_SIZE |
5.5 LLM Settings¶
| Variable | Type | Default | Env Override |
|---|---|---|---|
LLM_PROVIDER |
str | anthropic |
CART_LLM_PROVIDER |
LLM_MODEL |
str | claude-sonnet-4-6 |
CART_LLM_MODEL |
ANTHROPIC_API_KEY |
str | None |
CART_ANTHROPIC_API_KEY |
5.6 RAG Search Settings¶
| Variable | Type | Default | Env Override |
|---|---|---|---|
TOP_K_PER_COLLECTION |
int | 5 |
CART_TOP_K_PER_COLLECTION |
SCORE_THRESHOLD |
float | 0.4 |
CART_SCORE_THRESHOLD |
Collection Search Weights (must sum to ~1.0):
| Collection | Weight |
|---|---|
| Literature | 0.20 |
| Trials | 0.16 |
| Constructs | 0.10 |
| Assays | 0.09 |
| Safety | 0.08 |
| Biomarkers | 0.08 |
| Manufacturing | 0.07 |
| Real-World | 0.07 |
| Regulatory | 0.06 |
| Sequences | 0.06 |
| Genomic | 0.04 |
5.7 API Settings¶
| Variable | Type | Default | Env Override |
|---|---|---|---|
API_HOST |
str | 0.0.0.0 |
CART_API_HOST |
API_PORT |
int | 8522 |
CART_API_PORT |
STREAMLIT_PORT |
int | 8521 |
CART_STREAMLIT_PORT |
CORS_ORIGINS |
str | http://localhost:8080,http://localhost:8521,http://localhost:8522 |
CART_CORS_ORIGINS |
MAX_REQUEST_SIZE_MB |
int | 10 |
CART_MAX_REQUEST_SIZE_MB |
5.8 Monitoring & Scheduling¶
| Variable | Type | Default | Env Override |
|---|---|---|---|
METRICS_ENABLED |
bool | True |
CART_METRICS_ENABLED |
INGEST_SCHEDULE_HOURS |
int | 168 (weekly) |
CART_INGEST_SCHEDULE_HOURS |
INGEST_ENABLED |
bool | False |
CART_INGEST_ENABLED |
5.9 Conversation & Citation¶
| Variable | Type | Default | Env Override |
|---|---|---|---|
MAX_CONVERSATION_CONTEXT |
int | 3 |
CART_MAX_CONVERSATION_CONTEXT |
CITATION_HIGH_THRESHOLD |
float | 0.75 |
CART_CITATION_HIGH_THRESHOLD |
CITATION_MEDIUM_THRESHOLD |
float | 0.60 |
CART_CITATION_MEDIUM_THRESHOLD |
5.10 PubMed & ClinicalTrials.gov¶
| Variable | Type | Default | Env Override |
|---|---|---|---|
NCBI_API_KEY |
str | None |
CART_NCBI_API_KEY |
PUBMED_MAX_RESULTS |
int | 5000 |
CART_PUBMED_MAX_RESULTS |
CT_GOV_BASE_URL |
str | https://clinicaltrials.gov/api/v2 |
CART_CT_GOV_BASE_URL |
6. Collection Setup and Seeding¶
6.1 Collections¶
The agent operates across 11 Milvus collections:
| Collection | Description | Schema Fields |
|---|---|---|
cart_literature |
Published research papers | title, abstract, target_antigen, year, journal |
cart_trials |
Clinical trial records | nct_id, title, phase, status, target_antigen |
cart_constructs |
CAR construct designs | name, target, costimulatory_domain, generation |
cart_assays |
Laboratory assay protocols | name, assay_type, target, sensitivity |
cart_manufacturing |
Manufacturing processes | process_name, scale, duration, yield |
cart_safety |
Safety/toxicity profiles | event_type, grade, management, product |
cart_biomarkers |
Predictive biomarkers | name, category, clinical_utility |
cart_regulatory |
Regulatory approvals | product, indication, agency, approval_date |
cart_sequences |
CAR/TCR sequences | sequence_type, target, framework |
cart_realworld |
Real-world evidence | study_type, endpoints, population |
genomic_evidence |
Shared genomic variants | (shared with genomics pipeline) |
All agent-owned collections use: - Embedding dimension: 384 (BGE-small-en-v1.5) - Index type: IVF_FLAT - Metric: COSINE - nlist: 128
6.2 Seed Data¶
The data/seed/ directory contains 13 JSON files with 649 curated records:
| File | Records | Description |
|---|---|---|
assay_seed_data.json |
75 | Laboratory assay protocols |
biomarker_seed_data.json |
60 | Predictive and prognostic biomarkers |
constructs_seed_data.json |
41 | CAR construct designs |
immunogenicity_biomarker_seed.json |
20 | Immunogenicity biomarkers |
immunogenicity_sequence_seed.json |
18 | Immunogenicity sequence data |
literature_seed_data.json |
60 | Published research papers |
manufacturing_seed_data.json |
56 | Manufacturing process records |
patent_seed_data.json |
45 | Patent records |
realworld_seed_data.json |
54 | Real-world evidence studies |
regulatory_seed_data.json |
40 | Regulatory approval records |
safety_seed_data.json |
71 | Safety and toxicity profiles |
sequence_seed_data.json |
40 | CAR/TCR sequence records |
trials_seed_data.json |
69 | Clinical trial records |
6.3 Manual Seeding¶
If the cart-setup container didn't run or you need to re-seed:
# Re-run the setup container
docker compose run --rm cart-setup
# Or seed individually
docker compose exec cart-api python scripts/seed_knowledge.py
docker compose exec cart-api python scripts/seed_assays.py
docker compose exec cart-api python scripts/seed_manufacturing.py
# ... etc.
7. Data Ingestion (PubMed, ClinicalTrials.gov)¶
The agent includes automated ingestion pipelines for external data sources.
7.1 PubMed Ingestion¶
# One-time ingest
docker compose exec cart-api python scripts/ingest_pubmed.py \
--query "CAR-T cell therapy" \
--max-results 5000
# With NCBI API key (higher rate limit)
CART_NCBI_API_KEY=your_key docker compose exec cart-api \
python scripts/ingest_pubmed.py --query "CAR-T cell therapy"
7.2 ClinicalTrials.gov Ingestion¶
docker compose exec cart-api python scripts/ingest_trials.py \
--condition "CAR-T" \
--status "RECRUITING,ACTIVE_NOT_RECRUITING"
7.3 Scheduled Ingestion¶
Enable automatic weekly ingestion:
The scheduler runs inside the API container and triggers ingestion pipelines at the configured interval.
8. Networking and Ports¶
8.1 Port Map¶
| Port | Service | Protocol | Purpose |
|---|---|---|---|
| 8521 | cart-streamlit | HTTP | Streamlit chat UI |
| 8522 | cart-api | HTTP | FastAPI REST API |
| 19530 | milvus-standalone | gRPC | Milvus client connections |
| 9091 | milvus-standalone | HTTP | Milvus health/metrics |
| 2379 | milvus-etcd | HTTP | etcd client (internal only) |
| 9000 | milvus-minio | HTTP | MinIO API (internal only) |
| 9001 | milvus-minio | HTTP | MinIO console (internal only) |
8.2 Network Configuration¶
All services communicate over the cart-network Docker bridge network.
Internal services (etcd, minio) are not exposed to the host by default.
To expose MinIO console for debugging:
8.3 CORS Configuration¶
CORS is restricted to 3 origins by default:
http://localhost:8080 # HCLS AI Factory Landing Page
http://localhost:8521 # CAR-T Streamlit UI
http://localhost:8522 # CAR-T API (self)
Override with CART_CORS_ORIGINS (comma-separated):
9. Storage and Persistence¶
9.1 Docker Volumes¶
| Volume | Mount Point | Purpose | Typical Size |
|---|---|---|---|
etcd_data |
/etcd |
Milvus metadata | ~100 MB |
minio_data |
/minio_data |
Milvus index/log data | ~2 GB |
milvus_data |
/var/lib/milvus |
Milvus WAL + data | ~5 GB |
9.2 Application Data¶
| Path | Purpose | Persistence |
|---|---|---|
data/seed/ |
Seed JSON files | Baked into image |
data/cache/ |
Embedding cache | Ephemeral (recreated on restart) |
data/reference/ |
Reference data | Baked into image |
9.3 Volume Backup¶
# Backup all Milvus volumes
docker compose stop
for vol in etcd_data minio_data milvus_data; do
docker run --rm -v cart_intelligence_agent_${vol}:/data \
-v $(pwd)/backups:/backup alpine \
tar czf /backup/${vol}_$(date +%Y%m%d).tar.gz -C /data .
done
docker compose start
10. Monitoring and Metrics¶
10.1 Prometheus Endpoint¶
The API server exposes Prometheus-compatible metrics at GET /metrics:
# HELP cart_api_requests_total Total API requests
# TYPE cart_api_requests_total counter
cart_api_requests_total 42
# HELP cart_api_query_requests_total Total /query requests
# TYPE cart_api_query_requests_total counter
cart_api_query_requests_total 15
# HELP cart_collection_vectors Number of vectors per collection
# TYPE cart_collection_vectors gauge
cart_collection_vectors{collection="cart_literature"} 1200
cart_collection_vectors{collection="cart_trials"} 850
10.2 Health Endpoint¶
10.3 Grafana Integration¶
On the DGX Spark, metrics flow to the central Prometheus instance and are visualized on the HCLS AI Factory Grafana dashboard.
Prometheus scrape config:
- job_name: 'cart-agent'
static_configs:
- targets: ['localhost:8522']
metrics_path: '/metrics'
scrape_interval: 30s
11. Security Hardening¶
11.1 Container Security¶
The Dockerfile implements security best practices:
- Multi-stage build — build dependencies excluded from runtime image
- Non-root user —
cartuserwith minimal permissions - Read-only filesystem — application code is not writable at runtime
- No shell access —
cartuserhas/bin/falseas shell
11.2 API Security¶
- CORS restrictions — limited to 3 explicit origins (not
["*"]) - Request size limits —
MAX_REQUEST_SIZE_MB=10prevents payload abuse - Input validation — Pydantic schemas validate all request bodies
- No API key in responses — API key is never reflected in output
11.3 Network Security¶
- Internal services (etcd, minio) are not exposed to the host
- Milvus gRPC port (19530) should be firewalled in production
- Use a reverse proxy (nginx, Traefik) for TLS termination
11.4 API Key Management¶
# Development: .env file
ANTHROPIC_API_KEY=sk-ant-...
# Production: Docker secrets or environment injection
docker compose run -e ANTHROPIC_API_KEY=$(vault read ...) cart-api
12. Health Checks and Troubleshooting¶
12.1 Service Health Checks¶
| Service | Health Check | Interval | Start Period |
|---|---|---|---|
| milvus-etcd | etcdctl endpoint health |
30s | -- |
| milvus-minio | curl http://localhost:9000/minio/health/live |
30s | -- |
| milvus-standalone | curl http://localhost:9091/healthz |
30s | 60s |
| cart-api | curl http://localhost:8522/health |
30s | 30s |
| cart-streamlit | curl http://localhost:8521/_stcore/health |
30s | 40s |
12.2 Common Issues¶
Milvus fails to start:
# Check etcd and minio are healthy first
docker compose ps
docker compose logs milvus-etcd
docker compose logs milvus-minio
# Common fix: remove stale lock files
docker compose down -v # WARNING: deletes all data
docker compose up -d
"Engine not initialized" on /health:
# The embedding model is still downloading (~90 MB on first run)
docker compose logs cart-api | grep -i "model"
# Wait 1–2 minutes for sentence-transformers to download
"LLM client not available":
# Check API key is set
docker compose exec cart-api env | grep ANTHROPIC
# Verify key is valid
curl -H "x-api-key: $ANTHROPIC_API_KEY" https://api.anthropic.com/v1/messages \
-d '{"model":"claude-sonnet-4-6","max_tokens":10,"messages":[{"role":"user","content":"hi"}]}'
Empty search results:
# Verify collections are seeded
curl http://localhost:8522/collections
# Re-run setup if needed
docker compose run --rm cart-setup
12.3 Log Inspection¶
# All services
docker compose logs -f --tail=100
# Specific service
docker compose logs -f cart-api
docker compose logs -f cart-streamlit
# Setup progress
docker compose logs -f cart-setup
13. Backup and Recovery¶
13.1 Milvus Data Backup¶
# Stop writes (optional, for consistency)
docker compose pause cart-api cart-streamlit
# Backup volumes
docker run --rm \
-v cart_intelligence_agent_milvus_data:/source:ro \
-v $(pwd)/backups:/backup \
alpine tar czf /backup/milvus_$(date +%Y%m%d).tar.gz -C /source .
docker compose unpause cart-api cart-streamlit
13.2 Full Backup¶
# Backup everything: volumes + seed data + config
tar czf cart_backup_$(date +%Y%m%d).tar.gz \
data/seed/ config/ .env docker-compose.yml
13.3 Restore¶
# Restore Milvus volume
docker compose down
docker volume rm cart_intelligence_agent_milvus_data
docker volume create cart_intelligence_agent_milvus_data
docker run --rm \
-v cart_intelligence_agent_milvus_data:/target \
-v $(pwd)/backups:/backup \
alpine tar xzf /backup/milvus_YYYYMMDD.tar.gz -C /target
docker compose up -d
14. Scaling Considerations¶
14.1 Vertical Scaling¶
On the DGX Spark (128 GB unified memory), the agent can handle: - Up to ~500K vectors per collection without performance degradation - Concurrent query load of ~50 requests/second on the API - Embedding batch sizes up to 256 for bulk ingestion
14.2 Horizontal Scaling¶
For higher throughput:
# Scale API workers
cart-api:
command:
- uvicorn
- api.main:app
- --host=0.0.0.0
- --port=8522
- --workers=4 # Increase from 2
14.3 Milvus Scaling¶
For production workloads exceeding single-node Milvus capacity:
- Migrate from milvus-standalone to Milvus distributed mode
- Use external etcd cluster and S3-compatible object storage
- Refer to Milvus documentation for cluster deployment
15. Integration with HCLS AI Factory¶
15.1 Landing Page Registration¶
The CAR-T agent registers with the HCLS AI Factory landing page (:8080)
for service health monitoring:
| Endpoint | Monitored By |
|---|---|
http://localhost:8522/health |
health-monitor.sh |
http://localhost:8521/_stcore/health |
health-monitor.sh |
15.2 Cross-Agent Event Bus¶
The agent publishes events via hcls_common.event_bus:
from hcls_common.event_bus import publish_event, EventType, PipelineStage
publish_event(
EventType.CART_MANUFACTURING_READY,
source_stage=PipelineStage.CART_ANALYSIS,
payload={
"target_antigen": "CD19",
"evidence_count": 42,
"mode": "deep_research",
},
)
15.3 Shared Collections¶
The genomic_evidence collection is shared with the genomics pipeline
(Stage 1). The CAR-T agent reads from this collection but does not write to it.
15.4 Master Docker Compose¶
In production, the CAR-T services are defined in the master
docker-compose.dgx-spark.yml alongside all other HCLS AI Factory services.
16. Updating and Maintenance¶
16.1 Code Updates¶
# Pull latest code
git pull origin main
# Rebuild containers
docker compose build --no-cache
docker compose up -d
16.2 Dependency Updates¶
# Update requirements
pip install --upgrade -r requirements.txt
# Rebuild Docker image
docker compose build cart-api cart-streamlit
docker compose up -d cart-api cart-streamlit
16.3 Knowledge Updates¶
To refresh the knowledge graph and seed data:
# Re-seed with latest data
docker compose run --rm cart-setup
# Or update specific collections
docker compose exec cart-api python scripts/seed_knowledge.py
16.4 Milvus Updates¶
# Backup first
./scripts/backup_milvus.sh
# Update Milvus image
docker compose pull milvus-standalone
docker compose up -d milvus-standalone
Appendix A: Complete docker-compose.yml¶
The complete docker-compose.yml is included at the project root. Key
parameters:
version: "3.8"
services:
milvus-etcd:
image: quay.io/coreos/etcd:v3.5.5
# Health-checked, 4 GB backend quota
milvus-minio:
image: minio/minio:RELEASE.2023-03-20T20-16-18Z
# Health-checked, default credentials
milvus-standalone:
image: milvusdb/milvus:v2.4-latest
ports: ["19530:19530", "9091:9091"]
# Depends on etcd + minio healthy, 60s start period
cart-streamlit:
build: .
ports: ["8521:8521"]
# Default CMD: streamlit run app/cart_ui.py
cart-api:
build: .
ports: ["8522:8522"]
# CMD override: uvicorn api.main:app
cart-setup:
build: .
restart: "no"
# One-shot: create collections + seed all data
volumes:
etcd_data:
minio_data:
milvus_data:
networks:
cart-network:
driver: bridge
Appendix B: Environment Variable Quick Reference¶
All variables use the CART_ prefix for Pydantic settings injection.
| Variable | Required | Default | Description |
|---|---|---|---|
ANTHROPIC_API_KEY |
Yes* | -- | Claude API key (also accepts CART_ANTHROPIC_API_KEY) |
CART_MILVUS_HOST |
No | localhost |
Milvus server hostname |
CART_MILVUS_PORT |
No | 19530 |
Milvus server port |
CART_API_PORT |
No | 8522 |
FastAPI server port |
CART_STREAMLIT_PORT |
No | 8521 |
Streamlit UI port |
CART_LLM_MODEL |
No | claude-sonnet-4-6 |
Claude model ID |
CART_EMBEDDING_MODEL |
No | BAAI/bge-small-en-v1.5 |
Sentence embedding model |
CART_TOP_K_PER_COLLECTION |
No | 5 |
Results per collection |
CART_SCORE_THRESHOLD |
No | 0.4 |
Minimum similarity score |
CART_CORS_ORIGINS |
No | localhost:8080,8521,8522 |
Allowed CORS origins |
CART_METRICS_ENABLED |
No | True |
Enable Prometheus metrics |
CART_INGEST_ENABLED |
No | False |
Enable scheduled ingestion |
CART_INGEST_SCHEDULE_HOURS |
No | 168 |
Ingestion interval (hours) |
CART_MAX_REQUEST_SIZE_MB |
No | 10 |
Max request body size |
CART_RAG_PIPELINE_ROOT |
No | /app/rag-chat-pipeline |
Path to RAG pipeline |
CART_NCBI_API_KEY |
No | -- | NCBI API key (PubMed) |
HCLS_LIB_PATH |
No | /app/lib |
Path to hcls_common library |
* Required for LLM-powered queries. Embedding-only search works without it.
Author: Adam Jones License: Apache 2.0 Repository: github.com/ajones1923/hcls-ai-factory