Precision Oncology Intelligence Agent -- Deployment Guide¶

HCLS AI Factory / ai_agent_adds / precision_oncology_agent

Version 1.0.0 | March 2026 | Author: Adam Jones

Table of Contents¶

Overview
Prerequisites
Quick Start
Deployment Modes
4a. Docker Lite (Milvus + API Only)
4b. Docker Full Stack (All 6 Services)
4c. DGX Spark Production
4d. Development Mode (Local Python)
Configuration Reference
Collection Setup and Seeding
Data Ingestion (CIViC, PubMed, ClinicalTrials.gov)
Networking and Ports
Storage and Persistence
Monitoring and Metrics
Security Hardening
Health Checks and Troubleshooting
Backup and Recovery
Scaling Considerations
Integration with HCLS AI Factory
Updating and Maintenance
Appendix A: Complete docker-compose.yml
Appendix B: Environment Variable Quick Reference

1. Overview¶

The Precision Oncology Intelligence Agent is a RAG-powered clinical decision support system designed for molecular tumor boards (MTBs). It combines a multi-collection Milvus vector database, BGE-small-en-v1.5 sentence embeddings, and Claude LLM reasoning to deliver evidence-based therapy recommendations, clinical trial matching, and resistance mechanism analysis.

Core Capabilities¶

Multi-collection RAG search across 11 knowledge domains (variants, literature, therapies, trials, biomarkers, pathways, guidelines, resistance mechanisms, outcomes, patient cases, and genomic evidence).
Clinical trial matching with composite scoring (biomarker 40%, semantic 25%, phase 20%, status 15%).
Therapy ranking with evidence-level weighting and resistance awareness.
Cross-modal analysis linking genomic, imaging, and clinical data.
FHIR-compatible case management and export.
PDF report generation for tumor board presentations.
Prometheus metrics for operational monitoring.

Architecture at a Glance¶

The agent runs as 6 Docker services on a bridge network (onco-network):

                    onco-network (bridge)
    +--------------------------------------------------+
    |                                                  |
    |  milvus-etcd -----> milvus-standalone <-----+    |
    |  milvus-minio ---/    :19530 :9091          |    |
    |                                             |    |
    |  onco-api (:8527) --------------------------+    |
    |  onco-streamlit (:8526) --------------------+    |
    |  onco-setup (one-shot) ---------------------+    |
    |                                                  |
    +--------------------------------------------------+

Service	Container Name	Purpose
`milvus-etcd`	`onco-milvus-etcd`	Metadata key-value store for Milvus
`milvus-minio`	`onco-milvus-minio`	Object storage for Milvus log/index data
`milvus-standalone`	`onco-milvus-standalone`	Vector database (Milvus 2.4)
`onco-streamlit`	`onco-streamlit`	Streamlit clinical UI
`onco-api`	`onco-api`	FastAPI REST server
`onco-setup`	`onco-setup`	One-shot collection creation + data seeding

Software Components¶

The agent codebase is organized as follows:

agent/
  api/                  # FastAPI application (main.py + routes/)
    routes/             # Endpoint modules: meta_agent, cases, trials, reports, events
  app/                  # Streamlit UI (oncology_ui.py)
  config/               # Pydantic settings (settings.py)
  data/reference/       # 10 seed JSON files (~768 KB total)
  scripts/              # Setup, seed, ingest, and validation scripts
  src/                  # Core modules
    agent.py            # OncoIntelligenceAgent orchestrator
    case_manager.py     # FHIR-compatible case management
    collections.py      # 11 Milvus collection schemas + OncoCollectionManager
    cross_modal.py      # Cross-modal analysis trigger
    export.py           # PDF report generation
    knowledge.py        # In-memory knowledge graph (targets, therapies, resistance)
    metrics.py          # Prometheus metric definitions
    models.py           # Pydantic data models
    query_expansion.py  # Query rewriting and expansion
    rag_engine.py       # Multi-collection RAG search engine
    scheduler.py        # APScheduler-based periodic ingestion
    therapy_ranker.py   # Evidence-weighted therapy ranking
    trial_matcher.py    # Clinical trial matching with composite scoring
    ingest/             # Data source parsers (CIViC, OncoKB, PubMed, guidelines, etc.)
    utils/              # Utilities (pubmed_client.py)
    workflows/          # Multi-step agent workflows
  tests/                # Unit tests (6 test modules)
  Dockerfile            # Multi-stage build (Python 3.10-slim)
  docker-compose.yml    # All 6 services
  requirements.txt      # 27 dependency groups (~57 resolved packages)

2. Prerequisites¶

2.1 Hardware Requirements¶

Tier	RAM	CPU	Disk	GPU	Use Case
Minimum	16 GB	4 cores	50 GB	None	Development, demos
Recommended	32 GB	8 cores	100 GB	NVIDIA GPU (8+ GB VRAM)	Staging, small production
Production	128 GB	20-core Grace CPU	500 GB NVMe	RTX PRO 6000	NVIDIA DGX Spark, full pipeline

Notes: - Milvus standalone requires approximately 4-8 GB RAM depending on index size. - The BGE-small-en-v1.5 embedding model loads approximately 130 MB into memory. - GPU is optional but significantly accelerates embedding generation during bulk seeding and live ingest operations. - Disk requirements increase with ingested data volume. The base seed data is approximately 768 KB; a production deployment with full PubMed and ClinicalTrials.gov ingest may reach 10-50 GB in Milvus storage.

2.2 Software Requirements¶

Component	Minimum Version	Purpose
Docker Engine	24.0+	Container runtime
Docker Compose	2.20+	Multi-service orchestration
Python	3.10+	Local development (if not using Docker)
Git	2.30+	Source code management
curl	7.0+	Health checks and API testing

Verify Docker installation:

docker --version          # Docker version 24.0+
docker compose version    # Docker Compose version v2.20+

2.3 API Keys¶

Key	Required	Source	Purpose
`ANTHROPIC_API_KEY`	Yes (for LLM features)	console.anthropic.com	Claude LLM for RAG answer generation
`NCBI_API_KEY`	No (recommended)	ncbi.nlm.nih.gov/account	PubMed E-utilities (higher rate limits)

Without ANTHROPIC_API_KEY: Vector search, trial matching, and therapy ranking still function. Only the LLM-generated narrative answers (the /query endpoint) will fail.

Without NCBI_API_KEY: PubMed ingest works but is rate-limited to 3 requests/second instead of 10 requests/second.

2.4 Network Requirements¶

Outbound HTTPS (port 443) access to:
api.anthropic.com (Claude API)
eutils.ncbi.nlm.nih.gov (PubMed)
clinicaltrials.gov (ClinicalTrials.gov API v2)
civicdb.org (CIViC variant database)
huggingface.co (initial model download for BGE-small-en-v1.5)
No inbound ports are required beyond the service ports listed in Section 8.

3. Quick Start¶

Five commands to get the agent running with Docker Compose:

# 1. Clone or navigate to the agent directory
cd ai_agent_adds/precision_oncology_agent/agent

# 2. Create your environment file
cp .env.example .env
# Edit .env and set ANTHROPIC_API_KEY=sk-ant-...

# 3. Build the application images
docker compose build

# 4. Start all 6 services
docker compose up -d

# 5. Watch the setup/seed progress
docker compose logs -f onco-setup

If .env.example does not exist, create .env manually:

cat > .env << 'EOF'
ANTHROPIC_API_KEY=sk-ant-your-key-here
EOF

Verify the Deployment¶

Once onco-setup exits with code 0 (typically 2-5 minutes), verify:

# Check all services are running
docker compose ps

# Milvus health
curl -s http://localhost:9091/healthz
# Expected: {"status":"OK"}

# API health
curl -s http://localhost:8527/health | python3 -m json.tool
# Expected: {"status": "healthy", "collections": {...}, ...}

# Open the Streamlit UI
open http://localhost:8526    # macOS
xdg-open http://localhost:8526  # Linux

Quick Test Query¶

curl -s -X POST http://localhost:8527/query \
  -H "Content-Type: application/json" \
  -d '{"question": "What targeted therapies are available for EGFR L858R in NSCLC?", "top_k": 5}' \
  | python3 -m json.tool

4. Deployment Modes¶

4a. Docker Lite (Milvus + API Only)¶

This mode runs only the vector database infrastructure and the FastAPI server, omitting the Streamlit UI. Suitable for headless API-only deployments or when the UI is served separately.

Start Lite mode:

docker compose up -d milvus-etcd milvus-minio milvus-standalone onco-api onco-setup

Resource footprint: - RAM: ~8-12 GB - CPU: 2-4 cores - Disk: 20 GB minimum

Access points: - API: http://localhost:8527 - API docs (Swagger): http://localhost:8527/docs - Milvus gRPC: localhost:19530 - Milvus metrics: http://localhost:9091/healthz

When to use: - Backend-only deployments where a custom frontend connects via the REST API. - CI/CD pipelines that run integration tests against the API. - Resource-constrained environments that cannot spare memory for Streamlit.

4b. Docker Full Stack (All 6 Services)¶

This is the default mode as described in the Quick Start. All six services run together on the onco-network bridge.

Start Full Stack:

docker compose up -d

Service startup order (enforced by depends_on with health checks):

milvus-etcd starts first (healthcheck: etcdctl endpoint health)
milvus-minio starts in parallel with etcd (healthcheck: MinIO /minio/health/live)
milvus-standalone waits for both etcd and MinIO to be healthy
onco-streamlit, onco-api, and onco-setup all wait for Milvus to be healthy
onco-setup runs once (creates collections, seeds data, then exits with code 0)

Resource footprint: - RAM: ~12-16 GB - CPU: 4 cores minimum - Disk: 30-50 GB

Access points: - Streamlit UI: http://localhost:8526 - API: http://localhost:8527 - API docs (Swagger): http://localhost:8527/docs - MinIO Console: http://localhost:9001 (credentials: minioadmin / minioadmin) - Milvus gRPC: localhost:19530 - Milvus metrics/health: http://localhost:9091/healthz - etcd: localhost:2379

4c. DGX Spark Production¶

For deployment on an NVIDIA DGX Spark (128 GB RAM, 20-core Grace CPU, RTX PRO 6000), the agent integrates with the broader HCLS AI Factory pipeline.

Production Environment File¶

Create /etc/onco-agent/.env:

# === Required ===
ANTHROPIC_API_KEY=sk-ant-your-production-key

# === Milvus (use dedicated host if available) ===
ONCO_MILVUS_HOST=milvus-standalone
ONCO_MILVUS_PORT=19530

# === Embedding ===
ONCO_EMBEDDING_MODEL=BAAI/bge-small-en-v1.5
ONCO_EMBEDDING_DIM=384

# === LLM ===
ONCO_LLM_MODEL=claude-sonnet-4-6

# === API ===
ONCO_API_PORT=8527
ONCO_API_BASE_URL=http://onco-api:8527
ONCO_CORS_ORIGINS=http://localhost:8080,http://localhost:8526,http://localhost:8527

# === Security ===
ONCO_MAX_REQUEST_SIZE_MB=10

# === Monitoring ===
ONCO_METRICS_ENABLED=true

# === PubMed (recommended for production ingest) ===
NCBI_API_KEY=your-ncbi-key

# === Scheduler (weekly refresh) ===
ONCO_SCHEDULER_INTERVAL=168h

Production docker-compose Override¶

Create docker-compose.prod.yml to set resource limits and increase workers:

milvus-standalone: memory limit 16G, reservation 8G, log level warn
onco-api: memory limit 8G, reservation 4G, --workers=4
onco-streamlit: memory limit 4G, reservation 2G

docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d

DGX Spark Systemd Service¶

For automatic startup on boot, create a systemd unit at /etc/systemd/system/onco-agent.service with Type=oneshot, RemainAfterExit=yes, After=docker.service, and Requires=docker.service. Set ExecStart to docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d and ExecStop to docker compose down. Then:

sudo systemctl daemon-reload && sudo systemctl enable --now onco-agent

4d. Development Mode (Local Python)¶

Run the agent directly with Python for rapid iteration and debugging.

Step 1: Set Up Python Environment¶

cd ai_agent_adds/precision_oncology_agent/agent

# Create virtual environment
python3.10 -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install --upgrade pip
pip install -r requirements.txt

Step 2: Start Milvus (Docker Required)¶

Even in development mode, Milvus still runs in Docker:

docker compose up -d milvus-etcd milvus-minio milvus-standalone

Wait for Milvus to be healthy:

# Poll until healthy
until curl -sf http://localhost:9091/healthz > /dev/null; do
  echo "Waiting for Milvus..."
  sleep 5
done
echo "Milvus is ready."

Step 3: Create Collections and Seed Data¶

# Set environment variables
export ONCO_MILVUS_HOST=localhost
export ONCO_MILVUS_PORT=19530
export ANTHROPIC_API_KEY=sk-ant-your-key

# Create all 11 collections and seed with reference data
python scripts/setup_collections.py --drop-existing --seed

Alternatively, run the setup and individual seed scripts separately:

# Create collections only
python scripts/setup_collections.py --drop-existing

# Seed individually
python scripts/seed_variants.py
python scripts/seed_literature.py
python scripts/seed_trials.py
python scripts/seed_therapies.py
python scripts/seed_biomarkers.py
python scripts/seed_pathways.py
python scripts/seed_guidelines.py
python scripts/seed_resistance.py
python scripts/seed_outcomes.py
python scripts/seed_cases.py
python scripts/seed_knowledge.py

Step 4: Run the API Server¶

# Option A: Direct uvicorn
uvicorn api.main:app --host 0.0.0.0 --port 8527 --reload

# Option B: Python module
python -m api.main

Step 5: Run the Streamlit UI¶

In a separate terminal:

source .venv/bin/activate
export ONCO_MILVUS_HOST=localhost
export ONCO_API_BASE_URL=http://localhost:8527
streamlit run app/oncology_ui.py --server.port 8526

Step 6: Run Tests¶

# All tests
pytest tests/ -v

# Individual test modules
pytest tests/test_collections.py -v
pytest tests/test_agent.py -v
pytest tests/test_case_manager.py -v
pytest tests/test_trial_matcher.py -v
pytest tests/test_therapy_ranker.py -v
pytest tests/test_knowledge.py -v

5. Configuration Reference¶

All configuration is managed through config/settings.py using Pydantic BaseSettings. Every setting can be overridden via environment variables with the ONCO_ prefix (except ANTHROPIC_API_KEY and NCBI_API_KEY, which use their standard names).

5.1 Connection Settings¶

Variable	Default	Description
`ONCO_MILVUS_HOST`	`localhost`	Milvus server hostname. Set to `milvus-standalone` in Docker.
`ONCO_MILVUS_PORT`	`19530`	Milvus gRPC port.
`ONCO_API_HOST`	`0.0.0.0`	FastAPI bind address.
`ONCO_API_PORT`	`8527`	FastAPI listen port.
`ONCO_STREAMLIT_PORT`	`8526`	Streamlit server port.
`ONCO_API_BASE_URL`	`http://localhost:8527`	Base URL for API calls from the Streamlit UI.

5.2 Embedding Settings¶

Variable	Default	Description
`ONCO_EMBEDDING_MODEL`	`BAAI/bge-small-en-v1.5`	HuggingFace model for sentence embeddings.
`ONCO_EMBEDDING_DIM`	`384`	Embedding vector dimension. Must match the model output.
`ONCO_EMBEDDING_BATCH_SIZE`	`32`	Batch size for embedding generation during ingest.

5.3 LLM Settings¶

Variable	Default	Description
`ANTHROPIC_API_KEY`	(none)	Required for LLM features. Anthropic API key.
`ONCO_LLM_PROVIDER`	`anthropic`	LLM provider identifier.
`ONCO_LLM_MODEL`	`claude-sonnet-4-6`	Claude model to use for RAG answer generation.

5.4 RAG Search Settings¶

Variable	Default	Description
`ONCO_TOP_K`	`5`	Number of top results to retrieve per collection.
`ONCO_SCORE_THRESHOLD`	`0.4`	Minimum cosine similarity score for result inclusion.
`ONCO_MIN_SUFFICIENT_HITS`	`3`	Minimum hits before the agent considers evidence sufficient.
`ONCO_MIN_COLLECTIONS_FOR_SUFFICIENT`	`2`	Minimum collections with hits for sufficient evidence.
`ONCO_MIN_SIMILARITY_SCORE`	`0.30`	Absolute minimum similarity for any hit.

5.5 Collection Weight Settings¶

These weights control the relative importance of each collection in the multi-collection RAG search. They should sum to approximately 1.0.

Variable	Default	Collection
`ONCO_WEIGHT_VARIANTS`	`0.18`	`onco_variants`
`ONCO_WEIGHT_LITERATURE`	`0.16`	`onco_literature`
`ONCO_WEIGHT_THERAPIES`	`0.14`	`onco_therapies`
`ONCO_WEIGHT_GUIDELINES`	`0.12`	`onco_guidelines`
`ONCO_WEIGHT_TRIALS`	`0.10`	`onco_trials`
`ONCO_WEIGHT_BIOMARKERS`	`0.08`	`onco_biomarkers`
`ONCO_WEIGHT_RESISTANCE`	`0.07`	`onco_resistance`
`ONCO_WEIGHT_PATHWAYS`	`0.06`	`onco_pathways`
`ONCO_WEIGHT_OUTCOMES`	`0.04`	`onco_outcomes`
`ONCO_WEIGHT_CASES`	`0.02`	`onco_cases`
`ONCO_WEIGHT_GENOMIC`	`0.03`	`genomic_evidence`

5.6 Trial Matching Weights¶

These weights control the composite scoring for clinical trial matching.

Variable	Default	Description
`ONCO_TRIAL_WEIGHT_BIOMARKER`	`0.40`	Weight for biomarker match score.
`ONCO_TRIAL_WEIGHT_SEMANTIC`	`0.25`	Weight for semantic similarity score.
`ONCO_TRIAL_WEIGHT_PHASE`	`0.20`	Weight for trial phase preference (Phase 3 > Phase 1).
`ONCO_TRIAL_WEIGHT_STATUS`	`0.15`	Weight for trial recruitment status.

5.7 Citation Thresholds¶

Variable	Default	Description
`ONCO_CITATION_STRONG_THRESHOLD`	`0.75`	Similarity score above which a citation is labeled "strong evidence."
`ONCO_CITATION_MODERATE_THRESHOLD`	`0.60`	Similarity score above which a citation is labeled "moderate evidence."

Variable	Default	Description
`ONCO_CROSS_MODAL_ENABLED`	`true`	Enable cross-modal analysis (genomic + imaging linking).
`ONCO_CROSS_MODAL_THRESHOLD`	`0.40`	Minimum score to trigger cross-modal expansion.
`ONCO_GENOMIC_TOP_K`	`5`	Number of genomic evidence results to retrieve.
`ONCO_IMAGING_TOP_K`	`5`	Number of imaging results to retrieve.

5.9 External API Settings¶

Variable	Default	Description
`NCBI_API_KEY`	(none)	NCBI E-utilities API key for PubMed ingest. Optional but recommended.
`ONCO_PUBMED_MAX_RESULTS`	`5000`	Maximum PubMed articles to fetch per ingest run.
`ONCO_CT_GOV_BASE_URL`	`https://clinicaltrials.gov/api/v2`	ClinicalTrials.gov API v2 base URL.
`ONCO_CIVIC_BASE_URL`	`https://civicdb.org/api`	CIViC database API base URL.

5.10 Operational Settings¶

Variable	Default	Description
`ONCO_METRICS_ENABLED`	`true`	Enable Prometheus metrics collection.
`ONCO_SCHEDULER_INTERVAL`	`168h`	Interval for scheduled data refresh (default: 1 week).
`ONCO_CORS_ORIGINS`	`http://localhost:8080,http://localhost:8526,http://localhost:8527`	Comma-separated allowed CORS origins.
`ONCO_MAX_REQUEST_SIZE_MB`	`10`	Maximum HTTP request body size in megabytes.
`ONCO_CONVERSATION_MEMORY_DEPTH`	`3`	Number of conversation turns to retain for context.

5.11 PDF Report Branding¶

Variable	Default	Description
`ONCO_PDF_BRAND_COLOR_R`	`118`	Brand color red component (0-255).
`ONCO_PDF_BRAND_COLOR_G`	`185`	Brand color green component (0-255).
`ONCO_PDF_BRAND_COLOR_B`	`0`	Brand color blue component (0-255).

5.12 Collection Names¶

These are rarely changed but can be overridden if you need custom naming:

Variable	Default
`ONCO_COLLECTION_LITERATURE`	`onco_literature`
`ONCO_COLLECTION_TRIALS`	`onco_trials`
`ONCO_COLLECTION_VARIANTS`	`onco_variants`
`ONCO_COLLECTION_BIOMARKERS`	`onco_biomarkers`
`ONCO_COLLECTION_THERAPIES`	`onco_therapies`
`ONCO_COLLECTION_PATHWAYS`	`onco_pathways`
`ONCO_COLLECTION_GUIDELINES`	`onco_guidelines`
`ONCO_COLLECTION_RESISTANCE`	`onco_resistance`
`ONCO_COLLECTION_OUTCOMES`	`onco_outcomes`
`ONCO_COLLECTION_CASES`	`onco_cases`
`ONCO_COLLECTION_GENOMIC`	`genomic_evidence`

6. Collection Setup and Seeding¶

6.1 Collection Schemas¶

The agent manages 11 Milvus collections. Each uses IVF_FLAT indexing with COSINE similarity on 384-dimensional BGE-small-en-v1.5 embeddings.

#	Collection	Content	Key Fields
1	`onco_variants`	Actionable somatic/germline variants (CIViC, OncoKB)	gene, variant, cancer_type, evidence_level
2	`onco_literature`	PubMed/PMC/preprint chunks by cancer type	title, text_chunk, source_type, year, journal
3	`onco_therapies`	Approved and investigational therapies with MOA	drug_name, mechanism, cancer_type, approval_status
4	`onco_guidelines`	NCCN/ASCO/ESMO guideline recommendations	guideline_body, recommendation, cancer_type
5	`onco_trials`	ClinicalTrials.gov summaries with biomarker criteria	nct_id, title, phase, status, biomarkers
6	`onco_biomarkers`	Predictive/prognostic biomarkers and assays	biomarker_name, test_type, cancer_type
7	`onco_resistance`	Resistance mechanisms and bypass strategies	mechanism, gene, drug, bypass_strategy
8	`onco_pathways`	Signaling pathways, cross-talk, druggable nodes	pathway_name, genes, druggable_nodes
9	`onco_outcomes`	Real-world treatment outcome records	treatment, response, duration, cancer_type
10	`onco_cases`	De-identified patient case snapshots	diagnosis, mutations, treatment_history
11	`genomic_evidence`	Read-only VCF-derived evidence from Stage 1 pipeline	gene, variant, consequence, impact

6.2 Setup Script¶

The scripts/setup_collections.py script creates all collection schemas and optionally seeds them with reference data.

# Create collections only (preserves existing data)
python scripts/setup_collections.py

# Drop and recreate all collections
python scripts/setup_collections.py --drop-existing

# Drop, recreate, and seed with reference data
python scripts/setup_collections.py --drop-existing --seed

# Connect to a specific Milvus host
python scripts/setup_collections.py --host milvus-standalone --port 19530 --drop-existing --seed

6.3 Seed Scripts¶

There are 11 seed scripts, each populating a specific collection from JSON files in data/reference/:

Script	Source File	Collection
`seed_variants.py`	`variant_seed_data.json`	`onco_variants`
`seed_literature.py`	`literature_seed_data.json`	`onco_literature`
`seed_therapies.py`	`therapy_seed_data.json`	`onco_therapies`
`seed_guidelines.py`	`guideline_seed_data.json`	`onco_guidelines`
`seed_trials.py`	`trial_seed_data.json`	`onco_trials`
`seed_biomarkers.py`	`biomarker_seed_data.json`	`onco_biomarkers`
`seed_resistance.py`	`resistance_seed_data.json`	`onco_resistance`
`seed_pathways.py`	`pathway_seed_data.json`	`onco_pathways`
`seed_outcomes.py`	`outcome_seed_data.json`	`onco_outcomes`
`seed_cases.py`	`cases_seed_data.json`	`onco_cases`
`seed_knowledge.py`	(in-memory maps)	Loads knowledge graph into module memory

Seed data files are located in data/reference/ and total approximately 768 KB (10 JSON files).

6.4 Docker-Based Seeding¶

In Docker Compose, the onco-setup service runs seeding automatically:

# Watch seed progress
docker compose logs -f onco-setup

# Re-run seeding (drop collections first)
docker compose rm -f onco-setup
docker compose up onco-setup

The onco-setup container executes this sequence: 1. setup_collections.py --drop-existing (create all 11 schemas) 2. seed_variants.py through seed_outcomes.py (9 seed scripts in order)

After completion, the container exits with code 0 (restart policy: "no").

6.5 Verifying Seed Status¶

# Via API
curl -s http://localhost:8527/collections | python3 -m json.tool

# Via API health endpoint
curl -s http://localhost:8527/health | python3 -m json.tool

# Expected output includes collection counts:
# {
#   "status": "healthy",
#   "collections": {
#     "onco_variants": <count>,
#     "onco_literature": <count>,
#     "onco_therapies": <count>,
#     ...
#   },
#   "total_vectors": <total>,
#   ...
# }

7. Data Ingestion (CIViC, PubMed, ClinicalTrials.gov)¶

Beyond the static seed data, the agent includes 3 live ingest scripts that fetch and parse data from external APIs.

7.1 CIViC Variant Ingest¶

Fetches actionable variants and evidence from the CIViC (Clinical Interpretation of Variants in Cancer) database.

python scripts/ingest_civic.py

Configuration: - ONCO_CIVIC_BASE_URL (default: https://civicdb.org/api) - Uses src/ingest/civic_parser.py to parse CIViC API responses. - Populates onco_variants collection with clinical evidence items.

What it fetches: - Gene-variant-disease associations - Evidence levels (A through E) - Drug sensitivity/resistance annotations - Supporting publications

7.2 PubMed Literature Ingest¶

Fetches recent oncology literature from PubMed via the NCBI E-utilities API.

# Without API key (3 req/sec rate limit)
python scripts/ingest_pubmed.py

# With API key (10 req/sec rate limit)
NCBI_API_KEY=your-key python scripts/ingest_pubmed.py

Configuration: - NCBI_API_KEY (optional, recommended for higher rate limits) - ONCO_PUBMED_MAX_RESULTS (default: 5000) - Uses src/ingest/literature_parser.py and src/utils/pubmed_client.py. - Populates onco_literature collection.

What it fetches: - Abstracts and metadata from recent oncology publications - Chunks text for embedding and indexes by cancer type, gene, and variant

7.3 ClinicalTrials.gov Ingest¶

Fetches active oncology clinical trials from the ClinicalTrials.gov API v2.

python scripts/ingest_clinical_trials.py

Configuration: - ONCO_CT_GOV_BASE_URL (default: https://clinicaltrials.gov/api/v2) - Uses src/ingest/clinical_trials_parser.py. - Populates onco_trials collection.

What it fetches: - Trial summaries, eligibility criteria, and biomarker requirements - Phase, status, and intervention details - Geographic location data

7.4 Scheduled Ingest¶

The agent includes an APScheduler-based scheduler (src/scheduler.py) that can periodically run ingest tasks.

Configuration: - ONCO_SCHEDULER_INTERVAL (default: 168h = 1 week)

To enable scheduled ingest in the API server, the scheduler is initialized during the FastAPI lifespan and runs ingest jobs at the configured interval.

7.5 Custom Ingest Scripts¶

The src/ingest/ directory contains parser modules that can be used to build custom ingest pipelines:

Parser Module	Purpose
`base.py`	Base ingest class with common embedding and insertion logic
`civic_parser.py`	CIViC variant/evidence parsing
`clinical_trials_parser.py`	ClinicalTrials.gov API v2 parsing
`literature_parser.py`	PubMed abstract chunking and parsing
`guideline_parser.py`	NCCN/ASCO/ESMO guideline parsing
`oncokb_parser.py`	OncoKB variant annotation parsing
`outcome_parser.py`	Treatment outcome record parsing
`pathway_parser.py`	Signaling pathway parsing
`resistance_parser.py`	Resistance mechanism parsing

8. Networking and Ports¶

8.1 Port Map¶

Port	Protocol	Service	Container	Description
8526	HTTP	`onco-streamlit`	`onco-streamlit`	Streamlit clinical UI
8527	HTTP	`onco-api`	`onco-api`	FastAPI REST API + Swagger docs
19530	gRPC	`milvus-standalone`	`onco-milvus-standalone`	Milvus vector database client port
9091	HTTP	`milvus-standalone`	`onco-milvus-standalone`	Milvus metrics and health endpoint
9000	HTTP	`milvus-minio`	`onco-milvus-minio`	MinIO S3-compatible API
9001	HTTP	`milvus-minio`	`onco-milvus-minio`	MinIO web console
2379	HTTP	`milvus-etcd`	`onco-milvus-etcd`	etcd client endpoint (internal)

8.2 Docker Network¶

All services communicate over the onco-network Docker bridge network. Container-to-container communication uses service names as hostnames:

onco-api --> milvus-standalone:19530  (Milvus gRPC)
onco-streamlit --> milvus-standalone:19530  (Milvus gRPC)
milvus-standalone --> milvus-etcd:2379  (metadata)
milvus-standalone --> milvus-minio:9000  (object storage)

8.3 Exposing Ports¶

By default, only the following ports are mapped to the host:

8526 (Streamlit UI)
8527 (FastAPI API)
19530 (Milvus gRPC)
9091 (Milvus metrics)

MinIO ports (9000, 9001) and etcd (2379) are not exposed to the host by default. To expose them, add port mappings in a docker-compose.override.yml:

version: "3.8"
services:
  milvus-minio:
    ports:
      - "9000:9000"
      - "9001:9001"
  milvus-etcd:
    ports:
      - "2379:2379"

8.4 Changing Ports¶

Set ONCO_API_PORT and ONCO_STREAMLIT_PORT environment variables. You must also update port mappings in docker-compose.yml or use an override file.

8.5 Firewall Rules (Production)¶

Allow ports 8526 and 8527; block direct Milvus access (19530, 9091) from external networks. Example: sudo ufw allow 8526/tcp && sudo ufw allow 8527/tcp.

9. Storage and Persistence¶

9.1 Docker Volumes¶

The deployment uses three named Docker volumes for data persistence:

Volume	Mount Point	Service	Content
`etcd_data`	`/etcd`	`milvus-etcd`	Milvus metadata (collection schemas, partitions)
`minio_data`	`/minio_data`	`milvus-minio`	Milvus index files and log segments
`milvus_data`	`/var/lib/milvus`	`milvus-standalone`	Milvus WAL, insert logs, and query cache

9.2 Volume Lifecycle¶

Volumes persist across container restarts and docker compose down:

# Stop services (volumes preserved)
docker compose down

# Stop services AND delete volumes (DATA LOSS)
docker compose down -v

# List volumes
docker volume ls | grep -E "etcd_data|minio_data|milvus_data"

# Inspect a volume
docker volume inspect precision_oncology_agent_milvus_data

9.3 Disk Space Estimates¶

Component	Base Size	With Full Ingest
Seed data (JSON)	~768 KB	N/A
etcd metadata	~50 MB	~200 MB
MinIO (Milvus indexes)	~500 MB	~5-20 GB
Milvus data	~500 MB	~5-20 GB
Docker images	~2 GB	~2 GB
Total	~3 GB	~12-42 GB

9.4 Cache Directory¶

The application creates a cache directory at data/cache/ inside the container (path: /app/data/cache/). This directory stores:

Downloaded embedding model files (first run only, ~130 MB)
Temporary processing artifacts

In the Dockerfile, this directory is pre-created and owned by the oncouser non-root user.

9.5 Bind Mounts for Development¶

For development, you may want to bind-mount source code for live reloading:

# docker-compose.override.yml
version: "3.8"
services:
  onco-api:
    volumes:
      - ./api:/app/api
      - ./src:/app/src
      - ./config:/app/config
    command:
      - uvicorn
      - api.main:app
      - --host=0.0.0.0
      - --port=8527
      - --reload

10. Monitoring and Metrics¶

10.1 Prometheus Metrics¶

The agent exposes Prometheus-compatible metrics at GET /metrics on the FastAPI server (port 8527).

Available metrics:

Metric	Type	Description
`onco_agent_up`	Gauge	Service availability (1 = up, 0 = down)
`onco_collection_vectors`	Gauge	Vector count per collection (labeled by `collection`)
`onco_query_duration_seconds`	Histogram	RAG query latency
`onco_search_duration_seconds`	Histogram	Vector search latency
`onco_embedding_duration_seconds`	Histogram	Embedding generation latency
`onco_llm_tokens_total`	Counter	Total LLM tokens consumed
`onco_milvus_operations_total`	Counter	Milvus operation count (by operation type)

Scrape configuration for prometheus.yml:

scrape_configs:
  - job_name: "onco-agent"
    scrape_interval: 30s
    metrics_path: /metrics
    static_configs:
      - targets: ["onco-api:8527"]
        labels:
          service: "oncology-intelligence-agent"

10.2 Milvus Metrics¶

Milvus exposes its own metrics at http://localhost:9091/metrics in Prometheus exposition format.

Key Milvus metrics:

Metric	Description
`milvus_datanode_flush_segments_total`	Segment flush count
`milvus_proxy_search_vectors_count`	Search request count
`milvus_querynode_search_latency`	Query latency histogram

Add to prometheus.yml:

  - job_name: "milvus"
    scrape_interval: 30s
    metrics_path: /metrics
    static_configs:
      - targets: ["milvus-standalone:9091"]

10.3 Grafana Dashboard¶

Add Prometheus as a data source (http://prometheus:9090), then create panels using these PromQL queries:

Panel	PromQL
Avg query latency (5m)	`rate(onco_query_duration_seconds_sum[5m]) / rate(onco_query_duration_seconds_count[5m])`
Total vectors	`sum(onco_collection_vectors)`
Request rate/min	`rate(onco_milvus_operations_total[1m]) * 60`
Service up	`onco_agent_up`

10.4 Disabling Metrics¶

Set ONCO_METRICS_ENABLED=false to disable Prometheus metric collection. The /metrics endpoint will still respond but will return minimal data. When prometheus_client is not installed, the metrics module automatically falls back to no-op stubs with zero overhead.

10.5 Health Dashboard¶

The /health endpoint returns a JSON summary with status ("healthy" or "degraded"), per-collection vector counts, total_vectors, version, and a services map showing boolean availability for milvus, embedder, rag_engine, intelligence_agent, case_manager, trial_matcher, and therapy_ranker. Use this for uptime monitoring and alerting.

11. Security Hardening¶

11.1 CORS Configuration¶

The API server enforces CORS (Cross-Origin Resource Sharing) restrictions. By default, only requests from the Landing Page (:8080), Streamlit UI (:8526), and the API itself (:8527) are allowed.

Configure allowed origins:

ONCO_CORS_ORIGINS="https://mtb.hospital.org,https://admin.hospital.org"

In production, always restrict CORS to known origins. The default localhost values are suitable only for development.

11.2 API Key Protection¶

Critical: Never commit ANTHROPIC_API_KEY or NCBI_API_KEY to version control.

Best practices: - Use .env files (already in .gitignore) - Use Docker secrets or a secrets manager (HashiCorp Vault, AWS Secrets Manager) - Set environment variables directly on the host or in CI/CD pipelines - Rotate API keys regularly

Docker secrets example:

# docker-compose.yml addition
services:
  onco-api:
    secrets:
      - anthropic_key
    environment:
      ANTHROPIC_API_KEY_FILE: /run/secrets/anthropic_key

secrets:
  anthropic_key:
    file: ./secrets/anthropic_api_key.txt

11.3 Request Size Limits¶

The API enforces a maximum request body size via ONCO_MAX_REQUEST_SIZE_MB (default: 10 MB). Requests exceeding this limit receive a 413 Payload TooLarge response.

11.4 Non-Root Container User¶

The Dockerfile creates a dedicated oncouser with no shell access:

RUN useradd -r -s /bin/false oncouser
USER oncouser

All application processes run as this non-root user inside the container.

11.5 TLS/HTTPS Configuration¶

The agent does not natively terminate TLS. For production, place a reverse proxy (nginx, Caddy, or Traefik) in front of the services:

Proxy / to http://127.0.0.1:8526 (Streamlit -- requires WebSocket upgrade headers: Upgrade, Connection)
Proxy /api/ to http://127.0.0.1:8527/ (FastAPI)
Use TLS 1.2+ with valid certificates
Set X-Real-IP, X-Forwarded-For, and X-Forwarded-Proto headers

11.6 MinIO Credentials¶

The default MinIO credentials (minioadmin / minioadmin) should be changed in production:

milvus-minio:
  environment:
    MINIO_ACCESS_KEY: ${MINIO_ACCESS_KEY}
    MINIO_SECRET_KEY: ${MINIO_SECRET_KEY}

11.7 Network Isolation¶

The default configuration exposes only ports 8526, 8527, 19530, and 9091 to the host. For additional isolation, split into public and internal networks (set internal: true on the Milvus infrastructure network) so that etcd and MinIO are not reachable from the host.

12. Health Checks and Troubleshooting¶

12.1 Health Check Endpoints¶

Service	Endpoint	Expected Response
Milvus	`GET http://localhost:9091/healthz`	`{"status":"OK"}`
FastAPI	`GET http://localhost:8527/health`	JSON with `"status": "healthy"`
Streamlit	`GET http://localhost:8526/_stcore/health`	`ok`
MinIO	`GET http://localhost:9000/minio/health/live`	HTTP 200
etcd	`etcdctl endpoint health` (in container)	`is healthy: true`

12.2 Docker Health Check Configuration¶

Health checks are defined in both the Dockerfile and docker-compose.yml:

Service	Check Command	Interval	Timeout	Retries	Start Period
Milvus	`curl -f http://localhost:9091/healthz`	30s	10s	10	60s
FastAPI	`curl -f http://localhost:8527/health`	30s	10s	3	30s
Streamlit	`curl -f http://localhost:8526/_stcore/health`	30s	10s	3	40s
etcd	`etcdctl endpoint health`	30s	20s	5	--
MinIO	`curl -f http://localhost:9000/minio/health/live`	30s	20s	5	--

12.3 Common Issues and Solutions¶

Issue: Milvus fails to start¶

Symptoms: milvus-standalone container restarts repeatedly.

Diagnosis:

docker compose logs milvus-standalone | tail -50
docker compose logs milvus-etcd | tail -20
docker compose logs milvus-minio | tail -20

Common causes: 1. Insufficient memory: Milvus needs at least 4 GB. Check with docker stats. 2. etcd not ready: etcd may take 30-60 seconds. The depends_on with health check should handle this, but check etcd logs. 3. Port conflict: Another service is using port 19530 or 9091.

sudo lsof -i :19530
sudo lsof -i :9091

4. Corrupted data: If volumes contain corrupted data from a previous run:

docker compose down -v  # WARNING: deletes all data
docker compose up -d

Issue: onco-setup exits with non-zero code¶

Symptoms: Seed scripts fail during collection creation or data loading.

Diagnosis:

docker compose logs onco-setup

Common causes: 1. Milvus not ready: The setup container started before Milvus was fully initialized. Re-run:

docker compose rm -f onco-setup
docker compose up onco-setup

2. Collections already exist: Use --drop-existing flag (already set in the default Docker Compose command). 3. Missing seed data files: Verify data/reference/ contains all 10 JSON files.

Issue: API returns 503 "Service not initialised"¶

Symptoms: All API endpoints return 503 errors.

Diagnosis:

docker compose logs onco-api | tail -30
curl -s http://localhost:8527/health

Common causes: 1. Milvus not reachable: Verify ONCO_MILVUS_HOST and ONCO_MILVUS_PORT are correctly set. 2. Embedding model download failed: The first startup downloads BAAI/bge-small-en-v1.5 (~130 MB). Ensure outbound HTTPS access to huggingface.co. 3. Startup timeout: The embedding model download can take several minutes on slow connections. Check the API logs for download progress.

Issue: LLM queries fail but search works¶

Symptoms: /search returns results but /query returns errors.

Cause: ANTHROPIC_API_KEY is not set or is invalid.

Fix:

# Check if the key is set in the container
docker compose exec onco-api env | grep ANTHROPIC

# Update the key
echo "ANTHROPIC_API_KEY=sk-ant-your-new-key" >> .env
docker compose up -d onco-api

Issue: Streamlit UI shows connection error¶

Symptoms: The Streamlit UI displays "Unable to connect to Milvus."

Diagnosis:

docker compose logs onco-streamlit | tail -20

Fix: Ensure ONCO_MILVUS_HOST=milvus-standalone (not localhost) in the Docker environment, since the Streamlit container connects to Milvus over the Docker network.

Issue: Slow embedding generation¶

Symptoms: Queries take 5+ seconds; seed scripts run slowly.

Diagnosis: Check if GPU acceleration is available:

python -c "import torch; print(torch.cuda.is_available())"

Fix: If GPU is available, ensure the Docker image has CUDA support. For CPU-only deployments, increase ONCO_EMBEDDING_BATCH_SIZE to improve throughput at the cost of memory.

12.4 Diagnostic Commands¶

Command	Purpose
`docker compose ps`	Service status
`docker stats --no-stream`	Container resource usage
`curl -s localhost:8527/health \\| python3 -m json.tool`	Full health check
`curl -s localhost:8527/collections \\| python3 -m json.tool`	Collection stats
`curl -s localhost:8527/knowledge/stats \\| python3 -m json.tool`	Knowledge stats
`curl -s localhost:9091/healthz`	Milvus health
`docker compose logs --tail=100 <service>`	View logs
`docker compose exec onco-api bash`	Shell into container

13. Backup and Recovery¶

13.1 Volume Backup¶

Back up the three named Docker volumes to preserve all Milvus data:

#!/bin/bash
# backup-onco-volumes.sh
BACKUP_DIR="/backups/onco-agent/$(date +%Y%m%d-%H%M%S)"
mkdir -p "$BACKUP_DIR"

# Stop services to ensure consistency
docker compose stop

# Backup each volume
for vol in etcd_data minio_data milvus_data; do
    echo "Backing up $vol..."
    docker run --rm \
        -v "precision_oncology_agent_${vol}:/data:ro" \
        -v "$BACKUP_DIR:/backup" \
        alpine tar czf "/backup/${vol}.tar.gz" -C /data .
done

# Restart services
docker compose start

echo "Backup complete: $BACKUP_DIR"
ls -lh "$BACKUP_DIR"

13.2 Volume Restore¶

#!/bin/bash
# restore-onco-volumes.sh <backup-dir>
BACKUP_DIR="$1"
[ -z "$BACKUP_DIR" ] && echo "Usage: $0 /path/to/backup" && exit 1

docker compose down
for vol in etcd_data minio_data milvus_data; do
    docker volume rm "precision_oncology_agent_${vol}" 2>/dev/null
    docker volume create "precision_oncology_agent_${vol}"
    docker run --rm \
        -v "precision_oncology_agent_${vol}:/data" \
        -v "$BACKUP_DIR:/backup:ro" \
        alpine tar xzf "/backup/${vol}.tar.gz" -C /data
done
docker compose up -d

13.3 Re-Seed from Scratch¶

If backups are unavailable, you can reconstruct the entire knowledge base from the seed data and live ingest scripts:

# 1. Remove all data
docker compose down -v

# 2. Start infrastructure
docker compose up -d

# 3. Wait for setup to complete
docker compose logs -f onco-setup

# 4. Run live ingest (optional, for full data)
docker compose exec onco-api python scripts/ingest_civic.py
docker compose exec onco-api python scripts/ingest_pubmed.py
docker compose exec onco-api python scripts/ingest_clinical_trials.py

13.4 Backup Schedule¶

Schedule daily backups via cron (e.g., 0 2 * * *) and weekly full backups on Sundays. Log to /var/log/onco-backup.log.

13.5 Backup Verification¶

After restoring, verify with curl -s localhost:8527/collections, run a test search query, and execute python scripts/validate_e2e.py.

14. Scaling Considerations¶

14.1 Vertical Scaling¶

The simplest scaling approach is to increase resources allocated to existing containers.

API server workers:

# docker-compose.override.yml
services:
  onco-api:
    command:
      - uvicorn
      - api.main:app
      - --host=0.0.0.0
      - --port=8527
      - --workers=8    # Increase from default 2

Each uvicorn worker loads its own copy of the embedding model (~130 MB). Plan memory accordingly: workers * 200 MB + base overhead.

Milvus resource allocation:

services:
  milvus-standalone:
    deploy:
      resources:
        limits:
          memory: 32G
        reservations:
          memory: 16G

14.2 Horizontal Scaling: API Layer¶

The FastAPI server is stateless (all state lives in Milvus). Run multiple replicas behind a load balancer:

services:
  onco-api:
    deploy:
      replicas: 3

Use nginx, HAProxy, or Traefik to distribute requests across the replicas. Remove host port mapping from individual containers when using a load balancer.

14.3 Milvus Scaling¶

For larger deployments, consider migrating from Milvus standalone to Milvus cluster mode:

Deployment	Collections	Vectors	Concurrent Queries
Standalone	11	Up to ~10M	10-50
Cluster	11+	100M+	100+

Milvus cluster mode requires: - Separate etcd cluster (3+ nodes) - Separate MinIO or S3 storage - Query nodes, data nodes, and index nodes

Refer to the Milvus documentation for cluster deployment.

14.4 Embedding Model Scaling¶

GPU acceleration: ~1000 embeddings/sec (RTX 4090) vs. ~100/sec (CPU).
Dedicated embedding service: Run sentence-transformers as a standalone microservice with batching.
Model server: Triton Inference Server for production embedding serving.

14.5 Performance Benchmarks¶

Typical latencies on DGX Spark hardware:

Operation	Latency (p50)	Latency (p95)
Vector search (single collection)	~15 ms	~50 ms
Multi-collection RAG search	~100 ms	~300 ms
Full RAG query (with LLM)	~2 s	~5 s
Embedding generation (single text)	~10 ms (GPU) / ~50 ms (CPU)	~30 ms (GPU) / ~150 ms (CPU)
Trial matching (top 10)	~200 ms	~500 ms

15. Integration with HCLS AI Factory¶

15.1 Pipeline Position¶

The Oncology Intelligence Agent operates as part of the HCLS AI Factory three-stage pipeline:

Stage 1: Genomics Pipeline (Parabricks/DeepVariant)
    FASTQ --> VCF
        |
        v
Stage 2: RAG/Chat Pipeline (Milvus + Claude)
    VCF --> Variant Interpretation
        |
        v
    Oncology Intelligence Agent
    (MTB Decision Support)
        |
        v
Stage 3: Drug Discovery Pipeline (BioNeMo/DiffDock)
    Candidate Molecules --> Docking Scores

15.2 Shared Collections¶

The genomic_evidence collection is read-only and shared with the Stage 1 genomics pipeline. When the genomics pipeline processes a VCF file, it writes annotated variants to this collection. The oncology agent reads from it to provide cross-modal genomic context.

Configuration: The ONCO_COLLECTION_GENOMIC setting (default: genomic_evidence) must match the collection name used by the genomics pipeline.

15.3 Landing Page Integration¶

The HCLS AI Factory Landing Page (port 8080) provides a unified dashboard with health monitoring for all agents. The oncology agent registers at:

Health endpoint: http://onco-api:8527/health
UI link: http://localhost:8526

Ensure ONCO_CORS_ORIGINS includes the landing page origin (http://localhost:8080).

15.4 Docker Compose Integration¶

When running as part of the full HCLS AI Factory stack (docker-compose.dgx-spark.yml), the oncology agent services are defined within the main compose file. Ensure network connectivity between the oncology services and the shared Milvus instance if using a centralized vector database.

15.5 Cross-Agent Communication¶

The oncology agent can receive cross-modal triggers from other HCLS AI Factory agents:

Source Agent	Data Flow	Collection
Biomarker Agent	Biomarker panel results	`onco_biomarkers`
CAR-T Agent	Immunotherapy candidates	`onco_therapies`
Imaging Agent	Radiomics features	Cross-modal trigger
Autoimmune Agent	Immune checkpoint data	`onco_resistance`

15.6 Event Bus¶

The api/routes/events.py module provides an event endpoint for inter-agent communication. Other agents can publish events that the oncology agent processes for knowledge base updates.

16. Updating and Maintenance¶

16.1 Updating the Agent Code¶

# 1. Pull latest code
git pull origin main

# 2. Rebuild Docker images
docker compose build --no-cache

# 3. Restart services (preserves data volumes)
docker compose up -d

# 4. Re-run setup if schema changes occurred
docker compose rm -f onco-setup
docker compose up onco-setup

16.2 Updating Dependencies¶

# 1. Update requirements.txt with new versions
# 2. Rebuild the image
docker compose build --no-cache onco-api onco-streamlit onco-setup

# 3. Restart
docker compose up -d

16.3 Updating Seed Data¶

To refresh the seed data without losing live-ingested data:

# Run individual seed scripts (appends to existing collections)
docker compose exec onco-api python scripts/seed_variants.py

# Or drop and re-seed everything
docker compose exec onco-api python scripts/setup_collections.py --drop-existing --seed

16.4 Updating Milvus¶

To upgrade the Milvus version:

Back up all volumes (see Section 13).

Update the image tag in docker-compose.yml:

milvus-standalone:
  image: milvusdb/milvus:v2.5-latest  # Updated from v2.4

Review the Milvus release notes for breaking changes.

Restart:

docker compose down
docker compose up -d

16.5 Log Rotation¶

Configure Docker log rotation in /etc/docker/daemon.json with "max-size": "50m" and "max-file": "5". Restart Docker after changing.

16.6 Periodic Maintenance Tasks¶

Task	Frequency	Command
Backup volumes	Daily	`backup-onco-volumes.sh`
Refresh PubMed data	Weekly	`python scripts/ingest_pubmed.py`
Refresh ClinicalTrials.gov	Weekly	`python scripts/ingest_clinical_trials.py`
Refresh CIViC variants	Weekly	`python scripts/ingest_civic.py`
Compact Milvus	Monthly	Automatic (configured via etcd compaction)
Rotate API keys	Quarterly	Update `.env` and restart
Update Docker images	As needed	`docker compose build --no-cache`
Review metrics/alerts	Weekly	Grafana dashboard review
Test backup restore	Monthly	Restore to staging environment

16.7 Version Pinning¶

For production stability, pin all image versions:

services:
  milvus-etcd:
    image: quay.io/coreos/etcd:v3.5.5         # Pinned
  milvus-minio:
    image: minio/minio:RELEASE.2023-03-20T20-16-18Z  # Pinned
  milvus-standalone:
    image: milvusdb/milvus:v2.4.0              # Pinned to specific release

Appendix A: Complete docker-compose.yml¶

The canonical docker-compose.yml is maintained at the repository root:

ai_agent_adds/precision_oncology_agent/agent/docker-compose.yml

It defines all 6 services (milvus-etcd, milvus-minio, milvus-standalone, onco-streamlit, onco-api, onco-setup), 3 volumes (etcd_data, minio_data, milvus_data), and the onco-network bridge network.

Key configuration facts for quick reference:

Service	Image	Exposed Ports	Restart Policy
`milvus-etcd`	`quay.io/coreos/etcd:v3.5.5`	(none)	`unless-stopped`
`milvus-minio`	`minio/minio:RELEASE.2023-03-20T20-16-18Z`	(none)	`unless-stopped`
`milvus-standalone`	`milvusdb/milvus:v2.4-latest`	`19530`, `9091`	`unless-stopped`
`onco-streamlit`	Local `Dockerfile`	`8526`	`unless-stopped`
`onco-api`	Local `Dockerfile`	`8527`	`unless-stopped`
`onco-setup`	Local `Dockerfile`	(none)	`no`

Startup order: etcd + MinIO (parallel) --> Milvus (waits for both) --> Streamlit + API + Setup (wait for Milvus healthy).

etcd configuration: - Backend quota: 4 GB (ETCD_QUOTA_BACKEND_BYTES=4294967296) - Auto-compaction: revision mode, retention 1000 - Snapshot count: 50000

MinIO configuration: - Default credentials: minioadmin / minioadmin - Console address: :9001

Milvus configuration: - Security opt: seccomp:unconfined - Health check: start_period: 60s, retries: 10

onco-api: - Command: uvicorn api.main:app --host=0.0.0.0 --port=8527 --workers=2 - Health check: start-period: 30s, retries: 3

onco-setup: - Runs setup_collections.py --drop-existing then 9 seed scripts sequentially - Exits after completion (restart: "no")

Appendix B: Environment Variable Quick Reference¶

Complete list of all environment variables accepted by the Oncology Intelligence Agent, sorted alphabetically within categories.

Required¶

Variable	Example	Description
`ANTHROPIC_API_KEY`	`sk-ant-api03-...`	Anthropic API key for Claude LLM

Connection¶

Variable	Default	Description
`ONCO_API_BASE_URL`	`http://localhost:8527`	API base URL for Streamlit UI
`ONCO_API_HOST`	`0.0.0.0`	FastAPI bind address
`ONCO_API_PORT`	`8527`	FastAPI listen port
`ONCO_MILVUS_HOST`	`localhost`	Milvus hostname
`ONCO_MILVUS_PORT`	`19530`	Milvus gRPC port
`ONCO_STREAMLIT_PORT`	`8526`	Streamlit server port

Embedding & LLM¶

Variable	Default	Description
`ONCO_EMBEDDING_BATCH_SIZE`	`32`	Embedding batch size
`ONCO_EMBEDDING_DIM`	`384`	Embedding vector dimension
`ONCO_EMBEDDING_MODEL`	`BAAI/bge-small-en-v1.5`	HuggingFace embedding model
`ONCO_LLM_MODEL`	`claude-sonnet-4-6`	Claude model name
`ONCO_LLM_PROVIDER`	`anthropic`	LLM provider

RAG Search¶

Variable	Default	Description
`ONCO_MIN_COLLECTIONS_FOR_SUFFICIENT`	`2`	Min collections with hits
`ONCO_MIN_SIMILARITY_SCORE`	`0.30`	Absolute minimum similarity
`ONCO_MIN_SUFFICIENT_HITS`	`3`	Min hits for sufficient evidence
`ONCO_SCORE_THRESHOLD`	`0.4`	Similarity score threshold
`ONCO_TOP_K`	`5`	Results per collection

Collection Weights¶

Variable	Default	Variable	Default
`ONCO_WEIGHT_VARIANTS`	`0.18`	`ONCO_WEIGHT_BIOMARKERS`	`0.08`
`ONCO_WEIGHT_LITERATURE`	`0.16`	`ONCO_WEIGHT_RESISTANCE`	`0.07`
`ONCO_WEIGHT_THERAPIES`	`0.14`	`ONCO_WEIGHT_PATHWAYS`	`0.06`
`ONCO_WEIGHT_GUIDELINES`	`0.12`	`ONCO_WEIGHT_OUTCOMES`	`0.04`
`ONCO_WEIGHT_TRIALS`	`0.10`	`ONCO_WEIGHT_GENOMIC`	`0.03`
		`ONCO_WEIGHT_CASES`	`0.02`

Trial Matching¶

Variable	Default	Description
`ONCO_TRIAL_WEIGHT_BIOMARKER`	`0.40`	Biomarker match weight
`ONCO_TRIAL_WEIGHT_PHASE`	`0.20`	Trial phase weight
`ONCO_TRIAL_WEIGHT_SEMANTIC`	`0.25`	Semantic similarity weight
`ONCO_TRIAL_WEIGHT_STATUS`	`0.15`	Recruitment status weight

Variable	Default	Description
`ONCO_CITATION_MODERATE_THRESHOLD`	`0.60`	Moderate evidence threshold
`ONCO_CITATION_STRONG_THRESHOLD`	`0.75`	Strong evidence threshold
`ONCO_CROSS_MODAL_ENABLED`	`true`	Enable cross-modal analysis
`ONCO_CROSS_MODAL_THRESHOLD`	`0.40`	Cross-modal trigger threshold
`ONCO_GENOMIC_TOP_K`	`5`	Genomic evidence result count
`ONCO_IMAGING_TOP_K`	`5`	Imaging result count

External APIs¶

Variable	Default	Description
`NCBI_API_KEY`	(none)	PubMed API key (optional)
`ONCO_CIVIC_BASE_URL`	`https://civicdb.org/api`	CIViC API base URL
`ONCO_CT_GOV_BASE_URL`	`https://clinicaltrials.gov/api/v2`	ClinicalTrials.gov API URL
`ONCO_PUBMED_MAX_RESULTS`	`5000`	Max PubMed fetch count

Operational¶

Variable	Default	Description
`ONCO_CONVERSATION_MEMORY_DEPTH`	`3`	Conversation turns to retain
`ONCO_CORS_ORIGINS`	`http://localhost:8080,...`	Allowed CORS origins
`ONCO_MAX_REQUEST_SIZE_MB`	`10`	Max request body size
`ONCO_METRICS_ENABLED`	`true`	Enable Prometheus metrics
`ONCO_SCHEDULER_INTERVAL`	`168h`	Data refresh interval

PDF Branding¶

Variable	Default	Description
`ONCO_PDF_BRAND_COLOR_B`	`0`	Brand color blue (0-255)
`ONCO_PDF_BRAND_COLOR_G`	`185`	Brand color green (0-255)
`ONCO_PDF_BRAND_COLOR_R`	`118`	Brand color red (0-255)

Collection Names¶

Variable	Default
`ONCO_COLLECTION_BIOMARKERS`	`onco_biomarkers`
`ONCO_COLLECTION_CASES`	`onco_cases`
`ONCO_COLLECTION_GENOMIC`	`genomic_evidence`
`ONCO_COLLECTION_GUIDELINES`	`onco_guidelines`
`ONCO_COLLECTION_LITERATURE`	`onco_literature`
`ONCO_COLLECTION_OUTCOMES`	`onco_outcomes`
`ONCO_COLLECTION_PATHWAYS`	`onco_pathways`
`ONCO_COLLECTION_RESISTANCE`	`onco_resistance`
`ONCO_COLLECTION_THERAPIES`	`onco_therapies`
`ONCO_COLLECTION_TRIALS`	`onco_trials`
`ONCO_COLLECTION_VARIANTS`	`onco_variants`

Document generated from verified codebase analysis of the Precision Oncology Intelligence Agent. All defaults, file paths, and configuration values were extracted from config/settings.py, docker-compose.yml, Dockerfile, requirements.txt, and the scripts/ directory.

Precision Oncology Intelligence Agent -- Deployment Guide¶

Table of Contents¶

1. Overview¶

Core Capabilities¶

Architecture at a Glance¶

Software Components¶

2. Prerequisites¶

2.1 Hardware Requirements¶

2.2 Software Requirements¶

2.3 API Keys¶

2.4 Network Requirements¶

3. Quick Start¶

Verify the Deployment¶

Quick Test Query¶

4. Deployment Modes¶

4a. Docker Lite (Milvus + API Only)¶

4b. Docker Full Stack (All 6 Services)¶

4c. DGX Spark Production¶

Production Environment File¶

Production docker-compose Override¶

DGX Spark Systemd Service¶

4d. Development Mode (Local Python)¶

Step 1: Set Up Python Environment¶

Step 2: Start Milvus (Docker Required)¶

Step 3: Create Collections and Seed Data¶

Step 4: Run the API Server¶

Step 5: Run the Streamlit UI¶

Step 6: Run Tests¶

5. Configuration Reference¶

5.1 Connection Settings¶

5.2 Embedding Settings¶

5.3 LLM Settings¶

5.4 RAG Search Settings¶

5.5 Collection Weight Settings¶

5.6 Trial Matching Weights¶

5.7 Citation Thresholds¶

5.8 Cross-Modal Settings¶

5.9 External API Settings¶

5.10 Operational Settings¶

5.11 PDF Report Branding¶

5.12 Collection Names¶

6. Collection Setup and Seeding¶

6.1 Collection Schemas¶

6.2 Setup Script¶

6.3 Seed Scripts¶

6.4 Docker-Based Seeding¶

6.5 Verifying Seed Status¶

7. Data Ingestion (CIViC, PubMed, ClinicalTrials.gov)¶

7.1 CIViC Variant Ingest¶

7.2 PubMed Literature Ingest¶

7.3 ClinicalTrials.gov Ingest¶

7.4 Scheduled Ingest¶

7.5 Custom Ingest Scripts¶

8. Networking and Ports¶

8.1 Port Map¶

8.2 Docker Network¶

8.3 Exposing Ports¶

8.4 Changing Ports¶

8.5 Firewall Rules (Production)¶

9. Storage and Persistence¶

9.1 Docker Volumes¶

9.2 Volume Lifecycle¶

9.3 Disk Space Estimates¶

9.4 Cache Directory¶

9.5 Bind Mounts for Development¶

10. Monitoring and Metrics¶

10.1 Prometheus Metrics¶

10.2 Milvus Metrics¶

10.3 Grafana Dashboard¶

10.4 Disabling Metrics¶

10.5 Health Dashboard¶

11. Security Hardening¶

11.1 CORS Configuration¶

11.2 API Key Protection¶

11.3 Request Size Limits¶

11.4 Non-Root Container User¶

11.5 TLS/HTTPS Configuration¶

11.6 MinIO Credentials¶

11.7 Network Isolation¶

12. Health Checks and Troubleshooting¶