Imaging Intelligence Agent -- Advanced Learning Guide¶

Version: 1.0 Date: March 2026 Author: Adam Jones Audience: Senior engineers, ML researchers, and platform architects extending the Imaging Intelligence Agent

Prerequisites¶

What You Should Know¶

Before working through this guide you should be comfortable with:

Python 3.10+ -- async/await, dataclasses, type hints, ABC pattern
Pydantic v2 -- BaseModel, Field validators, model_dump/model_dump_json
Vector databases -- embedding similarity search, IVF indexes, distance metrics
Medical imaging basics -- DICOM, CT windowing, MRI sequences, modality codes
FastAPI -- request/response models, routers, middleware, dependency injection
Docker Compose -- multi-service orchestration, volumes, networks, health checks
pytest -- fixtures, mocking, parametrize, conftest patterns

Codebase Map¶

agent/
  api/
    main.py                         # FastAPI server (port 8524)
    routes/
      events.py                     # DICOM webhook + workflow routing
      meta_agent.py                 # Multi-step agent endpoint
      nim.py                        # NIM service status/invoke
      preview.py                    # Preview generation endpoints
      protocol.py                   # Protocol optimization
      dose.py                       # Dose comparison endpoints
      reports.py                    # Export endpoints (MD/JSON/PDF/FHIR)
      workflows.py                  # Workflow execution endpoints
      demo_cases.py                 # Pre-built demo case runner
  app/                              # Streamlit chat UI (port 8525)
  config/
    settings.py                     # ImagingSettings (50+ env vars)
    ohif-config.js                  # OHIF Viewer configuration
  src/
    agent.py                        # ImagingIntelligenceAgent orchestrator
    collections.py                  # ImagingCollectionManager (Milvus)
    cross_modal.py                  # Imaging->Genomics trigger (401 lines)
    demo_cases.py                   # Demo case definitions
    export.py                       # 4-format export (691 lines)
    knowledge.py                    # Domain knowledge graph (1843 lines)
    models.py                       # 37 enums + 20 Pydantic models (728 lines)
    protocol_optimizer.py           # AI protocol optimization
    query_expansion.py              # 12 expansion maps (186 lines)
    rag_engine.py                   # ImagingRAGEngine (692 lines)
    scheduler.py                    # APScheduler ingest scheduler
    imaging/
      preview_generator.py          # MP4/GIF slice animations
    ingest/
      base.py                       # BaseIngestPipeline ABC
      literature_parser.py          # PubMed ingest
      clinical_trials_parser.py     # ClinicalTrials.gov ingest
      finding_parser.py             # Imaging finding templates
      protocol_parser.py            # Acquisition protocol ingest
      device_parser.py              # FDA-cleared device ingest
      anatomy_parser.py             # Anatomical structure ingest
      benchmark_parser.py           # Model benchmark ingest
      guideline_parser.py           # Clinical guideline ingest
      report_template_parser.py     # Structured report templates
      dataset_parser.py             # Public dataset registry
      dicom_watcher.py              # Orthanc change-feed watcher
    nim/
      base.py                       # BaseNIMClient (83 lines)
      llm_client.py                 # LlamaLLMClient (439 lines)
      vista3d_client.py             # VISTA3DClient (215 lines)
      maisi_client.py               # MAISIClient (136 lines)
      vilam3_client.py              # VILAM3Client (537 lines)
      service_manager.py            # NIMServiceManager
    workflows/
      base.py                       # BaseImagingWorkflow (83 lines)
      ct_head_hemorrhage.py         # ICH detection (488 lines)
      ct_chest_lung_nodule.py       # Lung nodule tracking (598 lines)
      ct_coronary_angiography.py    # CAD-RADS scoring (357 lines)
      cxr_rapid_findings.py         # CXR classification (658 lines)
      mri_brain_ms_lesion.py        # MS lesion analysis (645 lines)
      mri_prostate_pirads.py        # PI-RADS scoring (280 lines)
  tests/
    conftest.py                     # 10+ fixtures
    test_models.py                  # Data model validation
    test_export.py                  # Export format tests
    test_rag_engine.py              # RAG engine tests
    test_workflows.py               # Workflow tests
    test_agent.py                   # Agent orchestrator tests
    test_knowledge.py               # Knowledge graph tests
    test_nim_clients.py             # NIM client tests
    test_dicom_ingestion.py         # DICOM watcher tests
    test_cross_modal.py             # Cross-modal trigger tests
    test_query_expansion.py         # Query expansion tests
    test_preview_generator.py       # Preview generation tests
    test_preview_api.py             # Preview API endpoint tests
  flare/                            # NVIDIA FLARE federated learning
    job_configs/
      cxr_classification/           # Chest X-ray classification FL job
      ct_segmentation/              # CT organ segmentation FL job
      lung_nodule_detection/        # Lung nodule detection FL job
  docker-compose.yml                # Full stack (11 services)
  docker-compose.lite.yml           # Lite stack (6 services, no GPU)
  Dockerfile                        # Multi-stage build

Chapter 1: Deep Dive into the RAG Engine¶

The ImagingRAGEngine class in src/rag_engine.py (692 lines) is the central retrieval-augmented generation component. It coordinates multi-collection vector search, knowledge graph augmentation, query expansion, comparative analysis, and LLM synthesis.

1.1 Class Architecture¶

class ImagingRAGEngine:
    def __init__(self, collection_manager, embedder, llm_client, nim_service_manager=None):
        self.collection_manager = collection_manager  # ImagingCollectionManager
        self.embedder = embedder                      # SentenceTransformer (BGE-small)
        self.llm_client = llm_client                  # LlamaLLMClient
        self.nim_manager = nim_service_manager         # Optional NIMServiceManager
        self.system_prompt = SYSTEM_PROMPT

The engine accepts four dependencies via constructor injection, making every component mockable for testing.

1.2 The retrieve() Method¶

retrieve() is the core search pipeline. Its execution flow:

Query expansion -- calls expand_query(query) to add synonyms from 12 domain maps
Embedding -- encodes the expanded query via _embed_query() (BGE-small-en-v1.5, 384-dim, L2-normalized)
Knowledge context -- checks query against IMAGING_PATHOLOGIES, IMAGING_MODALITIES, and IMAGING_ANATOMY dictionaries for supplemental context
Collection iteration -- searches each collection in COLLECTION_CONFIG (or a filtered subset)
Filter construction -- builds per-collection Milvus filter expressions from modality, body_region, and year parameters
Weighted scoring -- multiplies each hit's raw similarity score by the collection weight
Sorting -- sorts all hits by weighted score descending
Result assembly -- returns a CrossCollectionResult with hits, knowledge context, timing, and collection count

def retrieve(self, query, top_k_per_collection=5, collections_filter=None,
             year_min=None, year_max=None, modality_filter=None,
             body_region_filter=None) -> CrossCollectionResult:

Key detail: The top_k_per_collection parameter defaults to 5, meaning a full search across all 11 collections can return up to 55 candidate hits before weighted ranking.

1.3 Score Weighting (COLLECTION_CONFIG)¶

Each collection has a weight that reflects its relevance priority:

Collection	Label	Weight	Has Modality	Year Field
`imaging_literature`	Literature	0.18	Yes	`year`
`imaging_findings`	Finding	0.15	Yes	--
`imaging_trials`	Trial	0.12	Yes	`start_year`
`imaging_guidelines`	Guideline	0.10	Yes	`year`
`imaging_protocols`	Protocol	0.08	Yes	--
`imaging_devices`	Device	0.08	Yes	--
`imaging_benchmarks`	Benchmark	0.08	Yes	--
`imaging_anatomy`	Anatomy	0.06	No	--
`imaging_datasets`	Dataset	0.06	Yes	--
`imaging_report_templates`	ReportTemplate	0.05	Yes	--
`genomic_evidence`	Genomic	0.04	No	--

Weights sum to 1.00. Literature and Findings dominate because they contain the most clinically actionable evidence. The genomic collection has the lowest weight (0.04) because it provides cross-modal enrichment context rather than primary imaging evidence.

The weighted score formula is:

final_score = raw_cosine_similarity * collection_weight

1.4 Citation Scoring¶

Citation quality thresholds are configured in config/settings.py:

CITATION_HIGH_THRESHOLD: float = 0.75
CITATION_MEDIUM_THRESHOLD: float = 0.60

Hits with score >= 0.75 are high-confidence citations. Those between 0.60 and 0.75 are medium-confidence. Below 0.60 are low-confidence.

1.5 System Prompt¶

The SYSTEM_PROMPT constant (lines 222-241) defines the LLM's persona across 11 knowledge domains:

CT Analysis
MRI Interpretation
Chest X-ray
Imaging AI Models (MONAI, VISTA-3D, nnU-Net, SwinUNETR, DenseNet)
Clinical Guidelines (ACR, Lung-RADS, BI-RADS, TI-RADS, LI-RADS)
Imaging Protocols
FDA-Cleared Devices (510(k), De Novo)
Radiology Reporting (RadLex, DICOM SR)
Public Datasets (RSNA, TCIA, NIH, LIDC-IDRI, BraTS, CheXpert, MIMIC-CXR)
Quantitative Imaging (RECIST, volume doubling time)
NVIDIA NIMs (VISTA-3D, MAISI, VILA-M3, Llama3)

The prompt enforces two critical rules: (a) always cite evidence from context, and (b) include a research-use-only disclaimer.

1.6 Prompt Construction (_build_prompt)¶

The _build_prompt() method assembles the LLM input as a two-message list:

[
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "<domain knowledge>\n<evidence>\n<conversation>\n<question>"},
]

The user content is built in sections: - Domain Knowledge -- from the knowledge graph (pathology/modality/anatomy) - Retrieved Evidence -- top 20 hits with [Label] (score: 0.XXX) text format - Conversation Context -- prior exchanges for multi-turn sessions - Question -- the raw user query

1.7 Query Entry Points¶

Two public methods serve as entry points:

query(question, conversation_context, **kwargs) -> str -- synchronous, full RAG pipeline
query_stream(question, conversation_context, **kwargs) -> Generator[str] -- streaming tokens

Both automatically detect comparative queries via _is_comparative() and route to the comparative pipeline when matched.

Chapter 2: Vector Search Internals¶

2.1 Index Type: IVF_FLAT¶

The Imaging Agent uses IVF_FLAT (Inverted File with Flat quantization) defined in src/collections.py:

INDEX_PARAMS = {
    "metric_type": "COSINE",
    "index_type": "IVF_FLAT",
    "params": {"nlist": 1024},
}

IVF_FLAT partitions vectors into 1024 Voronoi cells (clusters). At search time, only a subset of clusters is scanned, controlled by nprobe.

Why IVF_FLAT over HNSW? For the Imaging Agent's collection sizes (hundreds to low thousands of records per collection), IVF_FLAT offers lower memory overhead than HNSW's graph structure while maintaining exact distance computation within probed clusters. HNSW would be preferred at scale (100K+ vectors per collection).

2.2 Distance Metric: COSINE¶

SEARCH_PARAMS = {
    "metric_type": "COSINE",
    "params": {"nprobe": 16},
}

COSINE vs L2 vs IP comparison:

Metric	Formula	Range	Best For
COSINE	1 - cos(a,b)	[0, 2]	Normalized embeddings (BGE default)
L2 (Euclidean)	\|\|a-b\|\|^2	[0, inf)	Unnormalized, magnitude matters
IP (Inner Product)	a . b	(-inf, inf)	Already-normalized (equivalent to COSINE)

COSINE is the correct choice because BGE-small-en-v1.5 produces L2-normalized embeddings (via normalize_embeddings=True). For normalized vectors, COSINE and IP produce equivalent rankings, but COSINE scores are bounded [0, 1] after Milvus's internal conversion, making threshold comparisons (e.g., SCORE_THRESHOLD = 0.40) more interpretable.

2.3 nprobe = 16¶

With nlist=1024 clusters and nprobe=16, each search scans 16/1024 = 1.56% of the index. This provides a good recall/latency tradeoff:

nprobe	Clusters Scanned	Recall (approx.)	Relative Latency
1	0.1%	~60%	1x
8	0.8%	~90%	4x
16	1.56%	~95%	8x
64	6.25%	~99%	32x
1024	100%	100% (exact)	512x

For collections under 10K vectors, nprobe=16 effectively achieves near-exact recall because many clusters contain fewer than top_k vectors.

2.4 BGE Embedding Model¶

EMBEDDING_MODEL: str = "BAAI/bge-small-en-v1.5"
EMBEDDING_DIMENSION: int = 384
EMBEDDING_BATCH_SIZE: int = 32

BGE-small-en-v1.5 characteristics: - Parameters: 33M (MiniLM backbone) - Dimensions: 384 - Max sequence length: 512 tokens - MTEB score: 63.55 (top-tier for its size class) - License: MIT

The _embed_query() method normalizes embeddings:

def _embed_query(self, text: str) -> List[float]:
    return self.embedder.encode(text, normalize_embeddings=True).tolist()

2.5 Collection Schemas¶

Each collection defines a FLOAT_VECTOR field with dim=384. The primary key is always a VARCHAR id field. Example from imaging_literature:

EMBEDDING_DIM = 384  # BGE-small-en-v1.5

LITERATURE_FIELDS = [
    FieldSchema(name="id", dtype=DataType.VARCHAR, is_primary=True, max_length=50),
    FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=EMBEDDING_DIM),
    FieldSchema(name="title", dtype=DataType.VARCHAR, max_length=500),
    FieldSchema(name="text_chunk", dtype=DataType.VARCHAR, max_length=3000),
    # ... additional metadata fields
]

Chapter 3: Adding a New Collection¶

This worked example adds an imaging_conferences collection for storing conference proceedings and presentations.

Step 1: Define the Pydantic Model¶

Add to src/models.py:

class ConferenceRecord(BaseModel):
    """Conference proceeding / presentation abstract."""
    id: str = Field(..., max_length=100)
    text_summary: str = Field(..., max_length=3000)
    title: str = Field("", max_length=500)
    conference_name: str = Field("", max_length=200)  # e.g., RSNA, MICCAI
    year: int = Field(0, ge=2000, le=2030)
    modality: ImagingModality = ImagingModality.CT
    body_region: BodyRegion = BodyRegion.CHEST
    presentation_type: str = Field("", max_length=50)  # poster, oral, keynote
    authors: str = Field("", max_length=500)

    def to_embedding_text(self) -> str:
        parts = [self.title, self.text_summary]
        if self.conference_name:
            parts.append(f"Conference: {self.conference_name}")
        if self.modality:
            parts.append(f"Modality: {self.modality.value}")
        return " ".join(parts)

Step 2: Define the Milvus Schema¶

Add to src/collections.py:

CONFERENCE_FIELDS = [
    FieldSchema(name="id", dtype=DataType.VARCHAR, is_primary=True, max_length=100),
    FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=EMBEDDING_DIM),
    FieldSchema(name="title", dtype=DataType.VARCHAR, max_length=500),
    FieldSchema(name="text_summary", dtype=DataType.VARCHAR, max_length=3000),
    FieldSchema(name="conference_name", dtype=DataType.VARCHAR, max_length=200),
    FieldSchema(name="year", dtype=DataType.INT64),
    FieldSchema(name="modality", dtype=DataType.VARCHAR, max_length=20),
    FieldSchema(name="body_region", dtype=DataType.VARCHAR, max_length=30),
    FieldSchema(name="presentation_type", dtype=DataType.VARCHAR, max_length=50),
    FieldSchema(name="authors", dtype=DataType.VARCHAR, max_length=500),
]

Step 3: Register the Schema¶

In the ImagingCollectionManager.create_all_collections() method, add:

self._create_collection("imaging_conferences", CONFERENCE_FIELDS)

Step 4: Add to COLLECTION_CONFIG¶

In src/rag_engine.py, add to COLLECTION_CONFIG:

"imaging_conferences": {
    "weight": 0.05,
    "label": "Conference",
    "has_modality": True,
    "year_field": "year",
},

Important: Adjust other weights so they still sum to approximately 1.0.

Step 5: Add to conftest.py¶

Update IMAGING_COLLECTION_NAMES in tests/conftest.py:

IMAGING_COLLECTION_NAMES = [
    # ... existing collections ...
    "imaging_conferences",
]

Step 6: Create the Ingest Parser¶

Create src/ingest/conference_parser.py:

from src.ingest.base import BaseIngestPipeline
from src.models import ConferenceRecord

class ConferenceIngestPipeline(BaseIngestPipeline):
    COLLECTION_NAME = "imaging_conferences"

    def fetch(self, **kwargs):
        # Implement API call or file read
        ...

    def parse(self, raw_data):
        records = []
        for item in raw_data:
            records.append(ConferenceRecord(
                id=item["id"],
                text_summary=item["abstract"],
                title=item["title"],
                conference_name=item["conference"],
                year=item["year"],
            ))
        return records

Step 7: Add Query Expansion Terms¶

In src/query_expansion.py, add a new map or extend an existing one:

CONFERENCE_EXPANSION = {
    "rsna": {"rsna annual meeting", "radiological society", "rsna 2025"},
    "miccai": {"miccai conference", "medical image computing"},
    "isbi": {"isbi conference", "biomedical imaging symposium"},
}

Add to ALL_EXPANSION_MAPS.

Step 8: Add Filter Support¶

In ImagingRAGEngine.retrieve(), add imaging_conferences to _has_body_region:

_has_body_region = {
    # ... existing collections ...
    "imaging_conferences",
}

Step 9: Write Tests¶

Create tests/test_conference_ingest.py with at minimum: - Model validation tests (valid/invalid records) - Parse method tests with sample data - Integration test with mock collection manager

Step 10: Update Settings¶

Add collection name to config/settings.py:

COLLECTION_CONFERENCES: str = "imaging_conferences"

Chapter 4: Building a Custom Ingest Pipeline¶

4.1 The BaseIngestPipeline Pattern¶

The abstract base class in src/ingest/base.py defines a three-step pipeline:

class BaseIngestPipeline(ABC):
    COLLECTION_NAME: str = ""

    def __init__(self, collection_manager, embedder):
        self.collection_manager = collection_manager
        self.embedder = embedder

    @abstractmethod
    def fetch(self, **kwargs) -> Any:
        """Retrieve raw data from external source."""
        ...

    @abstractmethod
    def parse(self, raw_data: Any) -> List[BaseModel]:
        """Convert raw data into validated Pydantic models."""
        ...

    def embed_and_store(self, records, collection_name=None, batch_size=32) -> int:
        """Embed text and insert into Milvus in batches."""
        ...

    def run(self, collection_name=None, batch_size=32, **fetch_kwargs) -> int:
        """Orchestrate fetch -> parse -> embed_and_store."""
        ...

4.2 The fetch -> parse -> embed_and_store Flow¶

fetch() -- retrieves raw data. Each parser implements this differently: - literature_parser.py calls the PubMed E-utilities API - clinical_trials_parser.py calls the ClinicalTrials.gov v2 API - device_parser.py reads FDA device listing files - anatomy_parser.py reads local JSONL anatomy reference files

parse() -- converts raw API/file data into validated Pydantic model instances. Each record must implement to_embedding_text() -> str, which produces the string that gets embedded.

embed_and_store() -- the base class handles this uniformly:

def embed_and_store(self, records, collection_name=None, batch_size=32):
    for i in range(0, len(records), batch_size):
        batch = records[i : i + batch_size]
        texts = [r.to_embedding_text() for r in batch]
        embeddings = self.embedder.encode(texts, normalize_embeddings=True).tolist()

        insert_data = []
        for rec, emb in zip(batch, embeddings):
            row = rec.model_dump()
            for k, v in row.items():
                if hasattr(v, "value"):
                    row[k] = v.value  # Convert enums to strings
            row["embedding"] = emb
            insert_data.append(row)

        self.collection_manager.insert_batch(coll_name, insert_data, batch_size=batch_size)

4.3 Batch Embedding¶

Embedding is done in batches of 32 (configurable via EMBEDDING_BATCH_SIZE). This balances GPU memory usage against throughput. For BGE-small-en-v1.5 on a DGX Spark:

Batch Size	Throughput (records/sec)	GPU Memory
1	~50	~200 MB
16	~400	~400 MB
32	~650	~600 MB
64	~800	~1.0 GB

4.4 Worked Example: FDA Device Parser¶

class DeviceIngestPipeline(BaseIngestPipeline):
    COLLECTION_NAME = "imaging_devices"

    def fetch(self, data_dir=None, **kwargs):
        # Read FDA device JSON files from reference directory
        path = Path(data_dir or self.default_data_dir)
        devices = []
        for f in path.glob("*.json"):
            with open(f) as fh:
                devices.extend(json.load(fh))
        return devices

    def parse(self, raw_data):
        records = []
        for item in raw_data:
            try:
                records.append(ImagingDevice(
                    id=item["k_number"],
                    text_summary=item["device_description"],
                    device_name=item["device_name"],
                    manufacturer=item["applicant"],
                    regulatory_status=DeviceRegulatory.CLEARED_510K,
                    clearance_date=item.get("decision_date", ""),
                    modality=self._map_modality(item),
                    intended_use=item.get("intended_use", ""),
                ))
            except ValidationError as e:
                logger.warning(f"Skipping invalid device {item.get('k_number')}: {e}")
        return records

Chapter 5: Extending the Knowledge Graph¶

5.1 Three Domains¶

The knowledge graph in src/knowledge.py (1843 lines) contains three domain dictionaries:

Domain	Variable	Entries	Key Fields
Pathologies	`IMAGING_PATHOLOGIES`	~25	ICD-10, modalities, CT/MRI characteristics, severity criteria, AI models, guidelines
Modalities	`IMAGING_MODALITIES`	~9	Physics, protocols, strengths, limitations, AI applications, DICOM codes
Anatomy	`IMAGING_ANATOMY`	~21	Body region, systems, SNOMED codes, FMA IDs, VISTA-3D labels, common pathologies

5.2 Adding a New Pathology¶

To add a new pathology (e.g., pulmonary embolism):

IMAGING_PATHOLOGIES["pulmonary_embolism"] = {
    "icd10": "I26.99",
    "display_name": "Pulmonary Embolism",
    "modalities": ["ct"],
    "body_region": "chest",
    "subtypes": ["acute", "chronic", "saddle", "segmental", "subsegmental"],
    "ct_characteristics": (
        "Filling defect in pulmonary artery on CTPA. "
        "Acute: central low-attenuation defect with vessel expansion. "
        "Chronic: eccentric mural thrombus with vessel narrowing."
    ),
    "severity_criteria": {
        "critical": "Saddle PE, RV strain, hemodynamic instability",
        "urgent": "Lobar PE, RV/LV ratio > 1.0",
        "significant": "Segmental PE without RV strain",
        "routine": "Subsegmental PE, no hemodynamic compromise",
    },
    "ai_models": ["MONAI PE Detection", "AIDOC PE Triage"],
    "related_guidelines": ["Fleischner Society CTPA", "AHA PE Guidelines"],
    "genomic_links": ["Factor V Leiden", "Prothrombin G20210A"],
}

5.3 Adding a New Modality¶

IMAGING_MODALITIES["nuclear"] = {
    "display_name": "Nuclear Medicine",
    "physics": "Gamma ray detection from administered radiotracers",
    "dicom_code": "NM",
    "typical_studies": ["bone scan", "thyroid scan", "renal scan", "MIBG"],
    # ...
}

5.4 Adding a New Anatomy Entry¶

IMAGING_ANATOMY["aorta"] = {
    "display_name": "Aorta",
    "body_region": "chest",
    "system": "cardiovascular",
    "snomed_code": "15825003",
    "fma_id": "FMA3734",
    "vista3d_label": 52,
    "structures": ["ascending aorta", "aortic arch", "descending aorta"],
    "common_pathologies": ["aneurysm", "dissection", "coarctation"],
    # ...
}

5.5 Keyword Routing¶

The _get_knowledge_context() method in ImagingRAGEngine scans the query text against all three dictionaries:

def _get_knowledge_context(self, query: str) -> str:
    query_lower = query.lower()
    contexts = []

    for key in IMAGING_PATHOLOGIES:
        if key.replace("_", " ") in query_lower:
            contexts.append(get_pathology_context(key))
            break

    for key in IMAGING_MODALITIES:
        if key in query_lower:
            contexts.append(get_modality_context(key))
            break

    for key in IMAGING_ANATOMY:
        if key in query_lower:
            contexts.append(get_anatomy_context(key))
            break

    return "\n\n".join(contexts) if contexts else ""

The routing uses simple substring matching with underscore-to-space normalization. It breaks after the first match per domain, so only one pathology, one modality, and one anatomy entry are injected per query. This prevents knowledge context from overwhelming the LLM prompt.

5.6 Comparison Entity Resolution¶

The resolve_comparison_entity() and get_comparison_context() functions support the comparative analysis pipeline. They check all three dictionaries for entity matches and produce structured comparison tables.

Chapter 6: Query Expansion Engineering¶

6.1 Architecture¶

Query expansion in src/query_expansion.py (186 lines) uses 12 domain-specific expansion maps:

Map	Keys	Example Key -> Synonyms
`MODALITY_EXPANSION`	9	`"ct"` -> `{"computed tomography", "cat scan", "helical ct", ...}`
`BODY_REGION_EXPANSION`	12	`"chest"` -> `{"thorax", "thoracic", "lung", "pulmonary", ...}`
`PATHOLOGY_EXPANSION`	22	`"hemorrhage"` -> `{"bleeding", "hematoma", "ich", ...}`
`AI_TASK_EXPANSION`	8	`"segmentation"` -> `{"segment", "delineation", "contour", ...}`
`SEVERITY_EXPANSION`	3	`"critical"` -> `{"emergent", "stat", "life-threatening", ...}`
`FINDING_EXPANSION`	6	`"consolidation"` -> `{"airspace opacity", "air bronchograms", ...}`
`GUIDELINE_EXPANSION`	9	`"lung_rads"` -> `{"lung-rads", "ldct screening", ...}`
`DEVICE_EXPANSION`	5	`"510k"` -> `{"510(k)", "premarket notification", ...}`
`DATASET_EXPANSION`	7	`"lidc"` -> `{"lidc-idri", "lung image database", ...}`
`MODEL_ARCHITECTURE_EXPANSION`	11	`"vista3d"` -> `{"vista-3d", "versatile imaging segmentation", ...}`
`MEASUREMENT_EXPANSION`	5	`"hounsfield"` -> `{"hu", "density", "attenuation", ...}`
`CONTRAST_EXPANSION`	4	`"gadolinium"` -> `{"gad", "gd", "gbca", "mri contrast", ...}`

6.2 The expand_query() Function¶

def expand_query(query: str) -> Set[str]:
    query_lower = query.lower()
    expanded = set()

    for expansion_map in ALL_EXPANSION_MAPS:
        for key, synonyms in expansion_map.items():
            if key.replace("_", " ") in query_lower or key in query_lower:
                expanded.update(synonyms)

    return expanded

The function returns a set of expansion terms. The RAG engine appends up to 10 terms to the original query:

expanded_terms = expand_query(query)
search_text = query
if expanded_terms:
    search_text = f"{query} {' '.join(list(expanded_terms)[:10])}"

6.3 Adding New Expansion Maps¶

To add a new expansion domain:

Define the map as a module-level dict:

PROTOCOL_EXPANSION = {
    "low_dose": {"low dose ct", "ldct", "dose reduction", "low radiation"},
    "high_resolution": {"hrct", "thin slice", "1mm", "high resolution ct"},
}

Add it to ALL_EXPANSION_MAPS:

ALL_EXPANSION_MAPS = [
    # ... existing maps ...
    PROTOCOL_EXPANSION,
]

6.4 Synonym Expansion vs Modality/Region Context¶

Query expansion and knowledge context serve different purposes:

Query expansion modifies the embedding input to improve vector recall (same concept, different terminology)
Knowledge context injects structured domain knowledge into the LLM prompt for synthesis accuracy

Both fire independently during retrieve().

Chapter 7: The Comparative Analysis System¶

7.1 Detection: _COMPARATIVE_RE¶

The regex pattern detects comparative intent:

_COMPARATIVE_RE = re.compile(
    r"\b(compare|compared to|vs\.?|versus|difference between"
    r"|head.to.head|better than|advantages|disadvantages)\b",
    re.IGNORECASE,
)

Any query matching this pattern gets routed to the comparative pipeline instead of standard retrieval.

7.2 Entity Parsing: _ENTITY_PATTERNS¶

Eight regex patterns extract the two entities from a comparative query:

Pattern	Example Match
`(.+?) vs\.? (.+)`	"CT vs MRI"
`(.+?) versus (.+)`	"CT versus MRI"
`compare (.+?) (?:and\\|with\\|to) (.+)`	"compare CT and MRI"
`(.+?) compared to (.+)`	"CT compared to MRI"
`difference(?:s)? between (.+?) and (.+)`	"difference between CT and MRI"
`(.+?) head[\s-]to[\s-]head (.+)`	"CT head-to-head MRI"
`advantages of (.+?) over (.+)`	"advantages of CT over MRI"
`(.+?) better than (.+)`	"CT better than MRI"

Patterns are tried in order; the first match wins. A fallback splits on separator keywords if no regex matches.

7.3 Entity Resolution: IMAGING_ENTITY_MAP¶

The IMAGING_ENTITY_MAP contains 20+ entity definitions across five categories:

Category	Entities
Modalities	ct, mri, pet, ultrasound, xray, mammography
Architectures	cnn, transformer, unet
Techniques	dlir, iterative
Tasks	detection, segmentation, classification
Technologies	photon_counting, dual_energy, conventional_ct

Resolution uses a 5-step cascade: 1. Exact key match in IMAGING_ENTITY_MAP 2. Exact alias match (case-insensitive) 3. Fuzzy substring match in aliases 4. Fall back to knowledge.py entity resolution 5. Return raw text as unknown type

7.4 retrieve_comparative() Pipeline¶

def retrieve_comparative(self, question, **kwargs) -> ComparativeResult:
    # 1. Parse entities
    entity_a_str, entity_b_str = self._parse_comparison_entities(question)

    # 2. Resolve entities against IMAGING_ENTITY_MAP
    entity_a_resolved = _resolve_imaging_entity(entity_a_str)
    entity_b_resolved = _resolve_imaging_entity(entity_b_str)

    # 3. Dual retrieval -- augment queries with full entity names
    search_a = f"{question} {entity_a_resolved.get('full_name', entity_a_str)}"
    search_b = f"{question} {entity_b_resolved.get('full_name', entity_b_str)}"
    evidence_a = self.retrieve(search_a, **kwargs)
    evidence_b = self.retrieve(search_b, **kwargs)

    # 4. Find shared evidence (same record_id in both result sets)
    shared = self._find_shared_evidence(evidence_a.hits, evidence_b.hits)

    # 5. Build knowledge-graph comparison context
    comparison_context = get_comparison_context(kg_a, kg_b)

    return ComparativeResult(...)

7.5 Dual-Query Evidence¶

The key insight is that each entity gets its own retrieval pass. The query for entity A is the original question augmented with entity A's full name, and similarly for entity B. This biases each search toward evidence relevant to that specific entity.

Shared evidence (records appearing in both result sets) is surfaced separately, as it often contains direct comparison studies.

7.6 Comparative System Prompt¶

The COMPARATIVE_SYSTEM_PROMPT instructs the LLM to produce a structured 7-section comparison:

Technical Specifications (table)
Clinical Performance (sensitivity, specificity, AUC)
Radiation Dose / Safety
AI Integration Capabilities
Clinical Guidelines & Recommendations
Cost-Effectiveness Considerations
Summary Recommendation

Chapter 8: Clinical Workflow Development¶

8.1 Base Class Architecture¶

src/workflows/base.py (83 lines) defines BaseImagingWorkflow:

class BaseImagingWorkflow(ABC):
    WORKFLOW_NAME: str = "base"
    TARGET_LATENCY_SEC: float = 60.0
    MODALITY: str = ""
    BODY_REGION: str = ""
    MODELS_USED: List[str] = []

    def __init__(self, mock_mode=True, nim_clients=None, mock_overrides=None):
        ...

    @abstractmethod
    def preprocess(self, input_path: str) -> Any: ...

    @abstractmethod
    def infer(self, preprocessed: Any) -> Dict: ...

    @abstractmethod
    def postprocess(self, inference_result: Dict) -> WorkflowResult: ...

    @abstractmethod
    def _mock_inference(self) -> Dict: ...

    def run(self, input_path: str = "") -> WorkflowResult: ...

8.2 The preprocess -> infer -> postprocess Pattern¶

preprocess() -- loads and prepares input data: - CT workflows: load DICOM/NIfTI, apply windowing, resample to isotropic spacing - CXR workflow: load DICOM, resize to model input dimensions, normalize pixel values - MRI workflows: load multi-sequence data, apply bias field correction, register sequences

infer() -- runs model inference: - Calls NIM clients (VISTA-3D, MAISI, VILA-M3) or MONAI models - Returns raw inference dict with predictions, probabilities, segmentation masks

postprocess() -- extracts clinical meaning: - Converts raw model outputs to findings, measurements, classifications - Applies clinical scoring systems (Lung-RADS, BI-RADS, etc.) - Determines severity level - Returns a structured WorkflowResult

8.3 The run() Orchestrator¶

def run(self, input_path: str = "") -> WorkflowResult:
    start = time.time()
    try:
        if self.mock_mode:
            raw = self._mock_inference()
            if self.mock_overrides:
                raw.update(self.mock_overrides)
        else:
            preprocessed = self.preprocess(input_path)
            raw = self.infer(preprocessed)

        result = self.postprocess(raw)
        result.inference_time_ms = (time.time() - start) * 1000
        result.is_mock = self.mock_mode
        result.workflow_name = self.WORKFLOW_NAME
        return result
    except Exception as e:
        return WorkflowResult(
            workflow_name=self.WORKFLOW_NAME,
            status=WorkflowStatus.FAILED,
            inference_time_ms=(time.time() - start) * 1000,
            is_mock=self.mock_mode,
        )

The mock_overrides parameter allows demo cases to inject specific findings (e.g., force Lung-RADS 4B classification for a demo scenario).

8.4 Six Implemented Workflows¶

Workflow	File	Lines	Modality	Scoring System	Models
CT Head Hemorrhage	`ct_head_hemorrhage.py`	488	CT	Volume/midline shift	3D U-Net, VISTA-3D
CT Lung Nodule	`ct_chest_lung_nodule.py`	598	CT	Lung-RADS v2022	DenseNet, VISTA-3D
CT Coronary Angiography	`ct_coronary_angiography.py`	357	CT	CAD-RADS 2.0	SegResNet
CXR Rapid Findings	`cxr_rapid_findings.py`	658	CXR	Multi-finding classification	DenseNet-121
MRI Brain MS Lesion	`mri_brain_ms_lesion.py`	645	MRI	Lesion count/volume	nnU-Net, FLAIR
MRI Prostate PI-RADS	`mri_prostate_pirads.py`	280	MRI	PI-RADS v2.1	SegResNet

8.5 Adding a New Workflow¶

To add a PET/CT lymphoma staging workflow:

# src/workflows/pet_ct_lymphoma_staging.py

from src.workflows.base import BaseImagingWorkflow
from src.models import FindingSeverity, WorkflowResult, WorkflowStatus

class PETCTLymphomaWorkflow(BaseImagingWorkflow):
    WORKFLOW_NAME = "pet_ct_lymphoma_staging"
    TARGET_LATENCY_SEC = 120.0
    MODALITY = "pet_ct"
    BODY_REGION = "whole_body"
    MODELS_USED = ["VISTA-3D", "SUV quantification"]

    def preprocess(self, input_path: str):
        # Load PET/CT DICOM series
        # Register PET to CT
        # Extract SUV maps
        return {"pet_volume": pet_vol, "ct_volume": ct_vol}

    def infer(self, preprocessed):
        # Run organ segmentation via VISTA-3D
        # Identify FDG-avid regions
        # Calculate SUVmax per region
        return {"regions": [...], "suv_values": {...}}

    def postprocess(self, inference_result) -> WorkflowResult:
        # Apply Deauville scoring (1-5)
        # Map to Lugano classification
        # Determine staging
        return WorkflowResult(
            workflow_name=self.WORKFLOW_NAME,
            status=WorkflowStatus.COMPLETED,
            classification=f"Deauville {score}",
            severity=self._deauville_to_severity(score),
            findings=findings,
            measurements={"suv_max": max_suv},
        )

    def _mock_inference(self):
        return {
            "regions": [{"name": "mediastinal", "suv_max": 8.2}],
            "deauville_score": 4,
        }

Register it in src/workflows/__init__.py:

from .pet_ct_lymphoma_staging import PETCTLymphomaWorkflow

WORKFLOW_REGISTRY["pet_ct_lymphoma_staging"] = PETCTLymphomaWorkflow

Add routing rules in api/routes/events.py:

WORKFLOW_ROUTING[("PT", "whole_body")] = "pet_ct_lymphoma_staging"

Chapter 9: NIM Integration Deep Dive¶

9.1 BaseNIMClient Architecture¶

src/nim/base.py (83 lines) provides the foundation for all NIM clients:

class BaseNIMClient(ABC):
    def __init__(self, base_url, service_name, mock_enabled=True):
        self.base_url = base_url.rstrip("/")
        self.service_name = service_name
        self.mock_enabled = mock_enabled
        self._available = None
        self._last_check = 0
        self._check_interval = 30.0  # Cached health check interval

Key methods: - health_check() -- HTTP GET to /v1/health/ready - is_available() -- cached health check (30-second TTL) - _request() -- HTTP POST with tenacity retry (3 attempts, exponential backoff) - _invoke_or_mock() -- try real NIM, fall back to mock if unavailable

9.2 The Fallback Chain¶

Each NIM client implements a multi-tier fallback:

LLM Client (llm_client.py, 439 lines):

Local NIM (Llama-3-70B) -> NVIDIA Cloud NIM (Llama-3.1-8B) -> Anthropic Claude -> Mock

VILA-M3 Client (vilam3_client.py, 537 lines):

Local VILA-M3 NIM -> NVIDIA Cloud NIM (Llama-3.2-11B-Vision) -> Mock

VISTA-3D Client (vista3d_client.py, 215 lines):

Local VISTA-3D NIM -> Mock (realistic segmentation results)

MAISI Client (maisi_client.py, 136 lines):

Local MAISI NIM -> Mock (synthetic CT metadata)

9.3 Client Architecture Details¶

LlamaLLMClient uses the OpenAI Python client because NIM exposes an OpenAI-compatible /v1/chat/completions endpoint:

class LlamaLLMClient(BaseNIMClient):
    def __init__(self, base_url, mock_enabled=True,
                 anthropic_api_key=None, nvidia_api_key=None,
                 cloud_url="https://integrate.api.nvidia.com/v1",
                 cloud_llm_model="meta/llama-3.1-8b-instruct",
                 local_llm_model="meta/llama3-70b-instruct"):

The health check for the LLM client uses /v1/models instead of /v1/health/ready because the OpenAI-compatible API exposes model listing.

VILAM3Client supports multimodal inputs by base64-encoding images and sending them in the OpenAI vision message format:

def _encode_image(self, image_path: str) -> str:
    path = Path(image_path)
    return base64.b64encode(path.read_bytes()).decode("utf-8")

9.4 Mock Mode¶

Mock mode is critical for development, testing, and demos without GPU access. Each client implements _mock_response() returning clinically realistic synthetic data:

VISTA-3D mock: 3 segmented organs with realistic volumes
MAISI mock: 512x512x512 volume metadata, 127 annotated classes
VILA-M3 mock: structured radiology report text
LLM mock: generic clinical assessment text

The mock system is controlled by: - NIM_ALLOW_MOCK_FALLBACK: bool = True in settings - Per-client mock_enabled constructor parameter - NIM_MODE: str = "local" (values: "local", "cloud", "mock")

9.5 NIMServiceManager¶

The NIMServiceManager aggregates all four NIM clients and provides: - check_all_services() -- returns status dict for all services - get_available_services() -- list of services with live endpoints - get_client(name) -- retrieve a specific client by name

Chapter 10: Testing Strategies¶

10.1 Test Suite Overview¶

The test suite consists of 12 modules with 620 tests, all passing:

Module	Tests	Focus
`test_models.py`	~80	Pydantic model validation, enum coverage
`test_export.py`	~60	Markdown, JSON, PDF, FHIR R4 output
`test_rag_engine.py`	~75	Retrieve, query, comparative, scoring
`test_workflows.py`	~65	All 6 workflows, mock inference, postprocess
`test_agent.py`	~55	Agent orchestrator, search planning
`test_knowledge.py`	~50	Knowledge graph lookups, comparison context
`test_nim_clients.py`	~60	All 4 NIM clients, fallback chains
`test_dicom_ingestion.py`	~40	DICOM watcher, Orthanc events
`test_cross_modal.py`	~45	Trigger evaluation, genomic queries
`test_query_expansion.py`	~35	All 12 expansion maps
`test_preview_generator.py`	~30	Slice animation, windowing
`test_preview_api.py`	~25	Preview API endpoints

10.2 The Mock-Everything Approach¶

Every external dependency is mocked in tests/conftest.py. The test suite runs without: - Milvus database - NIM GPU services - Network access (PubMed, ClinicalTrials.gov) - Orthanc DICOM server - Anthropic/NVIDIA API keys

10.3 Core Fixtures¶

conftest.py provides 10+ fixtures:

mock_embedder -- returns 384-dim numpy vectors (random):

@pytest.fixture
def mock_embedder():
    embedder = MagicMock()

    def _encode(texts, normalize_embeddings=True, **kwargs):
        if isinstance(texts, str):
            return np.random.randn(384).astype(np.float32)
        return np.random.randn(len(texts), 384).astype(np.float32)

    embedder.encode = MagicMock(side_effect=_encode)
    return embedder

mock_llm_client -- returns template clinical text:

@pytest.fixture
def mock_llm_client():
    client = MagicMock()
    client.generate.return_value = (
        "Based on the available imaging evidence, the findings are consistent "
        "with normal anatomy without acute pathology."
    )
    client.generate_stream.return_value = iter([
        "Based on ", "the available ", "imaging evidence, ", ...
    ])
    return client

mock_collection_manager -- returns 3 sample search results per collection:

@pytest.fixture
def mock_collection_manager():
    manager = MagicMock()
    manager.search.return_value = [
        {"id": "lit-001", "score": 0.85, "text_chunk": "AI-based hemorrhage detection..."},
        {"id": "lit-002", "score": 0.78, "text_chunk": "VISTA-3D segments 127 structures..."},
        {"id": "lit-003", "score": 0.72, "text_chunk": "Lung-RADS v2022 classifies..."},
    ]
    manager.get_collection_stats.return_value = {name: 100 for name in IMAGING_COLLECTION_NAMES}
    return manager

mock_nim_services -- all 4 NIM clients with mock responses (VISTA-3D segmentation, MAISI generation, VILA-M3 image analysis, LLM generation).

sample_search_hits -- 5 SearchHit objects spanning literature, trials, findings, guidelines, and benchmarks.

sample_evidence -- a CrossCollectionResult with 5 hits and knowledge context.

sample_workflow_result -- a WorkflowResult from CT head hemorrhage (urgent, 12.5 mL volume).

sample_agent_response -- complete AgentResponse for export testing.

10.4 Testing Patterns¶

Workflow tests verify postprocess logic by providing mock inference results and checking classifications:

def test_lung_nodule_4a(mock_lung_nodule_workflow):
    result = mock_lung_nodule_workflow.run()
    assert "4A" in result.classification or "Lung-RADS" in result.classification
    assert result.status == WorkflowStatus.COMPLETED

RAG engine tests verify weighted scoring:

def test_retrieve_weighted_scoring(rag_engine):
    result = rag_engine.retrieve("hemorrhage detection CT")
    # Literature weight (0.18) > Findings weight (0.15)
    # So literature hits should rank higher for equal raw scores
    assert result.hits[0].collection == "imaging_literature"

Cross-modal tests verify trigger thresholds:

def test_lung_rads_4b_triggers(trigger, mock_embedder):
    result = WorkflowResult(
        workflow_name="ct_chest_lung_nodule",
        classification="Lung-RADS 4B",
        severity=FindingSeverity.URGENT,
    )
    cross_modal = trigger.evaluate(result)
    assert cross_modal is not None
    assert "Lung-RADS 4B" in cross_modal.trigger_reason

Chapter 11: Export System Deep Dive¶

11.1 Four Export Formats¶

src/export.py (691 lines) provides four export functions:

Function	Output	Use Case
`export_markdown()`	String	Chat display, clipboard, documentation
`export_json()`	String	API responses, downstream integration
`export_pdf()`	File path	Clinical reports, presentations
`export_fhir_r4()`	Dict	EHR integration, interoperability

11.2 Markdown Export¶

def export_markdown(response: AgentResponse) -> str:

Produces a structured Markdown report with sections: - Header with query and timestamp - Analysis (the LLM-synthesized answer) - Evidence table (grouped by collection, top 5 per collection) - Workflow Results (findings, measurements, classification) - NIM Services Used - Research-use-only disclaimer

11.3 PDF Export with ReportLab¶

def export_pdf(response: AgentResponse, output_path: str) -> str:

Generates NVIDIA-branded PDF reports using ReportLab: - Header: NVIDIA green (RGB 118, 185, 0) bar with title - Page setup: Letter size, 40pt margins - Sections: Query, Analysis (markdown-to-PDF conversion), Evidence table, Workflow results - Severity colors: Critical=red, Urgent=orange, Significant=amber, Routine=green, Normal=gray

The _clean_markdown() helper strips Markdown formatting for PDF paragraph rendering, converting **bold** to <b>bold</b> tags compatible with ReportLab's Paragraph.

11.4 FHIR R4 Export¶

The FHIR R4 export produces a standards-compliant DiagnosticReport resource with embedded Observation resources:

Code systems used:

System	URI	Purpose
LOINC	`http://loinc.org`	Observation codes
SNOMED CT	`http://snomed.info/sct`	Finding codes
DICOM	`http://dicom.nema.org/resources/ontology/DCM`	Modality codes
HL7 Interpretation	`http://terminology.hl7.org/CodeSystem/v3-ObservationInterpretation`	Severity mapping

SNOMED finding codes:

Finding	SNOMED Code
hemorrhage	50960005
nodule	416940007
consolidation	95436008
pneumothorax	36118008
effusion	60046008
fracture	125605004
cardiomegaly	8186001
mass	4147007
edema	267038008
normal	17621005

Severity to FHIR Interpretation mapping:

Severity	FHIR Code
critical	HH (critically high)
urgent	H (high)
significant	A (abnormal)
routine	N (normal)
normal	N (normal)

DICOM modality codes:

Agent Modality	DICOM Code
ct	CT
mri	MR
xray	DX
cxr	CR
ultrasound	US
pet	PT
mammography	MG
fluoroscopy	RF

12.1 Architecture¶

The cross-modal system in src/cross_modal.py (401 lines) bridges imaging findings to the genomic_evidence collection (3.5M vectors) populated by the rag-chat-pipeline.

class CrossModalTrigger:
    GENOMIC_COLLECTION = "genomic_evidence"
    TOP_K_PER_QUERY = 5
    SCORE_THRESHOLD = 0.40

12.2 Five Evaluator Functions¶

Each evaluator maps a workflow result to genomic query templates:

Evaluator	Workflow	Trigger Condition	Genomic Queries
`_evaluate_lung_nodule`	`ct_chest_lung_nodule`	Lung-RADS 4A, 4B, 4X	EGFR, ALK, ROS1, KRAS driver mutations
`_evaluate_cxr`	`cxr_rapid_findings`	Critical/Urgent + consolidation	Antimicrobial resistance, immune response
`_evaluate_brain_lesion`	`mri_brain_ms_lesion`	Classification contains "highly_active"	HLA-DRB1 susceptibility, treatment response
`_evaluate_cardiac`	`ct_coronary_angiography`	CAD-RADS >= 3	LDLR, PCSK9, APOB, polygenic risk
`_evaluate_prostate`	`mri_prostate_pirads`	PI-RADS >= 4	BRCA2, HOXB13, ATM susceptibility

12.3 Severity Threshold Mapping¶

Workflow Result -> Evaluator -> Severity Check -> Genomic Queries -> Milvus Search -> CrossModalResult

Each evaluator parses the classification string from the WorkflowResult and checks against specific high-risk categories:

# Lung nodule: parse Lung-RADS category
high_risk_categories = ["4A", "4B", "4X"]
high_risk = any(cat in classification for cat in high_risk_categories)

# Cardiac CTA: parse CAD-RADS score
high_risk_categories = ["3", "4A", "4B", "5"]
high_risk = any(f"CAD-RADS {cat}" in classification for cat in high_risk_categories)

# Prostate MRI: parse PI-RADS score
high_risk_scores = ["4", "5"]
high_risk = any(f"PI-RADS {score}" in classification for score in high_risk_scores)

12.4 Genomic Query Templates¶

Each trigger fires 2-3 pre-defined genomic queries:

LUNG_CANCER_QUERIES = [
    "lung cancer driver mutations EGFR ALK ROS1 KRAS",
    "non-small cell lung cancer NSCLC targeted therapy genomics",
    "lung adenocarcinoma molecular subtypes precision medicine",
]

CARDIAC_GENOMICS_QUERIES = [
    "cardiovascular disease genetic risk factors LDLR PCSK9 APOB",
    "familial hypercholesterolemia genomic variants",
    "coronary artery disease polygenic risk score",
]

12.5 The _query_genomics() Method¶

def _query_genomics(self, queries, trigger_reason):
    for query in queries:
        embedding = self.embedder.encode(query, normalize_embeddings=True)
        hits = self.collection_manager.search(
            collection_name=self.GENOMIC_COLLECTION,
            query_embedding=embedding_list,
            top_k=self.TOP_K_PER_QUERY,      # 5 per query
            score_threshold=self.SCORE_THRESHOLD,  # 0.40 minimum
        )
        # De-duplicate by ID across queries

Results are de-duplicated by record ID across all queries and assembled into a CrossModalResult with genomic context strings, hit count, and an enrichment summary.

12.6 Integration with Agent¶

The ImagingIntelligenceAgent invokes cross-modal triggers after workflow execution:

if self.cross_modal_trigger and workflow_result:
    cross_modal_result = self.cross_modal_trigger.evaluate(workflow_result)
    if cross_modal_result:
        response.cross_modal = cross_modal_result

The cross-modal enrichment is controlled by the CROSS_MODAL_ENABLED setting (default: False for safety in production).

Chapter 13: Production Deployment¶

13.1 Docker Multi-Stage Build¶

The Dockerfile uses a multi-stage build:

Stage 1: Builder -- installs Python dependencies and compiles native extensions

Stage 2: Runtime -- copies only the installed packages and application code

13.2 Compose Topology¶

Full stack (docker-compose.yml) -- 11 services:

orthanc (8042, 4242)          -- DICOM server + web viewer
ohif-viewer (8526)            -- OHIF DICOM viewer
milvus-etcd                   -- Milvus metadata (etcd)
milvus-minio                  -- Milvus object storage (MinIO)
milvus-standalone (19530)     -- Milvus vector database
imaging-streamlit (8525)      -- Streamlit chat UI
imaging-api (8524)            -- FastAPI REST server
imaging-setup                 -- One-shot collection + seed
nim-llm (8520)                -- Meta Llama 3 NIM
nim-vista3d (8530)            -- NVIDIA VISTA-3D NIM
nim-maisi (8531)              -- NVIDIA MAISI NIM
nim-vilam3 (8532)             -- VILA-M3 VLM NIM

Lite stack (docker-compose.lite.yml) -- 6 services (no GPU required):

milvus-etcd                   -- Milvus metadata
milvus-minio                  -- Milvus object storage
milvus-standalone (19530)     -- Milvus vector database
imaging-streamlit (8525)      -- Streamlit chat UI
imaging-api (8524)            -- FastAPI REST server
imaging-setup                 -- One-shot collection + seed

All NIM-dependent features run in mock mode with the Lite stack.

13.3 Port Map¶

Service	Port	Protocol
Streamlit UI	8525	HTTP
FastAPI REST	8524	HTTP
Milvus gRPC	19530	gRPC
Milvus metrics	9091	HTTP
Orthanc REST/Web	8042	HTTP
Orthanc DICOM	4242	DICOM C-STORE
OHIF Viewer	8526	HTTP
NIM LLM	8520	HTTP (OpenAI-compat)
NIM VISTA-3D	8530	HTTP
NIM MAISI	8531	HTTP
NIM VILA-M3	8532	HTTP

13.4 Health Checks¶

Orthanc:

healthcheck:
  test: ["CMD", "curl", "-f", "http://localhost:8042/system"]
  interval: 30s
  timeout: 10s
  retries: 5

Milvus:

healthcheck:
  test: ["CMD", "curl", "-f", "http://localhost:9091/healthz"]
  interval: 30s
  timeout: 20s
  retries: 3

FastAPI: The /health endpoint returns service status and Milvus connectivity.

NIM services: Each NIM client checks /v1/health/ready (or /v1/models for LLM) with a 30-second cached TTL.

13.5 Monitoring¶

Prometheus metrics exposed by the FastAPI server:

Metric	Type	Labels	Description
`imaging_agent_queries_total`	Counter	`endpoint`	Total RAG queries
`imaging_agent_query_duration_seconds`	Histogram	`endpoint`	Query latency
`imaging_agent_search_hits`	Histogram	--	Evidence hits per query

Histogram buckets for latency: 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0, 30.0 seconds.

13.6 Environment Variables¶

All settings use the IMAGING_ prefix:

IMAGING_MILVUS_HOST=milvus-standalone
IMAGING_MILVUS_PORT=19530
IMAGING_ANTHROPIC_API_KEY=sk-ant-...
IMAGING_NIM_LLM_URL=http://nim-llm:8520/v1
IMAGING_NIM_MODE=local
IMAGING_CROSS_MODAL_ENABLED=true

13.7 DICOM Event Integration¶

The api/routes/events.py module provides webhook endpoints for Orthanc. When a DICOM study arrives:

Orthanc fires a change event to the webhook
The event handler extracts modality and body region from DICOM tags
WORKFLOW_ROUTING maps (modality, body_region) to a workflow name:

WORKFLOW_ROUTING = {
    ("CT", "head"): "ct_head_hemorrhage",
    ("CT", "brain"): "ct_head_hemorrhage",
    ("CT", "chest"): "ct_chest_lung_nodule",
    ("CT", "lung"): "ct_chest_lung_nodule",
    ("CR", "chest"): "cxr_rapid_findings",
    ("DX", "chest"): "cxr_rapid_findings",
    ("MR", "brain"): "mri_brain_ms_lesion",
    ("MR", "head"): "mri_brain_ms_lesion",
}

The matched workflow runs and results are stored in an in-memory deque (max 200 entries)

Chapter 14: Future Architecture¶

14.1 Federated Learning with NVIDIA FLARE¶

The flare/ directory contains federated learning job configurations for three clinical tasks:

Job	Directory	Description
CXR Classification	`flare/job_configs/cxr_classification/`	Multi-institution chest X-ray classification
CT Segmentation	`flare/job_configs/ct_segmentation/`	Federated organ segmentation
Lung Nodule Detection	`flare/job_configs/lung_nodule_detection/`	Distributed lung nodule CADe

FLARE enables training across hospital sites without centralizing patient data. Each site trains locally and shares only model weight updates (federated averaging).

Integration pattern:

Site 1 (Hospital A)     Site 2 (Hospital B)     Site 3 (Hospital C)
     |                       |                       |
     v                       v                       v
Local Training          Local Training          Local Training
     |                       |                       |
     +--------> FLARE Server <--------+
                    |
                    v
            Aggregated Model
                    |
                    v
           Deploy to NIM Endpoint

14.2 Multi-Agent Systems¶

The Imaging Intelligence Agent is designed to operate within the HCLS AI Factory's multi-agent ecosystem:

Biomarker Discovery Agent -- variant interpretation, biomarker identification
Precision Oncology Agent -- treatment recommendation, clinical trial matching
CAR-T Intelligence Agent -- cell therapy optimization
Autoimmune Intelligence Agent -- autoimmune disease analysis

Future cross-agent integration patterns: - Imaging findings triggering oncology treatment recommendations - Genomic variants informing imaging protocol selection - Federated knowledge sharing across agent collections

14.3 Fine-Tuned Embeddings¶

The current BGE-small-en-v1.5 model is a general-purpose text encoder. Future improvements include:

Domain-adapted embeddings: - Fine-tune on RadLex terminology and radiology reports - Use contrastive learning on (query, relevant_passage) pairs from radiology literature - Target improvement: +5-10% recall on radiology-specific queries

Multi-modal embeddings: - Encode DICOM image features alongside text - Use BiomedCLIP or RadFM for joint image-text embeddings - Enable "find studies similar to this image" searches

14.4 Scalability Roadmap¶

Current	Near-Term	Long-Term
IVF_FLAT, 11 collections	HNSW, 15+ collections	Distributed Milvus, 50+ collections
BGE-small (384d)	Fine-tuned BGE (384d)	Multi-modal (768d+)
Mock NIM fallback	Full NIM deployment	Multi-GPU NIM cluster
Single DGX Spark	DGX Spark + Cloud NIM	Multi-node DGX
In-memory event history	PostgreSQL event store	Kafka event streaming

Appendix A: Complete API Reference¶

A.1 Core Query Endpoints¶

POST /query

RAG query with multi-collection search, knowledge augmentation, and LLM synthesis.

Request:

{
    "question": "What is the sensitivity of AI for hemorrhage detection on CT?",
    "modality": "ct",
    "body_region": "head",
    "top_k": 5,
    "include_genomic": true,
    "include_nim": true,
    "collections": null,
    "year_min": 2020,
    "year_max": null,
    "conversation_context": ""
}

Response:

{
    "question": "...",
    "answer": "Based on the available evidence...",
    "evidence_count": 33,
    "collections_searched": 11,
    "search_time_ms": 45.2
}

POST /query/stream

Streaming version of /query. Returns Server-Sent Events (SSE) with incremental tokens.

POST /query/comparative

Explicitly invokes the comparative analysis pipeline for A-vs-B queries.

A.2 Search Endpoints¶

POST /search

Raw vector search without LLM synthesis. Returns ranked evidence hits.

POST /search/collection/{collection_name}

Search a specific collection by name.

GET /collections

List all registered collections with record counts and status.

GET /collections/{name}/stats

Detailed statistics for a single collection.

A.3 Workflow Endpoints¶

POST /workflows/run

Execute a named imaging workflow.

Request:

{
    "workflow_name": "ct_head_hemorrhage",
    "input_path": "/data/studies/CT001",
    "mock_mode": true,
    "mock_overrides": {"classification": "urgent_hemorrhage"}
}

Response:

{
    "workflow_name": "ct_head_hemorrhage",
    "status": "completed",
    "findings": [...],
    "measurements": {"volume_ml": 12.5, "midline_shift_mm": 3.2},
    "classification": "urgent_hemorrhage",
    "severity": "urgent",
    "inference_time_ms": 1250.0,
    "is_mock": true
}

GET /workflows

List available workflows with metadata.

GET /workflows/{name}/info

Workflow details: modality, body region, models used, target latency.

A.4 NIM Service Endpoints¶

GET /nim/status

Status of all 4 NIM services (available/mock/unavailable).

Response:

{
    "vista3d": "mock",
    "maisi": "mock",
    "vila_m3": "mock",
    "llm": "available"
}

POST /nim/vista3d/segment

Invoke VISTA-3D organ segmentation.

POST /nim/maisi/generate

Invoke MAISI synthetic CT generation.

POST /nim/vilam3/analyze

Invoke VILA-M3 image analysis with visual question answering.

A.5 Report/Export Endpoints¶

POST /reports/markdown

Export an agent response as Markdown.

POST /reports/json

Export as structured JSON.

POST /reports/pdf

Export as NVIDIA-branded PDF file.

POST /reports/fhir

Export as FHIR R4 DiagnosticReport resource.

A.6 Preview Endpoints¶

POST /preview/generate

Generate MP4/GIF preview from a DICOM series or NIfTI volume.

Request:

{
    "study_id": "orthanc-study-001",
    "format": "mp4",
    "plane": "axial",
    "window": "lung",
    "fps": 8,
    "max_frames": 200
}

GET /preview/{preview_id}

Retrieve a previously generated preview file.

A.7 Protocol/Dose Endpoints¶

POST /protocol/recommend

AI-optimized protocol recommendation for a clinical scenario.

POST /dose/compare

Compare standard vs AI-optimized radiation dose for a protocol.

A.8 Demo Case Endpoints¶

GET /demo/cases

List available demo cases with metadata.

POST /demo/run/{case_id}

Execute a pre-built demo scenario end-to-end.

A.9 Event Endpoints¶

POST /events/dicom/study-complete

Webhook for Orthanc study-complete events. Triggers automatic workflow routing.

GET /events/history

Retrieve recent DICOM ingestion history (max 200 entries).

A.10 System Endpoints¶

GET /health

Health check with Milvus connectivity status.

GET /metrics

Prometheus metrics endpoint (query counts, latencies, hit distributions).

GET /config

Current configuration (sensitive values redacted).

Appendix B: Configuration Reference¶

All environment variables use the IMAGING_ prefix (controlled by Pydantic env_prefix).

B.1 Path Configuration¶

Variable	Default	Description
`IMAGING_PROJECT_ROOT`	Auto-detected	Project root directory
`IMAGING_DATA_DIR`	`{PROJECT_ROOT}/data`	Data storage directory
`IMAGING_CACHE_DIR`	`{DATA_DIR}/cache`	Cache directory
`IMAGING_REFERENCE_DIR`	`{DATA_DIR}/reference`	Reference data directory
`IMAGING_RAG_PIPELINE_ROOT`	`/home/adam/.../rag-chat-pipeline`	RAG pipeline root

B.2 Milvus Configuration¶

Variable	Default	Description
`IMAGING_MILVUS_HOST`	`localhost`	Milvus server hostname
`IMAGING_MILVUS_PORT`	`19530`	Milvus gRPC port
`IMAGING_COLLECTION_LITERATURE`	`imaging_literature`	Literature collection name
`IMAGING_COLLECTION_TRIALS`	`imaging_trials`	Trials collection name
`IMAGING_COLLECTION_FINDINGS`	`imaging_findings`	Findings collection name
`IMAGING_COLLECTION_PROTOCOLS`	`imaging_protocols`	Protocols collection name
`IMAGING_COLLECTION_DEVICES`	`imaging_devices`	Devices collection name
`IMAGING_COLLECTION_ANATOMY`	`imaging_anatomy`	Anatomy collection name
`IMAGING_COLLECTION_BENCHMARKS`	`imaging_benchmarks`	Benchmarks collection name
`IMAGING_COLLECTION_GUIDELINES`	`imaging_guidelines`	Guidelines collection name
`IMAGING_COLLECTION_REPORT_TEMPLATES`	`imaging_report_templates`	Report templates collection
`IMAGING_COLLECTION_DATASETS`	`imaging_datasets`	Datasets collection name
`IMAGING_COLLECTION_GENOMIC`	`genomic_evidence`	Cross-agent genomic collection

B.3 Embedding Configuration¶

Variable	Default	Description
`IMAGING_EMBEDDING_MODEL`	`BAAI/bge-small-en-v1.5`	Embedding model name
`IMAGING_EMBEDDING_DIMENSION`	`384`	Vector dimension
`IMAGING_EMBEDDING_BATCH_SIZE`	`32`	Batch size for encoding

B.4 LLM Configuration¶

Variable	Default	Description
`IMAGING_LLM_PROVIDER`	`anthropic`	LLM provider
`IMAGING_LLM_MODEL`	`claude-sonnet-4-6`	LLM model name
`IMAGING_ANTHROPIC_API_KEY`	`None`	Anthropic API key

B.5 NIM Configuration¶

Variable	Default	Description
`IMAGING_NIM_LLM_URL`	`http://localhost:8520/v1`	Local LLM NIM endpoint
`IMAGING_NIM_VISTA3D_URL`	`http://localhost:8530`	VISTA-3D NIM endpoint
`IMAGING_NIM_MAISI_URL`	`http://localhost:8531`	MAISI NIM endpoint
`IMAGING_NIM_VILAM3_URL`	`http://localhost:8532`	VILA-M3 NIM endpoint
`IMAGING_NIM_MODE`	`local`	NIM mode: local, cloud, mock
`IMAGING_NIM_ALLOW_MOCK_FALLBACK`	`True`	Allow mock fallback
`IMAGING_NGC_API_KEY`	`None`	NGC API key for NIM pulls
`IMAGING_NIM_LOCAL_LLM_MODEL`	`meta/llama3-70b-instruct`	Local NIM LLM model

B.6 NVIDIA Cloud NIM Configuration¶

Variable	Default	Description
`IMAGING_NVIDIA_API_KEY`	`None`	NVIDIA cloud API key
`IMAGING_NIM_CLOUD_URL`	`https://integrate.api.nvidia.com/v1`	Cloud NIM base URL
`IMAGING_NIM_CLOUD_LLM_MODEL`	`meta/llama-3.1-8b-instruct`	Cloud LLM model
`IMAGING_NIM_CLOUD_VLM_MODEL`	`meta/llama-3.2-11b-vision-instruct`	Cloud VLM model

B.7 RAG Search Configuration¶

Variable	Default	Description
`IMAGING_TOP_K_PER_COLLECTION`	`5`	Results per collection
`IMAGING_SCORE_THRESHOLD`	`0.40`	Minimum similarity score
`IMAGING_WEIGHT_LITERATURE`	`0.18`	Literature collection weight
`IMAGING_WEIGHT_TRIALS`	`0.12`	Trials collection weight
`IMAGING_WEIGHT_FINDINGS`	`0.15`	Findings collection weight
`IMAGING_WEIGHT_PROTOCOLS`	`0.08`	Protocols collection weight
`IMAGING_WEIGHT_DEVICES`	`0.08`	Devices collection weight
`IMAGING_WEIGHT_ANATOMY`	`0.06`	Anatomy collection weight
`IMAGING_WEIGHT_BENCHMARKS`	`0.08`	Benchmarks collection weight
`IMAGING_WEIGHT_GUIDELINES`	`0.10`	Guidelines collection weight
`IMAGING_WEIGHT_REPORT_TEMPLATES`	`0.05`	Report templates weight
`IMAGING_WEIGHT_DATASETS`	`0.06`	Datasets collection weight
`IMAGING_WEIGHT_GENOMIC`	`0.04`	Genomic collection weight

B.8 Citation Scoring¶

Variable	Default	Description
`IMAGING_CITATION_HIGH_THRESHOLD`	`0.75`	High-confidence citation threshold
`IMAGING_CITATION_MEDIUM_THRESHOLD`	`0.60`	Medium-confidence citation threshold

B.9 API Server Configuration¶

Variable	Default	Description
`IMAGING_API_HOST`	`0.0.0.0`	API bind host
`IMAGING_API_PORT`	`8524`	API bind port
`IMAGING_STREAMLIT_PORT`	`8525`	Streamlit UI port
`IMAGING_API_BASE_URL`	`http://localhost:8524`	Internal API URL
`IMAGING_CORS_ORIGINS`	`http://localhost:8080,...`	CORS allowed origins
`IMAGING_MAX_REQUEST_SIZE_MB`	`10`	Max request body size

B.10 Orthanc/DICOM Configuration¶

Variable	Default	Description
`IMAGING_ORTHANC_URL`	`http://localhost:8042`	Orthanc REST API URL
`IMAGING_ORTHANC_USERNAME`	`admin`	Orthanc username
`IMAGING_ORTHANC_PASSWORD`	`""`	Orthanc password
`IMAGING_DICOM_AUTO_INGEST`	`False`	Auto-ingest new DICOM studies
`IMAGING_DICOM_WATCH_INTERVAL`	`5`	Orthanc poll interval (seconds)
`IMAGING_DICOM_SERVER_URL`	`http://localhost:8042`	Legacy alias

B.11 OHIF Viewer Configuration¶

Variable	Default	Description
`IMAGING_OHIF_URL`	`http://localhost:8526`	OHIF Viewer URL

B.12 Preview Generation Configuration¶

Variable	Default	Description
`IMAGING_PREVIEW_CACHE_DIR`	`{DATA_DIR}/cache/previews`	Preview cache directory
`IMAGING_PREVIEW_DEFAULT_FPS`	`8`	Default frames per second
`IMAGING_PREVIEW_DEFAULT_FORMAT`	`mp4`	Default format (mp4 or gif)
`IMAGING_PREVIEW_MAX_FRAMES`	`200`	Maximum frames per preview

B.13 Data Source Configuration¶

Variable	Default	Description
`IMAGING_NCBI_API_KEY`	`None`	PubMed API key (optional)
`IMAGING_PUBMED_MAX_RESULTS`	`5000`	Max PubMed results per query
`IMAGING_CT_GOV_BASE_URL`	`https://clinicaltrials.gov/api/v2`	ClinicalTrials.gov API

B.14 Scheduler/Ingest Configuration¶

Variable	Default	Description
`IMAGING_INGEST_SCHEDULE_HOURS`	`168`	Ingest interval (168h = weekly)
`IMAGING_INGEST_ENABLED`	`False`	Enable scheduled ingestion

Variable	Default	Description
`IMAGING_CROSS_MODAL_ENABLED`	`False`	Enable imaging->genomics triggers

B.16 Conversation Memory¶

Variable	Default	Description
`IMAGING_MAX_CONVERSATION_CONTEXT`	`3`	Prior exchanges to inject

B.17 Prometheus Configuration¶

Variable	Default	Description
`IMAGING_METRICS_ENABLED`	`True`	Enable Prometheus metrics

Appendix C: Scoring System Reference¶

C.1 Lung-RADS (ACR Lung-RADS v2022)¶

Used by: ct_chest_lung_nodule workflow

Category	Description	Finding Criteria	Management
0	Incomplete	Additional imaging needed	Complete evaluation
1	Negative	No nodules	Continue annual screening
2	Benign	Perifissural, calcified, or stable nodules	Continue annual screening
3	Probably benign	Solid nodule 6-8mm, part-solid <6mm	6-month follow-up
4A	Suspicious	Solid nodule 8-15mm, new part-solid 6mm+	3-month LDCT or PET/CT
4B	Very suspicious	Solid nodule >= 15mm, growing solid/part-solid	Tissue sampling
4X	Very suspicious +	4A/4B features + additional suspicious morphology	Tissue sampling
S	Other	Clinically significant non-pulmonary finding	Per finding type

Cross-modal trigger: Categories 4A, 4B, 4X fire the lung cancer genomics trigger.

Enum in models.py:

class LungRADS(str, Enum):
    CAT_0 = "0"
    CAT_1 = "1"
    CAT_2 = "2"
    CAT_3 = "3"
    CAT_4A = "4A"
    CAT_4B = "4B"
    CAT_4X = "4X"
    CAT_S = "S"

C.2 BI-RADS (ACR Breast Imaging)¶

Category	Description	Cancer Likelihood	Action
0	Incomplete	N/A	Additional imaging
1	Negative	0%	Routine screening
2	Benign	0%	Routine screening
3	Probably benign	<2%	Short-interval follow-up
4	Suspicious	2-95%	Biopsy
5	Highly suggestive	>95%	Biopsy
6	Known malignancy	100%	Treatment

C.3 TI-RADS (ACR Thyroid Imaging)¶

Category	Description	FNA Recommendation
TR1	Benign	No FNA
TR2	Not suspicious	No FNA
TR3	Mildly suspicious	FNA if >= 2.5 cm; follow if >= 1.5 cm
TR4	Moderately suspicious	FNA if >= 1.5 cm; follow if >= 1.0 cm
TR5	Highly suspicious	FNA if >= 1.0 cm; follow if >= 0.5 cm

C.4 LI-RADS (ACR Liver Imaging)¶

Used for HCC screening in at-risk populations.

Category	Description	HCC Probability
LR-1	Definitely benign	0%
LR-2	Probably benign	~5%
LR-3	Intermediate probability	~30%
LR-4	Probably HCC	~80%
LR-5	Definitely HCC	~95%
LR-M	Malignant, not HCC specific	Variable
LR-TIV	Tumor in vein	Variable

C.5 CAD-RADS 2.0 (Coronary Artery Disease)¶

Used by: ct_coronary_angiography workflow

Category	Stenosis	Description	Management
0	0%	No stenosis	None
1	1-24%	Minimal stenosis	Preventive therapy
2	25-49%	Mild stenosis	Preventive therapy
3	50-69%	Moderate stenosis	Consider functional testing
4A	70-99%	Severe stenosis	Consider ICA
4B	LM >50% or 3-vessel	Severe stenosis	ICA recommended
5	100%	Total occlusion	Consider ICA
N	N/A	Non-diagnostic	Repeat or alternative

Cross-modal trigger: Categories 3, 4A, 4B, 5 fire the cardiovascular genomics trigger.

C.6 PI-RADS v2.1 (Prostate Imaging)¶

Used by: mri_prostate_pirads workflow

Score	Description	Likelihood of csPC
1	Very low	Very low
2	Low	Low
3	Intermediate	Equivocal
4	High	High
5	Very high	Very high

Cross-modal trigger: Scores 4 and 5 fire the prostate cancer genomics trigger.

C.7 ASPECTS (Alberta Stroke Programme Early CT Score)¶

Score	Description	Interpretation
10	Normal	No early ischemic changes
7-9	Mild	Limited early ischemic changes
4-6	Moderate	Moderate early ischemic changes
0-3	Severe	Extensive early ischemic changes

Scoring: Start at 10, subtract 1 point for each affected MCA territory region showing early ischemic change (caudate, lentiform, insula, internal capsule, M1-M6).

C.8 CT Windowing Presets¶

Used by: src/imaging/preview_generator.py

Preset	Center (HU)	Width (HU)	Clinical Use
Brain	40	80	Gray/white matter differentiation
Lung	-600	1500	Lung parenchyma, nodules, airways
Bone	400	1800	Fractures, osseous lesions
Abdomen	50	400	Soft tissue organs, masses
Soft Tissue	50	350	Soft tissue detail

WINDOW_PRESETS = {
    "brain":       {"center": 40,   "width": 80},
    "lung":        {"center": -600, "width": 1500},
    "bone":        {"center": 400,  "width": 1800},
    "abdomen":     {"center": 50,   "width": 400},
    "soft_tissue": {"center": 50,   "width": 350},
}

Windowing formula:

lower = center - width/2
upper = center + width/2
pixel_display = clip(pixel_hu, lower, upper)
pixel_normalized = (pixel_display - lower) / width

C.9 Preview Generation Parameters¶

Parameter	Default	Range	Description
`fps`	8	1-30	Frames per second
`max_frames`	200	1-500	Maximum frame count
`format`	`mp4`	mp4, gif	Output format
`plane`	`axial`	axial, sagittal, coronal	Slice orientation

Axis mapping:

_AXIS_MAP = {
    "axial": 2,     # Superior-Inferior
    "sagittal": 0,   # Left-Right
    "coronal": 1,    # Anterior-Posterior
}

This guide covers the Imaging Intelligence Agent codebase as of March 2026. For updates, check the repository history and the PROJECT_BIBLE.md in the docs/ directory.

Imaging Intelligence Agent -- Advanced Learning Guide¶

Prerequisites¶

What You Should Know¶

Codebase Map¶

Chapter 1: Deep Dive into the RAG Engine¶

1.1 Class Architecture¶

1.2 The retrieve() Method¶

1.3 Score Weighting (COLLECTION_CONFIG)¶

1.4 Citation Scoring¶

1.5 System Prompt¶

1.6 Prompt Construction (_build_prompt)¶

1.7 Query Entry Points¶

Chapter 2: Vector Search Internals¶

2.1 Index Type: IVF_FLAT¶

2.2 Distance Metric: COSINE¶

2.3 nprobe = 16¶

2.4 BGE Embedding Model¶

2.5 Collection Schemas¶

Chapter 3: Adding a New Collection¶

Step 1: Define the Pydantic Model¶

Step 2: Define the Milvus Schema¶

Step 3: Register the Schema¶

Step 4: Add to COLLECTION_CONFIG¶

Step 5: Add to conftest.py¶

Step 6: Create the Ingest Parser¶

Step 7: Add Query Expansion Terms¶

Step 8: Add Filter Support¶

Step 9: Write Tests¶

Step 10: Update Settings¶

Chapter 4: Building a Custom Ingest Pipeline¶

4.1 The BaseIngestPipeline Pattern¶

4.2 The fetch -> parse -> embed_and_store Flow¶

4.3 Batch Embedding¶

4.4 Worked Example: FDA Device Parser¶

Chapter 5: Extending the Knowledge Graph¶

5.1 Three Domains¶

5.2 Adding a New Pathology¶

5.3 Adding a New Modality¶

5.4 Adding a New Anatomy Entry¶

5.5 Keyword Routing¶

5.6 Comparison Entity Resolution¶

Chapter 6: Query Expansion Engineering¶

6.1 Architecture¶

6.2 The expand_query() Function¶

6.3 Adding New Expansion Maps¶

6.4 Synonym Expansion vs Modality/Region Context¶

Chapter 7: The Comparative Analysis System¶

7.1 Detection: _COMPARATIVE_RE¶

7.2 Entity Parsing: _ENTITY_PATTERNS¶

7.3 Entity Resolution: IMAGING_ENTITY_MAP¶

7.4 retrieve_comparative() Pipeline¶

7.5 Dual-Query Evidence¶

7.6 Comparative System Prompt¶

Chapter 8: Clinical Workflow Development¶

8.1 Base Class Architecture¶

8.2 The preprocess -> infer -> postprocess Pattern¶

8.3 The run() Orchestrator¶

8.4 Six Implemented Workflows¶

8.5 Adding a New Workflow¶

Chapter 9: NIM Integration Deep Dive¶

9.1 BaseNIMClient Architecture¶

9.2 The Fallback Chain¶

9.3 Client Architecture Details¶

9.4 Mock Mode¶

9.5 NIMServiceManager¶

Chapter 10: Testing Strategies¶

10.1 Test Suite Overview¶

10.2 The Mock-Everything Approach¶

10.3 Core Fixtures¶

10.4 Testing Patterns¶

Chapter 11: Export System Deep Dive¶

11.1 Four Export Formats¶

11.2 Markdown Export¶

11.3 PDF Export with ReportLab¶

11.4 FHIR R4 Export¶

Chapter 12: Cross-Modal Integration¶

12.1 Architecture¶

12.2 Five Evaluator Functions¶

12.3 Severity Threshold Mapping¶

12.4 Genomic Query Templates¶