Skip to main content

Hypothesis Orchestrator (SPEC-03)

The orchestrator drives candidate hypotheses through a 6-phase lifecycle from detection to dispatch to completion.

Phase 1 — Emergency + dispatch

  • Red-signal conditions bypass scoring + approval → direct dispatch
  • Triggered by EHR import events (e.g., ER admission), anomalous lab results
  • Priority tier emergency in TaskQueue

Phase 2 — Feasibility scoring (spec 052)

Rule-based ScoringService (no LLM, deterministic, non-PHI inputs only):

  • Feasibility — source coverage + data availability
  • Impact — patient impact score
  • Novelty — dismissed-sibling count (higher count → lower novelty)
  • Cost estimate — projected LLM + retrieval cost
  • Composite score — weighted sum per DEFAULT_WEIGHTS
  • Reasoning bucket — low / mid / high band per SCORE_LOW_BAND_MAX / SCORE_HIGH_BAND_MIN

Produces frozen ScoringResult. Reads only non-PHI metadata per FR-052-008.

Routes:

  • POST /orchestrator/v1/scoring/run-once
  • POST /orchestrator/v1/scoring/run-batch

Phase 3 — Stakeholder approval (spec 053)

Human-in-the-loop approval surface.

  • Who can approve: admin | provider | coordinator (new require_stakeholder dep)
  • Who cannot: patient · nurse
  • Decision + reasoning captured (1-2000 chars, Pydantic extra='forbid')

Routes:

  • POST /candidates/{id}/approve (30/min)
  • POST /candidates/{id}/dismiss (30/min)
  • GET /candidates/pending-review (consent-scoped, 300/min)

Audit events:

  • orchestrator.candidate.approved
  • orchestrator.candidate.dismissed

Phase 4 — Detectors (spec 034)

3 detector types, each produces a candidate:

DetectorSignalExample
Contradiction2 active claims for same entity conflict"medication A helps" vs "medication A doesn't help"
GapEntity has fewer than N claims, warrants research"no claims about drug X's pediatric dosing"
StalenessClaim past decay_at threshold"claim about treatment Y older than 2 years"

Detectors write to orchestration_candidates with CandidateStatus.DETECTED + detector_type field.

Phase 5-6 — Scheduler (specs 037, 040)

Per-target interval scheduler with exponential backoff:

  • Each candidate has a next_check_at timestamp
  • Scheduler ticks every N seconds, processes candidates due
  • Failed dispatches retry with exponential backoff (30s → 60s → 120s → ...)
  • Max attempts configurable; exhausted → FAILED state

Admin routes:

  • POST /scheduler/pause
  • POST /scheduler/resume
  • GET /scheduler/status

End-to-end integration test

apps/research/orchestrator/services/tests/integration/test_state_machine_e2e.py exercises:

  1. DETECTED → SCORING → SCORED → APPROVED → DISPATCHED → COMPLETED (accept path)
  2. Alternate path ending in DISMISSED

Every transition produces an audit event; the test asserts audit order + count.

Configuration

Per-target intervals + backoff + SLA deadlines all live in services/config/orchestrator.yml. No hot-reload — service restart needed.

Why split into 6 phases?

  • Human-in-the-loop at the right place — Phase 3 is the single approval gate. Detectors run cheaply; scoring ranks; approval is where the HCP chooses.
  • Emergency bypass — Phase 1 exists so acute clinical signals don't wait in the queue.
  • Auditability — every transition emits an event; forensic review can trace the full candidate journey.
  • Scale independence — detectors + scheduler scale by load; approval scales by HCP availability.