Hypothesis Orchestrator (SPEC-03)
The orchestrator drives candidate hypotheses through a 6-phase lifecycle from detection to dispatch to completion.
Phase 1 — Emergency + dispatch
- Red-signal conditions bypass scoring + approval → direct dispatch
- Triggered by EHR import events (e.g., ER admission), anomalous lab results
- Priority tier
emergencyin TaskQueue
Phase 2 — Feasibility scoring (spec 052)
Rule-based ScoringService (no LLM, deterministic, non-PHI inputs only):
- Feasibility — source coverage + data availability
- Impact — patient impact score
- Novelty — dismissed-sibling count (higher count → lower novelty)
- Cost estimate — projected LLM + retrieval cost
- Composite score — weighted sum per
DEFAULT_WEIGHTS - Reasoning bucket — low / mid / high band per
SCORE_LOW_BAND_MAX/SCORE_HIGH_BAND_MIN
Produces frozen ScoringResult. Reads only non-PHI metadata per FR-052-008.
Routes:
POST /orchestrator/v1/scoring/run-oncePOST /orchestrator/v1/scoring/run-batch
Phase 3 — Stakeholder approval (spec 053)
Human-in-the-loop approval surface.
- Who can approve:
admin | provider | coordinator(newrequire_stakeholderdep) - Who cannot:
patient · nurse - Decision + reasoning captured (1-2000 chars, Pydantic
extra='forbid')
Routes:
POST /candidates/{id}/approve(30/min)POST /candidates/{id}/dismiss(30/min)GET /candidates/pending-review(consent-scoped, 300/min)
Audit events:
orchestrator.candidate.approvedorchestrator.candidate.dismissed
Phase 4 — Detectors (spec 034)
3 detector types, each produces a candidate:
| Detector | Signal | Example |
|---|---|---|
| Contradiction | 2 active claims for same entity conflict | "medication A helps" vs "medication A doesn't help" |
| Gap | Entity has fewer than N claims, warrants research | "no claims about drug X's pediatric dosing" |
| Staleness | Claim past decay_at threshold | "claim about treatment Y older than 2 years" |
Detectors write to orchestration_candidates with CandidateStatus.DETECTED + detector_type field.
Phase 5-6 — Scheduler (specs 037, 040)
Per-target interval scheduler with exponential backoff:
- Each candidate has a
next_check_attimestamp - Scheduler ticks every N seconds, processes candidates due
- Failed dispatches retry with exponential backoff (30s → 60s → 120s → ...)
- Max attempts configurable; exhausted →
FAILEDstate
Admin routes:
POST /scheduler/pausePOST /scheduler/resumeGET /scheduler/status
End-to-end integration test
apps/research/orchestrator/services/tests/integration/test_state_machine_e2e.py exercises:
DETECTED → SCORING → SCORED → APPROVED → DISPATCHED → COMPLETED(accept path)- Alternate path ending in
DISMISSED
Every transition produces an audit event; the test asserts audit order + count.
Configuration
Per-target intervals + backoff + SLA deadlines all live in services/config/orchestrator.yml. No hot-reload — service restart needed.
Why split into 6 phases?
- Human-in-the-loop at the right place — Phase 3 is the single approval gate. Detectors run cheaply; scoring ranks; approval is where the HCP chooses.
- Emergency bypass — Phase 1 exists so acute clinical signals don't wait in the queue.
- Auditability — every transition emits an event; forensic review can trace the full candidate journey.
- Scale independence — detectors + scheduler scale by load; approval scales by HCP availability.