Skip to main content

Research Agents

5 agents in research-engine, each following the three-file pattern (FR-KIA-013):

  • prepare.py — normalize inputs · fail-fast validation (raises PrepareError)
  • experiment.py — the work (LLM call or pure logic)
  • reshape.py — output conformance · error mapping

Researcher

  • Input: ResearchTask (priority tier: emergency / priority / batched)
  • Output: pending Claim
  • LLM: GPT-4o primary → Gemini fallback
  • Process:
    1. MemoryClient.retrieve(research_question, patient_id) → context
    2. LLM sanitize → delimit (<<<CLINICAL_DATA>>>) → de-ID → call
    3. schema-validated response → emit pending claim w/ supports[]
  • Fail-soft: retrieval error → PubMed-only fallback + consumer.retrieval_fallback audit

Critic

  • Input: pending claim + existing active claims for same entity
  • Output: CriticVerdict (promote · supersede · reject + score + reasoning)
  • LLM: GPT-4o
  • Process:
    1. Build evidence comparison prompt
    2. Parse verdict + confidence
    3. If supersede: atomically transition old activesuperseded via PATCH w/ supersedes[] query param
  • SLA: 30s asyncio.wait_for. 3 consecutive timeouts → claim transitions to CRITIC_QUARANTINE. research.critic.quarantine audit. Operator-drained only.

Correlator

  • Input: entity cluster with ≥2 active claims
  • Output: Finding w/ 4-value enum (co_occurrence / temporal_trend / dose_response / shared_citation_cluster) OR None
  • LLM: GPT-4o
  • Process:
    1. Build cluster prompt
    2. LLM returns pattern or "no_pattern"
    3. Finding persisted to correlator_findings (180d TTL)
  • Audits: research.correlator.finding or research.correlator.no_pattern

Replicator

  • Input: active claim aged > 30 days
  • Output: verdict (eroded / retracted_source / confirmed)
  • LLM: via Researcher (delegates)
  • Process:
    1. Re-run Researcher with same research question
    2. Compare new evidence vs old supports[]
    3. eroded / retracted_source → emit Finding + enqueue critic-role re-critique task via TaskQueueManager
  • Audits: research.replicator.finding · research.replicator.skipped

Librarian (LLM-free)

  • Input: memory_claims batch (100/tick)
  • Output: Finding (broken_provenance / approaching_decay)
  • LLM: none (pure validator)
  • Process:
    1. For each claim, resolve each supports[] entry against research_sources
    2. Missing source → broken_provenance (takes precedence)
    3. decay_at within 30-day warn window → approaching_decay
    4. Re-check path: previously-flagged claim now healthy → mark_resolved + research.librarian.resolved audit
  • Why LLM-free: validation is a pure function over structured data; no inference needed; cheap to run at scale

Agent scheduler base

All 4 periodic agents (Correlator + Replicator + Librarian) inherit AgentSchedulerBase (spec 050):

  • Lifespan task + interval tick loop
  • Cooperative pause / resume via admin routes
  • Run-once with run-lock (409 if run in flight)
  • 24h findings window accounting
  • FindingsRepository — insert-only + guarded mark_resolved + 180d TTL

Admin routes under /api/research/v1/{correlator,replicator,librarian}/{status,run-once,cancel-current} — all @Roles('admin')-gated.

Evidence sources (12 active)

NameConnectorAuthSpec
PubMedpubmed.pyNCBI api_key optional (10 rps w/ key, 3 rps anon)048
NEJMnejm.pyBearer key · 5 rps049
JAMA · JBJS · Clinical Orthopaedicslww.py (shared)Bearer · 5 rps051
JOR · JBMR-Bwiley.py (shared)Bearer · 5 rps051
Spine Journal · Acta Biomaterialia · J. Arthroplastyelsevier.py (shared)Bearer · 5 rps051
NCCNnccn.pyBearer · guideline endpoint051
Cochranecochrane.pyBearer · review type counter051
ClinicalTrials.govclinicaltrials_gov.pyOptional X-API-Key055
ClinVarclinvar.pyNCBI E-utilities055
DrugBank (planned)drugbank.pyBearer (commercial license pending)055

Every connector: 24h cache, fail-soft on 429/5xx/transport (empty hits + WARNING), structured metadata via SourceHit prefix conventions, no PHI flows outbound (queries are scientific topics only).