Knowledge Lifecycle — Add / Change / Delete

The knowledge graph (entities, relationships, claims) is append-only by design — "delete" is expressed as a state transition (retracted / archived), never a hard destroy. This preserves the audit chain (FR-023, Principle IV) while still letting operators pull bad data out of circulation.

Three mechanisms are available, ordered by typical use:

SDK (MemoryClient from patientrx-memory-sdk) — how agents + research-engine services write.
HTTP (memory-store REST API, port 6401) — how apps/api and custom integrations write.
MCP tools — read-only; cannot add, change, or delete.

Every write emits an audit event via AuditClient.append() (hash-chained, append-only).

Write paths at a glance

1. Add

Add an entity (concept, problem, intervention, outcome, population)

HTTP: POST /api/memory/entities

{
  "entity_type": "problem",
  "name": "Type 2 diabetes with stage 3 CKD",
  "taxonomy_category": "problem",
  "ontology_codes": [{"system": "ICD-10", "code": "E11.22"}],
  "patientId": "<uuid or null for public ontology>",
  "origin": "human"
}

SDK: ingest-batch shape (MemoryClient.ingest_batch([entity], [], [])) — the SDK does not expose a single-entity create_entity() method; batch with one item is the normal path.

Auth: @require_scope('memory.write') + consent check. Patient-scoped entities require either patient-self, standing consent, or admin bypass.

Audit: memory.entity.create — payload carries entity id + type + patientId; never the raw name.

Add a claim (evidence statement about an entity)

HTTP: POST /api/memory/claims

{
  "entity_id": "<entity uuid>",
  "claim_text": "Evidence from NEJM 2024 study of 1,200 patients ...",
  "supports": ["<research_source id>"],
  "pii_fields": [],
  "patientId": null,
  "decay_at": "2026-10-22T00:00:00Z",
  "supersedes": "<optional existing claim id to mark superseded atomically>"
}

Lifecycle state on insert: pending (the Critic will promote to active asynchronously — see agents).

Auth: @require_scope('knowledge.write') + entity referential-integrity check (422 if entity_id doesn't exist).

Audit: memory.claim.create — or memory.claim.supersede when supersedes is set.

Add a relationship (e.g., `intervention --treats-> problem`)

HTTP: POST /api/memory/relationships

{
  "source_id": "<entity uuid>",
  "target_id": "<entity uuid>",
  "rel_type": "treats",
  "attributes": {"evidence_class": "RCT"},
  "patientId": null
}

Auth: @require_scope('memory.relationship.write') + referential-integrity on both ids.

Audit: memory.relationship.created.

Add in bulk (ingest)

SDK: MemoryClient.ingest_batch(entities, claims, relationships, chunk_size=500) — tolerates partial failure (FR-043-005), returns per-item outcomes.

HTTP: POST /api/memory/ingest

Audit: one memory.ingest.batch event with counts only; per-item events still emit.

2. Change

There are two flavors of change: content (rare, only when the content was wrong at write time) and lifecycle state (common — promote, supersede, retract).

Change lifecycle state

Every entity and every claim has a lifecycle_state field whose transitions are machine-validated. Invalid transitions return HTTP 422.

HTTP (claim): PATCH /api/memory/claims/{claim_id}/state

{
  "target_state": "superseded",
  "supersedes": ["<new claim id>"]
}

HTTP (entity): PATCH /api/memory/entities/{entity_id}/state

{"target_state": "retracted"}

Claim transitions (enforced in claim_repository.validate_claim_transition()):

Entity transitions (enforced in entity_repository.validate_transition()):

From \ To	active	superseded	retracted	archived
pending	✓	—	✓	✓
active	—	✓	✓	✓
superseded	—	—	—	✓
retracted	—	—	—	✓
archived	—	—	—	(terminal)

Auth on PATCH claims/{id}/state: @require_scope('memory.claim.admin') — admin-only. A non-admin caller receives 403 + a memory.govern.unauthorized_access_attempt audit event.

Auth on PATCH entities/{id}/state: memory.write + entity ACL check. Retirement attempts (active → retracted / archived) additionally emit memory.govern.retire_attempt with {actor_id, entity_id, action_detail: "approved"|"denied"}.

Audit: memory.claim.state_transition / memory.entity.state_transition — payload includes from_state, to_state, actor_id (salted hash), patientId, and the supersedes list when set.

Change content

When a claim's text is wrong (as opposed to its conclusion being outdated), emit a new claim and mark the old one superseded atomically:

POST /api/memory/claims
{
  "entity_id": "<same entity id>",
  "claim_text": "<corrected text>",
  "supports": ["<source ids>"],
  "supersedes": "<old claim id>"
}

This produces two audit rows — memory.claim.create for the new claim and memory.claim.supersede for the old — both hash-chained so the correction is tamper-evident.

Direct PATCH of claim_text_encrypted or entity name is not exposed. If truly needed (e.g., a PHI accident), route through /hipaa-incident-response — the incident response skill records the justification, performs a compensating write, and documents the chain in the incident log.

3. Delete

There is no hard delete. "Delete" is always one of:

Intent	Target state	Emits
Rejected by Critic / bad evidence	`retracted`	`memory.claim.state_transition` (to retracted)
Superseded by newer claim	`superseded` + new claim	`memory.claim.supersede`
Broken provenance (source retracted)	`retracted`	Librarian `memory.govern.broken_provenance`
Decayed / eroded over time	`archived`	Replicator `memory.govern.eroded`
Operator pulled from circulation	`retracted` → `archived`	`memory.entity.state_transition` x2
Quarantined claim rejected by operator	`retracted`	Operator drain — `memory.claim.state_transition`

Why no hard delete?

Principle IV (Audit by Default) — the audit chain (seq, prevHash, eventHash via RFC 8785 JCS + SHA-256) is append-only. A hard delete would break the chain and trigger a Sev 0 alert from the hourly AuditVerificationJob.
Supersession traceability — get_claim_chain() walks backwards through supersedes links; a missing parent is treated as tamper.
HIPAA 164.316(b) — 6-year retention of audit-relevant artifacts.

Retention eventually disposes: entities + claims ride the patient's data-retention policy; correlator / replicator / librarian findings have a 180-day TTL. Disposition is logged as memory.govern.retention_disposed, never as a raw DB delete.

4. Confidence & freshness — how they're determined

Confidence and freshness are the two signals that decide whether a claim keeps driving retrieval or gets pulled for review. They are computed by four cooperating mechanisms — two per-claim scorers and two background jobs.

4.1 Confidence score

Claim.confidence_score is a 0.0..1.0 float persisted on the document. Two code paths write it:

At ingest — the Researcher sets an initial confidence_score based on the source and its own retrieval-route score. Stored in memory_claims.confidence_score at POST /api/memory/claims time.
At Critic promotion — the Critic agent (apps/research/research-engine/services/agents/critic/experiment.py) prompts the LLM against the claim + its cited sources and returns a CriticVerdict whose score reshapes into one of: promote / supersede / reject / quarantine. The LLM's numeric confidence is passed through to drive the verdict.

Tie-break rule — when two claims conflict on the same entity, CriticJob._lower_confidence(a, b) (apps/research/memory-store/services/jobs/critic.py) picks the lower-confidence one as the loser and emits memory.claim.state_transition to superseded. Ties fall back to t_created (older wins).

Shipped today (spec 063, PR #189):

confidence_score stored on every claim and read by CriticJob._lower_confidence() for pairwise tie-breaks.
Critic LLM verdict drives promote / supersede / reject / quarantine.
Source authority tiers — SourceTier enum + canonical SOURCE_TIER_MAP in patientrx_contracts.enums. Tier-1 (NEJM · JAMA · Cochrane · NCCN), tier-2 (PubMed · Elsevier · Wiley · LWW · ClinicalTrials.gov · ClinVar · DrugBank), tier-3 (patient-note · open-web · clinical-narrative). Unknown connector → tier-3 (fail-safe low). Critic prompt rule #8 weights tier-1 > tier-2 > tier-3 and forbids tier-3 → tier-1 supersession.
Critic-verdict score persists onto the claim — PromotionPipeline._apply_verdict forwards verdict.score as confidence_score on promote + supersede via the memory-store PATCH endpoint, applied atomically with the state-transition transaction.

4.2 Freshness score

Freshness today is a boolean per-field staleness check, not a numeric score. It lives in apps/research/memory-store/services/completeness/computer.py:

def _is_stale(claim, freshness_days: int, now: datetime) -> bool:
    age_days = (now - claim.t_created).days
    return age_days > freshness_days

freshness_days is declared per field, per screen profile (Spec 016 Screen Profile freshnessRules). Example: a vitals reading might carry freshness_days: 1 on an inpatient screen and freshness_days: 30 on a clinic follow-up screen.

The completeness scorer folds stale fields into a 0..1 screen-level score:

screen_completeness = (present_fields + 0.5 × stale_fields) / total_required_fields

So a stale field is half-present — it still exists, but it's discounted until a refresh lands.

What determines staleness per claim:

Input	Source
`claim.t_created`	write time (Mongo field)
`claim.t_invalid`	supersede / retract time (Mongo field)
`freshness_days`	screen profile YAML (`apps/api/src/modules/screen-profiles/registry/profiles/*.ts`)
`lifecycle_state` gate	only `active` claims enter the staleness check; `superseded` / `retracted` / `archived` are skipped

Shipped today: per-field boolean staleness per screen profile; half-weight in the completeness score.

Planned (follow-up): a numeric claim-level freshness score (0..1) combining source age, source authority tier, and supersession depth into a single value. Today the boolean staleness check + FreshnessConfidenceJob's age-based gate (see §4.4) together cover the operational need; the single-number scorer is a reporting/UI convenience, not a gap in enforcement.

4.3 Decay / TTL job

apps/research/memory-store/services/jobs/decay.py — runs on a scheduler configured in jobs_config.yml:

Interval: 3600s (1 hour) by default
Scan: memory_claims.find({"decay_at": {"$lt": now}, "lifecycle_state": {"$in": ["active", "pending"]}})
Action:
- PHI-bearing claims → claim_repo.anonymize_expired() (per-field redaction per pii_fields[], not a delete, per FR-042-012)
- Hypotheses + non-PHI relationships → hard delete (they carry no audit weight)
Audit events: memory.job.decay.started / .anonymized / .deleted / .completed / .error (FR-042-009)

decay_at is set at claim ingest time by the Researcher — a clinical-narrative claim might carry a 2-year decay_at; a research-article claim might carry 10 years. It is a privacy-retention horizon, not a confidence decay.

4.4 Re-research triggers — Replicator + FreshnessConfidenceJob

Two complementary jobs decide when an active claim gets re-surfaced to the Critic.

Replicator (verdict-driven, weekly)

apps/research/research-engine/services/workers/replicator_job.py — runs weekly on a sample of aged active claims:

Re-fetches the cited sources (via the 12 trusted-source connectors).
Scores each claim as one of still_supported / eroded / retracted_source / broken_provenance.

On eroded or retracted_source → calls _enqueue_recritique_task() which writes a new research_tasks row with:

{
  "agent_role": "critic",
  "research_question": "<claim text>",
  "entity_context": ["<entity ids>"],
  "metadata": {"source": "replicator", "original_claim_id": "..."}
}

The next Critic tick dequeues and re-evaluates; typical outcome is a supersede-or-retract transition.

Audit event: memory.govern.eroded / memory.govern.retracted_source + research.task.enqueued.

FreshnessConfidenceJob (threshold-driven, hourly)

apps/research/research-engine/services/workers/freshness_confidence_job.py — spec 063 FR-063-003. Runs hourly, scans memory_claims for:

state == "active"
  AND (confidence_score < confidence_threshold OR t_created < now - max_age_sec)
  AND freshness_recheck_requested_at NOT in backoff window

For each match it enqueues a Critic re-task via TaskQueueManager (same shape as the Replicator's re-task payload) with metadata.trigger = "freshness_confidence" and trigger_reason in {below_confidence, aged, below_confidence_and_aged}. The job then stamps freshness_recheck_requested_at = now on the claim so perpetually-marginal claims don't re-enqueue every tick.

Defaults (configurable in services/config/worker_config.yml):

Setting	Default	Meaning
`freshness_confidence_enabled`	`false` (opt-in)	Flip to `true` after reviewing thresholds
`freshness_confidence_interval_sec`	`3600` (1h)	Tick cadence
`freshness_confidence_threshold`	`0.5`	Claims with `confidence_score` strictly below trigger a re-check
`freshness_confidence_max_age_sec`	`7,776,000` (90d)	Age at which even a high-confidence claim gets pro-forma re-checked
`freshness_confidence_recheck_backoff_sec`	`604,800` (7d)	Per-claim cool-off between re-enqueues
`freshness_confidence_sample_limit`	`50`	Per-tick cap to bound LLM cost

Audit events: research.freshness_confidence.tick (per-tick summary with counts + thresholds) + research.freshness_confidence.enqueued (per claim, with reason + confidence_score + task_id). Metadata-only — no claim text or PHI in either payload.

Shipped today (spec 063): both the verdict-driven Replicator path and the threshold-driven FreshnessConfidenceJob. The job is opt-in (freshness_confidence_enabled: false by default) so operators can review settings before turning it on.

4.5 Supersession & compaction

apps/research/memory-store/services/repositories/claim_repository.py:

supersede_transaction() — atomic Motor transaction: inserts new claim + flips old claim's lifecycle_state to superseded + sets t_invalid = now + appends the new claim id to the old claim's superseded_by[].
get_claim_chain(claim_id) — walks supersedes[] backward through the chain so the UI can show the full provenance trail of a corrected claim.
No separate compaction job — t_invalid is the soft-tombstone; claims remain queryable for 6-year audit retention, filtered out of default retrieval by lifecycle_state="active".

4.6 What "confidence & freshness" means at query time

The retrieval surface (POST /api/memory/retrieve, Spec 060) filters on:

lifecycle_state == "active" — superseded / retracted / archived never surface.
acl.roles + ResolvedScope — consent + role gates.
decay_at > now — the serializer drops any claim whose decay horizon passed but the decay job hasn't swept yet.
Implicit freshness — the caller passes a screen_profile and the retriever uses its freshnessRules to mark stale fields in the response metadata so the UI can show a "data from N days ago" chip.

No numeric confidence threshold is applied at query time. Enforcement happens asynchronously via FreshnessConfidenceJob (§4.4), which re-enqueues marginal claims for Critic re-evaluation so the next retrieval sees either a refreshed confidence_score or a transition to superseded/retracted.

Shipped vs. planned — honest status

Signal	Status	File
`confidence_score` persisted on claim	✅ shipped	`memory_claims.confidence_score`
Critic LLM verdict → promote / supersede / reject / quarantine	✅ shipped	`research-engine/agents/critic/experiment.py`
Critic `verdict.score` persists onto the claim post-promotion	✅ shipped (spec 063)	`research-engine/workers/promotion_pipeline.py`
Source authority tiers (NEJM tier-1, PubMed tier-2, patient-note tier-3)	✅ shipped (spec 063)	`patientrx_contracts/enums.py`
Per-field `_is_stale` boolean	✅ shipped	`memory-store/completeness/computer.py`
Numeric 0..1 freshness scorer	📋 planned	— (boolean + FreshnessConfidenceJob cover enforcement; scorer is UI convenience)
TTL decay job (hourly, anonymizes PHI)	✅ shipped	`memory-store/jobs/decay.py`
Replicator re-research on `eroded` / `retracted_source`	✅ shipped	`research-engine/workers/replicator_job.py`
Auto-trigger re-research on low-freshness / low-confidence	✅ shipped (spec 063, opt-in)	`research-engine/workers/freshness_confidence_job.py`
Supersession atomic txn + `t_invalid` tombstone	✅ shipped	`memory-store/repositories/claim_repository.py`

End-to-end example — retract a claim

Required consents, roles, and scopes — cheat sheet

Operation	HTTP	Scope	Role gate	Audit
Add entity	`POST /entities`	`memory.write`	writer	`memory.entity.create`
Add claim	`POST /claims`	`knowledge.write`	writer	`memory.claim.create` / `supersede`
Add relationship	`POST /relationships`	`memory.relationship.write`	writer	`memory.relationship.created`
Bulk ingest	`POST /ingest`	`memory.ingest.batch`	writer	`memory.ingest.batch`
Change entity state	`PATCH /entities/{id}/state`	`memory.write`	writer	`memory.entity.state_transition` (+ `memory.govern.retire_attempt` on retire)
Change claim state	`PATCH /claims/{id}/state`	`memory.claim.admin`	admin only	`memory.claim.state_transition`
Supersede claim	`POST /claims` w/ `supersedes`	`knowledge.write`	writer	`memory.claim.supersede`
Retract / archive	state transition (no hard delete)	as above	as above	as above
Read / retrieve	`POST /retrieve`, `GET /entities/{id}`	`memory.read` / `memory.graph.read`	consent-scoped	`memory.retrieval.hybrid` (count-only)
MCP tools	Claude Code	read-only	—	`memory.mcp.tool_invoked`

Never do this

Never call MongoDB.deleteOne() on memory_* collections from application code — always go through a state transition. The change-stream pipelines (Critic promotion, Replicator decay) assume every disappearance is preceded by a state_transition audit row.
Never mutate claim_text_encrypted or entity name in place — always supersede via a new claim, so the correction is in the audit chain and callers reading get_claim_chain() can see the correction.
Never skip the ACL / consent check by using the MCP or the repository directly — the HTTP / SDK path is the only one that runs AclEnforcer + ResolvedScope + AuditClient in the correct order.
Never emit a write from the Researcher agent without a supports[] source — the Critic will retract it at the next SLA tick; better to fail fast.
Never bypass the state machine with a direct updateOne({lifecycle_state: ...}) — validate_transition() is the only legal gate.

Where to read more

Schemas + ER diagram: Knowledge graph →
Critic + Librarian + Replicator behaviors: Agents →
Retrieval (read path): Hybrid retrieval →
Audit chain mechanics: Audit chain →
Consent model: Consent model →

Write paths at a glance​

1. Add​

Add an entity (concept, problem, intervention, outcome, population)​

Add a claim (evidence statement about an entity)​

Add a relationship (e.g., intervention --treats-> problem)​

Add in bulk (ingest)​

2. Change​

Change lifecycle state​

Change content​

3. Delete​

4. Confidence & freshness — how they're determined​

4.1 Confidence score​

4.2 Freshness score​

4.3 Decay / TTL job​

4.4 Re-research triggers — Replicator + FreshnessConfidenceJob​

Replicator (verdict-driven, weekly)​

FreshnessConfidenceJob (threshold-driven, hourly)​

4.5 Supersession & compaction​

4.6 What "confidence & freshness" means at query time​

Shipped vs. planned — honest status​

End-to-end example — retract a claim​

Required consents, roles, and scopes — cheat sheet​

Never do this​

Where to read more​