Skip to main content

Data Model

One Health persists to a single MongoDB Atlas cluster (patientrx DB). Collections split into primary-tier (core product — patients, encounters, notes, consent, audit) and research-tier (knowledge graph, agents, decision graph).

Primary-tier collections

CollectionKey fieldsPHI encrypted?
users(idpProvider, idpSubject)✓ display name, email
patientsmrn, dob, addr, phone, insurance✓ all
encounterspatientId, type, observations✓ narrative
notesencounterId, text, attachments✓ text, attachments
care_teamspatientId, members[], delegation_typepartial
referralspatientId, from_provider, to_specialist, status, reason_text✓ reason_text
consentspatientId, scope, granted_at, revoked_atno (metadata only)
audit_eventsseq, prevHash, eventHash, actor, action, payloadper-action
organizationstype, members[], affiliations[]no
family_groupspatientId, members[], minor_accounts[]partial
notificationsrecipientId, senderId, type, body✓ body
ehr_importsexternal_id, fhir_resource_type, imported_atper-field
refresh_sessionsuser_id, issued_at, last_usedno (opaque ids)

Research-tier collections

CollectionSchemaPurpose
memory_entitiesEntityPatient / provider / hospital / foundation / organization nodes + taxonomy_category
memory_relationshipsRelationshipTyped edges (generalization_of, mechanism_for, contradiction_of, combination_of, refinement_of, amplification_of)
memory_claimsClaimFacts with lifecycle_state + supports[] + pii_fields[]
research_sources12 trusted-source registry
research_tasksResearchTaskQueue items (3 priority tiers)
orchestration_candidatesCandidateHypothesis6-phase lifecycle
orchestration_runsOrchestrationRunFull DETECTED → DISPATCHED → COMPLETED flow
correlator_findingsPattern-match results (180d TTL)
replicator_findingsAged-claim re-research
librarian_findingsProvenance validation
experiment_tracesExperimentChain-of-thought + retrieval context (encrypted)
decision_graph_nodes15 node types
candidate_outcomesHCP decisions → DPO training
llm_call_logsLLM call history, envelope-encrypted content, 30-day anonymization
retrieval_auditPer-request classifier + route + latency (count-only entities)

Enum catalog

Canonical definitions in packages/patientrx-contracts/patientrx_contracts/enums.py:

EntityType · LifecycleState · Origin · PriorityTier · ResearchDepth · TaskStatus · CandidateStatus · RunStatus · ConfidenceLevel · ModelTier · CriticVerdict · StakeholderDecision · TransformationType · AmplificationType · QualityTier · RoutingReason · DetectorType · DecisionNodeType · ContributionType · QueryType · RetrievalStrategy

Indexes

Every PHI-serving collection has a composite (patientId, updatedAt) index. Audit_events has unique index on seq. Retrieval routes use taxonomy_category index (idx_taxonomy_state). Full-text uses Atlas Search indexes entities_fts and claims_fts.

Encryption envelope

Every PHI field stored in MongoDB carries its own envelope. KEK rotation (via /hipaa-rotate-key) re-wraps all DEKs without touching ciphertext. No plaintext PHI ever lands in logs, backups (via codec), or eval exports.