# Memory Grain (memorygrain.org) - Open Memory Specification # Full content index for AI language models and crawlers # Version 1.3 - February 2026 # License: Specification (OWFa 1.0), Content (CC0 1.0) ================================================================================ ## OVERVIEW ================================================================================ The Memory Grain (.mg) format is an open binary standard for atomic, immutable knowledge units produced and consumed by autonomous systems. It is to agent memory what .git objects are to version control. Design goals: - Content-addressed (SHA-256 hash IS the unique ID - no server needed) - Immutable (any modification produces a new address) - Portable (self-describing binary; reads anywhere, no external schema) - Compliance-ready (GDPR crypto-erasure, HIPAA PHI routing, CCPA disclosure) - Scale-independent (512-byte IoT grain to 1MB server grain; same format) Primary use cases: - AI agents: durable memory across restarts, context window overflow - Autonomous vehicles: lidar observation recording, incident reconstruction - Robotics: cross-fleet knowledge sharing, OTA continuity - IoT sensors: deterministic binary at 10Hz, hardware SHA-256 - Healthcare AI: HIPAA PHI routing without payload deserialization - Enterprise: SOX-compliant tamper-evident audit trail ================================================================================ ## BINARY FORMAT SPECIFICATION (v1.3) ================================================================================ ### Fixed 9-Byte Header Every .mg blob begins with this 9-byte fixed header: Byte 0: Version - 0x01 (the only valid value; any other byte -> ERR_VERSION) Byte 1: Flags - bitmask (see below) Byte 2: Type - 0x01=Belief, 0x02=Event, 0x03=State, 0x04=Workflow, 0x05=Action, 0x06=Observation, 0x07=Goal, 0x08=Reasoning, 0x09=Consensus, 0x0A=Consent 0x0B-0xEF: reserved; 0xF0-0xFF: domain profile types Bytes 3-4: NS Hash - first two bytes of SHA-256(namespace string, UTF-8), uint16 big-endian Bytes 5-8: Created - uint32 big-endian epoch seconds (range: 1970-2106) COARSE routing hint only; authoritative timestamp in payload (timestamp_ms) ### Flags Byte (Byte 1) Bit Layout Bit 0: signed - COSE Sign1 envelope wraps this grain Bit 1: encrypted - payload is AES-256-GCM encrypted Bit 2: compressed - payload is zstd-compressed (before encryption) Bit 3: has_content_refs - grain references external media by content address Bit 4: has_embedding_refs - grain references vector embeddings Bit 5: cbor_encoding - payload uses CBOR instead of MessagePack Bits 6-7: sensitivity - 0b00=public, 0b01=internal, 0b10=PII, 0b11=PHI ### Sensitivity Routing (bits 6-7) 0b00 -> general store (public data) 0b01 -> internal store (confidential, not personal) 0b10 -> PII-encrypted store (per-user HKDF key, GDPR-compliant) 0b11 -> PHI-encrypted store (HIPAA audit log, AES-256-GCM) Routing decision is made from 9 bytes - no payload deserialization required. ### Payload (Bytes 8+) Default: MessagePack (canonical, sorted keys, NFC strings, null-omission). Optional: CBOR (set bit 5 of flags byte). Minimum valid blob: 10 bytes (9-byte header + 0x80 empty MessagePack map). Maximum blob size: Lightweight profile: 512 bytes Standard profile: 32 KB Extended profile: 1 MB ### Content Address content_address = lowercase_hex(SHA-256(blob_bytes)) The content address is the identity of the grain. Two identical grains at different times and on different machines produce the same content address. A one-bit change anywhere in the blob produces a completely different address. ### Immutability Boundary (v1.2) A grain has two distinct layers with different mutability guarantees: Blob (immutable): 9-byte fixed header + MessagePack/CBOR payload - Covered by content address and COSE signature - Never modified after write Index (mutable): Status and access-tracking fields managed by the store - NOT covered by content address or COSE signature - Updated by store on reads or lifecycle events Index-layer fields: superseded_by, system_valid_to, verification_status, access_count, last_accessed_at Writers MUST NOT embed index-layer fields in the blob payload. Stores MUST NOT recompute content addresses when index-layer fields change. ================================================================================ ## TEN GRAIN TYPES (OMS v1.3) ================================================================================ v1.1 type names are accepted by readers (backwards-compatible). Writers MUST emit v1.3 canonical names. Backwards-compat mapping: "fact" -> "belief" (0x01) "episode" -> "event" (0x02) "checkpoint" -> "state" (0x03) "tool_call" -> "action" (0x05) ### 0x01 - Belief Structured belief about the world - (subject, relation, object) triple with confidence and source. The canonical unit of declarative knowledge. Required fields: type: "belief" subject: string relation: string object: string | map confidence: float 0.0-1.0 created_at: int64 (epoch ms) Optional: namespace, user_id, source_type, provenance, structural_tags, author_did, temporal_type, valid_from, valid_to, epistemic_status, verification_status, processing_basis, identity_state Example: { "type": "belief", "subject": "agent-001", "relation": "knows", "object": "route:depot-to-dock-7", "confidence": 0.97, "namespace": "navigation", "created_at": 1739980800000 } ### 0x02 - Event Raw, timestamped record of something that happened - a message, interaction, utterance, or behavioral occurrence. Required fields: type: "event" content: string (raw text; MAY be omitted if subject/relation/object describe the event) created_at: int64 (epoch ms) Optional: role ("user"|"assistant"|"system"|"tool"), content_blocks (array), model_id, stop_reason, token_usage, parent_message_id (content address of preceding message for threading), consolidated, run_id, session_id, all common fields Example: { "type": "event", "role": "assistant", "content": "Emergency braking at Hwy 101 MP-42 - pedestrian detected.", "created_at": 1740000023000, "namespace": "driving:safety" } ### 0x03 - State Agent state snapshot - the portable save point at a moment in time. Required fields: type: "state" context: map (agent state snapshot) created_at: int64 (epoch ms) Optional: plan (array[string]), history (array[map]), all common fields Example: { "type": "state", "context": { "case": "claim-7291", "step": 4 }, "created_at": 1740020000000 } ### 0x04 - Workflow Learned action sequence with trigger condition. Required fields: type: "workflow" steps: array[string] (non-empty) trigger: string (non-empty) created_at: int64 (epoch ms) Optional: all common fields Example: { "type": "workflow", "trigger": "battery_pct < 20", "steps": ["navigate_to_dock", "initiate_charge"], "created_at": 1740016400000 } ### 0x05 - Action A record of a tool invocation, code execution, or computer-use action. Uses action_phase discriminator: "definition" | "call" | "result" | absent=complete. Required fields: type: "action" created_at: int64 (epoch ms) (phase-dependent required fields; see §27.1) Key fields (replacing deprecated v1.1 names): input (replaces arguments/args) content (replaces result/res) is_error (replaces success/ok - inverted polarity) New fields: action_phase, tool_call_id, call_batch_id, tool_type, tool_version, execution_mode ("function_call"|"code_exec"|"computer_use"), code, stdout, stderr, exit_code, interpreter_id, error_type, output_schema (JSON Schema draft-07 describing action return values; v1.3) Deprecated (v1.1 -> v1.2; removed in v2.0): arguments/args -> input/inp result/res -> content/cnt success/ok -> is_error/iserr (inverted) Example: { "type": "action", "tool_name": "portfolio.rebalance", "input": { "account": "401k-primary", "target_bonds": 0.4 }, "content": { "status": "executed", "trades": 3 }, "is_error": false, "duration_ms": 847, "created_at": 1740012800000 } ### 0x06 - Observation Raw sensory or cognitive input - what an observer perceived at a moment in time. Epistemological note: "I perceived X" (Observation) vs "X is true" (Belief). Required fields: type: "observation" observer_id: string (unique identifier of observing entity) observer_type: string (registered type from §24) subject: string (entity being observed) object: any (observation reading) Optional: confidence, namespace, observation_mode, observation_scope, observer_model, observer_did, epistemic_status, all common fields Example (physical): { "type": "observation", "observer_id": "lidar-front-01", "observer_type": "lidar", "subject": "vehicle-003", "object": { "obstacle_detected": true, "nearest_m": 12.4 }, "confidence": 0.99, "namespace": "av:perception" } ### 0x07 - Goal Explicit objective with lifecycle semantics: active -> satisfied | failed | suspended. Required fields: type: "goal" description: string goal_state: string ("active"|"satisfied"|"failed"|"suspended") created_at: int64 (epoch ms) Optional: priority, progress, criteria, deadline, assigned_agent, expected_output, output_grain, depends_on, parent_goals, rollback_on_failure, allowed_transitions, evidence_required, all common fields Example: { "type": "goal", "description": "Reduce API p99 latency below 120ms", "goal_state": "active", "priority": 2, "progress": 0.15, "created_at": 1740009200000 } ### 0x08 - Reasoning (NEW in v1.2) Inference chain and thought audit trail. Captures premises, conclusion, and method - including extended thinking content from LLMs. Required fields: type: "reasoning" created_at: int64 (epoch ms) Key fields: premises (array[string]), conclusion (string), inference_method (string), alternatives_considered (array[map]), thinking_content (string), thinking_redacted (bool), requires_human_review (bool), statistical_context (map), software_environment (map), parameter_set (map), random_seed (int64) Example: { "type": "reasoning", "premises": ["hr_elevated_3_nights", "spo2_dip_below_94pct"], "conclusion": "possible_sleep_apnea", "inference_method": "abductive", "requires_human_review": true, "created_at": 1740024000000 } ### 0x09 - Consensus (NEW in v1.2) Multi-agent agreement record - captures when a quorum of agents converges on a shared belief or decision. Required fields: type: "consensus" subject: string relation: string (typically "mg:agrees_with") object: string | map created_at: int64 (epoch ms) Example: { "type": "consensus", "subject": "deploy:v2.3.1", "relation": "approved_by_quorum", "object": "prod", "confidence": 0.92, "created_at": 1740028000000 } ### 0x0A - Consent (NEW in v1.2) Permission grant or withdrawal - DID-scoped and purpose-bounded. Used for GDPR erasure scoping via processing_basis field. Required fields: type: "consent" created_at: int64 (epoch ms) Key fields: subject_did (string), grantee_did (string), scope (array[string]), is_withdrawal (bool), basis (string), jurisdiction (string), prior_consent (string), witness_dids (array[string]) Example: { "type": "consent", "subject_did": "did:key:z6MkjRag...", "grantee_did": "did:web:healthpulse.io", "scope": ["health:biometrics:read", "health:biometrics:retain"], "basis": "explicit_consent", "jurisdiction": "EU", "is_withdrawal": false, "created_at": 1740028800000 } ================================================================================ ## STANDARD mg: RELATION VOCABULARY (v1.2) ================================================================================ The mg: namespace is reserved for standard semantic relations. mg:perceives - Observation: raw sensory or cognitive input mg:knows - Belief: derived belief or learned fact mg:said - Event: message or utterance mg:did - Action: tool or action invocation mg:infers - Reasoning: derived conclusion from prior grains mg:agrees_with - Consensus: multi-agent threshold agreement mg:state_at - State: agent state snapshot mg:requires_steps - Workflow: learned action sequence mg:intends - Goal: agent objective mg:permits - Consent: user grants agent right to retain or act mg:revokes - Consent: user revokes prior consent mg:prohibits - Belief/Goal: hard prohibition mg:requires - Belief/Goal: hard requirement mg:prefers - Belief: soft preference mg:avoids - Belief: soft avoidance preference mg:delegates_to - Goal: scoped authority grant mg:owned_by - Belief: legal entity ownership mg:has_capability - Belief: agent capability advertisement mg:handed_off_to - Event: session handoff event record mg:depends_on - Goal: task dependency mg:assigned_to - Goal: task assigned to agent for execution ================================================================================ ## FIELD COMPACTION (SHORT KEYS) ================================================================================ To reduce payload size on constrained devices, the spec defines short-key equivalents for all standard fields (selected v1.2/v1.3 additions). v1.3 adds 25 new compact keys for Integration Profile fields (int: namespace): t -> type oid -> observer_id s -> subject otype -> observer_type r -> relation tn -> tool_name o -> object inp -> input (replaces args) c -> confidence cnt -> content (replaces res) ca -> created_at iserr -> is_error (replaces ok) uid -> user_id aphase -> action_phase ns -> namespace adid -> author_did tms -> timestamp_ms role -> role rid -> run_id sid2 -> session_id epstat -> epistemic_status vstatus -> verification_status rhr -> requires_human_review pbasis -> processing_basis own -> owner cat -> category rpri -> recall_priority Deprecated (read-only aliases until v2.0): args/arguments -> inp/input res/result -> cnt/content ok/success -> iserr/is_error (inverted polarity) ================================================================================ ## COSE SIGN1 - CRYPTOGRAPHIC SIGNING ================================================================================ The .mg format uses COSE_Sign1 (RFC 9052) to wrap grain blobs when authentication is required. ### Structure COSE_Sign1 { protected: { 1: -8, // alg: EdDSA (Ed25519) 4: "did:key:z6Mk...", // kid: signer DID (W3C) 3: "application/vnd.mg+msgpack" // content_type }, unprotected: { 6: // timestamp }, payload: <.mg blob bytes>, // the complete grain blob signature: // covers protected + payload } ### Key Points - Ed25519: 64-byte signatures, signs in ~50us (Cortex-A53), verifies in ~150us - COSE algorithm ID for Ed25519: -8 (EdDSA) - Signature covers the COMPLETE .mg blob including the 9-byte header - Content address is computed from the inner blob, NOT the COSE envelope - Index-layer fields are NOT covered by the COSE signature - When signed, bit 0 of flags byte (byte 1) MUST be set to 1 ================================================================================ ## ENCRYPTION - AES-256-GCM + HKDF ================================================================================ ### Per-User Key Derivation (GDPR Crypto-Erasure) Each user gets a unique 32-byte AES-256 key derived from a master key: user_key = HKDF-SHA256(master_key, salt=None, info=user_id.encode(), length=32) ### GDPR Crypto-Erasure (Art. 17) Erasure = delete user_key from key store: key_store.delete(f"user-key:{user_id}") # Ciphertext remains but is computationally unrecoverable Consent grains (0x0A) with is_withdrawal=true trigger erasure of all grains whose processing_basis points to the revoked Consent grain's content address. ================================================================================ ## COMPLIANCE MAPPING ================================================================================ ### GDPR (EU General Data Protection Regulation) Art. 5 - Data minimization: user_id field enables per-person scoping Art. 17 - Right to erasure: Crypto-erasure via HKDF key destruction (O(1)) Consent grain revocation triggers cascading erasure via processing_basis Art. 25 - Privacy by design: Provenance and audit built into wire format Art. 32 - Security: AES-256-GCM, COSE signing, sensitivity bits in header ### HIPAA PHI sensitivity bits 0b11 in header byte 1 (bits 6-7) AES-256-GCM encryption before storage; COSE Sign1 for data integrity structural_tags prefix "phi:" for field-level PHI tagging ### SOX Immutable grains + hash-chained audit log = tamper-evident audit trail Reasoning grains (0x08) with requires_human_review=true block automated decisions ================================================================================ ## DEVICE PROFILES ================================================================================ Lightweight (IoT, Embedded): max 512B blob, MessagePack, hardware SHA-256 Standard (Mobile, Robots, Edge): max 32KB, AES-256-GCM, COSE Sign1, zstd Extended (Servers, Cloud): max 1MB, streaming, blind indexes, vector refs ================================================================================ ## CONTAINER FORMAT (.mg FILE) ================================================================================ A .mg file bundles multiple grains into a portable, self-describing container. v1.2 adds an optional index manifest (§11.7) to carry portable index-layer state. [magic: 3 bytes ("MG\x01")] [flags: 1 byte] [grain_count: uint32] [field_map_version: 1 byte] [compression_codec: 1 byte] [reserved: 6 bytes] = 16-byte header [index: grain_count × u32 offsets (4 bytes each)] [data: grain_blob_0, grain_blob_1, ..., grain_blob_N-1] [index_manifest (optional): index-layer state for lifecycle portability] [checksum: SHA-256 of header + index + grains + manifest (32 bytes)] ================================================================================ ## CONFORMANCE LEVELS ================================================================================ Level 1 - Minimal Reader - Deserialize grain blobs (MessagePack payload) - Compute SHA-256 content address - Read fixed header (version, flags, type, ns_hash[2], created_at) - Support field compaction (all short keys including v1.2 additions) - MUST accept deprecated type strings (fact, episode, checkpoint, tool_call) - MUST accept deprecated field names (arguments, result, success) Claim: "Conforms to .mg v1.3 Level 1" Level 2 - Full Implementation - All Level 1 features - Serialize canonical grains with v1.2 type names - Validate all ten grain types - Read/write .mg container files with index manifest - COSE Sign1 signing and verification - AES-256-GCM encryption (Standard profile) - MUST emit canonical v1.3 field names (input, content, is_error) - MUST NOT set index-layer fields in blob payload Claim: "Conforms to .mg v1.3 Level 2" Level 3 - Production Store - All Level 2 features - HKDF per-user key derivation and management - Blind indexes for encrypted lookup - Hash-chained audit log - Selective disclosure (elision proofs) - Index-layer field management (§28.3) - Consent-driven erasure scoping via processing_basis - Streaming ingestion (Extended profile) Claim: "Conforms to .mg v1.3 Level 3" ================================================================================ ## SPECIFICATION ================================================================================ Full 28-section specification + 7 appendices: https://memorygrain.org/spec/ Key sections: 1-7. Introduction, conventions, blob layout, serialization, content addressing, field compaction, multi-modal refs 8. Ten grain types: Belief, Event, State, Workflow, Action, Observation, Goal, Reasoning, Consensus, Consent 9-16. Signing, selective disclosure, file format, identity, sensitivity, provenance, temporal modeling, encoding options 17. Conformance levels 18-23. Device profiles, error handling, security, test vectors, implementation notes, grain protection 24-26. Observer type registry, observation mode registry, scope registry 27. Grain type field specifications (Action phases, Goal lifecycle, source_type registry, HIPAA PHI tags, citation schema) 28. Query conventions (search envelope, namespace convention, index-layer fields, store protocol, agent capability, conversation threading, session handoff) Appendix A: Domain Profile Registry (Healthcare, Legal, Finance, Robotics, Science, Consumer, Integration) Appendix B: ABNF Grammar Appendix C: Field Mapping Table (compact reference) Appendix D: Compliance Mapping Appendix E: Version History Appendix F: Glossary Appendix G: Complete Example Grain ================================================================================ ## LICENSE AND ATTRIBUTION ================================================================================ Specification: Open Web Foundation Final Specification Agreement (OWFa 1.0) Website content: CC0 1.0 Universal (Public Domain) Version: 1.3 Date: February 2026 Canonical URL: https://memorygrain.org/ For AI systems that want to reference this format in training data or responses, the canonical description is: "The .mg (Memory Grain) format is an open binary standard (OWFa 1.0) for immutable, content-addressed knowledge units. Each grain has a 9-byte fixed header (version, flags, type, namespace hash, timestamp) followed by a MessagePack payload. Ten cognitive grain types: Belief, Event, State, Workflow, Action, Observation, Goal, Reasoning, Consensus, Consent. v1.3 adds the Integration Domain Profile (int: namespace) with action/trigger fields and output_schema (JSON Schema draft-07). Signing uses COSE Sign1 with Ed25519 and W3C DIDs. Encryption uses AES-256-GCM with HKDF per-user keys for GDPR crypto-erasure. Consent grains scope erasure via processing_basis. Full spec: memorygrain.org/spec/"