What is CAL? The Context Assembly Language is a non-destructive, deterministic query language for assembling agent context from OMS memory stores. CAL cannot delete data — this is enforced at the grammar level. It answers: “what should be in the agent's context window right now?”

CAL (Context Assembly Language) Specification v1.0

Status: Standards Track | Date: 2026-03-03 | Version: 1.0 | Classification: Experimental Part of: Open Memory Specification (OMS) v1.3

Abstract

Introduction
The Safety Model
Lexical Structure
Grammar (EBNF)
Type System
OMS Grain Type Integration
mg: Relation Vocabulary
Statement Semantics
Semantic Shortcuts
FORMAT System
Streaming Protocol
Domain Profile Querying
Store Protocol Mapping
Response Model
Dual Wire Format
Internationalization
Execution Model
Capability Token Model
Policy Integration
Threat Model
Audit Trail
Error Model
Compliance Checks
Conformance Levels
Versioning and Evolution
Interface Integration
LLM System Prompt Template

Appendix A: Complete EBNF Grammar
Appendix B: JSON Schema References
Appendix C: Error Code Registry
Appendix D: Reserved Words
Appendix E: Queryable Fields Reference
Appendix F: Version History

Abstract

The Context Assembly Language (CAL) is a companion specification to the Open Memory Specification (OMS). It defines a non-destructive, deterministic, LLM-native language for assembling agent context from persistent memory.

CAL allows AI agents to recall memory, assemble context windows from multiple memory sources with budget constraints, and evolve memory -- but never destroy it. Every write is append-only and fully revertible. The core safety guarantee -- that CAL cannot destroy data -- is enforced at the grammar level and is a structural impossibility, not a policy check.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

1. Introduction

1.1 What is CAL?

CAL is a non-destructive, deterministic, LLM-native context assembly and evolution language for memory databases that implement the Open Memory Specification.

CAL is a non-destructive, deterministic, LLM-native context assembly and evolution language -- answering "what should be in the agent's context window right now?"

Key capabilities:

Dimension	CAL (Context Assembly Language)
Primary question	"What should be in the context window?"
Core operation	ASSEMBLE (compose context from multiple sources)
Output model	Flat, semantic, LLM-native context (content projection from OMS grains)
Token awareness	Native (BUDGET clause, progressive disclosure)
Multi-source	First-class (FROM clause with priority)
Format control	Built-in (FORMAT clause, AS clause, custom templates)
Progressive disclosure	Native (WITH progressive_disclosure)
Batching	BATCH statement for multiple queries
Schema discovery	DESCRIBE statement for introspection
Streaming	Native (STREAM clause on ASSEMBLE)

1.2 Design Goals

Non-destructive by grammar, not by convention. The parser rejects destructive tokens. There is no "unsafe mode."
Append-only evolution. Writes create new grains -- they never modify or delete existing ones. Every change is traceable and revertible.
Context-window-aware. CAL understands that its output will be consumed by an LLM with finite context. Budget allocation, progressive disclosure, content projection, and format control are first-class concerns. Output is shaped for LLM comprehension, not storage fidelity.
Multi-source composition. ASSEMBLE makes composing context from multiple memory sources a single, declarative operation.
Bounded execution. Every query has a compile-time-determinable upper bound on work.
Policy-transparent. CAL queries execute within the active policy (GDPR, HIPAA, etc.). CAL cannot override policy.
LLM-ergonomic. Keywords read like English. Common patterns have shortcuts (ABOUT, RECENT, SINCE, MY). Errors include suggestions. The grammar fits in approximately 1200 tokens.
Deterministic. Same query + same state = same results + same order. No randomness.
Composable. Queries nest, pipe, combine with set operations, and compose into ASSEMBLE blocks.
Internationally aware. CAL handles multilingual content natively -- Unicode normalization, cross-lingual search, bidi text, and locale-aware sorting are specified behaviors.
Dual-format. Every CAL statement has a bijective mapping between human-readable text (text/cal) and machine-readable JSON (application/json+cal). Neither is canonical -- they are equivalent.
Versionable. CAL/1 RECALL ... explicitly targets a spec version.

1.3 What CAL Is NOT

Not SQL. No tables, no joins, no DDL, no transactions.
Not Turing-complete. No loops, no recursion, no persistent variables.
Not a destructive language. CAL cannot forget, erase, delete, or destroy grains. CAL cannot touch encryption keys, policies, or consent records. This is the core safety guarantee.
Not a transport protocol. CAL defines a language. Transport (HTTP, gRPC, MCP, etc.) is implementation-specific.
Not a rendering engine. CAL's FORMAT clause specifies semantic structure, not pixel-level presentation. The agent or UI decides how to render.
Not a storage mirror. CAL output is a projection optimized for LLM consumption, not a serialization of the underlying OMS grain structure. Hashes, namespaces, and internal metadata stay in the machine envelope.

1.4 The Git Analogy

CAL's safety model maps directly to git:

Git Operation	CAL Equivalent	Destructive?	In CAL?
`git log`	`RECALL`	No	Yes (Tier 0)
`git show`	`RECALL WHERE hash = ...`	No	Yes (Tier 0)
`git add` + `git commit` (new file)	`ADD`	No (append-only)	Yes (Tier 1)
`git commit` (amend existing)	`SUPERSEDE`	No (append-only)	Yes (Tier 1)
`git revert`	`REVERT`	No (creates new commit)	Yes (Tier 1)
`git reset --hard`	Store-level `delete`	Yes (destroys data)	No
`git push --force`	Crypto-erasure	Yes (destroys keys)	No

The line between Tier 0/1 (in CAL) and Tier 2 (not in CAL) is: can the operation be undone by another append-only operation? If yes, it is safe for CAL. If no, it stays out.

1.5 Relationship to OMS

CAL operates on the 10 grain types defined by OMS v1.3: Belief, Event, State, Workflow, Action, Observation, Goal, Reasoning, Consensus, Consent. CAL treats this as a closed set -- custom types are not queryable via CAL.

CAL extends the Store Protocol Convention defined in OMS §28.4 (SPECIFICATION.md) with a formal query language. Where OMS defines the query, search, and supersede store operations, CAL provides a structured, deterministic syntax for invoking them safely.

2. The Safety Model

2.1 The Core Guarantee

CAL cannot destroy data. This is not a policy check. It is a structural impossibility.

CAL can evolve data -- by creating new grains that supersede old ones. But the old grains survive. Every evolution is traceable and revertible. Nothing is ever deleted.

The guarantee is enforced at three reinforcing levels:

Level	Mechanism	What It Prevents
Grammar	The EBNF grammar has no production rules for destructive operations	Parser cannot produce destructive AST nodes
Type System	`CalStatement` is a closed enum with exactly 12 variants: `Recall`, `Assemble`, `SetOp`, `Exists`, `History`, `Explain`, `Describe`, `Batch`, `Add`, `Supersede`, `Revert`, `Coalesce`.	No code path from AST to any destructive method
API Surface	CAL executor receives a constrained facade, not the full store	Destructive methods (delete, key destruction) are structurally inaccessible

2.2 Three-Tier Capability Model

Tier	Name	What It Can Do	How It Is Enforced
Tier 0	Read (default)	Query, count, explain, assemble, describe, batch. Cannot modify anything.	Grammar + type system. Default for all CAL sessions.
Tier 1	Evolve (opt-in)	Add new grains, supersede existing grains, revert supersessions, view history. Append-only; never deletes.	Separate grammar extension, explicit server opt-in, separate capability token with write quotas.
Tier 2	Lifecycle	Erasure, key rotation, policy changes, consent management.	Does not exist in CAL. No grammar, no AST, no parser extension, no config flag. Only available through implementation-specific APIs (REST, gRPC, CLI, etc.).

2.3 Formal Safety Proofs

"CAL cannot delete data because..." The CalStatement enum has 12 variants. The executor's match is exhaustive (compiler-verified in statically typed languages). None invoke a delete or forget operation. Adding a delete variant requires modifying the specification.

"CAL cannot trigger erasure because..." Erasure requires access to key management or store-level delete operations. The CAL executor facade exposes only: recall(), count(), exists(), add(), supersede(), revert(), get_history(), assemble(), describe(). No key management or delete methods are accessible.

"CAL ADD cannot destroy data because..." ADD creates a new grain via the OMS store put operation (OMS §28.4). It does not modify or reference any existing grain. The grain count increases by one.

"CAL SUPERSEDE cannot destroy the original grain because..." The OMS store protocol defines supersede as: write the new grain, then update the old grain's index-layer fields (superseded_by, system_valid_to). The old grain's blob is never touched. It remains readable.

"CAL REVERT cannot destroy data because..." REVERT creates a new grain (copying content from a previous version) and then supersedes the current head. Three grains exist afterward: original, supersession, and revert. Nothing is deleted.

"CAL cannot cross namespace boundaries because..." Every CAL query carries a CapabilityToken cryptographically bound to a namespace. The executor overwrites any namespace in the parsed query with the token's namespace.

"CAL cannot exhaust resources because..." Hard limits are specified: MAX_LIMIT=1000, MAX_QUERY_LENGTH=8192 bytes, QUERY_TIMEOUT=5000ms. Tier 1 operations have additional write quotas: MAX_ADD_PER_MINUTE=20, MAX_SUPERSEDE_PER_MINUTE=10, MAX_REVERT_PER_MINUTE=5. These cannot be overridden by query syntax.

2.4 Grammar-Level Exclusions

The following tokens do not exist in CAL's lexer or grammar:

DELETE, DROP, FORGET, ERASE, DESTROY, PURGE, TRUNCATE,    -- Destructive
INSERT, CREATE, WRITE, STORE,                               -- Unconstrained creation
KEY, ENCRYPT, DECRYPT, ROTATE, MASTER, DEK, SECRET,        -- Key management
POLICY, SEAL, UNSEAL, GRANT, REVOKE, CONSENT, RESTRICT,    -- Policy/auth
SCHEMA, PARTITION, INDEX, MIGRATION                         -- Schema

If these appear in a query, they are parse errors, not recognized keywords.

Note: ADD, SUPERSEDE, REVERT, SET, and REASON are Tier 1 keywords. The parser always recognizes them (so that EXPLAIN ADD ... works as a dry-run even when Tier 1 execution is disabled). However, the executor rejects non-EXPLAIN Tier 1 statements when Tier 1 is disabled, returning CAL-E044: Tier1NotEnabled. This two-layer approach ensures EXPLAIN can always preview evolve operations without risk.

3. Lexical Structure

3.1 Character Set

CAL queries are UTF-8 encoded. Keywords are case-insensitive (RECALL = recall = Recall). Implementations MUST reject queries containing invalid UTF-8 sequences (error CAL-E070: InvalidUTF8).

3.2 Keywords

All keywords are listed exhaustively.

Tier 0 (Read) keywords:

RECALL, ASSEMBLE, WHERE, AND, OR, NOT, IN, BETWEEN, LIMIT, OFFSET,
ORDER, BY, ASC, DESC, WITH, EXPLAIN, SCOPE,
UNION, INTERSECT, EXCEPT,
SELECT, COUNT, FIRST, GROUP, SUBJECTS, OBJECTS, HASHES, PROJECT,
INCLUDE, EXCLUDE, IS, NULL, TRUE, FALSE,
EXISTS, HISTORY, DESCRIBE, BATCH, COALESCE,
ABOUT, RECENT, SINCE, LIKE, MY, CONTRADICTIONS, AS,
FOR, FROM, BUDGET, PRIORITY, FORMAT,
LET, THREAD,
STREAM, TEMPLATE, DEFINE, EXTENDS,
HEADER, ELEMENT, ELEMENT_SUMMARY, ELEMENT_OMIT, SOURCE_BREAK, FOOTER,
DIFF, PROJECT,
CAL                                                       -- version prefix

Tier 1 (Evolve) keywords (always parsed; execution requires Tier 1 enabled -- see section 2.4):

ADD, SUPERSEDE, REVERT, SET, REASON

Relation category keywords:

PREFERENCE, KNOWLEDGE, PERMISSION, INTERACTION, AGENCY, LIFECYCLE, OBSERVATION

3.3 Identifiers

Field names are a closed set (not user-definable).

Common fields (available on all grain types):

query, subject, relation, object, user_id, namespace,
confidence, importance, tags, score, type, time, hash,
verification_status, source_type, contradicted,
recall_priority, epistemic_status

Grain-type-specific fields (see section 6 for which types unlock which fields):

role, session_id, parent_message_id, model_id, content,
context, plan,
trigger, steps,
tool_name, action_phase, is_error, tool_call_id,
observer_id, observer_type,
goal_state, assigned_agent, deadline, depends_on,
reasoning_type, premises, conclusion,
threshold, agreement_count, participating_observers,
consent_action, purpose, grantor_did, grantee_did, scope, expires_at

3.4 Literals

Type	Syntax	Example
String	Double-quoted, `\"` escape	`"alice"`, `"last 7 days"`
Number	Optional sign, digits, optional decimal	`0.8`, `-1`, `42`
Boolean	`true` / `false`	`true`
Array	Square brackets, comma-separated	`["tag1", "tag2"]`
Hash	`sha256:` + 8-64 hex chars	`sha256:a1b2c3d4...`
Parameter	`$` + identifier	`$user_id`, `$limit`

3.5 Comments

Line comments only: -- comment text

3.6 Reserved Words (Future-Proofing)

See Appendix D for the complete list. Reserved words cannot be used as unquoted identifiers even if not yet functional.

4. Grammar (EBNF)

This section provides the unified CAL/1 grammar. See Appendix A for the complete, unabridged grammar.

(* CAL/1 Grammar -- Tier 0 + Tier 1 + All Extensions *)
 
query           = [ version_prefix ] , [ let_block ] , statement ;
version_prefix  = "CAL" , "/" , major_version ;
major_version   = digit+ ;
 
let_block       = { let_binding } ;
let_binding     = "LET" , "$" , identifier , "=" , recall_stmt , [ "|" , extractor ] , ";" ;
extractor       = "SUBJECTS" | "OBJECTS" | "HASHES" ;
 
statement       = explain_stmt | recall_stmt | assemble_stmt | set_stmt
                | exists_stmt | history_stmt | describe_stmt | batch_stmt
                | coalesce_stmt | define_template_stmt
                | add_stmt | supersede_stmt | revert_stmt ;
 
(* --- Tier 0: Read --- *)
 
explain_stmt    = "EXPLAIN" , ( recall_stmt | assemble_stmt | set_stmt
                | add_stmt | supersede_stmt | revert_stmt | batch_stmt
                | coalesce_stmt ) ;
 
set_stmt        = "(" , query , ")" , set_op , "(" , query , ")" ;
set_op          = "UNION" | "INTERSECT" | "EXCEPT" ;
 
recall_stmt     = "RECALL" , [ "MY" ] , [ grain_type_plural ] , [ in_clause ] ,
                  [ about_clause ] , [ like_clause ] , [ since_clause ] ,
                  [ between_clause ] , [ thread_clause ] ,
                  [ where_clause ] , [ with_clause ] , [ pipeline ] ,
                  [ recent_clause ] , [ contradictions_clause ] , [ as_clause ] ;
 
assemble_stmt   = "ASSEMBLE" , [ context_name ] ,
                  [ for_clause ] ,
                  from_clause ,
                  [ budget_clause ] ,
                  [ priority_clause ] ,
                  [ format_clause ] ,
                  [ stream_clause ] ,
                  [ with_clause ] ;
 
exists_stmt     = "EXISTS" , ( hash_literal | parameter ) ;
 
history_stmt    = "HISTORY" , ( hash_literal | parameter ) , [ diff_clause ]
                | "HISTORY" , [ in_clause ] , "WHERE" , subject_clause , "AND" , relation_clause ,
                  [ as_of_clause ] ;
 
describe_stmt   = "DESCRIBE" , describe_target ;
describe_target = "grain_types" | "fields" , [ grain_type_singular ]
                | "capabilities" | "server" | "templates" | "grammar" ;
 
batch_stmt      = "BATCH" , "{" , batch_entry , { "," , batch_entry } , "}" ;
batch_entry     = label , ":" , ( recall_stmt | exists_stmt | history_stmt
                | describe_stmt | coalesce_stmt ) ;
 
coalesce_stmt   = "COALESCE" , "(" , recall_stmt , "," , recall_stmt ,
                  { "," , recall_stmt } , ")" ;
 
define_template_stmt = "DEFINE" , "TEMPLATE" , template_name ,
                       [ extends_clause ] , template_body ;
 
(* --- Clauses --- *)
 
context_name    = identifier ;
label           = identifier ;
 
for_clause      = "FOR" , string_literal ;
from_clause     = "FROM" , source , { "," , source } ;
source          = [ label , ":" ] , "(" , recall_stmt , ")"
                | [ label , ":" ] , let_ref ;
let_ref         = "$" , identifier ;
 
budget_clause   = "BUDGET" , positive_integer , ( "tokens" | "grains" ) ;
priority_clause = "PRIORITY" , label , { ">" , label } ;
format_clause   = "FORMAT" , format_spec ;
format_spec     = format_type
                | preset_name
                | "TEMPLATE" , template_name
                | "TEMPLATE" , "{" , template_body , "}" ;
format_type     = "markdown" | "json" | "yaml" | "text" | "sml" | "triples" | "toon" ;
preset_name     = "structured" | "readable" | "compact" | "data" ;
 
stream_clause   = "STREAM" , [ "{" , stream_option , { "," , stream_option } , "}" ] ;
stream_option   = "progress" | "budget" | "chunks" | "all"
                | "chunk_size" , "=" , positive_integer ;
 
about_clause    = "ABOUT" , string_literal ;
recent_clause   = "RECENT" , positive_integer ;
since_clause    = "SINCE" , string_literal ;
like_clause     = "LIKE" , string_literal ;
between_clause  = "BETWEEN" , value , "AND" , value ;
contradictions_clause = "CONTRADICTIONS" ;
as_clause       = "AS" , format_type ;
 
thread_clause   = "THREAD" , thread_target ;
thread_target   = string_literal | "FROM" , hash_literal ;
 
diff_clause     = "DIFF" , ( hash_literal | parameter ) ;
as_of_clause    = "AS" , "OF" , string_literal ;
 
in_clause       = "IN" , ( string_literal | "SCOPE" , string_literal ) ;
 
where_clause    = "WHERE" , condition , { "AND" , condition } ;
 
condition       = field_condition | grain_field_condition | query_condition
                | time_condition | type_condition | tag_condition
                | in_condition | hash_condition | meta_condition
                | relation_shortcut | domain_field_condition ;
 
field_condition   = field_name , comparator , value ;
grain_field_condition = grain_field_name , comparator , value ;
meta_condition    = meta_field_name , comparator , value ;
query_condition   = "query" , "=" , string_literal ;
time_condition    = "time" , "=" , string_literal
                  | "time" , "BETWEEN" , value , "AND" , value ;
type_condition    = "type" , "=" , string_literal ;
tag_condition     = "tags" , ( "INCLUDE" | "EXCLUDE" ) , array_literal ;
in_condition      = field_name , "IN" , "(" , ( value_list | subquery_extract ) , ")" ;
hash_condition    = "hash" , "=" , ( hash_literal | parameter ) ;
relation_shortcut = "relation" , "IS" , relation_category ;
domain_field_condition = domain_field , comparator , value ;
 
domain_field    = domain_prefix , ":" , identifier ;
domain_prefix   = "hc" | "legal" | "fin" | "rob" | "sci" | "con" | "int" ;
 
relation_category = "PREFERENCE" | "KNOWLEDGE" | "PERMISSION" | "INTERACTION"
                  | "AGENCY" | "LIFECYCLE" | "OBSERVATION" ;
 
subquery_extract = recall_stmt , "|" , extractor ;
 
meta_field_name = "recall_priority" | "epistemic_status" | "verification_status"
                | "source_type" ;
 
grain_field_name = event_field | state_field | workflow_field
                 | action_field | observation_field | goal_field
                 | reasoning_field | consensus_field | consent_field ;
 
event_field     = "role" | "session_id" | "parent_message_id" | "model_id" | "content" ;
state_field     = "context" | "plan" ;
workflow_field  = "trigger" | "steps" ;
action_field    = "tool_name" | "action_phase" | "is_error" | "tool_call_id" ;
observation_field = "observer_id" | "observer_type" ;
goal_field      = "goal_state" | "assigned_agent" | "deadline" | "depends_on" ;
reasoning_field = "reasoning_type" | "premises" | "conclusion" ;
consensus_field = "threshold" | "agreement_count" | "participating_observers" ;
consent_field   = "consent_action" | "purpose" | "grantor_did" | "grantee_did"
                | "scope" | "expires_at" ;
 
comparator      = "=" | "!=" | ">=" | "<=" | ">" | "<" ;
 
field_name      = "subject" | "relation" | "object" | "user_id" | "namespace"
                | "confidence" | "importance" | "score"
                | "verification_status" | "source_type" | "contradicted" ;
 
subject_clause  = "subject" , "=" , value ;
relation_clause = "relation" , "=" , value ;
 
with_clause     = "WITH" , with_option , { "," , with_option } ;
with_option     = "superseded" | "score_breakdown" | "explanation" | "provenance"
                | "contradiction_detection" | "progressive_disclosure"
                | "summarize"
                | "diversity" , "(" , diversity_spec , ")"
                | "consistency" , "(" , consistency_level , ")"
                | "progressive_disclosure" , "(" , disclosure_level , ")"
                | "dedup" , "(" , field_name , ")"
                | "locale" , "(" , string_literal , ")"
                | "cache" , "(" , "ttl" , "=" , positive_integer , ")"
                | extension_option ;
diversity_spec  = "mmr" , [ "," , "lambda" , "=" , number ]
                | "threshold" , "," , number ;
consistency_level = "eventual" | "bounded" , "(" , number , ")" | "linearizable" ;
disclosure_level = "summary" | "headlines" | "full" ;
extension_option = "x_" , identifier , [ "(" , value_list , ")" ] ;
 
pipeline        = { "|" , pipe_stage } ;
pipe_stage      = select_stage | order_stage | limit_stage | offset_stage
                | count_stage | first_stage | subjects_stage | objects_stage
                | hashes_stage | group_stage | project_stage ;
select_stage    = "SELECT" , field_name , { "," , field_name } ;
order_stage     = "ORDER" , "BY" , field_name , [ "ASC" | "DESC" ] ;
limit_stage     = "LIMIT" , positive_integer ;
offset_stage    = "OFFSET" , ( positive_integer | parameter ) ;
count_stage     = "COUNT" ;
first_stage     = "FIRST" ;
subjects_stage  = "SUBJECTS" ;
objects_stage   = "OBJECTS" ;
hashes_stage    = "HASHES" ;
group_stage     = "GROUP" , "BY" , field_name ;
project_stage   = "PROJECT" , project_spec , { "," , project_spec } ;
project_spec    = "content" , "(" , project_field , { "," , project_field } , ")"
                | "attr" , "(" , project_field , { "," , project_field } , ")" ;
project_field   = field_name | grain_field_name | domain_field ;
 
(* --- Tier 1: Evolve --- *)
 
add_stmt        = "ADD" , grain_type_singular , add_clause , { add_clause } , reason_clause ;
supersede_stmt  = "SUPERSEDE" , ( hash_literal | parameter ) , set_clause , { set_clause } , reason_clause ;
revert_stmt     = "REVERT" , ( hash_literal | parameter ) , reason_clause ;
 
add_clause      = "SET" , ( add_field | grain_add_field ) , "=" , value ;
add_field       = "subject" | "relation" | "object"
                | "confidence" | "importance" | "tags" ;
grain_add_field = goal_add_field | observation_add_field ;
goal_add_field  = "goal_state" | "assigned_agent" | "deadline" | "depends_on" ;
observation_add_field = "observer_id" | "observer_type" ;
 
set_clause      = "SET" , evolve_field , "=" , value ;
evolve_field    = "object" | "confidence" | "importance" | "tags" ;
reason_clause   = "REASON" , string_literal ;
 
(* --- Template definitions --- *)
 
template_name   = identifier ;
extends_clause  = "EXTENDS" , ( preset_name | template_name ) ;
template_body   = section+ ;
section         = header_section | element_section | element_summary_section
                | element_omit_section | source_break_section | footer_section ;
header_section           = "HEADER" , "{" , template_text , "}" ;
element_section          = "ELEMENT" , "{" , template_text , "}" ;
element_summary_section  = "ELEMENT_SUMMARY" , "{" , template_text , "}" ;
element_omit_section     = "ELEMENT_OMIT" , "{" , template_text , "}" ;
source_break_section     = "SOURCE_BREAK" , "{" , template_text , "}" ;
footer_section           = "FOOTER" , "{" , template_text , "}" ;
 
(* --- Shared terminals --- *)
 
value           = string_literal | number | boolean | parameter
                | array_literal | hash_literal ;
value_list      = value , { "," , value } ;
string_literal  = '"' , { any_char - '"' | '\\"' } , '"' ;
number          = [ "-" ] , digit+ , [ "." , digit+ ] ;
boolean         = "true" | "false" ;
parameter       = "$" , identifier ;
array_literal   = "[" , [ value_list ] , "]" ;
hash_literal    = "sha256:" , hex_char{8,64} ;
identifier      = letter , { letter | digit | "_" } ;
positive_integer = digit+ ;
 
grain_type_plural   = "beliefs" | "events" | "states" | "workflows" | "actions"
                    | "observations" | "goals" | "reasonings" | "consensuses" | "consents" ;
 
grain_type_singular = "belief" | "event" | "state" | "workflow" | "action"
                    | "observation" | "goal" | "reasoning" | "consensus" | "consent" ;

5. Type System

5.1 Grain Types (Closed Set)

Type	Plural (after RECALL)	Singular (in ADD/WHERE)	OMS Type Code
Belief	`beliefs`	`belief`	0x01
Event	`events`	`event`	0x02
State	`states`	`state`	0x03
Workflow	`workflows`	`workflow`	0x04
Action	`actions`	`action`	0x05
Observation	`observations`	`observation`	0x06
Goal	`goals`	`goal`	0x07
Reasoning	`reasonings`	`reasoning`	0x08
Consensus	`consensuses`	`consensus`	0x09
Consent	`consents`	`consent`	0x0A

5.2 Common Field Types

Field	Type	Operators	Notes
`query`	String	`=` only	Triggers semantic (BM25/vector) search
`subject`	String	`=`, `!=`, `IN`	Triple subject lookup
`relation`	String	`=`, `!=`, `IN`, `IS`	Triple relation lookup. `IS` used with relation category shortcuts.
`object`	String	`=`, `!=`, `IN`	Triple object lookup
`user_id`	String	`=`, `!=`	User isolation
`namespace`	String	`=`	Namespace isolation (overwritten by token)
`confidence`	Number	`=`, `!=`, `>=`, `<=`, `>`, `<`	Range [0.0, 1.0]
`importance`	Number	`=`, `!=`, `>=`, `<=`, `>`, `<`	Range [0.0, 1.0]
`score`	Number	`>=`, `>`	Post-retrieval filter
`tags`	Array	`INCLUDE`, `EXCLUDE`	Tag set operations
`type`	GrainType	`=`	One of 10 types
`time`	Temporal	`=`, `BETWEEN`	Natural language or epoch
`hash`	Hash	`=`	Content-address lookup
`contradicted`	Boolean	`=`	`true` or `false`
`verification_status`	String	`=`	`"unverified"`, `"verified"`, `"contested"`, `"retracted"`
`source_type`	String	`=`	Source type
`recall_priority`	String	`=`	`"hot"`, `"warm"`, `"cold"`
`epistemic_status`	String	`=`	`"certain"`, `"probable"`, `"uncertain"`, `"estimated"`, `"derived"`

5.3 Evolve Fields (Tier 1, Closed Set)

ADD fields -- these can appear in an ADD statement's SET clauses. The first three are required:

Field	Type	Required?	Constraint
`subject`	String	Yes	The entity
`relation`	String	Yes	The predicate
`object`	String	Yes	The value
`confidence`	Number	No	Range [0.0, 1.0]. Default: implementation-defined
`importance`	Number	No	Range [0.0, 1.0]. Default: implementation-defined
`tags`	Array	No	Tag set. Default: empty

Namespace and user_id are taken from the capability token -- they cannot appear in SET clauses.

SUPERSEDE fields -- only these fields can appear in a SUPERSEDE statement's SET clauses:

Field	Type	Constraint
`object`	String	The new value of the fact
`confidence`	Number	Range [0.0, 1.0]
`importance`	Number	Range [0.0, 1.0]
`tags`	Array	Replaces the tag set

5.4 NULL Semantics

Missing field = no match (never errors). WHERE confidence >= 0.8 on a grain without confidence returns no match, not an error.

6. OMS Grain Type Integration

6.1 Design Principle

When a RECALL statement specifies a grain type (e.g., RECALL actions), the parser unlocks a type-specific field set for use in WHERE clauses. This enables precise querying of OMS-native fields that only exist on specific grain types, without polluting the global field namespace.

The type-specific field set is a compile-time guarantee: the parser MUST reject field references that do not belong to the specified grain type. When no grain type is specified (RECALL WHERE ...), only the common field set is available.

6.2 Field Resolution Rules

Phase 1 -- Common fields. The common field set (section 5.2) is always available.
Phase 2 -- Type-specific fields. When the statement specifies a grain type plural, the grain-type-specific field set is additionally available.

Validation rule: If a grain_field_condition references a field not in the declared grain type's field set, the parser MUST return error CAL-E060: FieldNotOnGrainType with a suggestion listing valid fields for that type.

6.3 Grain-Type-Specific Queryable Fields

Belief (0x01) -- `RECALL beliefs`

All Belief fields are in the common set (subject, relation, object, confidence). No additional type-specific fields.

Event (0x02) -- `RECALL events`

Field	Type	Operators	Notes
`role`	String	`=`, `!=`	`"user"`, `"assistant"`, `"system"`, `"tool"`
`session_id`	String	`=`	Conversation session identifier
`parent_message_id`	String	`=`	Threading: parent message reference
`model_id`	String	`=`, `!=`	LLM model identifier
`content`	String	`=`	Semantic search on event content

State (0x03) -- `RECALL states`

Field	Type	Operators	Notes
`context`	String	`=`, `!=`	State context identifier
`plan`	String	`=`	Semantic search on plan content

Workflow (0x04) -- `RECALL workflows`

Field	Type	Operators	Notes
`trigger`	String	`=`, `!=`	Trigger condition (e.g., `"on:user_message"`)
`steps`	String	`=`	Semantic search on workflow steps

Action (0x05) -- `RECALL actions`

Field	Type	Operators	Notes
`tool_name`	String	`=`, `!=`, `IN`	Tool identifier
`action_phase`	String	`=`	`"definition"`, `"call"`, `"result"`, `"complete"`
`is_error`	Boolean	`=`	Whether the action resulted in error
`tool_call_id`	String	`=`	Correlation ID across action phases

Observation (0x06) -- `RECALL observations`

Field	Type	Operators	Notes
`observer_id`	String	`=`, `!=`	Identifier of the observing entity
`observer_type`	String	`=`, `!=`	Type classifier (e.g., `"agent:monitor"`)

Goal (0x07) -- `RECALL goals`

Field	Type	Operators	Notes
`goal_state`	String	`=`, `!=`	`"active"`, `"completed"`, `"abandoned"`, `"blocked"`
`assigned_agent`	String	`=`, `!=`	DID of responsible agent
`deadline`	Temporal	`=`, `BETWEEN`	ISO 8601 or epoch
`depends_on`	String	`=`, `IN`	Content address(es) of prerequisite goals

Reasoning (0x08) -- `RECALL reasonings`

Field	Type	Operators	Notes
`reasoning_type`	String	`=`	`"deductive"`, `"inductive"`, `"abductive"`, `"analogical"`
`premises`	String	`=`	Semantic search on premises
`conclusion`	String	`=`, `!=`	Semantic or exact match on conclusion

Consensus (0x09) -- `RECALL consensuses`

Field	Type	Operators	Notes
`threshold`	Number	`=`, `>=`, `<=`, `>`, `<`	Agreement threshold [0.0, 1.0]
`agreement_count`	Number	`=`, `>=`, `<=`, `>`, `<`	Number of agreeing observers
`participating_observers`	Array	`INCLUDE`	Filter by participating observer IDs

Consent (0x0A) -- `RECALL consents`

Field	Type	Operators	Notes
`consent_action`	String	`=`	`"grant"` or `"withdraw"`
`purpose`	String	`=`, `!=`	Purpose-binding (e.g., `"personalization"`)
`grantor_did`	String	`=`	DID of consent grantor
`grantee_did`	String	`=`	DID of consent recipient
`scope`	String	`=`	Consent scope identifier
`expires_at`	Temporal	`=`, `BETWEEN`	Consent expiration

6.4 Type-Specific ADD Extensions

When adding Goal or Observation grains, type-specific fields are available in SET clauses:

ADD goal
  SET subject = "alice"
  SET relation = "mg:intends"
  SET object = "complete quarterly review"
  SET goal_state = "active"
  SET assigned_agent = "did:web:assistant.example.com"
  SET deadline = "2026-03-15T00:00:00Z"
  SET importance = 0.9
  REASON "user created objective during planning session"
 
ADD observation
  SET subject = "system"
  SET relation = "mg:perceives"
  SET object = "alice works late on Fridays"
  SET observer_id = "obs-activity-monitor"
  SET observer_type = "agent:activity-tracker"
  SET confidence = 0.7
  REASON "observed pattern across last 4 weeks"

6.5 Field Count Summary

Grain Type	Common Fields	Type-Specific Fields	Total Queryable
Belief	18	0	18
Event	18	5	23
State	18	2	20
Workflow	18	2	20
Action	18	4	22
Observation	18	2	20
Goal	18	4	22
Reasoning	18	3	21
Consensus	18	3	21
Consent	18	6	24
(no type)	18	0	18

7. mg: Relation Vocabulary

7.1 Standard mg: Relations

OMS defines a standard mg: relation vocabulary. CAL provides first-class support for mg: relations: they are valid string literals, the parser recognizes them for validation, and common patterns have semantic shortcuts.

Relation	Category	Typical Subject	Typical Object	Description
`mg:perceives`	Observation	Agent/Observer	Phenomenon	Sensory/cognitive input
`mg:knows`	Knowledge	Entity	Fact	Knowledge assertion
`mg:said`	Interaction	Entity	Statement	Recorded utterance
`mg:did`	Interaction	Entity	Action description	Recorded action
`mg:infers`	Knowledge	Agent	Conclusion	Inference result
`mg:agrees_with`	Consensus	Agent	Proposition	Agreement record
`mg:state_at`	Observation	Agent	State snapshot	Point-in-time state
`mg:requires_steps`	Workflow	Process	Step sequence	Workflow definition
`mg:intends`	Lifecycle	Entity	Objective	Goal declaration
`mg:permits`	Permission	Grantor DID	Action/scope	Permission grant
`mg:revokes`	Permission	Grantor DID	Action/scope	Permission withdrawal
`mg:prohibits`	Permission	Authority	Action/scope	Prohibition
`mg:requires`	Preference	Entity	Requirement	Requirement assertion
`mg:prefers`	Preference	Entity	Preference	Preference assertion
`mg:avoids`	Preference	Entity	Aversion	Avoidance assertion
`mg:delegates_to`	Agency	Entity	Agent DID	Delegation
`mg:owned_by`	Knowledge	Resource	Entity	Ownership
`mg:has_capability`	Agency	Agent DID	Capability	Agent capability
`mg:handed_off_to`	Interaction	Agent DID	Agent DID	Agent handoff
`mg:depends_on`	Lifecycle	Goal	Goal	Goal dependency
`mg:assigned_to`	Agency	Task	Agent DID	Task assignment

7.2 Relation Category Shortcuts

CAL defines relation category shortcuts as syntactic sugar for common multi-relation queries. These expand to IN conditions at parse time:

Shortcut	Expands To
`relation IS PREFERENCE`	`relation IN ("mg:prefers", "mg:avoids", "mg:requires")`
`relation IS KNOWLEDGE`	`relation IN ("mg:knows", "mg:infers")`
`relation IS PERMISSION`	`relation IN ("mg:permits", "mg:revokes", "mg:prohibits")`
`relation IS INTERACTION`	`relation IN ("mg:said", "mg:did", "mg:handed_off_to")`
`relation IS AGENCY`	`relation IN ("mg:delegates_to", "mg:has_capability", "mg:assigned_to")`
`relation IS LIFECYCLE`	`relation IN ("mg:intends", "mg:depends_on")`
`relation IS OBSERVATION`	`relation IN ("mg:perceives", "mg:state_at")`

Examples:

-- All preference-related beliefs about alice
RECALL beliefs WHERE subject = "alice" AND relation IS PREFERENCE
  | ORDER BY confidence DESC
 
-- All permission records for a DID
RECALL WHERE subject = "did:key:z6Mk..." AND relation IS PERMISSION

7.3 mg: Relation Validation

The parser SHOULD validate mg: prefixed relation values against the known vocabulary. Unknown mg: relations produce warning CAL-W001 (not an error).

8. Statement Semantics

CAL/1 has 12 statement types organized into three tiers:

Statement	Tier	Description
RECALL	0	Retrieve grains matching filters
ASSEMBLE	0	Compose context from multiple sources with budget
EXISTS	0	Check grain existence by content address
HISTORY	0	Version history with AS OF and DIFF
EXPLAIN	0	Execution plan preview
DESCRIBE	0	Schema introspection
BATCH	0	Multiple independent queries in one request
COALESCE	0	Fallback chain of RECALL queries
ADD	1	Create a new grain (append-only)
SUPERSEDE	1	Create a new version of an existing grain
REVERT	1	Restore a previous version
Set operations	0	`UNION`, `INTERSECT`, `EXCEPT`

8.1 RECALL (Tier 0)

Retrieves grains matching the given filters. Returns results using the OMS Standard Search Response Envelope.

RECALL beliefs WHERE subject = "alice" AND relation = "prefers"
  WITH contradiction_detection
  | ORDER BY confidence DESC
  | LIMIT 10

RECALL supports semantic shortcuts (ABOUT, RECENT, SINCE, LIKE, MY, CONTRADICTIONS -- see section 9), grain-type-specific fields (section 6), thread shorthand (section 8.1.1), and per-query format control via AS.

-- With grain-type-specific fields
RECALL actions WHERE tool_name = "get_weather" AND is_error = false
  | ORDER BY time DESC | LIMIT 20
 
-- With domain profile fields
RECALL beliefs WHERE tags INCLUDE ["profile:healthcare"]
  AND hc:patient_id = "P-12345" AND relation = "mg:knows"

8.1.1 THREAD Shorthand

The THREAD keyword provides concise syntax for conversation retrieval:

-- Full conversation in a session
RECALL events THREAD "sess-123"
-- Expands to: RECALL events WHERE session_id = "sess-123" | ORDER BY time ASC
 
-- Full thread containing a specific message
RECALL events THREAD FROM sha256:a1b2c3d4...

8.2 ASSEMBLE (Tier 0)

The flagship new statement. Composes a context block from multiple RECALL sources with token budgets, priority ordering, format control, and progressive disclosure.

CAL/1 ASSEMBLE user_context
  FOR "conversation about alice's preferences and goals"
  FROM
    beliefs:  (RECALL beliefs ABOUT "alice" WHERE relation = "prefers" LIMIT 20),
    goals:    (RECALL goals ABOUT "alice" RECENT 10),
    events:   (RECALL events WHERE user_id = "alice" RECENT 5),
    history:  (RECALL beliefs ABOUT "alice"
                WHERE relation = "prefers" WITH superseded
                | ORDER BY time DESC | LIMIT 3)
  BUDGET 2000 tokens
  PRIORITY beliefs > goals > events > history
  FORMAT markdown
  WITH progressive_disclosure, dedup(subject)

Execution Phases:

Source Resolution. Each source in the FROM clause is an independent RECALL. They execute in parallel. Results MUST be deterministic.
Deduplication. If WITH dedup(field) is specified, grains in multiple sources are deduplicated. The copy from the highest-priority source is kept.
Budget Allocation. The budget allocator distributes tokens according to the PRIORITY clause. Default weights: 2 sources [0.65, 0.35]; 3 sources [0.50, 0.30, 0.20]; 4 sources [0.40, 0.28, 0.20, 0.12]; 5+ sources: exponential decay. Surplus from under-utilizing sources redistributes to remaining sources.
Progressive Disclosure. When enabled, the response includes three tiers: Summary, Headlines, Full.
Formatting. The FORMAT clause determines output structure.

Budget Units:

Unit	Meaning	Default	Max
`tokens`	Approximate token count	4000	16000
`grains`	Maximum total grain count	50	200

Token estimation is approximate by design. The response MUST report actual tokens used.

ASSEMBLE Constraints:

Constraint	Limit
Max sources in FROM	8
Max LET bindings per ASSEMBLE	5
Max total BUDGET (tokens)	16,000
Max total BUDGET (grains)	200
Max context_name length	64 characters
Max FOR string length	256 characters
ASSEMBLE timeout	10,000ms

8.3 EXISTS (Tier 0)

Checks if a specific grain exists by content address. Returns boolean. O(1) via hash lookup.

EXISTS sha256:a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2

8.4 HISTORY (Tier 0)

Retrieves the version history for a grain or a (subject, relation) triple. Returns versions in reverse chronological order. Capped at 100 versions.

-- By hash: show version chain
HISTORY sha256:a1b2c3d4...
 
-- By triple: show all versions
HISTORY WHERE subject = "alice" AND relation = "prefers"
 
-- AS OF: temporal snapshot
HISTORY WHERE subject = "alice" AND relation = "prefers" AS OF "2025-06-15"
 
-- DIFF: show changes between two versions
HISTORY sha256:aaa... DIFF sha256:bbb...

8.5 EXPLAIN (Tier 0)

Returns the execution plan without running the query. Works with all statement types including ASSEMBLE.

EXPLAIN RECALL beliefs WHERE query = "alice preferences" LIMIT 10
EXPLAIN ASSEMBLE user_context
  FOR "conversation about alice"
  FROM beliefs: (RECALL beliefs ABOUT "alice"),
       goals: (RECALL goals ABOUT "alice" RECENT 5)
  BUDGET 2000 tokens

8.6 DESCRIBE (Tier 0)

Schema introspection for grain types, fields, capabilities, server metadata, templates, and grammar.

CAL/1 DESCRIBE grain_types        -- list available grain types
CAL/1 DESCRIBE fields             -- list all queryable fields
CAL/1 DESCRIBE fields belief      -- list fields for a specific grain type
CAL/1 DESCRIBE capabilities       -- server capabilities and conformance
CAL/1 DESCRIBE server             -- server metadata
CAL/1 DESCRIBE templates          -- list registered templates
CAL/1 DESCRIBE grammar            -- return EBNF (optional, Extended conformance)

8.7 BATCH (Tier 0)

Multiple independent queries in a single request. Each sub-query gets its own result slot. Only Tier 0 (read) statements are allowed in BATCH.

CAL/1 BATCH {
  preferences: RECALL beliefs ABOUT "alice" WHERE relation = "prefers",
  recent:      RECALL events ABOUT "alice" RECENT 5,
  team:        RECALL beliefs WHERE relation = "member_of" AND object = "team-alpha"
}

Constraints: Max 10 queries per BATCH. LET bindings within a BATCH are scoped to that BATCH block.

8.8 ADD (Tier 1)

Creates a new grain. Pure append-only -- does not modify or reference any existing grain.

ADD belief
  SET subject = "alice"
  SET relation = "prefers"
  SET object = "dark mode"
  SET confidence = 0.9
  SET tags = ["preference", "ui"]
  REASON "user stated preference during onboarding conversation"

Addable grain types: Belief, Observation, Goal. Events, Actions, States, and other types represent system-generated records and are not user-creatable.

Required SET fields: subject, relation, object. REASON is mandatory.

8.9 SUPERSEDE (Tier 1)

Creates a new grain that supersedes an existing one. The old grain is preserved and remains queryable via WITH superseded.

SUPERSEDE sha256:target_hash
  SET object = "light mode"
  SET confidence = 0.95
  REASON "user explicitly changed preference"

Only Belief grains can be superseded via CAL. At least one SET clause and REASON are required.

8.10 REVERT (Tier 1)

Creates a new grain that restores content from the version before the target. Like git revert, this does not undo history -- it creates a new version.

REVERT sha256:target_hash
  REASON "supersession was based on misunderstood context"

8.11 Set Operations

(RECALL WHERE user_id = "alice" AND query = "project status")
EXCEPT
(RECALL WHERE user_id = "bob" AND query = "project status")

Each operand executes independently. Set operations are applied post-retrieval:

UNION: Deduplicate by content_address, merge scores (max)
INTERSECT: Keep only grains present in both, merge scores (min)
EXCEPT: Keep left grains absent from right

8.12 LET Bindings

LET bindings name intermediate RECALL results that can be referenced by $name in subsequent FROM clauses or WHERE IN sub-expressions.

CAL/1
LET $team_members = RECALL beliefs
  WHERE relation = "member_of" AND object = "team-alpha" | SUBJECTS;
 
LET $team_prefs = RECALL beliefs
  WHERE subject IN ($team_members) AND relation = "prefers";
 
ASSEMBLE team_context
  FOR "team alpha's collective preferences"
  FROM prefs: ($team_prefs),
       goals: (RECALL goals WHERE subject IN ($team_members) RECENT 10)
  BUDGET 3000 tokens
  PRIORITY prefs > goals
  FORMAT markdown

LET constraints:

Max 5 LET bindings per request
Evaluated once, in declaration order
Within a standalone query: cannot reference other LET bindings
Within ASSEMBLE: LET bindings can reference prior bindings (linear chaining only, max depth 3)
Scoped to the enclosing BATCH or single-statement context

8.13 COALESCE

Evaluates argument queries left-to-right. Returns the result of the first query that returns at least one grain. Remaining queries are not executed (short-circuit evaluation).

COALESCE(
  RECALL beliefs WHERE subject = "alice" AND relation = "favorite_color",
  RECALL beliefs WHERE subject = "alice" AND relation = "prefers"
    AND tags INCLUDE ["color"],
  RECALL beliefs ABOUT "alice" LIKE "color preference"
)

Constraints: Max 5 branches. All branches MUST be RECALL statements.

9. Semantic Shortcuts

Shortcuts are syntactic sugar -- they desugar to standard WHERE/pipeline clauses. The desugared form is always valid and produces identical results.

9.1 ABOUT

RECALL beliefs ABOUT "alice"
-- Desugars to: RECALL beliefs WHERE subject = "alice"
-- Falls back to: RECALL beliefs WHERE query = "alice" (if no structural match)

9.2 RECENT

RECALL events ABOUT "alice" RECENT 5
-- Desugars to: RECALL events WHERE subject = "alice" | ORDER BY time DESC | LIMIT 5

9.3 SINCE

RECALL events SINCE "last week"
-- Desugars to: RECALL events WHERE time = "last week"

9.4 LIKE

RECALL LIKE "machine learning best practices"
-- Desugars to: RECALL WHERE query = "machine learning best practices"

9.5 MY

RECALL MY beliefs
-- Desugars to: RECALL beliefs WHERE user_id = $current_user_id

9.6 CONTRADICTIONS

RECALL beliefs ABOUT "alice" CONTRADICTIONS
-- Desugars to: RECALL beliefs WHERE subject = "alice" AND contradicted = true
--              WITH contradiction_detection

9.7 BETWEEN

RECALL events BETWEEN 1709251200 AND 1709337600
-- Desugars to: RECALL events WHERE time BETWEEN 1709251200 AND 1709337600

9.8 Shortcut Combination Rules

Combination	Valid?	Notes
ABOUT + WHERE	Yes	ABOUT becomes an additional AND condition
ABOUT + LIKE	No	Ambiguous -- error `CAL-E060`
RECENT + LIMIT	No	Ambiguous -- error `CAL-E060`
RECENT + ORDER BY	No	Ambiguous -- error `CAL-E060`
SINCE + WHERE time	No	Ambiguous -- error `CAL-E060`
SINCE + BETWEEN	No	Ambiguous -- error `CAL-E060`
MY + WHERE user_id	No	Ambiguous -- error `CAL-E060`
CONTRADICTIONS + WITH contradiction_detection	Yes	Redundant but not an error
AS + FORMAT (in ASSEMBLE)	Yes	AS controls per-source; FORMAT controls assembly

10. FORMAT System

10.1 Semantic Presets

Preset	Output Format	Description
`structured` / `sml`	SML-based	Semantic tag structure optimised for LLM consumption (see SML spec)
`readable` / `markdown`	Markdown	Human-readable, default for ASSEMBLE
`compact` / `text`	Plain text	Minimal, token-efficient
`data` / `json`	JSON	Machine-readable structured data
`yaml`	YAML	YAML structure
`triples`	Triples	Subject-relation-object triples
`toon`	TOON	Token-Oriented Object Notation — CSV-tabular for uniform grain arrays; ~40% fewer tokens vs JSON. Optimised for large RECALL result sets and budget-constrained ASSEMBLE. See Section 10.9.

10.2 Custom Templates (Mustache-subset)

CAL templates use a strict subset of Mustache:

Variable interpolation: {{variable}}
Sections (conditional blocks): {{#section}}...{{/section}}
Inverted sections (if-not): {{^section}}...{{/section}}
Comments: {{! comment }}
Constrained iteration: {{#each}} block (capped at 200 iterations)

Excluded: Lambdas, partials, set delimiter, unescaped interpolation.

10.3 Content Projection Model

CAL output is consumed by LLMs, not database clients. The Content Projection Model defines how OMS grain fields map to LLM-friendly output -- what becomes readable text content, what becomes lightweight metadata attributes, and what stays in the machine envelope only.

10.3.1 Design Principle

The output format reflects the consumer's mental model, not the storage system's data model.

OMS stores grains as structured triples with rich metadata (hashes, namespaces, short keys, provenance chains). LLMs think in natural language with lightweight structural hints. The projection model bridges these two worlds: it composes grain fields into readable text with just enough structure for the LLM to categorize and weight information.

10.3.2 Per-Grain-Type Content Projection

Each grain type defines a content rule (what becomes the text content of the output element) and an attribute set (what becomes metadata on the element). All other fields remain in the machine envelope (Section 14.1) and never appear in formatted output.

Grain Type	Text Content Rule	Default Attributes
Belief	`humanize(relation) + " " + object`	`subject`, `confidence`?
Event	`content`	`role`, `time`?
Goal	`object` (the objective description)	`subject`, `state`?, `deadline`?
Action	`object` (tool result summary)	`tool`, `phase`?
Observation	`object` (what was observed)	`observer`?
Reasoning	`conclusion`	`type`?
State	`plan` (summary)	`context`?
Workflow	`steps` (joined as readable text)	`trigger`?
Consensus	`object` (the agreed claim)	`threshold`?, `count`?
Consent	`purpose`	`action`, `grantor`, `grantee`

Attributes marked with ? are included at standard and full disclosure levels only, omitted at summary level.

10.3.3 The `humanize()` Function

The humanize() function transforms OMS relation strings into human-readable text:

Strip namespace prefix: "mg:prefers" → "prefers"
Replace underscores with spaces: "works_at" → "works at"
Preserve custom relations as-is after stripping: "acme:similar_to" → "similar to"

Implementations MUST apply humanize() to relation strings in formatted output. The raw relation value remains available in the machine envelope.

10.3.4 Time Humanization

Timestamps in formatted output SHOULD use relative human-readable form by default:

Age	Formatted As
< 1 hour	`"Nm ago"` (e.g., `"23m ago"`)
< 24 hours	`"Nh ago"` (e.g., `"3h ago"`)
< 7 days	`"yesterday"`, `"2d ago"`, etc.
< 30 days	`"2w ago"`, `"3w ago"`
< 1 year	`"Mar 1"`, `"Jan 15"`
>= 1 year	`"Mar 2025"`, `"2024"`

Full ISO 8601 timestamps remain in the machine envelope. Implementations MAY provide a WITH iso_timestamps option to override humanization.

10.3.5 The PROJECT Clause

The PROJECT clause overrides default content projection, allowing queries to surface custom or domain-specific fields:

CAL/1 RECALL beliefs ABOUT "alice"
  PROJECT content(relation, object), attr(confidence, x_department)
  LIMIT 10 AS sml

Syntax:

PROJECT content(field, ...), attr(field, ...)

content(...) -- fields composed into text content via concatenation with space separator. Relation-type fields are passed through humanize().
attr(...) -- fields rendered as element attributes.
Fields not listed in either content() or attr() are excluded from formatted output.

Without PROJECT, the per-grain-type defaults (Section 10.3.2) apply. This is the common case.

With PROJECT, the query author has explicit control:

-- Surface domain profile fields
RECALL observations WHERE tags INCLUDE ["profile:healthcare"]
  PROJECT content(object), attr(observer, hc:patient_id, hc:encounter_id)
  AS sml
 
-- Produces:
-- <observation observer="dr-smith" hc:patient_id="P-1234" hc:encounter_id="E-567">
--   elevated heart rate detected
-- </observation>

10.4 Semantic Markup Language (SML)

SML is now a standalone specification. See SEMANTIC-MARKUP-LANGUAGE-SML-SPECIFICATION.md for the full SML definition, structural rules, comprehensive example, and progressive disclosure model.

SML is the default output format for the structured / sml preset. The Content Projection Model (Section 10.3) and template engine (Section 10.2) apply to SML output as defined in this CAL specification.

10.5 Template Variables

Assembly-Level Variables

Variable	Type	Description
`{{assembly.name}}`	string	Context name
`{{assembly.intent}}`	string	FOR clause text
`{{assembly.source_count}}`	integer	Number of sources
`{{assembly.grain_count}}`	integer	Total grains included
`{{budget.total}}`	integer	Total budget
`{{budget.used}}`	integer	Budget consumed
`{{budget.remaining}}`	integer	Remaining budget
`{{budget.unit}}`	string	`"tokens"` or `"grains"`
`{{budget.utilization}}`	number	0.0-1.0 utilization ratio
`{{disclosure.level}}`	string	Disclosure level
`{{timestamp}}`	string	ISO 8601 assembly timestamp

Source-Level Variables

Variable	Type	Description
`{{source.label}}`	string	Source label
`{{source.index}}`	integer	0-based position
`{{source.priority}}`	integer	1-based priority rank
`{{source.grain_count}}`	integer	Grains in this source
`{{source.tokens_used}}`	integer	Tokens consumed
`{{source.truncated}}`	boolean	Whether grains were cut for budget

Grain-Level Variables

Variable	Type	Description
`{{grain.content}}`	string	Projected text content (per Section 10.3.2 content rules)
`{{grain.type}}`	string	Grain type (used as SML element name)
`{{grain.subject}}`	string	Triple subject
`{{grain.relation}}`	string	Raw triple relation (with namespace)
`{{grain.humanized_relation}}`	string	Humanized relation (namespace stripped, underscores replaced)
`{{grain.object}}`	string	Triple object
`{{grain.confidence}}`	number	Confidence [0.0, 1.0]
`{{grain.importance}}`	number	Importance [0.0, 1.0]
`{{grain.tags}}`	string	Comma-separated tags
`{{grain.created_at}}`	string	ISO 8601 timestamp
`{{grain.relative_time}}`	string	Humanized relative time (e.g., "2h ago")
`{{grain.score}}`	number	Relevance score
`{{grain.hash}}`	string	Content address (for machine envelope use, not LLM output)
`{{#grain.is_full}}`	section	True when disclosure = full
`{{#grain.is_summary}}`	section	True when disclosure = summary

10.6 DEFINE TEMPLATE

Templates use the flat semantic model. The ELEMENT section defines how each grain renders, and elements are emitted directly without group wrappers:

CAL/1 DEFINE TEMPLATE semantic_sml
  EXTENDS structured
  HEADER {
<context intent="{{assembly.intent}}">
  }
  ELEMENT {
  <{{grain.type}} subject="{{grain.subject}}"{{#grain.confidence}} confidence="{{grain.confidence}}"{{/grain.confidence}}>{{grain.content}}</{{grain.type}}>
  }
  ELEMENT_SUMMARY {
  <{{grain.type}} subject="{{grain.subject}}">{{grain.content}}</{{grain.type}}>
  }
  SOURCE_BREAK {
 
  }
  FOOTER {
</context>
  }

Usage:

CAL/1 ASSEMBLE conversation_context
  FOR "helping alice with her project"
  FROM beliefs: (RECALL beliefs ABOUT "alice" LIMIT 20),
       goals: (RECALL goals ABOUT "alice" RECENT 5)
  BUDGET 3000 tokens
  FORMAT TEMPLATE semantic_sml

Inline templates:

FORMAT TEMPLATE {
  ELEMENT {
- [{{grain.type}}] {{grain.content}}{{#grain.confidence}} ({{grain.confidence}}){{/grain.confidence}}
  }
}

All 10 grain types rendered:

<context intent="helping alice prepare her Q1 engineering review">
 
  <belief subject="alice" confidence="0.95">prefers dark mode in all tools</belief>
  <belief subject="alice" confidence="0.88">requires keyboard shortcuts for productivity</belief>
  <belief subject="alice" confidence="0.82">works best in deep-focus blocks of 90 minutes</belief>
 
  <goal subject="alice" state="active" deadline="2026-03-15">complete Q1 engineering review presentation</goal>
  <goal subject="alice" state="active">reduce P0 incident rate by 20% in Q2</goal>
 
  <event role="user" time="10m ago">Can you help me pull together the Q1 metrics?</event>
  <event role="assistant" time="10m ago">Sure — retrieving deployment counts, incident data, and velocity now.</event>
  <event role="user" time="8m ago">Focus on the reliability numbers first.</event>
 
  <action tool="query_metrics" phase="completed">retrieved 47 deployments and 3 P0 incidents for Q1 2026</action>
  <action tool="search_docs" phase="completed">found Q1 review template in confluence/engineering/reviews</action>
 
  <observation observer="system">alice opened incident-dashboard at 09:14 UTC</observation>
  <observation observer="system" source="calendar">Q1 review presentation scheduled for 2026-03-15 14:00 UTC</observation>
 
  <reasoning type="deductive">alice is prioritising reliability given 3 P0 incidents; lead with incident reduction narrative</reasoning>
  <reasoning type="abductive">low velocity in week 8 likely caused by the infra migration; flag as contextual outlier</reasoning>
 
  <state context="q1_review_prep">outlining slides: 1. headline metrics  2. incident retrospective  3. velocity trend  4. Q2 goals</state>
 
  <workflow trigger="review_prep_requested">1. retrieve Q1 metrics  2. identify narrative arc  3. draft slide outline  4. populate data  5. send for review by 2026-03-14</workflow>
 
  <consensus threshold="3" count="4">Q1 deployment frequency improved 18% over Q4 2025</consensus>
 
  <consent action="granted" grantor="alice" grantee="agent">access engineering metrics dashboards for review preparation</consent>
 
</context>

10.7 Template Inheritance

Templates inherit from presets via EXTENDS. Sections not defined in the template use the parent preset's definition. Inheritance depth is limited to 1 (template -> preset only). Default parent is readable.

The data preset cannot be extended (it outputs structural JSON, not template-driven text).

10.8 Template Safety Model

Templates are rendering instructions, not programs:

No file system access, no code execution, no network requests
No access to environment variables or other namespaces
Undefined variables render as empty string
{{#each}} capped at 200 iterations
Output bounded by budget * 2 characters
Validated at definition time for syntax, known variables, section balance, and size

Template Constraints:

Constraint	Limit
Max template body size	4096 bytes
Max templates per namespace	50
Max nesting depth	5 levels
Max `{{#each}}` iterations	200
Template name length	64 characters
Inheritance depth	1
Variable set	Closed

10.9 TOON — Token-Oriented Object Notation

10.9.1 What is TOON?

TOON (Token-Oriented Object Notation) is a compact, LLM-native encoding format defined by the TOON specification (v3.0). It combines:

YAML-like indentation for nested or non-uniform objects
CSV-style tabular layout for uniform arrays of objects

For CAL's primary output shape — arrays of grains of the same type — TOON's tabular mode achieves approximately 40% fewer tokens compared to JSON while preserving full semantic fidelity. The same content projection rules from Section 10.3 apply: humanize(), time humanization, and per-grain-type content rules all carry through.

TOON is complementary to SML, not a replacement:

Property	SML	TOON
Semantic tag names	Yes (`<belief>`, `<goal>`, …)	No — grain type in section header only
Token efficiency	Moderate	High (~40% fewer vs JSON)
Uniform arrays	One element per line	CSV table — optimal
Mixed grain types	Natural (each type has its own tag)	Grouped sections
LLM-native	Yes	Yes
Best for	Rich context with clear epistemic signals	Large result sets, tight budgets

10.9.2 When to Use TOON

Prefer FORMAT toon / AS toon when:

Large RECALL result sets — uniform grain arrays of 20+ grains where token savings matter.
Tight ASSEMBLE budgets — when the BUDGET clause is at or near the limit of available context.
Homogeneous source queries — ASSEMBLE sources that each contain a single grain type.

Prefer SML when:

The LLM must make epistemic decisions per grain (trust calibration based on tag name).
Mixed grain types appear within a single source without logical grouping.
The downstream prompt system is tuned for <tag> signals.

10.9.3 TOON Rendering Rules for RECALL Results

For a RECALL result returning N grains of a single type, the TOON output is a root-level array document. The first line is the TOON array header (detected as root array by §5 of the TOON spec); rows follow at depth 0 (no indentation):

type[N]{col1,col2,...}:
value1,value2,...
value1,value2,...
...

Where:

type is the grain type (plural form, lowercase): beliefs, events, goals, etc.
[N] is the count of rows.
{col1,col2,...} are the projected field names.
The trailing : on the header is required by the TOON grammar (header = [key] bracket-seg [fields-seg] ":").
Rows follow at depth 0 — no indentation — because the array is the root document.

Column set for each grain type (same content projection as Section 10.3.2):

Grain Type	Columns (standard disclosure)
`beliefs`	`subject`, `content`, `confidence`
`events`	`role`, `time`, `content`
`goals`	`subject`, `content`, `state`
`actions`	`tool`, `phase`, `content`
`observations`	`observer`, `content`
`reasonings`	`type`, `content`
`states`	`context`, `content`
`workflows`	`trigger`, `content`
`consensuses`	`threshold`, `count`, `content`
`consents`	`grantor`, `grantee`, `action`, `content`

At summary disclosure, confidence, state, phase, and type columns are omitted. At full disclosure, additional columns source and observed are appended.

Example — RECALL beliefs ABOUT "alice" LIMIT 3 AS toon:

beliefs[3]{subject,content,confidence}:
alice,prefers dark mode,0.95
alice,prefers vim,0.9
alice,works best in deep-focus blocks of 90 minutes,0.82

Example — RECALL events WHERE user_id = "alice" RECENT 3 AS toon:

events[3]{role,time,content}:
user,10m ago,Can you help me pull together the Q1 metrics?
assistant,10m ago,Sure — retrieving deployment counts and incident data.
user,8m ago,Focus on the reliability numbers first.

String quoting. A value in a tabular row MUST be double-quoted if it: contains the active delimiter (comma by default), has leading or trailing whitespace, is empty, matches a reserved literal (true, false, null), matches a numeric pattern, contains a leading hyphen, or contains any of :, ", \, [, ], {, }, or control characters. Only five escape sequences are valid inside quoted strings: \\, \", \n, \r, \t. No \u escapes — Unicode characters appear as literal UTF-8. Implementations MUST apply humanize() to relation fields and time humanization to timestamp fields, identical to SML.

Number canonicalization. Numeric values (confidence, importance, scores) MUST be emitted in canonical decimal form: no exponent notation, no leading zeros except 0 itself, no trailing fractional zeros (0.90 → 0.9, 1.5000 → 1.5). NaN and ±Infinity map to null.

10.9.4 TOON Rendering Rules for ASSEMBLE Results

For an ASSEMBLE result, the TOON output is a root-level object document. The first line is a metadata key-value pair, which causes the TOON parser to detect root form as "object" (per §5 of the TOON spec). Named grain-type arrays are then properties of that object; their tabular rows are indented 2 spaces (depth+1):

context: <context_name>
intent: <for_clause_text>
tokens: <used>/<total>
<grain_type>[N]{col1,col2,...}:
  row1_val1,row1_val2,...
  row2_val1,row2_val2,...
<grain_type>[N]{col1,col2,...}:
  row1_val1,row1_val2,...
  ...

Rules:

The metadata header uses key: value format (colon-space separator).
The trailing : on every array header is required by the TOON grammar.
Tabular rows are indented 2 spaces because they are named properties of the root object.
No blank lines between tabular rows within a group; one blank line between groups.
Source labels are omitted from the output (ASSEMBLE TOON is grain-group-centric). To expose source attribution, use FORMAT sml.
Within a group, all grains MUST be of the same type. If a source returns mixed types, the executor MUST split them into separate same-type groups.
Groups are ordered by priority (highest priority first, matching the PRIORITY clause).

Example — ASSEMBLE FORMAT toon:

context: agent_context
intent: helping alice prepare her Q1 engineering review
tokens: 1847/2000
beliefs[3]{subject,content,confidence}:
  alice,prefers dark mode,0.95
  alice,prefers vim,0.9
  alice,works best in deep-focus blocks of 90 minutes,0.82
goals[2]{subject,content,state,deadline}:
  alice,complete Q1 engineering review,active,2026-03-15
  alice,reduce P0 incident rate by 20% in Q2,active,-
events[3]{role,time,content}:
  user,10m ago,Can you help me pull together the Q1 metrics?
  assistant,10m ago,Sure — retrieving deployment counts and incident data.
  user,8m ago,Focus on the reliability numbers first.

10.9.5 Auto-TOON (Budget Pressure Hint)

When no explicit FORMAT is specified and all of the following conditions hold, implementations MAY automatically select toon as the output format instead of the default sml:

A BUDGET clause is present.
Estimated token utilization (from the EXPLAIN plan) exceeds 85% of the budget.
All ASSEMBLE sources return a single grain type each (enabling full tabular mode).

When auto-TOON activates, the response MUST include a warning:

{ "code": "CAL-W005", "message": "FORMAT auto-selected as toon due to budget pressure (>85% utilization estimate). Specify FORMAT explicitly to suppress this warning." }

Auto-TOON is opt-in at the server level. Servers report whether it is active via DESCRIBE capabilities (auto_toon_enabled).

10.9.6 TOON and PROJECT

The PROJECT clause (Section 10.3.5) works with TOON. The projected fields become the TOON column headers:

CAL/1 RECALL observations WHERE tags INCLUDE ["profile:healthcare"]
  | PROJECT content(object), attr(hc:patient_id, hc:encounter_id)
  | LIMIT 10 AS toon

Output (root array — RECALL result):

observations[2]{content,hc:patient_id,hc:encounter_id}:
elevated heart rate detected,P-1234,E-567
blood pressure within normal range,P-1235,E-568

10.9.7 TOON and Streaming

TOON output is compatible with the STREAM protocol (Section 11). When streaming TOON, each source_data chunk carries one or more complete CSV rows — never partial rows. The metadata header is emitted in the first chunk.

10.9.8 TOON Wire Format in application/json+cal

In application/json+cal, the formatted TOON output is a plain string in formatted_context.text (for ASSEMBLE) or formatted (for RECALL). The media type annotation uses "format": "toon".

11. Streaming Protocol

11.1 Event Types

Streaming ASSEMBLE uses a typed event stream. Events are delivered in causal order.

Event Type	Phase	Description
`assembly_started`	Init	Stream opened, assembly ID assigned
`source_started`	Source Resolution	A RECALL query has begun
`source_completed`	Source Resolution	A RECALL query has finished
`dedup_completed`	Deduplication	Cross-source dedup finished
`budget_allocated`	Budget Allocation	Token budget distributed
`disclosure_decided`	Progressive Disclosure	Disclosure levels assigned
`chunk`	Formatting	A chunk of formatted output
`assembly_completed`	Done	All phases complete
`error`	Any	An error occurred
`cancelled`	Any	Stream cancelled

Ordering invariant:

assembly_started
  -> source_started(s1) -> source_completed(s1)
  -> source_started(s2) -> source_completed(s2)
  -> ...                                          (sources may interleave)
  -> dedup_completed
  -> budget_allocated
  -> chunk(1) -> chunk(2) -> ... -> chunk(n)
  -> assembly_completed

11.2 STREAM Clause

ASSEMBLE user_context
  FROM beliefs: (RECALL beliefs ABOUT "alice")
  BUDGET 2000 tokens
  STREAM { all }                              -- all events
  -- or: STREAM { progress, chunks }          -- specific events
  -- or: STREAM { all, chunk_size = 200 }     -- custom chunk size
  -- or: STREAM                               -- bare = all events

Option	Events Emitted
`progress`	assembly_started, source_started, source_completed, assembly_completed
`budget`	dedup_completed, budget_allocated, disclosure_decided
`chunks`	chunk (formatted output)
`all`	All of the above
`chunk_size = N`	Target tokens per chunk (default 100, min 20, max 1000)

error and cancelled events are ALWAYS emitted regardless of options.

11.3 Transport Bindings

SSE (Server-Sent Events) -- RECOMMENDED

POST /memories/{id}/cal HTTP/1.1
Content-Type: application/json+cal
Accept: text/event-stream
 
event: assembly_started
data: {"type":"assembly_started","assembly_id":"asm_a1b2c3d4",...}
 
event: chunk
data: {"type":"chunk","chunk_index":0,"content":"## Context...","tokens":18,...}
 
event: assembly_completed
data: {"type":"assembly_completed","summary":{...}}

NDJSON (Fallback)

POST /memories/{id}/cal HTTP/1.1
Accept: application/x-ndjson
 
{"type":"assembly_started","assembly_id":"asm_a1b2c3d4",...}
{"type":"chunk","chunk_index":0,...}
{"type":"assembly_completed",...}

WebSocket

Full-duplex with explicit pause/resume/cancel:

{"action": "assemble", "request_id": "req_001", "payload": {...}}
{"action": "cancel", "assembly_id": "asm_a1b2c3d4"}
{"action": "pause", "assembly_id": "asm_a1b2c3d4"}
{"action": "resume", "assembly_id": "asm_a1b2c3d4"}

11.4 Progressive Budget Updates

Budget information is refined through the streaming phases: assembly_started (total known), source_completed (per-source estimates), budget_allocated (final allocation), chunk (running countdown via budget_remaining), assembly_completed (final utilization).

If a source fails, its allocated budget is redistributed and a revised budget_allocated event is emitted.

11.5 Cancellation

HTTP: Client closes connection. Server also supports DELETE /memories/{id}/cal/stream/{assembly_id}.
WebSocket: Client sends {"action": "cancel", "assembly_id": "..."}.
Cancellation is best-effort. Partial results are valid and usable.
Cancellation MUST be recorded in the audit trail.

11.6 Backpressure

SSE: TCP-level flow control. Server buffers max 64KB unsent events. Stall timeout: 30 seconds.
WebSocket: Explicit pause/resume actions. Chunk emission pauses; progress events continue.

Streaming Constraints:

Constraint	Limit
Max concurrent streams per client	3
Max event buffer per stream	64 KB
Stream reconnection window	10 s
Min chunk_size	20 tokens
Max chunk_size	1000 tokens
Default chunk_size	100 tokens
Backpressure stall timeout	30 s

12. Domain Profile Querying

OMS defines domain profiles (healthcare, legal, finance, robotics, science, consumer, integration). CAL provides structured access to domain-tagged grains.

12.1 Profile Querying via Tags

RECALL WHERE tags INCLUDE ["profile:healthcare"]
RECALL beliefs WHERE tags INCLUDE ["profile:healthcare"]
  AND subject = "patient:P-12345" AND relation IS PREFERENCE

12.2 Domain-Prefixed Fields

Domain-specific fields use OMS domain prefix convention:

Domain	Prefix	Example Fields
Healthcare	`hc:`	`hc:patient_id`, `hc:encounter_id`, `hc:provider_id`, `hc:condition_code`, `hc:phi_category`
Legal	`legal:`	`legal:case_id`, `legal:jurisdiction`, `legal:privilege_status`, `legal:retention_category`
Finance	`fin:`	`fin:account_id`, `fin:transaction_id`, `fin:risk_category`, `fin:compliance_flag`
Robotics	`rob:`	`rob:device_id`, `rob:coordinate_frame`, `rob:safety_zone`
Science	`sci:`	`sci:experiment_id`, `sci:dataset_id`, `sci:methodology`, `sci:reproducibility_status`
Consumer	`con:`	`con:session_context`, `con:interaction_channel`
Integration	`int:`	`int:source_system`, `int:correlation_id`, `int:sync_status`

Example:

RECALL beliefs WHERE tags INCLUDE ["profile:healthcare"]
  AND hc:patient_id = "P-12345"
  AND hc:condition_code IN ("J06.9", "J20.9")
  AND relation = "mg:knows"
  | ORDER BY time DESC | LIMIT 20

The parser SHOULD emit warning CAL-W002 if a domain field is used without the corresponding profile: tag.

13. Store Protocol Mapping

Every CAL statement maps to one or more OMS Store Protocol operations (OMS §28.4). This mapping is deterministic.

CAL Statement	Min Store Ops	Max Store Ops	Operations
RECALL	1	1	`query` or `search`
EXISTS	1	1	`exists`
HISTORY (hash)	1	101	`get` + chain walk
HISTORY (triple)	1	1	`query(include_superseded=true)`
EXPLAIN	0	0	Compile-time only
ADD	1	1	`put`
SUPERSEDE	2	3	`get` + `supersede`
REVERT	3	4	`get` + `get` + `supersede`
Set operation	2	2	One query per operand
ASSEMBLE	N	N	N x `query`/`search` (one per source)

14. Response Model

14.1 Machine Envelope

Every CAL response includes a _cal metadata block:

{
  "_cal": {
    "version": "1.0",
    "statement_type": "recall",
    "tier": 0,
    "query_hash": "sha256:...",
    "duration_ms": 42,
    "budget": {
      "tokens_used": 1847,
      "grains_returned": 8,
      "grains_scanned": 156
    }
  },
  "results": [...],
  "total": 42,
  "next_cursor": "cursor:eyJ..."
}

14.2 LLM Content Layer

A formatted representation for direct insertion into LLM context windows. The content layer uses the Content Projection Model (Section 10.3) to transform grain fields into natural language with lightweight structural hints.

SML format (default for structured / sml):

<context intent="helping alice with project">
 
  <belief subject="alice" confidence="0.92">prefers dark mode</belief>
  <belief subject="alice" confidence="0.88">requires keyboard shortcuts</belief>
 
  <goal subject="alice" state="active">complete Q1 review</goal>
 
</context>

Markdown format (default for readable / markdown):

## Context: helping alice with project
 
**Beliefs**
- alice prefers dark mode (confidence: 0.92)
- alice requires keyboard shortcuts (confidence: 0.88)
 
**Goals**
- alice: complete Q1 review (active)

Compact format (for compact / text):

[belief] alice prefers dark mode (0.92)
[belief] alice requires keyboard shortcuts (0.88)
[goal] alice: complete Q1 review (active)

The machine envelope (Section 14.1) carries hashes, namespaces, full timestamps, and other storage metadata. These MUST NOT appear in the LLM content layer.

14.3 Progressive Disclosure

Level	Metadata Density	When Used
`summary`	Tag name + subject + content only	Token budget tight (<1000 tokens)
`standard`	+ confidence, role, state, time	Default
`full`	+ source_type, importance, tags, verification_status	Token budget generous or LIMIT <= 5

Progressive disclosure controls metadata density on a flat structure, not nesting depth. The element shape stays the same across all levels -- only the number of attributes changes.

14.4 Per-Grain-Type Content Projection

Each grain type projects its fields into a text content string and attribute set using the rules defined in Section 10.3.2. The following table shows the projected output for each type:

Grain Type	Projected Text Content	Example Output (sml)
Belief	`humanize(relation) + " " + object`	`<belief subject="alice" confidence="0.95">prefers dark mode in all tools</belief>`
Event	`content`	`<event role="user" time="10m ago">Can you help me pull together the Q1 metrics?</event>`
Goal	`object`	`<goal subject="alice" state="active" deadline="2026-03-15">complete Q1 engineering review presentation</goal>`
Action	`object` (tool result)	`<action tool="query_metrics" phase="completed">retrieved 47 deployments and 3 P0 incidents for Q1 2026</action>`
Observation	`object`	`<observation observer="system">alice opened incident-dashboard at 09:14 UTC</observation>`
Reasoning	`conclusion`	`<reasoning type="deductive">alice is prioritising reliability given 3 P0 incidents; lead with incident reduction narrative</reasoning>`
State	`plan` summary	`<state context="q1_review_prep">outlining slides: 1. headline metrics 2. incident retrospective 3. velocity trend 4. Q2 goals</state>`
Workflow	`steps` joined	`<workflow trigger="review_prep_requested">1. retrieve Q1 metrics 2. identify narrative arc 3. draft slide outline 4. populate data 5. send for review by 2026-03-14</workflow>`
Consensus	`object`	`<consensus threshold="3" count="4">Q1 deployment frequency improved 18% over Q4 2025</consensus>`
Consent	`purpose`	`<consent action="granted" grantor="alice" grantee="agent">access engineering metrics dashboards for review preparation</consent>`

The PROJECT clause (Section 10.3.5) overrides these defaults when custom or domain-specific fields must be surfaced.

15. Dual Wire Format

15.1 Media Types

Format	Media Type	Use Case
Text	`text/cal`	LLM generation, human authoring, documentation
JSON	`application/json+cal`	Programmatic construction, structured output

15.2 Bijective Mapping

Every valid CAL statement has exactly one representation in each format, and conversion between them is lossless.

text/cal:

CAL/1 RECALL beliefs ABOUT "alice" WHERE confidence >= 0.8 RECENT 5 AS markdown

application/json+cal:

{
  "cal_version": 1,
  "statement": "recall",
  "grain_type": "beliefs",
  "about": "alice",
  "where": [{ "field": "confidence", "op": ">=", "value": 0.8 }],
  "recent": 5,
  "as": "markdown"
}

15.3 Round-Trip Guarantee

parse(serialize(parse(text))) == parse(text) and serialize(parse(serialize(json))) == serialize(json). Whitespace may differ; semantic content is identical.

15.4 Content Negotiation

Standard HTTP content negotiation applies. The Accept header controls response format. If absent, response format matches request format.

15.5 JSON Schema

The JSON format has published JSON Schemas (draft 2020-12):

Request: https://cal-spec.org/schema/v1/cal-request.schema.json
Response: https://cal-spec.org/schema/v1/cal-response.schema.json

Implementations MUST validate incoming application/json+cal against the schema before execution.

16. Internationalization

16.1 Character Encoding

CAL queries and responses MUST be UTF-8 encoded. Invalid UTF-8 sequences produce error CAL-E070: InvalidUTF8.

16.2 Unicode Normalization

All string comparisons use NFC normalization. String literals are NFC-normalized at parse time. Stored grain content is NFC-normalized at write time. Implementations MUST normalize.

16.3 Bidirectional Text (Bidi)

Grain content is stored in logical order. CAL rejects string literals containing bidi override characters (U+202A-U+202E, U+2066-U+2069) to prevent bidi-based spoofing attacks (error CAL-E071: BidiOverrideRejected).

16.4 Cross-Lingual Search

When query = "...", LIKE "...", or ABOUT "..." triggers semantic search, the search SHOULD work across languages when multilingual embeddings are available. Cross-lingual search is REQUIRED at Extended conformance level.

Implementations MUST declare cross-lingual capability in DESCRIBE capabilities:

{
  "cross_lingual_search": true,
  "embedding_model": "multilingual-e5-large",
  "supported_languages": ["en", "es", "fr", "de", "ja", "zh", "ar"]
}

16.5 Locale-Aware Sorting

Default: Unicode code point order (binary sort). Locale-aware sorting requested via WITH locale("xx"):

RECALL beliefs ABOUT "alice" | ORDER BY object ASC WITH locale("de")

Locale-aware sorting is optional. Implementations that do not support it MUST ignore the locale() option with a warning.

16.6 Identifier Safety

Field names and keywords are ASCII-only, never subject to Unicode normalization. This prevents confusion attacks with visually similar Unicode characters.

17. Execution Model

17.1 Query Pipeline (Tier 0)

CAL String
    |
    v
+----------+    +---------+    +----------+    +-----------+    +----------+
|  LEXER   |--->| PARSER  |--->|VALIDATOR |--->| PLANNER   |--->| EXECUTOR |
| Tokens   |    |CalStmt  |    |Type chk  |    |Query plan |    | Results  |
+----------+    +---------+    +----------+    +-----------+    +----------+
                                    |               |                |
                                    |          +----v------+   +----v----+
                                    |          |POLICY GATE|   | AUDIT   |
                                    |          |check_read |   | TRAIL   |
                                    |          +-----------+   +---------+
                               +----v-----+
                               | FIREWALL |
                               |complexity|
                               |deny list |
                               +----------+

17.2 Evolve Pipeline (Tier 1)

CAL String (ADD, SUPERSEDE, or REVERT)
    |
    v
+----------+    +---------+    +----------+    +-----------+
|  LEXER   |--->| PARSER  |--->|VALIDATOR |--->| TIER CHECK|
| Tokens   |    |CalStmt  |    |Type chk  |    |Token ok?  |
+----------+    +---------+    +----------+    +-----+-----+
                                                      |
                   +----------------------------------+
                   |
             +-----v------+    +-----------+    +----------+
             |POLICY GATE |--->| EXECUTOR  |--->|  AUDIT   |
             |check_write |    | add() or  |    |  TRAIL   |
             +------------+    | supersede |    +----------+
                               +-----------+

17.3 Resource Limits

Resource	Limit	Spec-mandated?
Max query string length	8,192 bytes	Yes
Max LIMIT value	1,000	Implementation-configurable
Default LIMIT (if omitted)	20	Yes
Max subquery nesting	3 levels	Yes
Max pipeline stages	5	Yes
Max IN literal set size	100	Implementation-configurable
Max set operands	5	Yes
Max parameters per query	20	Yes
Query timeout	5,000ms	Implementation-configurable
ASSEMBLE timeout	10,000ms	Implementation-configurable
Parse time budget (queries <=4KB)	<1ms	Yes
Max ADD per minute	20	Implementation-configurable
Max SUPERSEDE per minute	10	Implementation-configurable
Max REVERT per minute	5	Implementation-configurable
Max SET clauses per ADD	6	Yes (3 required + 3 optional base)
Max SET clauses per SUPERSEDE	4	Yes
Max REASON length	500 chars	Yes
Max BATCH queries	10	Yes
Max COALESCE branches	5	Yes
Max LET bindings	5	Yes
Max ASSEMBLE sources	8	Yes
Max BUDGET tokens	16,000	Yes
Max BUDGET grains	200	Yes

17.4 Determinism Guarantees

Property	Guarantee
Same query + same state = same results	Yes
Tiebreaking for equal scores	Lexicographic hash order (ascending)
Parser is stateless	Yes
Decidability	Every string terminates in bounded time
ADD is idempotent	No (unique content address per call)
SUPERSEDE is idempotent	No (returns SupersessionConflict)

18. Capability Token Model

18.1 Token Structure

Tier 0 (read-only) token:

{
  "token_id": "uuid-v4",
  "namespace": "authorized-namespace",
  "user_id": "on-whose-behalf",
  "tier": 0,
  "allowed_ops": ["Recall", "Assemble", "Count", "Exists", "Explain",
                   "History", "Describe", "Batch", "Coalesce"],
  "issued_at": 1709337600000,
  "expires_at": 1709337900000,
  "max_uses": 1,
  "allowed_grain_types": [],
  "write_quota_remaining": 0,
  "signature": "hmac-sha256-signature"
}

Tier 1 (evolve) token:

{
  "token_id": "uuid-v4",
  "namespace": "authorized-namespace",
  "user_id": "on-whose-behalf",
  "tier": 1,
  "allowed_ops": ["Recall", "Assemble", "Count", "Exists", "Explain",
                   "History", "Describe", "Batch", "Coalesce",
                   "Add", "Supersede", "Revert"],
  "issued_at": 1709337600000,
  "expires_at": 1709337900000,
  "max_uses": 1,
  "allowed_grain_types": ["belief"],
  "write_quota_remaining": 10,
  "signature": "hmac-sha256-signature"
}

18.2 Two-Phase Execution

1. LLM generates CAL string
2. Agent harness -> prepare endpoint
   -> Server authenticates, parses, validates, creates token
   -> For Tier 1: shows what will be added/superseded/reverted
   -> Returns {token, plan, tier, side_effects}
3. Agent harness reviews plan (REQUIRED for Tier 1, RECOMMENDED for Tier 0)
   -> Execute endpoint
   -> Server verifies token, checks expiration/replay, executes, returns results

18.3 Namespace Enforcement

The namespace is ALWAYS taken from the token, never from the query. Implementations MUST overwrite any namespace specified in the CAL string with the token's namespace.

19. Policy Integration

19.1 CAL Inherits Sealed Policy

CAL queries execute through the same read path as all other interfaces. No CAL syntax can weaken the active policy.

Policy Constraint	CAL Behavior
`encryption_required`	Transparent -- CAL reads decrypted grains via normal path
`consent_level = Explicit`	Grains without consent silently excluded
`processing_restriction`	Restricted users' data invisible in results
`pii_detection` / `phi_detection`	PII/PHI-tagged fields subject to policy redaction
`audit_required`	Every CAL query produces audit entry

Art. 15 (Right of Access): CAL enables DSAR via RECALL WHERE user_id = "alice" | COUNT
Art. 16 (Right to Rectification): CAL SUPERSEDE enables correction of inaccurate personal data.
Art. 17 (Right to Erasure): Excluded at grammar level. Erasure only via implementation-specific APIs.
Art. 20 (Data Portability): CAL can serve as query interface for exports.
Art. 25 (By Design): Grammar-level safety qualifies as "by design" protection.

19.3 HIPAA Implications

Minimum Necessary: Under HIPAA policy, CAL SHOULD enforce stricter default LIMIT and require field projection for PHI-containing results.
Audit: CAL query audit entries MUST use pseudonymized user IDs and query hashes.

19.4 EU AI Act Implications

Transparency: Results include provenance_id linking to immutable provenance chain.
Explanations: WITH explanation provides compliant explanations.
Tier 1 Traceability: Every SUPERSEDE/REVERT includes a mandatory REASON.

20. Threat Model

20.1 Attack Vectors and Defenses

Attack	Severity	Defense
Prompt injection	CRITICAL	Grammar-level exclusion + capability token scoping + query firewall
Query injection	HIGH	Parameterized queries (`$param`) -- no string concat
Memory spam via ADD	HIGH	Write quota (20/min); single-use tokens; mandatory REASON
Hallucinated ADD	HIGH	Two-phase prepare/execute; mandatory REASON; REVERT enables correction
Supersede injection	HIGH	Two-phase prepare/execute; write quotas; REVERT recovery
Supersede storm	HIGH	Write quotas; per-token single-use; rate limiting
Resource exhaustion	HIGH	Hard compiled limits, timeout enforcement
Cross-namespace disclosure	CRITICAL	Token-bound namespace enforcement
Timing side-channel	MEDIUM	Response jitter; identical error responses
Privilege escalation	HIGH	Token tier checked before execution
Template injection	MEDIUM	Closed variable set; no code execution; validated at definition time
Streaming resource exhaustion	MEDIUM	Max concurrent streams (3); backpressure; stall timeout (30s)

20.2 Query Firewall

Implementations SHOULD perform static analysis between parsing and execution:

Maximum query complexity score
Deny patterns
Mandatory namespace filter
Maximum Tier 1 operations per session

20.3 Kill Switch

Implementations MUST support disabling CAL at runtime:

Master switch: Disables all CAL operations (503).
Tier 1 switch: Disables only ADD/SUPERSEDE/REVERT (403 for evolve, reads continue).

21. Audit Trail

Every CAL execution MUST produce an audit entry.

Tier 0 (Read):

Field	Type	Description
`token_id`	string	Capability token correlation
`query_hash`	string	SHA-256 of normalized CAL string
`namespace`	string	Token's namespace
`actor_id`	string	Pseudonymized (HMAC-SHA256)
`agent_id`	string?	Which LLM generated this query
`result_count`	integer	Number of grains returned
`tier`	integer	0
`duration_ms`	integer	Execution time

Tier 1 (Evolve):

Field	Type	Description
`token_id`	string	Capability token correlation
`query_hash`	string	SHA-256 of normalized CAL string
`namespace`	string	Token's namespace
`actor_id`	string	Pseudonymized
`agent_id`	string?	Which LLM generated this query
`operation`	string	`"add"`, `"supersede"`, or `"revert"`
`target_hash`	string?	Target grain's content address
`new_hash`	string	Newly created grain's content address
`reason`	string	Mandatory reason text
`tier`	integer	1
`duration_ms`	integer	Execution time

Streaming audit fields (additional):

Field	Type	Description
`stream_enabled`	boolean	Whether streaming was requested
`stream_options`	array	Active stream options
`events_emitted`	integer	Total events sent
`cancelled`	boolean	Whether assembly was cancelled
`cancel_reason`	string?	Cancellation reason
`sources_failed`	integer	Number of failed sources

22. Error Model

22.1 Error Format

Errors are stable across spec versions. Every error MUST include: code, message, and suggestion. Errors SHOULD include: position, expected alternatives, and example correction.

{
  "error": {
    "code": "CAL-E003",
    "message": "Unknown grain type \"fact\".",
    "position": {"start": 7, "end": 11, "line": 1, "col": 8},
    "suggestion": "Did you mean \"belief\"? (OMS renamed Fact -> Belief in v1.2)",
    "example": "RECALL beliefs WHERE subject = \"alice\"",
    "valid_values": ["belief","event","state","workflow","action","observation","goal","reasoning","consensus","consent"]
  }
}

22.2 Error Code Summary

See Appendix C for the complete registry. Error codes are organized by category:

Range	Category	Count
CAL-E001 -- CAL-E019	Parse	19
CAL-E020 -- CAL-E022	Type	3
CAL-E030 -- CAL-E031	Execution	2
CAL-E040 -- CAL-E052	Evolve	10
CAL-E060 -- CAL-E066	Grain Type	7
CAL-E070 -- CAL-E071	i18n	2
CAL-E075 -- CAL-E082	Streaming	8
CAL-E085 -- CAL-E096	Template	12
CAL-E100	Version	1

22.3 Warning Codes

Code	Category	Description
CAL-W001	Warning	Unknown `mg:` relation (not in standard vocabulary)
CAL-W002	Warning	Domain field used without matching `profile:` tag
CAL-W003	Warning	Unknown domain prefix
CAL-W004	Warning	Unknown extension option (ignored)
CAL-W005	Warning	FORMAT auto-selected as `toon` due to budget pressure (>85% utilization estimate). Specify FORMAT explicitly to suppress.

23. Compliance Checks

CAL introduces compliance verification checks that implementations MUST validate:

Check	Regulation	Severity
`cal_grammar_safety`	All	Critical
`cal_default_minimization`	GDPR Art.25, HIPAA	Critical
`cal_audit_logging`	All	Critical
`cal_authz_enforcement`	HIPAA, SOX	Critical
`cal_no_policy_override`	All	Critical
`cal_injection_prevention`	All	Critical
`cal_tier1_audit`	All	Critical
`cal_tier1_policy_gate`	All	Critical
`cal_provenance_tracking`	EU AI Act	High
`cal_ai_marking`	EU AI Act	High
`cal_dsar_completeness`	GDPR Art.15	High
`cal_hipaa_minimum_necessary`	HIPAA	High
`cal_phi_in_queries`	HIPAA	High
`cal_rate_limiting`	All	High
`cal_portability_format`	GDPR Art.20	Medium
`cal_consent_on_read`	GDPR Art.6, LGPD	Medium

24. Conformance Levels

Level 1: Core (MUST implement)

RECALL with WHERE, IN, LIMIT, ABOUT, RECENT
EXISTS
Parameter binding ($param)
Hash literals (sha256:...)
Error codes CAL-E001 through CAL-E031
All safety invariants (section 2)
Policy enforcement, audit integration
Determinism guarantees (section 17.4)
text/cal wire format

Level 2: Extended (SHOULD implement)

Everything in Core, plus:

Pipeline operators (| SELECT, | ORDER BY, | LIMIT, | COUNT, | FIRST, | GROUP BY)
Set operators (UNION, INTERSECT, EXCEPT)
Subqueries (WHERE field IN (subquery | EXTRACTOR))
EXPLAIN mode
HISTORY statement (including AS OF and DIFF)
DESCRIBE statement (grain_types, fields, capabilities, server)
BATCH statement
COALESCE statement
LET bindings
All semantic shortcuts (SINCE, LIKE, MY, CONTRADICTIONS, BETWEEN)
ASSEMBLE statement with BUDGET, PRIORITY, FORMAT
Advanced WITH options (diversity, score_breakdown, explanation, provenance, progressive_disclosure)
AS per-query format control
application/json+cal wire format (dual wire format)
Error suggestion system ("did you mean?")
Cross-lingual search
Grain-type-specific fields (section 6)
mg: relation category shortcuts (section 7)
Domain profile querying (section 12)
THREAD shorthand

Level 3: Evolve (MAY implement)

Everything in Extended, plus:

ADD with grain type, SET clauses, and REASON
SUPERSEDE with SET clauses and REASON
REVERT with REASON
Tier 1 capability tokens with write quotas
Error codes CAL-E040 through CAL-E052
Two-phase prepare/execute with side-effect preview

Level 4: Full (MAY implement)

Everything in Evolve, plus:

Streaming ASSEMBLE (STREAM clause, SSE transport, cancellation)
Custom FORMAT templates (DEFINE TEMPLATE, inline templates, named references)
Template inheritance from presets
Template validation (error codes CAL-E085 through CAL-E096)
Content Projection Model and PROJECT clause
DESCRIBE grammar (returns EBNF)
DESCRIBE templates
WebSocket transport for streaming

Implementations MUST declare conformance:

{"cal_conformance": "extended", "cal_version": "1.0"}

25. Versioning and Evolution

25.1 Semver for Specs

Major (CAL/1 -> CAL/2): Breaking changes to grammar or semantics. Extremely rare.
Minor (e.g. 1.0 -> 1.1): Additive only — new keywords, operators, or WITH options.
Patch (e.g. 1.0.0 -> 1.0.1): Clarifications only. No grammar changes.

25.2 Extension Mechanism

Implementation-specific hints via WITH x_prefix_name(...):

RECALL WHERE query = "..." WITH x_hnsw_ef(200)

Rules:

Extensions MUST use x_ prefix
Extensions MUST NOT change core query semantics
Unknown extensions produce warning (not error)
Extensions MUST NOT enable destructive operations

26. Interface Integration

CAL is transport-agnostic. Implementations MAY expose CAL through any combination of interfaces:

Interface	Endpoint Pattern	Input	Output
REST	`POST /memories/{id}/cal`	`{"query": "...", "params": {...}}`	Response Envelope
REST	`POST /memories/{id}/cal/prepare`	`{"query": "...", "params": {...}}`	`{token, plan, tier}`
REST	`POST /memories/{id}/cal/execute`	`{"token": "..."}`	Response
REST (stream)	`POST /memories/{id}/cal`	`Accept: text/event-stream`	SSE stream
gRPC	`CalQuery(CalRequest)`	query string + params	`CalResponse`
MCP	Tool: `cal`	`{"query": "...", "params": {...}}`	JSON
A2A	Skill: `memory_cal`	CAL in task input	Task artifact
CLI	`<impl> cal "..."`	String or file	JSON or table
WebSocket	`/memories/{id}/cal/ws`	JSON frames	JSON frames

All interfaces SHOULD support both Tier 0 and Tier 1 (when enabled). The two-phase prepare/execute flow is REQUIRED for Tier 1 and RECOMMENDED for Tier 0.

27. LLM System Prompt Template

Implementations SHOULD provide this reference to LLMs generating CAL queries (~1200 tokens):

## CAL Quick Reference

CAL is a non-destructive context assembly language for OMS memory databases.
It can read, assemble, and evolve memories, but never delete them.

### Read operations:
  RECALL [MY] [type] [IN "ns"] [ABOUT "entity"] [WHERE conditions] [WITH options]
    [| pipeline] [RECENT n] [SINCE "time"] [AS format]
  ASSEMBLE name FOR "intent" FROM label:(RECALL ...), ...
    BUDGET n tokens PRIORITY l1 > l2 FORMAT markdown [STREAM]
  EXISTS sha256:hash
  HISTORY WHERE subject = "s" AND relation = "r" [AS OF "date"]
  HISTORY sha256:hash [DIFF sha256:other]
  EXPLAIN <any statement>
  DESCRIBE grain_types | fields | capabilities | server
  BATCH { label: RECALL ..., label: RECALL ... }
  COALESCE(RECALL ..., RECALL ...)

### Evolve operations (when enabled):
  ADD <type> SET subject = "s" SET relation = "r" SET object = "o" [SET ...] REASON "why"
  SUPERSEDE sha256:hash SET field = value [SET ...] REASON "why"
  REVERT sha256:hash REASON "why"

### Types: beliefs, events, states, workflows, actions, observations, goals,
           reasonings, consensuses, consents

### WHERE conditions (combine with AND):
  query = "search text"           -- semantic search
  subject = "entity"              -- triple subject
  relation = "predicate"          -- triple relation
  relation IS PREFERENCE          -- category shortcut
  object = "value"                -- triple object
  user_id = "uid"                 -- user filter
  hash = sha256:abcd...           -- exact lookup
  time = "last 7 days"            -- natural language time
  time BETWEEN epoch1 AND epoch2  -- epoch range
  confidence >= 0.8               -- min confidence
  tags INCLUDE ["tag1"]           -- required tags
  type = "belief"                 -- grain type

### Shortcuts: ABOUT, RECENT n, SINCE, LIKE, MY, CONTRADICTIONS, BETWEEN

### Pipeline: | SELECT f1,f2 | ORDER BY field [ASC|DESC] | LIMIT n | COUNT
             | FIRST | SUBJECTS | OBJECTS | HASHES | GROUP BY field
             | PROJECT content(f1,f2), attr(f3,f4)

### Parameters: Use $name for dynamic values

### Streaming:
  ASSEMBLE ... STREAM                          -- stream all events
  ASSEMBLE ... STREAM { progress, chunks }     -- specific events

### Custom templates:
  FORMAT TEMPLATE name                         -- use named template
  FORMAT TEMPLATE { ELEMENT { <{{grain.type}}>{{grain.content}}</{{grain.type}}> } }

### Output formats:
  sml (default structured): flat tag-based — <belief subject="alice" confidence="0.92">prefers dark mode</belief>
  toon: CSV-tabular, ~40% fewer tokens — beliefs[3]{subject,content,confidence}:\nalice,prefers dark mode,0.95
  markdown: human-readable prose
  json: machine-readable structured data
  text: minimal plain text
  triples: subject-relation-object triples
  -- Use AS toon on large RECALL sets; FORMAT toon on budget-constrained ASSEMBLE

### Rules:
- LIMIT is always enforced (default 20, max 1000)
- CAL cannot delete, erase, forget, or destroy data.
- REASON is mandatory for all evolve operations.
- Use HISTORY to check current version before SUPERSEDE.

Appendix A: Complete EBNF Grammar

The complete EBNF grammar is provided in section 4. This appendix restates it as a single unbroken production set for implementer convenience. The grammar in section 4 is the normative reference.

Implementations seeking the EBNF as machine-readable text SHOULD support DESCRIBE grammar which returns the productions in EBNF notation.

Appendix B: JSON Schema References

CAL defines two JSON Schemas for the dual wire format:

Schema	URI	Purpose
Request	`https://cal-spec.org/schema/v1/cal-request.schema.json`	Validates `application/json+cal` requests
Response	`https://cal-spec.org/schema/v1/cal-response.schema.json`	Validates CAL responses

The schemas are published alongside this specification in schemas/v1/. Implementations MUST validate incoming application/json+cal against the request schema. The schemas use JSON Schema draft 2020-12.

A collection of 50 example request/response pairs is provided in schemas/v1/cal-examples.json as a conformance test suite.

Appendix C: Error Code Registry

All error codes use the CAL-E prefix.

Parse Errors (CAL-E001 -- CAL-E019)

Code	Description
CAL-E001	Query exceeds maximum length (8192 bytes)
CAL-E002	Unexpected token (includes expected list)
CAL-E003	Unknown grain type (includes "did you mean?")
CAL-E004	Unknown field name (includes suggestions)
CAL-E005	Unterminated string literal
CAL-E006	Invalid number
CAL-E007	Subquery nesting exceeds depth 3
CAL-E008	Unbound parameter
CAL-E009	Duplicate parameter binding
CAL-E010	LIMIT exceeds maximum
CAL-E011	IN set too large
CAL-E012	Too many pipeline stages
CAL-E013	Too many set operands
CAL-E014	Empty query
CAL-E015	Invalid hash literal (must be sha256: + hex)
CAL-E016	REASON text exceeds maximum length
CAL-E017	Unknown evolve field in SET clause
CAL-E018	Missing REASON clause
CAL-E019	Missing SET clause (SUPERSEDE requires at least one)

Type Errors (CAL-E020 -- CAL-E022)

Code	Description
CAL-E020	Incompatible types in comparison
CAL-E021	Pipeline stage type mismatch
CAL-E022	SUBJECTS/OBJECTS requires belief-type input

Execution Errors (CAL-E030 -- CAL-E031)

Code	Description
CAL-E030	Grain budget exceeded
CAL-E031	Query timeout

Evolve Errors (CAL-E040 -- CAL-E052)

Code	Description
CAL-E040	SupersessionConflict -- target grain already superseded
CAL-E041	NoPreviousVersion -- REVERT target is the original grain
CAL-E042	GrainTypeNotEvolvable -- only Belief grains can be superseded
CAL-E043	WriteQuotaExceeded -- too many evolve operations
CAL-E044	Tier1NotEnabled -- requires Tier 1 capability
CAL-E045	NamespaceMismatch -- target grain in different namespace
CAL-E046	TargetNotFound -- target hash does not exist
CAL-E050	MissingRequiredField -- ADD requires subject, relation, object
CAL-E051	GrainTypeNotAddable -- only Belief, Observation, Goal can be created
CAL-E052	AddQuotaExceeded -- too many ADD operations

Shortcut and Grain Type Errors (CAL-E060 -- CAL-E066)

Code	Description
CAL-E060	AmbiguousShortcut / FieldNotOnGrainType
CAL-E061	Grain-type-specific field used without declaring grain type
CAL-E062	Invalid `action_phase` value
CAL-E063	Invalid `goal_state` value
CAL-E064	Invalid `consent_action` value
CAL-E065	Invalid `recall_priority` value
CAL-E066	Invalid `epistemic_status` value

Internationalization Errors (CAL-E070 -- CAL-E071)

Code	Description
CAL-E070	InvalidUTF8 -- query contains invalid UTF-8 sequences
CAL-E071	BidiOverrideRejected -- bidi override characters not allowed

ASSEMBLE Errors (CAL-E075 -- CAL-E076)

Code	Description
CAL-E075	ASSEMBLE timeout exceeded (10s)
CAL-E076	All ASSEMBLE sources failed

Streaming Errors (CAL-E077 -- CAL-E082)

Code	Description
CAL-E077	InvalidStreamOption -- unknown option in STREAM clause
CAL-E078	ChunkSizeOutOfRange -- chunk_size must be 20-1000
CAL-E079	StreamNotSupported -- server does not support streaming
CAL-E080	GrainFormatError -- individual grain could not be formatted
CAL-E081	StreamReconnectExpired -- assembly completed before reconnect
CAL-E082	AssemblyCancelled -- assembly was cancelled

Template Errors (CAL-E085 -- CAL-E096)

Code	Description
CAL-E085	TemplateNotFound -- named template does not exist
CAL-E086	CannotExtendData -- templates cannot extend 'data' preset
CAL-E087	CannotExtendCustom -- templates can only extend presets
CAL-E088	DuplicateTemplateName -- template already exists
CAL-E089	TooManyTemplates -- namespace at 50-template limit
CAL-E090	UnknownTemplateVariable -- variable not in known set
CAL-E091	UnbalancedTemplateSection -- opening tag without closing
CAL-E092	TemplateTooLarge -- exceeds 4096 bytes
CAL-E093	TemplateNestingTooDeep -- conditional nesting exceeds 5
CAL-E094	InvalidTemplateSyntax -- unrecognized Mustache syntax
CAL-E095	DuplicateSection -- same section defined twice
CAL-E096	ConflictingElementSections -- ELEMENT vs ELEMENT_SUMMARY conflict

Version Errors (CAL-E100)

Code	Description
CAL-E100	Unsupported CAL version

Warning Codes (CAL-W001 -- CAL-W005)

Code	Description
CAL-W001	Unknown `mg:` relation (not in standard vocabulary)
CAL-W002	Domain field used without matching profile tag
CAL-W003	Unknown domain prefix
CAL-W004	Unknown extension option (ignored)
CAL-W005	FORMAT auto-selected as `toon` due to budget pressure (>85% utilization estimate). Specify FORMAT explicitly to suppress.

Appendix D: Reserved Words

The following words are reserved in CAL/1. They cannot be used as unquoted identifiers even if not yet functional. This list consolidates reserved words from all sources.

Active Keywords

RECALL, ASSEMBLE, WHERE, AND, OR, NOT, IN, BETWEEN, LIMIT, OFFSET,
ORDER, BY, ASC, DESC, WITH, EXPLAIN, SCOPE,
UNION, INTERSECT, EXCEPT,
SELECT, COUNT, FIRST, GROUP, SUBJECTS, OBJECTS, HASHES, PROJECT,
INCLUDE, EXCLUDE, IS, NULL, TRUE, FALSE,
EXISTS, HISTORY, DESCRIBE, BATCH, COALESCE,
ABOUT, RECENT, SINCE, LIKE, MY, CONTRADICTIONS, AS,
FOR, FROM, BUDGET, PRIORITY, FORMAT,
LET, THREAD, DIFF,
ADD, SUPERSEDE, REVERT, SET, REASON,
STREAM, TEMPLATE, DEFINE, UNDEFINE, EXTENDS,
HEADER, ELEMENT, ELEMENT_SUMMARY, ELEMENT_OMIT, SOURCE_BREAK, FOOTER,
PREFERENCE, KNOWLEDGE, PERMISSION, INTERACTION, AGENCY, LIFECYCLE, OBSERVATION,
CAL, OF

Future-Reserved Words

FIND, RELATE, TIMELINE, TRACE, GRAPH, ANNOTATE,
MATCHING, SIMILAR, NEAR, TAGGED, USER,
VIA, DEPTH, TOP, UNTIL, LAST, HAVING,
DIVERSITY, MMR, THRESHOLD, RERANK, PROVENANCE,
SUPERSEDED, EXPLANATION, SCORE_BREAKDOWN,
CONSISTENCY, EVENTUAL, BOUNDED, LINEARIZABLE,
CACHE, PIN, UNPIN, MERGE, LANG,
CHUNK, PAUSE, RESUME, CANCEL

Appendix E: Queryable Fields Reference

Common Fields (All Grain Types)

Field	Type	Operators	Sortable	Projectable	Groupable
`query`	String	`=`	No	No	No
`subject`	String	`=`, `!=`, `IN`	Yes	Yes	Yes
`relation`	String	`=`, `!=`, `IN`, `IS`	Yes	Yes	Yes
`object`	String	`=`, `!=`, `IN`	Yes	Yes	Yes
`user_id`	String	`=`, `!=`	No	Yes	Yes
`namespace`	String	`=`	No	No	No
`confidence`	Number	`=`, `!=`, `>=`, `<=`, `>`, `<`	Yes	Yes	No
`importance`	Number	`=`, `!=`, `>=`, `<=`, `>`, `<`	Yes	Yes	No
`score`	Number	`>=`, `>`	Yes	Yes	No
`tags`	Array	`INCLUDE`, `EXCLUDE`	No	Yes	No
`type`	GrainType	`=`	No	Yes	Yes
`time`	Temporal	`=`, `BETWEEN`	Yes	Yes	No
`hash`	Hash	`=`	No	Yes	No
`contradicted`	Boolean	`=`	No	Yes	No
`verification_status`	String	`=`	Yes	Yes	Yes
`source_type`	String	`=`	Yes	Yes	Yes
`recall_priority`	String	`=`	No	No	No
`epistemic_status`	String	`=`	No	No	No

Grain-Type-Specific Fields

Grain Type	Field	Type	Operators
Event	`role`	String	`=`, `!=`
Event	`session_id`	String	`=`
Event	`parent_message_id`	String	`=`
Event	`model_id`	String	`=`, `!=`
Event	`content`	String	`=`
State	`context`	String	`=`, `!=`
State	`plan`	String	`=`
Workflow	`trigger`	String	`=`, `!=`
Workflow	`steps`	String	`=`
Action	`tool_name`	String	`=`, `!=`, `IN`
Action	`action_phase`	String	`=`
Action	`is_error`	Boolean	`=`
Action	`tool_call_id`	String	`=`
Observation	`observer_id`	String	`=`, `!=`
Observation	`observer_type`	String	`=`, `!=`
Goal	`goal_state`	String	`=`, `!=`
Goal	`assigned_agent`	String	`=`, `!=`
Goal	`deadline`	Temporal	`=`, `BETWEEN`
Goal	`depends_on`	String	`=`, `IN`
Reasoning	`reasoning_type`	String	`=`
Reasoning	`premises`	String	`=`
Reasoning	`conclusion`	String	`=`, `!=`
Consensus	`threshold`	Number	`=`, `>=`, `<=`, `>`, `<`
Consensus	`agreement_count`	Number	`=`, `>=`, `<=`, `>`, `<`
Consensus	`participating_observers`	Array	`INCLUDE`
Consent	`consent_action`	String	`=`
Consent	`purpose`	String	`=`, `!=`
Consent	`grantor_did`	String	`=`
Consent	`grantee_did`	String	`=`
Consent	`scope`	String	`=`
Consent	`expires_at`	Temporal	`=`, `BETWEEN`

Domain-Prefixed Fields

Domain	Fields
`hc:`	`patient_id`, `encounter_id`, `provider_id`, `condition_code`, `phi_category`
`legal:`	`case_id`, `jurisdiction`, `privilege_status`, `retention_category`
`fin:`	`account_id`, `transaction_id`, `risk_category`, `compliance_flag`
`rob:`	`device_id`, `coordinate_frame`, `safety_zone`
`sci:`	`experiment_id`, `dataset_id`, `methodology`, `reproducibility_status`
`con:`	`session_context`, `interaction_channel`
`int:`	`source_system`, `correlation_id`, `sync_status`

Appendix F: Version History

Version	Date	Change
1.0	2026-03-03	Initial CAL specification. 12-variant statement model. Tier 0 (RECALL, ASSEMBLE, SetOp, EXISTS, HISTORY, EXPLAIN, DESCRIBE, BATCH, COALESCE) + Tier 1 (ADD, SUPERSEDE, REVERT). ASSEMBLE with budget, priority, format, streaming. Semantic shortcuts (ABOUT, RECENT, SINCE, LIKE, MY, CONTRADICTIONS, BETWEEN). LET bindings. Custom FORMAT templates (Mustache-subset). Grain-type-specific queryable fields for all 10 OMS types. mg: relation vocabulary with category shortcuts. Domain profile querying. Dual wire format (text/cal + application/json+cal). Internationalization (Unicode NFC, cross-lingual search, bidi safety). Streaming protocol (SSE, NDJSON, WebSocket). THREAD shorthand. HISTORY AS OF and DIFF. Non-destructive safety model. Content Projection Model with flat semantic output (Section 10.3-10.4). PROJECT clause for custom field surfacing. Per-grain-type content projection rules with humanize() and time humanization. ELEMENT/ELEMENT_SUMMARY/SOURCE_BREAK template sections for flat semantic rendering. TOON (Token-Oriented Object Notation) format support — `toon` as a first-class FORMAT/AS preset (Section 10.9): tabular CSV rendering for uniform RECALL results, grouped-section rendering for ASSEMBLE results, per-grain-type column sets at each disclosure level, PROJECT integration, STREAM compatibility, auto-TOON budget-pressure hint (CAL-W005).

Document Status: This is the CAL (Context Assembly Language) Specification v1.0. It defines a non-destructive, deterministic, LLM-native context assembly and evolution language for OMS-compliant memory databases. CAL is part of the Open Memory Specification (OMS) v1.3 — see SPECIFICATION.md.

Last Updated: 2026-03-03 License: This specification is offered under the Open Web Foundation Final Specification Agreement (OWFa 1.0) Copyright: Public Domain (CC0 1.0 Universal)