This directory contains the complete JSON Schema definitions for the Agent Object Model (AOM)™ protocol. To validate surfaces and outputs, use the CLI from the repo root (aom.py / aom.mjs). For a high-level overview and motivation, see the AOM v0.1.0 White Paper.
aom-input-schema.json (core surface schema)What it defines: The structure of an AOM surface (screen, modal, panel, widget, drawer).
Validates: *.aom.json files in examples/ (for example examples/v0.1.0/)
Key sections:
automation_policy — (Required) Agent automation policy for a surface: forbidden | allowed | open. Conceptually:
forbidden — No automation; agents may read only enough AOM/policy to learn this rule, then MUST stop using this surface.allowed — Allowed (with guardrails); agents MUST treat the AOM as the only source of truth for actions and MUST NOT act based on DOM/HTML or other page content outside the AOM.open — Permissive mode; agents MAY go beyond the AOM’s explicit actions, using additional page content to infer reasonable actions (while still obeying global safety rules).generated_at — (Required) ISO 8601 timestamp when this AOM document was generated.calling_agent — (Optional) Identifies the agent requesting this surface and/or declares that when the agent submits a request on this site it must include agent_id (and optionally agent_name) in that request; standard field names are agent_id and agent_name. See Agent identity and traceability.purpose — Primary goal and user rolestasks — Explicit tasks the surface supports, including default_action_id, success_conditions (min one)entities — Domain objects with schema (including validation), and current valuesactions — Operations agents can perform, including priority ranking and methodstate — Session and workflow state, including flow_id trackingnavigation — Breadcrumbs and neighbor surfacessignals — Errors, warnings, notifications, and automated test_casesUse when: Generating or consuming AOM from web/mobile screens.
aom-output-schema.json (agent output schema)What it defines: The structure of an agent’s response to an AOM surface.
Validates: *.output.json files in Examples/ or examples/ (typically under each surface’s outputs/ folder).
Key sections:
agent_id — (Optional) Identifier of the agent producing this output. When the surface declares calling_agent.agent_id_required, the agent MUST include this in the output and in the action-invocation request to the site for traceability.agent_name — (Optional) Human-readable name of the agent; include when the surface requires it or for logging (e.g. aom.tools).key_issuer — (Optional) Issuer of agent_id (e.g. ‘aom.tools’).mode — "single" (one-shot) or "flow" (multi-step)thought — Agent’s reasoning (for debugging/transparency)action — Requested action with action_id + paramsresult — Final output (populated when meta.done: true)meta — Control flags including done, error, confidence, and optional a2h_intentUse when: Building agents that operate on AOM surfaces.
Secure/signed payloads are out of scope for this spec.
AOM defines two machine-readable contracts. Both are required for AOM-compliant agent systems:
| Contract | Schema | File Extension | Purpose |
|---|---|---|---|
| Surface | aom-input-schema.json |
*.aom.json |
Describes what the agent sees (UI state, available actions, entities) |
| Output | aom-output-schema.json |
*.output.json |
Describes what the agent does (thought, action, result, confidence) |
The diagram below shows the runtime flow: the User and the Worker Agent do not communicate directly—all interaction is via the Master Agent. The User triggers the workflow and sends information to the Master; the Master triggers and informs the Worker Agent. The Agent reads surfaces (Input AOM), performs actions, and moves between surfaces (e.g. Surface A → Surface A/B); surfaces serve Input AOM and return Success or Error. The Agent then informs the Master with Output AOM, and the Master informs the User. For the full flow including site and page policy checks, see Sequence diagram (for geeks).
sequenceDiagram
autonumber
actor User as User
participant Master as Master Agent
participant Agent as Worker Agent
participant Site as Site (origin)
participant Surf1 as Surface A
participant Surf2 as Surface A/B
%% 0. User and Agent communicate only via Master
User->>Master: Triggers workflow with Information
Master->>Agent: Triggers workflow with Information
%% 1. Agent fetches site policy
Agent->>Site: GET /.well-known/aom-policy.json
Site-->>Agent: Site policy (forbidden | allowed | open)
alt Site policy = forbidden
rect rgb(255, 230, 230)
Agent->>Master: Surface forbidden (exit)
Master->>User: Surface forbidden
end
else Site policy = allowed or open
rect rgb(230, 240, 255)
%% 2. First surface serves AOM; Agent reads and checks page policy
Surf1-->>Surf1: Serves Input AOM
Agent->>Surf1: Reads Input AOM
Agent-->>Agent: Checks page automation_policy (forbidden | allowed | open)
alt Page automation_policy = forbidden
rect rgb(255, 230, 230)
Agent->>Master: Page forbidden (exit)
Master->>User: Page forbidden
end
else Page = allowed or open
rect rgb(232, 245, 233)
%% 3. Agent checks if it has all data; need more info goes via Master to User
Agent-->>Agent: Checks if it has all data
alt Has all data
rect rgb(232, 245, 233)
Note over Agent: Proceed with current data
end
else Needs more information
rect rgb(255, 248, 225)
Agent->>Master: No. Give more Information
Master->>User: No. Give more Information
User->>Master: Sends more Information
Master->>Agent: Sends more Information
end
end
Agent->>Surf1: Submits action (fills Input AOM)
Agent->>Surf1: Performs Action of Input AOM
Surf1->>Surf2: Navigates to same or next surface
end
%% 5. Next surface serves AOM; Agent reads and checks page policy again
Note over Agent: Waits for next surface
Surf2-->>Surf2: Serves Input AOM
Agent->>Surf2: Reads Input AOM
Agent-->>Agent: Checks page automation_policy (forbidden | allowed | open)
alt Page automation_policy = forbidden
rect rgb(255, 230, 230)
Agent->>Master: Page forbidden (exit)
Master->>User: Page forbidden
end
else Page = allowed or open
rect rgb(240, 248, 255)
%% 6. Success: Agent informs Master, Master informs User. Error: escalate via Master.
Surf2-->>Agent: Returns response
alt Success
rect rgb(232, 245, 233)
Agent-->>Agent: Converts Input AOM to Output AOM
Agent->>Master: Informs Output AOM
Master->>User: Informs Output AOM
end
else Error
rect rgb(255, 245, 238)
Agent-->>Agent: Check Error
alt Can Fix
rect rgb(232, 245, 233)
Agent->>Surf2: Retry (e.g. read again / resubmit) with Input AOM
end
else Can't Fix (Converts to Output AOM)
rect rgb(255, 230, 230)
Agent->>Master: No. Give more Information with Output AOM
Master->>User: No. Give more Information with Output AOM
end
end
end
end
end
end
end
end
end
*.aom.json per page or component; the Agent receives and reads them. Surfaces describe purpose, tasks, entities, actions, state, navigation, and signals.*.output.json (thought, action, result, meta). The Agent informs the Master with Output AOM, and the Master informs the User; the full output is also for the agent owner (and e.g. logging at aom.tools). When the Agent invokes an action, it sends an action-invocation request (action_id, params, and optionally agent_id, agent_name) to the site.spec/v0.1.0/ holds the schemas; examples/v0.1.0/ holds sample surfaces and golden outputs; tools/ holds validators and create-outputs. Use the schemas to validate your own *.aom.json and *.output.json files.Both schemas use semantic versioning:
0.1.0 (this document)1.0.0)0.2.0)0.1.1)Each AOM surface must declare its version:
{
"aom_version": "0.1.0",
...
}
These mirror the design intent throughout the spec:
forbidden / allowed / open control how agents may use the surface.Agent identity (agent_id, agent_name) lives in the output and in the request the agent sends to the site when invoking an action. The input (surface) does not contain the agent’s identity; it may declare that when the agent submits a request on this site it must include agent_id (and optionally agent_name) in that request.
agent_id, optional agent_name, optional key_issuer. The agent fills these when producing an output; the same values are sent in the action-invocation request to the site.calling_agent can identify the agent requesting the surface and/or declare requirements (agent_id_required, agent_name_required). Standard field names in the request are agent_id and agent_name.agent_id and optionally agent_name (same as in the output). Site owners MAY log them for traceability. Optional today; as the ecosystem matures, surfaces may require them.AOM describes what users can accomplish (tasks, actions, entities), not the HTML structure.
Why: Agents reason about goals, not CSS selectors.
Domain objects (Product, Order, User) are first-class citizens with schemas, runtime validations, and current values.
Why: Agents operate on structured data, not unstructured text.
Actions declare their inputs, outputs, effects, priorities, and preconditions.
Why: Agents can plan, validate, and execute actions safely.
AOM natively supports automated testing via signals.test_cases and runtime escalation gating via meta.confidence.
Why: Agents require strict validation and human-in-the-loop fallback paths for enterprise reliability.
Supports both single-shot (one action → done) and flow (multi-step workflows).
Why: Different tasks have different execution patterns.
AOM is JSON. Works with any agent framework, LLM, or automation tool.
Why: Interoperability across ecosystems.
Both schemas use JSON Schema Draft 2020-12:
$schema: "https://json-schema.org/draft/2020-12/schema"ajv (Node) or jsonschema (Python)Input/core schema (aom-input-schema.json):
automation_policy, aom_version, surface_id, surface_kind, generated_at, purpose, context, tasks, entities, actions, state, navigation, signalsgenerator, calling_agent, a2hOutput schema (aom-output-schema.json):
mode, action (with action_id), meta (with done and confidence)agent_id, step_id, thought, resultBoth schemas allow additionalProperties: true in specific sections:
context — Add app-specific metadatastate — Add custom flags/valuesnavigation — Add extra routing infosignals — Add structured error objectsExample:
{
"context": {
"app_name": "MyApp",
"locale": "en-US",
"custom_tenant_id": "acme-corp",
"custom_feature_flags": ["beta_ui", "dark_mode"]
}
}
Entity schemas support arbitrary field types and custom validation rules:
{
"entities": {
"CustomWidget": {
"schema": {
"widget_id": {"type": "string", "required": true},
"config": {"type": "object", "required": false}
},
"current": {
"widget_id": "w123",
"config": {"color": "blue", "size": "large"}
}
}
}
}
binds_to (agents.json Integration)Actions can optionally reference external API/tool definitions via the binds_to field:
{
"actions": [
{
"id": "submit_checkout",
"label": "Place order",
"category": "mutation",
"description": "Submit checkout and create order.",
"input_entities": ["CheckoutIntent"],
"output_entities": ["OrderConfirmation"],
"effects": [
"entities.OrderConfirmation.current = shop_api.place(...)",
"state.workflow.step_id = 'order_placed'"
],
"binds_to": {
"type": "agent.workflow_step",
"ref": "place_order_confirm_checkout",
"optional": true
}
}
]
}
When to use:
Your runtime has an external tool registry (e.g., agents.json, MCP tools, OpenAPI specs)
You want agents to call real APIs instead of simulating via effects
When binds_to is present:
Runtime tries to resolve binds_to.ref from external registry
If found → use external tool schema (parameters, authentication, etc.)
If not found AND optional: true → fall back to AOM’s inline effects
If not found AND optional: false → fail with clear error
Schema:
type (string) — Namespace/type of external binding (e.g., “agent.workflow_step”, “mcp.tool”, “openapi.operation”)
ref (string) — External identifier (tool name, operation ID, etc.)
optional (boolean, default false) — Whether binding is required
Default behavior: If binds_to is omitted, runtime executes action using AOM’s effects only.
Roadmap: Auto-resolution from common tool registries.
AOM natively supports the industry-standard A2H protocol for safe Human-in-the-Loop (HITL) escalations. This allows the surface to dictate when an agent must pause and ask a human for approval or data.
aom-core-schema)The surface defines which actions require human intervention via the a2h_policy object on an action:
{
"actions": [
{
"id": "delete_database",
"label": "Delete Production DB",
"category": "mutation",
"a2h_policy": {
"requires_authorization": true,
"escalation_channel": "in_app"
}
}
]
}
aom-output-schema)When the agent realizes it needs to escalate (either due to the surface’s a2h_policy or low internal confidence), it outputs an a2h_intent inside the meta block:
{
"mode": "flow",
"action": { "action_id": "none" },
"meta": {
"done": false,
"confidence": 0.4,
"a2h_intent": {
"type": "AUTHORIZE",
"message": "I am about to delete the production database. Do I have your approval to proceed?"
}
}
}
Supported A2H Intents:
INFORM — One-way notification to the user.COLLECT — Request specific structured data from the user.AUTHORIZE — Request explicit approval for a high-risk action.ESCALATE — Hand off the entire workflow to a human support agent.RESULT — Report final task completion.tools are organized by language under tools/ so you can use only Python or only Node. See tools/README.md.
# From repo root (pip install -r tools/python/validate/requirements.txt first)
python tools/python/validate/validate.py spec/v0.1.0/aom-input-schema.json examples/v0.1.0/login-single/login.aom.json
python tools/python/validate/validate_all.py
python tools/python/validate/validate_all.py v0.1.0/ecom-flow
# From repo root (npm install in tools/node/validate first)
node tools/node/validate/validate.js spec/v0.1.0/aom-input-schema.json examples/v0.1.0/login-single/login.aom.json
node tools/node/validate/validate_all.js
node tools/node/validate/validate_all.js v0.1.0/ecom-flow
Golden *.output.json files under each example’s outputs/ folder are generated by the create-outputs tools. From repo root:
python tools/python/create-outputs/create_outputs.py
# or
node tools/node/create-outputs/create_outputs.js
Initial public release (current)
signals.test_cases for automated golden-file testing overridestasks[].default_action_id to guide primary agent behavioractions[].priority and actions[].method for semantic agent planningentities.*.validation for runtime data quality checksstate.workflow.flow_id for orchestration context trackingmeta.confidence in the output schema for runtime safety gatingerror and workflow schemasFuture versions MAY introduce:
Q: Why separate surface and output schemas?
A: Surfaces describe what’s available (input to agent), outputs describe what the agent decided (output from agent). Different lifecycles, different consumers.
Q: Can I use AOM with non-LLM agents?
A: Yes. AOM is JSON. Any system that can parse JSON and make decisions can consume AOM.
Q: Does AOM require specific UI frameworks?
A: No. AOM is framework-agnostic. Generate it from React, Vue, mobile apps, or even server-rendered HTML.
Q: What about authentication/security?
A: AOM surfaces can include state.session.authenticated and state.session.user_id. Authorization logic lives in your runtime, not the schema.
Q: Can AOM represent native mobile screens?
A: Yes. surface_kind supports screens, modals, panels, drawers, widgets. The abstraction works for web and mobile.
entities.*.schema_org_type)context.locale)Found an issue or have a suggestion?
Schema improvements should maintain backward compatibility when possible.