The Response Envelope

Scope. Beach-specific. The envelope shape and canonical part-type registry are Beach's protocol artefacts. Some concepts (mailbox-style accumulation, part-typed responses) generalise to other agent frameworks, but the wire format and registered part types here are Beach's.

How a Beach-based agent replies to an external caller — whether that caller is a human chat panel, a peer agent (over A2A or MCP), or a future email correspondent.

The problem

When an agent responds with structured data, the response must serve multiple consumers:

  1. A peer LLM (e.g. Baxter consulting Sally about flights) needs to reason about the data. It wants natural-language analysis ("the EasyJet is half the price but departs at 6am") alongside the data, to decide what to tell the user.
  2. A UI needs to display the data visually. It doesn't understand the domain — it needs rendering instructions.
  3. The producing agent knows both — the meaning of the data and how it should be displayed.

Three industry protocols cover most of this:

Protocol Purpose Covers
MCP Agent-to-tool Transport (tool calls)
A2A Agent-to-agent Transport + structured data exchange
A2UI Agent-to-UI Declarative visual rendering

The gap: none of these standards defines how to provide natural-language context to the consuming agent's LLM. A2A treats agents as opaque peers — it doesn't assume the consumer has an LLM that needs explaining-to. But in a Beach-based system, it usually does.

Beach's envelope fills this gap with an llm-context part — a convention, not a standard — sent alongside the A2A data and the A2UI surface.

The envelope is a stream

An envelope is not a fixed three-part structure. It is a stream of typed parts arriving over the course of a turn, with a settlement marker when the actor declares turnState: 'complete'. Streaming transports flush parts as they arrive; buffered transports accumulate until settlement. Most data-bearing envelopes carry the canonical three — domain-data, llm-context, a2ui-surface — but any envelope may include additional parts (artifacts, citations, reasoning traces, approval requests, progress updates).

See ../../packages/protocol/README.md for the full canonical part type list, and the per-transport delivery rules.

The canonical three

Every response carrying structured data typically includes three parts, assembled as A2A Message Parts with Beach-extended partType metadata:

Part 1 — Structured data (partType: domain-data)

The raw domain data, unchanged. Peer agents read it programmatically; UIs bind to it by JSON Pointer path.

{
  "data": {
    "route": { "origin": "London Gatwick", "destination": "Corfu" },
    "flights": [
      {
        "flightNumber": "BA 2043",
        "airline": "British Airways",
        "departure": "2026-08-15T08:45:00",
        "arrival": "2026-08-15T14:20:00",
        "pricePerPerson": 187,
        "currency": "GBP",
        "stops": 0
      }
    ]
  },
  "metadata": { "partType": "domain-data" }
}

Part 2 — LLM context (partType: llm-context)

Natural-language analysis from the producing agent, intended for a peer LLM to consume. Typically analytical and dense — the peer LLM has the data, so it does not need re-statement. It needs understanding.

{
  "text": "Two direct flights found, 15 August 2026 LGW→CFU. Cheapest is easyJet EJ4521 at £94 per person (06:15–12:00). More convenient is British Airways BA2043 at £187 per person (08:45–14:20) — nearly double the price but a civilised departure. For 6 passengers, the cost delta is £558.",
  "metadata": { "partType": "llm-context" }
}

How this part is produced: the primary actor (Sally, for example) writes her natural response once, tagged as a response part. If the envelope's destination includes a peer LLM, @cool-ai/beach-session's collector invokes a fast secondary model (typically Haiku) to translate the conversational response into analytical prose. The translation is cached per turn and only produced when a peer-LLM consumer is present. For local-UI-only turns, the step is skipped.

Part 3 — UI surface (partType: a2ui-surface)

A2UI createSurface / updateDataModel / updateComponents messages. Declarative rendering instructions using the A2UI Basic Catalog primitives (Text, Card, Row, Column, List, Button, etc.), with data bindings referencing the domain data by JSON Pointer path.

{
  "data": {
    "version": "v0.9",
    "createSurface": {
      "surfaceId": "flight-results",
      "catalogId": "https://a2ui.org/specification/v0_9/basic_catalog.json",
      "sendDataModel": true
    },
    "updateDataModel": { "surfaceId": "flight-results", "path": "/", "value": { "...": "..." } },
    "updateComponents": { "surfaceId": "flight-results", "components": [ "..." ] }
  },
  "metadata": { "partType": "a2ui-surface" }
}

The UI that renders this surface does not need to understand flights, hotels, or cruises. It renders Text, Card, Row, List, and Button primitives from its catalog. The domain is carried in the data binding paths; the primitives are universal.

Full envelope shape

{
  "role": "agent",
  "parts": [
    { "data": { ... },               "metadata": { "partType": "domain-data" } },
    { "text": "...",                 "metadata": { "partType": "llm-context" } },
    { "data": { "version": "v0.9", "createSurface": {...}, "updateDataModel": {...}, "updateComponents": {...} },
                                     "metadata": { "partType": "a2ui-surface" } }
  ],
  "meta": {
    "sessionId": "sess_abc123",
    "turnId": "turn_xyz789",
    "producedAt": "2026-04-20T09:50:00Z",
    "finalizedBy": "complete"
  }
}

The meta block is Beach-specific and non-standard. Peer agents that only understand A2A v1.0 ignore it. Beach consumers use it for correlation and audit.

Assembly and delivery

The envelope is progressively assembled across a turn. The session manager maintains a live envelope state and emits part deltas to streaming transports as they arrive, while holding state for buffered transports until settlement.

Sequence for a typical turn:

  1. Actor calls respond({ parts: [ack], turnState: 'awaiting' }). Session emits ack to streaming transports immediately; buffered transports drop it.
  2. Tool result arrives as injection. Session re-invokes actor.
  3. Actor calls respond({ parts: [thinking], turnState: 'awaiting' }). Streaming emits thinking; buffered drops.
  4. Subsequent tool results land. Session tracks accumulating data-bearing events in the mailbox.
  5. Actor eventually calls respond({ parts: [response, domain-data, a2ui-surface], turnState: 'complete' }).
  6. The envelope builder:
    • Collects all data-bearing events from the mailbox into a single domain-data object.
    • Derives llm-context via Haiku translator — but only if at least one originator's Agent Card declares consumes: ["llm-context"]. Otherwise skipped.
    • Builds A2UI surface(s) using consumer-registered surface templates matching the data kind.
    • Streams final parts to streaming transports; sends assembled envelope to buffered transports.

Each part type declares per-transport delivery semantics via the PartTypeRegistry. See ../../packages/protocol/README.md for the delivery rule table.

Transport independence

The envelope is the contract. The transport is an implementation detail.

Transport Delivery class How parts travel
A2A HTTP (streaming) Streaming Parts flushed as they arrive; A2A task-stream model. Extended partTypes in part metadata.
A2A HTTP (non-streaming) Buffered Single A2A Message at settlement, all parts in the array.
MCP Buffered Structured tool response at settlement; the assembled envelope is the tool result's content.
SSE (local UI) Streaming Each part flushed as a successive SSE event.
WebSocket Streaming Each part as a WebSocket frame.
Email / WhatsApp / SMS Buffered domain-data, a2ui-surface, reasoning-trace typically dropped or attached; response rendered to plain text; artifact becomes an attachment (email) or URL reference (SMS).
Webhook reply Buffered POST response body carries final envelope.

Partial envelopes

Not every response carries all three canonical parts.

  • A conversational turn with no data — just a response part, no data-bearing envelope. The text goes to the chat channel as SSE; peer agents receive an A2A Message with only the response part.
  • A clarification — a clarify part, turnState: 'clarifying', no envelope build.
  • Data without a UI surface — if no surface template exists for the data kind, the a2ui-surface part is omitted. The peer receives domain-data + llm-context only; local UI falls back to a generic rendering.
  • Local-UI-only turnllm-context is skipped because no peer LLM is consuming it.
  • Suspended turn — the envelope pauses at an approval-request part; no settlement until an approval-response injection arrives.
  • Progress-only turn — long-running actors can emit progress parts during awaiting state without ever producing domain-data — useful for status-only reports.

Related

  • respond-tool.md — how actors produce the parts in the first place.
  • agent-card.md — how an agent declares which envelope parts it supports.
  • ../../packages/a2ui/ — the A2UI package (builder + renderer).
  • ../../packages/protocol/ — the envelope builder + A2A transport types.
  • ../design-principles.md — principle 2.1 ("the router is the boundary") and principle 2.7 ("channel-agnostic actors").