The `respond()` Tool

Scope. Beach-specific. respond() is Beach's enforcement of the "LLMs never emit free text" principle. Other frameworks have analogous patterns (Anthropic's tool-use, OpenAI's structured outputs), but the specific schema, the turn-state vocabulary, and the integration with Beach's SessionTurnManager are Beach's.

The single way an LLM actor produces output in a Beach-based system.

Why a tool, not prose

A Beach actor's output must be read by multiple consumers:

A browser chat UI showing an immediate acknowledgement.
A session turn manager deciding whether the turn is settled.
A collector assembling the three-part response envelope.
A peer LLM that will reason about the response.
An SSE stream delivering progress signals.
An email or WhatsApp adaptor delivering the final text.

Each consumer needs structured information. If the LLM emits free-form prose with bracket tags like [ACK] ... [RESPONSE] ..., every consumer ends up parsing the same text by regex, and the LLM can accidentally emit unparseable output (a missing tag, an extra newline, a hallucinated variant).

The answer is strict structured output. Anthropic tool use validates JSON input against a schema. We define one tool — respond() — that every actor calls to produce every response. There is no other way to emit output.

Tool schema

{
  "name": "respond",
  "description": "Emit a structured response. This is the ONLY way you produce output. Always call this tool to produce any user-facing or peer-facing text. Never emit free-form text outside this tool. Pass all your output as parts, and set turnState to indicate whether you are done or still working.",
  "input_schema": {
    "type": "object",
    "required": ["parts", "turnState"],
    "properties": {
      "parts": {
        "type": "array",
        "description": "One or more A2A-style message parts. Each part carries text or data plus a partType metadata tag.",
        "minItems": 1,
        "items": {
          "type": "object",
          "required": ["metadata"],
          "properties": {
            "text": { "type": "string" },
            "data": { "type": "object" },
            "metadata": {
              "type": "object",
              "required": ["partType"],
              "properties": {
                "partType": {
                  "type": "string",
                  "description": "Registered part type. Validated against @cool-ai/beach-core's PartTypeRegistry (canonical core set plus consumer-registered extensions). See the canonical set below."
                }
              }
            }
          }
        }
      },
      "turnState": {
        "type": "string",
        "description": "Declare the lifecycle state of this turn. Validated against @cool-ai/beach-core's TurnStateRegistry (canonical core set plus consumer-registered extensions). Canonical values: 'awaiting' (more respond() calls or injections expected), 'complete' (turn settled, envelope emitted if data-bearing), 'clarifying' (terminal; question to user, no envelope), 'error' (terminal failure), 'suspended' (HITL approval pending), 'delegated' (peer-agent call in flight), 'passed' (handed off to another actor — see passTo)."
      },
      "passTo": {
        "type": "string",
        "description": "Optional. When present and turnState is 'passed', names the actor to which this turn is handed off. The session manager begins the next invocation with the named actor, inheriting the mailbox. The receiving actor's tool scope applies. Only valid with turnState: 'passed'."
      },
      "note": {
        "type": "string",
        "description": "Optional free-form note for logs. Not delivered to any consumer."
      }
    }
  }
}

Part types

The partType on each part determines how the session manager and envelope collector treat it.

Conversational signals

partType	Purpose	Delivered to
`ack`	Immediate acknowledgement. Specific — states what was understood and what the actor is doing.	SSE to the originating chat panel; stripped from async-channel envelopes.
`thinking`	Optional progress update during long tool-use loops.	SSE only. Stripped from async deliveries and envelopes.
`response`	The substantive answer.	Rendered in the chat panel. Becomes the `response` text in the envelope.
`clarify`	The actor cannot proceed and needs a question answered. Replaces `response` when used.	Rendered in the chat panel. Terminal for this turn; no envelope is built.
`error`	A failure the user needs to know about.	Rendered in the chat panel; logged; may be re-emitted through retry policies.

Envelope parts (for peer agents and UIs)

When an actor is responding to a peer agent (via MCP or A2A) or to a UI with structured content, the same respond() call carries the envelope parts:

partType	Purpose
`domain-data`	Structured domain data (A2A `data` part). The raw answer the peer can read programmatically.
`llm-context`	Natural-language context for a peer LLM to reason about the data. Typically generated by a Haiku translator post-hoc, but an actor may emit it directly.
`a2ui-surface`	A2UI `createSurface` / `updateDataModel` / `updateComponents` messages for a peer UI to render.
`artifact`	Reference to a binary asset (file, image, audio) in the `ArtifactStore`. Carries `artifactId`, `mimeType`, `sizeBytes`.
`reasoning-trace`	Captured model-native reasoning (Claude thinking blocks etc.). Persistence configurable.
`citation`	Source reference bound to a path in `domain-data` — where did this data come from?
`approval-request`	Emitted when a tool call marked `requiresApproval` is intercepted. Carries tool name, arguments, actor, turn.
`approval-response`	Inbound only — the human's or policy's decision, resuming the suspended turn.
`progress`	Structured progress update for long-running operations (e.g. research sections completing one by one).

See envelope.md for the full envelope spec and per-transport delivery rules.

Consumer-registered part types

The partType enum is open. Consumers register additional types at startup via @cool-ai/beach-core's PartTypeRegistry:

// Conceptual — consumer code.
partTypes.register({
  id: 'audio',
  isCanonical: false,
  meta: { streamingPreferred: true, buffered: 'dropped', serialisation: 'base64' }
});

The respond() tool schema validates partType against the registered set (canonical ∪ consumer-registered). A missing registration causes validation failure at the respond() call.

Turn states

turnState	Meaning	Envelope?	SSE?
`awaiting`	The actor expects more signals or injections. Conversational parts stream to the chat channel as ack/thinking/partial-response.	Partial (streaming)	Yes
`complete`	The turn is settled. The envelope is built from the mailbox's data-bearing events plus the current parts.	Yes	Yes
`clarifying`	The actor needs user input before continuing. No envelope.	No	Yes (the `clarify` part)
`error`	Terminal failure for this turn.	No	Yes (the `error` part)
`suspended`	HITL approval pending. Tool call is intercepted; an `approval-request` part is emitted; turn resumes on an `approval-response` injection.	No (until resumed)	Yes (the `approval-request` part)
`delegated`	Peer-agent call in flight. The session's timeout is the peer's turn budget, not this agent's.	Partial (streaming of the peer's envelope as it arrives)	Yes
`passed`	Turn handed off to another local actor via the `passTo` field. The named actor inherits the mailbox. Each actor's tool scope still applies. Requires `passTo` to be set.	No (yet — the receiving actor may settle with its own terminal state)	Yes

Turn states are an open registry (@cool-ai/beach-core's TurnStateRegistry). Consumer-registered states declare isTerminal, emitsEnvelope, and holdsActor metadata.

passTo — multi-actor handoff

{
  "parts": [{ "text": "Handing off to the drafter.", "metadata": { "partType": "thinking" } }],
  "turnState": "passed",
  "passTo": "drafter"
}

Valid only with turnState: 'passed'. The session manager treats this as terminal for the current actor's turn and begins a new turn with the named actor, carrying the accumulated mailbox forward. The router still mediates — there is no direct LLM-to-LLM call.

Multi-call pattern

An actor may call respond() several times during a single tool-use loop:

turn 1:
  respond({ parts: [{ text: "Looking up flights.", metadata: { partType: "ack" } }],
            turnState: "awaiting" })
  → tool_use: search_flights(...)

turn 2 (after tool result):
  respond({ parts: [{ text: "Filtering for direct options.", metadata: { partType: "thinking" } }],
            turnState: "awaiting" })
  → tool_use: filter_flights(...)

turn 3 (after tool result):
  respond({ parts: [{ text: "Two direct options. EasyJet £94pp at 06:15; BA £187pp at 08:45.",
                      metadata: { partType: "response" } }],
            turnState: "complete" })

Each respond() call is an atomic state update. ack and thinking parts are streamed to SSE as they arrive — consumers see progressive output without needing to parse partial JSON mid-stream. The final turnState: "complete" settles the turn and triggers envelope emission if the mailbox contains data-bearing events.

Parser behaviour

The parser in @cool-ai/beach-llm transforms each respond() tool call into a set of events:

Each part becomes a partReceived signal event carrying the part itself and its partType.
The turnState is carried as metadata on every partReceived event and emitted separately as turnStateChanged.
The part is dispatched based on its registered delivery rules. Streaming-preferred parts (ack, thinking, progress) flush immediately to streaming transports. Envelope parts accumulate for assembly.
When turnState: 'complete' arrives, the envelope is finalised and fanned out to buffered transports.
When turnState: 'suspended' arrives with an approval-request part, @cool-ai/beach-llm pauses tool execution until an approval-response injection arrives.

System-prompt snippet

@cool-ai/beach-llm exports a snippet that consumers include in every actor's system prompt:

You produce output only by calling the respond() tool. Never write free-form text. Every response you produce — acknowledgements, thinking updates, the substantive answer, clarifying questions, and errors — is passed as parts to respond(), each tagged with the appropriate partType.

When you are still working and expect to produce more, call respond() with turnState: "awaiting". When your turn is settled, call respond() with turnState: "complete". If you need a clarification from the user before you can proceed, call respond() with turnState: "clarifying" and a clarify part containing a single clear question.

Do not wait until the end of your turn to call respond(). Call it as soon as you have an acknowledgement, a thinking update, or a partial answer — the user sees these in real time.

Common shapes

Simple response, one call

{
  "parts": [
    { "text": "Your tasks for today: T12, T15, T18.", "metadata": { "partType": "response" } }
  ],
  "turnState": "complete"
}

Progressive response across three calls

{ "parts": [{ "text": "Searching flights.", "metadata": { "partType": "ack" } }], "turnState": "awaiting" }
{ "parts": [{ "text": "Filtering direct options.", "metadata": { "partType": "thinking" } }], "turnState": "awaiting" }
{ "parts": [{ "text": "Two direct options found.", "metadata": { "partType": "response" } }], "turnState": "complete" }

Clarification

{
  "parts": [
    { "text": "Did you mean the flight from Gatwick or Heathrow?", "metadata": { "partType": "clarify" } }
  ],
  "turnState": "clarifying"
}

Peer-agent response with envelope

{
  "parts": [
    { "text": "Found 2 direct flights.", "metadata": { "partType": "response" } },
    { "data": { "flights": [ ... ] }, "metadata": { "partType": "domain-data" } },
    { "text": "EasyJet EJ4521 is half the price but departs at 06:15 — the user may prefer BA at 08:45 for comfort.", "metadata": { "partType": "llm-context" } },
    { "data": { "version": "v0.9", "createSurface": { ... } }, "metadata": { "partType": "a2ui-surface" } }
  ],
  "turnState": "complete"
}

Error

{
  "parts": [
    { "text": "The flight search service is unreachable. I cannot find options right now.", "metadata": { "partType": "error" } }
  ],
  "turnState": "error"
}

envelope.md — how envelope parts are assembled into an A2A response
agent-card.md — how agents declare their capabilities, including supported partTypes
../design-principles.md — principle 2.3 ("LLMs never emit free text") and principle 2.8 ("explicit turn completion")

The respond() Tool