The respond() Tool
Scope. Beach-specific.
respond()is Beach's enforcement of the "LLMs never emit free text" principle. Other frameworks have analogous patterns (Anthropic's tool-use, OpenAI's structured outputs), but the specific schema, the turn-state vocabulary, and the integration with Beach'sSessionTurnManagerare Beach's.
The single way an LLM actor produces output in a Beach-based system.
Why a tool, not prose
A Beach actor's output must be read by multiple consumers:
- A browser chat UI showing an immediate acknowledgement.
- A session turn manager deciding whether the turn is settled.
- A collector assembling the three-part response envelope.
- A peer LLM that will reason about the response.
- An SSE stream delivering progress signals.
- An email or WhatsApp adaptor delivering the final text.
Each consumer needs structured information. If the LLM emits free-form prose with bracket tags like [ACK] ... [RESPONSE] ..., every consumer ends up parsing the same text by regex, and the LLM can accidentally emit unparseable output (a missing tag, an extra newline, a hallucinated variant).
The answer is strict structured output. Anthropic tool use validates JSON input against a schema. We define one tool — respond() — that every actor calls to produce every response. There is no other way to emit output.
Tool schema
{
"name": "respond",
"description": "Emit a structured response. This is the ONLY way you produce output. Always call this tool to produce any user-facing or peer-facing text. Never emit free-form text outside this tool. Pass all your output as parts, and set turnState to indicate whether you are done or still working.",
"input_schema": {
"type": "object",
"required": ["parts", "turnState"],
"properties": {
"parts": {
"type": "array",
"description": "One or more A2A-style message parts. Each part carries text or data plus a partType metadata tag.",
"minItems": 1,
"items": {
"type": "object",
"required": ["metadata"],
"properties": {
"text": { "type": "string" },
"data": { "type": "object" },
"metadata": {
"type": "object",
"required": ["partType"],
"properties": {
"partType": {
"type": "string",
"description": "Registered part type. Validated against @cool-ai/beach-core's PartTypeRegistry (canonical core set plus consumer-registered extensions). See the canonical set below."
}
}
}
}
}
},
"turnState": {
"type": "string",
"description": "Declare the lifecycle state of this turn. Validated against @cool-ai/beach-core's TurnStateRegistry (canonical core set plus consumer-registered extensions). Canonical values: 'awaiting' (more respond() calls or injections expected), 'complete' (turn settled, envelope emitted if data-bearing), 'clarifying' (terminal; question to user, no envelope), 'error' (terminal failure), 'suspended' (HITL approval pending), 'delegated' (peer-agent call in flight), 'passed' (handed off to another actor — see passTo)."
},
"passTo": {
"type": "string",
"description": "Optional. When present and turnState is 'passed', names the actor to which this turn is handed off. The session manager begins the next invocation with the named actor, inheriting the mailbox. The receiving actor's tool scope applies. Only valid with turnState: 'passed'."
},
"note": {
"type": "string",
"description": "Optional free-form note for logs. Not delivered to any consumer."
}
}
}
}
Part types
The partType on each part determines how the session manager and envelope collector treat it.
Conversational signals
| partType | Purpose | Delivered to |
|---|---|---|
ack |
Immediate acknowledgement. Specific — states what was understood and what the actor is doing. | SSE to the originating chat panel; stripped from async-channel envelopes. |
thinking |
Optional progress update during long tool-use loops. | SSE only. Stripped from async deliveries and envelopes. |
response |
The substantive answer. | Rendered in the chat panel. Becomes the response text in the envelope. |
clarify |
The actor cannot proceed and needs a question answered. Replaces response when used. |
Rendered in the chat panel. Terminal for this turn; no envelope is built. |
error |
A failure the user needs to know about. | Rendered in the chat panel; logged; may be re-emitted through retry policies. |
Envelope parts (for peer agents and UIs)
When an actor is responding to a peer agent (via MCP or A2A) or to a UI with structured content, the same respond() call carries the envelope parts:
| partType | Purpose |
|---|---|
domain-data |
Structured domain data (A2A data part). The raw answer the peer can read programmatically. |
llm-context |
Natural-language context for a peer LLM to reason about the data. Typically generated by a Haiku translator post-hoc, but an actor may emit it directly. |
a2ui-surface |
A2UI createSurface / updateDataModel / updateComponents messages for a peer UI to render. |
artifact |
Reference to a binary asset (file, image, audio) in the ArtifactStore. Carries artifactId, mimeType, sizeBytes. |
reasoning-trace |
Captured model-native reasoning (Claude thinking blocks etc.). Persistence configurable. |
citation |
Source reference bound to a path in domain-data — where did this data come from? |
approval-request |
Emitted when a tool call marked requiresApproval is intercepted. Carries tool name, arguments, actor, turn. |
approval-response |
Inbound only — the human's or policy's decision, resuming the suspended turn. |
progress |
Structured progress update for long-running operations (e.g. research sections completing one by one). |
See envelope.md for the full envelope spec and per-transport delivery rules.
Consumer-registered part types
The partType enum is open. Consumers register additional types at startup via @cool-ai/beach-core's PartTypeRegistry:
// Conceptual — consumer code.
partTypes.register({
id: 'audio',
isCanonical: false,
meta: { streamingPreferred: true, buffered: 'dropped', serialisation: 'base64' }
});
The respond() tool schema validates partType against the registered set (canonical ∪ consumer-registered). A missing registration causes validation failure at the respond() call.
Turn states
| turnState | Meaning | Envelope? | SSE? |
|---|---|---|---|
awaiting |
The actor expects more signals or injections. Conversational parts stream to the chat channel as ack/thinking/partial-response. | Partial (streaming) | Yes |
complete |
The turn is settled. The envelope is built from the mailbox's data-bearing events plus the current parts. | Yes | Yes |
clarifying |
The actor needs user input before continuing. No envelope. | No | Yes (the clarify part) |
error |
Terminal failure for this turn. | No | Yes (the error part) |
suspended |
HITL approval pending. Tool call is intercepted; an approval-request part is emitted; turn resumes on an approval-response injection. |
No (until resumed) | Yes (the approval-request part) |
delegated |
Peer-agent call in flight. The session's timeout is the peer's turn budget, not this agent's. | Partial (streaming of the peer's envelope as it arrives) | Yes |
passed |
Turn handed off to another local actor via the passTo field. The named actor inherits the mailbox. Each actor's tool scope still applies. Requires passTo to be set. |
No (yet — the receiving actor may settle with its own terminal state) | Yes |
Turn states are an open registry (@cool-ai/beach-core's TurnStateRegistry). Consumer-registered states declare isTerminal, emitsEnvelope, and holdsActor metadata.
passTo — multi-actor handoff
{
"parts": [{ "text": "Handing off to the drafter.", "metadata": { "partType": "thinking" } }],
"turnState": "passed",
"passTo": "drafter"
}
Valid only with turnState: 'passed'. The session manager treats this as terminal for the current actor's turn and begins a new turn with the named actor, carrying the accumulated mailbox forward. The router still mediates — there is no direct LLM-to-LLM call.
Multi-call pattern
An actor may call respond() several times during a single tool-use loop:
turn 1:
respond({ parts: [{ text: "Looking up flights.", metadata: { partType: "ack" } }],
turnState: "awaiting" })
→ tool_use: search_flights(...)
turn 2 (after tool result):
respond({ parts: [{ text: "Filtering for direct options.", metadata: { partType: "thinking" } }],
turnState: "awaiting" })
→ tool_use: filter_flights(...)
turn 3 (after tool result):
respond({ parts: [{ text: "Two direct options. EasyJet £94pp at 06:15; BA £187pp at 08:45.",
metadata: { partType: "response" } }],
turnState: "complete" })
Each respond() call is an atomic state update. ack and thinking parts are streamed to SSE as they arrive — consumers see progressive output without needing to parse partial JSON mid-stream. The final turnState: "complete" settles the turn and triggers envelope emission if the mailbox contains data-bearing events.
Parser behaviour
The parser in @cool-ai/beach-llm transforms each respond() tool call into a set of events:
- Each part becomes a
partReceivedsignal event carrying the part itself and itspartType. - The
turnStateis carried as metadata on everypartReceivedevent and emitted separately asturnStateChanged. - The part is dispatched based on its registered delivery rules. Streaming-preferred parts (
ack,thinking,progress) flush immediately to streaming transports. Envelope parts accumulate for assembly. - When
turnState: 'complete'arrives, the envelope is finalised and fanned out to buffered transports. - When
turnState: 'suspended'arrives with anapproval-requestpart,@cool-ai/beach-llmpauses tool execution until anapproval-responseinjection arrives.
System-prompt snippet
@cool-ai/beach-llm exports a snippet that consumers include in every actor's system prompt:
You produce output only by calling the
respond()tool. Never write free-form text. Every response you produce — acknowledgements, thinking updates, the substantive answer, clarifying questions, and errors — is passed as parts torespond(), each tagged with the appropriate partType.When you are still working and expect to produce more, call
respond()withturnState: "awaiting". When your turn is settled, callrespond()withturnState: "complete". If you need a clarification from the user before you can proceed, callrespond()withturnState: "clarifying"and aclarifypart containing a single clear question.Do not wait until the end of your turn to call
respond(). Call it as soon as you have an acknowledgement, a thinking update, or a partial answer — the user sees these in real time.
Common shapes
Simple response, one call
{
"parts": [
{ "text": "Your tasks for today: T12, T15, T18.", "metadata": { "partType": "response" } }
],
"turnState": "complete"
}
Progressive response across three calls
{ "parts": [{ "text": "Searching flights.", "metadata": { "partType": "ack" } }], "turnState": "awaiting" }
{ "parts": [{ "text": "Filtering direct options.", "metadata": { "partType": "thinking" } }], "turnState": "awaiting" }
{ "parts": [{ "text": "Two direct options found.", "metadata": { "partType": "response" } }], "turnState": "complete" }
Clarification
{
"parts": [
{ "text": "Did you mean the flight from Gatwick or Heathrow?", "metadata": { "partType": "clarify" } }
],
"turnState": "clarifying"
}
Peer-agent response with envelope
{
"parts": [
{ "text": "Found 2 direct flights.", "metadata": { "partType": "response" } },
{ "data": { "flights": [ ... ] }, "metadata": { "partType": "domain-data" } },
{ "text": "EasyJet EJ4521 is half the price but departs at 06:15 — the user may prefer BA at 08:45 for comfort.", "metadata": { "partType": "llm-context" } },
{ "data": { "version": "v0.9", "createSurface": { ... } }, "metadata": { "partType": "a2ui-surface" } }
],
"turnState": "complete"
}
Error
{
"parts": [
{ "text": "The flight search service is unreachable. I cannot find options right now.", "metadata": { "partType": "error" } }
],
"turnState": "error"
}
Related
- envelope.md — how envelope parts are assembled into an A2A response
- agent-card.md — how agents declare their capabilities, including supported partTypes
- ../design-principles.md — principle 2.3 ("LLMs never emit free text") and principle 2.8 ("explicit turn completion")