The Tool Registry

Scope. Beach-specific. The two-axis tool definition (scope: 'generalist' | 'specialist' × routing: 'routed' | 'bypass') is Beach's enforcement model for the principles "specialists own their internals" and "no unmediated access to shared domain data". The general idea of scoped tools applies broadly; the specific registry API and integration with @cool-ai/beach-llm are Beach's.

Where tools are defined, how they're filtered per actor, and how capability enforcement actually works in a Beach-based system.

The problem

Actor configuration declares which tools an actor may use:

{ "name": "triage", "tools": ["task_get", "task_list", "missive_list", "person_lookup"] }

But where is task_get actually defined? What JSON Schema describes its input? Which handler executes it? Does it require approval? Does its call flow through the event router or stay internal to the actor?

The Tool Registry is the answer to all of those. It is Beach's capability-enforcement layer made explicit: a single registry where every tool is declared, and actor configs filter from it by name.

The registry

Consumers register every tool once at application bootstrap:

// Conceptual — consumer's bootstrap code.
import { tools } from '@cool-ai/beach-llm';

tools.register({
  name: 'missive_list',
  description: 'List missives (messages) for a thread or entity.',
  inputSchema: { /* JSON Schema */ },
  outputSchema: { /* JSON Schema, optional */ },
  handler: async (args, ctx) => { /* executes on invocation */ },
  scope: 'generalist',           // shared domain; routes through the framework
  requiresApproval: false,
  tags: ['read', 'missive']
});

Beach itself does not ship domain tools. @cool-ai/beach-missives may export reference tool definitions (missiveListToolDef, missiveCreateToolDef) that consumers can import and register, but no tool is auto-registered. Tool definition is a consumer concern; Beach provides the registration and filtering mechanism.

Registration fields

Field Purpose
name Unique tool identifier. Actor configs reference this.
description Human-readable purpose. Included in the LLM's tool schema.
inputSchema JSON Schema for the tool's input. Validated before handler invocation.
outputSchema JSON Schema for the tool's output (optional). Used for peer-agent contract validation.
handler Async function that executes the tool. Receives (args, context).
scope 'generalist' or 'specialist' — ownership: shared domain vs actor-private. See scope and routing below.
routing 'routed' or 'bypass' — dispatch shape. Generalist tools always route. Specialist tools default to routed; opt into bypass with bypassRouting.reason + justification.
bypassRouting { reason: string } — required when routing: 'bypass'. The articulated reason why this tool should not pass through the router (latency, coupling, framework-internal).
justification string — required for specialist tools. Explains why this work is private to one actor and not shared domain.
requiresApproval false | true | ApprovalPolicy — see the HITL section below.
tags Free-form tags. Used by tag-based policy filtering.
peerExposed boolean (generalist-only) — whether to expose this tool to A2A peers.

Scope and routing

Beach's tool definition is a two-axis discriminated union:

  • scope answers "who owns this tool's invocation?" — the application as a whole (generalist) or one specific actor (specialist).
  • routing answers "how does the call dispatch?" — through the framework (routed; observable, gateable, audit-friendly) or inline (bypass; lower-latency, locally-scoped).

Three legal combinations; one is rejected at registration:

scope routing Use
'generalist' 'routed' (default and only legal value) Tools touching shared domain data. Reads against missives / tasks / people; writes to shared state; anything other actors might need to react to. Generalist tools always route. Returning an AnnotatedRecord<T> triggers fan-out through FilterAndDistribute to the LLM session, the UI stream, the cache, the audit log, and any per-channel formatters — see Filter-and-distribute.
'specialist' 'routed' (default for specialist) Tools private to one actor but where the framework still owns dispatch — typical for substantive work the actor does that may need cancellation, retry, or observability.
'specialist' 'bypass' (opt-in) Tools the actor calls inline. Latency-sensitive (a typed extraction, an inline calculation), framework-internal (the respond() tool itself), or coupled to the actor's invocation context (a private IMAP client living inside the actor's lifecycle). Requires bypassRouting: { reason: ... } plus justification.
'generalist' 'bypass' Rejected at registration. A generalist tool by definition touches shared domain; bypassing the framework means losing audit, gating, and reaction surfaces — that's specialist semantics.

This is the concrete expression of design principle 2.5 ("specialists own their internals") and the trust principle ("no participant has unmediated access to shared domain data; generalist tools always route").

scope: 'generalist' mechanism

The handler registered in the registry is not called directly by @cool-ai/beach-llm. Instead, @cool-ai/beach-llm emits routeEvent('actor:<actorName>', 'tool_call:<toolName>', args); the router dispatches to the registered handler; the handler's result flows through FilterAndDistribute (when configured) to the per-destination view set; the LLM session destination's filtered result is returned as the actor's tool result.

When the handler returns an AnnotatedRecord<T>, the framework also dispatches in parallel to the cache, UI streaming, audit log, peer-response, and any registered formatter:* destinations. See Filter-and-distribute for the full surface.

scope: 'specialist' mechanism

Routed specialist (routing: 'routed'): the framework dispatches as for a generalist, but other actors cannot react — the routing event is scoped private to the calling actor. Cancellation, retry, and audit instrumentation still apply because the framework is in the loop.

Bypass specialist (routing: 'bypass'): the handler runs inline within the actor's invocation context. No routeEvent dispatch; the call does not appear as a tool_call:* event that other actors can observe or filter.

Bypass requires explicit bypassRouting: { reason: ... }. Without that field, registration fails. The reason is read at audit time and surfaces in inspect / observability tooling. Beach uses bypass internally for the respond() tool itself (parsed inline by the actor loop; framework infrastructure, not a domain tool).

Either way, execution is logged. Specialist calls — routed or bypass — are appended to the event log as specialist_execution records: actor name, tool name, arguments, result (truncated for large payloads), duration, timestamp, scope, routing, bypass reason if any. The log is what @cool-ai/beach-evals uses to mock specialist calls during replay and what audit tooling uses to reconstruct a specialist's behaviour after the fact.

The distinction: routing observability (which actors can react to an event mid-turn) vs logging (which the system always does for audit and replay). Specialists own their internals — other actors cannot react to their tool calls — but the record is still there.

Constraint: a tool registered as specialist should only be exposed to actors where it is the specialist's internal work — not shared system resources. The framework rejects peerExposed: true on specialist tools at registration; other constraints are design discipline. The naming convention (e.g. email_search for the Email Researcher's private IMAP access) makes the scope visible in the actor config.

Per-actor filtering

Actor configs filter from the registry by name:

{
  "name": "triage",
  "tools": ["task_get", "task_list", "missive_list", "person_lookup"]
}

respond() is not listed — it is provided automatically by @cool-ai/beach-llm as architectural infrastructure, not a consumer-registered tool.

At callActor time:

  1. @cool-ai/beach-llm resolves each name against the registry.
  2. Unknown names cause the actor invocation to fail with a clear error (not silently ignored).
  3. The resolved inputSchemas are assembled into the LLM's tool schema.
  4. When the LLM calls a tool, @cool-ai/beach-llm validates the arguments against the schema.
  5. Dispatch proceeds per the (scope, routing) pair: generalist and routed-specialist tools go through routeEvent; bypass-specialist tools call the handler directly. Generalist returns of AnnotatedRecord<T> fan out through FilterAndDistribute (see Filter-and-distribute).

This is the filter — and it is declarative, readable in the consumer's repo without running code.

HITL approval integration

Tools registered with requiresApproval: true (or a policy) are intercepted before execution. See design principle 2.6 ("no unreviewed action on side-effecting tools").

// Conceptual.
tools.register({
  name: 'book_flight',
  inputSchema: { /* ... */ },
  handler: async (args) => { /* ... */ },
  scope: 'generalist',           // shared domain; framework owns dispatch
  requiresApproval: {
    autoApproveIf: "args.totalCost < 50 && ctx.user.tier == 'trusted'"
  }
});

On a matching tool call:

  1. @cool-ai/beach-llm does not execute the handler.
  2. The session emits an approval-request part in the envelope.
  3. The session enters turnState: 'suspended'.
  4. An approval-response injection arrives (from any channel — chat button, email reply, WhatsApp yes/no).
  5. If approved, the handler executes and its result is injected into the original turn's mailbox.
  6. If denied, a standard ToolDenied result is injected; the LLM reacts to it.

Approval policy expressions are evaluated against a pinned context (tool args, actor identity, session user). Consumers can also provide a policy-handler function for more elaborate logic.

Tag-based policy filtering (v2)

The tags field reserves room for a future capability — dynamic, context-sensitive filtering beyond static names.

Example shape (v2):

// Conceptual — v2.
configurePolicy({
  rule: "actor.scope != 'admin' && tool.tags.includes('destructive')",
  effect: 'hide'
});

In v1, tags are captured on registration but not consumed by any policy engine. Reserving the field now avoids a schema migration later.

The context argument

Handlers receive (args, context). The context carries:

  • actorName — which actor is calling.
  • sessionId — current session.
  • turnId — current turn.
  • userId, personId, tenantId — identity, where known.
  • approvalDecision? — populated for calls resumed after approval.

This gives handlers the same information the router has. Consumer policies and handlers can branch on identity and tenant.

Why a dedicated registry, not just inline config

Alternatives considered:

  • Tools inline per actor. Duplicative when multiple actors share tools. Schema-duplication becomes a source of drift.
  • Tools imported as decorated functions. Works for code-first design but not declarative — a reader of the repo can't see what tools exist without running code.
  • A shared module of tool constants. Workable but offers no lifecycle hooks (registration ordering, validation, approval interception, scope enforcement) without more ceremony than a registry.

A registry is the smallest primitive that gives uniform enforcement across all these concerns.

respond() is not a registered tool

The respond() tool is architectural infrastructure, not a consumer-registered tool. It does not live in the ToolRegistry. Every actor is given respond() automatically by @cool-ai/beach-llm; its schema is known to the library; its parsing is the library's job (not a handler the consumer wires up).

Treating respond() as just-another-registered-tool muddles its privileged role. The ToolRegistry is for consumer-declared domain tools — the tools an application provides to its actors. respond() sits alongside the registry, as part of the runtime contract between every actor and the session manager.

Related

  • ../design-principles.md — principle 2.4 (tools constrain; prompts guide); principle 2.5 (specialists own their internals); principle 2.6 (no unreviewed action on side-effecting tools).
  • ../../packages/llm/README.md — the package that owns the registry.
  • respond-tool.md — the architectural respond() tool, which is not in the ToolRegistry.
  • ../../packages/core/README.md — the event router that dispatches router-scoped tool calls.