The Tool Registry

Scope. Beach-specific. The router-scoped vs specialist-scoped distinction is Beach's enforcement model for the principle "specialists own their internals." The general idea of scoped tools applies broadly; the specific registry API and integration with @cool-ai/beach-llm are Beach's.

Where tools are defined, how they're filtered per actor, and how capability enforcement actually works in a Beach-based system.

The problem

Actor configuration declares which tools an actor may use:

{ "name": "triage", "tools": ["task_get", "task_list", "missive_list", "person_lookup"] }

But where is task_get actually defined? What JSON Schema describes its input? Which handler executes it? Does it require approval? Does its call flow through the event router or stay internal to the actor?

The Tool Registry is the answer to all of those. It is Beach's capability-enforcement layer made explicit: a single registry where every tool is declared, and actor configs filter from it by name.

The registry

Consumers register every tool once at application bootstrap:

// Conceptual — consumer's bootstrap code.
import { tools } from '@cool-ai/beach-llm';

tools.register({
  name: 'missive_list',
  description: 'List missives (messages) for a thread or entity.',
  inputSchema: { /* JSON Schema */ },
  outputSchema: { /* JSON Schema, optional */ },
  handler: async (args, ctx) => { /* executes on invocation */ },
  scope: 'router',
  requiresApproval: false,
  tags: ['read', 'missive']
});

Beach itself does not ship domain tools. @cool-ai/beach-missives may export reference tool definitions (missiveListToolDef, missiveCreateToolDef) that consumers can import and register, but no tool is auto-registered. Tool definition is a consumer concern; Beach provides the registration and filtering mechanism.

Registration fields

Field Purpose
name Unique tool identifier. Actor configs reference this.
description Human-readable purpose. Included in the LLM's tool schema.
inputSchema JSON Schema for the tool's input. Validated before handler invocation.
outputSchema JSON Schema for the tool's output (optional). Used for peer-agent contract validation.
handler Async function that executes the tool. Receives (args, context).
scope 'router' or 'specialist' — see scope semantics below.
requiresApproval false | true | ApprovalPolicy — see the HITL section below.
tags Free-form tags. Used by tag-based policy filtering (v2).

Scope semantics

The registry distinguishes two tool scopes. This is the concrete expression of design principle 2.5 ("specialists own their internals").

scope: 'router'

Tool calls are routed through routeEvent(). The tool invocation becomes an event; the result becomes a follow-on event. Other actors can observe, filter, or react to these events per routing config.

When to use: any tool whose calls touch data or behaviour that other actors might need to read, audit, or contest. Reads against shared domain state (tasks, missives, people). Writes to shared domain state. Tool calls the system as a whole benefits from auditing uniformly.

Mechanism: the handler registered in the registry is not called directly by @cool-ai/beach-llm. Instead, @cool-ai/beach-llm emits routeEvent('actor:<actorName>', 'tool_call:<toolName>', args); the router dispatches to the registered handler; the handler's result becomes a follow-on event; the result is returned to the actor as the tool result.

scope: 'specialist'

Tool calls bypass the router. The handler runs directly within the actor's invocation context.

When to use: tools that are strictly an actor's internal implementation on a private substrate. An email-research specialist with direct IMAP access. A spreadsheet actor with direct Excel MCP tools. A browser-automation actor with direct Playwright tools.

Mechanism: @cool-ai/beach-llm calls the handler directly. No routeEvent dispatch; the call does not appear as a tool_call:* event that other actors can observe or filter.

But execution is still logged. Specialist calls are appended to the event log as specialist_execution records — a distinct event class separate from routed tool events. These records carry: actor name, tool name, arguments, result (or truncated result for large payloads), duration, timestamp. The log is what @cool-ai/beach-evals uses to mock specialist calls during replay (see ../../packages/session/README.md) and what audit tooling uses to reconstruct a specialist's behaviour after the fact.

The distinction: routing observability (which actors can react to an event mid-turn) vs logging (which the system always does for audit and replay). Specialists own their internals — other actors cannot react to their tool calls — but the record is still there.

Constraint: a tool registered as specialist can only be exposed to actors where it is the specialist's internal work — not shared system resources. Beach does not enforce this at runtime; it is a design discipline. The naming convention for specialist tools (e.g. email_search for the Email Researcher's private IMAP access) makes the scope visible in the actor config.

Per-actor filtering

Actor configs filter from the registry by name:

{
  "name": "triage",
  "tools": ["task_get", "task_list", "missive_list", "person_lookup"]
}

respond() is not listed — it is provided automatically by @cool-ai/beach-llm as architectural infrastructure, not a consumer-registered tool.

At callActor time:

  1. @cool-ai/beach-llm resolves each name against the registry.
  2. Unknown names cause the actor invocation to fail with a clear error (not silently ignored).
  3. The resolved inputSchemas are assembled into the LLM's tool schema.
  4. When the LLM calls a tool, @cool-ai/beach-llm validates the arguments against the schema.
  5. Dispatch proceeds per scope: router goes through routeEvent; specialist calls the handler directly.

This is the filter — and it is declarative, readable in the consumer's repo without running code.

HITL approval integration

Tools registered with requiresApproval: true (or a policy) are intercepted before execution. See design principle 2.6 ("no unreviewed action on side-effecting tools").

// Conceptual.
tools.register({
  name: 'book_flight',
  inputSchema: { /* ... */ },
  handler: async (args) => { /* ... */ },
  scope: 'router',
  requiresApproval: {
    autoApproveIf: "args.totalCost < 50 && ctx.user.tier == 'trusted'"
  }
});

On a matching tool call:

  1. @cool-ai/beach-llm does not execute the handler.
  2. The session emits an approval-request part in the envelope.
  3. The session enters turnState: 'suspended'.
  4. An approval-response injection arrives (from any channel — chat button, email reply, WhatsApp yes/no).
  5. If approved, the handler executes and its result is injected into the original turn's mailbox.
  6. If denied, a standard ToolDenied result is injected; the LLM reacts to it.

Approval policy expressions are evaluated against a pinned context (tool args, actor identity, session user). Consumers can also provide a policy-handler function for more elaborate logic.

Tag-based policy filtering (v2)

The tags field reserves room for a future capability — dynamic, context-sensitive filtering beyond static names.

Example shape (v2):

// Conceptual — v2.
configurePolicy({
  rule: "actor.scope != 'admin' && tool.tags.includes('destructive')",
  effect: 'hide'
});

In v1, tags are captured on registration but not consumed by any policy engine. Reserving the field now avoids a schema migration later.

The context argument

Handlers receive (args, context). The context carries:

  • actorName — which actor is calling.
  • sessionId — current session.
  • turnId — current turn.
  • userId, personId, tenantId — identity, where known.
  • approvalDecision? — populated for calls resumed after approval.

This gives handlers the same information the router has. Consumer policies and handlers can branch on identity and tenant.

Why a dedicated registry, not just inline config

Alternatives considered:

  • Tools inline per actor. Duplicative when multiple actors share tools. Schema-duplication becomes a source of drift.
  • Tools imported as decorated functions. Works for code-first design but not declarative — a reader of the repo can't see what tools exist without running code.
  • A shared module of tool constants. Workable but offers no lifecycle hooks (registration ordering, validation, approval interception, scope enforcement) without more ceremony than a registry.

A registry is the smallest primitive that gives uniform enforcement across all these concerns.

respond() is not a registered tool

The respond() tool is architectural infrastructure, not a consumer-registered tool. It does not live in the ToolRegistry. Every actor is given respond() automatically by @cool-ai/beach-llm; its schema is known to the library; its parsing is the library's job (not a handler the consumer wires up).

Treating respond() as just-another-registered-tool muddles its privileged role. The ToolRegistry is for consumer-declared domain tools — the tools an application provides to its actors. respond() sits alongside the registry, as part of the runtime contract between every actor and the session manager.

Related

  • ../design-principles.md — principle 2.4 (tools constrain; prompts guide); principle 2.5 (specialists own their internals); principle 2.6 (no unreviewed action on side-effecting tools).
  • ../../packages/llm/README.md — the package that owns the registry.
  • respond-tool.md — the architectural respond() tool, which is not in the ToolRegistry.
  • ../../packages/core/README.md — the event router that dispatches router-scoped tool calls.