Beach — Design Principles
Scope. Beach-specific. These principles apply to applications built using Beach packages; they describe Beach's architectural invariant and the discipline that follows from it. The underlying ideas (router-as-boundary, opaque interior, participant discipline) are portable concepts, but the concrete commitments here —
respond()discipline, theManifestRegistry, the canonical pipeline — are Beach implementations.
Beach is an application scaffold with a protocol-agnostic interior. Inbound and outbound protocols are replaceable edges. Domain logic never moves when the standards change.
These principles are what that position means in practice. They apply to every package and to every consumer that adopts Beach.
Part 1 — The architectural invariant
The following five statements are not design choices — they are the shape from which everything else is derived. A Beach application that violates any of them is not a Beach application.
1.1 The router and the manifest registry are the architectural centre
Every participant produces events through routeEvent() and consumes results through the ManifestRegistry. The router and the registry together are the architectural centre. Every edge translates to and from this centre; every participant operates inside it.
The packages that own the centre are @cool-ai/beach-core (the event router and manifest registry) and @cool-ai/beach-session (the turn lifecycle built on the registry). Nothing else is the centre.
1.2 The centre is opaque
Edges never see each other. Participants never see protocols. An inbound adapter does not know which actor will receive its event. An actor does not know which outbound channel will deliver its response. A channel does not know what the actor did internally.
This is not an aspiration; it is a structural constraint. The only component permitted to observe what enters and exits is the router — which logs it — and beach-inspect, which reads the log.
1.3 Inbound and outbound protocols are adapters and plugins — and nothing else
Inbound adapters (SSE, A2A, IMAP, AMQP) translate external protocols into routeEvent() calls. Outbound plugins (SMTP, A2A peer, HTTP webhook) translate routed events into external protocol calls. These adapters and plugins are the only place in the codebase where an external specification is implemented.
No interior component reads a protocol-specific field, a channelId, or a transport-specific identifier. If it does, it is in the wrong place.
1.4 Participants include deterministic handlers, LLM actors, and long-running processes — the centre treats them identically
An event handler is a participant. An LLM actor is a participant. A durable workflow is a participant. All of them receive events, do work, and emit results through routeEvent(). The session manager, the router, and the audit log do not distinguish between them.
The LLM is not the centre. It is one participant type, with its own sub-rules (see Part 2) that apply precisely because it is interior — not because it is special.
1.5 Every asynchronous, suspended, or multi-hop flow is a manifest waiting for a result
When an actor waits for background research, the in-flight work is tracked by a Manifest. When a batched outbound edge waits for a turn to complete, it holds open a Delivery Manifest. When a multi-specialist fanout reassembles, a ResultsCollector wraps a manifest. There is no other primitive for asynchronous coordination.
This is what makes the system auditable: every in-flight operation has a named, inspectable manifest entry. Nothing disappears into a callback heap.
Part 2 — Participant discipline
These rules describe what every participant inside the opaque interior must do. The LLM-specific applications are called out explicitly; the underlying rule is architectural.
2.1 The router is the boundary
Every message crossing a trust boundary — between a human and an AI, or between two AIs — is a routeEvent() call. Nothing bypasses it. This is what makes audit, observability, capability enforcement, and testability possible.
There are no "direct" paths. Not for performance. Not for convenience. Not for "just this one case."
2.2 LLMs never talk directly to each other
An LLM emits an event; the router dispatches it; the result returns as a follow-on event. This rule is absolute.
The cost: a handful of extra event hops per multi-agent exchange. The value: every LLM-to-LLM exchange is a routable, filterable, loggable event — and no LLM can deputise another without system permission.
2.3 LLMs never emit free text
LLMs respond by calling the respond() tool with structured parts and a turn-state marker. They do not write prose that the system then has to parse with regex. They do not declare completion in the body of a message. They do not route themselves.
This is the single strictest rule in the toolkit. It prevents the bracket-tag brittleness that informal formats produce at scale, and it makes turn lifecycle legible to every downstream consumer.
2.4 Tools constrain; prompts guide
Capability boundaries are enforced at the tool layer, not by prompt instructions. A "triage" actor that must not send outbound messages does not have missive_create in its tool list — a prompt rule alone is not sufficient.
Prompts describe intent and behaviour. Tools enforce permission.
The concrete enforcement layer is the Tool Registry in @cool-ai/beach-llm — every tool is declared once, and actor configs filter from it by name. Unknown tool names fail loudly; there is no silent fallback. See protocol/tool-registry.md.
2.5 Specialists own their internals
Rules 2.1 and 2.2 apply to boundaries between actors, not to a specialist's internal work. An email-research specialist with IMAP access does not route each internal IMAP operation through the event router — that is its private implementation.
The line is: if a tool call touches data or behaviour that other actors might need to read, audit, or contest, it goes through the router. If it is strictly the specialist's internal operation on its own private substrate, it does not.
Tools register with scope: 'router' (observable) or scope: 'specialist' (private). See protocol/tool-registry.md.
2.6 No unreviewed action on side-effecting tools
An LLM does not send outbound messages, book flights, move money, delete records, or trigger deploys without approval. It produces a draft or an approval request; a deterministic handler executes on behalf of the event system.
Tools are marked requiresApproval: true | ApprovalPolicy in actor config. The LLM layer intercepts matching calls, emits an approval-request envelope part, and enters the suspended turn state. An approval decision from any channel resumes the tool.
2.7 Channel-agnostic actors
An LLM in the interior does not know which channel a message arrived on. It does not know whether its response will be rendered in a browser, delivered by email, played via TTS, or shown on a visual surface to a peer agent. The channel is a concern for the inbound and outbound edges.
The principle is positional, not mechanical. Three actor roles:
- Orchestrating actor — interior; channel-blind. TA Concierge, PO Baxter.
- Interior specialist — interior; channel-blind. TA Researcher, PO email-triage.
- Composer specialist — edge; channel-aware.
EmailComposer, futureWhatsAppComposer.
The first two are interior and forbidden channel knowledge. The third is part of a channel's outbound plumbing and allowed to know its channel — that is precisely what it is there to know. The Composer emits only connective prose with placeholder tokens; deterministic Content Renderers substitute the structured content. See @cool-ai/beach-llm README § Actor taxonomy and @cool-ai/beach-format README.
Any other component that reads channelId, channelClass, or any channel-shaped property is either an edge component in the wrong place or a bug.
2.8 Explicit turn completion
A turn is settled when the LLM says so, via turnState in its respond() call. The system does not infer completion from tool state.
Canonical turn states: awaiting, complete, clarifying, error, suspended (HITL approval pending), delegated (peer-agent call in flight). Inference-based turn completion is forbidden.
Part 3 — Operational properties
These rules describe how the interior operates. They apply regardless of participant kind.
3.1 Background results dispatch immediately
When a background operation completes (research, peer agent response, system event), the router dispatches the result to the relevant participant immediately — not on the next user turn. Injections are never queued and never dropped.
3.2 Declarative configuration, open registries
Routing, filtering, agent registries, and capability tables are JSON or equivalent declarative files. A reader of the repo can see without reading code: which events exist and where they go; which agents are registered as peers; which tools each actor has.
Extensibility concerns — part types, turn states, transport protocols — are open registries with canonical core sets. Beach ships the canonical set; consumers register additional values at startup. Agent Cards advertise the concrete set so capability negotiation works.
3.3 Multiple models for cost and speed
Not every task warrants the primary conversational model. Triage, relevance filtering, digest summaries, and routine translation use fast, cheap models. Primary conversation uses the capable model. The model choice is per actor, declarative, and swappable.
3.4 Async throughout
LLM calls do not block the event loop. Results return as events. UI updates arrive via pub/sub. Long-running research does not hold HTTP connections. Heartbeats replace timeouts for long-running turns.
3.5 Retry for flaky work; fail-loud for logic errors
LLM calls retry via a bounded queue with exponential backoff. Deterministic code does not retry — if a handler throws, the error is logged and surfaced. Retrying a logic error hides bugs.
3.6 Observability is design-time, not after-the-fact
The router, LLM provider, tool executor, and transport adapter all emit OpenTelemetry spans by default. Event IDs become span IDs; routing becomes span parentage. Retrofitting observability later breaks event shapes that downstream code already depends on; doing it from day one is cheap.
3.7 Replay and evaluation are first-class
Session replay (reconstructing mailbox state from the event log) is part of @cool-ai/beach-session's interface. Evaluation primitives live in @cool-ai/beach-evals. Audit without assertions is just logging; evaluation without replay is guesswork.
The event log is Beach's replay substrate. Every routed event is appended. Every specialist-scope tool execution emits a specialist_execution record so replay can mock it. Routing observability and logging share infrastructure but are distinct concerns.
Part 4 — Versioning and ecosystem posture
4.1 Agents are peers
No agent is subordinate to another. Any agent can initiate a conversation with any other agent. Discovery is via Agent Card at /.well-known/agent-card.json. Hierarchy is a consumer choice, not an infrastructure requirement.
4.2 Versioning honours consumers
Every package is independently semver'd. Breaking changes require a major bump. Consumers may pin and defer — Beach never forces a migration.
4.3 Don't reinvent where the ecosystem has it better
Model-provider breadth: delegate to Vercel AI SDK. Durable execution: delegate to Temporal / Inngest / Durable Objects. A2A and A2UI spec implementation: wrap Google's first-party references where they exist.
Beach's product is the discipline, not the breadth.
Design posture
Designed for the full range. Implemented in stages. Stub what's not yet needed with a paper trail.
Cool Digital has many projects and the next is unknown. Beach accommodates the full range of AI-agent shapes so the foundations aren't forgotten. Implementation is staged — what TA and Organiser need ships as working code; what no current consumer needs ships as a stub with a docstring pointing at the decision that motivates it.
Every closed enum is a future breaking change. Every interface omitted now is a retrofit tomorrow. Design interfaces for concerns we foresee, even if we only implement the parts we need.
What these principles produce
Consumers who follow these principles get:
- Auditability. Every message is logged, every decision is traceable, every capability is declared.
- Composability. Participants are small, channel-agnostic units that combine by event routing rather than direct wiring.
- Testability. The router is a pure function over declarative config. Handlers are pure functions of events. Both are straightforward to test; replay and evals make regression testing the default.
- Honesty. The system does what it says it does, because it has no way to do otherwise.
- Extensibility. Open registries mean new concerns land without library changes.
What these principles forbid
- Hidden reasoning. Every LLM invocation's inputs and outputs are legible in the event log.
- Autonomous side-effects beyond tool scope.
- LLMs routing themselves or deputising each other.
- Bracket-tag parsing, regex on model output, or any other way of inferring structure from prose.
- Closed enums for concerns that are likely to extend.
- Any interior component reading channel identity.
Beach serves; he doesn't scheme. These principles are the shape that commitment takes in code.