Being Consumed by Other Applications

Your Beach application has a useful capability — answering travel questions, managing tasks, classifying inbound correspondence — and another application now wishes to use it. This guide describes how to expose it so that they can.

Three publishing shapes are common, and a single Beach application may use any combination of them:

  • the A2A peer shape, where your application becomes an agent that other agents converse with;
  • the MCP server shape, where your application exposes a curated set of typed functions that any compatible agent may call; and
  • the webhook receiver shape, where your application accepts fire-and-forget event deliveries (covered in Exposing your application through multiple protocols).

These are not alternatives. Each suits a different kind of integration partner, and most production applications use at least two of them at once.

Step 1 — Decide what you are exposing, and on which surfaces

Two questions need to be answered before any code is written. They are independent of one another.

The first question is about each behaviour's shape. A behaviour is either conversational — it admits clarification, partial results, and asynchronous delivery, of the form "suggest a destination for a quiet long weekend" or "investigate this customer's recent activity and tell me what stands out" — or it is function-shaped: a single named function with a wholly typed contract, of the form "look up customer X's order history" or "compute the booking total for this itinerary". The conversational shape is what Beach calls a capability: it triggers the orchestrator (or some other participant) and the peer does not need to know how the answer is produced. The function-shape is what Beach calls a tool: stateless from the caller's perspective, fully typed at input and output, no conversation.

The second question is which protocol surfaces should expose each behaviour. The shape of the behaviour does not determine the answer. A2A is the natural home for conversational capabilities; MCP is the natural home for typed tools; but a Beach application is free to expose the same behaviour through both — and most production applications do. suggest-destination can be reached as an A2A capability (the partner agent has a conversation, asks a clarifying question, gets a structured offer) and as an MCP tool (a Claude Desktop user invokes a one-shot call with a free-text brief and receives a single response). The orchestrator runs the same way underneath; the edge adapter shapes the contract presented at the surface.

Decide each behaviour's shape first; then decide each surface separately. The choice on the second is rarely "one or the other" — it is "which combination of A2A, MCP, REST, and webhook earns its keep with the audiences who will actually consume this behaviour".

Step 2 — Build your Agent Card

The Agent Card is your application's discovery contract. It tells other agents your name, your version, the transports through which you can be reached, the capabilities you offer, and the Beach extensions you support. The full schema is in Reference: Agent Card; the basics are in Your first agent card. The example below is enough for an A2A-publishing application:

import { buildAgentCard } from '@cool-ai/beach-protocol';

const agentCard = buildAgentCard({
  name: 'travel-concierge',
  version: '1.2.0',
  description: 'Suggests destinations and creates booking quotes.',
  transports: [
    { protocol: 'a2a', endpoint: 'https://travel.example.com/a2a' },
  ],
  capabilities: [
    {
      id: 'suggest-destination',
      description: 'Suggest a destination based on a free-text brief.',
      inputs:  [{ partType: 'user-message' }],
      outputs: [{ partType: 'response' }, { partType: 'domain-data' }],
      approval: 'none',
    },
    {
      id: 'create-quote',
      description: 'Create a booking quote.',
      inputs:  [{ partType: 'user-message' }],
      outputs: [{ partType: 'response' }, { partType: 'domain-data' }],
      approval: 'required',
    },
  ],
  extensions: {
    beach: {
      version: '1.0.0',
      envelope: {
        produces: ['response', 'domain-data', 'a2ui-surface'],
        consumes: ['user-message', 'peer-message'],
      },
      turnStates: ['awaiting', 'complete', 'clarifying', 'suspended'],
    },
  },
});

Mount the middleware so the card is served at the canonical well-known URL:

import express from 'express';
import { agentCardMiddleware } from '@cool-ai/beach-protocol';

const app = express();
app.use(agentCardMiddleware(agentCard));   // serves GET /.well-known/agent-card.json

Peers fetch the card, parse it, and decide whether to call you. The endpoints in the card must use your public hostnames: pointing them at internal endpoints is one of the more common deployment errors, and it leaves the card useless to anyone outside your network.

Step 3 — Wire the A2A inbound adapter

A2A traffic arrives as JSON-RPC over HTTP. The peer POSTs message/send to your A2A endpoint and waits for the JSON-RPC response carrying your reply. Beach's A2A inbound adapter handles the JSON-RPC layer for you and delivers the inbound Message to the router as an event:

import { A2AInboundAdapter } from '@cool-ai/beach-transport';

const a2a = new A2AInboundAdapter({
  router,
  agentCardPath: '/.well-known/agent-card.json',
  endpoint:      '/a2a',
});

await a2a.mount(app);

The adapter listens for the conventional A2A methods (message/send, message/stream, and so on), translates each incoming Message into Beach's internal event shape, dispatches it through the router under the routing key a2a:peer_message_received, and packs the orchestrator's eventual reply back into an A2A Message for the JSON-RPC response.

The routing rule and orchestrator handler are short:

router.loadRoutingConfig({
  rules: [
    /* … other rules … */
    { source: 'a2a', eventType: 'peer_message_received', handler: 'a2a-orchestrator' },
  ],
});

router.register('a2a-orchestrator', async (event, context) => {
  const { sessionId, turnId, peerMessage } = event.data;

  const respond = await sessionManager.runTurn({
    sessionId,
    turnId,
    slotKey:    'a2a-reply',
    actorId:    'concierge',
    actorConfig,
    provider,
    registry:   tools,
    inboundMessage: { role: 'user', content: peerMessage.parts[0]?.text ?? '' },
  });

  await context.routeEvent({
    source:    'a2a',
    eventType: 'reply_ready',
    data: { sessionId, turnId, parts: respond.parts, turnState: respond.turnState },
  });
});

The orchestrator that handles A2A traffic can be the same orchestrator you already use for chat or for email. It does not — and must not — read the channel identifier. The reply-dispatcher in the canonical pipeline routes the reply back into the A2A adapter on the strength of the channel-id alone, so the orchestrator stays channel-blind.

Step 4 — Wire the MCP inbound adapter

If you also want to expose particular tools as MCP-callable, the MCP inbound adapter sits alongside the A2A one:

import { MCPInboundAdapter } from '@cool-ai/beach-transport';

const mcp = new MCPInboundAdapter({
  router,
  toolRegistry: tools,
  exposedTools: ['lookup-pricing-rules', 'list-available-dates', 'get-customer-tier'],
  endpoint:     '/mcp',
});

await mcp.start(app);

The adapter listens for MCP requests, looks up each requested tool in the supplied ToolRegistry, invokes the handler, and returns the result.

One constraint matters here: only tools whose scope is 'router' belong in exposedTools. Specialist-scope tools are private to a particular actor; making them callable from outside exposes internal implementation that callers should not see and that you have made no commitment to keep stable. Audit the list carefully — every entry is part of your public contract.

Once MCP is wired, declare the additional transport on the Agent Card so that peers know you speak it:

transports: [
  { protocol: 'a2a', endpoint: 'https://travel.example.com/a2a' },
  { protocol: 'mcp', endpoint: 'https://travel.example.com/mcp' },
],

Step 5 — Authentication

Beach does not impose an authentication model. Pick the one that fits your application's existing trust regime. Three patterns recur.

Bearer tokens in the HTTP Authorization header are the most common. Your middleware verifies the token before the request reaches the router:

app.use('/a2a', (req, res, next) => {
  const token = req.headers.authorization?.split(' ')[1];
  if (!token || !verifyJwt(token)) {
    res.status(401).json({ error: 'unauthorized' });
    return;
  }
  next();
});

Declare the requirement on the Agent Card so that peers know to supply a token:

capabilities: [
  {
    id: 'suggest-destination',
    auth: { type: 'bearer', tokenDescription: 'JWT signed by partner-agent-issuer' },
    /* … */
  },
],

Mutual TLS is stronger. The peer presents a client certificate, and verification happens at your TLS-terminating proxy (nginx, Caddy, a managed reverse proxy) before the request reaches Node at all. mTLS suits high-trust integrations between organisations that have agreed on certificate exchange beforehand.

API keys are simplest, and remain reasonable for low-value internal integrations, but rotation and revocation are entirely your problem. We do not recommend them for new external integrations where bearer tokens or mTLS will do.

The Beach extensions registry includes an authTypes registry; register additional types at startup if your organisation's standards require them.

Step 6 — Rate limiting and quotas

A peer can call your application as fast as the network permits, and you should impose a rate limit at the HTTP layer rather than relying on good behaviour:

import rateLimit from 'express-rate-limit';

app.use('/a2a', rateLimit({
  windowMs: 60_000,
  max:      60,
  message:  { error: 'rate-limit-exceeded' },
}));

For per-peer quotas — useful when several partner agents share a network egress and an IP-based limit would be too crude — key the limiter on the bearer token's claims rather than on the source address.

Step 7 — Operational concerns

Three operational matters arise as soon as the application has outside callers.

Idempotency

A peer whose request times out and is retried should not cause your orchestrator to run twice. Capture the request's correlation identifier — the A2A messageId, the MCP request id, or your own header — and short-circuit duplicate invocations from a small cache:

router.register('a2a-orchestrator', async (event, context) => {
  const { peerMessage } = event.data;
  const idempotencyKey  = peerMessage.messageId;

  const cached = await idempotencyStore.get(idempotencyKey);
  if (cached !== undefined) {
    return context.routeEvent({
      source:    'a2a',
      eventType: 'reply_ready',
      data:      { sessionId, turnId, ...cached },
    });
  }

  // … run the orchestrator, cache the reply before returning.
});

The cache need not be durable across long horizons; a Redis-backed cache with a fifteen-minute TTL handles typical retry windows.

Asynchronous capabilities

If a capability takes minutes to settle — research, generation, anything that makes a synchronous JSON-RPC connection unreliable — do not block the inbound. A2A's task model exists for this case: your immediate response acknowledges the request and returns a taskId rather than the eventual result. The work proceeds in the background. When complete, your application either accepts a poll for status or POSTs the result to a callback URL the peer supplied with the original request, with the taskId correlating delivery to call.

The task model becomes the right answer somewhere around fifteen seconds of worst-case latency. Below that, a synchronous JSON-RPC reply is conventional; above it, the peer's connection times out before you finish.

Versioning

Bump the version field on the Agent Card every time the application's contract changes. Peers that pin their integration to a semver range will follow your minor bumps quietly and warn on majors; peers that fetch the card afresh on each call will see the version and may invalidate any cached behaviour.

When a breaking change cannot be avoided — a capability whose input shape must change, or an output that has acquired a new required field — introduce it as a new capability with a new identifier (create-quote-v2, say) rather than mutating the existing one. Mark the old capability deprecated: true on the card so that peers see the warning, and keep both running for a generous transition window.

Pitfalls particular to being consumed

A few errors recur in applications exposed to outside callers, all the consequence of treating an external surface like an internal one.

Exposing specialist-scope tools through MCP. Specialist-scope tools are private to the actor that calls them; their inputs, outputs, and side effects are part of internal reasoning, not part of any external contract. Once a peer can call such a tool directly, you have lost the ability to refactor it without breaking the peer.

Trusting peer-supplied session identifiers. A peer that constructs a sessionId of "admin" should not, on the strength of that string, end up acting in the admin's session. Resolve sessions from the peer's authenticated identity on your side; treat any peer-supplied session string as a hint at most.

Advertising turn states the application cannot produce. If your application has no human-in-the-loop approval flow, do not list 'suspended' in the Agent Card's turnStates. Peers prepare for what you advertise, and an unfulfilled advertisement leaves them waiting for behaviour that will not arrive.

Publishing an Agent Card whose endpoints point at internal hostnames. The card's URLs are how peers reach you. They must be your public hostnames, even when the card is generated from internal-deploy configuration that uses internal names.

Exposing tools that handle sensitive data without considering the audit trail. Every MCP call is recorded; the audit pipeline must be able to handle whatever the tool reads or writes. Configure redaction or retention policies before the tool is exposed, not after.

Blocking your orchestrator on a slow peer. If your A2A inbound handler triggers a long synchronous chain that calls another peer, and that peer is slow, the JSON-RPC connection from the original caller will time out before your orchestrator finishes. Use the task model, an explicit timeout with a fallback, or asynchronous reply via callback.

Related