How does an MCP server authenticate an agent?

Each agent gets its own API key, carried in the Authorization header of every request. The agent's MCP client adds the header automatically once you paste the config; the key is never typed into a chat. Because keys are per-agent, revoking one agent does not disturb any other. AppElixir's engine resolves the key to a tenant and an agent identity before any tool code runs.

What is a structured rate-limit error and why does it matter?

A bare HTTP 429 confuses an agent because it sits below the protocol layer the agent reasons about. A structured rate-limit error is an MCP error object with a code, a human-readable message, and machine-readable data such as the limit, the window, and seconds until reset. The agent reads the data, waits, and retries instead of failing or looping.

What belongs in an MCP audit log?

Every tool call records who called (tenant and agent), which tool, the validated arguments, the result status, and the latency. When an agent does something unexpected, the audit log is the only honest record of what actually happened. AppElixir writes one row per call automatically, before and after tool execution, so even failures are captured.

Is schema validation a security feature?

Yes. The JSON Schema the engine compiled from your form is enforced on the server, not just advertised to the agent. Malformed or out-of-range inputs bounce with a clean validation error before they ever reach the spreadsheet, SQL table, or REST endpoint behind the tool. Validation is the first boundary, and it is the cheapest place to reject a bad call.

Why is read-only the default for generated MCP servers?

A read tool that misbehaves returns wrong data; a write tool that misbehaves changes your data. The engine ships every tool read-only by default and treats a write tool as an explicit, gated opt-in per collection. Write capability is a deliberate decision recorded in the contract, not a side effect of how a form was drawn.

The Security Plane: Auth, Rate Limits, and the Audit Log

This is chapter 6 of the AppElixir engine teardown, a walk through how a no-code schema becomes a live MCP server. Start at chapter one.

By chapter 6, the tool is real. The schema compiler turned a form into a typed contract, the collection abstraction bound it to a data source, the runtime serves it over the protocol, and the description is callable. None of that is safe yet. A tool that any caller can hit, as often as they like, with any arguments, against your live data, with no record of what happened, is a liability with a nice JSON Schema.

The security plane is the answer to four questions the runtime cannot avoid: who is calling, how often may they call, what may they touch, and what did they actually do. The thesis of this whole teardown is that the plumbing is identical for every server, which is exactly why a no-code layer can produce a correct one. Nowhere is that truer than here. The data differs per server. The auth, limits, audit, and validation do not. So the engine solves them once and stamps the same plane onto every artifact it compiles.

Who is calling: per-agent keys and revocation

Authentication on the generated server is deliberately boring, because boring is what survives. Each agent that consumes a tool gets its own API key. The key rides in the Authorization header on every request. The agent's MCP client adds it automatically once you paste the config snippet from the engine. It is never typed into a chat, never pasted into a prompt, never visible to the model as text. The header is transport, not conversation.

Per-agent is the load-bearing word. One server can be consumed by a Claude Desktop instance, a Cursor workspace, an n8n flow, and a staging bot at the same time. Each holds a distinct key. When the staging bot is decommissioned, you revoke one key and the other three never notice. Revocation is a state change on a single credential, not a rotation that forces every consumer to re-paste config. The engine resolves the key to a tenant identity and an agent identity before a single line of tool code runs, so every later stage in the plane knows exactly who it is dealing with.

By hand

You reach for a middleware, decide between bearer tokens and signed headers, write a key store, write a revocation table, write the lookup on the hot path, and then write it again the same way for the next server because the first one is welded to its handlers.

With the engine

You click "issue key" per agent and "revoke" per agent. The header handling, the constant-time comparison, the tenant resolution, and the revocation check are compiled into the artifact. Every server gets the same code path, so a fix to the auth layer is a fix everywhere at once.

Per-tenant isolation: one builder cannot see another's anything

A multi-tenant engine has a non-negotiable rule: one builder's servers cannot read another builder's data, and one builder's traffic cannot consume another builder's limits. The key that authenticates a request also scopes it. The tenant identity resolved at the door is attached to the request and travels with it through validation, data access, rate counting, and the audit write. A tool can only ever see the collection bound to its own server, under its own tenant. There is no global handle to reach across.

Isolation also means limits are per-tenant. A noisy agent on one account cannot exhaust a counter that throttles a calm agent on another account. The rate-limit state is keyed by tenant, then by tool, so blast radius stops at the account boundary. This is the same separation principle that database row-level security enforces inside a table; we cover the data-layer version in Supabase row-level security for vibe coders. The engine applies the request-layer version above it, so the two reinforce each other rather than relying on either alone.

How often: structured rate-limit errors, not a bare 429

Rate limits are per-tenant, per-tool, per-minute. The defaults match what most agents need, and they exist to protect the data source as much as the platform. The interesting design decision is not the counter. It is the shape of the error when the counter trips.

A bare HTTP 429 is the wrong tool for the job. It lives below the protocol layer the agent reasons about, so the agent often sees a transport failure, not a tool-level signal. Some clients retry instantly and make it worse. Some give up and hallucinate an answer. The fix is to speak the agent's language. The Model Context Protocol defines a structured error shape with a numeric code, a human-readable message, and an optional data object for machine-readable detail. AppElixir returns the limit as exactly that: a protocol error the agent can parse and act on.

{
  "jsonrpc": "2.0",
  "id": 42,
  "error": {
    "code": -32004,
    "message": "Rate limit exceeded for tool 'lookup_customer'.",
    "data": {
      "type": "rate_limit",
      "tool": "lookup_customer",
      "limit": 60,
      "window": "1m",
      "remaining": 0,
      "retry_after_seconds": 17,
      "tenant_scope": "per_tenant_per_tool"
    }
  }
}

An agent that receives this does not loop and does not guess. It reads retry_after_seconds, waits, and retries. The data object is the difference between a tool that degrades gracefully under load and one that turns a transient limit into a wrong answer. The structured-error shape itself comes straight from the protocol; the engine just fills it in honestly. (For the full anatomy of structured results and errors, see chapter 4 on the generated runtime.)

What did they do: the audit log

The audit log is the most-overlooked production requirement of an MCP server and the one that pays for itself the first time something goes sideways. Agents are non-deterministic. They will call a tool you did not expect, with arguments you did not anticipate, in an order that makes no sense, and a customer will eventually ask "the agent did something weird, why?" Without a log, the only honest answer is a shrug.

So the engine writes one row per call, automatically, around tool execution. It records who (tenant and agent), which tool, what arguments (the validated ones), what result (status, not necessarily the full payload), and how long it took. The row is written even when the call fails validation or trips a limit, because the failures are often the interesting part.

{
  "ts": "2026-06-04T09:14:22.118Z",
  "tenant": "tnt_amber_4f2",
  "agent": "agt_cursor_ws_19",
  "tool": "lookup_customer",
  "arguments": { "email": "dana@example.com" },
  "result": "ok",
  "rows_returned": 1,
  "latency_ms": 38,
  "rate_limited": false,
  "validation": "passed"
}

That row answers the "why" question without speculation. It also feeds the rate counter, surfaces the latency you actually serve, and gives you a clean trail when a write tool changes something it should not have. An audit log you trust is also the precondition for enabling write tools at all, a point the security checklists in the related reading make repeatedly.

What may they touch: validation as a boundary, write as an opt-in

Two more boundaries sit between the agent and your data, and both are about saying no early.

The first is schema validation. The JSON Schema the compiler produced from your form is not just advertised to the agent in tools/list. It is enforced on the server before the tool body runs. A missing required field, a string where an integer belongs, a number outside the declared range, a value not in the enum: all of these bounce with a clean validation error and never reach the spreadsheet, SQL table, or REST endpoint behind the tool. Validation is the cheapest place to reject a bad call, and rejecting it here means the data source only ever sees inputs that already conform to the contract. The engine leans on a mature JSON Schema validator, Ajv, so the enforcement matches the same draft semantics the protocol advertises, and the protocol's own error vocabulary, defined in the Model Context Protocol spec, is what carries the rejection back to the agent.

The second boundary is capability. Every tool the engine emits is read-only by default. A read tool that misbehaves returns wrong data; a write tool that misbehaves changes your data, and the second failure mode is the one that costs you a customer. So a write tool is an explicit, gated opt-in, decided per collection and recorded in the contract. You do not get a write tool because a form happened to look like an editor. You get it because you turned it on, on purpose, knowing the audit log is already capturing every mutation.

By hand

You validate at the edges if you remember to, you discover the one route where a malformed argument reached the database after it already did, and "read-only" is a code review convention that holds until someone wires up a quick UPDATE under deadline.

With the engine

Validation is compiled from the same schema the agent reads, so the advertised contract and the enforced contract cannot drift. Read-only is the default state of the artifact, and write is a flag with a paper trail. The boundary is structural, not a habit you have to maintain.

One plane, every server

The point of the security plane is that none of it is per-server work. Auth, isolation, limits, audit, validation, and read-only-by-default are the same on a customer-lookup tool over a Google Sheet and a usage-reader over Postgres. The data and the tool design are what differ, and they are what you should spend attention on. The plumbing is identical, which is the entire reason a no-code engine can emit a server that is correct on day one and still correct at scale, instead of a demo that quietly skips the parts nobody enjoys building. That is the plane. Chapter 7 takes the same compiled artifact, security and all, and ships it two ways: hosted, or as a Docker image you run yourself.