Architecture May 2026 7 min read

A Cortex Can't Coordinate

An LLM is the creative cortex. Cortices don't coordinate. When three agents touch the same file, the cortex isn't what stops them — the substrate underneath has to. Here's what that substrate looks like.

The collision

Three agents are working on the same repo. One is refactoring the auth middleware. One is fixing a bug in the same middleware. One is regenerating types from a schema the middleware imports.

An hour later, two of them push. The third's branch is now full of merge conflicts it doesn't understand and can't safely resolve. Someone — a human, probably you — reads three transcripts, untangles what each agent intended, and decides which work survives.

The cortex didn't do this. The cortex was working. Three separate cortices, each doing their job. The failure happened in the space between them — the part of cognition that decides who's allowed to touch what, when, and with what context from whoever touched it last.

That space is what most agent infrastructure leaves empty.

What an LLM does well, and what it doesn't

A language model generates text. Given context, it produces plausible next steps. Given tools, it calls them. Given a task description, it writes code that's often correct.

What it does not do is know what the agent two desks over is doing. It does not remember what the previous agent learned about this file. It does not negotiate. It does not yield. It cannot, because it is a cortex — a function for converting context into output. Each agent's context window is closed to the others: there is no shared state to read, no protocol for yielding, no way for one cortex to even know another cortex exists. Asking the LLM to coordinate is asking it to act on information it structurally cannot have. Coordination is not a context-to-output problem. Coordination is a substrate problem.

You can give every agent the smartest possible LLM and they will still overwrite each other's work. Intelligence at the endpoint doesn't compose into coordination across endpoints. That's a structural property of the architecture, not a quality property of the model.

The four things the substrate has to do

Strip away framing and there are four jobs the layer underneath the agents has to do for fleets to work. We shipped each as a piece of MemBrain Team Mode over the last two weeks.

1. Surface the context the team already paid for — without silently mutating the prompt

When a new chat starts, the person at the keyboard doesn't know what the team knows. The previous person who touched the auth middleware figured out that the JWT secret rotates every 24h and that the test fixtures lie about the issuer. That knowledge exists — in a Slack thread, a half-finished PR description, an aborted transcript. It is not in the new conversation.

The substrate retrieves it. On a new chat-completions request, MemBrain runs a semantic search over the team's shared knowledge entries authored by other actors, and rides the matches back on the response as an X-Membrain-Actor-Context header. The dashboard renders them as a "Related context from your team" card. The user reads them and decides whether to incorporate. We deliberately do not inject these into the LLM prompt — silent prompt mutation is the kind of magic that ages badly. The cortex still does the reasoning. The substrate makes sure the team's prior work is one click away when it matters.

2. Queue work that needs a human, instead of stopping

Real agent workflows hit decision points. A migration looks risky. A test failure looks ambiguous. A continuation is available but the agent isn't sure it should take it. The wrong move is to halt and wait for the operator to notice. The right move is to file the question and pick up something else.

The queue does this. Agents enqueue items typed as needs_human, continuation_available, or dedupe_candidate, with the context an operator would need to decide. The operator sees a flat list across the fleet. Agents claim work atomically — the substrate is designed so two agents don't end up holding the same item — via a small SDK. A single agent's uncertainty doesn't stall the fleet.

3. Pin the thread that the next person needs to pick up

Not all conversations are equal. Some are throwaway debugging. Some contain a load-bearing decision that downstream work depends on. The substrate has to make those threads easy to find later — without making everyone re-read the whole channel.

A pin is an explicit claim: "this conversation matters, come back to it." Pinning a thread enqueues it as a continuation_available item with a one-week default expiry, scoped to the right visibility tier (private to you, your fleet, or your team). Anyone on the right tier can find it in the queue, claim it, and continue the work. A pinned "we decided not to migrate to Postgres 17 because of the pgvector regression" thread sits in the team queue for a week, claimable by whoever picks up the follow-up — instead of disappearing into chat history the moment the conversation ends.

4. Let agents check out work, and hand off cleanly when they're done

The collision at the top of this post is the one Team Mode's task coordination prevents. An agent declares which paths it's about to touch. The substrate checks for in-flight conflicts — another agent already holds an overlapping declaration — and refuses the checkout if it would collide. When the agent checks back in, it submits a compacted summary of what it learned. The substrate persists that summary as a high-trust knowledge entry, so the next agent that touches the same area starts from the previous one's conclusions instead of re-deriving them.

This is the part that turns a fleet from a swarm of independent agents into a team that remembers what the team did.

What composes

None of these four surfaces is novel on its own. Queues exist. Locks exist. Knowledge stores exist. The thing that's missing in agent infrastructure today is having them share state, scoped to the right visibility tier, and applied automatically to every agent on the substrate — not opted into per-agent.

Why the substrate, not the agent

You could try to solve this at the agent level. Give every agent a tool that locks files. Give every agent a tool that posts to a queue. Give every agent a system prompt that says "before you start, check if anyone else is working here."

This breaks for the same reason application-level governance breaks: every new agent in the fleet has to be told. Every fork has to be retold. Every shadow agent — the one a developer spun up locally last week and forgot to register — never gets told. The coordination only holds for agents that opted in, which means it doesn't actually hold.

At the substrate, it just is. The agent's HTTP request goes through MemBrain. The checkout check fires before the agent gets to run. The team's prior context surfaces on the response whether the agent's prompt asked for it or not. The queue exists whether the agent was instructed to use it or not. The substrate enforces; the cortex reasons. That separation is the whole point.

Three layers, same substrate

An individual developer running a single Claude Code session doesn't need a queue. A team of six engineers running an agent fleet against a shared repo absolutely does. An org running fifteen teams across a hundred agents needs the coordination plus a way to audit which agent touched which file under whose authority. The substrate handles all three.

What we shipped

The four surfaces above landed in MemBrain over the last two weeks, each as a discrete piece: an actor substrate that types every API key as human or agent and scopes work to a visibility tier; a queue and agent SDK; pinned threads; task coordination with handoff. They share the team visibility tier and the same audit log, so an operator sees one coherent record of what the fleet did, not four parallel ones.

If you're running more than one agent against the same codebase, the substrate is the part that's missing. The cortex is fine. The cortex was always fine. The cortex just can't do this on its own.


Stop agents from overwriting each other

Team Mode ships in MemBrain today. JIT team context, an agent queue, pinned threads, and task coordination with clean handoff — on the same substrate, scoped to the right visibility tier.

Request access →