Skip to main content

Command Palette

Search for a command to run...

We need to reimagine 2 pizza rule for AI era

Headcount was never the real constraint. Agent teams expose that.

Updated
6 min read
We need to reimagine 2 pizza rule for AI era

The two-pizza rule was never really about pizza. It was about communication overhead. Jeff Bezos looked at large teams and saw most of the cost sitting in coordination, alignment, and the slow diffusion of decisions through too many people. Keep the team small enough to feed with two pizzas, and you keep that overhead manageable.

That logic held for twenty years. It does not hold anymore.

When agents are doing real work on your team, the communication model changes completely. Agents do not have hallway conversations. They do not build shared context over time. They do not pick up on tone or infer priorities from body language. Every interaction with an agent is a cold start, and the quality of what comes out is entirely determined by the quality of what goes in.

That shifts the constraint. It is not headcount. It is the clarity of what you ask each agent to do.

What the Two-Pizza Rule Was Actually Solving

Dan Shipper put it well in Every.to recently: Amazon's small team structure was a solution to communication debt. The more people in a team, the more relational links, the more meetings, the more diluted the ownership. Small teams moved faster because they spent less time on coordination.

Agents do not add to that coordination overhead in the same way. A team of two people with six agents is not an eight-person team for communication purposes. The agents do not send Slack messages. They do not need to be aligned on quarterly goals. They execute the task in front of them.

So if you apply the two-pizza rule to agent-augmented teams, you end up optimising for the wrong thing entirely.

The Real New Constraint: Decision Surface

Here is what I have learned running agent systems in production.

With human teams, a capable generalist is a gift. Someone who is strong at communication, good at research, reliable on detail work, you lean on them for everything and it works. You take their adaptability for granted.

With agents, you cannot do that. Agents are technically generalists but they are not reliably generalists. A capable agent given a vague brief will produce a plausible-looking result that may or may not be what you needed. The generalist capability is there, but you are the one who has to activate it correctly for each specific task.

I saw this clearly on a consulting engagement. I helped a team automate workflows by adapting templates from similar companies. The templates looked reasonable on paper. In practice, they delivered mediocre results and I spent most of the time editing outputs rather than using them. Eventually I realised the problem: the templates were not built for this team's actual priorities, this team's specific skill gaps, this team's judgement about what mattered. Starting from scratch, built around their context, produced significantly better results for less effort.

The same principle applies to agents on a product team. The question is not how many agents you have. It is how precisely you have defined what each one is responsible for and what good output looks like.

Optimise for Decision Surface, Not Headcount

The new heuristic I use: optimise for decision surface.

For each agent, there should be one clear decision or output it owns. Not a role. Not a broad function. A specific decision or deliverable, with explicit criteria for what success looks like. The tighter that surface, the more reliable the agent.

This is a bigger mental shift than it sounds. With humans, you hire people with broad capabilities and trust them to apply judgment to novel situations. You invest in their context over time and that investment compounds. With agents, you do not get that compounding. You get the capability fresh each time, and your job is to make the task surface clear enough that the capability gets applied correctly.

That does not mean you need a separate agent for every tiny task. It means every agent in your system should have a brief you could hand to a new contractor and have them immediately understand what good looks like. If you cannot write that brief, the agent is going to underperform regardless of the model.

No Template Survives Contact With Your Team

The other thing the two-pizza rule gave people was a template. A tidy number. Fewer than ten people, small enough for two pizzas. Simple to apply.

With agent teams, there is no equivalent template that transfers. I have seen people copy agent architectures from other companies wholesale and wonder why the results are inconsistent. The architecture looks right. The agents look capable. But the decision surfaces are mapped to someone else's priorities, someone else's quality standards, someone else's judgment about what matters.

Right now there is no established rule for how to structure an agent team. Anyone who tells you there is a formula is selling you something. What exists is a set of principles: keep decision surfaces tight, define output criteria explicitly, do not template from someone else's context, and build incrementally so you learn what actually needs an agent versus what is simpler without one.

At a 5-person AI-first company running production agent systems, the question I ask before adding any agent is not whether an agent could do this. It is whether I can write a brief clear enough that the agent will do this reliably. If the answer is no, the problem is not the agent. It is that we have not done the thinking yet.

What This Means for How You Structure

The two-pizza rule forced a useful discipline: keep teams small so you are forced to prioritise. The equivalent discipline in an agent-augmented team is different. It is: keep decision surfaces tight so you are forced to be specific about what matters.

That is harder than it sounds. It requires you to do the thinking upfront that human teams often do implicitly. It requires you to write down what good output looks like, what the agent should and should not consider, what a failed result looks like so you know when to intervene.

It is more work at the start. It produces dramatically better results throughout.

The teams that will figure this out are not the ones that add the most agents. They are the ones that are most disciplined about defining what each agent is actually supposed to decide.

More from this blog

J

Jai kora

9 posts