Skip to content
Engineering#ai-agents#engineering#governance#software-delivery#systems-thinking#testing

Why We Had to Build a Constitution for AI Coders

Reading time:6 min read
By

Why We Had to Build a Constitution for AI Coders

Prompting Was Never Going to Be Enough

The early phase of AI coding feels intoxicating for a reason.

You can ask for:

  • a feature
  • a refactor
  • a test suite
  • a bug fix
  • a migration
  • a review

and the system produces something plausible very quickly.

That creates a seductive belief:

if the model gets a bit better and the prompt gets a bit sharper, the rest of the problem will mostly solve itself

I do not think that is true.

Once you move beyond isolated experiments and start trying to build real products with multiple agentic workstreams, prompting alone stops being an adequate operating model.

At that point you do not merely need helpful outputs. You need a governable system.

That is why I increasingly think serious AI coding requires something closer to a constitution than a prompt library.

Why Prompting Stops Being Enough

Prompting is a good interface for asking for work. It is not a sufficient operating model for coordinating scope, verification, isolation, and integration once multiple agentic workstreams are producing meaningful change in parallel.

What I Mean by a Constitution

I do not mean legal theatre or heavyweight enterprise ceremony.

I mean a clear, explicit set of rules that defines:

  • what the system is allowed to do
  • how work must be decomposed
  • how quality is verified
  • what counts as completion
  • how parallel work is isolated
  • who owns integration
  • and which principles override convenience when speed increases

In other words, the system needs governing law, not just stylistic advice.

Why? Because agents do not only need help producing code. They need constraints that keep the whole development system coherent.

The Failure Mode of Loose Prompting

Loose prompting works surprisingly well for:

  • small edits
  • contained tasks
  • local experiments
  • isolated generation

It works much less well as the operating model for a fast-moving product.

Without stronger governance, the common failure modes appear quickly:

  • scope expands invisibly
  • two workstreams edit the same conceptual boundary differently
  • tests become decorative
  • claims of completion outrun verification
  • docs drift behind the new truth
  • agents change files they were never meant to touch
  • integration becomes the place where truth is discovered too late

None of those failures are especially mysterious. They happen because the system has throughput without constitutional order.

Why Speed Makes Governance More Valuable

The faster the system moves, the more expensive ambiguity becomes.

In a slower human-only workflow, some of the ambiguity can be absorbed informally. People talk. Reviewers catch context. A single thread of execution limits how much can diverge at once.

In agent-native delivery, you may have multiple parallel workstreams, rapid iteration, and many more opportunities for apparently reasonable changes to conflict or drift.

That is why governance is not anti-speed. It is what makes speed usable.

A constitution creates the shared rules that let high-output systems remain intelligible.

What a Real AI Coding Constitution Needs

I think there are a few essential parts.

1

Require evidence before claims

The system should never treat plausible output as completion. Fresh verification needs to exist before saying the test passes, the bug is fixed, the route works, or the task is done.

2

Enforce root cause before fixes

Fast systems can generate cosmetic patches instantly. Without cause-finding first, they accumulate drift and recurring defects at exactly the moment they appear most productive.

3

Control scope explicitly

Each workstream needs clear boundaries. If scope is loose, throughput turns into merge friction, review difficulty, and architectural incoherence.

4

Isolate parallel work deliberately

Parallelism only compounds if tracks are allowed to proceed without trampling one another, which usually means explicit file scope, environment isolation, and a distinct owner for integration.

5

Pair source changes with verification

If code changes are not accompanied by the right kind of proof, the system gains speed while losing truth. That trade becomes expensive very quickly.

6

Define completion operationally

A strong constitution makes “done” concrete rather than emotional. That clarity is one of the biggest advantages a governed agentic system can have.

Why Hooks, Rules, and Sentinels Matter

One of the important lessons here is that human intention is not enough.

Even if the team agrees with the principles, high-throughput systems need some of them encoded into the environment itself.

That is where things like:

  • hook-based checks
  • skill activation gates
  • file ownership warnings
  • verification prompts
  • commit rules
  • and session-boundary audits

become useful.

They are not there to create bureaucracy. They are there because high-speed systems need environmental reinforcement, not just good intentions.

If the rules only exist as prose, they will be broken precisely when speed and pressure rise.

The Human Role Changes Too

A constitution also clarifies the human role.

The human is not there to type everything manually. The human is there to:

  • set direction
  • define scope
  • choose decomposition
  • decide what counts as evidence
  • arbitrate trade-offs
  • and own integration where multiple tracks converge

That is a more powerful role than “primary typist.” But it is also a more disciplined one.

It requires the operator to think like a system designer, not only like an implementer.

Why This Will Matter Beyond Coding

I suspect this pattern will spread well beyond engineering.

Any environment where agents participate materially in:

  • analysis
  • planning
  • execution
  • review
  • or operational updates

will increasingly need constitutional logic.

The same core questions come up everywhere:

  • what may the system change?
  • what evidence is required?
  • how are parallel tracks bounded?
  • which truths are canonical?
  • who owns integration?
  • what is reversible?
  • what should always remain under stronger human judgement?

That is governance. And governance becomes more important as capability rises.

The Deeper Point

The deeper point is that AI coding is no longer only a model problem. It is an operating model problem.

The teams that win will not simply have access to strong models. They will have systems that turn model capability into reliable engineering behaviour.

That requires law-like constraints. Not because the models are bad, but because the system is now powerful enough that loose behaviour has real cost.

Conclusion

We did not need a constitution for AI coders because the models were uniquely broken. We needed one because the system became useful enough that informal norms were no longer sufficient.

Once agentic throughput becomes real, the important challenge is not “how do we get more output?” It is:

how do we govern output well enough that speed does not destroy coherence?

That is why I think the future of serious AI coding is not just better prompting. It is better governance.

And governance starts to look a lot like a constitution.