Demo-ready vs production-ready, the gap AI didn't close

← RCF

Two years of working with AI tools, properly and at depth, has changed the shape of the conversation about what AI does to software work. The thing that hasn’t changed is that production-ready software is hard. The thing that has changed is that demo-ready software is now astonishingly cheap. The distance between the two is where the executive misconceptions sit, where most of the new failure modes live, and where the methodology gets to be useful.

Demo-ready: the thing AI made cheapCopy link

Demo-ready software is the version that clicks through. It has the screens. It calls the right API. It handles the happy path. It looks like what it says it is. With a good agent and a reasonable brief, demo-ready software now takes an afternoon, not a fortnight. If you’ve been writing software long enough to remember the alternative, the speed is jarring. The first time I built a real-looking prototype between lunch and home time I had to sit with the feeling that I’d cheated.

The thing demo-ready is genuinely good for is the conversation about the product. A working artefact in front of a stakeholder will surface more honest reactions in twenty minutes than a Word document will in twenty hours. The product owner sees how the flow actually feels. The designer sees how the copy actually sits in the screen. The compliance lead sees the field they’ll need to log. The conversation moves from abstract to concrete almost instantly. That’s a genuine lift, and it’s what makes AI tooling worth having upstream of the build, not just inside it.

What demo-ready is not is the product. The mistake the next section is about is treating it as if it were.

Production-ready: the thing that stayed hardCopy link

Production-ready is everything between “the demo works” and “the product can be sold, supported, and not killed in its second week of real use.” The list is unflattering, because most of it has always been unflattering:

The cases the demo didn’t exercise. Empty inputs, enormous inputs, unicode, negative numbers, simultaneous writes, expired sessions, partial failures from third parties. Real users will hit every one of these within a week. The demo will not have hit any of them. The agent didn’t invent the edge cases the demo skipped; it followed the brief, and the brief was the happy path.

The decisions the agent made without telling you. Schema choices, retry policies, error messages, default permissions. The agent had to pick something to produce running code. It picked plausibly, in passing, and didn’t flag the choice as one. Each of those quiet choices is a policy the product is now committed to, and the team has no record of having committed to it.

The supportability layer. Logs you can actually grep. Metrics you can actually alert on. Errors that surface as themselves, not as a stack trace from three layers deep. The hooks operations needs to keep the product up at three in the morning. Demos don’t need any of this. Production does, and the gap between “works” and “works in a way that can be operated” is enormous.

The security posture. Auth, authorisation, input validation, secrets handling, audit logging, the things compliance will look for whether you did them on purpose or not. The agent will produce code that compiles regardless of how the auth is wired; it won’t produce code that’s right unless the requirement said so explicitly. Most demo briefs don’t say so explicitly.

The maintenance horizon. A feature that works today is one thing. A feature that’s still working in six months, after twelve other features have been added on top of it, is a different problem. Traceability and a healthy test suite are the only things that survive this. The demo has neither.

None of this is new. It’s what production-ready meant before AI was writing code, and it’s what it still means now. The AI didn’t change any of it. It just made it easier to skip past, because the thing the AI did deliver (a running artefact) looks enough like the goal that the rest can quietly disappear from the schedule.

The gap, plainlyCopy link

The gap between demo-ready and production-ready isn’t a fixed ratio. It varies enormously by domain. Greenfield SaaS in a forgiving regulatory environment, where users can be retrained on a bad release, has a smaller gap than financial software running through a regulator’s audit cycle. A consumer app where a bad release means a one-star review has a smaller gap than an internal tool that thousands of people depend on hourly. The shape is the same in every case; the size of the gap is project-specific.

What’s constant is the structural difference. Demo-ready proves the thing could exist. Production-ready proves the thing does exist, behaves the way the team committed to, and won’t degrade silently. The mechanism for that second thing is documents, tests, traceability, and a disciplined build cycle, none of which the demo had.

The dangerous version of thisCopy link

A version of the misconception is already loose in the wild and will be a substantial problem for the industry over the next year or two. Product owners and executives can stand up demo-ready software themselves now, in an afternoon. The careful ones know that’s a prototype. The rest convince themselves the hard work is done, then toss the artefact over the fence to an already-overworked engineering team to “just check it and productionise it.” After all, the AI built it. How hard can the rest be?

The rest, as the previous section laid out, is most of the work. Most companies have no real plan for how AI-built demos get shipped, paid for, and supported. McKinsey’s State of AI work consistently finds that despite near-universal adoption, only a small minority of organisations report material business value from it, and most are still figuring out how to scale beyond pilots. The pattern is consistent with the gap I’m describing here: it’s easy to get demo-ready output from AI; it’s genuinely hard to produce a product the organisation can stand behind.

To be fair to the demo: when a non-engineer can encode a flow, an algorithm, or a screen sequence in a working artefact, that artefact is a real input to the requirements process. It’s easier to write an AC against something runnable than against a Word document. The mistake is treating the demo as the build instead of the brief. As an input to the upstream work, the demo is gold. As a shortcut around the engineering, it’s the trap executives keep walking into.

What this means for methodologyCopy link

Software methodology arguments, for most of the period before AI got good at typing, mostly came down to a tax on the typing. Waterfall versus agile. TDD versus “ship and fix.” Heavy docs versus light docs. The argument was always about how much ceremony you could afford to layer onto the actual code-writing without slowing the team to a halt. Methodologies that imposed too much got rejected, because typing was the bottleneck and ceremony made it worse.

With demo-ready output now nearly free, the argument changes shape. The methodology isn’t a tax on the expensive part of the work; the methodology is the work that closes the gap to production-ready. Writing the requirement properly, breaking it into stories properly, writing acceptance criteria that hold up against the cases the demo didn’t exercise, sequencing the build properly, reviewing the diff honestly, running the cycle. These activities aren’t overhead. They’re the part that converts a working-looking artefact into a product the team can ship.

Teams that figure this out get an enormous lift. The work that produces production-ready software is the work that also compounds. A clean requirement set survives the project. A thoughtful build sequence shapes the next quarter. A test suite that traces to acceptance criteria protects the product for years.

Teams that don’t figure it out keep trying to skip the methodology, because they’re mistaking demo-ready for done. They ship more code with less understanding, and the gap compounds the other way. This is what AI drift looks like at the team level.

What RCF is for, in this contextCopy link

RCF is the operational answer to “what does the work between demo-ready and production-ready actually look like.” The document chain is what the work produces. The build cycle is the discipline that survives contact with agents. The AC-as-contract rule is what keeps the agent’s output honest against the cases the demo didn’t hit. The human signature at the approval gates is what keeps the chain from drifting into ceremony.

The framework runs at the level the gap actually lives at. It’s the part that always mattered, finally afforded, now that the typing around it has stopped consuming the schedule.