Theatre risk, the human signature gate

What “theatre” means hereCopy link

The term comes from the BDD critique, where it’s used to describe Gherkin scenarios written after the code, to look like they shaped the code, when in fact the code shaped them. The form is right (well-structured Given/When/Then sentences) and the function is gone (those sentences didn’t drive what got built; they were tidied up afterwards to make a ceremony look honest). The methodology is theatre because the artefacts are props.

I’m using the word in that BDD sense throughout this page, and generalising it. Theatre is what any methodology becomes when its artefacts get written to satisfy the form rather than the function. A PRD can be theatre. A TAD can be theatre. An AC can be theatre, exactly the way a Gherkin scenario can. A test suite can be theatre when it’s written against the code that already exists rather than the criterion the code is meant to satisfy. None of these failures need malice. They just need a team under pressure, a tooling chain that lets you fill in the form after the fact, and a culture that doesn’t make the difference visible.

The honest position: no methodology survives a team that’s determined to cargo-cult it. Humans will try to bend, corrupt, and morph things to suit themselves, shorter goals, or reactive thinking, and they’ll do it without noticing. RCF can’t out-discipline a team that doesn’t want to be disciplined. What it can do is make the theatre version structurally harder to produce, and visibly different from the real version when somebody looks.

The four parts of the defenceCopy link

The structural answer has four parts. They aren’t a checklist; they’re a shape that recurs at every layer of the chain.

Standards and checklistsCopy link

Standards say what MUST and SHOULD be present in each artefact. A PRD that doesn’t state its non-functional requirements isn’t a PRD. A TAD that doesn’t state its decisions isn’t a TAD. An AC that doesn’t name an observable expectation isn’t an AC. The methodology provides the shape; each organisation provides the content. Regulated industries will have more MUSTs. Greenfield SaaS will have fewer. The standards aren’t universal, but the rule that there are standards is.

Standards are what make the form non-trivial. Without them, an artefact is any document that names itself a PRD. With them, an artefact is a document that meets the checklist or visibly doesn’t.

AI-assisted extractionCopy link

The artefact gets built by drawing the right information out of the stakeholders, by conversation, supplemented by working prototypes, existing documentation, research pieces, and whatever else the project has accumulated. The AI’s job is the extraction: ask the right questions, surface the missing pieces, propose draft text against the standards. The human’s job is to commit to what’s drawn out.

Extraction is what stops the artefact being a copy of last quarter’s template with the names changed. It also stops the artefact being a novel written by the AI in isolation. The human is in the conversation from the start; the AI is the patient interviewer that asks the questions the human would have skipped.

The diff between what’s present and what’s requiredCopy link

The standards declare what must be present. The extraction produces what has been gathered. The difference is a structured gap: visible, named, and on the page. The artefact isn’t green until the gap is addressed, either by gathering the missing piece or by recording a deliberate decision not to.

The gap is where theatre dies, when it dies. A theatre artefact is one that looks complete because the form is filled in. A gap-aware artefact is one where incompleteness is loud. You can still ship an incomplete artefact, but you can’t pretend it’s complete. The gap is the structural truth.

The approval gateCopy link

Approval is a recorded act, not a meeting outcome. The human signs off against the artefact, against the standards, against the recorded gap. The signature is on a commit, in a system that captures who signed, when, against what version of the document, with what gap acknowledged. The signature isn’t a rubber stamp because it’s against a structured artefact with a structured diff.

This is where the human signature lives in RCF. Not at the bottom of a PDF, not on a Confluence page, not in the minutes of a meeting. At the approval commit on the artefact that’s about to drive the next stage of work. The signature is auditable, attributable, and tied to the artefact version it actually signed.

STANDARDS → EXTRACTION → DIFF → APPROVAL → (loops)

Why the four parts work togetherCopy link

Each part on its own is weak. Standards without extraction produce template-filling. Extraction without standards produces a novel. A diff without an approval gate produces a tracked debt nobody owns. An approval gate without the rest produces the rubber-stamp version of the human signature: someone signed because they were in the room, not because they verified anything. That’s the version we’re trying to replace.

Together, the four parts make the theatre version visibly different from the real version, in a way that survives a senior reviewer looking at the artefact, the diff, and the commit history. Theatre is still possible. It’s just harder to produce by accident, and embarrassingly obvious when it’s produced on purpose.

The same shape at every layer of the chainCopy link

Every artefact in the chain runs the same four-part pattern, with the content of each part changing layer by layer.

PRD. Standards: the org’s PRD checklist, covering sections like business goals, non-functional requirements, regulatory constraints, out-of-scope. Extraction: AI-led conversation with product, stakeholders, sometimes customers. Diff: gaps against the checklist, visibly recorded. Approval: human sign-off commits the PRD as the contract upstream of everything else.

TAD. Standards: architecture checklist (components, data, integrations, decisions, NFR posture). Extraction: AI-led conversation with technical architecture, security, devops, infrastructure. Diff: gaps in the architecture treatment, visibly recorded. Approval: tech architect sign-off, ideally co-signed by security and devops where the decisions touch their domain.

AC. Standards: a minimum schema covering happy paths, named error modes, named failure paths, derived edge cases, observable expectations. Extraction: AI drafts the ACs from REQ, TAD, US, against the schema. Diff: schema coverage check (which categories are represented; which are conspicuously missing). Approval: product owner or accountable role signs off the AC set against the schema.

FBS. Standards: the FBS template (storyScope, dependencies, build context, testable outcomes, status). Extraction: AI assembles the FBS from the build sequence and the in-scope stories. Diff: any missing field is visible. Approval: tech lead or accountable engineer commits the FBS as ready-to-build.

Tests. Standards: test cases cover the named happy paths, error modes, failure paths, edge cases declared on the AC. Extraction: the coding agent drafts the tests against the AC, in a separate context from the code (different system prompt, instruction set, commit boundary). Diff: any case category on the AC without a corresponding test case is loud. Approval: review and commit of the suite happens before the code-writing pass starts.

The shape is the same in each row. The content is different. The methodology gives you the shape and the rule that the shape runs everywhere. The organisation gives you the checklists.

How this disposes of the AI-marks-its-own-homework critiqueCopy link

A reasonable worry: if AI drafts the ACs and the same coding agent then writes the tests, you’ve got a chain of self-reports with no anchor point. The agent agrees with itself; the tests pass against the code the agent wrote; the human signature at the end is a rubber stamp on a closed loop.

The four-part defence is the anchor. Two anchor points in particular:

First, AC review against a structured schema is a human signature on structured content. The human isn’t reviewing a paragraph of prose hoping to catch a vibe error. They’re reviewing an AC against a checklist (happy path covered? error modes named? observable expectation present?) and signing off that the AC meets the schema. The signature is on the schema-compliance, not on vibes. The agent can’t fake schema compliance; either the schema fields are populated with real content or they’re not.

Second, the test-writing context is structurally separate from the code-writing context. Tests are drafted from the AC, against the AC schema, in a session that doesn’t have the implementation in view. The tests get committed before the code session starts. By the time the coding agent sees the test suite, the suite is a fixed target. It can’t adjust the suite to match what it’s about to write, because the suite is already on a commit nobody is about to amend without a deliberate, visible act.

The chain of self-reports is broken by the human signature on the AC (signed against a schema, not against vibes) and by the commit boundary between test-writing and code-writing (which separates the contexts an agent can self-confirm across). Neither is bulletproof. Both are structurally better than “the agent says it worked.”

What this page is and isn’t sayingCopy link

Not saying: RCF eliminates theatre risk. Nothing eliminates theatre risk. A team that’s determined to cargo-cult a methodology will cargo-cult RCF the same as anything else.

Saying: RCF makes the theatre version structurally harder to produce and visibly different from the real version, by routing every artefact through the same four-part pattern (standards, extraction, diff, approval). The human signature, when it lands at the approval gate, is against a structured artefact with a structured diff, not against a feeling that everything looked fine in the meeting.

The upstream tooling that operationalises this fully (the extraction workflows, the diff visualisations, the approval-gate UI) is what’s coming next. The scoping page is the reference for what’s in frame today and what’s coming later. This page describes the mechanism that will sit underneath that tooling.

Where this shows up elsewhereCopy link

The pages that cross-link to this one are the places where theatre risk is the obvious worry:

Methodology lineage is where the BDD theatre critique lives. The four-part pattern on this page is what RCF inherits and tightens from that lineage.

Acceptance criteria as the contract is where AC theatre is the obvious worry. The schema-and-approval mechanism described here is how AC theatre is kept structural.

Scoping is the sibling to this page. Scoping says what RCF currently covers; theatre risk says how the cover stays anchored to real human commitment rather than ceremony.