Software Engineering is not the art of being right

It is the scientific discipline of discovering where you are wrong before damage compounds

Dec 28, 2025

First, a story on Reality, Reasoning, and Production.

One upon a time in a Office Space-style cubicle near you…

The developer had always suspected that the universe was, at heart, a codebase written by someone who enjoyed recursion a little too much. But this belief had never caused problems before.

The Guess arrived on a Tuesday afternoon, fully formed, elegant, and—most dangerously—beautiful.

It explained everything. The latency spikes. The intermittent failures. The logs that seemed to contradict one another like quarrelling witnesses.

With a few lines of refactored logic and a gentle rearrangement of responsibilities, the system would, at last, reveal its true and orderly nature. The developer leaned back, smiled, and experienced that rarest of professional pleasures: coherence.

The code compiled without complaint. The tests passed with enthusiasm. The linter, that most vindictive of creatures, fell silent.

And production smiled in anticipation.

The First Review

The senior engineer scrolled the change carefully.

“This is… neat,” she said, which in engineering culture is the highest praise available without three-eye approval.

She traced the logic with a finger, nodding. “Yes. That’s very clean. Almost… inevitable.”

The developer felt a warm glow. Inevitability is what one hopes for when rewriting reality.

The Second Review

The architect squinted at the diagrams.

“I like how this reduces complexity,” he said. “Fewer moving parts. More symmetry.”

He paused, frowned slightly, then shrugged. “I can’t see anything wrong.”

No one ever does, thought the developer. That’s the whole point of beauty.

The Third Review (Asynchronous)

A comment appeared two days later:

> LGTM. Love the abstraction.

The developer considered printing it out and framing it.

The Fourth Review (AI-Assisted)

The AI agent was asked, politely, to look for edge cases. It responded with alarming speed.

> The logic is internally consistent. The assumptions are reasonable. The approach aligns with best practices.

It then added, almost apologetically:

> Confidence: High.

The developer felt briefly uneasy, but dismissed the feeling. Even oracles must agree sometimes.

Deployment Time: Reality’s Review

Production did not crash, it did something far worse. It behaved correctly, according to the new logic—and catastrophically, according to reality.

Payments were delayed in ways no test had imagined. Caches synchronised with impeccable timing… to the wrong truths. Retries amplified errors with ballet-like precision.

The system was not broken. It was faithful.

The Postmortem

The room was quiet.

“We followed all the steps,” someone said.

“The reviews were thorough,” said another.

“The tests covered the cases,” said a third, who would later be promoted.

The developer stared at the wall, where an invisible diagram seemed to float—perfect, symmetrical, and entirely unrelated to the world.

“What assumption failed?” the facilitator asked gently.

The developer spoke at last.

“The system,” he said, “does not experience time the way we modelled it.”

There was a pause. Someone asked, “How does it experience time?”

The developer smiled sadly.

“Differently.”

A Footnote

Much later, the developer would write in a private notebook:

The guess was not wrong because it was illogical.
It was wrong because reality was under no obligation to be elegant.

He closed the notebook, opened the editor, and began again—this time not with a beautiful guess, but with a question.

And somewhere in the distance, production waited patiently, sharpening its knives.

Do Not Trust the Beautiful Guess

A beautiful guess that survives every review but fails in production was never correct—only untested by reality.

Software engineering increasingly resembles science—but often lacks its most important discipline: letting reality say no.

Design reviews, architecture boards, test suites, linters, simulations, AI assistants, and dashboards are not experiments. They are filters. Useful ones—but filters nonetheless. They operate entirely inside the model we have chosen to believe. Feynman’s explanation of science is brutal in its simplicity:

You guess how the world works.
You derive consequences of that guess.
You compare those consequences with observation.
If reality disagrees, the guess is wrong—without appeal.

Most software failures occur because we stop at step 2½. We mistake internal coherence for external truth. The whole SDLC is the real evidence-gathering playground for learning through experimentation:

You guess how the world works — Design / architecture / refactor
You derive consequences of that guess. — Tests, reasoning, reviews
You compare those consequences with observation, i.e. experiment — Real users, real load, real time
If reality disagrees, the guess is wrong—without appeal. — Incidents, anomalies, surprises

The danger is not in guessing. The danger is falling in love with the guess.

Production is the experiment. Everything else is rehearsal.

“It doesn’t matter how beautiful your guess is. If it disagrees with experiment, it’s wrong.”

— Richard Feynman, paraphrased

Some practices to consider

Treat Design as Hypothesis, Not Truth — Every architecture is a claim about reality. Write it as such:
1. We believe X will reduce Y under conditions Z.
2. We expect to observe A, B, and C if this is correct.
If you cannot state what would falsify your design, you are not doing engineering—you are doing storytelling.
Force Predictions Before Deployment — Before shipping, ask explicitly:
1. What will get slower?
2. What will break first?
3. What will surprise us at scale?
4. Where will humans behave “incorrectly”?
Write these down. Revisit them after release. Prediction sharpens understanding. Surprise sharpens humility.
Prefer Small, Reversible Guesses — Feynman’s loop works because wrong guesses are cheap. In software:
1. Use feature flags
2. Use canaries
3. Use gradual rollouts
4. Use experiments, not launches
A guess that cannot be safely disproven is already dangerous.
Production Is Not Validation, It Is Merely Judgment — Passing tests does not mean “correct”. Approval does not mean “true”. Production answers a single question only:
Does the system behave this way in the world that actually exists?
No amount of reasoning can overrule that answer.
Beware Coherence Bias (Especially with AI) — Modern tools are exceptionally good at producing:
1. Plausible explanations
2. Internally consistent designs
3. Confident affirmations
They are not good at:
1. Knowing which assumptions are false
2. Experiencing time, load, incentives, or fatigue
3. Being surprised

AI amplifies the danger of the beautiful guess by making it effortless. Judgment must remain human—and grounded in observation.

Some things to avoid

Confusing elegance with correctness
Mistaking review consensus for evidence
Treating incidents as moral failures instead of epistemic ones
Retrofitting explanations after the fact
Optimising for narrative clarity instead of behavioural truth

The system does not care how convincing your diagram was.