The Resilience Rehearsal
A Tale of High Theatre and Low Learning
This story was inspired by the wonderful new book, “Why we still suck at Resilience: Organizational Dynamics” by Adrian Hornsby. If you are curious about how resilience can really be invested in and built, beyond the theatre, this is the book for you.
There are cafés that serve coffee, and there are cafés that serve metaphors.
Le Bon Mot, tucked between a second-hand bookshop and a locksmith who claimed to open only “conceptual locks,” did both. It was a narrow place, taller than it was wide, with shelves that leaned inwards as if listening. The books were arranged not by author, nor by genre, but by the emotional fragility of their protagonists. The coffee was strong enough to revive a pod in CrashLoopBackOff.
Case preferred the table beneath the brass clock that ran three minutes late. “Resilience,” she once said, “is the art of noticing the clock is wrong before you schedule your life around it.”
On this particular afternoon, the café hosted what was advertised, on a chalkboard whose letters rearranged themselves when unobserved, as:
THE GREAT RESILIENCE REHEARSAL
Featuring Chaos, Load, and an Honest Review
The event had been organised by a consortium of local technology leaders who had recently read a book with a title so refreshingly blunt that it felt less like literature and more like an accusation: Why We Still Suck At Resilience: Organizational Dynamics.
Case had read it in one sitting, occasionally pausing to underline phrases such as “Work-as-Imagined versus Work-as-Done,” “normalisation of deviance,” and “the drift toward theatre” She had smiled the way one smiles when a mirror proves unflattering but accurate.
Now the consortium had decided to demonstrate that they, unlike the unnamed organisations in the book, did not suck at resilience. They would prove it in public. At a café.
The Master of Ceremonies, a man named Percival whose tie bore small embroidered circuit breakers, cleared his throat.
“Ladies and gentlemen,” he began, “we are here to demonstrate our adaptive capacity.”
From behind the counter, Madame Beauregard, proprietor and philosopher-queen of Le Bon Mot, muttered, “One must hope it is not decorative.”
Percival gestured toward a small stage upon which stood three artefacts: a fog machine labeled CHAOS, a treadmill labeled LOAD, and a lectern with a thick binder titled OPERATIONAL READINESS REVIEW (REV. 47).
“Our organisation,” Percival continued, “has invested heavily in resilience. We run GameDays, occasionally. We conduct blameless postmortems. We have a comprehensive readiness checklist.”
Case sipped her coffee. “How delightful,” she murmured to the brass clock. “A museum of assurances.”
Chaos Engineering as Pageantry
The first act was Chaos.
A young engineer in a black turtleneck, his name tag read “Disruption Facilitator”, stepped forward and flipped a switch. The fog machine hissed theatrically. A speaker emitted the sound of distant thunder, which had been downloaded from a royalty-free website.
“This simulates a regional outage,” the facilitator declared.
Nothing happened. After a pause, a confetti cannon fired, releasing slips of paper printed with the words SIMULATED LATENCY.
The audience applauded.
Percival beamed. “As you can see, our system absorbs disturbances gracefully.”
Case raised a hand.
“Yes?” Percival asked, with the strained politeness of someone who suspects a question will not be congratulatory.
“What exactly have you disturbed?”
The facilitator frowned. “The staging environment.”
“Is it connected to anything meaningful?”
“It mirrors production,” Percival replied quickly.
“Work-as-Imagined production,” Case said. “Or Work-as-Done?”
The room shifted uneasily. A developer in the second row whispered, “We do not speak of Work-as-Done.”
Case stood, walked toward the stage, and examined the fog machine. It was unplugged.
“Ah,” she said gently. “Resilience as rebound without disturbance. Chaos as choreography.”
Percival laughed too loudly. “We prefer controlled experimentation.”
“Of course,” said Case. “One must ensure the experiment does not surprise the experimenters.”
She turned to the audience. “In the book you’ve all recently devoured, at least the parts with diagrams, it is suggested that practices like chaos engineering reveal the gap between how we think systems work and how they actually work. But only if the organisation is willing to be surprised.”
She tapped the unplugged cord with her foot.
“Surprise,” she said, “cannot be simulated by unplugging nothing.”
Madame Beauregard rang a small bell behind the counter. “A point for the lady.”
The facilitator plugged in the fog machine. It whirred with enthusiasm, promptly triggering the café’s smoke detector. The alarm screamed.
In the ensuing confusion, someone spilled coffee onto the readiness binder. Someone else attempted to silence the alarm by following a laminated instruction sheet that referenced a model of detector discontinued in 2014.
Case observed with mild interest.
“This,” she said over the noise, “is more educational.”
Load Testing as Treadmill
After the smoke cleared, both literally and metaphorically, the second act commenced.
A middle manager named Harriet stepped onto the treadmill labeled LOAD.
“We have extensively load-tested our systems,” she announced, beginning a brisk walk. “We know exactly how much traffic we can handle.”
“Under what conditions?” Case asked.
“Peak retail events. Promotional campaigns. Simulated Black Fridays.”
The treadmill accelerated slightly.
“And under drift?” Case inquired.
Harriet blinked. “Drift?”
“The gradual accumulation of small deviations. Configuration tweaks. Unreviewed patches. Informal workarounds. The system that runs in degraded mode because, as someone once observed, everything fails all the time.”
Harriet’s pace quickened as the treadmill, apparently misconfigured, interpreted her rising heart rate as a scaling signal.
“We test to 150% of expected capacity,” Harriet said, breathlessly. “With synthetic traffic.”
“Synthetic,” Case repeated thoughtfully. “Does it behave like customers?”
“It behaves like we think customers behave.”
“Work-as-Imagined customers,” Case said.
The treadmill lurched. Harriet stumbled, grabbing the handrails. The audience gasped. Case approached and pressed a red button labeled EMERGENCY STOP, which did nothing.
“That,” she said, “is interesting.”
Harriet managed to leap off just before the treadmill reached a speed better suited to small cheetahs. A junior engineer hurried forward. “The stop button is decorative,” he whispered. “It looks reassuring.”
“Ah,” Case said. “Robustness as interior design.”
She faced the room again.
“Load testing reveals capacity cliffs,” she said. “But only if you test the system that actually exists. Not the one drawn in the architecture deck. If the architecture deck and reality have diverged, and they always have, then your load tests measure confidence, not capacity.”
Harriet sat on the edge of the stage, contemplating her pulse.
“We have graphs,” she said faintly.
“Graphs are wonderful,” Case replied. “They are the poetry of imagined systems.”
The Honesty Test
The third act was the Operational Readiness Review. Percival, having regained his composure, approached the lectern with solemnity.
“Our readiness checklist contains 437 items,” he said. “Every service must pass before deployment.”
He opened the binder. Pages fluttered, still damp from coffee. Case peered over his shoulder.
“Item 142: ‘Monitoring dashboards updated.’ Are they?”
“Yes.”
“Have they been used during an actual incident?”
Percival hesitated. “We have a GameDay next quarter.”
“Item 289: ‘Runbook validated.’ By whom?”
“By the team that wrote it.”
“And the last time someone followed it under pressure?”
Silence.
From the back of the café, an SRE named Malik raised a tentative hand. “We bypassed it during the payment outage,” he said. “It didn’t account for the new caching layer.”
Percival cleared his throat. “We are updating it.”
Case smiled at Malik. “Thank you. That is the beginning of learning.”
She addressed the room.
“A readiness review is an honesty test. Not of compliance, but of alignment between Work-as-Imagined and Work-as-Done. Checklists fail silently when psychological safety is absent. If the answer ‘No’ carries risk, then every answer will be ‘Yes.’”
Malik looked at the binder. “We pass every review,” he said. “But incidents still surprise us.”
“Of course they do,” Case replied. “Because the review has become theatre. It demonstrates competence without discovering reality.”
Percival’s tie sagged slightly.
Incident Analysis as Ritual
At this point, summoned by narrative necessity, an actual incident occurred. The café’s payment terminal froze. Customers queued awkwardly, holding cups and existential doubt.
“Ah,” Case said softly. “Production.”
Percival seized the moment. “Excellent! We shall conduct a live blameless postmortem.”
“Before we’ve restored service?” Malik asked.
“Parallel workstreams,” Percival said briskly.
Case knelt beside the terminal. “When was it last updated?”
“Yesterday,” Malik said. “We patched it urgently. The vendor advisory was vague.”
“Did the readiness review capture that?”
Percival pretended to examine his tie.
As Malik and Harriet investigated, Percival assembled the audience into a semicircle.
“Let us identify the root cause,” he declared.
“Careful,” Case murmured. “Roots are seductive.”
Malik looked up from the floor, where he had discovered that the terminal’s power cable was loosely connected—possibly nudged during the Chaos Act.
“It’s not the patch,” he said. “It’s physical.”
Percival paused. “So the root cause is inadequate cable management.”
Case tilted her head. “Or perhaps the interaction between theatrical chaos and unexamined infrastructure.”
The payment terminal flickered back to life. Customers applauded. Percival scribbled on a flip chart: ROOT CAUSE: CABLE MANAGEMENT.
Case took the marker gently from his hand and added beneath it:
CONTRIBUTING FACTORS:
Unplugged fog machine plugged in.
Smoke detector alarm.
Spilled coffee.
Decorative emergency stop.
Readiness review untested in reality.
Psychological safety emerging mid-incident.
She turned to the audience.
“Incident analysis as theatre asks, ‘Who erred?’ Incident analysis as learning asks, ‘What does this reveal about our system, including ourselves?’”
Malik nodded slowly. Harriet, still recovering from her treadmill encounter, said, “We always add controls.”
“Yes,” Case replied. “Single-loop learning. Add a cable clip. Add a sign. Add a checklist item. But what assumptions produced the context in which this failure was possible?”
Percival looked faintly unwell.
“Double-loop learning,” Case continued, “would question why we treat resilience as performance. Why we prefer passing reviews to discovering gaps. Why the absence of failure is taken as evidence of safety.”
Madame Beauregard poured another espresso. “And deutero-learning?” she asked.
Case smiled. “Learning about how we learn. Or, in this case, how we rehearse.”
GameDay as Mirror
The consortium, to its credit, did not cancel the remainder of the event. Instead, they declared an impromptu GameDay.
“We will simulate a cascading failure,” Harriet announced, with newfound humility.
“Simulate?” Case asked.
“Perhaps,” Malik said, “we should use what just happened.”
The room fell quiet. They replayed the incident—not as accusation, but as inquiry.
Why was the fog machine unplugged? Because previous simulations had caused nuisance alarms. Why had previous simulations caused alarms? Because the environment was not configured for smoke. Why was the environment not configured Because facilities were not included in planning. Why were facilities not included? Because resilience was considered a technical concern. Why was resilience considered technical? Because the organisation’s fixated on and rewarded uptime metrics, not learning.
Why did incentives reward uptime? Because leadership reported stability to the board. Why did leadership report stability? Because visible incidents were rare. Why were visible incidents rare? Because teams adapted informally, bridging gaps silently.
Why were adaptations silent?
Because raising concerns had historically slowed delivery.
The café seemed to lean closer and Case spoke softly.
“You see the pattern. The gap between Work-as-Imagined and Work-as-Done does not close through rehearsal. It closes, a little and sometimes temporarily, through learning. And learning requires specific conditions.”
“Psychological safety,” Malik said.
“Appropriate incentives,” Harriet added.
“Leadership support,” Madame Beauregard concluded, polishing a cup.
Percival looked around at his colleagues.
“We have invested in practices,” he said slowly. “But not in the bedrock.”
Case nodded.
“Practices without this bedrock become theatre: Chaos without curiosity; load without honesty; reviews without risk, incidents without insight.”
She gestured toward the brass clock.
“Resilience is not a destination. It is a practice. A rhythm. An ongoing navigation of drift.”
The clock ticked, three minutes late.
The Prevention Paradox at Closing Time
As evening approached, the consortium tallied their findings. They had intended to demonstrate capability. Instead, they had discovered fragility.
“This is bad optics,” Percival murmured.
“On the contrary,” Case said. “It is excellent practice.”
A junior developer asked, “How do we show the board that nothing happened because we learned?”
Case smiled sympathetically.
“Ah. The prevention paradox. When success looks like absence, and absence looks like idleness.”
Madame Beauregard placed a hand on the developer’s shoulder.
“You must learn to narrate invisible work,” she said. “Otherwise, it will be cut.”
The group sat in contemplative silence. Harriet broke it.
“What do we do next?”
Case considered.
“Design for adaptability. Create forums where Work-as-Done is surfaced regularly. Rotate incident facilitators. Share learnings across teams. Adjust incentives to reward curiosity. And, perhaps most radical of all, treat surprises as information, not embarrassment.”
Percival exhaled.
“And the fog machine?”
“Keep it plugged in,” Case said. “But move it away from the smoke detector.”
Epilogue: The Café That Refused to Pretend
Weeks later, Le Bon Mot had changed subtly.
The chalkboard now read:
RESILIENCE PRACTICE – WEEKLY
Bring Your Drift.
Teams gathered not to perform, but to inquire.
They examined minor anomalies before they escalated. They invited facilities to planning meetings. They updated the readiness binder not just with new items, but by removing obsolete ones. They treated the treadmill’s emergency stop as a feature to be tested, not admired.
Percival began reporting to the board not only uptime percentages, but learning cycles completed.
Malik facilitated a cross-team review of monitoring assumptions. Harriet redesigned the load-testing suite to incorporate real traffic patterns. The fog machine became a symbol. Not of chaos, but of humility.
Case returned often, always to the table beneath the tardy clock.
One afternoon, a newcomer asked her, “Do you think we still suck at resilience?”
Case stirred her coffee thoughtfully.
“We are complex,” she said. “Therefore we drift. We will always have a gap between how we imagine our systems and how they actually behave.”
She glanced at the clock.
“The question is not whether we suck. The question is whether we are willing to notice.”
The clock ticked.
This story was inspired by the wonderful new book, “Why we still suck at Resilience: Organizational Dynamics” by Adrian Hornsby. If you are curious about how resilience can really be invested in and built, beyond the theatre, this is the book for you.


