Skip to content
Scan a barcode
Scan
Paperback Run the Incident: A Practical Guide to Leading IT Outages Book

ISBN: B0GF1YKNJT

ISBN13: 9798242485730

Run the Incident: A Practical Guide to Leading IT Outages

Incident Management: Leading When Systems Fail

Modern systems don't fail quietly-and neither do the organisations that run them.

When critical technology breaks, the real challenge is rarely the code. It's decision-making under pressure, communication under uncertainty, and leadership when the cost of mistakes is measured in trust, revenue, and real-world impact.

Drawing on years of experience managing incidents in high-stakes financial and banking environments, this book reframes incident management as what it truly is: a leadership discipline, not a debugging exercise.

Using clear analogies from emergency response-fires, floods, earthquakes, and tsunamis-Incident Management provides a practical, human-centred guide to staying effective when everything is on the line.


What You'll Learn

✔ What an incident really is-and why impact matters more than urgency
✔ How to stabilise situations in the first five critical minutes
✔ Why clearly defined roles outperform heroics under pressure
✔ How communication becomes infrastructure during outages
✔ Techniques for decision-making when information is incomplete
✔ How to manage cascading failures, dependency shocks, and systemic events
✔ Practical guidance on observability, triage, and recovery
✔ How to run blameless post-incident reviews with real accountability
✔ How to build resilient systems and resilient people


Inside the Book

This book covers the full lifecycle of incident management, including:

Incident command and team structure

Crisis communication for technical and non-technical stakeholders

Flood control patterns such as rate limiting and backpressure

Dependency failures and staged recovery

Customer trust during outages

Security and compliance incidents

Automation, drills, and simulation

Metrics that matter-and those that mislead

The psychology of stress, fatigue, and group dynamics

Each chapter combines practical frameworks with timeless insight, supported by reflections from philosophy, leadership, and emergency management.


Who This Book Is For

This book is written for:

Technology leaders and engineering managers

Incident commanders and on-call engineers

SRE, platform, and reliability teams

Executives responsible for critical systems

Anyone expected to lead calmly when systems fail

No prior incident management framework is required-just the responsibility to act when things go wrong.


Why This Book Is Different

Most books focus on tools, dashboards, and postmortems.
This one focuses on how people think, communicate, and decide under pressure.

Because when systems fail, leadership-not technology-determines the outcome.

Recommended

Format: Paperback

Condition: New

$20.00
Ships within 2-3 days
Save to List

Customer Reviews

0 rating
Copyright © 2026 Thriftbooks.com Terms of Use | Privacy Policy | Do Not Sell/Share My Personal Information | Cookie Policy | Cookie Preferences | Accessibility Statement
ThriftBooks® and the ThriftBooks® logo are registered trademarks of Thrift Books Global, LLC
GoDaddy Verified and Secured