HACKER Q&A
📣 veddox

Literature on crisis response?


So I've always been interested in how teams respond to crises. I love reading about teams that handle crises well, and try to figure out what I can learn from their actions for the teams I lead. (Apollo 13 continues to be an all-time favourite movie of mine...)

Recently, there've been two discussions on HN [1,2] that have gotten me thinking about this topic again. And now I'm wondering: are there any good books on the topic that you can recommend? I'm not restricting myself to any domain - business, politics, engineering, natural disasters, could all be interesting.

[1] https://news.ycombinator.com/item?id=26506920

[2] https://news.ycombinator.com/item?id=26539495


  👤 cjbprime Accepted Answer ✓
The obvious choice is aviation -- there are thousands of commercial accident reports, many of which lead to process improvements for everyone else afterwards, and general aviation emergencies every day. YouTube especially is full of ATC audio combined with radar visuals and commentary for emergencies.

For tech, Dan Liu maintains a list of tech company incident public post-mortems: https://github.com/danluu/post-mortems


👤 Darkstryder
With regard to the specific topic of communicating through a crisis, I've found Masters of Disaster: The Ten Commandments of Damage Control by Christopher Lehane, Mark Fabiani and Bill Guttentag to be pretty interesting.

The authors worked with multiple organizations and celebrities during time of crises and helped them with the public relation side of it. For instance they managed Bill Clinton's PR during the Monica Lewinsky events.

The book is not perfect (I think it could be shortened a bit and retain the same information) but it is still very interesting and I think about it every time I witness someone getting themselves into a big, public crisis and making things worse by not managing their PR properly.


👤 qbasic_forever
The climbing and mountaineering community is great about reviewing past accidents (especially deadly ones) to understand the root causes. Every year the American Alpine society produces a new volume of Accidents in North American Climbing and it's well worth a read: http://publications.americanalpineclub.org/about_the_acciden... Most things are human error like rappelling off the end of a rope, but there's usually a few really big team or trip planning failures like avalanches, etc.

There's a podcast they do related to it called The Sharp End and it's worth a listen too: https://www.thesharpendpodcast.com/


👤 trop
Two more, by authors coincidentally with the same initials:

Thinking in an Emergency (2012) by Elaine Scarry -- She argues for the importance of planning and procedure. Examples include the Swiss shelter system (civil defense), CPR training, and compacts in rural Canada to deal with grain silo fires. She suggests that careful thought before the emergency is vital for "civilization", or at lest for democratic governance. As a converse, the country is destabilized when an opportunist leader comes along, cowboy-style, and says, "This is a crisis, and I'm going to shoot from the hip and fix it". Hence it's both a pragmatic study and a lucid work of political philosophy.

Command and Control (2014) by Eric Schlosser -- This is more of an anti-study: not what to do in an emergency, but the inevitable flaws in complex systems, the limited efficacy of administration, and the inherent failings of human effort. Sort of the mirror-book to Scarry's, but also utterly fascinating. It's a history of disasters of the U.S. nuclear weapons program, and in an effort as huge as that, the disasters certainly exist at scale. (Interestingly, Scarry's book centers on the importance of governance in a nuclear state, or perhaps that democratic governance is not compatible with being a nuclear power.)


👤 pjmorris
For more on the Apollo 13 crisis response, check out 'Apollo: Race to the Moon' by Murray and Cox. I haven't met a more detailed popular examination of the engineering and management effort behind the Apollo program, and they spend some time on the 'back room' of engineers depicted in the film.

I'm reminded of 'The Medical Detectives', Roueche, but only by reputation (I own a copy I haven't read.) "In each true story, local health authorities and epidemiologists race against time to find the clue to an unknown and possibly fatal disease."

If you interpret 'The enemy might get the bomb before we do' as a crisis, 'The Making of the Atomic Bomb', Rhodes, is a detailed (and Pulitzer Prize-winning) examination of how we got from discovering the atom's nucleus to the consequences of deploying city-destroying weapons in a generation or so.

You might find general systems theory interesting, maybe 'Thinking In Systems', Meadows, and/or 'An Introduction to General Systems Thinking', Weinberg.


👤 markedathome
The CDC has the Crisis & Emergency Risk Communication (CERC) manual[1], as well as a website covering this topic[2].

[1] https://emergency.cdc.gov/cerc/ppt/cerc_2014edition_Copy.pdf https://emergency.cdc.gov/cerc/manual/index.asp

[2] https://emergency.cdc.gov/cerc/index.asp


👤 yodon
If you're operating at a scale or in a domain where crisis-like issues are expected (which is probably true if you're asking a question like this), The Checklist Manifesto[0] is a great read.

[0] https://www.amazon.com/Checklist-Manifesto-How-Things-Right/...


👤 rockmeamedee
The term "crisis response" will get you info on PR crises and brings to mind the TV show Scandal.

The term you want for our field is "Incident Response", and the practice of 1)preventing them and 2)handling them 3)learning from them is Resilience Engineering. It's about investigating air plane crashes, nuclear meltdowns, errors during surgery, etc, and learning how humans keep complex systems running.

I recommend "Behind Human Error" by David Woods as a great starter there. A key insight of this field is that incidents aren't just "some idiot didn't follow the safety checklist", but often the safety checklist itself will cause the issue; at some level the errors happen because of complicated interactions between the system and even the safety mechanisms.

An interesting tech industry related document is the STELLA report [1] from a few tech companies comparing notes on incidents.

[1] https://snafucatchers.github.io/


👤 throwmeaway_66
I don't see it mentioned in the comments so far, but one very enlightening book on crisis and crisis response is "Thinking through crisis", by Amy L. Fraher. Recommended.

👤 bwh2
Drift Into Failure. Great book telling stories like airplane crashes and the Challenger disaster. I posted some notes on my website: https://www.briansnotes.io/book/drift-into-failure/

👤 myth_drannon
Volokalamsk Highway by Alexandr Bek https://www.goodreads.com/book/show/1769643.Volokolamsk_High... It's an old book, so your best chance is an ebook. http://ciml.250x.com/archive/literature/english/alexander_be...

It's about leadership during crises and it's based on real events. Telling the story of a small battalion stopping German army en route to Moscow in 1941. At some point it was a required reading in some military schools (like in Israel for example, maybe even now).


👤 jkingsbery
This week's EconTalk talked about the general problem of responding to crises, and I found the guest's (and host's) take interesting (https://www.econtalk.org/megan-mcardle-on-catastrophes-and-t...). They talked about how often we respond to infrequent crises by trying to prevent them from happening again, but then that investment goes to waste because that crises doesn't happen again (or doesn't happen for a long time, after which the investment has depreciated). They both advocated for focusing on being more responsive to crises, since adaptability can help more generally across many types of crises.

👤 kevin_nisbet
Specific to books, "Failure Is Not an Option: Mission Control From Mercury to Apollo 13" was pretty good from Gene Kranz and as I recall covered many of the incidents in the US space program. I don't think I ever found a copy of the Chris Kraft one but heard it's pretty good as well.

And as others have suggested, reading flight accident reports, or watching the videos made off of them tends to be valuable.

Also, I think NASA published a bunch of research on human factors at one point, but it's been a long time since I've looked it up.

And last, specific to our industry, the SRE Books have a couple chapters on incident response: https://sre.google/books/


👤 abee
Never waste a good crisis when it comes to politics → read this book [The Shock Doctrine by Naomi Klein](https://en.wikipedia.org/wiki/The_Shock_Doctrine)

👤 sjg007
Is the best idea to plan for a crisis before it happens? I guess it would be a process that you engage in because you won't necessarily be able to predict the crisis in advance otherwise you'd mitigate it.

👤 jabroni_salad
Pagerduty has some very practical guidance on their website: https://response.pagerduty.com/

👤 thaumaturgy
Several countries and lots of agencies use something called the Incident Command System (ICS). They use it to coordinate multi-agency response to things like explosions, toxic spills, massive wildfires, large-scale searches for missing persons, and more, all the way down to small and localized incidents.

ICS courses are free to anyone who wants them. You can get started with ICS-700 here: https://training.fema.gov/is/courseoverview.aspx?code=IS-700...

There are a few basic principles of ICS that should be useful to company incident response:

1. ICS defines specific roles and their responsibilities. In ICS, there is Planning, Logistics, Operations, Management and/or Coordinator, and Finance, among others. Each of these roles are defined ahead of time, and disaster response teams practice these roles regularly. Each role has defined ways of handing the role off to another person during a shift change and often includes specific forms that need to be filled out. This data collection is integral to being able to review the incident while it's happening, as well as after the fact for improving training.

2. ICS is scalable. For a very small incident, one person may be responsible for all roles. For a very large incident, response may be further subdivided into branches and divisions. This flexibility is an extremely important part of ICS, and it only works because everyone understands the different roles involved.

3. Under ICS, everyone has exactly one boss, supervisor, etc. that they report to. Any of you who have had to try to go spelunking through logs while multiple suits keep contacting you for updates already understands how important this is. This structure also helps to minimize miscommunication during an incident.

4. In the planning section specifically, there's a process called the "Planning P" that describes a lifecycle of information gathering, decision-making, and communication. It's pretty straightforward and it resolves a lot of common issues in incident response. This is covered in ICS-201: https://training.fema.gov/is/courseoverview.aspx?code=is-201

Companies developing their own incident response strategies will want to customize forms, data collection, roles, &etc., but the basic principles of ICS are an effective framework that should be adaptable to a wide variety of situations. Most companies on their worst day aren't dealing with an actual or potential loss of life; experienced ICS people can sleep-walk through a company's worst incident.


👤 max_
Catastrophy bonds in my opinion are the most practical way manage effective crisis response.

I wrote about pandemic bonds last year [0].

[0]: https://as1ndu.xyz/2020/02/fighting-of-disease-pandemics-wit...


👤 tjalfi
Total Loss[0] has 45 stories of yachting disasters; the lesson I took away is to carry a knife when you're on a boat.

[0] https://www.amazon.com/Total-Loss-Collection-First-hand-Acco...


👤 villasv
Jared Diamond's Upheaval is a bit off-topic I think but I still recommend it. It starts talking about personal crisis to extrapolate into nation crisis, which are both extremes compared to company or team crisis.

Even if tangentially related, it's an interesting read if only for the historical content.


👤 sinac
A great how to resource is Dealing With an Angry Public by Field and Susskind. https://www.amazon.com/Dealing-Angry-Public-Approach-Resolvi...

👤 jenkstom
In the US all first responders are required to go through various levels of incident management. You can get free training on NIMS and ICS online. It works, and it works very well.

👤 mronge
Here's a list of some of my favorite books dealing with crisis:

- In Thin Air - About a mountaineering expedition that turned into disaster on Mount Everest.

- Black Hawk Down - The story of hundreds of US special forces trapped in Mogadishu overnight after a mission went completely sideways

- Leadership in Turbulent Times - About different US presidents leading through crisis and how there is no one singular type of leadership

- The Hard Thing about Hard Things - Leading a startup on the verge of failure to an eventual massive acquisition

- The Sledge Patrol - How a small group, outgunned and out manned fought back Nazi invaders in Greenland


👤 WmyEE0UsWAwC2i
Nassim Taleb, Antifragile