Does your team use them? If not, why not?
(By runbook I mean a detailed guide on how to perform a process, normally in response to an incident/alert)
Started when someone who didn't write a thing could be paged for a thing. We also strove to have all alerts related to our team's services routed to our team and we rotated on call weekly.
Standard process was, as on call engineer, to update runbooks as needed. Each runbook had key debugging info, a list of alerts with workarounds or suggestions for digging deeper, links to dashboards, stuff like that. Super useful. A runbook missing info would come up in stand up or retro. We really relied on good runbooks for things we had yet to automate into self-healing systems.
New place, we have some basic runbooks in the wiki but we have room to grow in this area. Coincidentally, the new company is about the same size as the old one was when we started to invest in runbooks.
We use runbooks written in Notion to give step-by-step guides for when things go wrong (which they do, a lot). We're getting better about making sure every alert has a runbook linked, and improving them over time.
What I really want is to use something like DeepNote to have runbooks that automatically gather context from databases and logs.
For any incidents, we can open up the run book to get quick access to kibana queries, grafana dashboards etc that make our process easier while under the pressure
My team currently doesn't have any used services yet though so I'm yet to have used them. That's changing soon though
Most are in a wiki with a few corner cases using GDocs or Dropbox Paper.
(reboot that db, or unstuck that kafka, etc)