Why do organizations fail at runbooks?
Following the cascade of failure on the HN thread, the two big problems seem to be priorities and toolchains. Which are inter-related in that the costlier a toolchain is to use, the harder it is to prioritize documentation.
For example, someone who uses GUIs all day has a problem because they have to edit their 8-hour work session down into a series of screen captures or a mini-film. Similarly, people who actually type into a command-line shell are disadvantaged because they have to edit their history file down into something legible.
Using an editor to drive a shell makes runbooks easier because both duck-talking and documenting help.
Duck-talking for literate administration means writing down a question, and trying to answer it by writing a series of shell commands that are then sent to the shell, and the results pulled back into the document. For example:
\sec Upgrade Plan Size of current prod? \shell ssh db1-prd.example.com df -h | grep dev.vd /dev/vda 10G 8.4G 1.6G 84% / /dev/vdb 200G 98G 103G 49% /opt/example/db1 ssh db1-prd.example.com free -m total used free shared buff/cache available Mem: 16024 2353 163 9 13507 13330 Swap: 1023 0 1023 ^Z
While documenting works by helping to organize problem-solving, ie. start with a Polya-esque process where free-form writing reigns initially, then gives way to explorations which are recorded just as duck-talking was in the above example.
People have to live in their documentation. Once people write natively for posterity, document re-use (i.e. a run-book) becomes part of the Polya process' literature review, ie. once a pattern of use is recognized, it can be extracted, named, and improved. For example:
\sec Prior art :r ! git grep -l -e db1 -e postgresql
The pain is that many popular tools are not documentation-friendly. GUIs, IDEs, and shells like fish all prioritize doing over thinking, with the end result that many people end up anti-documentation.
As evident from the HN thread above, many organizations are pre-historic in that only artifacts exist. This is important enough to bear restating pointedly:
If all you have are artifacts, then you are literally pre-historic.