RunBooksRun

Why do organizations fail at runbooks?

Following the cascade of failure on the HN thread, the two big problems seem to be priorities and toolchains. Which are inter-related in that the costlier a toolchain is to use, the harder it is to prioritize documentation.

Toolchains

For example, someone who uses GUIs all day has a problem because they have to edit their 8-hour work session down into a series of screen captures or a mini-film. Similarly, people who actually type into a command-line shell are disadvantaged because they have to edit their history file down into something legible.

Using an editor to drive a shell makes runbooks easier because both duck-talking and documenting help.

Duck-talking for literate administration means writing down a question, and trying to answer it by writing a series of shell commands that are then sent to the shell, and the results pulled back into the document. For example:

\sec Upgrade Plan
Size of current prod?

\shell
ssh db1-prd.example.com df -h  | grep dev.vd
/dev/vda                       10G  8.4G  1.6G  84% /
/dev/vdb                      200G   98G  103G  49% /opt/example/db1

ssh db1-prd.example.com free -m
              total        used        free      shared  buff/cache   available
Mem:          16024        2353         163           9       13507       13330
Swap:          1023           0        1023
^Z

While documenting works by helping to organize problem-solving, ie. start with a Polya-esque process where free-form writing reigns initially, then gives way to explorations which are recorded just as duck-talking was in the above example.

Priorities

People have to live in their documentation. Once people write natively for posterity, document re-use (i.e. a run-book) becomes part of the Polya process' literature review, ie. once a pattern of use is recognized, it can be extracted, named, and improved. For example:

\sec Prior art
:r ! git grep -l -e db1 -e postgresql

The pain is that many popular tools are not documentation-friendly. GUIs, IDEs, and shells like fish all prioritize doing over thinking, with the end result that many people end up anti-documentation.

Really?

As evident from the HN thread above, many organizations are pre-historic in that only artifacts exist. This is important enough to bear restating pointedly:

If all you have are artifacts, then you are literally pre-historic.