I like this story:

To: nanog@merit.edu
Subject: Re: Akamai DNS Issue?
From: "Laurence F. Sheldon, Jr." 
Date: Wed, 16 Jun 2004 09:37:42 -0500
In-Reply-To: <052501c453ac$2be69ab0$166df640@amplex.net>
References: <5AFA5A2C102DAB4692ABC1E87E0780CA085EBA55@OCCLUST02EVS1.ugd.att.com> 

Mark Radabaugh wrote:

> But you don't say how to avoid failures caused by massive confusion when
> maintaining a excessively complicated system....

I don't have much to offer for the "excessively complicated" case
(which I think the instant case is an example of), but there are
cases as complex and complicated with some justification in my history.

For those, the best solutions involved concepts like "canned, tested,
documented procedures", "quality control", "change management" (which
included "staging", "testing and verification", and so on.

We were not fond, in the "production" and "system test" environments, of 
people who made ad hoc changes of any kind.

Many years ago, I hand carried a patch through the approvals process,
group leader reviewed the purpose, urgency, test methods, test results,
and signed the sheet.  District manager looked it over and asked "what
are the chances that this patch could fail?"  I flippantly replied
"One in a million!".

He handed the documents back unsigned with the words "Seven times
in the Metro (Los Angeles, California) office tonight.