I like this story:
To: nanog@merit.edu Subject: Re: Akamai DNS Issue? From: "Laurence F. Sheldon, Jr."Date: Wed, 16 Jun 2004 09:37:42 -0500 In-Reply-To: <052501c453ac$2be69ab0$166df640@amplex.net> References: <5AFA5A2C102DAB4692ABC1E87E0780CA085EBA55@OCCLUST02EVS1.ugd.att.com> <5BE2E262-BF9D-11D8-A0F8-000A95E7E6B4@isc.org> <052501c453ac$2be69ab0$166df640@amplex.net> Mark Radabaugh wrote: > But you don't say how to avoid failures caused by massive confusion when > maintaining a excessively complicated system.... I don't have much to offer for the "excessively complicated" case (which I think the instant case is an example of), but there are cases as complex and complicated with some justification in my history. For those, the best solutions involved concepts like "canned, tested, documented procedures", "quality control", "change management" (which included "staging", "testing and verification", and so on. We were not fond, in the "production" and "system test" environments, of people who made ad hoc changes of any kind. Many years ago, I hand carried a patch through the approvals process, group leader reviewed the purpose, urgency, test methods, test results, and signed the sheet. District manager looked it over and asked "what are the chances that this patch could fail?" I flippantly replied "One in a million!". He handed the documents back unsigned with the words "Seven times in the Metro (Los Angeles, California) office tonight.