Every few weeks, we see a story in the press that highlights the incongruity of saving money by skimping on safety – incongruous because the eventual cost, not just in money but in human misery, is often so high.
Much less frequently but still too often there are real horror stories. Bodies piled up behind locked emergency exits in burned out public buildings: “well we didn’t want people sneaking in without paying, and we’d never had a fire before”. It’s always “never again”, but then memories fade and history repeats itself.
As providers of services in the area of safety-critical systems, the stories that concern us directly never have such an obvious focal point as a locked emergency exit. What we routinely encounter from our (potential) clients is the question: is it really necessary for these tests to be done independently? Wouldn’t it be cheaper and quicker if we did them ourselves (or not at all)?
Independence is the main buffer against management pressures that conflict with safety considerations. One of the best-known illustrations of the effects of those management pressures dates all the way back to the Challenger disaster in 1986, where a weakness that had been known about since 1977 remained hidden, or was at least ignored, until eventually the inevitable happened. (For those that don’t remember, Morton Thiokol engineers realised that the O-rings in the shuttle’s solid rocket boosters could become dangerously brittle and prone to failure at low launch temperatures, but their warnings were suppressed by a management more concerned with maintaining their delivery programme and protecting their cashflow.)
The Challenger case, when it eventually came out, was pretty much as obvious as a locked emergency exit, but often the scenarios are much more subtle. The end results can be at least as tragic though.
Closer to home, it’s taken years for the background to emerge of one of the RAF’s worst ever peacetime accidents. The crash of Chinook ZD576 on the Mull of Kintyre in June 1994 was officially blamed on the two pilots after an enquiry that more and more looks like a whitewash. The AAIB and the RAF Board of Inquiry were left unaware of a 1993 report by EDS that the software of the Full Authority Digital Engine Control (FADEC) was dangerously flawed in both design and implementation. In fact, EDS had given up its evaluation after inspecting 2,897 lines of the total 16,254 lines because they were finding so many errors and anomalies. The full murky story can be found here. It’s obvious that basic principles of safety systems engineering were squeezed out and that, regardless of the cause of the 1994 crash, the Chinook was not airworthy when it went into RAF service.
Coming right up to date, the latest example of safety systems principles being relegated by (im)pure business considerations so that many lives are lost is the Toyota debacle. This story has still to play out, but it is already clear that a company that had previously put such emphasis on quality processes, doing things right and doing things better fell into the trap of starting to prefer all out growth. This would have been bad enough in the days when throttles were controlled by means of a cable running between a pedal and a carburettor, but since “fly by wire” came to the automobile, software safety engineering has been in the mix. How many others besides Toyota have fallen into the trap of “penny wise, pound foolish” when it comes to taking the necessary steps to ensure that complex software-intensive systems are fit and safe for purpose?
The moral in all this is simple. If a thing can go wrong, it eventually will. Making sure that things can’t go wrong costs money, but it costs a lot less than the consequences when they do. Manufacturers shouldn’t gamble, or be allowed to gamble, with lives at stake. They will continue to do so until the complexities of modern systems, and in particular software-intensive systems, are more widely understood by those in authority, including legislators.