For all the cybersecurity planning and mitigation that takes place, we keep experiencing cybersecurity incidents that are formed of a collection of coincidental mistakes that collide to cause an incident that remains unique. A Swiss Cheese security incident.

Opinion-Swiss-CheeseWhen the Boeing 767 was developed, the designers never intended it to be used as a glider. But in July 1983 that’s precisely what happened. Air Canada Flight 143 was a brand new 767, being flown by talented and experienced pilots from Montreal to Edmonton, ran out of fuel way short of its destination and became the subject of one of the world’s most celebrated air accidents.

The Gimli Glider, as the incident was nicknamed, was a shining example of a “Swiss Cheese” incident. That is, it was subject to a bewildering collection of coincidental mistakes which, individually, would have been largely benign, but which aligned collectively to cause an incident that remains unique. The fromage analogy is, of course, that a collection of slices one on top of the other will seldom land with holes aligning all the way through the stack, but the chance of them lining up – although tiny – is non-zero.

Getting Ready to Glide

The crew of Flight 143 were super-unlucky. Some of the electronics in the flight deck were acting up, and a combination of two engineers misunderstanding the fault and each other ultimately rendered the fuel gauges inoperable. The refueling team did its sums incorrectly and, by confusing pounds with kilograms, loaded less than half the fuel that was required. Amazingly, the captain then used the wrong figures when working out the fuel on board from a dipstick measurement (which was required due to the fuel gauges being broken) and so didn’t notice that they only had 45% of the fuel they needed. And flying without engines had not been covered in flight crew training because it was such an unlikely occurrence.

The good news is that the crew had one bit of good luck that day: the captain was an experienced glider pilot and was able to pull off an insanely improbable glide to a defunct runway at a military airfield where the first officer had spent time when in the Forces. Nobody died, and they were even able to bolt the aircraft back together (the nose wheel broke on landing) and put it back into use.

Swiss Cheese Cybersecurity
Now, let us take a step back and look at our cyber security defenses. We’re used to the concept of “defense in depth” – that is, working on the premise that by having layers of security, an attack that penetrates one or more layers should be blocked by some of the remaining ones. But are we leaving ourselves open to a “Swiss cheese” incident? Are we using common hardware or software in multiple layers, so that a vulnerability in one of them also exists in one or more of the others? Are our change control procedures sufficient to catch errors? When engineers peer-review the technical aspects of a change are they doing so diligently, or are they falling into the trap of “seeing” what they expect to see and overlooking mistakes?

But, most importantly, are we making it as difficult as possible to be wrong?

If we wind many incidents – both inside and out of the cyber industry – back to the underlying cause, the problem is one of human factors. They used to call it “human error” back in the day, but toward the end of the 20th century it was realized that this was an unfair term, because when something is confusing or difficult it’s no surprise that some people will do it wrongly.

What Creates Swiss Cheese?
The ultimate reason behind the Air Canada issue was that the brand new 767 used metric units for fueling, while the airline’s existing fleet worked in pounds. When the captain checked the fuel figures he used a legacy conversion table, which referenced pounds instead of kilograms, but didn’t notice his mistake. The investigation resulted in improvements to procedures and safety measures, but the key observation was that having some aircraft using imperial measures and others using metric was a recipe for disaster – hence the advice was for Air Canada to switch all its imperial measures to metric.

So, we should take a leaf from the investigators’ book and ask ourselves: as well as having decent procedures and robust checking, do we have opportunities to reduce security failures by making mistakes harder to make?