Human factors can become the uninvited guest
You might be forgiven for assuming that it would be impossible to gain entry to a U.S. State dinner hosted by the POTUS, where the PM of India was the honoured guest. Surely so because of the intense security, the multiple layers of protection, x-ray screening and continual scrutiny. Of course you wouldn’t even try because it could all go terribly wrong? However, what you see in the picture above is Mr and Mrs Salahi being introduced in person to both President Obama and the Indian PM Manmohan Singh. Yet neither had invitations, they were not on the guest list or in fact entered by any other official means. The system failed. Fortunately, the Salahi’s did not have any dark intent, but were actually trying to raise their “Reality TV” profile. (!!!)
On November 24, 2009, Michaele and Tareq Salahi, a married couple from Virginia, attended a White House state dinner for Indian Prime Minister Manmohan Singh, as uninvited guests. The Salahis were able to pass through two security checkpoints (including one requiring positive photo identification), enter the White House complex, and meet President Barack Obama. The incident resulted in security investigations and legal inquiries.
The Salahis entered the state dinner in honour of India’s Prime Minister Manmohan Singh without an invitation. They passed through two security checkpoints, one of which checked them for photo identification. Robin Givhan of the Washington Post surmised that the Salahis were allowed to enter because they “looked the part” and, in her words, stepped through a “cultural blind spot.” The Washington Post also quoted an anonymous official as having said that “The Salahis were allowed inside in violation of agency policies by an officer outside the front gate who apparently was persuaded by the couple’s manner and insistence as well as the pressure of keeping lines moving on a rainy evening.”
Put simply, the protective means that were put in place relied upon human interpretation of threat, where they did not fit the profile, or looked the part, and as it was raining and they had to get the line in to the event quickly, these failed.
This is a prefect example of a “Swiss cheese” event where multiple points of weakness align allowing a failure to occur.
In many unplanned machinery failures the causation is often not a lack of system capability but a lack or a variability in the human activity that forms part of the system.
Excluding any malicious intent or personal failings, people simply do not operate like machines. They have good and bad days, they do the same job differently in the morning as they might in the afternoon, before a break or before a shift change when their minds are slightly distracted and often when they are doing jobs for which they are not only highly skilled, but have many hours of experience. They simply take short cuts, often ones they have taken many times before, or unconsciously perform jobs in a sub-optimal way and when they are the only system element that is not aligned to the failure they can become the reason for the failure.
Just like a uninvited guest all the systems were in place to mitigate the known risks, they were simply not followed at the human level allowing the failure to occur.
In industry we have regulatory control such as country HSE or International Maritime regulations and classification to force a standard of care but we still get failures. Mostly these are thankfully not significant, but occasionally, like in Deepwater Horizon, Piper Alpha, Exxon Valdez, Challenger etc; people did not do what was necessary to avoid the avoidable failure. In some cases this was due to pressure to meet business objectives, in others it was simply human failings.
My point here is that we need to ensure that where we build human behaviour into a system, we should recognise that there are multiple opportunities for it to go wrong. If you take the view that “if it can go wrong – it will go wrong” then we can identify the risks and then offer appropriate mitigations. Also, when we get an unexpected machinery failure, we perform a meaningful Root Cause Analysis and assess the risks of it happening again and then build in the assurances to eliminate or reduce the risk of reoccurrence to acceptable limits.
Finally, as we move toward AI and the use of data processing to highlight anomalies, we must remember that we need to not only know the preceding symptoms that occurred prior to a failure event but also what we did to fix the situation and bring a system back under control. It may be useful to map a 1000 instances where a motor bearing failed so we can predict when the next one will fail, but if there were a 1000 different responses to those failures we need to work out which solution is right in every instance. Its not enough to simply say a system is exhibiting familiar symptoms that we expect to lead to a machinery failure…. what are we going to do about it and will we do it right?