Broken as Designed: "Abort, Retry, Ignore?" -- and Die

All engineering failures -- and disasters -- have a critical human element. This observation applies across the spectrum: the Challenger shuttle explosion, the BP Macondo well blowout, the aircrash that wiped out Poland's government.

That element: someone in authority overrode the safety checks and ignored advice of those who knew the risks. Launch directors pushed the launch schedule, BP execs told the drilling engineers to proceed, an air force general told the pilot to land, all despite warnings from rocket engineers, the rig lead, and the pilot and ground control.

Why does this happen? Because we train people that it's okay to go ahead anyway or offer an option to continue despite a system interlock or warning.

On a microscale, consider system security measures for privacy and identity theft prevention. How often have users ignored warnings like the one at right:

This warning appears when Outlook 2007 connects to an email server over an encrypted channel. The purpose of the secure connection is to prevent a bad guy from stealing email identity (login/password) and ensure mail privacy. But there's no clue about that in this warning, and the temptation is to blithely click through in order to read mail.

Here's another example. In this case, the browser fails to validate Register.com's security certificate. Multiple reasons could lead to this warning: the browser doesn't know about Register.com's certificate authority or the certificate is self-signed.

In both cases, it doesn't matter. The user is given the option to "go ahead anyhow" and since most people deem reading email as urgent-but-low-risk, they click through to do the task that's uppermost in mind.

Similar messages appear for expired certificates. Certificates must be renewed periodically, and many business, particularly smaller businesses, forget this administrative chore. After all, the customers can still get in okay, right?

It makes no difference if the URL location bar turns "green" or the little security lock icon appears locked when security is validated. The user has already clicked through regardless. The security interlock has been ignored.

Once habituated to clicking through for email, ignoring warnings for banking, e-commerce, healthcare, government, and other security-required applications becomes second nature. The "go-ahead, make my day" feature has trained users it's okay and nothing bad will happen.

Until something does.

The Real Problem

Several things are at work. The total system (browser, application, website, certificate authority) offers a way to go ahead instead of enforcing the lockout and requiring the user to go to lengths to verify the security of the situation. Second, the reasons for the warning are obscure as it's assumed the user understands how certificates work. The onus is on the user to evaluate the risk without complete information or understanding the underlying causes. Last, the user is likely under time pressure or other constraint to make a decision quickly and "just get on with it".

In short, the overall design permits a dangerous action by someone who is uninformed about the root problem and who lacks the training and patience needed to understand the matter and judge risks.

The built-in assumption is that value and convenience override all. The lesson drawn and reinforced: it's okay to ignore warnings because likely nothing will happen. So thus, a security blow-out.

Fixing the Design

This one is simple: make the security interlock positive and hard. Never make it simple for someone to obviate. Will it inconvenience people? Yes. Will they take the time to fix it by calling and complaining? Maybe. Will they leave the application, website, or abandon their task? Likely, but that puts the onus on certificate owners and application and website managers to keep their stuff up to date and working.

Make applications and sites work securely from any URL in the domain. Certificates are inexpensive and serve to advertise and secure the brand. There's no excuse for Amazon.com (for example) to cheap-out on a cert for the domain amazon.com. Even if this domain isn't the final target (www.amazon.com), buy a cert to avoid seeing this screen, then bounce the customer from the former to the latter.

Applications must offer an alternate path in the event of a lock-out. The alternate path must inform the user what to do and whom to contact to correct the situation. The information must be meaningful, helpful, and lead to a positive resolution of the problem, e.g., call this toll-free number to talk to our security division of customer service. The goal is to correct the defect, not override it.

Last, inform the user about what's going on, what the risks are, and the "what to do next" alternate path. The generic security warning screen in Firefox 3.6 is much better than Internet Explorer 8 (above). It cannot solve the problem with Amazon's website, but at least it explains why the problem occurred and offers a reasonable workaround, in this case, the URL of the correct site.

References

The first two references discuss the impact of and cite the same Carnegie Mellon University study, "Crying Wolf: An Empirical Study of SSL Warning Effectiveness".

Addendum

28 June 2010 - Sean Kerner at eSecurity Planet reports a Qualys study suggesting that of 92 million active domains, 23 million were running SSL. Of those, 22 million had invalid certificates. See "SSL Certificates in Use Today Aren't All Valid".

Broken as Designed

Sunday, June 20, 2010

"Abort, Retry, Ignore?" -- and Die

No comments:

Post a Comment

Pages

On Design

Others' Opinions

References

Blog Archive

About Me

Followers