Recent Southwest and Delta Outages Expose Huge Technology Risks

technology risks, business continuityJust in the last month, two of the country’s largest airlines experienced massive technology outages that reverberated throughout their entire operation(s) during the peak summer travel season.

The first outage occurred on July 20 when a computer meltdown led to the grounding of Southwest flights over a 3-day period. Although flights could land, the computer glitch prevented flights from leaving many of the cities Southwest serves, including most flights at its Chicago hub. In total, over 2000 flights were affected by the outage.

However, Southwest’s woes were pale in comparison to Delta’s outage earlier this month since the effects were felt worldwide.

At approximately 2:30 AM on Monday, August 8th, a power outage at the company’s Atlanta headquarters caused a computer failure that led to “large-scale” cancellations. According to a statement from Delta’s Chief Operating Officer Gil West, the power outage was caused by a malfunctioning power control module (i.e. fuse) that caused a power surge and therefore a loss of electricity. Although the electricity wasn’t out very long, “critical systems and network equipment didn’t switch over to backups” according to the COO.

Regardless of what caused the outages, thousands of passengers were stranded at airports throughout the world. Due to the extent of the computer outage, the company was unable to provide lodging for all of its passengers – many were forced to sleep on the floor at airports from Atlanta to London and beyond.

Why did seemingly minor computer glitches have such dramatic effects on Southwest and Delta’s operations?

Since airlines (…and many other organizations for that matter) are increasingly reliant upon automation and computerized systems for a wide-variety of things, their technology risks increase along with their dependencies. Properly identifying and assessing these risks, then developing a strategy to handle the risks, are critical steps to ensuring operations continue uninterrupted.

This is especially important for large airlines like Southwest and Delta…

In these cases specifically, neither company had sufficient business continuity programs to address potential issues. There were no “fail over” data centers, or rather a secondary technology site that was synchronized with the main data center. If there were, these backup systems could have kicked in the moment troubles began.

Also, judging from available information, communications from both Southwest and Delta on how the outages were affecting operations was rather poor.

Companies who are prepared for such outages have “response plans” in place to notify media, customers, partners and other interested parties. These plans could consist of message templates that company officials could easily modify for the particular situation.  Leveraging technology already used by the company would be a smart use of resources in this type of scenario. For example, passengers could have been notified by phone app, text, or email before leaving for the airport.

How will these events affect Southwest and Delta? Will they take steps to address these risks inherent to technology?

Although the outages did come with a pretty hefty price tag for each airline, those effects will largely be temporary.  Southwest’s outage affected flights for 4 days and cost between $5 and $10 million according to the company.

The biggest cost will be in terms of their reputation, both among the affected passengers and the public at large. This is especially damaging for Delta since before this outage they were considered one of the most reliable air carriers in the industry – only a handful of flights have been canceled this year.

What these and other airlines do to address technology risks is a pretty open-ended question though – many of these computer systems have been around for decades. The systems have been built mostly on upgrades and patches, which of course carries enormous risks when dealing with such antiquated technology.

“Airlines need to revisit technologies that they’re using,” explains Ahmed Abdelghany, and associate professor of operations management at the College of Business at Embry-Riddle Aeronautical University.

If airlines do not take steps to address these risks with the technology they’re using, we’ll certainly see more cases like this in the years ahead, especially as airlines “automate more their operations, distribute boarding passes on smartphones and fit their planes with WI-FI” according to this story in Fortune Magazine.

, , , ,

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu