- Facebook's services crashed for roughly six hours Monday, after a network configuration issue made the world's biggest social networking site vanish from the internet. WhatsApp and Instagram were also impacted in the global outage, as well as the company's internal communication tools.
- Configuration changes on the backbone routers that coordinate network traffic between Facebook's data centers led to the crash, according to a blog post from Santosh Janardhan, VP of engineering at Facebook. The company said it had no evidence of user data compromise during the outage.
- The impact of the outage was amplified by the company's reliance on its own systems for internal communications. Unable to use Facebook Workplace, workers turned to Zoom and Discord chat rooms to communicate, the New York Times reported.
Systems crashes can bring disruption regardless of company size. For tech experts, Facebook's nightmare of a Monday is a lesson in redundancy of systems — from the network that supports their digital products, to the way employees can reach one another in a moment of need.
"In Facebook's quest to integrate its products and underlying technical infrastructure into a single platform is the concentration risk it creates for the company," said Alla Valente, senior analyst at Forrester, in an email. A single risk in a concentrated infrastructure can cascade down, effectively halting the entire organization.
In the event a key vendor falters, companies can prepare to overcome the hiccup by having backup systems clearly identified prior to down time. Assessing the existing capabilities of tools currently on hand and exploring free options of competing vendors can help companies advance.
No matter the function, companies "should not place everything, from DNS to all of their apps, on a single network," said Chris Buijs, EMEA field CTO at NS1.
"Yesterday's outage, and other recent incidents, resulted from human error — misconfiguration — and the effect on the entire chain had enormous consequences," said Buijs in an email.
The concern over single-system dependency also impacts the collaboration tool space, particularly after the hybrid work model placed these tools under new context. Internally, Facebook made its Workplace platform the single hub for employee communication, replacing email and the intranet, said Carrie Marshall, CEO of Talk Social to Me.
"Yesterday's outage showed why relying on one tool is a bad idea: Without redundancies, information becomes choked up and employees are left in the dark," Marshall said. "Companies should always maintain multiple points of communication for situations where the primary hub goes down."
But Facebook's tool is more than an internal communication platform. As collaboration tool use in the enterprise has grown 44% since 2019, a recent Gartner survey shows, Facebook joined the market in 2016. Facebook now sells Workplace to other companies such as Starbucks, Walmart and Spotify. CEO Mark Zuckerberg said the product had reached 7 million paid subscribers in May.
"The intranet, email distribution lists, and even a list of team phone numbers can all remediate collaboration hub downtime issues," said Marshall. "Maintaining multiple forms of communication, even if they're not used on a regular basis, will also serve companies when natural disasters or other crisis moments take down local digital resources."