How to get MTTR down to zero
Automated tools to reduce Mean Time to Remediation can make cloud infrastructure "self-healing," said Josh Stella, CEO and co-founder of Fugue.
Somewhere along the way security took a backseat to efficiency, but the automatic deployment of cyber airbags is something every company needs, especially in cloud governance.
To ensure a company is "using the cloud in the right way," it needs to adopt automated remediation tools to cut the Mean Time to Remediation (MTTR), or how long it takes to resolve a cyber incident, down to zero, Josh Stella, CEO and co-founder of Fugue, a cloud security tool, told CIO Dive in an interview. Manually probing and responding to flagged errors slows the MTTR process, exposing a network to risk for too long.
Cloud misconfigurations from human error were mostly to blame for a staggering 424% increase in security breaches last year, according to a 2018 IBM report. If there are hundreds upon thousands of configurations on a cloud, it is sometimes unfair to hold a human accountable for a misconfiguration. "We don't work like that," Stella said.
For example, when someone manually performs maintenance, certain assets are taken offline from the public. However, human error can easily creep in when the developer mistakenly "leaves a hole" that goes unnoticed until a monitoring tool picks it up.
The monitoring tool, traditionally used in datacenters, is already automated, but once it has sent a notification to the human user within incident management, its job is over. The remediation process does not begin until someone receives the ticket, then details a plan for approval, manually corrects the problem and writesa report for future mitigation.
All of this adds up to a MTTR of hours, days or even weeks because human involvement "just takes time," Stella said.
Wasting valuable MTTR
MTTR is often overseen by the CISO and traditionally the "mindset" of CISOs was to focus on perimeter security. But the cloud introduced a new breed of security surfaces to protect. A lack of automation to help secure this new surface is a CISO's "biggest problem on the cloud," even if they "don't know that's their problem," according to Stella.
In addition to human error, manual remediation costs many hours because developers are inundated with false positives flagged by the monitoring tool.
Remediation actions in a cloud environment were traditionally a responsibility that "[fell] on some poor guy or woman sitting there watching all these changes go by" because it took human judgment to determine a false positive.
But because the cloud has enabled DevOps to build applications and tools quickly, the cloud is constantly in motion with "more moving parts than ever before," making it difficult to accurately keep pace with what is correct and what needs updating, said Stella.
False positives lead to a "bad signal to noise ratio" which results in a "flood of information" flagging things that don't deserve the time it takes to investigate and resolve, he said.
To cut out the risk of human error or delayed remediation time, here are the four steps to get MTTR down to zero:
1. Discover what's on the cloud
The first thing companies need to do is take a product and direct it the applications built on the cloud. At this point, the tool is used to discover what could be wrong.
But standalone tools used solely to monitor and alert changes in a cloud infrastructure only gets security so far. The tools issue tickets for every alert sent to a human to review. Eventually, the queue of alerts adds up.
2. Adopt 'policy as code'
Having a well outlined understanding of what exists on an organization's cloud environment makes conforming to required policies easier. For example, healthcare professionals need to be aware of HIPPA requirements, PCI requirements for transactions or GDPR if handling data in the EU.
The most efficient way to accomplish policy accordance is using "policy as code" to alert an organization of the places it's jeopardizing policy compliance. It requires a company to "methodically" comb through systems to align with the policies needed, according to Stella.
3. Begin remediation
Once an organization is alerted of shortcomings within its cloud environment, it should start issuing the remediation. Automated remediation learns the "footprints" of applications.
While remediation is something organizations can accomplish in-house, they can end up building an unnecessary repertoire of tools and a pile of script, according to Stella, which often contributes to prolonging MTTR.
4. Maintain the repairs
The final step for procuring a MTTR of zero is "the one you're going to have to live with for the rest of your life," said Stella. Everything that was identified in the first three steps and issued corrections for need to be maintained.
This is where automation is key. Having a tool constantly identifying potential threats or inaccuracies and applying resolutions without needing permission from a human takes the guessing out of false positives and the time spent manually repairing errors. Automated remediation tools have the ability to make cloud infrastructure "self-healing" without the need for human intervention.
Follow Samantha Ann Schwartz on Twitter