Like Spartan arrows blocking out the sun, imagine a slew of cyberattacks hitting a company's defenses.
Each attempt to breach through the gates symbolizes negative business outcomes. It's a potentially unrecoverable wire transfer landing in an offshore account. It's a trove of sensitive data sent to an impostor and sold in a corner of the dark web.
Ideally, a sharp cybersecurity outfit can stop hacking attempts before they do harm.
But workers inside the security operations center (SOC) will be most effective at spotting patterns with data science tools. The trend is driven by the increase of cheap computing power afforded by the cloud, and the need for more sophisticated defenses against breaches.
Inside the SOC, companies are using data science tools to enhance speed and accuracy, leveraging threat patterns to identify where the barriers might falter, and where the most pressing threats lie. Use of data science can also help ease a company's thirst for talent in the hyper-competitive cybersecurity space — where unemployment nears zero — by maximizing engineers' efficiency and lowering their workload.
Data science helps security operators normalize data sets and extract compromise indicators, according to Nicolas Kseib, lead data scientist at TruSTAR Technology. In the scenario of a mass phishing attack, for example, data science tools can parse those emails and compare them to normal email communication to spot dangerous patterns.
"That's where a data scientist can help," Kseib told CIO Dive.
The expansion of agile tools, built on frameworks such as Apache Spark or Amazon Kinesis, pave the way for more collaboration between data scientists and engineers within the SOC. The dynamics of work between data scientists and cybersecurity teams is also eased by a lower barrier of entry into data lake technology, where unstructured data can be easily accessed and transformed.
"The collaboration is becoming more and more agile," said Kseib. "The engineer is able to re-leverage whatever code the data scientist used and maybe tweak it a little bit to deploy it on production."
Data science reduces the manual workload faced by security specialists, said Hessam Tehrani, principal data scientist at 4iQ, in an interview with CIO Dive. Data science tools flag malicious activities based on differences with known, safe activity.
"It's a better tool to predict and be ready for the next event that's a threat," Tehrani said. Data collection will often delay the deployment of data science within cybersecurity, since training algorithms without sufficient quality data will deliver poor results.
The central obstacle to leveraging data science in the SOC lie in data collection. It's the most difficult part, since positive results depend on access to relevant data sources.
Adoption of algorithms in cybersecurity will also require some tuning to fit specific use cases. In the financial industry, if a bank is trying to detect fraudulent activity, it may initially flag every transaction as potentially fraudulent.
"That has the downside of ruining your productivity," said Tehrani. "In the other way, you can have your detection a little relaxed and miss something that could be detrimental to your operation."
The key to making data science tools to work in this context is to find the right balance between two extremes.
Currently, adoption of advanced data science techniques isn't widespread in enterprise-grade cybersecurity. By 2024, eight in 10 modern SOCs will use machine learning tools according to Gartner, up from less than 10% today.
But simply plugging in machine learning tools into the SOC isn't a guarantee of high efficiency or decreased vulnerabilities. SOCs require "trained staff and fine-tuned workflows" to fully leverage machine learning, Gartner said in its report.