Big data: enabling new approaches to IT infrastructure security
Big data technologies and advanced analytics, including AI, are promising a way to get ahead of cyber threats.
Consider modern enterprise IT infrastructure. Increasingly, it is a complex combination of on premise computing and storage and off premise, cloud-based resources. Tying all of this together is a web of data connections. Applications can run either in the cloud or locally, and all of this is subject to penetration by bad actors. Combine this with the internet of things (IoT), where virtually every device is connected, and the number of points of potential compromise increase exponentially. The wonder is not that so many large enterprise networks are breached, but that so few are.
It is no wonder that the security industry is burgeoning. With five-year growth estimated by Forbes at 9.8 percent annually, security has become big business. As the threat becomes more acute, the security solutions being introduced to the market are also becoming more sophisticated. The current notion is that single solutions are no longer sufficient to adequately protect the enterprise: “defense in depth” is the new mantra—where multiple solutions beginning at the edge of the network and becoming increasingly more complex the closer to the application one gets—becomes the norm.
Yet, there may be a light at the end of this tunnel and it may be a result of technology that, itself, may pose the greatest threat to security: big data.
Big data: threat or savior?
Big data, of course, can be a problem. Because a data lake is often a collection of individual data sources, generally accessed over network connections or located in the cloud, simply adopting a big data approach to computing can introduce a myriad of new points of vulnerability. And, in the event of a breach, big data can ensure that the breach will be substantial.
Big data can also complicate threat detection by distributing the points of vulnerability. Rather than having one database to secure, a company may now use dozens, each of which can also be supporting more direct forms of access. How do you distinguish between an authorized access and one that may be automated through a big data interface? Increasingly, big data can be seen as the straw that breaks the security back. In fact, a recent Frost & Sullivan survey disclosed that IT decision makers believe that big data makes security harder. Eighty-one percent of those surveyed said they were somewhat concerned, concerned or very concerned about security issues associated with big data.
Nevertheless, big data can also be part of the solution to security issues. Consider that security threat detection and mitigation depend on analyzing telemetry from the IT environment. Data on application and server activity, user access, network traffic, and use profiles all feed the security analytics that ultimately determine whether the enterprise IT environment is under attack. As the network becomes more complex, the amount of telemetry increases exponentially. All of this data must be collected and analyzed.
This sounds more and more like a big data environment, with security as an analytics overlay.
Of course, it’s not just big data
Big data is well, big data. Data by itself is not information: it is simply a collection of facts for which relationships exist. In the case of security, the relationships within the data lake can provide indications of threats or exploits. To detect these threats, advanced analytics must be brought to bear.
Such solutions adopt an approach that should be increasingly familiar to our readers. Telemetric data is collected and deposited in a big data store that advanced analytics mine for clues to aberrant behavior or non-conformal demand patterns. Once detected, either corrective actions can be launched automatically, or human intervention can be requested.
In particular, in a virtualized environment that use VMs to execute applications, threat detection can be problematic and involve very intrusive approaches to security management that tend to degrade overall performance. By offloading the security detection to a big data environment, performance can be maintained, and more global threats can be detected before they involve the entire computing fabric of the enterprise.
As described by Win, Tianfield, and Mair in IEEE Transactions on Big Data, using cluster technology such as Hadoop Distributed File System (HDFS), application logs can be collected and analyzed to determine if an attack is taking place. Experiments confirm that such an approach is at least as effective as more pervasive approaches.
Experiments such as those described by Win et al. provide hope that, ultimately, big data will be the solution that enables cost efficient security services that can keep up with both network complexity and advances in the threat.
It’s a big data world, after all
Ultimately, big data will be an essential component to every IT environment. Along with the cloud and advanced analytics, big data forms the backbone of the evolving computing landscape. Enabling this environment is data networking and, as a result, security.
Just as in previous IT environments, where the infrastructure, itself, provided the security solutions that sought to prevent breaches, the new architectures will depend on big data and advanced analytics to detect and mitigate threats. Once again, big data is increasingly enabling new approaches to computing: ones characterized by the use of large data sets in a near real time environment.