Dark data: Peering through the blanket of the dark
“Come, thick night, and pall thee in the dunnest smoke of hell, that my keen knife see not the wound it makes, nor heaven peep through the blanket of the dark.” —William Shakespeare, Macbeth, Act 1, Scene 5
There’s a little Shakespeare for you: that’s Lady Macbeth, waiting for the doomed King Duncan to arrive at her castle. Swap Duncan for data, the castle for your mainframe infrastructure, cast Lady Macbeth as a cyber threat, and you can see what I’m getting at.
What do we mean by dark data?
Dark data, the deep web, the dark web: it’s scary stuff. But when people talk about dark data, what do they actually mean? For many organizations, it’s simply the data they’ve gathered or obtained but do little or nothing with. In the mainframe world, it means something different. In addition to your production data, just think about all the copies that may be out there, for development and other purposes, the versions you don’t know about. You can also bet that an awful lot of organizations don’t bother with data obfuscation (DO), even though tools are available. Why is that? Because of the time and costs involved. When companies are cutting costs, some security measures—especially if you believe you already have control—are quickly abandoned.
So, are you sure you really have the level of control over your data that you think you have?
The bad guys don’t even have to go after your production data sets. Why bother, when they can, maybe far more easily, go after a copy or a backup? So what if it’s a couple of weeks old? There will still be more than enough, and it’s almost certainly unencrypted. So your data ends up being traded and bought on a dark web cryptomarket like Silk Road 3.
As this scenario illustrates, if we focus on protecting production data, we could miss the data theft actually being committed: it’s at best on the periphery of our vision, and at worst, completely outside our sphere of control.
The surface web and the deep web
Just like in The Matrix movies, the world is not as it seems. The surface web is just the tip of an iceberg, the 4 percent you can access using Google. The other 96 percent starts with deep web stuff like academic databases, legal documents, subscription-only resources, and so on, and then slides into proper illegality with the dark web: drugs, guns, having someone whacked… and, of course, fake passports, IDs and credit cards for financial and identity fraud, founded on the availability of unencrypted personal and financial data. Peel back the onion layers, use the Tor network, and you can find pretty much anything you might imagine.
The existence of the dark web and cryptomarkets gives serious value to your mainframe and personal data. That value creates demand, and with demand comes motivation.
Protecting data on the mainframe
Some 80 percent of the world’s system of record data already resides on mainframes. More commercial transactions are processed on mainframes than on any other platform. And in today’s connected Internet of Things (IoT) world, the mainframe no longer works in splendid isolation. Much of the mainframe data is being copied and copied and copied and copied. As mainframe professionals, we should all be worried about our data ending up in “the dunnest smoke of hell,” particularly as the mainframe is now an increasingly attractive target for the bad actors. To them, it’s just another computer. As a community, we need to seriously step up our game in mainframe security and be far more proactive in what we do and how we do it. Our clients, employers, shareholders and regulators are looking to people like us to gaze “through the blanket of the dark” and sort it out.
I’m looking forward to talking more about this at the upcoming #IBMTechU in London this May. I’ll be a keynote speaker on “Dark Data, The Dark Web and Mainframes…What are you talking about?”
I’m always amazed at the quality and breadth of the technical training on offer at our #IBMTechU events. IBM Systems Lab Services consistently provides in-depth training sessions, hands-on labs and demos delivered by IBM engineers, developers or product experts. The content in London covers IBM Z and IBM Storage on Z with over three hundred technical sessions. Check it out.