May 2, 2017 | Written by: Nin Lei
Categorized: Real-time analytics
Share this post:
A conversation with an IT social influencer
During InterConnect 2017, we spoke to industry thought leader and prominent social influencer Craig Mullins, president and principal consultant of Mullins Consulting, Inc., about his perspective on the challenges faced by today’s CTOs and the growing complexity of data.
Mullins has an extensive background in data and database systems. Before he became an independent consultant, Mullins was a DBA and a developer for several organizations, and he worked for a series of DBA tool vendors. He also covered database administration while employed as an analyst at Gartner.
What got your attention at InterConnect 2017?
If I had to distill it down to one thing, it’s change that is happening so rapidly it’s very easy to get distracted—and it’s almost impossible to keep up. When you look at the overarching trends, obviously cloud is number one. People are moving workloads to the cloud, creating new workloads in the cloud and merging on-premises and off-premises workloads in hybrid clouds, and that kind of change significantly impacts a lot of IT roles.
But the speed of change is really incredible too. I think part of it is just our ability to store vast amounts of data while the price to store it has plunged. Couple that with the ability to access it rapidly with in-memory technologies such as Spark; it’s not just that you have the data, but you also have a way of processing it rapidly. We’re not doing things in batches and waiting days for results; we are actually getting answers in memory, in real time, allowing us to move forward and make additional insights.
As a data person, I think that this point in time is absolutely the most interesting period to be working as a data professional—definitely in my career, which spans more than 30 years. But I think we’re kind of at the cusp of a transmogrification, if you will, with the way in which we manage and use data. You see AI and machine learning and analytics all being applied to large volumes of data, and those volumes of data continue to grow.
What about the growing complexity of data when we consider social content, images, unstructured data?
I think unstructured data is an area IT professionals are really starting to struggle with. First of all, the term unstructured data is a ridiculous term. There’s no such thing as unstructured data. If the data was unstructured, we wouldn’t be able to read it. It would be useless. So, it’s differently structured. I do use the term unstructured data because people know what it means, but it’s a horrible term.
When you look at unstructured data—meaning it’s not characters and numbers and dates and times, the traditional types of things we store—you’re talking about images and even large text documents, which make up a lot of the unstructured data out there. It’s not, “I want to store a person’s picture in the employee database,” it’s “Now I’ve got these contracts, and I’ve got these regulations, and I’ve got these policies that I want to store. I want to have access to these things and be able to read through them and get information out of them.” That’s driving a lot of what we call “unstructured data.”
Some people say you can’t do it in a relational world. Well, that’s hogwash too, because you have the ability with large objects—binary large objects, character large objects—to store these things in relational databases that are pretty adept at being able to handle them. But does that mean that’s always the way to do it? No.
If you’ve got a large document store in something like MongoDB, maybe that’s a lot more preferable in some instances with some use cases—which gets us to the term, polyglot persistence. Organizations don’t have just one DBMS, they’ve got multiple DBMSs—meaning store the right data in the right place for the right use—and they create a term such as polyglot persistence. What a mouthful, right? But that’s all it really means.
Given your experience with data and the rising rate of technology change, what do you see as the biggest challenges today for CTOs and technology leaders?
The bottom line is change and the rapidity of change. Just yesterday I was talking to the CTO of a healthcare supply chain company, and the example this CTO used was a particular Java framework. Suppose you committed to version 1 of this framework, and then version 2 comes out, but it’s completely incompatible. This example is not about a technology failing because of the marketplace, but the next iteration of that technology is completely incompatible with its past iteration.
That’s the kind of thing that keeps CTOs up at night. They have to make choices for the overall IT infrastructure and try to come to some rationalization that this is the technology everyone will use.
What about the other side of that coin? How is this rapid change affecting their teams?
When you look at what’s going on in the industry, you see the phenomenal rates of growth in data. But what about the DBAs? What about the people who are charged with making sure that the data is there and accessible? That growth rate is under 5 percent. You’ve got anywhere from a 50 to 100 percent—maybe even greater—data growth rate in your organization, and the number of people who have to manage that data is not growing anywhere near that rate.
What is needed is some sort of automation and intelligence. All kinds of IT data exist out there. Consider the z System and its instrumentation as one example. These machines automatically generate a massive amount of information. If you could take the domain expertise of a DBA, put it into Watson and run all that voluminous data through Watson, then Watson could make decisions for the DBA—allowing the DBAs’ stripped-down staff to continue operating.
Because if you don’t have this intelligence built on top of it, and the automation built on top of it, there’s no way the DBAs—or anyone—can keep up.
See what other CTOs are doing to keep pace with technology and drive innovation in their business.