Moving toward an open future of data and AI
A faster journey to AI for the enterprise? What’s the secret? In this interview with Dinesh Nirmal, IBM vice president of analytics development, he shares the highlights of his upcoming Think session: “Modernizing Your Data Estates for an AI and Multicloud World.” On Wednesday, 13 February, he and David Bernert from The Boeing Company will discuss advancements in data that will give enterprises the edge.
Big Data and Analytics Hub: Where are most enterprises on their journey to AI – and where does IBM Cloud Pak for Data fit in?
Dinesh Nirmal: The biggest challenge that enterprises are facing now is embarking upon the process of modernizing their whole infrastructure, including data. Unless it was born in the last five to 10 years, a company has to go through that exercise of a complete overhaul, bringing cloud-like characteristics behind the firewall, writing applications once and being able to deploy them behind the firewall or on public cloud.
That’s where ICP for Data comes in, clearly with the upper hand. All of a sudden, you get a containerized, Kubernetes infrastructure built on microservices that has cloud-like characteristics and an integrated, well-stitched framework of data and AI applications. Now you have that well-stitched framework or platform of products or services available for you to go deploy. Any enterprise that goes through that transformation can benefit from ICP for Data.
BDAH: Does the IBM history of adopting open standards and open source come into play?
DN: I believe that the future is going to be built on three things: multicloud, AI and open source.
These three things can drive the growth of any enterprise. Period. There’s no question about it.
You have to play in all three pillars, not just one. If you look at ICP for Data, it’s built on open source Kubernetes. And we’re building the whole thing as an open stack so customers and vendors and partners can extend on it. We built it with an enterprise envelope around it so clients get the benefits of scalability, availability and security. Open source has become so critical that IBM is focusing on how we bring more open source elements to ICP for Data.
If we go back in history, the amount of effort and focus we’ve put into supporting open source is amazing. We have championed so many open source efforts since we embraced the Linux operating system in 2000 — namely Apache, Hadoop, Spark, Docker and now, the Open API Initiative. It’s a long history we have. When we decided to go with the data platform, we decided the base was Kubernetes and build on open source and make it extensible and open. There was never a debate. We were clear on it.
BDAH: IBM has been named a leader in five AI-related Forrester Wave reports, which include many of the Watson Services that are now coming into our data and AI portfolio. What are the implications?
DN: To build AI and to build a model, you need the data. The data is all sitting across the organization. And you need data virtualization. It used to be collect, organize and analyze. But I don’t need to collect, I can connect. That’s what data virtualization in ICP for Data brings. With data virtualization, you can virtualize the data without moving the data. Self-service analytics can become easier if you can get your trusted data without moving the data.
Why do we bring AI apps onto ICP for Data? Because now we’re in a stage where the data is more readily available for self-service analytics in a way that the analyst and data scientist can get a hold of. This is huge for customers. With AI applications sitting on top of it, you have the full stack. You can ingest your data, analyze your data. You can assist your data, visualize your data and build AI models on your data. Everything is on single stack, which is just amazing. And that’s what we think differentiates ICP for Data.
BDAH: Can we claim ICP for Data will lower the barrier to entry to AI?
DN: We are confident that ICP for Data can help accelerate customers in the modernization of their data estates. We believe it will help customers build AI on their data much faster on their trusted data. There is governance around it, and it will enable self-service analytics in a very clear-cut fashion.
Today, if I’m a data scientist, we have found that it will possibly take me six weeks to get trusted data to build a model. ICP is designed to cut that down to two or three days to get the trusted data because we have one enterprise data catalog that can tell you where data is coming from. We have a crowdsourcing mechanism for people to rate the quality of data. It will take your data all the way from ingest to visualize without dropping the data. You get self-service analytics faster. You have a governed data lake available, so you can build your AI models faster. This will enable our customers to get into the AI sphere a lot faster. ICP for Data needs to be looked at as something that designed to put customers on a much faster path to modernize data estate and infrastructure.