What happened to Netezza?
There are some people, like me, who like to know how the story ends and thus may occasionally read the last chapter before going back and reading the rest of the book. So, I guess this is a spoiler alert. The answer to the question is, “Netezza is still alive, well and evolving and IBM has now come out with the next generation of Netezza as part of IBM Cloud Pak for Data System.”
There were some decisions about two years ago which led me to believe that Netezza was all but dead at IBM. It was quite a blow to the Netezza community, as the solution had an almost religious following. Thankfully last year, seemingly about the same time IBM acquired Red Hat, the next generation of Netezza was announced.
The Evolution of Netezza
Netezza Performance Server for IBM Cloud Pak for Data takes a large step towards making the platform more flexible and robust, while at the same time holding true to the past. It is a bundled architecture of software including the Netezza core software and broad new analytics and data processing capabilities within IBM Cloud Pak for Data System. The solution also includes infrastructure, adding Netezza processing nodes to the Cloud Pak for Data System base. The Cloud Pak for Data System includes data collection, virtualization, refinement, governance, data warehousing, and an advanced analytics platform. Cloud Pak for Data is designed so that workloads do not affect the Netezza processing / performance and vice versa. The Cloud Pak for Data System uses a containerized / hybrid cloud architecture where expanded processing needs can be delivered within the hyperconverged system, on another set of on-premises infrastructure or on the cloud.
All Cloud Pak for Data System components, as well as the Netezza host, are containerized, running on Red Hat LINUX (RHEL) and OpenShift. OpenShift contributes a common orchestration layer and allows IBM to build more unified common functions such as administration, security and logging. Netezza Performance Server has all of the strengths of the legacy Netezza solutions (TwinFin, Striper, and Mako) but has removed many of the weaknesses, the largest of which was the lack of expandability without purchasing an additional appliance. The bundling of the Cloud Pak for Data System software and Netezza Performance Server achieves more than legacy Netezza could deliver. It allows end-to-end data and artificial intelligence (AI) processing capabilities in one platform. No other company or platform brings this breadth of functionality to one converged solution.
Netezza Performance Server and Cloud Pak for Data System: A Modular Solution
The IBM Cloud Pak for Data System and Netezza Performance Server are both modular in nature (IBM calls it hyperconverged) and allow for additional processing nodes to be added within the same architecture and often within the same rack. This modular architecture creates a highly tuned solution specifically aligned to each organization’s needs. The Netezza solution (at this time – April 2020) is still an on-premise architecture tied to x86 hardware and leveraging the Field Programmable Gate Array (FPGA) cores as part of the Netezza secret sauce, but later in 2020 there are expectations of a cloud-enabled version.
Speedy Migration and Expanded Functionality
The Netezza Performance Server has all of the key functions we were accustomed to including: NZ functions, INZA functions, geospatial processing, and the core Netezza Platform Software (NPS), so migration is a snap. Mainline has customers who have migrated to the new IPS system in just a couple hours, leveraging common tools such as nz_migrate. Those leveraging User Defined Functions (UDFs) will only need to recompile, as the new IPS is 64 bit vs. the legacy 32 bit Netezzas.
The Netezza host has evolved and is now containerized, which is another step toward allowing the core functionality to truly become hybrid with components running on-premise while other workloads are shifted up to the cloud or off to other infrastructure. Performance has also received a boost through the replacement of legacy spinning disk with SSD drives. For those of you who lived through the significant disk failure rates on the TwinFin models, this will come as a welcome change for stability, not just processing speed.
In the past, the Netezza platform had a very specific and somewhat narrow role within the information ecosystem. It was a high-performance data repository / data warehouse for analytic workloads. Netezza with Cloud Pak for Data System expands the available functionality to cover the vast majority of functions necessary to bring end-to-end information and AI value to an organization. The elegance of the solution allows for the organization to only use the solution components necessary to meet their specific value proposition or initiative. Additional functions can be seamlessly initiated over time if needs change or the solution expands to reach a broader audience.
The following diagrams show the functional shift and expanded capabilities now available on Netezza Performance Server with Cloud Pak for Data System when compared to the legacy Netezza functionality. We have all seen similar information ecosystem conceptual diagrams, with data sources on the left, a flow of data from those varying sources through data processing, repositories, or virtualization and then, out to the end, consumers of the data.
The legacy Netezza was a strong analtyics database for structured data. It brought some of the first AI processing to business solutions thru the INZA engine as well as GeoSpatial analysis with its ESRI engine. Under IBM, the Netezza solution expanded to include some script-based virtualization capabilities with Fluid Query and then broader data processing of Hadoop and unstructured data with IBM Db2 Big SQL. Many orgnaizations found significant value in these core Netezza functions alone when building traditional data warehouses, datamarts, and expanding into Big Data use cases. In the end, the solution was closely tied to the data repository layer of an organization’s information ecosystem.
With Netezza Performance Server and Cloud Pak for Data System, organizations still have all of the legacy capabilities of Netezza, but now have the expanded breadth of Cloud Pak for Data functionality which gives organizations end-to-end coverage for most information and analytics use cases. Users are no longer tied to complex scripting or third-party tools to virtualize data across a large contingent of sources, structured and unstructured. This brings functionality such as data virtualization closer to the business and allows for more rapid solution deployment and reduced administration of changes within IT. Also, with AI functionality of Watson built into the solution, many organizations can step up the AI ladder while at the same time tackling complex organization and data management challenges such as data governance and adherence to regulations. AI is also embedded within the core components of the Cloud Pak for Data System, giving users the ability to rapidly profile, analyze, model, and ingest new sources and align them with other organizational artifacts.
Mainline Customer Success Stories
As with any newer solution in the IT market, organizations want to know if all the marketing is reality. “Does this really work?”
Mainline has been working with numerous organizations across industry segments, including:
- Financial Services
- Higher Education
We have partnered with our customers to implement both targeted and broad-scoped use cases where Netezza Performance Server and Cloud Pak for Data System has given organizations a solution with capabilities far beyond the Netezza of old.
Use case 1: One organization offloaded a significant portion of their staging area to NoSQL solution to allow for complex Natural Language Processing (NLP) to be preprocessed before joining it back with the Netezza core repository.
Use case 2: Another customer has focused on leveraging the Watson Studio components to start climbing the AI ladder and bringing additional insights to a manufacturing line, mitigating waste which was only visible once multiple processing points were combined.
Now, organizations are building upon their initial success, expanding into more complex and higher business value areas. Without Netezza Performance Server and Cloud Pak for Data System, the organization’s legacy data and systems had proven insurmountable. Mainline is in step with our customers to ensure success and the highest return on investment.
Get more insights from your data: Request a Business Analytics Health Check.