This is an exciting time for all of us, both in IBM and in the technology industry. Apache Spark is one of the driving forces at the center of this industrial revolution. Spark represents a new way of thinking about working with data, infusing insight into more and more applications. Apache Spark is an open-source in-memory distributed compute engine that speeds iterative analysis on large-scale data up to 100 times faster than what’s possible with current technologies in the market today. An analytics operating system that anyone can use, Apache Spark is opening the market to a completely new group of smart applications, transforming industries and professions. It’s a technology that can be used by a variety of different professionals from data scientists to data and application developers. The true value of Spark is that it enables more people to collaborate together to access data, apply analytics and deploy deep intelligence into every application including IoT, web, mobile, social, business process and more.
With Spark, unlocking the potential of data remains fun, fresh and exciting. Today, IBM announced that with IBM Open Platform with Apache Hadoop it is bringing Spark along with the Open Development Platform to IBM’s industry-leading systems portfolio, including IBM Power Systems, the only server platform designed from the ground up to handle the demands of big data.
Bringing Spark to IBM Power Systems will enable clients to infuse insight into business applications in a more efficient and agile way. POWER8 is the industry leader in system bandwidth, multithreading, and caching with capabilities that are four times what x86 platforms typically offer, capabilities that will dramatically improve the ability of Apache Spark to process data and iterate quickly. Because insight is only as good as the timeliness of the action we can glean from it, the more we can shorten the time between insight and action, the greater impact the actions we take will have. With Spark on Power Systems, we have the ability to shrink the latency between data and the point of interaction while reducing infrastructure costs and complexity. Our internal testing is showing that a wide range of Spark workloads benefit from POWER8's memory bandwidth, thread density and cache leadership. Compared to traditional x86 servers, we are seeing over 2X performance across a suite of Spark machine learning, SQL, graph computation, and streaming applications. For Spark SQL query workloads we see over 3X improvement. These workloads are consuming common real world big data sets including search engines, social networks, social media streams, e-commerce and multimedia analytics.
Power Systems offer leading performance for many data sources, from enterprise data warehouses based on traditional RDBMS to emerging open software solutions like NoSQL, MariaDB, and Apache Hadoop. With this large selection of data environments to meet client needs, Power Systems align to a variety of specific business needs and deployment environments, enabling clients to focus more on data science and less on infrastructure. As an open platform, deploying Spark workloads on POWER shortens the path for businesses looking to gain insight from overwhelming amounts of data. Furthermore, building on the advantages of full portfolio of open source technologies including the OpenPOWER Foundation, OpenStack and Linux on Power communities, IBM Power Systems is at the center of a community of global technology leaders that data scientists, app developers and data engineers benefit from while working within a robust and unified technology stack.
Machine learning represents a major topic in the open ecosystem conversation. POWER8 offers unique opportunities for developers to significantly accelerate the processing of big data for these kinds of applications, using technologies like the Coherent Accelerator Processor Interface (CAPI) or GPU attach to increase throughput for Machine Learning. Developed in concert with OpenPOWER partners like Altera and NVIDIA, these accelerators can perform many calculations in parallel, and will be ideal for accelerating deep learning processes that could take years down to days. The value of an open accelerator approach has already been validated by the US Department of Energy, which intends to build two supercomputers with POWER processors from IBM, GPUs and interconnect technology from NVIDIA, and high-speed networking from Mellanox.
With IBM’s deep investment in Spark and Open Data Platform we are quickly optimizing the Power Portfolio for a wide range of applications and uses for a variety of different industries. The already screaming fast performance of IBM Power Systems becomes even more differentiated when clients can run an open-source in-memory engine like Apache Spark. In the world, there are massive amounts of data within our grasp, we have it in our power to harness a great explosion of insight. If IBM Power Systems is the powder keg, a Spark will help clients unlock the seemingly infinite potential of structured and unstructured data, for valuable business results.
For more information on Big Data and Analytics on IBM Power Systems please visit http://www-03.ibm.com/systems/power/solutions/bigdata-analytics/index.html
And, be sure to join Power Systems on October 5th for a webcast highlighting new capabilities and product announcements that will help you go faster than ever before! http://bit.ly/1OcrNru