Overview

Apache® Spark™ is an open-source cluster computing framework optimized for extremely fast and large scale data processing. Developed in the AMPLab at UC Berkeley, Apache Spark can help reduce data interaction complexity, increase processing speed and enhance mission-critical applications with deep intelligence.

Benefits

Innovate Faster

Apache Spark delivers 100x the performance of Apache Hadoop for certain workloads because of its advanced in-memory computing engine.

Easy to Use and Powerful

Apache Spark's Streaming and SQL programming models backed by MLlib and GraphX make it incredibly easy to build apps that exploit machine learning and graph analytics.

Open technology

The OpenPOWER Foundation enables GPU, CAPI Flash, RDMA, FPGA acceleration and machine learning innovation optimizing performance for Apache Spark workloads

Security and privacy in the cloud

When using IBM Cloud offerings, your company can scale and adapt quickly to changing business needs without compromising security, privacy or risk levels. Learn more about IBM Cloud security.

This offering meets the following industry and global compliance standards, depending on the edition you choose.

  • EU-US Privacy Shield and Swiss-US Privacy Shield Framework
  • ISO 27001
  • ISO 27017
  • ISO 27018
  • SOC2 Type 1 (SSAE 16)

To learn about the compliance and certifications for a specific offering edition, consult the Cloud Services data security and privacy data sheets.