Big Data

Introducing a Universal Translator for Big Data and Machine Learning

Share this post:

SystemMLPressRelease_Graphic-09-DSAnybody who travels to a foreign country or reads a book or newspaper written in a language they don’t speak understands the value of a good translation. Yet, in the realm of Big Data, application developers face huge challenges when combining information from different sources and when deploying data-heavy applications to different types of computers. What they need is a good translator.

That’s why IBM has donated to the open source community SystemML, which is a universal translator for Big Data and the machine learning algorithms that are becoming essential to processing it. System ML enables developers who don’t have expertise in machine learning to embed it in their applications once and use it in industry-specific scenarios on a wide variety of computing platforms, from mainframes to smartphones.

Today, we’re announcing that Apache, one of the leading open source organizations in the world, has accepted SystemML as an official Apache Incubator project—giving it the name Apache SystemML.

We open sourced SystemML in June when we threw our weight behind the Apache Spark project—which enables developers and data scientists to more easily integrate Big Data analytics into applications.

We believe that Apache Spark is the most important new open source project in a decade. We’re embedding Spark into our Analytics and Commerce platforms, offering Spark as a service on IBM Cloud, and putting more than 3,500 IBM researchers and developers to work on Spark-related projects.

Apache SystemML is an essential element of the Spark ecosystem of technologies. Think of Spark as the analytics operating system for any application that taps into huge volumes of streaming data. MLLib, the machine learning library for Spark, provides developers with a rich set of machine learning algorithms. And SystemML enables developers to translate those algorithms so they can easily digest different kinds of data and to run on different kinds of computers.

SystemML allows a developer to write a single machine learning algorithm and automatically scale it up using Spark or Hadoop, another popular open source data analytics tool, saving significant time on behalf of highly skilled developers. While other tech companies have open sourced machine learning technologies as well, most of those are specialized tools to train neural networks. They are important, but niche, and the ability to ease the use of machine learning within Spark or Hadoop will be critical for machine learning to really become ubiquitous in the long run.

In the coming years, all businesses and, indeed, society in general, will come to rely on computing systems that learn—what we call cognitive systems. This kind of computer learning is critical because the flood of Big Data makes it impossible for organizations to manually train and program computers to handle complex situations and problems—especially as they morph over time. Computing systems must learn from their interactions with data.

The Apache SystemML project has achieved a number of early milestones to date, including:

–Over 320 patches including APIs, Data Ingestion, Optimizations, Language and Runtime Operators, Additional Algorithms, Testing, and Documentation.

–90+ contributions to the Apache Spark project from more than 25 engineers at the IBM Spark Technology Center in San Francisco to make Machine Learning accessible to the fastest growing community of data science professionals and to various other components of Apache Spark.

–More than 15 contributors from a number of organizations to enhance the capabilities to the core SystemML engine.

One of the Apache SystemML committers, D.B.Tsai, had this to say about it: “SystemML not only scales for big data analytics with high performance optimizer technology, but also empowers users to write customized machine learning algorithms using simple domain specific language without learning complicated distributed programming. It is a great extensible complement framework of Spark MLlib. I’m looking forward to seeing this become part of Apache Spark ecosystem.”

We are excited too. We believe that open source software will be an essential element of big data analytics and cognitive computing, just at it has been critical to the advances that have come in the Internet and cloud computing. The more tech companies and developers share resources and combine our efforts, the faster information technology will transform business and society.

General Manager, IBM Analytics

[…] the company blog, IBM’s Analytics VP Rob Thomas said application developers are in need of a good translator. This was a reference to the huge challenges developers face when combining information from […]

[…] no tardó demasiado en aprovechar esa oleada de generosidad abriendo SystemML (las siglas ML responden al término […]

[…] no tardó demasiado en aprovechar esa oleada de generosidad abriendo SystemML (las siglas ML responden al término […]

[…] no tardó demasiado en aprovechar esa oleada de generosidad abriendo SystemML (las siglas ML responden al término […]

[…] no tardó demasiado en aprovechar esa oleada de generosidad abriendo SystemML (las siglas ML responden al término […]

[…] IBM объявила о передаче под крыло организации Apache Software Foundation […]

[…] the company blog, IBM’s Analytics VP Rob Thomas said application developers are in need of a good translator. This was a reference to the huge challenges developers face when combining information from […]

[…] vient d’annoncer que son logiciel propriétaire SystemML peut désormais être partagé et modifié, d’après […]

[…] Vía | Blog Oficial de IBM […]

[…] In a move intended to make it simpler for organizationsSalesforce Adds New Predictive Analytics To Marketing Cloud. Read more … » of all sizes to generate machineSD Times GitHub Project of the Week: Seneca. Read more … » learningSD Times GitHub Project of the Week: Seneca. Read more … » algorithmsGoogle Makes Powerful TensorFlow Machine-Learning Software Free To AI Researchers. Read more … », IBM today announced that its general-purpose machine learning compiler and optimization platform has… […]

[…] Advances to Sensing Emotions. Read more … » engine and optimizer, and the company said in a blog post that scalability is a core feature for enterprise dataThe New Kid on the Block: GPU-Accelerated […]

Comments are closed.

More Uncategorized stories

Propelling the Mobile Enterprise in 2018

When it comes to the digital workplace, we view it in terms of stages, or generations. The first was device-centric and included a one-size-fits-all model, in which every employee received essentially the same type of device, the same applications, and the impersonalized levels of support service. The second generation focused on limited device choices and […]

Continue reading

CEVA Links Cognitive Supply Chain to Holiday Season Rewards

CEVA is a multi-billion dollar global supply chain management company that works with some of the busiest retailers in the business. We’re responsible for everything from contract logistics to freight management, packing and shipping, and all on a global scale. Our business is much more than simply moving things from A to B. We operate […]

Continue reading

University of Oklahoma Taps AI to Strengthen Student Retention Rates

Graduating from a four-year college in four years should be an achievable goal, but only just over 40 percent of students are able to reach this milestone. A critical driver for achieving on-time graduation is first-year retention. But for public institutions, like the University of Oklahoma (OU), the national average of full-time, first-time students who started […]

Continue reading