Machine Learning, Deep Learning 101
Getting started on IBM Power Systems
The relevance of analytics in today’s world
Raw data in its unprocessed state does not offer much value, but with the right analytics techniques can offer rich insights that can aid various aspects of life such as making business decisions, political campaigns, and advancing medical science.
As shown in Figure 1, the analytics cycle can be broadly classified into four categories or phases: descriptive, diagnostic, predictive and prescriptive. Machine Learning is an approach to data analysis that automates analytical model building and is used in all four types of analytics.
Four types of analytics
- Descriptive analytics: This type determines what is happening based on existing data.
- Diagnostic analytics: This type goes one step further to determine why a specific situation happened.
- Predictive analytics: This type looks across a broader set of data perhaps over a longer period of time to see trends and examples and then uses that historical information to predict future occurrences.
- Prescriptive analytics: This type goes beyond prediction to provide suggestions on how to best change future situations to meet your goals.
Figure 1. The four phases of the analytics cycle
The relevance and the growing use of analytics using machine learning can be demonstrated by its widespread use in the 2016 US presidential election campaign. Unprecedented growth in the availability of useful information coupled with advancements in technology are making it attractive to use analytics to build and run a better campaign. Campaign teams analyze voter sentiment, population segmentation, and historic voting patterns and use this information to better plan on which states and voter profiles to focus their campaign efforts on in order to ensure maximum turnout. Machine Learning is at the core of what makes this possible. With this new trend, the real asset in any political campaign has rapidly changed from funds to voter's data which is collected from pollsters, fundraisers, field workers, consumer databases, private companies as well as through cookies and tracker programs on campaign websites and social media apps. Applying Machine Learning algorithms on this colossal repository of voter's data has changed the landscape of election campaigns by delivering action-oriented insights: predictions for each individual voter. These insights are used by the campaigns to strategize ways to raise funds, better target advertisements, and create detailed models of swing-state voters. Machine Learning has the potential to increase the effectiveness of campaign efforts by calculating the likelihood a candidate would show up at the polls, the likelihood a supporter who did not consistently vote could be motivated to go to the polls this time, and finally, how persuadable someone would be by the various means of campaign contact. As a result, Machine Learning enables campaigns to be more metrics-driven.
Machine Learning overview
Machine Learning algorithms iteratively learn from data, thus allowing computers to find hidden insights without being explicitly programmed where to look. Machine Learning is essentially teaching the computer to solve problems by creating algorithms that learn by looking at hundreds or thousands of examples, and then using that experience to solve the same problem in new situations. Machine Learning tasks are typically classified into the following three broad categories, depending on the nature of the learning signal or feedback available to a learning system:
Supervised learning: The algorithm trains on labeled historic data and learns general rules that map input to output/target. For example, based on historic data of voters (voter details labeled with their votes (label) in the previous years), the presidential campaigns can predict which kinds of voters are likely to vote for a given candidate or which kinds of voters are persuadable by campaign efforts and use this information to better plan resource utilization.
In supervised learning, the discovery of relationships between the input variables (for example, the voter details such as age and income) and the label/target variable (for example, the vote cast by a particular voter in the last election) is done with a training set. The computer/machine learns from the training data.
A test set is used to evaluate whether the discovered relationships hold and the strength and utility of the predictive relationship is assessed by feeding the model with the input variables of the test data and comparing the label predicted by the model with the actual label of the data.
The decision on proportional split between train data and test data is often considered tricky. Having a greater proportion of data as test data ensures a better validation of model performance. Too little training data provides less data for the model to learn from. Opinions on a good split generally range from a 60:40 to 80:20 ratio of train and test data.
- Unsupervised learning: The algorithm trains on unlabeled data. The goal of these algorithms is to explore the data and find some structure within. For example, using these algorithms, the presidential campaigns can identify segments of voters with similar attributes who can then be treated similarly in the campaign by customizing campaign efforts for each group. The most widely used unsupervised learning algorithms are Cluster Analysis and Market Basket Analysis.
- Reinforcement learning: The algorithm learns through a feedback system. The algorithm takes actions and receives feedback about the appropriateness of its actions and based on the feedback, modifies the strategy and takes further actions that would maximize the expected reward over a given amount of time. Reinforcement learning is most widely used in self-driven cars, drones, and other robotics applications.
Deep Learning overview
Deep Learning is a special type of Machine Learning that involves a deeper level of automation. One of the great challenges of Machine Learning is feature extraction where the programmer needs to tell the algorithm what kinds of things it should be looking for, in order to make a decision and just feeding the algorithm with raw data is rarely effective. Feature extraction places a huge burden on the programmer especially in complex problems, such as object recognition. The algorithm's effectiveness relies heavily on the skill of the programmer. Deep Learning models address this problem as they are capable of learning to focus on the right features by themselves and requires little guidance from the programmer, making the analysis better than what humans can do. Deep Learning models have been very effective in complex tasks, such as sentiment analysis and computer vision. However, Deep Learning algorithms, due to their slow learning process associated with a deep layered hierarchy of learning data abstractions and representations from a lower-level layer to a higher-level layer, are often prohibitively computationally-intensive.
Getting started with Machine Learning on IBM Power Systems
Using Machine Learning requires a variety of technical and engineering skills. Making use of Machine Learning at your company will likely require a team of experts possessing the knowledge and skills in different aspects of data and analytics. The skills range from understanding and having access to the data to be used, knowing how to use data cleansing tools, understanding Machine Learning concepts and algorithms, having experience with analytics tools, programming applications, and setting up the necessary hardware and software to implement and deploy the Machine Learning processing environment.
Here is a view of the common steps for using Machine Learning:
- Get a head start at running Machine Learning workloads by setting up and configuring the IBM Power Systems server.
- Determine the business problem to solve.
- Identify and collect the data to be used, preprocess the data to cleanse and transform it into a usable state, and split it into train data and test data if using a Supervised Learning algorithm.
- Determine the Machine Learning algorithm to use. The algorithm is determined based on the business question that needs to be answered. For example, a neural network can be used for predictive analytics and for customer segmentation, cluster analysis can be used. The best algorithm to use also depends on the state of the data available. For example, if the data has a number of missing values, a decision tree may be the preferred algorithm because they can deal with missing values better.
- Select the analytics tools and install it on IBM Power Systems. Each analytics tool supports one or more programming languages. So, the programming language used to build the model often depends upon the tool selected. Example tools include SPSS, SAS, open source, and Spark MLlib. Language examples are R, Java™ and Python.
- Write the code to build one or more models using your Machine Learning algorithm of choice and train the models with the train data. If you have built a Supervised Learning model, test it with the test data and make any necessary configuration tunings to achieve greater accuracy. If you have multiple models, select the best one based on its performance on the test data.
- Run the model on IBM Power Systems.
- Use Apache Spark to increase performance by running the model in a distributed mode or across a cluster of IBM POWER8® processor-based server nodes.
- Visualize the results in an analytics tool, such as IBM Cognos® Business Intelligence on IBM Power Systems.
Why should you use IBM Power Systems for Machine Learning applications?
There are many benefits of running Machine Learning and Deep Learning workloads on IBM Power Systems. These workloads can be floating-point compute-intensive and require a lot of memory and I/O bandwidth, and thus can take advantage of graphics processing unit (GPU) acceleration for increased performance. IBM Power server's larger caches along with its ability to push data to the numerical coprocessor or GPU makes it suitable for running these workloads. The Deep Learning frameworks also provide prebuilt open source options to easily install on POWER8 processor-based server with GPUs. Because IBM Power® platforms are able to converge Big Data and Deep Learning on the same platform, these workloads can directly run close to the Big Data / Hadoop infrastructure on an IBM Power server without using extract, transform, and load (ETL).
Machine Learning related analytics tools available on IBM Power Systems
There are several options for using Machine Learning and Deep Learning on IBM Power systems.
IBM SPSS on POWER / Linux or AIX
IBM SPSS® supports several Machine Learning algorithms. Use SPSS Modeler to create and test the model along with SPSS Collaboration and Deployment Services to run the model. Refer to the blog that describes how to run and tune SPSS Modeler on IBM POWER8 processor-based servers to achieve superior performance.
SAS on POWER / AIX
SAS supports a broad analytics tooling portfolio on IBM Power Systems. Enterprise Miner supports building and testing models with several Machine Learning algorithms.
For more information, refer: http://www.sas.com
Open source Machine Learning and Deep Learning libraries available on POWER / Linux
Many open source Machine Learning libraries have become popular. Several open source Machine Learning and Deep Learning libraries are available to run on IBM Power Systems including Caffe, Torch, and Theano, and others are coming in the future.
- Read Micheal Gschwind's blog to learn more about how to get started with open source Machine Learning and Deep Learning on Power Systems.
- Read Michael Gschwind's blog to learn more about the new OpenPOWER software distribution for Deep Learning.
- Refer to the OpenPOWER Deep Learning data sheet which documents the Machine Learning and Deep Learning Power Systems reference architecture.
Apache Spark MLlib on POWER / Linux
Apache Spark is a distributed processing environment. One of the key components of Spark is MLlib, which is a Machine Learning library. The library can be used by Spark's supported programming languages: Java, Scala, Python and SparkR. MLlib supports dozens of algorithms and utilities which can be found in the Spark MLlib guide.
Read the blog by Raj Krishnamurthy and Randy Swanberg about how Apache Spark Runs 2X Faster on IBM POWER8.
The blog summarizes three key Machine Learning workloads (Logistic Regression, Support Vector Machine, and Matrix Factorization) and the recommended configuration to achieve superior performance on IBM POWER8 over x86 processors.
Join IBM Data Science Experience to interact and collaborate with other data scientists as you get started using Machine Learning and Deep Learning on IBM Power Systems.
Last month OpenPOWER announced a hackathon called OpenPOWER Developer's Challenge which is open for submissions through September 1. One of the tracks is on Deep Learning with Apache Spark on OpenPOWER servers (The Accelerated Spark Rally). This is a great chance to try out Deep Learning! To participate, go here for more information and to register.
There is more going on at IBM with Machine Learning. Here are two additional resources: