Displaying top predictors and predictive strength

Blog Home > Displaying top predictors and predictive strength

Displaying top predictors and predictive strength

Once you have a new prediction displayed in Watson Analytics, you can click on the View All option near the upper right to display charts with the ranking of the top predictors and their respective predictive strength value. Each predictive strength value is displayed in parentheses after each predictor.

To see the statistical details behind each predictor, click on a predictor chart. From the Main Insight blade you can select to show or hide statistical details.

More Predict Stories

Getting Started

Using Customer Behavior Data to Improve Customer Retention

We’ve uploaded some sample data sets in the IBM Watson Analytics community for you to work with as you learn more about Watson Analytics. This expert blog uses the Telco Customer Churn data set. WA_Fn-UseC_-Telco-Customer-Churn What’s in the Telco Customer Churn data set? This data set provides info to help you predict behavior to retain customers. You can analyze all relevant customer data and develop focused customer retention programs. A telecommunications company is concerned about the number of customers leaving their landline business for cable competitors. They need to understand who is leaving. Imagine that you’re an analyst at this company and you have to find out who is leaving and why. The data set includes information about: Customers who left within the last month – the column is called Churn Services that each customer has signed up for – phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies Customer account information – how long they’ve been a customer, contract, payment method, paperless billing, monthly charges, and total charges Demographic info about customers – gender, age range, and if they have partners and dependents If you don’t have the data set… Go to Download the Telco Customer Churn sample data file. In Watson Analytics, tap Add and upload Telco Customer Churn. The filename is a bit longer: WA_Fn-UseC_-Telco-Customer-Churn.csv. The data set appears as a tile in the Welcome page and you’re ready to get to work. Which customers are likely to leave? To find the answer to this question, tap the WA_Fn-UseC_-Telco-Customer-Churn tile and tap Prediction. You want to learn more about customers who’ve left the company in the past month – this is the target that you want to investigate. The data is in the column called Churn, which is the column we’ve already picked as the target for the prediction. Let’s find out which variables influence customers who leave. Name the prediction and tap Create Prediction. Watson Analytics analyzes the data and generates visualizations to provide insights into this issue. The spiral shows you the top predictors, or key drivers, of churn in color; other drivers appear in gray. The closer the driver is to the center of the spiral, the stronger the predictive strength of the driver is.   The key drivers are tenure, contract, and online security. The visualizations to the right of the spiral show how one driver at a time drives churn. The blue or green dots in the upper right of the visualizations identify which driver is being shown. Tap tenure drives Churn. This new visualization shows that customers who have been customers for shorter periods are more likely to leave. Close this visualization by tapping the X in its upper right corner. You can look at the visualizations for the other drivers on your own. Let’s move on and explore churn in more depth. To the left of the spiral are options for creating visualizations that show more than one driver at a time. Let’s go straight to the deeper and more predictive analysis of the data. Tap Combination. You get a new set of visualizations on the right, including a decision tree, that show the combination of variables that influence your target. Let’s look at the combination of key drivers that influence whether customers leave. Tap the decision tree. Let’s look at a word cloud about the key factors that influence churn. Tap Predictor Importance. Contract, Internet Service, Tenure, and Total Charges are the most important factors. Let’s get some more details on who is leaving so we can predict who is likely to leave in the future. Tap Top Decision Rules. The rules are specific and detailed, and are sorted by accuracy. They currently focus on customers who do not leave. We need to change that. Change the No to Yes. A clearer view emerges. Customers who leave tend to be ones who are on a month-to-month contract, have fiber optic internet service, and have been customers for shorter periods. You can now predict which customers are at risk to churn. Use the decision rules to identify customers who fit the churn profile so you can proactively offer them an incentive to stay.

Getting Started

Quality In, Quality Out

When you add a data set, IBM Watson Analytics reads the data and assesses it for data quality. The data quality score measures the degree to which the data is suitable for predictive analysis. Data sets with low quality scores may be suitable for data exploration even if they are not suitable for predictive analysis. The overall score is an average of the data quality score for every field in the data set, as determined by missing and constant values, influential categories, outliers, imbalance and skewness. In this example from SportsData_NFL_2014_REG_PST_players.csv (which is available here), Watson Analytics excludes fields with more than 25% missing values and fields with constant values. You access the Data Quality Report from a prediction, using the menu in the upper-left corner. The Data Quality Report highlights areas where you could optimize your source data. Adding more rows and columns to the data often improves the quality of the data. The more data that Watson Analytics has available to choose from, the more accurate its results are. Note that you can choose to include a field that Watson Analytics has excluded; for example you may want to use a field that has more than 25% missing values because you know this field is important to your analysis. In this case, use the Predict Menu to select Field Properties, change the role of the field to input or target, and regenerate your prediction. This action may affect the quality of your prediction. How to influence data quality? Do your best to clean your data before you add it into Watson Analytics. List files work best. Some of the typical issues with data sets can be resolved by: Removing blank rows from your data file Removing summary rows and columns from your data file Eliminating column headings and row headings that appear in the same cell Avoiding look up tables Avoiding subtotals and aggregations More tips for cleaning your data before uploading to Watson Analytics: Watson Analytics assumes that the first row of your file contains headers files; descriptive column headers are preferred. You must have a header for every column. The number of columns in the header row is assumed by Watson Analytics to be the number of columns of data. For example, if the first six columns have headers but there are eight columns of data, the last two columns of data are ignored. You cannot have empty columns inserted before the data. You can have empty rows above the data. Empty rows preceding the data are ignored. You cannot have textual rows above the header row. For example, if you have a title or description of what the data is about above the header row, the file is not read appropriately. You cannot have textual rows following the data. For example, a row following the data that says “This information came from…” is considered to be part of the data. More details are in this helpful document: Introduction to Data Loading and Data Quality, including specific conditions that apply to MS Excel and CSV files.


Get ready for the 2015 fantasy football season

Are you missing the excitement of the NFL during the offseason? With Watson Analytics, you don't have to. Thanks to our partners at SportsData LLC, you can use a sample of the National Football League (NFL) regular season offensive statistics to prepare for this season's fantasy football. From the Watson Analytics Welcome page, use the data file SportsDataLLC NFL 2014 Offensive Stats.csv  to explore your favorite NFL team or player, assemble compelling dashboards, or discover top predictive drivers of key statistics like rushing touchdowns. Then, when the fantasy football season begins, you'll be able to impress your league with your picks because you'll have new insights into which players, based on their performance in 2014, are likely to make your fantasy team a contender for the championship. If you need additional ideas, this helpful video walks you through the powerful capabilities of Watson Analytics. After playing with the NFL football data, learn how Watson Analytics can help analyze your business data. Whether it's analyzing marketing campaigns, protecting your customer base, retaining key employees, or if you simply need help analyzing your data, Watson Analytics can help you get started.