Blog

What's happening? What's new? What can I do? Find answers to these questions in the blog.

Archive Results

Blog

Using Customer Behavior Data to Improve Customer Retention

Telco Customer Dataset This demo uses the the sample data within Watson Analytics. Please use the sample dataset.   What’s in the Protect Your Customer data set? This data set provides info to help you predict behavior to retain customers. You can analyze all relevant customer data and develop focused customer retention programs. A telecommunications company is concerned about revenue and the number of customers leaving their landline business for cable competitors. They need to understand who is leaving. Imagine that we are analysts at this company and we have to find out who is leaving and why. The data set includes information about: Customers who left within the last month –this column is called Churn Services that each customer has signed up for – phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies Customer account information – how long they’ve been a customer, contract type, payment method, paperless billing, monthly charges, and total charges Demographic info about customers – gender, age range, and if they have partners and dependents Getting the data                          Under the Data tab in Watson Analytics, tap + New Data button. Tap Import > Sample Data and then select and import the Protect Your Customer dataset. The data set appears as a tile under the Personal folder within the Data tab and you’re ready to get to work. It may take a couple of seconds as Watson Analytics is analyzing the data to aid your journey in using this dataset. Which customers have high value? To find the answer to this question, tap the Protect Your Customers CSV data set tile. You want to know where the revenue comes from and what you want to protect.   To better understand the business you may want to look at total charges by internet service type by asking “What is the average TotalCharges by InternetService?” You will want to select the first tile as best represents the line of inquiry.  Note that the image on the tile indicates you should expect a bar chart for this comparison.  In looking at the results, we see that Fiber Optic is clearly  the main internet service that gets the bulk of the revenue. Next, you want to find out about the total charges by contract type. Press the Plus button circled below and we will add another tab to your discovery set.   We want to investigate the total charges by contract type.  Enter the question “What are the average TotalCharges by contact?” and select the first suggestion (tile) from Watson Analytics.  We see the result that 2 year contracts generate more income whereas month-to-month is the lowest.  Typically, I would have thought average charges would be lower with longer contracts.  This is a little surprising. Clearly we want to protect customers with Fiber Optic and longer service contracts.  Lets add to the discovery set again with the plus button and find out how long customers stay with the services for each contract type by asking “What is the average tenure by contract type?”  Again the first suggestion from Watson Analytics is exactly the line of inquiry we want to explore, so we will select the first tile. When reviewing the results, we see that month-to-month contracts stay with the service on average 18 months whereas customers stay 42 months and 56 months on average for one year and two year contracts respectively.  You can hover over the bars of the chart to get the actual numbers.  The month-to-month contracts are not leaving immediately, but we should be thinking about how we can move these customers into longer term contracts. What drives customer tenure and churn? In thinking this through, we want to nail down the factors that drive customer tenure.  Let's add to the discovery set and ask “What drives Tenure?”.  The first suggestion fromWatson Analytics brings you to a spiral diagram which highlights TotalCharges and InternetService as the key factors for Tenure with a predictive strength of 91%. Looking at the relationships further down the list, I see that churn also affects tenure. This makes a lot of sense. Let’s see what drives churn by adding to the discovery set and asking “What drives Churn?”.  This time we will look at the second tile as it shows a decision tree for Churn.  By scrolling downward on the decision tree and hovering over each of the tree nodes, we can see that  customers with a month-to-month contract with less than six months tenure and Fiber Optic services churn 75% of the time. This occurrence is very high and we need to understand this better.  Perhaps the service is weaker than what our competition is providing and these new customers see the difference.  In any case, we need to speak to customer services and our hardware team with this finding as this directly impacts Fiber Optic revenue which is key to our business. Again, you can watch the narrated video for this use case here.  

Blog

Exploring Banking Loss Event data with Watson Analytics

Download the Dataset   This IBM Watson Analytics use case shows you how you can analyze loss event data from IBM OpenPages GRC using the updated Watson Analytics user experience.  (If you haven’t already signed up for Watson Analytics, you can do so here for free.) For the purposes of this use case, We are working on the risk team of a financial services enterprise and we need to review and analyze 7 years of loss events recorded in OpenPages which can be download from here. After we login we will see three main tabs and am currently positioned within the Data tab.  The Discover tab is where you will explore and discover the data you have in Watson Analytics.  The Display tab takes the discoveries and assembles them into rich stories, dashboard and infographics to share.  It all begins with your data, so the first thing to do is import the spreadsheet in Watson Analytics. 1.    Tap the New Data button. 2.    Tap the Local file tab, then tap the Browse button to select the spreadsheet you downloaded for the win/loss analysis. 3.    Tap the Import button. After importing the spreadsheet into Watson Analytics, where we can directly access the Excel file that I exported from the Loss Events pages in OpenPages . We can analyze the data in five steps. Step 1. Discover your Data When we click the tile created by the import and Watson Analytics immediately positions me into the “Discovery” functionality which provides me with a set of suggested questions or starting points that you can use right away.  You could also type your own question here too. If I am seeing a question in the tiles that make sense for me, I could simply click on the tile to get the result.  Note that each tile has a graphic showing you what to expect in the result. Let’s start by looking into the trend of the net loss by year. Watson Analytics presents a list of possible interpretations of what you wanted.  As it turns out, the first tile identifies exactly what we want, let’s select this question.  When I look at the figures, it appears that the budget safeguard we put in place in early 2014 worked as expected after that big loss in 2013. Next, we should check the trend of net loss by region by dragging the “Region” field (circled above in red) from the data tray directly onto the data visualization. Now we see the safeguard also worked as expected for all the countries: a great result. Step 2. Be open-minded and take suggestions from Watson Analytics Watson Analytics provides suggested lines of inquiry on the right based on interesting data distributions it finds adjacent to our current analysis.  These suggestions change as I change my line of inquiry.  I note that Net loss by business is very relevant, so lets evaluate this discovery by clicking on this tile. When we do that, we learn that our Corporate Finance and Retail Banking businesses account  for close to half of the net loss of my company.  Yikes! Lets tweak the visualization to use a treemap by clicking on the left “Visualization” icon”. And then select the Treemap visualization. We now see an interesting view of net loss by business. Step 3. Use Predict to review a model with net loss as the target Watson Analytics discovery capabilities also apply to predictive analytics.  Next, we ask the question: “What drives Net Loss”. Watson Analytics creates a spiral diagram where the factors most likely to correlate with the outcome (called “predictors” or ”drivers”) appear closest to the center. Here, business unit and risk sub-category are the top predictors with a predictive strength of 75%. By clicking the   icon next to an item in the list of drivers we are able to zoom into the details of this model.   We can see that the top issue with Net Loss is the relationship with Vendors or Suppliers with our Corporate Finance.  Mouse over the cells to get details as shown below. Let’s rename the tabs for our Discovery Set and then save it: 1.    Click on a tab name and then click the “pencil” icon to edit the tab name. 2.    Click on the disk icon on the top right (   ) and provide a name for the Discovery Set. 3.    Close the Discovery Set using the drop menu as shown below. Step 4. Assemble the data within a Display We can quickly put these findings into an interesting Display such as a dashboard or Infographic to share with others.  Click on the Display tab and then click “+ New display”. Select the Dashboard option and then select the four quadrant display template. On the left, we will be able to locate the previously saved Discovery set in the personal folder assuming we saved in the default location.  Expand your Discovery Set and select a visualization. Drag or click the four visualizations from the Discovery set onto each of the quadrants of the template (the blue box will glow to show you when to drop) and save the Display using the disk icon as you had done before. Step 5. Share the new insights! These findings are significant and we will want to share them with the VP of Risk.  We could use shared folders if the VP is also a user in the same Watson Analytics account or we can share with anyone using PDF, Powerpoint or Image files via email of download. Click on the share    icon, select Email and then select “PDF”.  Using download or email, the person you are sharing with does not even need Watson Analytics to benefit from my analysis. Great Job!  Don’t stop there apply these analytics to your own data!

Blog

What will a graduate degree give me? Exploring the American Time Use Survey data set

American Time Survey data The American Time Survey data is included within Watson Analytics as a sample data set called American Time Use Survey.csv Imagine you’re a university student thinking about going to graduate school and wondering what the impact would be on your income and how this affects your free time over the long term. The American Time Use Survey data set contains data about the amount of time people spend doing various activities, such as paid work, volunteering, childcare, and socializing. This demographic data is about a subset of Americans but can be applied more widely.   It all starts with your Data!   In Watson Analytics, click the New Data button. Click Sample Data icon.   Select American Time Use Survey.csv, scroll down and then click the Import button. The data set appears as a tile in your Personal data folder. Watson Analytics analyzes the data and metadata when uploading the csv file to provide smarter data discovery and analysis. In this process, Watson Analytics identifies field names and concepts, possible measurements and hierarchies in your data and captures metadata including data quality, data distributions, skewness and missing values. Let’s ask our first question.   Does higher education lead to higher earnings? Tap the American Time Use Survey data set tile. You are taken into a new Discovery set. This is where you start interacting with the data. That single tap gave you a list of Starting points, which are different ways to launch yourself into data analysis and visualizations.   Let’s enter our question: does higher education lead to higher earnings, and then press Enter. You now see different Starting points based on your question and these are ranked by relevance.  The most relevant inquiries bubble to the top of the list.   Select the Starting point: What is the breakdown of Weekly Earnings by Education Level? The results are shown in a treemap visualization. The size of each rectangle below indicates the relative size of weekly earnings by education level. The largest rectangles are for those with advanced degrees. This visualization is for all ages.  Let’s see how weekly earnings by education level breaks down when ages are added in. At the very bottom of the window is the Data Tray showing all the column headings in the data set. Add Age Range to the visualization. Just drag it from the data tray (the grey strip on the bottom) and drop it anywhere on the visualization. Note: you can also drop it on the Data Slot beside the drop down for Education Level on the bottom left just below the visualization. There’s a lot more detail in the visualization now, perhaps too much. Let’s focus in on people with college or university degrees. Below the visualization, you can modify what is displayed. Select Education Level and tap the items listed from 9th grade down to Some College to remove them from the visualization. You may need to scroll down in the box to complete this. Some of the smaller rectangles are for age groups that aren’t really relevant to the question that we’re exploring. People aged 0-19 have generally not completed university or college, and those aged 70 and older have generally retired from paid work. Let’s filter out these groups: Tap Age Range at the bottom of the Visualization Select 0-19, 70-79, and 80+ to remove them.   Then tap Done or outside the Age Range list to close it.   Try a different visualization type Different visualization types communicate information about data in different ways. Let’s see what else we can learn by using a different visualization type. Tap  to the left of your visualization to see what Watson Analytics recommends. You can, of course, pick any type you want. Tap the first recommended visualization: the Bar chart.   You see that earnings peak when people are in their 30s and 40s, regardless of education level. But what about work-life balance? Earnings is one way to look at it.  However, life is about more than how much money you earn. Does someone with more education work longer hours? Do they have time to spend with their families and friends? Lets add to this discovery set with a simple click on the plus button circled below and then ask the question “How do weekly hours worked compare by education level?” By clicking on the insight tile circled above you will see the treemap.  We can see that people with more advanced education level spending more time working.   In the previous inquiry on weekly hours worked by education level, I see that there are other questions we could ask that are more predictive in nature. Similar to Step 8 lets add to the Discovery Set and determine “What drives Weekly Earnings?”  Select the circled insight tile. It may take a few minutes for this insight to process as it is going through many predictive models to determine what drives weekly earnings.   Once it evaluates thousands of models, it will present us with a short list of predictive relationships. Not surprising -based on what we have already seen that weekly hours worked and education level have relatively strong relationship with weekly earnings with a predictive strength of 45%. If you wanted to see more drivers, you can tap the link for “Show more drivers”.  If we tap the button to the right of the driver we can see more details on the driver. As we mouse over the blue blocks in this heatmap, which show the key elements of the relationship, the cell values for weekly income (shown as color intensity) are generally higher earnings as you move your cursor up and to the right. What did we learn? These findings show us that working hard to get good marks in school to attain a higher education does not stop there.  We will need to keep working after we have attained our advanced degree to continue in building up the weekly earnings.  This of course affects our free time. Don’t stop there - Try this type of analysis with your own data set!

Getting Started

Using Customer Behavior Data to Improve Customer Retention

We’ve uploaded some sample data sets in the IBM Watson Analytics community for you to work with as you learn more about Watson Analytics. This expert blog uses the Telco Customer Churn data set. WA_Fn-UseC_-Telco-Customer-Churn What’s in the Telco Customer Churn data set? This data set provides info to help you predict behavior to retain customers. You can analyze all relevant customer data and develop focused customer retention programs. A telecommunications company is concerned about the number of customers leaving their landline business for cable competitors. They need to understand who is leaving. Imagine that you’re an analyst at this company and you have to find out who is leaving and why. The data set includes information about: Customers who left within the last month – the column is called Churn Services that each customer has signed up for – phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies Customer account information – how long they’ve been a customer, contract, payment method, paperless billing, monthly charges, and total charges Demographic info about customers – gender, age range, and if they have partners and dependents If you don’t have the data set… Go to https://community.watsonanalytics.com/resources/ Download the Telco Customer Churn sample data file. In Watson Analytics, tap Add and upload Telco Customer Churn. The filename is a bit longer: WA_Fn-UseC_-Telco-Customer-Churn.csv. The data set appears as a tile in the Welcome page and you’re ready to get to work. Which customers are likely to leave? To find the answer to this question, tap the WA_Fn-UseC_-Telco-Customer-Churn tile and tap Prediction. You want to learn more about customers who’ve left the company in the past month – this is the target that you want to investigate. The data is in the column called Churn, which is the column we’ve already picked as the target for the prediction. Let’s find out which variables influence customers who leave. Name the prediction and tap Create Prediction. Watson Analytics analyzes the data and generates visualizations to provide insights into this issue. The spiral shows you the top predictors, or key drivers, of churn in color; other drivers appear in gray. The closer the driver is to the center of the spiral, the stronger the predictive strength of the driver is.   The key drivers are tenure, contract, and online security. The visualizations to the right of the spiral show how one driver at a time drives churn. The blue or green dots in the upper right of the visualizations identify which driver is being shown. Tap tenure drives Churn. This new visualization shows that customers who have been customers for shorter periods are more likely to leave. Close this visualization by tapping the X in its upper right corner. You can look at the visualizations for the other drivers on your own. Let’s move on and explore churn in more depth. To the left of the spiral are options for creating visualizations that show more than one driver at a time. Let’s go straight to the deeper and more predictive analysis of the data. Tap Combination. You get a new set of visualizations on the right, including a decision tree, that show the combination of variables that influence your target. Let’s look at the combination of key drivers that influence whether customers leave. Tap the decision tree. Let’s look at a word cloud about the key factors that influence churn. Tap Predictor Importance. Contract, Internet Service, Tenure, and Total Charges are the most important factors. Let’s get some more details on who is leaving so we can predict who is likely to leave in the future. Tap Top Decision Rules. The rules are specific and detailed, and are sorted by accuracy. They currently focus on customers who do not leave. We need to change that. Change the No to Yes. A clearer view emerges. Customers who leave tend to be ones who are on a month-to-month contract, have fiber optic internet service, and have been customers for shorter periods. You can now predict which customers are at risk to churn. Use the decision rules to identify customers who fit the churn profile so you can proactively offer them an incentive to stay.

Blog

Realize the Power of Predictive Analytics In Just 4 Steps

Statistics and predictive analytics are powerful techniques for analyzing your data so you can uncover the insights that really matter. However, they are daunting applications for many people. Fortunately, Watson Analytics is able to make predictive analytics easier and more accessible to practically anyone. You can start taking advantage of this power in 4 easy steps. (If you haven’t already signed up for Watson Analytics, you can do so here for free.) 1. Load your data. 2. Select Predict and choose a target variable from the data set created when you uploaded the data. You can select up to 5 targets. 3. Start exploring your results by reviewing key influencing factors that drive your target. 4. Drag a variable into the spiral to see positive and negative associations and correlations in your data. And, just like that, you’ve used sophisticated analytics the easy way and discovered insights that help guide further exploration of your data. You can learn more about the Watson Analytics predict spiral in this video. If you haven't used Watson Analytics to find key drivers in your data, today's a great day to start. Sign up for your free account today.

Getting Started

Quality In, Quality Out

When you add a data set, IBM Watson Analytics reads the data and assesses it for data quality. The data quality score measures the degree to which the data is suitable for predictive analysis. Data sets with low quality scores may be suitable for data exploration even if they are not suitable for predictive analysis. The overall score is an average of the data quality score for every field in the data set, as determined by missing and constant values, influential categories, outliers, imbalance and skewness. In this example from SportsData_NFL_2014_REG_PST_players.csv (which is available here), Watson Analytics excludes fields with more than 25% missing values and fields with constant values. You access the Data Quality Report from a prediction, using the menu in the upper-left corner. The Data Quality Report highlights areas where you could optimize your source data. Adding more rows and columns to the data often improves the quality of the data. The more data that Watson Analytics has available to choose from, the more accurate its results are. Note that you can choose to include a field that Watson Analytics has excluded; for example you may want to use a field that has more than 25% missing values because you know this field is important to your analysis. In this case, use the Predict Menu to select Field Properties, change the role of the field to input or target, and regenerate your prediction. This action may affect the quality of your prediction. How to influence data quality? Do your best to clean your data before you add it into Watson Analytics. List files work best. Some of the typical issues with data sets can be resolved by: Removing blank rows from your data file Removing summary rows and columns from your data file Eliminating column headings and row headings that appear in the same cell Avoiding look up tables Avoiding subtotals and aggregations More tips for cleaning your data before uploading to Watson Analytics: Watson Analytics assumes that the first row of your file contains headers files; descriptive column headers are preferred. You must have a header for every column. The number of columns in the header row is assumed by Watson Analytics to be the number of columns of data. For example, if the first six columns have headers but there are eight columns of data, the last two columns of data are ignored. You cannot have empty columns inserted before the data. You can have empty rows above the data. Empty rows preceding the data are ignored. You cannot have textual rows above the header row. For example, if you have a title or description of what the data is about above the header row, the file is not read appropriately. You cannot have textual rows following the data. For example, a row following the data that says “This information came from…” is considered to be part of the data. More details are in this helpful document: Introduction to Data Loading and Data Quality, including specific conditions that apply to MS Excel and CSV files.

Predict

Displaying top predictors and predictive strength

Once you have a new prediction displayed in Watson Analytics, you can click on the View All option near the upper right to display charts with the ranking of the top predictors and their respective predictive strength value. Each predictive strength value is displayed in parentheses after each predictor. To see the statistical details behind each predictor, click on a predictor chart. From the Main Insight blade you can select to show or hide statistical details. //

Assemble

Get ready for the 2015 fantasy football season

Are you missing the excitement of the NFL during the offseason? With Watson Analytics, you don't have to. Thanks to our partners at SportsData LLC, you can use a sample of the National Football League (NFL) regular season offensive statistics to prepare for this season's fantasy football. From the Watson Analytics Welcome page, use the data file SportsDataLLC NFL 2014 Offensive Stats.csv  to explore your favorite NFL team or player, assemble compelling dashboards, or discover top predictive drivers of key statistics like rushing touchdowns. Then, when the fantasy football season begins, you'll be able to impress your league with your picks because you'll have new insights into which players, based on their performance in 2014, are likely to make your fantasy team a contender for the championship. If you need additional ideas, this helpful video walks you through the powerful capabilities of Watson Analytics. After playing with the NFL football data, learn how Watson Analytics can help analyze your business data. Whether it's analyzing marketing campaigns, protecting your customer base, retaining key employees, or if you simply need help analyzing your data, Watson Analytics can help you get started.

Predict

Predicting with Watson Analytics: A quick start guide

When you know your data well, you can use the Watson Analytics predict feature as a kind of shortcut to analytic insights. You use data that you have already uploaded for your prediction, which is different from an exploration. To learn about exploration, see the "Exploring data with Watson Analytics: A quick start blog," which you can also find here in the Watson Analytics Community. Here's how you can use your data to create predictions. 1. Log in to Watson Analytics. From the Welcome page, click Predict. 2. From the data selection page that opens, select the set of data you want to use by clicking its name. 3. When the Create a Prediction page appears, you see that Watson Analytics has taken your source data and provided you with suggested targets. In this example, the target is churn. You can have up to 5 targets. You can also edit the target fields (for example, adding labels). After you make any changes or if you are satisfied with the suggested target, enter a name for your workbook at the top and click Create Prediction.   4. Watson Analytics builds a prediction workbook by automatically running thousands of algorithms to find the right model and likely predictors. 5. After the workbook is created, Watson Analytics launches a page with a spiral graph and relevant predictors. 6. This is where it gets fun. Click through the predictor level chooser to see what other fields affect your target and look at the results in the visualization tiles to the right. Or add fields to create a combination model, which enables you to drill into the rules behind a decision, and navigate down into the aspects of the decision.   And that’s it. The fastest way to start using Watson Analytics to predict and explain the meaning behind the data. For more tutorials, browse the Watson Analytics Community.     //

Assemble

Quick start Watson Analytics Community blogs: Getting started

Get started with Watson Analytics With Watson Analytics, you can instantly access and use predictive and visual analytics to can find answers and insights in your data. No expensive training or time-consuming setup is needed. This blog will give you a few hints to help you get started. When you log in to Watson Analytics, the welcome screen provides you with these analytics options: Explore Predict Assemble Social Media Refine This getting started guide shows you how to use the Explore, Predict, and Assemble capabilities. If you want a question about your data answered, Explore can help. After selecting your data set, Watson Analytics provides you with recommended questions for exploration or you can ask a question you would like answered from the data. You start just like you would ask someone a question, and Watson Analytics will respond in clear, concise terms and visualizations everyone can understand. As an example, here we have selected a data set “American Time Use Survey” and asked the question – “Does higher education lead to higher salary?” (Screen 1). Screen 1 Watson Analytics responds with a range of visualization options, such as: “What is the breakdown of Weekly Earnings by Education Level?” (Screen 2). Screen 2 Predict If you want to identify the factors most likely to affect a specific target or goal, click Predict. Watson Analytics processes uploaded data that you indicate and provides you with a spiral that shows you the factors that influence your target. The closer the factor is to your target, the stronger the influence. For example, you might be an insurer who wants to know the top drivers of customer lifetime value for your firm. The Spiral Chart (Screen 3) shows you the strongest individual drivers. “Number of Policies” is closest to the center, so it is clearly the strongest single determinant of Customer Lifetime Value. Screen 3 You can hone your results by combining drivers and predictors for more accurate results or review additional visualization tiles. (Screen 4) Screen 4 Assemble If you want to view your historical data in a dashboard or in visualizations so you can better understand your data and communicate your findings, click Assemble. Templates help you choose an appealing layout. When you drag data or an object to a dotted box in the selected template, it is sized and positioned based on the defaults defined in your template. After you click Create, the data from your data set is at the bottom of your dashboard and you can slide the tab up to see more of it. Simply drag data into your template. Watson Analytics converts the data into recommended visualizations. You can also add data to an actual visualization by dragging new data directly onto the existing visualization. Screen 5 What are you waiting for? Watson Analytics is that easy: Explore, Predict and Assemble with the click of a button. Just add your data, think of your questions, and begin uncovering the business insights that lead to better outcomes.     //