As data scientists for the customer analytic group in our wireless service provider company, we want to leverage customer data to predict customer churn. Customer retention is a critical challenge for the telecommunications industry, where annual churn rates can be as high as 40 percent. If we can predict which customers are in danger of turnover, our company can take action to retain them before they take their business elsewhere. Even a small reduction in churn can have a significant impact on our bottom line.
We decided to build a quick web application we can enhance over time. Our app uses the code for a classification algorithm we developed in the Java™ language using Weka, an open source machine-learning tool. In Bluemix, we can deploy our Java application and take advantage of the Analytics Warehouse (formerly BLU Acceleration) service to perform analysis on our customer data. This service provides simplicity and performance, as well as enterprise scale if we decide to grow our model or enhance our app to perform additional types of analysis on our data. Finally, we chose Twitter Bootstrap as the web development framework because it offers the flexibility of a mobile-first web interface and can be easily adapted to the myriad devices and browsers our analysts use.
Learn how you can build a similar application in Bluemix. We assume that you have the necessary code for your application, and provide our application code and data as a sample to help you get started.
What you will need to build a similar app
- Familiarity with Java application development
- Familiarity with a modern front-end framework, such as Twitter Bootstrap
- Knowledge of a statistical analysis tool, such as Weka or R
Step 1. Create the application in Bluemix
Log into BlueMix.
On the dashboard page, click Add an application.
In this example, you will create a Java application. Under Runtimes, select .java liberty (Liberty for Java).
In the pop-up window, click CREATE APP.
In the next pop-up window, fill in the app name and host, then click CREATE.
Bluemix creates the app in your workspace and starts the Java runtime. You will know when the app successfully starts by the confirmation displayed on the dashboard.
Step 2. Create the Analytics Warehouse (formerly BLU Acceleration) service
Select the app you created from the dashboard to go to its overview page.
Click Add new service in the Services section of that page.
Select Analytics Warehouse as the service to add.
A pop-up window will display with more information about the service. Click ADD TO APPLICATION and CREATE on the subsequent pop-up window.
Step 3. Explore the Analytics Warehouse service (optional)
The service provides several data analysis tools from its web console, including loading and querying data, data analysis using R or Excel®, reporting using Cognos, and industry models that help you with common industry specific use cases. It's worthwhile to explore this impressive set of available tools for future projects.
On the app overview page, select the Analytics Warehouse service.
On the following page, click Launch the console.
A new window will open with the web console. You can do many things in here, including uploading data files into your database and analyzing your data with R.
Step 4. Upload your data to Analytics Warehouse (optional)
Our sample data set is already available in the Analytics Warehouse. However, you can use your own data. To upload data:
- In the Analytics Warehouse web console, click the Manage tab, then select Load Data.
- We will load data from a CSV file. Select Local File System as the source and browse for the file that contains your data.
- A new table needs to be created for this data. Click +.
- Again, browse for the CSV file you want to upload. The service will generate a SQL statement to create the table based on the content of the CSV file. For our analysis, we need all the columns to be DOUBLE except for the classification column. Modify the column types as indicated.
- Click Run DDL to run the statement; you will be notified that the query ran successfully. Click OK, then Cancel.
- Select the table you just created.
- Select the default option Append new data into the table, and then click Load Now. The data should be loaded.
Step 5. Download the code
If you haven't already done so, get the code.
Select EDIT CODE. After you log in, you will see the code.
Click File > Export > Zip to download the code to your machine.
Step 6. Understand the code
The sample application consists of these components:
- The FileLocationContextListener creates a folder for the file upload on the server.
- If the user selects the database to upload the training set for the model, the entered details are used to load data into an Instances object as TrainingSet. This TrainingSet is then used to create the NaiveBayes model. Alternatively, the default database table is used to create the model.
- The user can upload a CSV file as a Testing set. The file is uploaded into the folder created earlier on the server.
- Weka works with Attribute-Relation File Format (ARFF) files as a basic file format, including the attributes and the dataset it requires. The CSV2ARFF.java is an independent utility that converts the CSV file to ARFF file stored in the same folder on the server.
- The ARFF file is then loaded into an Instances object as TestingSet.
- For all the instances in the TestingSet, the NaiveBayes model is used to classify the output into Churn or Not Churn classes.
- The corresponding output is then displayed on the user interface.
Step 7. Generate a WAR file
To push the code to Bluemix, you will need to generate a WAR file. We can easily do this with Eclipse. A WAR file is already included in case you are unable to generate one.
Select File > Import. In the dialog window, select Existing Projects into Workspace, then select Next.
In the next dialog window, browse for the files you downloaded.
Keeping all the defaults selected is fine. Select Finish. The project has now been added to your Eclipse Client.
To export as a WAR file, right click on the project in the Project Explorer. Select Export > War File. Save the WAR file into a directory by itself.
Step 8. Deploy the application
Open a terminal and move into the directory of the WAR file. It is best to have the WAR file in its own directory.
Run the cf
push command. Provide the application name, memory
needed, instances, and path to the WAR file. For this application, let's
provide 512 MB of memory and one instance:
cf push bludemo -m 512m -p
As the application uploads, there will be details indicating what is happening. After about a minute and a half, the application should be live.
If you make changes to the application, repeat this process. Run the same command after you have generated a new WAR file to push to Bluemix.
Alternative steps: Deploy the application
Instead of following most of the preceding steps, you can create the service and deploy the application.
After you have the code in your own workspace (Step 5), modify the file named manifest.yml.
Modify name and host to the name of the application and host. These should be the same value. The file should automatically be saved.
Click Deploy, and DevOps Services will attempt to deploy the application based upon the manifest.yml file. DevOps Services will ask for credentials when deploying. Complete Step 4 to upload the training data. After that, the demo application will work.
Now you know how Analytics Warehouse provides data warehousing and analytics as a service on Bluemix platform and how you can develop and deploy a heavy-duty analytic application using IBM database technology in the cloud. Here's to faster, easier data mining in the cloud.
Many thanks to Alexandria Burkleaux for her review of this article.