July 27, 2015 | Written by: Venkatesh Gopal
Share this post:
Do you have a scenario where you want to get insights in to some data? Are you held up because there is not an easy way to get a data warehousing environment up and running with the associated capabilities that data scientists would love? And, above it all, are you constrained by time?
Don’t worry… The IBM Bluemix platform gives you what you need and more. IBM Bluemix is the cloud platform that helps developers rapidly build, manage, and run web and mobile applications. Go to www.ibm.com/bluemix/ to learn more about the platform and its capabilities.
To get up and running on Bluemix, get your Bluemix ID by signing up for a free trial at console.ng.bluemix.net/registration/. Log in to the Bluemix console using your registration information, and just like that you have a platform with all the services in the world that you care about.
IBM dashDB is the data warehousing service on IBM Bluemix. It shows up under the Data and Analytics section in the Bluemix console. IBM dashDB is a fully managed data warehousing service in the cloud. Simply clicking on the service takes you to a page with some information on the service, such as where it will be provisioned, the kind of plan you can choose, and so on. Just click on CREATE, and you have a service provisioned for you –- yes, a warehousing service that is ready to use in less than 20 seconds.
The provisioning will take you to the page from where you can launch the dashDB console.
Launch the dashDB console to get started. Create the tables that you want by going into the Tables section and clicking the Add Table button. Once you are in there, just feed in a CSV or XLS file to be interpreted and it creates a table definition for you (which you can make modifications to) or you can give it an explicit DDL statement — either option works.
Once the tables are created, the next step is to load the data. Click Load Hub in the main menu, and you are presented the different load options. It is easiest to use the Desktop load option, feed it the same CSV file that you used in the table definition above, and load the data, accepting the defaults for the next couple of steps. In fact, you could have avoided this table creation step and combined the two into a single step. It’s just that easy!
Now that you have the schema created, the data loaded, you need to help the data scientist to do something with the data. Here again, dashDB’s R capabilities to the rescue!
By now you might have noticed that I have loaded pedometer data. Simplifying the example, I want to find out the total number of steps I have taken each month and just plot it in a graph. I have written the SQL for that; you can get it quickly validated in the Run SQL menu option. Just cut and paste the SQL below, and click the Run button.
select month(date) as mnth,sum(numberofsteps) as nsteps from pedometer where year(date)= 2015 group by month(date) order by month(date)
Note: Before doing the next step, you need to get the user ID and password information. In the dashDB console, click Connect > Connection Information, and you note the user ID and password information.
Now that you have the necessary information, go to the Analytics menu, and click on R scripts. This brings up the R development options. Click on the R Studio button. Enter the user ID and password that you noted from the connection information and enter it here. This launches the R Studio console.
Enter the following simple R Script and see the graph that is plotted:
Would you not agree that the whole thing was so easy to use?
There is a lot more complex analytics that can be done. And the functionality that is available with the service makes dashDB very easy to use right from provisioning, to loading, to using R scripts to get some quick insights into the data.
There are other Bluemix services that provide BI capabilities (Embeddable Reporting service), data movement (IBM DataWorks), and specialized services like Cloudant that integrate and work very well with IBM dashDB. All of this makes dashDB an attractive, easy-to-use warehousing service on the cloud.
Why wait? Time to try it out for yourself!