March 1, 2012 | Written by: Raul Chong
Share this post:
Early in 2010 when the IBM Information Management Cloud Computing Center of Competence, based at the IBM Toronto Lab, was formed, I started delivering presentations about cloud computing to customers, professors, and the community in general. In those days, the concept of cloud was something fairly new, and there was a lot of interest about it. Fast forward two years, and most people I talk to now are very familiar with cloud concepts. They might not have had the chance to try it out, but they are familiar with what it can offer. Moreover, things have evolved rapidly; for example, fears about lack of security of your data in the cloud, are quickly disappearing as people realize cloud providers follow strict security measures and are often certified to given standards in the industry.
By mid-2011, the newest IT hot topic was big data analytics. You’ve probably heard already about V3: velocity, variety, and volume. These are the three characteristics that define big data. Big data analytics can be performed in motion using a product such as IBM InfoSphere Streams (Streams), or it can be performed at rest using a product such as IBM InfoSphere BigInsights, based on the Hadoop framework.
As I discussed earlier, Hadoop and Cloud are a very good fit. Hadoop can process big data relatively quickly using commodity hardware. Cloud can provide for this hardware in minutes, on demand, and at a low cost. Thus, cloud enables anyone, from a student to a scientist to run analytics jobs at any time. For example, setting up a 3-node Hadoop Cluster on the IBM SmartCloud Enterprise can take few minutes.
I find it fascinating the idea that anyone can potentially discover something very interesting with huge repercussions using these new technologies enabled with cloud computing. We are on the verge of many breakthroughs, thanks to these technnologies. For example, back in October 2011, I read an article in a newspaper in Toronto about a curious discovery made about the March 2011 Japan earthquake. According to the article, scientists noted that one day before this powerful earthquake took place, there were changes in the atmosphere. So, though this was a hypothesis, they concluded that potentially, if these atmospheric changes were detected again in the future, people could be warned ahead of time of an earthquake. Imagine the consequences of such discovery if it was confirmed! How many lives could be saved!
From the time the earthquake happened, to the time I read the article, approximately seven months had passed. Why did it take so long to discover something like this? My hypothesis is that these scientists were collecting many sensor data daily, but the amount was so large, that they had not had the chance to process it until several months had passed. Then they used Hadoop to derive this analytics. Now what I hope they are doing is use something like Streams to process the sensor data in motion. Given that they now know what they are looking for, they can code the correct collectors in Streams to quickly identify a repeating pattern and predict earthquakes!
Thanks to the cloud, a student or an entrepreneur, a small company or a large company can find valuable insight not possible in the past.
Attend the free webcast “Big data on the Cloud: A better path to deeper business insight” on March 7th to learn more about how to perform big data analytics on the IBM SmartCloud Enterprise!