The promise of predictive analytics
Tinniam V Ganesh 270004Y158 Visits (2537)
We are headed towards a more connected, more instrumented and more data driven world. This fact is underscored once again in Cisco’s latest Visual Networking Index: Global Mobile Data Traffic Forecast Update, 2011–2016. The statistics from this report is truly mind boggling.
According to this report, by 2016, 130 Exabytes (130 * 2 ^ 60) will rip through the internet. The number of mobile devices will exceed the human population of 7 billion this year. By 2016 the number of connected devices will touch almost 10 billion.
The devices that are connected to the net range from mobiles, laptops, tablets, sensors and the millions of devices based on the “internet of things”. All these devices will constantly spew data on the internet and business and strategic decisions will be made by determining patterns, trends and outliers among mountains of data.
In this future of swirling data, predictive analytics will be a key discipline and experts in this domain will be much sought after. Predictive analytics uses statistical methods to mine information and patterns in structured, unstructured and streams of data. The data can be anything from click streams, browsing patterns, tweets, sensor data etc. The data can be static or it could be dynamic. Predictive analytics will have to identify trends from data streams from mobile call records, retail store purchasing patterns, social network status messages etc.
Analytics and predictive analytics will be applied across many domains from banking, insurance, retail, telecom, energy. In fact predictive analytics will be the new language of the future akin to what C was a couple of decades ago. C language was used in all sorts of applications spanning the whole gamut from finance to telecom.
While analytics can mine data for patterns, trends and outliers, predictive analytics can model the behavior of the system under study and come up with future trends and outcomes.
In this context it is worthwhile to mention The R Language. R language is used for statistical programming and graphics. The Wikipedia defines R Language as a language that “provides a wide variety of statistical and graphical techniques, including linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, and others”.
Predictive analytics is already being used in traffic management in identifying and preventing traffic gridlocks. Applications have also been identified for energy grids, for water management, besides determining user sentiment by mining data from social networks etc.
One very ambitious undertaking is “the Data-Scope Project” that believes that the universe is made of information and there is a need for a “new eye” to look at this data. The Data-Scope project is described as “a new scientific instrument, capable of ‘observing’ immense volumes of data from various scientific domains such as astronomy, fluid mechanics, and bioinformatics. The system will have over 6PB of storage, about 500GBytes per sec aggregate sequential IO, about 20M IOPS, and about 130TFlops. The Data-Scope is not a traditional multi-user computing cluster, but a new kind of instrument, that enables people to do science with datasets ranging between 100TB and 1000TB. The Data-scope project is based on the premise that new discoveries will come from analysis of large amounts of data. Analytics is all about analyzing large datasets and predictive analytics takes it one step further in being able to make intelligent predictions based on available data.
Predictive analytics does open up a whole new universe of possibilities and the applications are endless. Predictive analytics will be the key tool that will be used in our data intensive future.
Disclaimer: "The postings on this site are my own and don't necessarily represent IBM's posi