Big Data, Analytics, and Software and Systems Development. Oh my!
Peter Spung 270002C3XF Comment (1) Visits (8959)
Big Data, Analytics, and Software and Systems Development (SSD). An interesting intersection and combination of value, which can be new to you, and/or a bit confusing. Here's an attempt to clarify...does this work for you? First, let's briefly define the purpose and value of each. Then, combine them to show how they can be complementary, and mutually reinforcing in ways that improve a business' customer experience and effectiveness, operational efficiency, and overall business outcomes. Conceptually, Big Data is as much a human phenomenon and observation about volume, as it is about reflecting any particular purpose or value in the data itself. Simply put, data is being generated, collected and stored at an ever increasing rate. What kind of data? Well, about nearly anything that touches an electronic or computer based device. Every time you go shopping, your purchases are scanned and stored. Every time you interact with someone on Facebook, Twitter, Foursquare, LinkedIn, or other social web sites, data about that interaction is generated and stored. Every time you make a phone call or text message someone, data is generated and stored. And on, and on, and on, .... Data is being amassed at exponential rates. It's typically measured in bytes which roughly equates to one letter or number, such as the letter 't' in that word 'letter'. Well, since the volume of data has become so big, so fast, new words have to continually be defined to describe the units of data volume. First came bytes, then kilobytes, then megabytes, gigabytes, terabytes, petabytes, exabytes, .... And we're now up to zettabytes, which is 1 followed by 21 zeros worth of the letter 't'. In 2011, human beings generated 1.8 zettabytes of data. That's a lot of data; a Big pile of it. This is Big Data. Oh, and yottabytes has been defined for the next 3 orders of magnitude of volume. Analytics is about turning that Big pile of Data into something useful. When an individual data item is first generated and collected, such as when you go shopping and purchase an item at the cash register, the data has a useful purpose and value: to look up the price of the item you're buying, and tally up the charges accordingly. Analytics doesn't refer to that initial use of the data item. Analytics tries to give data a purpose and value after that, after it's collected into a big pile (or a small one, as we'll see). Analytics attempts to find meaning and useful patterns among the data; actionable insights in the industry jargon. The canonical example is in the grocery store... when many transactions were analyzed against time of day, one of the most common purchase combinations in the evening was beer and diapers. The narrative attached was that someone on the way home from work stops to buy the evening's refreshments AND something to keep the baby dry and happy. The pattern was confirmed over time, and using new data from recent purchases. Given this insight, fact based merchandising in the grocery store can begin -- the actionable part of actionable insights. In a business context, actionable insights involve improving business outcomes: selling more, to more delighted customers, and doing it faster and cheaper. In the grocery business, this means selling more items that fill the grocery cart more. And selling grocery carts at a higher rate and pace. Merchandisers rearrange the placement of products in the store and on the shelves. They run promotions. They encourage the store to make checkout faster and more seamless. All with the goal of adding products to the beer + diapers purchase. And doing it faster and at higher volume, and at lower cost to the business. Now that Analytics software is checking prior patterns in the data, and looking for new ones, you can test and experiment to see how your business outcome might change. In a chain of grocery stores, a merchandiser might try different placements of beer, diapers and complementary products in the store and on the shelves among the different stores. Complementary products are those that recent Analytics has shown are often purchased with beer or diapers; say, pretzels and baby food. Analytics can be used weekly, daily, or even in real time, to understand the impact of the merchandising and changes in store configurations. And adjust and steer on an ongoing basis to achieve better business outcomes: selling more, and faster. You might ask, "What's different now? The Analytics you just described is been 10-20 years in the making. I've heard about beer and diapers. What's new, now?" What's new, now, is that within the big piles of data amassing at exponential rates are structure, information and insights completely buried and locked. Your business is operating and amassing petabytes of data; however it's not using it effectively, unlocking its potential, until it improves operational efficiency, or customer effectiveness & experience, or predicts and anticipates instead of just reacting. Until one begins to dig through and unearth these things in a systematic way, looking for patterns and predictors, it remains buried deep in piles of big data. In fact a new discipline is emerging: data science. Data scientists are becoming impressive characters, learning the skills and techniques to convert raw data into information, predictions, and actionable insights. They are applying predictive, domain specific analytic techniques to the data, which provides the opportunity to gain deeper insights to drive business operations to be more efficient; to be smarter. And to delight and inspire customers. Data science is a new discipline set apart from the software engineer and computer scientist; throwing bigger Von Neumann architectures & memory and faster Turing machines at the data won't unearth insights. Data scientists, trained in how to sift and sort the data and its structures and to distill and frame the insights and predictions, will. And it is happening: A taxi company in Singapore, ComfortDelGro, is offering a dynamic route planning and rerouting system using real time data collected from its 15,000 taxis on the go; and on the stop in traffic jams, which is happening less and less frequently to the customers in their taxis. Not to mention Netflix and Amazon, who are offering more precise recommendations based on observing our collected preferences over months and years. Each is using the emerging discipline of data science to deliver real value for business. So what in the world do Big Data, Analytics, data science, and beer and diapers have to do with Software and Systems Development (SSD)? They relate in two ways. First, data can be collected about the SSD process, and about the products and services produced: quality, warranty costs, customer feedback / ratings, etc. Analytics based on data science can be used to derive actionable insights about improving the products and services produced, and to derive insights about the SSD people, process and tools that produced them. Now, this isn't necessarily Big Data in volume. However, as we learned in Michael Lewis' book and movie Moneyball, you don't need terabytes or petabytes of data to derive the right actionable insights from the right data science and analytics to improve outcomes. That said, as the feedback loop from the product or service's consumer to SSD people & process is extended and improved, data volume will increase. For a popular software based product, the feedback data volume may become downright big. Second, analytics is software -- it is data processing if you will, harkening back to the name of computing and the IT industry in the 1960s. Data processing -- the Analytics of Big Data, or small amounts of data for that matter -- involves writing software. That is, using SSD people and process, along with their data scientist colleagues, to design software that yields actionable insights from analyzing data. Improved SSD can improve Analytics software, which in turn improves the actionable insights and predictions derived from it, and the related business outcomes. From my byline, you can see I'm with IBM Rational Software, the SSD people in IBM. Does Rational have SSD solutions for those two ways Big Data and Analytics intersect with SSD? Yes we do. And that will be the subject of my next blog post. I welcome your feedback on this post in the meantime. Peter Spung, Director, Strategy, IBM Rational Software @paspung on twitter | link