Bit buckets of data on the floor ... what to do .... what to do ...
chrisoc 110000JRJ8 Visits (1916)
When I ask IT and LOB executives where their biggest challenges lie, its often around the buckets of bits they have lying around their data center that, if properly understood, represent the value of their applications, their business, their systems and resources. They further respond always around keeping services up and running with the performance customers expect .... there must be some magic way to derive that from those buckets of bits they have. How do we avoid problems? If they do occur, how do we reduce the amount of time it takes to isolate and fix them? The worst nightmare they have is the twitter storm of customer complaints; the second worst nightmare is the war room conference call with 100 attendees all pointing fingers at each other as they try to fix the problem. Early warning of developing issues and speedy root cause analysis in the face of all this information is today's toughest challenge, and it's a challenge that's getting harder as systems and services become more dynamic, more agile.
Most IT shops are already collecting the data that they need. A typical 5000 server environment generates about one and a half TB of logs, events, performance metrics and other operational data, every day. That's a lot of data, and it can be overwhelming. Sorting through it all is a job for analytics. Analytics can show you what's important. It can provide deep insights, operating on the data already collected. Analytics takes us to the next step of effectiveness in detecting and predicting problems, and quickly isolating their root causes. Customers tell me their third biggest pain point is doing more with less. Analytics can help there too, reducing the need for labor intensive activities like managing performance thresholds.
While analytics can be intimidating, it's not just the purview of statisticians and mathematicians. Baked into solutions for IT management, it can be harnessed for the rest of us. Behavioral learning approaches allow software to create analytics models and understanding without the need of special user skills, and apply that understanding for problem prediction and detection, and reducing mean time to repair. Existing products today use manual thresholds to try to predict when performance will cross the line into becoming a problem. Using behavioral learning, software can analyze hundreds of thousands of performance metrics together, without any manual thresholds, figure out what their normal behavior and interaction has been, and then let you know when something abnormal starts to happen. That can be the difference between scrambling to fix a problem that's already caused service disruption, and resolving it before any service is affected. It's the difference between catching a memory leak early, because the memory usage doesn't match up with the number of requests, instead of later, when the service has degraded.
Does this resonate? We've introduced products recently around Log Analytics and Predicitive Insights that are helping show clients how to best utilize their buckets of bits .... We'll continue to talk about and demonstrate IT operational analytics at the Information on Demand Conference in November. IOD 2013 will focus on how Big Data and Analytics is transforming our world, and how you can take advantage of it ... let me know if you are coming ... love to connect on what we are doing there and share experiences.