Thanks again for making 2012 a spectacularly successful year for the IBM Software Blog. We'll be back in 2013 with more news, opinion and insights into the software that's changing the way the world literally works. Happy holidays!
The following is the third in a new six-part series on Advanced Data Visualization. Over the next three months, IBM visualization experts will explore new and emerging visual techniques and the underlying technologies you can deploy to better understand your data to transform insights into better business outcomes.
Frank van Ham is a well-known research scientist and an IBM Master Inventor with over a decade in experience in designing and deploying interactive information visualizations. Some of his past projects include Many Eyes, a site for collaborative visualization and SequoiaView, a visual disk browser. Frank currently works with the IBM Business Analytics division on integrating visualization into IBM's product portfolio.
Once you realize that information visualizations are subjective carriers of a particular message, there are a lot of lessons from communication theory you can apply to your daily visualization design. In this post I will touch on a couple of these.
Know your message
For every data set there are many different messages I can choose to highlight in a single visualization. You'll be unlikely to find a single visualization that conveys all features of your dataset in a single image. Designing an effective visualization usually begins with making choices and deciding what the message is that you want to convey to your viewers.
The flipside here is that this means that you also have to decide which features of your data are irrelevant to the message you are trying to convey and eliminate them from your design. Another factor involves deciding what you want to achieve by showing that particular feature: When someone is doing data analysis, they likely want to obtain and objective perspective on the data and have that perspective be as detailed as possible
If I want to present a set of findings to some of my stakeholders to convince them to pursue a particular course of action, I might want to make sure that they’re only seeing the high level message I intended to communicate and not all the nitty-gritty details.
Know your audience
Like any other communication, many aspects of visualizations are dependent on the audience that will consume your visualization. For example, if your intended audience has an already established mental map of the domain of your visualization, they will be very unlikely to accept an alternate but equivalent representation. The world map below might be a slightly contrived example, but especially when you are deploying visualizations with highly skilled domain experts, you’ll find that they often have their own set of mental models and are unwilling to change them.
Color is something else that is highly cultural dependent. The typical green/red color scheme for financial data might be valid in the western world, but in the Far East you’ll find that the scale is reversed.Similarly, if you’re building a visualization for a heart monitor, you’ll find that darker red is actually good because medical specialists associate it with well-oxygenated blood. Talking to your audience before designing and deploying your visualization will help in identifying these potential stumbling blocks.
It’s all good, stocks are in the green.
Be a skeptical viewer
Visual media can be much more convincing than other communication media, because humans are inclined to believe what they see. And exactly because visualization is a medium, you can distort the truth just as easily as in any other medium, sometimes without recipients realizing it. The most obvious occurrences of this are playing with the data axes (using a non-zero axis origin to exaggerate growth, an inconsistent axis scale throughout the graphic, or mapping a single value to area without clearly indicating this).
Doing a Google search for lying with visualization will yield you plenty of instances.Other less obvious cases involve using inappropriate accuracy for values that have a high amount of uncertainty. Graphs that show forecasts should indicate the uncertainty for some of those predictions.
Graphs that visually show results from a sample or poll should include visual representations of the uncertainty (instead of listing the sampling uncertainty in tiny font beneath the graphic, I’m looking at you, pollsters). In many cases, it’s likely that you are being fed one particular interpretation of the data to support (or create!) a story and it’s important to be mindful of these cases.
If my unit of communication is a sentence, I can try to cram as much information in a single sentence as possible, for example by adding extra clauses to a sentence or by adding concatenations, which often results in sentences that carry a lot of information but are not easy to digest, because the reader has to parse the structure of the sentence to be able to understand it.
In visualization design it might be tempting to have a single graphic show multiple aspects of a single dataset, but in many cases it’s better to break down a complex message into multiple sections. My colleague Graham Wills has offered a number of approaches to reduce chart complexity in a previous post .
Chart embellishments are another feature that should be used with care: It pays to have a graphic that grabs the audience’s attention, because a message that is not received is not a very effective message by definition.
On the other hand, if the features you are using to engage the audience obscure the message you are trying to convey, you will not communicate effectively either. In this post I’ve argued that visualization should not be seen as a content neutral information carrier, but rather as novel communication medium.
Almost all of the caveats that apply to more traditional forms of communication (text, speech, images) apply to visual representations of data as well. Next time you view an infographic or a visualization, try to distill the message the authors was trying to convey consider if that message is conveyed effectively and more importantly, accurately.
Continue exploring visual analytics on IBM Many Eyes
Why stop the insight with this article? Visit IBM’s hub of visual analytics, IBM Many Eyes, and join over 100,000 like-mined visualization enthusiasts, academia and professionals. The Many Eyes web community democratizes data visualization by providing a simple three step process to create and interact with a visualization using your data set. Then share or embed your visualization across the web or your social network.
The following is the second in a new six-part series on Advanced Data Visualization. Over the next three months, IBM visualization experts will explore new and emerging visual techniques and the underlying technologies you can deploy to better understand your data to transform insights into better business outcomes. You can read the first post here
Graham Wills is the lead architect for IBM�s visualization engine. He has two decades experience in research and implementation of visualization systems in areas including statistical models, geo- and temporal- visualization, large-scale networks and coordinated views. He has published widely in the field and his recent book,Visualizing Timeis currently available on Amazon.
Data is not simply become larger every day, it is also becoming more complex. It is rare that any serious project concerns just one table, and it is common that the data contains structures and information that is non-tabular in nature. A common visualization requirement is to take a set of complex data sources and produce visualizations that allow domain experts (who are unlikely to be visualization experts, statisticians or comp. sci. majors) to see patterns, make deductions, and take effective action.
Einstein�s famous suggestion is to �make everything as simple as possible, but not simpler." Another similar quotation of his is �Any fool can make things bigger, more complex, and more violent. It takes a touch of genius -- and a lot of courage -- to move in the opposite direction.� The goal of visualization is to make the complex intuitive � simple. It�s not an easy goal and, while it doesn�t require an Einstein, it does require a bit of thought and experimentation. Here are some suggestions:
Use Analytics to Focus on the Important
I recently was cast in an amateur theater�s production of �12 Angry Men� and, looking through my part I noticed that my character (�Four�) was not particularly angry. So I wondered about the question �How angry are the 12 Jurors?� Being a statistician and computer scientist, I couldn�t resist getting to work. After getting the text and reading it in, I built up a few data sets; the base one was the play broken into words, but there were data sets for act breakdowns, graphs of character interactions and so on. I could search for words, build up patterns, see who spoke the most � but it was complex. Too complex.
So I turned to analytics to take the words I had read in, filter out the spoken words only and then generate an �anger score� to each word used. Neutral words scored zero, strong words like �mad� scored high, and peaceful words like �calm� were given a negative score�. I rolled all these up to give a simple resulting table � character x word count x anger.
So now I wanted a simple visualization that shows this data. A bar chart of character x word count colored by anger would do, but it was a little dry, and I wanted to appeal to people who have a strong domain knowledge about TEXT � not numbers, so I settled on the following Wordle chart:
Bar charts are fantastic for showing differences in values very clearly � aligned bars are the best to go for quantitative judgments. But I wasn�t interested in that � I wanted a display that showed QUALITATIVE effects, and this chart gives that. In the Wordle, you cannot tell who has more words � �Four� or �Ten� for example look very similar, but for analysis of a play, that is not a bad thing. The audience will not be able to make an exact count � the overall impression is what is important. The fact that they are roughly the same is a better way of stating the data than saying one has a small number more than the other. In the above figure the font heights represent the root of the number of lines, and the more saturated the red, the more angry that character is; characters who try to calm things down are shown in blue with �Foreman� the most calming of the roles.
Build on Known Learned Representations
People can and do understand complex systems. If you need to make a complex chart, then one possibility is to build on what people already know already. Very often domain experts are familiar with certain types of graphic or visualization and building on that pre-existing knowledge allows you to show something more complex that can still be immediately understood. Genome browsers are a good example of such a visualization. They show tremendous quantities of data, often at multiple scales, with plenty of ancillary data. Yet researchers in the area are very familiar with them, so visualizations based on the representations used in genome browsers can be understood by those experts more readily.
Another visualizations that most people are familiar with map visualizations, such as the one shown below.
Maps often contain many hundreds of thousands of points, joined together to form polygons that cover an extent. We often have different elements (or layers) for different features. Because we are used to them, their complexity has become second nature to use, so when we look at a hybrid Google map with layers of satellite imagery, roads, traffic, features of interest and so on, we are not overwhelmed and can use it to take important decisions (such as �can I walk along the river and get coffee on the way?�).
The visualization shown here uses just two layers. One is a set of state polygon data, where each state is colored by the state�s population as given by the 2010 census, on a heat scale. This type of chart is known as a �choropleth chart�. We are very familiar with this chart as it used to show all sorts of US region-based data, from election voting patterns to global population data.
A second element (or �layer�, in map parlance) shows points at the state centers. The points are sized and colored by the population in 1960. By compositing these two layers we can see that California has grown significantly, whereas New York state has declined a lot over the half-century. Michigan too has shrunk, relatively speaking, and Florida grown. Using the combined two sets of population data and the geo-locations, we might theorize a national shift to the south.
Two Charts are Better than One
Another solution is to divide and conquer. Rather than creating a single very complex chart, we can break the data into different parts and show multiple charts, one for each section of the data. We use interactivity or a visual cue to link the two together. For example, if we had four columns of data on the states, we might show two scatterplots with a pair of fields in each one, and color and label the points consistently between the two charts. We could even add a third field (such as population in 2010) to both charts used as size. If we used the same mapping, we would both add a fifth field to our analysis, and, at the same time, enhance the linking by making it easier to spot the same states in different charts (California is the big, red one � etc.)
A single chart that tries to show six different variables is likely to fail unless we are very clever, but by showing 2 unique and 2 common variables in two separate charts, we can show the same information, but with more visual simplicity. If we can combine them into one chart, we would do so, but such complexity is often hard to understand, and the technique of using multiple views and linking them works well.
As a final thought on this technique, it has the valuable feature of making it easier to combine visualizations from different domains. If we have a lot of correlated statistical data on a network, for example, we can create one set of charts for the stats part, and another view for the network display, and then link them together to create a set of views that break the complexity down into smaller, simpler units. Minard�s famous graphic of Napolean�s march (well worth a web search for if you are unfamiliar with it) is an example of this linked charts technique; a map visualization linked to a time series chart.
It is always easy to �add more stuff� in any powerful tool. We can take a simple chart, throw in more elements, color, size and symbol mappings and soon have a chart that allegedly shows more data, but actually hides it in a sea of consuming complexity. Good designers know that the secret to a compelling presentation is to create simple images that capture complexity. This article presents three ways to achieve that goal and allow people to make good decisions from complex data.
Continue exploring visual analytics on IBM Many Eyes
Why stop the insight with this article? Visit IBM�s hub of visual analytics, IBM Many Eyes, and join over 100,000 like-mined visualization enthusiasts, academia and professionals. The Many Eyes web community democratizes data visualization by providing a simple three step process to create and interact with a visualization using your data set. Then share or embed your visualization across the web or your social network.
The 2012 edition of Information On Demand certainly left us all with a lot to think about, whether it was our Big Future, its Big Possibilities or simply the new mandate to Think Big.
No doubt you�re wondering how to share with your colleagues all the great things you saw and learned over the course of those three days. Naturally, the best way to determine your next steps would be to contact your IBM rep or Business Partner.
For my part, I�ve compiled a list of assets and sites related to the big themes of this year�s conference, namely: Big data, Smarter Analytics, IBM PureSystems and Watson.These are by no means exhaustive, but they should help you get started.
Finally, you can always review our rich store of event video content on our Livestream channel. Review and revisist the opening general sessions, track keynotes and a long list of Scott and Todd�s interviews with keynote speakers and IBM executives.
A crowdsourced collection of big data challenges and analytics opportunities was on the agenda for the Technical Unconference at Information On Demand. Here�s a sampling of what transpired.
�Businesses, place your bets�
�Business is all about placing bets on what you think will happen next,� said IBM Big Data Evangelist James Kobielus. He laid out some of the criteria that IT must satisfy and the capabilities they must embrace to turn the odds in their favor. For example:
Engage Customers as Individuals: Organizations must use every tool at their disposal to drive personalization capabilities into every touch point and form a bond with each customer. Big data and social analytics can help organizations go beyond a 360-degree view of their customers (knowing who they are, what they�ve bought and what they�re worth in the long-term) to a 720-degree view that reveals their undeclared needs and opportunities to satisfy them. �When you develop this capacity, customers will stay with you because you make them happy,� he said. �Listen carefully, then lavishly respond. Intimacy is everything.�
Go faster: Kobielus placed a premium on the speed and agility of your analytics operations, right down to the code. �Python and PHP didn�t get popular because they were better. They got popular because they�re faster.� Speed and agility are key IT attributes in a volatile market marked by an every-tightening spiral of shorter cycles, faster responses and urgent decisions. Successful organizations are powerful, flexible and faster, he said. They�re production-driven, business-focused and results-oriented. Users will choose sub-par tools if they help them get work done more quickly.�
Automate your analytics: Predictive models are the backbone of the agile enterprise, Kobielus said. But with so many people needing them and with more and more decisions being automated, how does IT respond? �Automate your analytics,� he said. �Automating your modeling function is the only way to respond at scale.� This automation must also include governance capabilities to refresh the models or replace them with next-best challengers when outcomes start to slide
The two people you�ll meet in every big data project
Tom Deutsch, IBM Big Data Program Manager, gave an overview of the two personality types you�re bound to run into in every big data or analytics project.
"This Changes Everything� Guy: prone to hyperbole and susceptible to the new and shiny. You�ll know him when you hear things like this:
�Everything older than six months is crap."
�We�ll figure out the information and metadata lineage later.�
�Breaking down silos takes too long. We�ll just create a new one."
�This will never work� Guy: obsessed with process and addicted to exactitude. His telltale expressions include:
�All this new stuff is crap.�
�We need to get all the governance and metadata right before we start.�
�I don�t trust anything without a schema.�
These opposing dispositions aren�t going to go away, Deutsch said. Further, each has a valuable part to play on any project. Your job is to find roles for them where they don�t manifest these behaviors. There is a happy, successful and profitable medium between utter chaos and complete control.
On the �irony� of NoSQL
RedMonk analyst Steven o�Grady assuaged fears in the room that emerging NoSQL technologies would not, in fact, render their hard-earned and well-honed SQL skills obsolete. �A lot of people thought NoSQL would replace the relational database. They got scared that they would be replaced.� The truth, said O�Grady, is more sophisticated than that. �The SQL skills that you�ve developed are just as relevant in NoSQL as in SQL,� he said.
O�Grady said another misconception is that NoSQL is a blanket, one-size-fits-all term. Rather, he said, �there are lots of NoSQL tools that have nothing to do with each other. It�s a matter of knowing what each tool is good at and picking the right one for the job.� These new tools provide IT with considerably more options - and opportunities � to build the right solution.
�Before NoSQL, the answer to every question was to run an SQL query against a relational database,� said O�Grady. �Now, you can choose the right tool for each function.�
"We all have our horror stories"
The frustration floodgates opened wide when participants explained how user demands for faster analytics run straight into the persistent (and persistently thorny) issues of data governance and SLAs.
Here�s a sampling (business users, take note: they�re trying to drive better outcomes, too):
�They�re distinct factions that don�t like to work together.�
�People always want to go off and do their own stuff.�
�Users don�t care how clean the data is. They just want all of it so they can do their own ETL. Taking time to clean it frustrates the bejeebus out of them.�
�It�s difficult to establish service levels in the cloud when a build takes one hour one day and four days the next. My users won�t even wait an extra 30 seconds.�
�I can teach smart people how to think, but it takes time for them to learn the business.�
�I kept rubbing my eyes and saying �I don�t see the value�.�
�The approach is generating revenue for us and I don�t want to stop it, but I can�t build a product out of something that only one person knows.�
�I found a massive Microsoft SQL installation running on someone�s desktop that was running an entire shop floor. We didn�t know it was there until it crashed.�
�The data becomes accepted, then embraced, then we find out it�s wrong.�
�You can�t dictate use models when users are in control.�
Big Ideas conveyed, Big Opportunities revealed, the last full day of Information On Demand concerned itself with a Big Future.
Your future, our future, the entire planet�s future.
As it turns out, host Jason Silva isn�t the only on here thinking about it. A charming �person on the street� video compilation yielded some interesting ideas about what our accelerating technology can do for us. For example: houses that talk to us; table tops as touch screens, pay-per-view sports offering immersive, in-home experiences; a universal translator app for all the languages we speak.
But if they seemed straight out of Star Trek or The Jetsons, none of them phased Silva in the slightest.
�We�re in a world where complexity is bootstrapping on its own complexity. The more dense and complex the connections among people and systems, the further we advance as a species. We�re riding a wave of accelerating change that is human history. Culture and technology and the manipulation of matter reaching an infinite velocity.�
Remember, Big Future.
But while not everyone will appreciate a surround-screen Super Bowl, everyone will appreciate improved health, which will most certainly be a hallmark of our collective Big Future. Next up was Craig Rhinehart, IBM Director of Enterprise Content Management Strategy and Market Development. The U.S. spends $750 billion on health care every year, said Rhinehart, Yet in global rankings, the country�s quality of care sits 37th, barely ahead of Slovenia, a country of two million people.
A substantial percentage of that $750 is due to unnecessary or inefficient treatments, process errors such as duplicate lab tests and outdated manual processes. This �trial and error healthcare system� sees patients continually readmitted for recurring or chronic problems that in many cases never get better. The solution isn�t more money, said Rhinehart. With 83 percent of healthcare costs deemed avoidable, the solution is a move from a reactive system to a predictive and preventative one.
Rhinehart then walked attendees through a new product, IBM Patient Care and insights. This new product sources hidden opportunities to improve patient care by integrating, analyzing and liberating the valuable patient information in doctors� notes, hospital files, lab reports and other sources that remain trapped in information silos.
With IBM Patient Care and Insights, healthcare professionals can analyze both structured and unstructured data using some of the same foundational natural processing language technology as IBM Watson to understand text-based information and present it for analysis.
The predictive analysis capabilities enable healthcare organizations to identify patients at risk for developing illnesses or needing additional interventions. Providers can use predictive modeling, trending and scoring to anticipate patient outcomes and evaluate the potential effects of interventions.
Drawing on the insights gained from analytics, care teams can then use the care management capabilities in IBM Patient Care and Insights to create personalized, coordinated treatment plans for patients that span multiple physicians, specialists, hospitals, clinics and home care environments. IBM Patient Care and Insights eliminates paper-based processes and automates care delivery mechanics such as managing workflow tasks and providing ongoing patient assessments.
I spent an hour yesterday at one of two Influencer Roundtables. The topic was "A closer look at the CMO: Social and Predictive Analytics, How Sports is Becoming a Metaphor for Business." On-stage to discuss the ways professional sports are becoming a "living laboratory" and the lessons they could teach CMOs were:
Leslie Ament: Vice President and Senior Analyst, Hypatia Research (Moderator)
Rod Smith, IBM Fellow and Vice President, Emerging Technologies
Deepak Advani, Vice President, IBM Business Analytics
Andrew Shelton, Head of Sports and Science, Leicester Tigers
Mark Wyllie, CEO, Flagship
Here's what they said:
Leslie Ament: Social media analytics is the "flavor du jour." Are we getting a tangible return on our investment?
Deepak Advani: Marketing is evolving from an art to a science. The proliferation of channels is making it increasingly complex. We're not where we need to be but we're making tremendous progress. For example: we can now do better targeting. We can run social media data through predictive models to understand which kinds of messages will improve the sentiment toward products. Of course, CEOs will say �positive sentiment doesn't pay the bills.� That�s true, but we�re also starting to model the correlations between sentiment and purchasing behavior.
Rod Smith: We�re part of the way there. Putting analytics in the hands of the business professionals will be the next trick. When Tiger Woods makes a great shot, fan reaction will spike on Twitter and CBS will get a massive influx of viewers for the replays for half an hour. That dynamic can impact advertising. Opportunistic marketers will know how to take advantage of those opportunities.
Leslie Ament: Do organizations understand the difference between simple social media monitoring and using social analytics for business purposes?
Mark Willie: They�re getting there. The Miami Dolphins, for example, look at their media strategy across Paid, Earned and Owned media channels. That used to be all Paid. Now they�re looking at new projects that can take fan content on Facebook and Twitter and present it on the Jumbotron during games. When Chipper Jones tweets about a problem with his bathroom at a Holiday Inn, there�s someone at his door 10 minutes later. But there�s no call to the front desk.
Leslie Ament: What are the biggest challenges to implementing social analytics in sports organizations? How do you overcome them? What results have you seen?
Andrew Shelton: We�re also in the early stages. We started by collecting player data from the past few seasons and doing some basic reporting, but we didn�t take a lot of action. In sports, injuries drive performance outcomes, so we want to make more confident predictions. We�re starting to look at modeling all of our player injury data from the past few seasons � that includes training, treatment and recovery data, even collision data from sensors the players wear. From there, we're building a model to predict if they�ll be injured again.
Deepak Advani: This is the same approach that oil companies and power plants are applying to their maintenance schedules. Whether it�s player injuries or mechanical breakdowns, it�s more effective to predict and prevent failures before they happen.
Leslie Ament: So where do you start? Which variables matter?
Rod Smith: It�s easy to look at the structured information and start there, but you really need a discovery model. Sometimes structured data doesn�t reveal anything useful. So you need to pull in other data and combine it to see if the new combination has value. Also, factors are going to change over time. Predictive analytics can help you manage uncertainty over time, but it's also an iterative process. You need to look at different indicators that you think will be useful at different times.
Deepak Advani: Sometimes you don�t know which variables will impact outcomes. But if the data is there, you should try it. A perfect example of this is the Memphis Police Department. They looked to see if there was a correlation between crime rates and phases of the moon, and the changes they made because of this insight helped reduce violent crime by 28 percent.
Andrew Shelton: Our focus is to prevent injuries and protect our players� physical welfare. So we look at every kind of variable � that adds up to 1,500 data points per player per day. It comes from GPS data collected during training, recovery from various therapies, game data and a lot more.
Leslie Ament: What steps do you take to help clients choose variables and get started?
Mark Wyllie: Every assessment begins with the same question: What are you trying to achieve? The Miami Dolphins were looking to optimize the fan experience. So we walked around the stadium for three games, one of which was a WrestleMania. We didn�t have any predefined ideas when we went in. We toured back of house operations, we talked to ushers, we simply paid attention to what was going on and took a lot of notes. Then we came up with recommendations for new signage and improving ingress/egress. The fan response was positive because they got into the stadium more quickly and team owners were happy because it meant fans bought concessions sooner, which meant they�d have more time to go back and buy more.
The challenge in sports is to keep fans coming back to the stadium. Teams need to preserve the value of their offering and offer fans experiences that they can�t get at home. In Miami, the Dolphins just rolled out a discount and loyalty card for season ticket holders. This gives them things like access to the field before the games, opportunities to meet the players, and a �Rookie Zone� that puts new fans closer to the field. All of these new new programs were driven by survey data.
Leslie Ament: Using social analytics and big data together can be very powerful. Can you cite some examples?
Deepak Advani: Remember, unstructured data isn�t just from social media. Consider call center records. Service providers have massive volumes of service records that have never been analyzed. But I know of one provider that cut customer churn from 90 percent to two percent. They did this by putting their unstructured customer data and churn data into a predictive model that connected the two data sets through customer ID numbers.
Rod Smith: Real-time data lets you �freshen up� your customer insights on a regular basis. You can monitor the performance of your messaging and stories. What are the latest things on your customers� minds? Those will change over time. The cost of doing this has dropped dramatically.
Leslie Ament: Where are social analytics and predictive technologies going? What should best practices be five years from now?
Rod Smith: Privacy is going to be an increasingly large concern. Companies can tap into what people are saying about their products, but the rules on acting on those insights are still unclear. This is a grey area; businesses don�t want to cross that line.
Deepak Advani: Consumers will exert increasingly more control over how they engage with companies. They�re going to engage with companies on their own terms, so companies will need to pay more attention to advocates and near-advocates, as well as to detractors and near-detractors. The marketing function will change � we used to spend our time getting people to buy products or handing off leads to sales. Now, though, customers say when they�re ready to buy. The new role for marketing will be to ensure customers get the experience they expected. Customer delight can fuel a cycle of advocacy. It�s a shift from customer acquisition to customer attraction.
Leslie Ament: How do organizations tackle the cultural challenges in making this big a change?
Deepak Advani: The value of predictive analytics resonates more with the business managers, but they can also be skeptical. Will this new tool take my job? Also, they�re not terribly interested in learning about linear regressions. You need to speak in their language.
Rod Smith: You need to find a believer and do a proof of concept. You need to present it as �informed intuition.� IT also needs to rethink its role. There�s a rebalancing of the relationship between business and IT going on. There are also new delivery methodologies to consider. Take cloud, for example. Sports teams don�t want to spend on IT infrastructure, they want to spend on things that help the team win and keep the fans coming back. Cloud capabilities take control out of IT�s hands.
Big opportunities for better outcomes dominated the opening general session as Information On Demand 2012 moved on to day two.
A kickoff video showcasing successful IBM customers Dillard's, Moneygram and Del Monte opened attendees' eyes to the transformative power of big data harnessed and acted upon.
First, attendees heard how retailer Dillard's improved customer relationships through a more effective CRM system and improved productivity across its multiple lines of business.
Next, they heard how money transfer leader Moneygram cut fraudulent transactions by 72 percent to save more than $62 million.
They also learned how Del Monte reduces risk and increases efficiency by applying predictive analytics to weather patterns and global trends.
Video concluded, host Jason Silva took the stage and once again provided a thoughtful, near-philosophical context to help attendees orient and attune themselves to the discussions to come throughout the day.
�We're living in a new cosmology,� he said. �A new computational universe of big data in which our thoughts, genes and behavior are reduced to bits churning through algorithmic computes. It's circuits and information flows, signal and noise.�
�When information is everything, we can understand how things really work.�
Hence, Big Opportunities.
Very little - perhaps none of this, though � would be possible without the precipitous drop in the cost of computing overall.
Thus began the presentation by Steve Mills, IBM Software & Systems Senior Vice President and Group Executive.
Mills proceeded to highlight some of the latest and biggest numbers from the big data universe. For example:
An expected 1.3 zettabytes of internet traffic by 2016.
500 terabytes processed by Facebook each day; 12 terabytes daily processed by Twitter.
40 terabytes per day generated by the large hadron collider at CERN.
24 terabytes crunched every day by Google.
From here, Mills furthered Robert Leblanc�s illustration of the resulting pressures on IT. Consider, for example:
The six billion mobile phones in the world (half of which are used by people with no access to electricity).
The 1 million wireless sensors for each 10 square kilometers of Shell Oil exploration projects.
An expected 420 million wearable or wireless health monitors in use by 2014, up from 12 million in 2012.
To make sense of such a maelstrom of machine and mobile data, Mills said organizations need multiple platform capabilities such as data visualization and discovery. Further, he said, they must be delivered from a flexible, adaptable platform built on open standards.
Mills went on to explain how though the challenges of big data may be new, the tools and techniques organizations can use to resolve them are not. Rather, he said, organizations will capitalize on the big opportunities by rethinking the things they�ve already done. Opportunities to increase efficiency, reduce waste, eliminate fraud and increase customer loyalty abound, as does the data. Mills even made the bold claim that U.S. national debt could be eliminated within a decade by applying better analytics to its myriad disbursement processes.
�To every challenge the data is there,� he said. �All that's required is for organizations to grasp the opportunity. Capitalizing on big data doesn�t require us to do anything other than raise our game in the things we�ve been doing for a long time. We�re not constrained by cost or technology.�
Drawing on IBM�s now 100-plus years in business, Mills left attendees with an encouraging thought: �We�ve been through these transitions before. This is right in our wheelhouse."
Mills then gave way to Fred Balboni, Global Leader for Business Analytics and Optimization, IBM Global Business Services, who both stressed the importance of acting on the new insights they gain from big data.
�Insight will be the next commodity,� he said. �All competitive enterprises will find new insights in their data.�
�The difference will be in acting on those insights. Once you know, what can you do, and how fast?�
Balboni previewed a world remade by big data and analytics in which no profession, industry or business function will be untouched. For example: no longer will CMOs simply manage agency relationships. They�ll be called upon to provide hard numbers. CFOs will cease to be scoreboard operators and assume the role of coach calling in the plays that move their organizations forward. CIOs will continue to obsess over technology, but will also need to change the way they interact with the business to ensure all functions can benefit from big data and analytics.
These transitions are shifting organzations' mindsets from focusing on tackling their biggest problems to seeking out their biggest opportunities.
"It's a very optimistic mindset," he said.
To demonstrate what's possible, Balboni welcomed to the stage two IBM customers who had taken these words very seriously indeed.
From JP Morgan Chase, Senior Vice President of Customer Analytics Adam Braff explained how big data and analytics are helping what is America�s largest bank better understand its customer interactions across its multiple channels. From information brokers Thompson Reuters, CIO Jerry Hope explained how a new analytics program was helping the company better integrate and leverage customer data and relationships and develop new products based on a 360-degree analysis of its 100 most important customers.
The session concluded with IBM Business Analytics General Manger Les Rechan taking the stage. The always energetic Rechan began his talk by calling attention to a recent Harvard Business Review article declaring data scientist as �the sexiest job of the 21st century.�
Rechan proceeded to explain how analytics is simply the starting point for organizations and that a strong IT-business partnership is essential to weaving analytics into every business process. �Data is the new oil,� he said. �Our job is to drill into an refine it, to turn insight into action. We need to move from data-rich, information-poor to data-rich, insight-pervasive.�
Software capabilities such as information integration and governance may seem far removed from the philosophical nature of Monday's opening keynote but in fact, it - and many others - are essential capabilities for delivering on the promise of big data. So it was with much interest that I sat in on a session entitled "Raising the Bar: Findings From a Study of Smarter Analytics Use and Outcomes by Industry."
The joint IBM / Villanova University Study was carried out by Dr. Matthew Liberatore, of the Villanova University School of Business and Information Agenda Program Director Carolyn Martin. It looked at the respective performance of organizations in the four phases of the IBM Information Agenda. The four phases are: Define and Govern: Analytics and Optimization Trusted Information Information Foundation
The team looked at results for companies in 10 industries, around the world, from 2009 to 2011.
The industries were: banking, insurance, telecommunications, government, healthcare, industrial, retail, consumer packaged goods, travel and transportation and energy and utilities.
The geographies were: North America, Western Europe, Central and Eastern Europe, Middle East & Africa, Latin America, Japan and Asia Pacific.
The team has completed the first phase of its research. Some interesting findings are below:
Within the Information Foundation phase, organizations rated metadata management as the top priority.
Within the Trusted Information phase, organizations rated master data management as the top priority
Within the Analytics and Optimization phase, organizations rated performance management and analysis as the top priority.
Findings by industry
Industry leaders have a 24 percent higher maturity rating than their peers in the Define and Govern and Information Foundation phases.
Organizations in industrial, insurance, telecommunications, healthcare, banking and travel & transportation ranked in the "High" performing group. One hypothesis for the results is the heavy regulation of these industries.
Organizations in government, energy and utilities, consumer packaged goods and retail were ranked in the "Low" performing group. Budget pressures, fragmented management and a lack of competition - particularly for government organizations - were offered as a hypothesis for these results.
Findings by geography
Central and Eastern Europe, along with Middle East and Africa displayed a 26 percent lower maturity score across all phases of the Information Agenda. Reasons for this outcome included slower market development and political instability.
Latin America demonstrated the highest aspirations for maturity across all phases of the Information Agenda, with improvement goals up to 30 percent higher than other geographies. Here, a few strong leaders, particularly in the financial sector, are driving these outcomes.
North America demonstrated high aspirations for improvements, but did not achieve as high outcomes. Organizations in this region demonstrate a good understanding of the benefits of an Information Agenda.
Western Europe was 21 percentage points less likely than other geographies to name business analytics and optimization as a priority, due most likely to its ongoing economic crisis.
Findings over time
Analytics maturity dropped across all industries from 2009 to 2010, but rose by 12 percent in 2011. Possible reasons for this? Most organizations either assumed they were performing well, or simply did not know what they did not know, said Martin. Increased media and academic emphasis on analytics beginning in 2011 may account for the rise in awareness.
Finally, most organizations follow a common path towards an Information Agenda: information Foundation capabilities are their first priority, followed by those in the Define & Govern phase. From there, the move to Trusted Information, with Business Analytics and Optimization the last stage.
Big data innovations turned into big data implementations in my next session of the day. The session began with a recap of the findings from the just-released IBM/Oxford study, �Analytics: The Real-World use of Big Data," followed by a panel discussion featuring four IBM clients with a long track record implementing big data projects.
Study findings were presented by Michael Schroeck, VP & Global Leader, Information Management Foundation, Business Analytics & Optimization, IBM Global Business Services. Schroeck is also one of the report authors. Among the findings:
Over the past two years, the percentage of organizations reporting competitive advantage from analytics has jumped from 37 percent to 63 percent.
Yet, organizations are struggling to leverage the four �Vs� of big data. Specifically:
volume � from terabytes to petabytes
variety � structured, unstructured, even semi-structured
velocity � data in motion, data in streams
veracity � data uncertainty, separating signal from noise.
Organizations report different levels of maturity with their big data projects: 24 percent are increasing their awareness of big data potential. 47 percent are developing plans and blueprints, teams and roadmaps 28 percent are implementing proofs of concept (POCs), pilots and enterprise solutions. This last group is �very committed and moving quickly.�
The survey highlighted five key findings on how organizations are moving forward:
Customer analytics are driving the majority of projects.
Big data success depends on a scalable and extensible information foundation.
Organizations are focusing their Initial big data efforts on gaining insights from existing and new internal sources
Big data success needs strong analytics People skills not keeping pace. Organizations are suffering from an analyics skills gap, which is one of the biggest inhibitors of big data progression
Big data projects will not move forward without a strong business case. This finding cuts across industries
To moderate the panel discussion that followed, Schroeck handed the microsphone to his IBM counterpart Sharon Hodgson, Servicve Line leader for North America, Business Analytics & Optimization, IBM Global Business Services. Here are some highlights from their discussion:
Big data can help big organizations restore the personal touch - at scale. An indivdual insurance agent can have up to three thousand customers. Yet through analytics and natural language processing, the company can discover, optimize and disseminate best practices for customer engagement that improve the personalized dimension of their relationships for all its reps and through all its channels.
Big data may be big, but it's not complete, at least not yet. Some organizations are missing important data sets to complete their customer profiles. Also, some data sets may be too old to be helpful. Organizations need to take a step back and carefully think about how to get the data they need. �We need to treat data with more respect,� said one panelist. In addition, not every customer is willing to share every piece of information. Organizations need to determine the data sets they need for their models to provide new products or improve existing ones so that all customers may benefit and justify their rationale for asking customers to share.
The skills shortage is real. Organizations are seeking people who possess not only strong analytics and data skills, but product and industry knowledge as well. Old-school problem-solving is also desirable. Such individuals are in extremely short supply.
Business leaders must take the lead. Companies will go further with their big data projects and receive more value from them with a business executive at the helm. Also, starting with a clearly articulated desired business goal is key.
Governments and public leaders must lead the discussion on big data use. The boundaries, roles and responsibilities of organizations pursuing value from big data are fuzzy at best. A discussion on the ethical use of big data needs to happen now.
"To understand is to perceive patterns.� ~ Isaiah Berlin
Big data and smarter analytics came wrapped in a healthy dose of awe as the curtain rose this morning on the 2012 edition of Information On Demand in Las Vegas.
�Think Big� is this year's theme and host Jason Silva conveyed it with conviction, dazzle and flair.
Dwarfed by a mammoth 80 X 40-foot screen (the largest possible to make without a seam), an energetic Silva listed myriad ways that big data and analytics are helping organizations take advantage of Big Opportunities to create a Big Future.
He explained, for example, how the United Nations is using sentiment analysis to help predict civil unrest, job losses, spending reductions and disease outbreaks. He highlighted how real-time grid data help electricity companies detect and fix problems before a major outage. He illustrated how doctors can now tap into the experience of other doctors to determine the best treatment.
�This seismic shift toward data-driven discovery and decision-making is a revolution,� he said.
All the while, futuristic images of cities, trees and, oddly enough, harvester ants flew by as the TED Talker and self-confessed �epiphany addict� placed these ideas within the context of metabolic laws and biological design and quoted at length from big thinkers like Kevin Kelley and Steven Johnson.
�The more we look at these patterns, the more they resemble forms in nature,� he said. �We're taking human understanding to an unprecedented level."
"Do you know what's going on here? It's accelerated evolution.�
Think Big, indeed.
But if technology is �slingshotting us forward faster than ever before,� if you consider yourself one of �the truly enlightend ones,� a �cosmic revolutinary� in a �new renaissance,� how do you make it � even a small part it � happen in your organization?
At this point Silva ceded the floor to Robert Leblanc, IBM Senior VP for Middleware Software, whose charts provided the answers and whose customer interviews provided the proof points.
Leblanc drew from a long list of IBM C-suite studies to show how since 2004, technology has steadily risen to the top of executives' list of concerns. The rapid adoption of mobility and cloud, plus advances in big data and analytics, are ushering in a new era of computing, he said. Data volume, variety, velocity and increasingly � veracity � are the key drivers of this new era.
Leblanc shared Silva's enthusiasm for the Big Opportunities, but he also acknowledged the big strain these four V's are placing on organizations and their IT infrastructures. IT workloads are now so onerous that most organizations now spend nearly two-thirds of their IT budgets on maintenance and administration, he said. Further, only one organization in five allocates more than half of their IT dollars to new projects.
�I want that 63 percent to be inverted,� said Leblanc. � I want you spending 63 percent of your time innovating.�
The make that switch, Leblanc, said, organizations must embrace a new mindset more in-tune and in-synch with our dynamic and interconnected global economy.
To prove that such a transition is possible, Leblanc then welcomed the first of two client speakers to the stage.
Phil Anno, Principal Scientist at energy giant ConocoPhillips, explained how IBM Infosphere Stream computing is helping protect their oil rigs and optimize their investments in the Arctic, a region believed to contain 25 percent of the world's remaining gas and oil reserves. One oil rig is a $350 million investment with a time-frame measured in decades, said Anno. To optimize its placement and output � not to mention safeguarding its employees, Anno explained how IBM predictive capabilities help the company track the movement of icebergs and ice floes in the Arctic sea. These insights improve the company's ability to deploy ice breakers. With thousands of icebergs in constant flux, this project can generate a terabyte of data per day, depending on weather conditions and ocean currents.
Following Anno on stage was Keith Figlioli, SVP Informatics at Premier, a performance improvement alliance of more than 2,700 U.S. hospitals and 90,000 other sites. Data silos are pervasive throughout the U.S. Healthcare system, said Figlioli, so much so that 30 cents of every health care dollar is either lost to fraud or waste. More worrisome, he also explained how 100,000 people die each year from preventable hospital infections.
�Providers can't afford to not take advantage of analytics," he said. �It's literally a matter of life and death.�
He then described how Premier's analytics solution is driving a transition from the �stone age� siloed and manual processes that contribute to these problems to an integrated and powerful ecosystem of insights that can dramatically cut costs and improve patient care. �There's not a lot of end-to-end visibility,� he said. Processes can be improved even within a single protocol like a blood transfusion. Figlioli explained. When applied across its providers - some 40 percent of the U.S. healthcare system, these improvements can drive dramatic cost savings and improve patient care. �Our analytics are aimed at the heart of the problem.
Finally, Leblanc also provided considerable stage time to IBM executives Inhi Cho Suh and Deepak Advani, who walked attendees through the IBM technologies that will be helping organizations make these Big Opportunities possible today within their organizations. First, Inhi Cho Suh, VP of IBM Information Management, Product Management and Strategy, provided a detailed run-down of IBM's new family of PureSystems expert integrated systems. You can read more about PureSystems on our official Information On Demand Blog.
Deepak Advani, VP Business Analytics Products and Solutions, then walked attendees through a simulation of IBM's new Smarter Analytics solutions. For more, you can read Katrina Read's recap.
Hollywood magic and hard-nosed analytics made an unlikely couple this morning as the Business Partner summit component of Information on Demand 2012 drew to a close.
Both were embodied within the figure of Jeff Ma, the blackjack whiz kid-turned-analytics strategist. In a funny, engaging and at times highly personal presentation, Ma explored the parallels between blackjack table and boardroom and imparted important lessons to succeed in a data-driven world.
Originally an MIT engineering graduate and member of the MIT BlackJack Team, Ma parlayed his experiences devising analytically driven blackjack strategies into the success recounted in the book, Bringing Down The House. The book served as the inspiration for the film �21,� starring Kevin Spacey, Laurence Fishburne and Kate Bosworth.
Ma appears in the film as well � for two minutes, as a blackjack dealer.
�Only true Hollywood magic could transform an average-looking asian guy into a dashingly handsome white guy,� he joked.
According to Ma, there are three main reasons why blackjack is a perfect vehicle to hone your analytics strategy:
First, it's a closed system. Unlike poker, where the rules can change from dealer to dealer, blackjack features fixed and fast rules. A face card is always 10. No cards are ever wild. As such, it's a system that can be modeled and analyzed, and your decisions can be optimized.
Second, the past matters. With a fixed number of cards in each hand, each can be played only once. As such, the odds of landing a �21� will change with each turn. Savvy card counters know to keep tabs on the number of high and low cards that have been played and adjust their tactics accordingly.
Third, there already exist proven strategies to win. Counting cards can cut a casino's advantage by a factor of two, said Ma. A basic strategy can reduce a casino's advantage by a factor of six.
Despite these advantages, Ma said, most blackjack players lose more than they win. Here are the reasons why, and how to avoid them. Business leaders take note: these lessons apply to you as well.
Eschew loss aversion. The fear of losing what they have prevents people from moving forward to greater reward. In a moving personal anecdote, Many people will �stay� on 15, despite basic strategy telling them to hit. Ma recounted how his mother, having suffered a massive debilitating stroke, convinced doctors to operate to remove the blood clot in her brain despite the risks. For him - and eventually for his family � a 22% chance of survival by doing nothing was not good enough. The operation was successful, his mother recovered, and the two danced together at his wedding a few months later.
Focus on the long game. Data-driven decisions don't always pay off. If you hit on 15, for example, you will sometimes go bust. Yet, said Ma, an analytics strategy, consistently executed, will yield better outcomes over the long term.
Decision points are everywhere, You must approach each one from a zero-point frame of reference. Don't ask yourself which decision will prevent a loss; ask yourself which decision will help you win.
Avoid groupthink. In a game that Ma said became his "defining moment," his decision to split a pair of 10s into two different hands � again risking a strong position for a potentially bigger upside � went against the expectations of everyone around him, including the dealer. Yet the move paid off. �You can't make everyone around you happy,� he said. �If you try to you'll never innovate.�
�You have to believe in an analytically driven strategy," said Ma. �Analytics can take the risk out of gambling. They can take the risk out of business as well."
The following is the first of a new six-part series on Advanced Data Visualization. Over the next three months, IBM visualization experts will explore new and emerging visual techniques and the underlying technologies you can deploy to better understand your data to transform insights into better business outcomes.
Graham Wills is the lead architect for IBM�s visualization engine. He has two decades experience in research and implementation of visualization systems in areas including statistical models, geo- and temporal- visualization, large-scale networks and coordinated views. He has published widely in the field and his recent book,Visualizing Timeis currently available on Amazon.
Visualization is an enabling technology � when we create a set of charts to show some data, our goal is not to create a pretty chart for its own sake, but rather to reveal something in the data.
When we look at beautiful hand-drawn pictures of data, carefully composed by talented individuals, we are drawn to the artistic side. In some ways, those charts are discouraging; their artistic elegance implies that the creation of good visualizations is not an option for most of us.
There are books that provide rules and advice on how to draw graphs. Some give general advice, suggesting that such and such is good, but this other is bad. Others give specific advice such as requiring all charts to have a title, or all axes to go to zero, but these are often tied to specific visualizations, and so are not general enough to qualify as scientific principles. So this leads to a question � what makes a good visualization? Is it the quality of the presentation, or is it the degree to which it allows people to explore and understand the data?
Over the years, this split has led people to label charts and put them into categories: Tables and Pie charts are presentation charts; anyone wanting to explore their data should not use them. Scatterplots are exploratory charts; hide them from anyone who isn�t a data geek. Different tools evolved that concentrated on one of these aspects; presentation graphics packages that emphasized a vast amount of customization on a small subset of simple charts; and exploratory graphics packages that allowed very little customization, but often had a wide and eclectic set of charts.
Exploratory/presentation split no longer useful
Now, in contrast, there is a much stronger emphasis on chart building tools, whether based on programming libraries or language descriptions (such as IBM�s Grammar of Graphics-based approach). My strong feeling is that the exploratory/presentation split is no longer a useful one; visualization tools can serve both presentation and exploratory goals. In fact, I would argue that they must do so.
Consider the figure below. This shows box office take for movies in 2008. It has presentation aspects � it highlights major effects, looks attractive and is effective as a static image. But it also facilitates exploratory tasks. We can see not only the big movies and when they are released (summer and Thanksgiving effects are very obvious), but we can also browse though the shapes and see more subtle details, such as how �The Dark Knight� hit its peak rapidly, whereas �Juno� had much longer legs. We could switch color from being simply a way to differentiate moves to instead encoding movie type � action, drama comedy etc. (Click here for a bigger image)
It would also be interesting to use a small multiples or time animation approach here and show the same chart for several years (I suspect 2012 will look very similar � swap �The Avengers� in for �Iron Man� and �The Dark King Rises� for �The Dark Knight�). We could explore different aspects of movie releases with simple enhancements to this chart.
This chart is based on a relatively �tech-y� statistical technique, kernel density estimation, composed with a stacking operation similar to the one that makes stacked bar and area charts. It is then wrapped around into a circle using a polar transformation that was originally developed for the canonical presentation graphic, the much-maligned pie chart.
This concept of composability is central to bridging the divide between presentation and exploratory graphics. A feature that might be thought of as presentation only can be used as an exploratory tool. And limiting your graphics to a small set of presentation charts is a recipe for failure in a world where data is not simply growing in volume, but variety.
Even the most staid and low-tech visualizations can benefit from composing in exploratory aspects. The table shown below gives the percentage of canceled flights by day of year over a period of twenty years (so the top-left cell lets us know that 2.1% of flights on January 1 were cancelled).
Add shading to tables to aid exploration
A table is a good, simple way to represent the data. It is helpful to see the numbers so we can see that July 4th is great day to fly, but the numbers themselves do not help us explore and find patterns. So this table has been enhanced with exploratory features; we shade the cells by the cell data (which makes it easier to see the strong difference between November and December, for example) and, because upper-end outliers are of particular interest, we highlight those cells that are statistically significant with a border.
I have deliberately kept this a black-and-white chart rather than use color (which is more effective) to show that even in a very constrained situation such as a printed report, visualization can successfully merge exploratory and presentation techniques and improve the ability of people to do what they do best � see important features of their data and take action on it.
That is the heart of what visualization is for. A beautiful chart can be appreciated, but when we have beautiful charts that allow people to see their data and take action based on it, we have merged art and science to provide truly useful visualizations. Allowing exploratory and presentation features to be composed is a key feature that makes this possible; it is the future of visualization.
Continue exploring visual analytics on IBM Many Eyes
Visit IBM�s hub of visual analytics, IBM Many Eyes, and join over 100,000 like-mined visualization enthusiasts, academia and professionals. The Many Eyes web community democratizes data visualization by providing a simple three step process to create and interact with a visualization using your data set. Then share or embed your visualization across the web or your social network.
We�re one week out from the start of Information On Demand 2012, where Jason Silva will challenge you to �Think Big� about the opportunities, innovations and improved outcomes for your own organization that you�ll discover at the largest conference in the IBM Software event calendar.
The articles and blog posts below should give you a head start. I've drawn them from a wide range of sources, but they all sit at a happy intersection of big data, analytics, professional development and organizational change. These topics will no doubt dominate the main stage and breakout sessions during our four days together. Happy reading!
How do YOU feel about the potential of big data?
A new Pew Internet/Elon University survey measured current opinions about the potential impact of human and machine analysis of newly emerging large data sets in the years ahead. While 53% of those surveyed predicted that the rise of Big Data is likely to be �a huge positive for society in nearly all respects� by 2020, 39% of survey participants said it is likely to be �a big negative.
�The analysts who expect we will see a mostly positive future say collection and analysis of Big Data will improve our understanding of ourselves and the world,� said researcher Lee Rainie, director of the Pew Research Center�s Internet & American Life Project.
�They predict that the continuing development of real-time data analysis and enhanced pattern recognition could bring revolutionary change change to personal life, to the business world, and to government.�
Quantified communities: Coming soon to a neighborhood near you?
Over on Project Syndicate, tech watcher and blogger Esther Dyson makes a case for �Quantified Communities.� An evolution of the nascent �Quantified Self� movement, Dyson sees communities to take their data into their own hands to measure, analyze and improve outcomes across a range of essential services:
I predict (and am trying to foster) the emergence of a Quantified Community movement, with communities measuring the state, health, and activities of their people and institutions, thereby improving them.
Just consider: each town has its own schools, library, police, roads and bridges, businesses, and, of course, people. All of them potentially generate a lot of data, most of it uncollected and unanalyzed.
That is about to change.As with the Quantified Self, the tools for collecting and analyzing data about everything from public health to potholes in roads, real-estate prices, school attendance, and more are beginning to emerge. Indeed, many independent data-analysis software tools and Web sites provide data that can be filtered for local information and presented with useful visualizations.
The positive message is that digital progress is, in my view, the best economic news in the world today. And I'll go one step further: it's the most important business story in recent times.
I believe that when the full impact of the computer is assessed, it will turn out to be about as big a deal as the steam engine. And the steam engine was a very, very big deal indeed. It touched off the Industrial Revolution, which changed the world, for the better, as nothing has before or since.
With Big Data we can now begin to actually look at the details of social interaction and how those play out, and are no longer limited to averages like market indices or election results.
This is an astounding change. The ability to see the details of the market, of political revolutions, and to be able to predict and control them is definitely a case of Promethean fire�it could be used for good or for ill, and so Big data brings us to interesting times. We're going to end up reinventing what it means to have a human society.
CIOS: Boost your people skills to increase your influence
As geeks, we don't like to trespass on other people's interior experiences and subjective reality. That's the realm of emotions, and we don't do emotions. We don't like to talk about them, think about them or attempt to make others feel them. And strategizing about how to make someone feel a certain way seems wrong.
But we can't influence our business partners without understanding their interior experience. Geeks have become reasonably good at understanding business processes, but we rarely consider the human experience of inhabiting those processes. Without stepping into other people's worldview, we have no hope of gaining influence.
There�s strong demand for data scientists, people who know how to deal with one of today's most popular technology topics, Big Data. That�s the huge trove of raw information that�s now available thanks to the explosion of social media, sensors and other sources.
The U.S. Department of Labor forecasts that the number of analytics-based jobs will grow by more than 20 percent between now and 2018 � far outstripping most other categories. At a time when so many recent college graduates are underemployed or searching for jobs, analytics offers the promise of a great career at the cutting edge of technology, business and social change
Programming and development abilities top many employers' most-sought-after-skills lists, as big data and mobile-platform development jack up demand to new levels.
Wall Street firms, for example, are searching hard for programmers with a side of database skills, according to employment recruiter eFinancialCareers, which specializes in financial gigs. When the site posted its top 10 skill searches for the summer of 2012, programming languages and databases were at the top "by a wide margin," a company statement reported.
Several novice programmers who signed up for a free machine-learning class on Coursera have gone on recently to win predictive-modeling competitions. Maybe it�s not that hard to mint new data scientists after all.
Instead of asking, "How can we get far more value from far more data?" successful big data overseers seek to answer, "What value matters most, and what marriage of data and algorithms gets us there?"
The most effective big data implementations are engineered from the desired business outcomes in, rather than the humongous data sets out. Amazon's transformational recommendation engines reflect Bezos' focus on superior user experience rather than any innovation emphasis on repurposing customer data. That's real business leadership, not petabytes in search of profit.
Big data: The management revolution
In a similar vein, Andrew McAfee and Erik Brynjolfsson present two examples of how big data, harnessed in the right way, can lead to dramatic business transformation. One uses big data to create new businesses, the other to drive more sales. It�s worth noting that both examples feature established companies, not Silicon Valley upstarts:
We expect companies that were born digital to accomplish things that business executives could only dream of a generation ago. But in fact the use of big data has the potential to transform traditional businesses as well. It may offer them even greater opportunities for competitive advantage (online businesses have always known that they were competing on how well they understood their data). As we�ll discuss in more detail, the big data of this revolution is far more powerful than the analytics that were used in the past. We can measure and therefore manage more precisely than ever before. We can make better predictions and smarter decisions. We can target more-effective interventions, and can do so in areas that so far have been dominated by gut and intuition rather than by data and rigor.
Predictive analytics quantifies what every winemaker instinctively understands: that change in any one area affects the entire vineyard-to-bottle process. The links between these different areas are complex and diffuse, but the deep data analysis of platforms like IBM�s Cognos TM1 can give us strong correlations and models upon which to base our decisions.
We may not be able to entirely predict the future, but we can use data analysis predictions to answer the �what-if� questions that determine Delegat�s profitability and international standing, helping us become smarter winemakers and business leaders.
Leadership is about change, but what is a leader to do when faced with ubiquitous resistance? Resistance to change manifests itself in many ways, from foot-dragging and inertia to petty sabotage to outright rebellions. The best tool for leaders of change is to understand the predictable, universal sources of resistance in each situation and then strategize around them.