From dreams to streams: turning the vision of streaming analytics into practical business reality with IBM Streams Designer
Today’s web is a much more open place than ever before—most social networks and other web platforms offer public APIs that allow anyone to request and use data on a scale that would have been unthinkable just a few years ago.
There’s a lot of hype around the possibilities of stream computing. It seems like everywhere you look, more and more organizations are touting the benefits of capturing and analyzing large volumes of data at high velocity—and increasing numbers of streaming analytics solutions, both commercial and open source, are flooding the market.
Many organizations have started to explore the value that machine learning can bring—from illuminating previously “dark data” such as images and videos, to creating models that help to guide or even automate business decision-making. However, very few companies have gone beyond pilots and prototypes, or made the transition from one-off projects to a scalable, repeatable workflow. Too often, machine learning exists in a bubble of its own, instead of being understood in the context of the broader data science workflow.
Change doesn’t stop, so neither should your analytics. You could capture the most crucial, valuable insight of all—but if you don’t identify and act on it while it’s still valid, or before your competitors do, it’s worth nothing. Imagine you’re an electronics company that has sunk thousands of hours and millions of dollars into building a profile of the perfect customer for a new product release. Before you can claw back your investment with a wildly successful launch, a rival comes along and disrupts the entire industry with an innovative device like no one has ever seen before. All that effort and resources expended… all for nothing.
Data science is rapidly being established as the new frontier for analytics, as it moves from niche interest to the mainstream. Combining elements of statistics, computer science, applied mathematics and visualization, it offers a powerful new set of tools and techniques to enable more effective decision-making.
Machine learning is one of the most exciting areas of data science, with enormous potential to transform data into the pure gold of competitive advantage. Data scientists can seem like wizards when their models first accurately predict customer or market behavior, or reveal valuable insight from previously untapped data sources.
The world doesn’t stop, which also means that data never stops pouring in. If you’re in the analytics game, then basing your efforts on a snapshot of historical data always involves a degree of compromise. Did you choose the right data set; one that is an accurate representation of ongoing operations so that it doesn’t skew your analysis? How soon will your insights be out of date? How can you store the data that you’re analyzing cost-effectively?
Last year we made data science a team sport with IBM Data Science Experience, our award-winning IDE for analytics. This summer we brought to market IBM Watson Machine Learning that allows companies to put models into production with easy model management and full workflow automation. And last week, we announced we've grown up those two products into Watson Data Platform, while adding new features.