At the end of the Superbowl, people created 12,233 tweets per second. And it turns out that was less than half the number of tweets created in Japan on December 9th, when 25,088 tweets per second were recorded about the Castle in the Sky anime movie. Which, according to the Chinese, is nothing compared to the 32,312 messages per second sent on their twitter-like Sina Weibo system during the beginning of the Chinese new year.
Within the government space, we’re no strangers to our own Big Data. Whether you’re in the DOD or NASA, the IRS or SSA, you’ve got your own Big Data to deal with.
Last week, Forrester Research released a report that should help those in government understand the Big Data Market. It is called “ The Forrester Wave™: Enterprise Hadoop Solutions, Q1 2012, (February 2, 2012)” report[1]. IBM Technologies evaluated were IBM InfoSphere BigInsights (IBM’s Hadoop-based offering), and IBM Netezza Analytics. In this evaluation, IBM was placed in the Leaders category of the Wave and achieved the highest possible score in both the Strategy and Market Presence segments. In the third segment, Current Offering, IBM received the second highest score. You can download the complete report here.
The report by analyst James Kobielus states, “IBM has the deepest Hadoop platform and application portfolio.”
The IBM Analytics Solution Center in Washington, DC also focused on how to handle Big Data at its January 19th seminar. The seminar covered various aspects of Big Data including data-in-motion processing software, Hadoop software, SONAS (scale out network attached storage), and the Netezza data warehouse appliance.
1. Big Data in Motion
Going back to the Tweeting, if you’re a government agency and you need to get actionable insights into 10s of thousands of tweets per second which might be about an unfolding crisis, how would you do it? InfoSphere Streams is unlike anything else in the market in its ability to ingest, analyze and act on data “in motion” – that is, data is processed and analyzed at microsecond latencies.
2. Hadoop Big Data
Hadoop is an open source codebase supported by the Apache software foundation. It is designed to process large volumes of unstructured data. For example, if a government agency wanted to analyze months of tweets or documents in non-real time, the Hadoop distributed file system would be a good choice. The enterprise class IBM Hadoop-based offering, BigInsights, is designed with system management, security, and performance features that go beyond what is available in the open source. It provides the ability to analyze and extract information from a wide variety of data sources, and promotes data exploration and discovery.
3. SONAS
Network Attached Storage, or NAS, has become a very popular way to provide storage within an organization. However NAS has a number of limitations when dealing with Big Data including the number of objects (files) it can support, support for very large files, the i/o bandwidth it can deliver to applications, and fragmented data management across multiple systems. The IBM SONAS system is designed to overcome these limitations and look like a very large virtual system to the applications.
4. Data Warehouse Appliance
Traditional data warehouses when used for large volumes of structured data can be costly to operate and maintain, and can be very slow when used for sophisticated analysis. The Netezza appliance is a dedicated device requiring no tuning or storage administration and with special hardware chips to accelerate the performance of advanced analytics.
Want to learn more?
- More details on the topics can be found at the ASC Website under past events.
- Get started with our no-charge version of InfoSphere BigInsights that you can download to your own cluster or run in the cloud.
- Download the Forrester full report and see for yourself how IBM measures up.
- On the educational front, we provide free online training through BigDataUniversity.com. To date, more than 13,000 students have registered for courses on Hadoop, cloud computing and more.
We are working with a broad range of clients to help them define their big data strategies. We look forward to working with you on your Big Data Challenges.
Frank Stein
Director, Analytics Solution Center
ASCdc@us.ibm.com[1] The Forrester Wave™: Enterprise Hadoop Solutions, Q1 2012, Forrester Research, Inc., February 2, 2012. The Forrester Wave is copyrighted by Forrester Research, Inc. Forrester and Forrester Wave are trademarks of Forrester Research, Inc. The Forrester Wave is a graphical representation of Forrester's call on a market and is plotted using a detailed spreadsheet with exposed scores, weightings, and comments. Forrester does not endorse any vendor, product, or service depicted in the Forrester Wave. Information is based on best available resources. Opinions reflect judgment at the time and are subject to change.