13 May 2013
Learn about the text analytic development and runtime environments of BigInsights, including how you can use Eclipse-based tools to create, test, and publish text extractors on your cluster so you can analyze content relevant to your application.
Manage and analyze structured and unstructured data with this free
Apache Hadoop-based solution. It enhances the Hadoop
open source technology to withstand your enterprise
demands, adding administrative, workflow,
provisioning, and security features, along with
Download InfoSphere BigInsights Basic Edition (sign-in required)
- An introduction to InfoSphere Streams
IBM InfoSphere Streams, part of the IBM big data platform, is used to process vast amounts of generated streaming data in real time. Find out what the product is designed to do, when it can be useful, how it works, and how it can complement InfoSphere BigInsights to perform highly complex analytics.
- Integrate PureData System for Analytics and InfoSphere BigInsights for email analysis
Use BigInsights to store and analyze text files, and PureData System for Analytics as a warehouse and base platform for running Cognos BI reports. The authors use an email analysis example, storing and analyzing email data in BigInsights and the structured employee data in PureData System for Analytics.
- Developing a big data application for data exploration and discovery
Explore an approach for indexing big data managed by a Hadoop-based platform for use with a data discovery solution. We describe how data stored in InfoSphere BigInsights (a Hadoop-based platform) can be pushed to InfoSphere Data Explorer.
- Calling Python code from InfoSphere Streams
Learn how to bring together the best capabilities of two worlds: SPL and Python. Seamlessly mix analytics code written in Python in the Streams applications to take advantage of its unparalleled features in scaling and distributed processing.
- Accelerating batch processing with IBM DB2 Analytics Accelerator
Get an overview of the benefits a company can achieve by introducing DB2 Analytics Accelerator in batch processing systems. The example given is based on a real implementation done at Swiss Re, a major re-insurance company based in Zurich, Switzerland.
- Big SQL Technology Preview
Experience a new technology that extends SQL to Apache Hadoop data repositories. Get access to a Hadoop cluster configured with HDFS and HBase, and loaded with sample data for queries. Download Big SQL JDBC and ODBC drivers and use them with your applications to explore and query data in the Hadoop cluster.
- Using InfoSphere Streams with Informix
Connect and use IBM Informix as a data source or data target with IBM InfoSphere Streams. The article covers the use of the Informix-specific and the general IBM Common Driver protocols used by several IBM database products.
- IBM Accelerator for Machine Data Analytics, Part 4: Speeding up the up-and-running experience
Use the web or Eclipse tooling in IBM InfoSphere BigInsights to quickly get up and running with IBM Accelerator for Machine Data Analytics. This article takes you step by step through the process of preparing and testing data for analysis.
- Big data security and auditing with
IBM InfoSphere Guardium
Seamlessly integrate Hadoop data protection into your existing data security strategy using InfoSphere Guardium security policies and reports.
- Developing, publishing, and
deploying your first big data application
with InfoSphere BigInsights
Use Eclipse-based tools for InfoSphere BigInsights to expedite application development, package your application for publication in a web-based catalog, and deploy your application so staff and others can easily launch it.
- Open Source big data for the
impatient, Part 1: Hadoop tutorial: Hello
World with Java, Pig, Hive, Flume, Fuse,
Oozie, and Sqoop with Informix, DB2, and MySQL
Get a working definition of big data and some of the capabilities of Hadoop, the leading open source technology in the big data domain.
IBM big data platform capabilities
Hadoop-based analytics: Store any data type in the low-cost, scalable Hadoop engine to reduce the cost of processing and analyzing massive volumes of data.
Stream computing: Continuously analyze massive volumes of streaming data with sub-millisecond response times to take action in real time.
Text analytics: Analyze textual content to uncover hidden meaning and insight in unstructured information.
Accelerators: Deploy pre-packaged analytical and industry-specific software modules to extract value from big data.
Application development: Develop text analytics applications with toolkits and tools, including an extensive library of extractors you can customize and extend.