We are excited to announce the beta of the IBM Analytics Engine, providing a single Hadoop and Spark service under the Watson Data Platform. It makes it easier for data engineers, data scientists and developers to develop and deploy analytics applications. With integration through Jupyter notebooks in Data Science Experience, IBM Analytics Engine provides the foundation for executing data science and machine learning workloads. The IBM Analytics Engine utilizes the Hortonworks Data Platform as the underlying Hadoop distribution, providing access to a market leading open source Hadoop distribution.
With the release of BigInsights 4.2, IBM is making self-service, powerful advanced analytics - including Apache Spark - available on an optimal Hadoop distribution. To find out more, we chatted with Priya Krishnan, Program Director and Product Manager for BigInsights at IBM, about the new release and how its comprehensive, open, and flexible architecture makes the delivery of big data analytics and business applications easier.
Hadoop is a powerful technology, but it’s not the easiest to get up and running, particularly for companies that don’t have any experience of big data technologies. Jim Wankowski, Technical Sales Specialist for IBM Cloud Data Services, talks us through the challenges that many businesses face when adopting technologies such as Hadoop.
Streams applications can integrate with HDFS in on-premise BigInsights clusters using the streamsx.hdfs toolkit. However, an extra layer of security in the cloud requires a special toolkit to access the BigInsights service in Bluemix. The HDFS for Bluemix toolkit contains Streams operators that can connect through the Knox Gateway. This article shows how to use these operators to read and write files to HDFS on Bluemix.
There are many blogs and analyst reports that have provocative titles like “Why the days are numbered for Apache Hadoop as we know it,” or “Does Spark Mean the End of Hadoop?” Many of these articles appear to be heavily sensationalized and ignore the reality that Apache Spark actually integrates deeply with Hadoop. While Spark is an impressively fast and advanced general purpose-processing framework, it is not a data storage system.
This weekly post showcases some of best new tutorials, videos, and other content published each week on developerWorks. The three featured articles include: Building an enterprise-scale database of SEC financial data with Bluemix and Cloudant, using the Node-RED workflow editor in Bluemix to capture a Twitter feed and analyze the data with the IBM Analytics for Hadoop service, and a video showing now to build/deploy a scalable contacts application in the cloud with API Management service.
To call attention to popular articles on Bluemix available on IBM developerWorks, this weekly post will introduce three "best of" articles. This week's entry focuses on getting started with IBM Mobile Data, Watson's Q&A API, and IBM Analytics for Hadoop.
With the advent of IBM Bluemix, it has never been easier to start playing with Hadoop, specifically IBM’s Analytics for Hadoop. The IBM Analytics for Hadoop Bluemix service leverages IBM’s Hadoop offering, BigInsights, to power the latest mobile, web and cloud applications. It also provides an opportunity for enterprise IT administrators who are looking to adopt Hadoop, to explore the rich enterprise features that BigInsights provides. If you are new to Hadoop technology and want to get hands-on quick, you have come to the right place.