December 11, 2015 | Written by: Hernando Borda
Categorized: What's New
Share this post:
Today we are announcing the availability on Bluemix of IBM DataWorks, a fully managed data preparation service for IBM Cloud Data Services.
DataWorks enables business analysts, developers, data scientists and engineers to put data to work. DataWorks empowers Bluemix users to access a variety of data sources both on the cloud and on premises and gives them the ability to shape, combine and wrangle the data using a familiar spreadsheet-like user interface, and deliver it to a wide ecosystem of cloud data services for analysis, visualization, modeling and other analytics tasks.
Data is the new natural resource that fuels business decision-making, insights and competitive advantage. Cloud, Big Data and Internet of Things (IoT) technologies are providing vast amounts of new data in a wide variety of formats that businesses need to tap into, along with more traditional systems of record that run today’s business operations.
However, raw data needs to be refined, cleansed, shaped, formatted and, in general, prepared in order to unlock its true value. Today, business analysts, data scientist and other knowledge workers are spending the majority of their time (between 40 and 80 percent according to Forrester) finding, refining and wrangling data to be used for analytics. Traditional data integration tools require technical skills that are scarce – and most of the time unavailable – to fulfill requests for clean and relevant data, leading to weeks, months and sometimes years of delays in analytics project.
To overcome the current paradigm, where too much time is spent preparing rather than analyzing data, business users must be empowered to self-serve and prepare their data in a fraction of the time and effort required by traditional tools. DataWorks addresses this challenge by providing easy-to-use data preparation and data movement on the cloud, accessible to both technical and non-technical users.
Guarantee the Quality and Shape of Your Data
With DataWorks, today’s knowledge workers can access hybrid data wherever it is, by leveraging connectivity to the most common and widely used data sources and secure gateway technology to reach into on-premise data, including data that is stored behind enterprise firewalls. Users can access data from multiple sources and combine it to produce more relevant and complete datasets, as well as shape raw data by using automated scores to assess the quality, filter unwanted values, remove columns and sort the data.
Once the user is satisfied with the shape of the data, they can deliver it to integrated cloud data services like dashDB, Cloudant and Watson Analytics. A shaping recipe is called an activity in DataWorks. Once defined, users can run activities any time to refresh or augment their data. Application developers can also integrate the execution of activities created by analysts and data scientists into application workflows.
DataWorks is now available as a beta service to Bluemix subscribers for a limited time. Enterprise customers can reserve subscription packages and benefit from full access to IBM support and the ability to support larger scale data preparation projects. Want to learn more? See the resources below:
And for a range of new demos and tutorials to help you get started, check out the DataWorks Learning Center from the IBM Cloud Data Services developer advocacy team.