Two changes related to Watson Studio and Watson Knowledge Catalog

We’d like to announce two changes related to Watson Studio and Watson Knowledge Catalog that will take effect on May 17, 2019 (1) and July 12, 2019 (2) respectively:

  1. Shaping data and running data refinery flows will use enhanced spark environments that add flexibility and change the capacity unit hours needed.

  2. Data profiling and classification will be available from Watson Knowledge Catalog only.

Enhanced spark environments to shape data and run data refinery flows

Data refinery provides data preparation and visualization capabilities in both Watson Studio and Watson Knowledge Catalog. You use data refinery to shape your data and run data flows to create new datasets for your analytical projects. We use Capacity Unit-Hours as the unit of measure for the amount of processing consumed to run data refinery flows. Each Watson Studio or Watson Knowledge Catalog plan includes a set amount of Capacity Unit-Hours (CUH). Overage is charged as per-plan when the account has used up the included amount of CUHs.

Current case

Currently, data refinery flows use a dedicated Spark cluster that requires 6 Capacity Units per hour. In addition, there is a minimum charge of 0.96 Capacity Unit-Hours each time you initiate a data flow job.

Proposed changes

Effective May 17, 2019, data refinery flows will run using the default Spark environment and require only 1.5 Capacity Units per hour, which is lower than the current rate and provides greater value. We are also removing the minimum charge related to running data flows.

However, effective July 12, 2019, shaping data in the interactive UI will start consuming Capacity Units per hour. Note that this is a new charge. Here are some data refinery usage examples and their associated CUH consumption:

If you need additional processing capacity for big data flows, you can select another custom Spark environment. The associated capacity unit hours can vary based on the performance specified. Learn more about custom Spark environments here. Effective July 12, 2019, a user will also get an option to select a non-distributed R runtime for data flow runs. 

These changes provide three key benefits:

  • Reduced rates when running data refinery flows using the default Spark environment

  • Greater flexibility in selecting the appropriate Spark environment for the job

  • A pricing model that reflects the actual usage of data refinery services

Data profiling and classification available only with Watson Knowledge Catalog

Data profiling and classification processes samples of data assets to extract statistics and insights on their content. It includes whether the data contain personal or sensitive information, such as names, emails, national identifications, and credit card numbers. You can access the data profile from the Profile tab when viewing the data asset in a project or catalog.

Currently, you can run profiling and classification if you have either Watson Studio or the Watson Knowledge Catalog. Effective July 12, 2019, you must have Watson Knowledge Catalog in order to run profiling or access the profile of a data asset from a project or a catalog.

Please note that the profile and classification feature (available on the preview of data in the project or catalog) is different than the profile tab in the Data Refinery tool. There are no changes in the profile tab of the Data Refinery tool.

There is no change to the current number of capacity units needed to run profiling jobs at 6 Capacity Units per hour. A minimum charge of 0.96 Capacity Unit-Hours applies each time you run a profiling job. The consumed capacity units will be deducted from the set amount included with the provisioned Watson Knowledge Catalog plan.

How to access profiling and classification

Note that the sign-up process for Watson Studio includes the no-cost Lite plan of Watson Knowledge Catalog by default. Therefore, unless you have explicitly removed the Watson Knowledge Catalog app from the account or during the sign-up process, you can continue to access profiling services from the included Watson Knowledge Catalog Lite plan.

If you only have Watson Studio and want continued access to data profiling, you can add Watson Knowledge Catalog to your account at any time.

For both of these updates, there are no changes in price metrics for CUH (0.50 USD/Capacity Unit-Hour). For the most up-to-date pricing and plans, please refer to the catalog pages for Watson Studio and Watson Knowledge Catalog.


More from Analytics

Data science vs data analytics: Unpacking the differences

5 min read - Though you may encounter the terms “data science” and “data analytics” being used interchangeably in conversations or online, they refer to two distinctly different concepts. Data science is an area of expertise that combines many disciplines such as mathematics, computer science, software engineering and statistics. It focuses on data collection and management of large-scale structured and unstructured data for various academic and business applications. Meanwhile, data analytics is the act of examining datasets to extract value and find answers to…

Financial planning & budgeting: Navigating the Budgeting Paradox

5 min read - Budgeting, an essential pillar of financial planning for organizations, often presents a unique dilemma known as the “Budgeting Paradox.” Ideally, a budget should give the most accurate and timely idea of anticipated revenues and expenses. However, the traditional budgeting process, in its pursuit of precision and consensus, can take several months. By the time the budget is finalized and approved, it might already be outdated.In today's rapid pace of change and unpredictability, the conventional budgeting process is coming under scrutiny.It's…

How Macmillan Publishers authored success using IBM Cognos Analytics

5 min read - Macmillan Publishers is a global publishing company and one of the “Big Five” English language publishers. If you're a reader, chances are good you've read a book from Macmillan. They published many perennial favorites including Kristin Hannah’s The Nightingale, Bill Martin’s Brown Bear, Brown Bear, what do you see? and some of the more recent bestsellers such as The Silent Patient by Alex Michaelides, Identity by Nora Roberts and Razorblade Tears by S. A. Cosby. It’s no wonder then that Macmillan…

MLOps and the evolution of data science

7 min read - The advancement of computing power over recent decades has led to an explosion of digital data, from traffic cameras monitoring commuter habits to smart refrigerators revealing how and when the average family eats. Both computer scientists and business leaders have taken note of the potential of the data. The information can deepen our understanding of how our world works—and help create better and “smarter” products. Machine learning (ML), a subset of artificial intelligence (AI), is an important piece of data-driven…