Watson Studio on Cloud Pak for Data
Version: 6.5.0 Included IBM
Description
The architecture of Watson Studio is centered around the project. Data scientists and business analysts use projects to organize resources and analyze data.
You can have these types of resources in a project:
- Collaborators are the people on the team who work with the data.
- Data assets point to your data that is either in uploaded files or accessed through connections to data sources.
- Operational assets are the objects you create, such as scripts and models, to run code on data.
- Other types of assets that provide components, templates, or other information.
- Tools are the software you use to derive insights from data. These tools are included with the
Watson Studio service:
- Data Refinery: Prepare and visualize data.
- Jupyter notebook editor: Code Jupyter notebooks.
- JupyterLab IDE: Code Jupyter notebooks and Python scripts with Git integration. Other project tools require additional services. See the lists of supplemental and related services.
- Federated learning: Train models on remote parties without sharing data.
- Pipelines: Automate end-to-end flows of data or models.
Watson Studio projects fully integrate with the catalogs and deployment spaces:
- Catalogs are provided by the Watson Knowledge Catalog service
- You can easily move assets between projects and catalogs.
- Catalogs and projects support the same types of data assets.
- Data protection rules are enforced on catalog assets that you add to projects.
- Without the Watson Knowledge Catalog service, you can create one catalog without any governance capabilities to share assets between projects.
- Deployment spaces to view and manage model and other types of deployments.
- You can easily move assets between projects and deployment spaces.
If you have the Data Science and ML Ops Express® offering, the Data Refinery feature of the Watson Studio service is not available.
Quick links
- Install: Install the service
- Set up: Set up the service after installation
- Use: Work with the service
- What's new: See a list of new features
- Known issues: View limitations
- Develop: Write code and build applications
Integrated services
Service | Capability |
---|---|
Analytics Engine Powered by Apache Spark | Run analytical, machine learning, and Spark API jobs on Apache Spark clusters. |
SPSS® Modeler | Create flows to prepare data, develop and manage models, and visualize data. No coding required. |
Watson™ Machine Learning | Build, train, and deploy machine learning models with a full range of tools. |
Decision Optimization | Find the most appropriate prescriptive solutions to your business problems by using CPLEX optimization engines to evaluate millions of possibilities. |
Runtime 22.2 with Python 3.10 for GPU | Access compute environments for Jupyter Notebooks that use GPU-accelerated Python 3.10 libraries. |
Runtime 22.2 with R 4.2 | Access compute environments to create Jupyter Notebooks that use R 4.2 libraries. |
RStudio® Server Runtimes | Access the RStudio IDE. |
Execution Engine for Apache Hadoop | Integrate the Watson Studio service with your remote Apache Hadoop cluster so you can explore data and build and deploy models on your remote cluster. |
Watson Pipelines | Use Watson Pipelines and create end-to-end flows of machine learning pipelines to create models and customize various functions. |
Service | Capability |
---|---|
Watson Knowledge Catalog | Create catalogs of curated assets with this secure enterprise catalog management platform that is supported by a data governance framework. |
Cognos® Dashboards | Identify patterns in your data with sophisticated visualizations. No coding needed. |
Analytics Engine Powered by Apache Spark | Run analytical, machine learning, and Spark API jobs on Apache Spark clusters. |
Watson Query | Integrate data sources across multiple types and locations into one logical data view. |
AI Factsheets | Use AI Factsheets to organize and track lineage events, facts, and details for each of your machine learning models' lifecycle, and increase transparency for model governance needs. |
Data Replication | Integrate and synchronize your data using near-real-time data delivery with low impact to sources. |
DataStage® | Use built-in search, automatic metadata propagation, and simultaneous highlighting of compilation errors to create, edit, load, and run jobs that transform and tailor information for your enterprise. |
Watson OpenScale | Infuse your AI with trust and transparency. Understand how your AI models make decisions to detect and mitigate bias. |
Compatible data sources
See Supported data sources for a list of data source services that are compatible.