Creating a catalog

You create a catalog to organize your assets and collaborators.

Some IBM Knowledge Catalog plans have limits on the number of catalogs that you can create.

Requirements

Before you create a catalog, understand the requirements for permissions, storage, rule enforcement, duplicate handling, and asset removal. You must have the IBM Knowledge Catalog service to create a catalog.

Required permission

Required role:

You must have one of these roles or permissions to create a catalog:

  • The IAM Administrator role in the IBM Cloud account.
Storage requirement

You must specify the IBM Cloud Object Storage instance configured during IBM Cloud account setup. If you are not an administrator for the IBM Cloud Object Storage instance, it must be configured to allow catalog creation.

The object storage that you associate with the catalog will contain files for assets that are copied into the catalog. For example, if you add local files or a notebook to the catalog, the information about those files are stored as assets in the catalog and the associated files are stored in object storage.

Procedure

To create a catalog:

  1. Click Catalogs > View All Catalogs to get to the Your Catalogs page, and then click New Catalog.

  2. Specify these properties:

    • A name and optional description for the catalog.
      The name can be 1–256 printable Unicode characters. ISO control characters and Unicode special characters are not allowed.
      The name must be unique. If, for any reason, you want to allow duplicate names, use the API to create or update catalogs. See the API documentation in the Data and AI Common Core API under Create a catalog.

    • The IBM Cloud Object Storage service. A dedicated storage bucket for the catalog is automatically created

    • Duplicate asset handling. By default, duplicate handling is set to allow duplicates.

    • Whether to enforce data protection rules on the catalog. If you enable it, you can't disable it. See Data protection rules.

    • How you want to configure asset removal. You can either select to purge the assets automatically after the removal or 30 days after it.

  3. Click Create. You can now add assets and collaborators to your catalog.

Enforcing data protection rules

Base Premium

Data protection rules apply to all governed catalogs and all assets within these catalogs. Data protection rules are automatically enforced when a catalog member attempts to view or act on a data asset in a governed catalog to prevent unauthorized users from accessing sensitive data.

However, if the user who is trying to access the asset is the owner of the asset (by default, the user who created the asset), then access is always granted. See Data Protection Rules.

Duplicate asset handling

When assets are added to a catalog, it can happen that an asset with a duplicate already exists in the catalog. You can specify what action should be taken in such a case. The choices you have are determined by the catalog default setting. Find out how you can specify duplicate asset handling in Handling duplicate assets in catalogs.

Asset removal

When you create a new catalog, you can decide how you want to configure removing assets. You can either select to purge the assets automatically from the trash and get them permanently deleted immediately after the removal or 30 days after the removal, in which case the assets remain in the trash for the next 30 days and you can restore them if needed.

All previously created catalogs have Never purge assets option selected by default. You can change this setting to any other of the purge options on the catalog Settings page.

Watch this short video to see how to create a catalog and find assets fast.

This video provides a visual method to learn the concepts and tasks in this documentation.

  • Video transcript
    Time Transcript
    00:00 This video shows you how to create a catalog and find assets quickly with Watson Knowledge Catalog.
    00:08 From the home page, you can access the catalogs.
    00:13 Now, create a new catalog and provide a name and a description.
    00:21 A catalog contains metadata about the contents of assets and how to access them, and a set of collaborators who need to use the assets for data analysis.
    00:32 The metadata is stored in an encrypted IBM Cloud Object Storage instance.
    00:37 Any data that you want to store in the cloud you can upload to the Cloud Object Storage of your choice and then specify that object storage when you create the catalog.
    00:48 This split between where the data's metadata is stored and the actual location of the data is important.
    00:54 It means that you can keep your data wherever it is.
    00:57 You don't need to move it into the catalog, because the catalog only contains metadata.
    01:03 You can have the data in on-premises data repositories, in other IBM Cloud services like Cloudant or Db2 on Cloud, in non-IBM cloud services like Amazon or Azure, in streaming data services, or even dark data sources like PDS.
    01:23 Included in the metadata, is how to access the data asset; in other words: the location and credentials.
    01:31 That means that anyone who is a member of the catalog and has sufficient permissions can get to the data without knowing the credentials or having to create their own connection to the data.
    01:45 If you don't have an Object Storage instance, you can create one right from here.
    01:51 When you create a catalog, you can enforce data policies.
    01:56 If you enable data policies now, you can't disable it later.
    02:02 Now, you're ready to create the catalog.
    02:07 Since the new catalog is empty, let's take a look at an existing catalog.
    02:12 On the "Browse assets" tab, you can see recommendations, highly rated assets, and recently created assets, as well as a list of all the assets.
    02:26 You can type a search term to find assets.
    02:29 And you can filter by asset type, such as "Data asset" or "Notebook", or filter by tags that were assigned to the asset when it was added to the catalog.
    02:42 When you view an asset, the "Overview" tab provides basic information about the asset, such as the description, a rating, tags, where the asset is located, business terms, and any classifications.
    03:00 The "Asset" tab provides a preview of the data.
    03:05 On the "Access" tab, those with permission can add members to view this particular asset.
    03:12 And the "Review" tab shows reviews and lets you contribute a review.
    03:19 When assets are added to a catalog with data policies enabled, Watson Knowledge Catalog automatically profiles and classifies the content of the asset based on the values in those columns.
    03:32 The "Profile" tab contains more detailed information about the inferred classifications.
    03:38 On the "Activities" tab, you'll see the various events that Watson Knowledge Catalog has captured that occurred in the life cycle of this data asset, allowing you to trace what's happened to the asset since it was created.
    03:54 Go back to the catalog and on the "Access control" tab, you can see the current list of catalog members.
    04:02 You can also add members, which is pretty similar to adding collaborators in a project.
    04:08 Most catalog members will likely have the "Editor" role.
    04:12 The "Viewer" role is intentionally restricted.
    04:15 And only a select few will have the "Admin" role.
    04:19 You can also add an access group that you previously defined in your IBM Cloud account and provide the group with the specified access to the catalog.
    04:30 This is an easy way to give many users "Viewer" access.
    04:35 And on the "Settings" tab, you can edit the catalog name and description, and see other important information about the Cloud Object Storage associated with this catalog.
    04:49 Find more videos in the Cloud Pak for Data as a Service documentation.

You can also create a catalog by using the Watson Data API. See Create a catalog with the Data and AI Common Core API.

Learn more

Parent topic: Administering a catalog