Configure data source connections
Creating data source connections in IBM Data Cataloging identifies source storage systems that are to be indexed by IBM Data Cataloging.
For some data source types, a network connection is (optionally) created to allow for automated scanning and indexing of the source system metadata. IBM Data Cataloging will not index data from unknown sources, so creating a data source connection is the first step towards cataloging any source storage system.
You can add data connections to the source storage systems from the IBM Data Cataloging graphical user interface and REST API. For more information on configuring data source connections offline, see Configuring data source connections offline. Configuring data source connections offline in the IBM Data Cataloging: Concepts, Planning, and Deployment Guide.
Typically, a data source is equivalent to a single file system or object vault or bucket. A data source connection is an alias for the combination of a cluster name and a data source within the cluster. This allows multiple file systems or buckets or vaults with the same name to be indexed by IBM Data Cataloging when they are in separate clusters.
- IBM Data Cataloging does not support file or file path names that use characters that are not part of the UTF-8 character set.
- After you do a manual scan, run a Metadata summarization database refresh:
- In , go to .
- Click refresh and wait for the process to complete successfully. The expected time of completion is based on the amount of records to process. For example, 30 minutes.