Manual scanning of an IBM Storage Scale data source

How to configure IBM Spectrum® Discover to connect to IBM Spectrum Scale. After completing these steps, data can be ingested from an IBM Spectrum Scale data source to IBM Spectrum Discover for metadata indexing.

Before you begin

Create the data source connection to IBM Storage Scale. For more information, see Configure data source connections.
You can include or exclude the files during initial IBM Storage Scale scan process by configuring the following environment variable:
INCLUDE_SCALE_SNAPSHOTS
When the INCLUDE_SCALE_SNAPSHOTS variable value is set to 'false' (default value), the IBM Storage Scale scan excludes all the files that are inside the .snapshots directories, otherwise, if the variable value is set to 'true', the scan includes all the files, including the .snapshots directories.

To set the INCLUDE_SCALE_SNAPSHOTS variable by using configmap, see Enabling skip snapshot directories feature on Red Hat® OpenShift®Enabling skip snapshot directories feature on Red Hat® OpenShift in the IBM Storage Scale: Administration Guide.

The minimum connection parameters required for manual scanning are:
  • Connection Name
  • Connection Type
  • Cluster
  • Filesystem
Restriction: IBM Spectrum Discover uses a unit separator (ASCII code 0x1F) as the field delimiter for ingestion into the database. This means that data which contains this character in path/file/object names results in improper parsing of the input data and the records are rejected by IBM Spectrum Discover.

Procedure

  1. Perform a file system scan to collect system metadata from IBM Spectrum Scale to be ingested into IBM Spectrum Discover. For more information, see Performing file system scan to collect metadata from IBM Storage Scale.
  2. Copy the output of the file system scan to the IBM Spectrum Discover master node. For more information, see Copying the output of the IBM Storage Scale file system scan to the IBM Spectrum Discover master node.
  3. Ingest data from the file system scan in IBM Spectrum Discover. For more information, see Ingesting metadata from IBM Storage Scale file system scan in IBM Spectrum Discover.
  4. Ingest quota information from the file system. For more information, see Ingesting quota information from the file system.