Performing file system scan to collect metadata from IBM Storage Scale
You can use the file system scanning tool, IBM Storage Scale Scanner, to collect system metadata from IBM Storage Scale to be ingested into IBM Spectrum® Discover.
About this task
The IBM Storage
Scale Scanner tool uses the IBM Storage
Scale information lifecycle management (ILM) policy
engine to obtain the system metadata about the files stored on the file system. The system metadata
is written to a file, which is then transferred to the IBM Spectrum
Discover master node. The file is then ingested within
the node and analytics is carried out to provide search, duplicate file detection, archive data
detection, and capacity show-back report generation. The following system metadata is collected from
the file system scan:
Key name | Description |
---|---|
Site | The site where the file or object resides. |
Platform | The source storage platform that contains the file or object. |
Size | The size of the file. |
Owner | The owner of the file. |
Path | The subdirectory where the data resides. |
Name | The name of the data. |
Permissions | The permissions for the file (mode). |
ctime | The change time of the file metadata (inode). |
mtime | The time when the data was last modified. |
atime | The time when the data was last accessed. |
Filesystem | The name of the IBM Storage Scale file system that is storing the data. |
Cluster | The name of the IBM Storage Scale cluster. |
inode | The IBM Storage Scale inode that is storing the data. |
Group | The Linux® group associated with the file. |
Fileset | The file set that stores the file. |
Pool | The storage pool where the file resides. |
Migstatus | If applicable, indicates whether the data is migrated to tape or object. |
migloc | If applicable, indicates the location of the data if migrated to tape or object. |
ScanGen | Scan generation - useful to track rescans. |
The IBM Storage Scale Scanner tool also collects quota information by calling mmrepquota.
The tool comprises the following files:
- scale_scanner.py: The tool that starts the IBM Storage Scale ILM policy.
- scale_scanner.conf: The configuration file used to customize the behavior of the scale_scanner.py tool.
- createScanPolicy: The script that is called internally by the tool.
Procedure
Install the IBM Storage Scale Scanner tool by unpacking the utility from the IBM Spectrum Discover node to the required location on the IBM Storage Scale cluster node.