IBM Cloud Object Storage Scanner output data

The Scanner generates a directory beneath the output data directory for each vault or vault prefix as defined in the configuration file.

The /opt/ibm/metaocean/data/connections/cos/replay/output/data directory is the Scanner output data directory.

The following screen shows an example of a configuration file and also shows that all vaults are scanned, but mega_vault has four separate prefixes that are defined winch means the four scans of the vault occurred.


"include_all_vaults": true,
  "vaults": [
  {"vault_name": "mega_vault", "prefix": "main/production/finance"},
  {"vault_name": "mega_vault", "prefix": "main/production/sales"},
  {"vault_name": "mega_vault", "prefix": "main/production/marketing"},
  {"vault_name": "mega_vault", "prefix": "main/production/hr"}
] 

Figure 1 shows the directory structure.

Figure 1. Directory structure from the configuration file
Directory structure from the configuration file

The status and progress of each scan must be maintained so a separate directory structure is created for each scan. Table 1 shows the leaf directories that contain the file names and description.

Table 1. Leaf directory file names
File name Description
_LISTProcessN.debug The N in the file name is different for each process (0 - 9 if there are 10 processes).

Contains detailed debug information and details of any errors that are encountered when you scan the vault. Figure 2 shows an example of running in debug mode.

Figure 2. Example of running in debug mode
Example of running in debug mode
task.stats Scanner starts in JSON format for a single vault. Updated following successful processing of each batch of objects.

Batch processing

*.log The Scanner creates multiple .log files for each vault. Each .log file contains up to 1000 Kafka messages, ready to be submitted to the Kafka cluster by the Notifier.

The naming convention for the log files is

<date>-<time>-<milliseconds>-<batch number>-<number
of messages in file>.log

messages in file .log