Importing externally curated tags for COS/S3 using import tags application
The import tags application is used to import a set of externally curated tags for Cloud Object Storage and S3 services.
Before you begin
The S3/COS (Cloud Object Storage) data source that is associated with the objects in the external tags file must be scanned before you run an import tags policy.
Tag names must be defined in IBM Spectrum® Discover before the Import Tags policy is run. You must refrain from using Restricted tag type and use Open or Characteristics tag types.
- You can create limited number of Open type tags and the tags must correspond to values in the header row of the comma-separated values (CSV) file. Tags that are not defined before you trigger the policy are not imported.
- The CSV file must be in the bucket that is defined in the data source.
- Only a single tag file is supported per policy.
Requirements of the external CSV file are listed as shown.
- The tags file must be in CSV format.
- The first row in the file must be a header row or line.
- The first column must be the full object path or name. For example, if the bucket is auto_data, and the object name is car1/image1.png, then the first column entry is auto_data/car1/image1.png.
- The value in the header row for the first column is not restricted by IBM Spectrum Discover.
- The subsequent columns in the CSV file represent the tag values that can be imported into IBM Spectrum Discover for the associated object records.
- The second through Nth entries in the header row must correspond to valid tags in IBM Spectrum Discover that are defined before you run the import tags policy.
- Each entry in the CSV file must represent a unique file in the data source.
Example contents of a CSV file: objectname,bus,tree,stop_sign,red_light,yellow_light,green_light,pedestrian auto_data/car1/image1.png,1,3,0,1,1,0,1 auto_data/car2/image1.png,1,6,0,0,0,0,12 auto_data/car2/image2.png,1,3,0,2,1,0,1 auto_data/car3/image1.png,1,3,0,2,1,0,2
The following tags can be defined in IBM Spectrum Discover from the records available in the CSV file:
- bus
- tree
- stop_sign
- red_light
- yellow_light
- green_light
- pedestrian
About this task
The IBM Spectrum Discover import tags application allows a user with DATA ADMIN role to apply a pre-curated set of labels (tags) that are available in an external CSV file to S3/COS object records stored in IBM Spectrum Discover.
For example, an external analytics job might generate tag information for a set of S3/COS objects, and save this information into a CSV file. The CSV file comprises an entry for each object that contains an object name and an associated list of labels or tags.
The import tags application can merge these tags into the associated object records in IBM Spectrum Discover, extending the records with new information.