/db2whrest/v1/report: POST
Creates a curation report definition and runs it immediately.
Data admin | Data user | Collection Admin | Admin | Service user |
---|---|---|---|---|
✓ | Χ | Χ | Χ | Χ |
Synopsis of the request URL
curl -k -H 'Authorization: Bearer <token>' https://<spectrum_discover_host>/db2whrest/v1/report -X POST -d@report.json -H "Content-Type: application/json"
/db2whrest/v1/search
endpoint and
also has a name parameter: - query
- Specifies a search query string.
- filters
- Specifies an array of dictionaries that filter the results of the query. Each dictionary must
contain the following three fields:
- key
- The name of the field or column to be returned.
- operator
- One of the following operators:
=, >, <, <>, <=, >=, is, like
. - value
- The value of the field or column to filter on.
- group_by
- Specifies a list of fields to be used to summarize search results. Grouped queries return output
columns for count and sum, whereas nongrouped queries return all columns for each row. If the
group_by
field is not specified, the search returns record-level information. - sort_by
- Specifies an array of dictionary objects. Each dictionary object must specify a field or column name to be sorted on and a sort direction. Valid sort directions are asc and desc.
- limit
- Specifies the maximum number of rows that can be returned by the response.
- name
- Specifies a name for the report output file. If this parameter is not specified, the endpoint uses the randomly generated UUID as the file name base.
{
"name": "Unassigned Project Report",
"query": "platform='Spectrum Scale'",
"filters": [
{
"key": "project",
"operator": "is",
"value": "null"
}
],
"group_by": ["Filesystem","Owner","Site"],
"sort_by": [{"Filesystem": "asc"},{"Owner": "asc"}],
"limit": 100000
}
Supported request types and response formats
- POST
- JSON
Status codes
- 201
- The operation is successful.
- All other status code values
- The operation failed.
- 201: The operation is successful.
- All other status code values: The operation failed.
Examples
- The following example shows how to create a report:
- Step 1: Define the search parameters in a file named
report.json:
{ "name": "Unassigned Project Report", "query": "platform='Spectrum Scale'", "filters": [ { "key": "project", "operator": "is", "value": "null" } ], "group_by": ["Filesystem","Owner","Site"], "sort_by": [{"Filesystem": "asc"},{"Owner": "asc"}], "limit": 100000 }
- Step 2: Submit the following
request:
curl -k -H 'Authorization: Bearer <token>' https://<spectrum_discover_host>/db2whrest/v1/report -X POST -d@report.json -H "Content-Type: application/json"
- Step 1: Define the search parameters in a file named
report.json:
- The response includes the ID of the new report and the status of the operation ("report
created"):
{"report": "b5ff3126-353d-4d7d-857a-750cc20b8bab", "status": "report created"}
Report scripts
- JSON Examples
The following example lists the files or objects that are accessed in last 0 - 30 days. This report is created in the UI.
age_report_0-30_days_since_access_detail.json{ "name": "Age Report Detail. 0-30 Days", "query": "", "filters": [ { "key": "atime", "operator": ">", "value": "NOW() - 30 DAYS" } ], "group_by": [], "sort_by": [] }
This example summarizes the files or objects that are accessed in last 0 - 30 days and are grouped by data source. This report is created in the UI.
age_report_0-30_days_since_access_summary.json{ "name": "Age Report Summary. 0-30 Days", "query": "", "filters": [ { "key": "atime", "operator": ">", "value": "NOW() - 30 DAYS" } ], "group_by": ["datasource"], "sort_by": [{"datasource": "asc"}] }
This example lists the files or objects that are accessed in last 30 - 60 days. This report is created in the UI.
age_report_30-60_days_since_access_detail.json{ "name": "Age Report Detail. 30-60 Days", "query": "", "filters": [ { "key": "atime", "operator": "<=", "value": "NOW() - 30 DAYS" }, { "key": "atime", "operator": ">", "value": "NOW() - 60 DAYS" } ], "group_by": [], "sort_by": [] }
This example summarizes the files or objects that are accessed in last 30 - 60 days and are grouped by data source. This report is created in the UI.
age_report_30-60_days_since_access_summary.json{ "name": "Age Report Summary. 30-60 Days", "query": "", "filters": [ { "key": "atime", "operator": "<=", "value": "NOW() - 30 DAYS" }, { "key": "atime", "operator": ">", "value": "NOW() - 60 DAYS" } ], "group_by": ["datasource"], "sort_by": [{"datasource": "asc"}] }
This example lists the files or objects that are accessed in last 60 - 90 days. This report is created in the UI.
age_report_60-90_days_since_access_detail.json{ "name": "Age Report Detail. 60-90 Days", "query": "", "filters": [ { "key": "atime", "operator": "<=", "value": "NOW() - 60 DAYS" }, { "key": "atime", "operator": ">", "value": "NOW() - 90 DAYS" } ], "group_by": [], "sort_by": [] }
The following example summarizes the files or objects that are accessed in last 60 - 90 days and are grouped by data source. This report is created in the UI.
age_report_60-90_days_since_access_summary.json{ "name": "Age Report Summary. 60-90 Days", "query": "", "filters": [ { "key": "atime", "operator": "<=", "value": "NOW() - 60 DAYS" }, { "key": "atime", "operator": ">", "value": "NOW() - 90 DAYS" } ], "group_by": ["datasource"], "sort_by": [{"datasource": "asc"}] }
The following example lists the files or objects that are accessed in last 90 - 180 days. This report is created in the UI.
age_report_90-180_days_since_access_detail.json{ "name": "Age Report Detail. 90-180 Days", "query": "", "filters": [ { "key": "atime", "operator": "<=", "value": "NOW() - 90 DAYS" }, { "key": "atime", "operator": ">", "value": "NOW() - 180 DAYS" } ], "group_by": [], "sort_by": [] }
The following example summarizes the files or objects that are accessed in last 90 - 180 days and are grouped by data source. This report is created in the UI.
age_report_90-180_days_since_access_summary.json{ "name": "Age Report Summary. 90-180 Days", "query": "", "filters": [ { "key": "atime", "operator": "<=", "value": "NOW() - 90 DAYS" }, { "key": "atime", "operator": ">", "value": "NOW() - 180 DAYS" } ], "group_by": ["datasource"], "sort_by": [{"datasource": "asc"}] }
The following example lists the files or objects that are accessed in last 180 - 360 days. This report is created in the UI.
age_report_180-360_days_since_access_detail.json{ "name": "Age Report Detail. 180-360 Days", "query": "", "filters": [ { "key": "atime", "operator": "<=", "value": "NOW() - 180 DAYS" }, { "key": "atime", "operator": ">", "value": "NOW() - 360 DAYS" } ], "group_by": [], "sort_by": [] }
The following example summarizes the files or objects that are accessed in last 180 - 360 days and are grouped by data source. This report is created in the UI.
age_report_180-360_days_since_access_summary.json{ "name": "Age Report Summary. 180-360 Days", "query": "", "filters": [ { "key": "atime", "operator": "<=", "value": "NOW() - 180 DAYS" }, { "key": "atime", "operator": ">", "value": "NOW() - 360 DAYS" } ], "group_by": ["datasource"], "sort_by": [{"datasource": "asc"}] }
The following example lists the files or objects that are accessed in last 360 - 720 days. This report is created in the UI.
age_report_360-720_days_since_access_detail.json{ "name": "Age Report Detail. 360-720 Days", "query": "", "filters": [ { "key": "atime", "operator": "<=", "value": "NOW() - 360 DAYS" }, { "key": "atime", "operator": ">", "value": "NOW() - 720 DAYS" } ], "group_by": [], "sort_by": [] }
The following example summarizes the files or objects that are accessed in last 360 - 720 days and are grouped by data source. This report is created in the UI.
age_report_360-720_days_since_access_summary.json{ "name": "Age Report Summary. 360-720 Days", "query": "", "filters": [ { "key": "atime", "operator": "<=", "value": "NOW() - 360 DAYS" }, { "key": "atime", "operator": ">", "value": "NOW() - 720 DAYS" } ], "group_by": ["datasource"], "sort_by": [{"datasource": "asc"}] }
The following example lists the files or objects accessed that are not accessed in the last 720 days. This report is created in the UI.
age_report_720+_days_since_access_detail.json{ "name": "Age Report Detail. 720+ Days", "query": "", "filters": [ { "key": "atime", "operator": "<=", "value": "NOW() - 720 DAYS" } ], "group_by": [], "sort_by": [] }
The following example summarizes the files or objects that are accessed in last 720 days and are grouped by data source. This report is created in the UI.
age_report_720+_days_since_access_summary.json{ "name": "Age Report Summary. 720+ Days", "query": "", "filters": [ { "key": "atime", "operator": "<=", "value": "NOW() - 720 DAYS" } ], "group_by": ["datasource"], "sort_by": [{"datasource": "asc"}] }
- SQL scripts
This script provides the count of potentially duplicate files across the heterogeneous environment. This report is not created in the UI.
duplicate_files_by_count.sqlselect filename, size, count(fkey) from metaocean group by filename, size order by count(fkey) desc limit 20;
This script provides the size of potentially duplicate files across the heterogeneous environment. This report is not created in the UI.
duplicate_files_by_total_size.sqlselect filename,entrysize,entrycount,totalsize from (select filename, size as entrysize, count(fkey) as entrycount, count(fkey)*size as TotalSize from metaocean group by filename,size) where entrycount>1 order by totalsize desc;
This script provides the size of potentially duplicate files across the heterogeneous environment. This report is not created in the UI.
size_snap.sqlselect datasource,count(*),sum(size)/1024/1024/1024,max(mtime),max(atime) from metaocean group by datasource with ur
This script provides a view of the capacity that is consumed per collection. This report is not created in the UI.
space_per_collection.sqlselect metaocean.collection,count(*),sum(size)/1024/1024/1024,max(mtime),datasource, tier from metaocean group by metaocean.collection,datasource,tier order by max(mtime) desc with ur
This script provides a view of the capacity that is consumed per file type. This report is not created in the UI.
space_per_filetype.sqlselect filetype,datasource,tier,count(*),sum(size)/1024/1024/1024 from metaocean group by filetype,datasource,tier order by filetype,datasource,tier desc with ur
This script provides a view of the capacity that is consumed per user. This report is not created in the UI.
space_per_user.sqlselect owner,tier,count(*),sum(size)/1024/1024/1024,max(mtime),max(atime) from metaocean group by owner,tier order by owner,tier with ur
- /opt/ibm/metaocean/reports/generate_report.pyTo run the utilities, log in to an IBM Spectrum Discover node and run generate_report.py with the following usage:
python generate_report.py [-h] [-o filename] -u username [infile]
- Run a report with an SQL input
file:
python generate_report.py -u sdadmin -o report.csv sql/space_per_user.sql Enter password for SD user 'sdadmin':
- Run a report with a JSON input
file:
python generate_report.py -u sdadmin sql/age_report_0-30_days_since_access_detail.json Enter password for SD user 'sdadmin':
This requires a IBM Spectrum Discover user name and input file. You must have the Data Admin role to generate reports. The tool prompts for a password
Input file examples are stored in the /opt/ibm/metaocean/reports/sql directory. There is a mix of JSON and SQL files in the directory. All JSON files create a report in the IBM Spectrum Discover UI. The SQL files create a CSV file.
- Run a report with an SQL input
file:
- /opt/ibm/metaocean/reports/generate_path_age_report.pyImportant: This tool does not create reports in the IBM Spectrum Discover UI.Here is the usage of the tool:
generate_path_age_report.py [-h] -u username -r report -p pathlevel -a archive optional arguments: -h, --help show this help message and exit -u username, --user username User name with authority to create reports -r report, --report report The report type to be generated (ARDS, ARDSOW, AROW, ARPL, CPPL, FTMB, CPFT, FTMB50) -p pathlevel, --pathlevel pathlevel The level of path used in some reports -a archive, --archive archive The archive threshold in months
Thegenerate_path_age_report.py
tool needs a minimum of three parameters:pathlevel
The level of path used in some reports, for example:
If your report does not use this parameter a value of 1 must be used.1 = /x/, 2=/x/y/
archive
The archive threshold is the number of months since the last time a file is considered relevant for archiving for reporting purposes. If your report does not use this parameter a value of 1 must be used.
report
The report type is one of the following codes:ARDS = Summary of archivable capacity grouped by datasource ARDSOW = Summary of archivable capacity grouped by datasource AROW = Summary of archivable capacity grouped by owner ARPL = Summary of archivable capacity grouped by specified path level CPPL = Summary of capacity grouped by specified path level FTMB = Summary of filetype usage by month, previous 12 months CPFT = Summary of capacity grouped by filetype FTMB50 = Summary of filetype usage by month, previous 12 months' top 50 filetypes
generate_path_age_report.py
tool:python generate_path_age_report.py -u sdadmin -r ARDSOW -p 2 -a 12
Starting to create report type 'ARDSOW' for user: sdadmin
Enter password for SD user 'sdadmin':
Setting up path level summary table
Generating path level summary table
Generating path level summary table complete.
Generating report
Generating report - building temporary table.
Generating report - querying temporary table.
Generating report - writing output file.
Report generated successfully. Results are in 'rpt_ar_ds_ow.csv'.