/db2whrest/v1/search: POST
Searches a database for data.
The following table shows which roles can access this REST API endpoint:
Data admin | Data user | Collection Admin | Admin | Service user |
---|---|---|---|---|
✓ | ✓1 | ✓1 | Χ | Χ |
1The search is restricted to documents that are tagged with collections to which the user ID has a datauser role assigned. |
The search endpoint has the following
format:
/db2whrest/v1/search -H 'Authorization: Bearer <token>' -X POST -d@<data.json>
-H "Content-Type: application/json"
The
endpoint accepts requests in JSON format and returns a response in a self-describing data set. It
also returns the query time and number of elements in the result set. If an endpoint operation takes
more than 10 seconds to complete it is converted to an asynchronous operation. For more information
on asynchronous endpoint operations, see Asynchronous endpoints.You can modify a search endpoint with the following fields. An example follows
this list:
- query
- Specifies a search query string.
- filters
- Specifies an array of dictionaries that filter the results of the query. Each dictionary must
contain the following three fields:
- key
- The name of the field or column to be returned.
- operator
- One of the following operators:
=, >, <, <>, <=, >=, in, like
. - value
- The value of the field or column to filter on.
- group_by
- Specifies a list of fields to be used to summarize search results. If the
group_by field is not specified, the search returns record-level information.
Note: The combination
"group_by": ["filename"]
causes the query to be applied to the duplicate file table. All other group_by combinations cause the query to be applied to the summary tables. - sort_by
- Specifies an array of dictionary objects. Each dictionary object must specify a field or column name to be sorted on and a sort direction. Valid sort directions are asc and desc.
- limit
- Specifies the maximum number of rows that can be returned by the response.
The following example illustrates how to specify these
fields:
{
"query": "platform='Spectrum Scale'",
"filters": [
{
"key": "project",
"operator": "is",
"value": "null"
}
],
"group_by": ["Filesystem","Owner","Site"],
"sort_by": [{"Filesystem": "asc"},{"Owner": "asc"}],
"limit": 100000
}
Synopsis of the request URL
curl -k -H 'Authorization: Bearer <token>' https://<spectrum_discover_host>/db2whrest/v1/search
-X POST -d@search.json -H "Content-Type: application/json"
Supported request types and response formats
Supported request types:
- POST
- JSON
Examples
- The following example shows how to define search parameters and format the data that is returned:
- Step 1: Define the search parameters in a file named
search.json:
{ "query": "platform='Spectrum Scale'", "filters": [ { "key": "project", "operator": "is", "value": "null" } ], "group_by": ["Filesystem","Owner","Site"], "sort_by": [{"Filesystem": "asc"},{"Owner": "asc"}], "limit": 100000 }
- Step 2: Submit the
request:
curl -k -H 'Authorization: Bearer <token>' https://<spectrum_discover_host>/db2whrest/v1/ search -X POST -d@search.json -H "Content-Type: application/json"
In the following code block, the information from the rows field of the response is reflowed so that it is easier to read. The sum column is omitted. You can see that the rows are grouped by file system, owner, and site and are sorted by file system and owner:{ "facet_tree": { "OWNER": "[{\"owner\":\"nobody\",\"count\":8},{\"owner\":\"benjamin\",\"count\":2}, {\"owner\":\"_NULL_\",\"count\":989},{\"owner\":\"borgato\",\"count\":2}, {\"owner\":\"boston\", \"count\":4554},{\"owner\":\"root\",\"count\":366104932}, {\"owner\":\"beale\",\"count\":203545}, {\"owner\":\"baldwin\",\"count\":2},{\"owner\":\"behr\",\"count\":785144}, {\"owner\":\"babcock\", \"count\":9082943},{\"owner\":\"broadwood\",\"count\":375}]", "PLATFORM": "[{\"platform\":\"Spectrum Scale\",\"count\":376182496}]", "FILESYSTEM": "[{\"filesystem\":\"fs11-1m-me1\",\"count\":185493076}, {\"filesystem\":\"filesys1\",\"count\":10077057},{\"filesystem\":\"fs10-1m-me1\", \"count\":180612363}]", "CLASSIFICATION": "[{\"classification\":null,\"count\":376182496}]", "DEPARTMENT": "[{\"department\":null,\"count\":376182496}]", "CLUSTER": "[{\"cluster\":\"host.ibm.com\",\"count\":366105439},{\"cluster\": \"filesys1.university.edu\",\"count\":10077057}]", "TIER": "[{\"tier\":\"system\",\"count\":376182496}]", "ARCHIVE": "[{\"archive\":null,\"count\":376182496}]", "SITE": "[{\"site\":\"host\",\"count\":366105439},{\"site\":\"\",\"count\":10077057}]", "PROJECT": "[{\"project\":null,\"count\":376182496}]"}, "query_time_secs": 1.174846, "rows": "[{\"filesystem\":\"fs10-1m-me1\",\"owner\":\"_NULL_\",\"site\":\"host\", \"count\":468, \"sum\":52933001066},{\"filesystem\":\"fs10-1m-me1\",\"owner\":\"nobody\",\"site\": \"host\", \"count\":4,\"sum\":20976},{\"filesystem\":\"fs10-1m-me1\",\"owner\":\"root\", \"site\":\"host\",\"count\":180611891,\"sum\":2002705351263},{\"filesystem\": \"fs11-1m-me1\",\"owner\":\"_NULL_\", \"site\":\"host\",\"count\":521,\"sum\":58473947666}, {\"filesystem\":\"fs11-1m-me1\",\"owner\":\"nobody\",\"site\":\"host\",\"count\":4,\"sum\":20976},{\"filesystem\":\"fs11-1m-me1\", \"owner\":\"root\",\"site\":\"host\",\"count\":185492551,\"sum\":2174830065250}, {\"filesystem\":\"filesys1\",\"owner\":\"baldwin\",\"site\":\"\",\"count\":2, \"sum\":16388},{\"filesystem\":\"filesys1\",\"owner\":\"behr\",\"site\":\"\", \"count\":785144,\"sum\":5000771899189}, {\"filesystem\":\"filesys1\", \"owner\":\"boston\",\"site\":\"\",\"count\":4554,\"sum\":57670240959030}, {\"filesystem\":\"filesys1\",\"owner\":\"beale\",\"site\":\"\",\"count\":203545, \"sum\":69505800364825},{\"filesystem\":\"filesys1\",\"owner\":\"root\",\"site\": \"\",\"count\":490,\"sum\":265209686947},{\"filesystem\":\"filesys1\",\"owner\": \"broadwood\",\"site\":\"\",\"count\":375,\"sum\":48214210142151},{\"filesystem\": \"filesys1\",\"owner\":\"benjamin\",\"site\":\"\",\"count\":2,\"sum\":3140759161}, {\"filesystem\":\"filesys1\",\"owner\":\"babcock\",\"site\":\"\",\"count\":9082943, \"sum\":48534270142857},{\"filesystem\":\"filesys1\",\"owner\":\"borgato\", \"site\":\"\",\"count\":2,\"sum\":4415704787}]","count": 15,"doc_count": 376182496 }
"rows": "[ {\"filesystem\":\"fs10-1m-me1\", \"owner\":\"_NULL_\", \"site\":\"host\", \"count\":468, {\"filesystem\":\"fs10-1m-me1\", \"owner\":\"nobody\", \"site\":\"host\", \"count\":4, {\"filesystem\":\"fs10-1m-me1\", \"owner\":\"root\", \"site\":\"host\", \"count\":1806, {\"filesystem\":\"fs11-1m-me1\", \"owner\":\"_NULL_\", \"site\":\"host\", \"count\":521, {\"filesystem\":\"fs11-1m-me1\", \"owner\":\"nobody\", \"site\":\"host\", \"count\":4, {\"filesystem\":\"fs11-1m-me1\", \"owner\":\"root\", \"site\":\"host\", \"count\":1854, {\"filesystem\":\"filesys1\", \"owner\":\"baldwin\",\"site\":\"\", \"count\":2, {\"filesystem\":\"filesys1\", \"owner\":\"behr\", \"site\":\"\", \"count\":785144, {\"filesystem\":\"filesys1\", \"owner\":\"boston\", \"site\":\"\", \"count\":4554, {\"filesystem\":\"filesys1\", \"owner\":\"beale\", \"site\":\"\", \"count\":203545, {\"filesystem\":\"filesys1\", \"owner\":\"root\", \"site\":\"\", \"count\":490, {\"filesystem\":\"filesys1\", \"owner\":\"broadwood\",\"site\":\"\", \"count\":375, {\"filesystem\":\"filesys1\", \"owner\":\"benjamin\", \"site\":\", \"count\":2, {\"filesystem\":\"filesys1\", \"owner\":\"babcock\",\"site\":\"\", \"count\":9082943, {\"filesystem\":\"filesys1\", \"owner\":\"borgato\",\"site\":\"\", \"count\":2,
- Step 1: Define the search parameters in a file named
search.json:
- The following example shows how to search for duplicate files with a size greater than 1 MiB:
- Step 1: Define the search parameters in a file named
search.json:
{ "query": "", "filters": [ { "key": "size", "operator": ">", "value": 1048576 } ], "group_by": ["size", "filename"], "sort_by": [{"size": "asc"}, {"filename": "asc"}], "limit": 100 }
- Step 2: Submit the
request:
curl -k -H 'Authorization: Bearer <token>' https://<spectrum_discover_host>/db2whrest/v1/ search -X POST -d@search.json -H "Content-Type: application/json"
In the following code block, the information from the rows field of the response is reflowed so that it is easier to read. You can see that the rows are grouped by size and file name and that they are sorted by size and file name in ascending order:{ "facet_tree":{"OWNER": "[{\"owner\":\"behr\",\"count\":160096},{\"owner\":\"babcock\", \"count\":14609},{\"owner\":\"beale\",\"count\":612},{\"owner\":\"root\",\"count\":34}]", "DEPARTMENT":"[{\"department\":null,\"count\":175351}]","FILESYSTEM": "[{\"filesystem\": \"filesys1\",\"count\":175351}]", "PROJECT": "[{\"project\":\"TCGA_kirc\",\"count\":86}, {\"project\":\"TCGA_ucs\",\"count\":14},{\"project\":\"TCGA_stad\",\"count\":21}, {\"project\":\"TCGA_lusc\",\"count\":20},{\"project\":\"acc\",\"count\":2},{\"project\": \"TCGA_meso\",\"count\":54},{\"project\":\"TCGA_skcm\",\"count\":107},{\"project\": \"TCGA_ov\",\"count\":81},{\"project\":\"kich\",\"count\":10},{\"project\":\"TCGA_lgg\", \"count\":137},{\"project\":\"TCGA_prad\",\"count\":662},{\"project\":\"TCGA_laml\", \"count\":6},{\"project\":\"TCGA_thym\",\"count\":54},{\"project\":\"TCGA_thca\",\"count\": 255},{\"project\":\"kirp\",\"count\":256},{\"project\":\"hnsc\",\"count\":22},{\"project\": \"TCGA_pcpg\",\"count\":2},{\"project\":\"Eichler\",\"count\":877},{\"project\": \"TCGA_ucec\",\"count\":238},{\"project\":\"TCGA_luad\",\"count\":1417},{\"project\": \"Level1\",\"count\":6},{\"project\":\"brca\",\"count\":14},{\"project\":null,\"count\": 168855},{\"project\":\"TCGA_paad\",\"count\":196},{\"project\":\"TCGA_read\",\"count\":244} ,{\"project\":\"cesc\",\"count\":613},{\"project\":\"coad\",\"count\":176},{\"project\": \"dlbc\",\"count\":20},{\"project\":\"blca\",\"count\":471},{\"project\":\"TCGA_sarc\", \"count\":90},{\"project\":\"TCGA_lihc\",\"count\":206},{\"project\":\"gbm\",\"count\":120}, {\"project\":\"esca\",\"count\":19}]","CLASSIFICATION": "[{\"classification\":null, \"count\":175351}]","ARCHIVE": "[{\"archive\":null,\"count\":175351}]"},"query_time_secs": 1.377808,"rows": "[{\"size\":1048608,\"filename\":\"NA-C-ms22-cm0.mcdat\",\"count\":4106, \"sum\":4305584448},{\"size\":1048608,\"filename\":\"asm_g100-C-ms22-cm0.mcdat\", \"count\":2,\"sum\":2097216},{\"size\":1048608,\"filename\":\"asm_g1083-C-ms22-cm0.mcdat\", \"count\":2,\"sum\":2097216},{\"size\":1048608,\"filename\":\"asm_g111-C-ms22-cm0.mcdat\", \"count\":2,\"sum\":2097216},{\"size\":1048608,\"filename\":\"asm_g1122-C-ms22-cm0.mcdat\", \"count\":2,\"sum\":2097216},{\"size\":1048608,\"filename\":\"asm_g134-C-ms22-cm0.mcdat\", \"count\":2,\"sum\":2097216},{\"size\":1048608,\"filename\":\"asm_g145-C-ms22-cm0.mcdat\", \"count\":2,\"sum\":2097216},{\"size\":1048608,\"filename\":\"asm_g147-C-ms22-cm0.mcdat\", \"count\":3,\"sum\":3145824},{\"size\":1048608,\"filename\":\"asm_g149-C-ms22-cm0.mcdat\", \"count\":4,\"sum\":4194432},{\"size\":1048608,\"filename\":\"asm_g151-C-ms22-cm0.mcdat\", \"count\":2,\"sum\":2097216},{\"size\":1048608,\"filename\":\"asm_g153-C-ms22-cm0.mcdat\", \"count\":2, ... 8583},{\"size\":1053910,\"filename\":\"asm_g2981.asm\",\"count\":2,\"sum\":2107820}, {\"size\":1053914,\"filename\":\"seqDB.v006.dat\",\"count\":2,\"sum\":2107828}, {\"size\":1054721,\"filename\":\"asm_g3384.asm\",\"count\":2,\"sum\":2109442}]","count": 100,"doc_count":4318 }
"rows": "[ {\"size\":1048608,\"filename\":\"NA-C-ms22-cm0.mcdat\", \"count\":4106,\"sum\":4305584448}, {\"size\":1048608,\"filename\":\"asm_g100-C-ms22-cm0.mcdat\", \"count\":2,\"sum\":2097216}, {\"size\":1048608,\"filename\":\"asm_g1083-C-ms22-cm0.mcdat\", \"count\":2,\"sum\":2097216}, {\"size\":1048608,\"filename\":\"asm_g111-C-ms22-cm0.mcdat\", \"count\":2,\"sum\":2097216}, {\"size\":1048608,\"filename\":\"asm_g1122-C-ms22-cm0.mcdat\", \"count\":2,\"sum\":2097216}, {\"size\":1048608,\"filename\":\"asm_g134-C-ms22-cm0.mcdat\", \"count\":2,\"sum\":2097216}, {\"size\":1048608,\"filename\":\"asm_g145-C-ms22-cm0.mcdat\", \"count\":2,\"sum\":2097216}, {\"size\":1048608,\"filename\":\"asm_g147-C-ms22-cm0.mcdat\", \"count\":3,\"sum\":3145824}, {\"size\":1048608,\"filename\":\"asm_g149-C-ms22-cm0.mcdat\", \"count\":4,\"sum\":4194432}, {\"size\":1048608,\"filename\":\"asm_g151-C-ms22-cm0.mcdat\", \"count\":2,\"sum\":2097216}, {\"size\":1048608,\"filename\":\"asm_g153-C-ms22-cm0.mcdat\", \"count\":2, ... {\"size\":1053910,\"filename\":\"asm_g2981.asm\", \"count\":2,\"sum\":2107820}, {\"size\":1053914,\"filename\":\"seqDB.v006.dat\", \"count\":2,\"sum\":2107828}, {\"size\":1054721,\"filename\":\"asm_g3384.asm\", \"count\":2,\"sum\":2109442} ]",
- Step 1: Define the search parameters in a file named
search.json:
- The following example shows how to perform a nongrouped search (record level results) for files
owned by
benjamin
:- Step 1: Define the search parameters in a file named
search.json:
{ "query": "", "filters": [ { "key": "owner", "operator": "=", "value": "benjamin" } ], "group_by": [], "sort_by": [], "limit": 100 }
- Step 2: Submit the following
request:
curl -k -H 'Authorization: Bearer <token>' https://<spectrum_discover_host>/db2whrest/v1/ search -X POST -d@search.json -H "Content-Type: application/json"
In the following code block, the information from the rows field of the response is reflowed for better viewing. Many columns are omitted. You can see that the response returns two rows in which the owner field is benjamin:* Connection #0 to host localhost left intact { "query_time_secs": 1422.651552,"rows": "[{\"filesystem\":\"filesys1\",\"revision\":\"MO1\", \"site\":\"\",\"platform\":\"Spectrum Scale\",\"cluster\":\"filesys1.university.edu\", \"inode\":257173,\"owner\":\"benjamin\",\"group\":\"iacs\",\"permissions\":\"-r--r--r--\", \"fileset\":\"hg19\",\"uid\":null,\"gid\":null,\"path\":\"\\/filesys1\\/hg19\\/\", \"filename\":\"Homo_sapiens_assembly19.fasta.fai\",\"filetype\":\"fasta\",\"migstatus\": \"resdnt\",\"migloc\":\"NA\",\"mtime\":\"2014-08-08T19:37:13.000Z\",\"atime\": \"2017-08-27T22:21:11.000Z\",\"ctime\":\"2016-02-22T19:54:30.000Z\",\"inserttime\": \"2018-08-02T15:47:25.000Z\",\"tier\":\"system\",\"size\":2780,\"qpart\":1,\"fkey\": \"filesys1.university.edufilesys1257173\",\"project\":null,\"department\":null, \"archive\":null,\"classification\":null,\"tag5\":null,\"tag6\":null,\"tag7\":null, \"tag8\":null,\"tag9\":null,\"tag10\":null,\"tag11\":null,\"tag12\":null,\"tag13\":null, \"tag14\":null,\"tag15\":null,\"tag16\":null},{\"filesystem\":\"filesys1\",\"revision\": \"MO1\",\"site\":\"\",\"platform\":\"Spectrum Scale\",\"cluster\": \"filesys1.university.edu\",\"inode\":322176,\"owner\":\"benjamin\",\"group\":\"iacs\", \"permissions\":\"-r--r--r--\",\"fileset\":\"hg19\",\"uid\":null,\"gid\":null,\"path\": \"\\/filesys1\\/hg19\\/\",\"filename\":\"Homo_sapiens_assembly19.fasta\",\"filetype\": \"fasta\",\"migstatus\":\"resdnt\",\"migloc\":\"NA\",\"mtime\": \"2014-08-08T19:37:13.000Z\",\"atime\":\"2017-08-27T22:21:15.000Z\",\"ctime\": \"2016-02-22T19:44:09.000Z\",\"inserttime\":\"2018-08-02T15:47:25.000Z\",\"tier\": \"system\",\"size\":3140756381,\"qpart\":4, \"fkey\":\"filesys1.university.edufilesys1322176\",\"project\":null,\"department\":null, \"archive\":null,\"classification\":null,\"tag5\":null,\"tag6\":null,\"tag7\":null,\"tag8\":null, \"tag9\":null,\"tag10\":null,\"tag11\":null,\"tag12\":null,\"tag13\":null,\"tag14\":null, \"tag15\":null,\"tag16\":null}]","doc_count": 2,"count": 2,"facet_tree": {"OWNER": "[{\"owner\":\"benjamin\",\"count\":2.0}]","FILESYSTEM": "[{\"filesystem\":\"filesys1\", \"count\":2.0}]","ARCHIVE": "[{\"archive\":null,\"count\":2.0}]","CLUSTER": "[{\"cluster\":\"filesys1.university.edu\",\"count\":2.0}]","SITE": "[{\"site\":\"\", \"count\":2.0}]","CLASSIFICATION": "[{\"classification\":null,\"count\":2.0}]", "DEPARTMENT": "[{\"department\":null,\"count\":2.0}]","PLATFORM": "[{\"platform\": \"Spectrum Scale\",\"count\":2.0}]","TIER": "[{\"tier\":\"system\",\"count\":2.0}]", "PROJECT": "[{\"project\":null,\"count\":2.0}]"} }
"rows": "[ {\"filesystem\":\"filesys1\",\"revision\":\"MO1\",\"site\":\"\", ...\"owner\":\"benjamin\", ... {\"filesystem\":\"filesys1\",\"revision\":\"MO1\",\"site\":\"\", ...\"owner\":\"benjamin\", ...} ]",
- Step 1: Define the search parameters in a file named
search.json: