DB2 JSON capabilities, Part 2
Using the command-line processor
How to set up and use the DB2 JSON command-line interface
Content series:
This content is part # of # in the series: DB2 JSON capabilities, Part 2
This content is part of the series:DB2 JSON capabilities, Part 2
Stay tuned for additional content in this series.
Before you start
DB2 JSON is available with DB2 for Linux, UNIX, and Windows 10.5 and with the IBM DB2 Accessories Suite for z/OS V3.1 to work with DB2 for z/OS V10. It provides the following features, as illustrated in Figure 1:
- A command-line shell to administer and query JSON data.
- A Java™ API for application development.
- A wire listener to accept and respond to requests sent over the network.
Figure 1. DB2 JSON components

Learn what to expect from this article and how to get the most out of it
This article introduces basic features of DB2 to manage JSON data and execute queries in the command-line shell. It guides you through the following tasks:
- Setup
- Work with collections and documents
- Administration
- Cleanup
For a general overview, as well as details about the other components, see the other articles in this series.
Setup
System prerequisites
To complete the following steps, you must have at least IBM DB2 10.5 for Linux, UNIX, and Windows installed on the system, or use the IBM DB2 Accessories Suite for z/OS V3.1 to enable the use of DB2 for z/OS V10 as JSON datastore. Apply the following steps as applicable for your database platform.
IBM DB2 on Linux, Unix and Windows
To simplify managing authorization requirements for the examples in this article, make sure you have DBADM authority for the database you want to use. If you create the database as described in this section, the necessary authority is automatically assigned.
Create a database
In this step, a database is created that is suitable for the subsequent examples. Use a DB2 command window or your preferred DB2 administration tool to execute the command, as shown in Listing 1.
Listing 1. Command to create the database
CREATE DATABASE myjsondb automatic storage yes using codeset utf-8 territory US collate using system pagesize 32 k
Also, note the database server host name or IP address, and the port number. They are needed for the following steps in Preparing and starting the JSON command-line processing environment.
IBM DB2 for z/OS
Applications can use the DB2 JSON API or the DB2 JSON WireListener on a distributed system platform and use a DB2 Connect JDBC driver to use DB2 for z/OS as a JSON datastore.
The following components are required for this deployment scenario:
- A DB2 10.5 FP3 (or later) for Linux, Unix, and Windows server or client installation with the DB2 JSON and DB2 Driver components. After installation these components are located in SQLLIB/json.
- The DB2 Accessories Suite for z/OS V3.1.
- DB2 for z/OS Version 10 with required APARs. See APAR II14727 for details about prerequisites and installation of DB2 JSON with DB2 for z/OS.
Catalog the database on the distributed system and note the connection information or the URL of the DB2 database on z/OS for the following step.
Preparing and starting the JSON command-line processing environment
The command-line processor is started with the db2nosql script (located in the <db2home>/sqllib/json/bin directory) and requires database connection information. To ensure that dependencies are resolved, verify the following:
- A Java Runtime Environment (at minimum JRE 1.5) is included in the PATH.
- The CLASSPATH includes the JDBC driver (db2jcc.jar version 3.66 or higher or db2jcc4.jar).
The script assumes the database is on localhost:50000 unless you specify a different location or port with the -hostName and -port options. Use the -help option for additional details, as shown in Listing 2.
Listing 2. Examples for using the db2nosql script
db2nosql -help db2nosql -db bobdb -user bob -password mypassword db2nosql -hostName bob.bobhome.com -port 50003 -db bobdb -user bob -password mypwd
Script argument prompting:
- The database name is mandatory. The script will prompt for the database name, if the name is not provided.
- If a username but no password is provided, the script will prompt for the password.
- If neither a username nor a password are provided, the script will attempt a JDBC-Type2 connection, which uses the login user of the operating system.
Tip: Copy the script and edit the copy to provide hardcoded values to simplify starting the processing environment for a specific setup.
The JSON command-line processor will show a "nosql" prompt as shown in the sample excerpt in Listing 3.
Listing 3. Command-line processor
nosql>Type your JSON query and end it with <ENTER> nosql>Type help() or help for usage information
Enable the DB2 database
For DB2 on Linux, Unix and Windows, run the script, and in the command-line processing environment, type
enable(true)
to prepare the database for DB2 JSON
commands. This step will add the necessary system objects and is only
required once for a database, as shown in Listing 4.
Listing 4. Enable the database
nosql>enable(true) Executing SQL... Database Artifacts created successfully nosql>
Using the enable(false)
option will print the
associated DDL, but won't execute the command.
With DB2 for z/OS, enabling a database will be performed as a one-time installation task by a DB2 for z/OS Database or System Administrator. The enablement via the db2nosql command line tool is not supported for z/OS. See APAR PM98357 for details about enablement of DB2 JSON with DB2 for z/OS.
After the database is enabled, start the db2nosql script with the connection information or URL of the DB2 database on z/OS.
Select the JSON namespace
Using DB2 as a JSON document store allows the definition of multiple
JSON namespaces by using DB2 SQL schemas as qualifiers. By default, the name TEST is set as JSON namespace, however, it is good practice
to select a custom name with the command use
.
This JSON namespace (aka DB2 SQL schema name) will be used for the active
session until a new JSON namespace is selected, as shown in Listing 5.
Listing 5. Show and set JSON namespaces
nosql>dbs SCHEMAS: [TEST] nosql>use myown Switched to schema MYOWN nosql>dbs SCHEMAS: [MYOWN,TEST]
Note that the namespace (schema name) is case-insensitive.
To verify your current DB2 database connection, type
db
in the nosql-commandline processor, as shown
in Listing 6.
Listing 6. Show current database connection
nosql>db Database: jdbc:db2://localhost:50000/myjsondb Schema: TEST
Prefix commands with db to work with the selected namespace or collections in the namespace. The db prefix is a convenience shortcut and will be mapped to build fully qualified names when executing a command.
Working with collections and documents
Store and query documents with implicitly created collections
With the DB2 JSON feature, JSON documents are organized in collections. JSON collections do not enforce a collection schema or document structure, however, usually documents in a collection share common characteristics to make it easier to search for (and find) data.
Contrary to relational tables, it is therefore not necessary to define a table structure for collections. To insert a document, it is sufficient to specify the name of the collection, and if this collection does not already exist, it will be automatically created. In such a case, if the document contains an identifier, that is, a field tagged with the attribute name _id, that field is used as unique identifier and it is expected that all new documents will contain an _id of the same datatype.
If no such attribute is present for the initial document, the system uses generated object identifiers to uniquely identify each document. If documents that are inserted later contain an _id field, the datatype must fit into a VARCHAR(12) FOR BIT field, otherwise the document is rejected.
To insert a document, use the command
db.<collectionName>.insert(<document>)
,
as shown in Listing 7.
Listing 7. Insert a document (formatted for readability)
nosql>db.books.insert({ isbn: "123-456-789", author: "Verne, Jules", title: "Journey to the Center of the Earth", abstract: "Classic science fiction novel in an unusual setting", price: 6.00, pages: 276, category: "Fantasy", sales: 500.50 })
The previous sample document contains three different data types, string (such as Verne, Jules), number (such as 6.00), and integer (such as 276). Since it does not contain a field tagged with the attribute name _id, the system uses a generated object identifier to uniquely identify the document.
Table 1 contains a list of commonly occurring data types. See the DB2 JSON reference documentation for references containing a complete list.
Table 1. Commonly occurring data types
Datatype | Index allowed | Comment | Example |
---|---|---|---|
$string | Y | String | |
$int, $integer, $long | Y | integer, long | $int: 123 |
$number | Y | double, float | $number: 123.45 |
$date | Y | must be in format: 'yyyy-mm-ttThh:mm:ssZ | $date: '2013-05-18T18:56:00Z' |
$timestamp | Y | timestamp | |
$binary | Y | byte array | |
$oid | Y | object identifier (binary) |
You can query the data with the find()
command,
as shown in Listing 8. For the example document, a collection named
books has been created with an automatically
generated binary object identifier. Note that the collection name is
case-sensitive.
Listing 8. List all documents
nosql>db.books.find() nosql>Row 1: nosql> { nosql> "_id":{"$oid":"519b8727cd1552ed65b47a20"}, nosql> "isbn":"123-456-789", nosql> "author":"Verne, Jules", nosql> "title":"Journey to the Center of the Earth", nosql> "abstract":"Classic science fiction novel in an unusual setting", nosql> "price":6, nosql> "pages":276, nosql> "category":"Fantasy", nosql> "sales": 500.5 nosql> }
For new documents, the datatype of the _id must conform to the _id datatype of the collection, in this example the binary datatype. An attempt to insert a new document with an explicit _id that does not match the expected datatype will result in an error, as shown in Listing 9.
Listing 9. Error case: Insert a document with wrong _id data type (formatted)
nosql>db.books.insert({ _id: "123-456-788", author: "Verne, Jules", title: "Journey to the Center of the Earth", abstract: "Classic science fiction novel in an unusual setting", price: 6.00, pages: 276, category: "Fantasy", sales: 500.50 }) nosql> Error:[nosql][1.0.146] DBException; Caused by: [jcc][1083][10403][3.66.33] Illegal conversion: can not convert from "java.lang.String" to "byte[]" ERRORCODE=-4474, SQLSTATE=null
Create collections explicitly
In the previous example, the collection was created automatically with the first document. However, automatically created collections use default settings, and it is therefore often desirable to create collections explicitly with custom settings, as shown in Listing 10.
Listing 10. Create a collection named 'media' with _id's of type long
nosql>db.createCollection(“media”, {_id: “$long”)}) Collection: TEST."media" created. Use db.media.
Note that only the document _id, which is used as unique identifier, is specified in the definition of the collection. JSON documents are stored schema-less and a document schema definition is therefore not necessary.
Custom settings can also be used to enable DB2 features (like compression and time travel), to assign a tablespace and control bufferpool usage.
Note that bi-temporal capabilities and time travel queries are not supported for JSON collections in DB2 for z/OS. Some restrictions also apply to collection storage characteristics, for example, the TABLESPACE option is not currently supported in the createCollection command with DB2 for z/OS.
See the Related topics section for a link to the DB2 Information Center to learn more about tablespace and bufferpool management, compression, and time travel features. To learn more about DB2 JSON applications, see the other articles in this series.
Import documents
Apart from the insert command for single documents, it is also possible to import data from a file. Download the sample file from Downloadable resources to execute the following steps, as shown in Listing 11.
Listing 11. Import documents from a file
nosql>db.books.importFile(“books_import.js”) 14 objects were imported
Tip: If the command returns a 'file is not found' error, use the absolute file location.
The import command also allows you to set a commit frequency, as shown in Listing 12. Larger batches may improve throughput.
Listing 12. Import documents in batches
nosql>db.books.importFile(“books_import.js”, 100)
The imported file must contain valid documents in JSON notation. If the file contains invalid documents, an error will be reported for each such malformed document.
JSON collections do not enforce a collection schema or document structure. However, applications typically expect some structural commonality of the documents. That is, JSON collections are not totally schema-less, instead they provide document schema flexibility. To get information about the possible structure of a collection, use the sampleSchema() command. This command analyzes a subset of documents and returns a list of attributes and the count of their occurrences, as shown in Listing 13.
Listing 13. Find information about the document structure
nosql>db.books.sampleSchema() { "._id":"15;type:ObjectId", ".abstract":"15;type:String", ".author":"15;type:String", ".category":"12;type:String", ".isbn":"15;type:String", ".pages":"14;type:Integer", ".price":"14;type:Double", ".sales":"13;type:Double", ".title":"15;type:String", ".year_published":"5;type:Integer" }
In this example, most attributes are present in all documents, the exception are the 'category', 'sales', 'year_published', 'pages' and 'price' attributes.
Query JSON documents
The find command
db.<collection>.find(<conditions>,
<projection list>)
allows the selection of interesting documents by specifying conditions
(similar to a SQL where clause), as shown in Listing 14, and enables
tailoring the resulting output (similar to a SQL select statement).
Listing 14. Search for documents with a specific author name
nosql>db.books.find({author: "Tolkien, J.R"})
The example above returns all attributes for books by J.R. Tolkien, including the automatically generated identifier. To select which attributes of the documents should be returned, the query can be submitted with a projection list that indicates which attributes to include or exclude from the results. Use the following:
- 1 – to include the attribute
- 0 – to exclude the attribute
Tip: You can use either inclusion or exclusion, not both, in the same projection - with one exception: with a list of included fields, you can still exclude the _id, as shown in Listing 15.
Listing 15. Select specific document attributes
Search for titles and prices of the documents with this author: nosql>db.books.find({author: "Tolkien, J.R"}, {title: 1, price:1}) Search for titles and prices of the documents with this author in this category: nosql>db.books.find({author: "Tolkien, J.R", category: “Fantasy”}, {title: 1, price:1}) Search for titles and prices of the documents with this author in this category, but exclude the _id: nosql>db.books.find({author: "Tolkien, J.R", category: “Fantasy”}, {_id: 0, title: 1, price:1})
Note that the _id is automatically included, except if it is explicitly excluded.
You can use comparison operators and join operators to select documents, as shown in Listing 16.
Listing 16. Use comparison values to search for documents
Search author, sales and price for books with sales less than 300, exclude the _id: nosql>db.books.find({sales: {$lt:300}},{_id:0, author:1, sales:1, price:1})
The most common operators are listed in Table 2.
Table 2. Comparison and join operators
Operator | Usage | Example |
---|---|---|
$eq | equals | author: { $eq : "Lindgren, Astrid" }; |
equals as default notation | author: "Lindgren, Astrid" | |
$lte | less or equal | pages: { $lte : 200 } |
$lt | less than | price: { $lt : 6.50 } |
$gte | greater or equal | author.lastname: { $gte : "Doe" } |
$gt | greater than | rating: { $gt : 3 } |
$ne | not equal | status: { $ne : 1 } |
$in | member of a set of values | year_published: { $in : [1990, 1991,1992] } |
$nin | not a member of a set of values | year_published: { $nin : [2012, 2013] } |
$and | both must be true | $and:[{"rating":5},{"category":"Mystery"}] |
$and is implied by using a comma as separator | "category":"Mystery","rating":5 | |
$or | at least one must be true | $or:[{"rating":5},{"category":"Mystery"}] |
$nor | neither must be true | $nor:[{"rating":3},{"category":"Mystery"}] |
$not | logical not (do not match) | $not:{"rating":1} |
Control options for result sets
It is often useful to sort the output, for example, to list highest or
lowest values first, or to order documents so values can be found more
easily. Attributes in the JSON query can be sorted by listing one or more
attributes in the sort()
function, and
specifying for each whether it should be sorted ascending (=1) or
descending (= -1), as shown in Listing 17.
Listing 17. Search books and sort the result
Search books with sales less than 300, lowest price first: nosql>db.books.find({sales: {$lt:300}}, {_id:0, author:1, sales:1, price:1}).sort({price: 1}) Search books with sales less than 300, sort on lowest price and largest number of pages within: nosql>db.books.find({sales: {$lt:300}}, {_id:0, author:1, pages:1, price:1}).sort({price: 1, pages: -1})
If many documents match the criteria, depending on the application needs it
may be sufficient to retrieve a subset of potential results. To specify
the maximum number of documents that should be returned, use the
limit()
function with the query, as shown in
Listing 18.
Listing 18. Limit the number of results
nosql>db.books.find({sales: {$lt:300}},{_id:0, author:1, pages:1, sales:1}).limit(3)
Limiting the result set can also be used to page through a larger set of
results by adding an offset to the query with the function
skip()
, as shown in Listing 19.
Listing 19. Page through results
nosql>db.books.find({sales: {$lt:300}}, {_id:0, author:1, pages:1, sales:1}).limit(3).skip(3)
If only a single result is needed, the findOne()
function can be used for convenience, as shown in Listing 20.
Listing 20. Find first matching document
nosql>db.books.findOne({author: "Tolkien, J.R"}, {title: 1, price:1})
Aggregate data
With large numbers of documents, it is often desirable to aggregate results for a group of documents. For this purpose, a variety of functions can be used to get information about the number of documents, or distinct values or computed aggregate values for groups of documents.
To retrieve the number of documents that match specified query criteria,
use the count()
function, as shown in Listing
21.
Listing 21. Count documents
Get the total count of documents in this collection: nosql>db.books.count() 15 Get the number of documents for this author: nosql>>db.books.count({author: "Tolkien, J.R"}) 2
To find distinct values for an attribute in a set of data, use the
distinct()
function for the attribute, as shown
in Listing 22.
Listing 22. Find distinct values
nosql>db.books.distinct("author")
To apply a query, or use multiple key values, see the next article in this series: Using the DB2 JSON Java API.
The count()
and
distinct()
functions are convenience functions
for a specific purpose and can be used instead of the more general
group()
function. For example, distinct values
can also be found with the group()
function on
an attribute, as shown in Listing 23.
Listing 23. Find distinct values with the group() command
nosql>db.books.group({"_id": {"author": 1}})
The group()
function also allows the
specification of a compound grouping key, and the definition of
aggregation functions, for example, to calculate sums or averages, as
shown in Listing 24.
Listing 24. Calculate the average price of books per author
nosql>db.books.group({"_id": {"author": 1}, "avgprice": {"$avg": "$price"}} )
For more complex queries, the aggregate()
function can be used. It allows you to describe a query in a sequence of
steps, such that intermediate results can be used again as input for later
steps. For example, calculate the average price per author or use the
following query to first select all documents with category "Fantasy",
then pick attributes author and price from the documents, and group on
authors to calculate the average price, as shown in Listing 25.
Listing 25. Calculate the average price of books per author in a category (formatted)
nosql>db.books.aggregate( {$match: { category: "Fantasy" }}, {$project: {author:1, price:1}}, {$group: {_id: {author:1}, avgPrice:{$avg:"$price"}}} ) nosql>Row 1: nosql> { nosql> "_id":"Tolkien, J.R", nosql> "avgPrice":15.5 nosql> } nosql>Row 2: nosql> { nosql> "_id":"Verne, Jules", nosql> "avgPrice":6.35 nosql> } nosql>2 rows returned
You can apply arithmetic, string, or date-time operations in the $project task. The following example uses the $divide operator to calculate the price per page after calculating sums and averages in a $group task for each author, as shown in Listing 26.
Listing 26. List authors and price per page sorted ascending (formatted)
nosql>db.books.aggregate( {$group: {"_id": {author:1}, "sumSales":{"$sum": "$sales"}, avgPages: {"$avg": "$pages"}, avgPrice: {"$avg": "$price"} }}, {$project: {author: "$_id.author", sumSales: 1, pricePerPage: {$divide: ["$avgPages", "$avgPrice"]}}}, {$sort: {pricePerPage: 1}} )
Tip: When referring to actual or calculated attribute names in functions, make sure to prefix such attribute names with a $ and surround it with double quotes, like in "$avgPrice" for the $divide operation.
The aggregation function can contain single or multiple tasks and it is not
required to include a $group task. For example, you can use
aggregate()
for simple projection tasks with or
without modifying values, as shown in Listing 27.
Listing 27. Select a substring of an attribute
nosql>db.books.aggregate({"$project": {author:1, title:1, shortabs: {"$substr": ["$abstract",0,10]} } })
The aggregation tasks only allow fields to be specified by inclusion. If attributes are explicitly selected in a step via a projection or grouping step, they can be used in the next step with the reference name assigned in the previous step. All other fields are discarded.
Tip: The aggregate()
function
only returns the specified attributes. Different from the
find()
function, it will not automatically
include the generated _id in a $project task.
Tip: Make sure that calculated or renamed fields are projected before referencing them in a later task.
Valid tasks are shown in Table 3.
Table 3. Aggregation tasks
Task | Description |
---|---|
$group | apply aggregate operations like sum or avg; requires a group key built from one or more attributes |
$limit | limit the number of documents that should be returned |
$match | use query criteria to select a subset of documents |
$project | select fields from a document and (optionally) assign reference names |
$skip | use an offset for paging |
$sort | sort the output (ascending 1, descending -1) for the given attributes |
Table 4 shows a list of supported aggregation operations. See the DB2 JSON reference documentation for detailed usage.
Table 4. Aggregation operations
Datatype(s) | Operation | |
---|---|---|
$string | $concat, $max, $min, $substr, $toLower, $toUpper | |
$date, $timestamp | $year, $month, $week, $hour, $minute, $second, $dayOfYear, $dayOfMonth, $dayOfWeek | |
$int, $number | $add, $divide, $multiply, $subtract, $mod, $min, $max, $avg, $sum, $inc |
Tip: If you receive an error that a specific operator is not supported, review the syntax and ensure that the sequence is {operator: {field: value}}, for example, {“$gt”: {rating: 1}}.
Update documents
The update()
function updates data in one or
more documents. It supports several optional arguments to determine the
scope of the update. The usage is
update(<condition>, <fields to
update>, <upsertFlag>, <multiRowFlag>)
.
Caution: When specifying new field values, make sure to
use the $set
operator if you only want to
update the specified fields in a document and leave other fields
unchanged. Without this operator, the document content is replaced by the
specified fields, as shown in Listings 28 and 29.
Listing 28. Update a document with the specified isbn using the $set operator
nosql>db.books.find({isbn: "123-456-234"}, {_id:0, isbn:1, author:1, title:1, pages:1}) Row 1: { "isbn":"123-456-234", "author": "Verne, Jules", "title":"Journey to the Center of the Earth", "pages":276 } nosql>db.books.update({isbn: "123-456-234"}, {$set: {pages: 299}}) Updated 1 rows. nosql>db.books.find({isbn: "123-456-234"}, {_id:0, isbn:1, author:1, title:1, pages:1}) Row 1: { "isbn":"123-456-234", "author": "Verne, Jules", "title":"Journey to the Center of the Earth", "pages":299 }
Listing 29. Update a document without the $set operator
db.books.update({isbn: "123-456-234"}, {isbn: "123-456-234" , price: 6.50}) Updated 1 rows. nosql>db.books.find({isbn: "123-456-234"}, {_id:0, isbn:1, author:1, title:1, price:1}) Row 1: { "_id":{"$oid":"519e2bfb37a817b9cba77ab0"}, "isbn":"123-456-234", "author": null, "title": null, "price":6.50 }
Further options to determine the scope of updates are the upsert and the multiRow flags. The upsert flag allows specifying whether missing documents should be inserted or rejected, that is, the flag indicates what the system should do if there is no match in the collection for a specified identifier:
- If the upsert flag is set to true, the missing document will be inserted.
- If the flag is set to false, the update will be rejected.
With the multi-row flag, you can also determine whether the update should only be applied to the first document that matches the search criteria, or all such matching documents, as shown in Listing 30.
Listing 30. Examples for options to update documents
Do not insert if no matching document exists, update the first document that matches the query: nosql>db.books.update({author: "Climber, Joe"}, {$set: {price: 9.99}}, false, false) Do not insert if no matching document exists, update all documents that match the query: nosql>db.books.update({author: "Climber, Joe"}, {$set: {price: 9.99}}, false, true)
The save()
method combines insert and update
capabilities, but relies on the document identifier to determine the
scope. If the document contains an _id, the document will be
upserted. If the document does not contain an _id, a new
identifier will be generated and the document inserted.
In the following example, the book document does not contain an _id, so a new identifier will be generated and the new book will be inserted into the books collection, as shown in Listing 31.
Listing 31. Save a document
nosql>db.books.save({isbn: "123-456-239", "author": "Verne, Jules", "title": "Mysterious Island" })"
Work with indexes
If an attribute is often used as a selection or sort criteria in queries, it is useful to create an index to speed up retrieval times for larger data workloads. Indexes on JSON attributes can be created on single or multiple attributes. If no index name is specified, a default name will be assigned.
Note: For DB2 on Linux, Unix and Windows, indexes can be sorted in ascending and descending order, and indexing an element in an array is supported. With DB2 for z/OS, indexes can only be in ascending order and indexes on arrays are not supported.
To create
an index, use the ensureIndex()
command, as
shown in Listing 32.
Listing 32. Create indexes
Create an index on field 'author' in ascending order, using the default type string with default length 50: nosql>db.books.ensureIndex({"author": 1}) Index <books_xauthor> was created successfully. Create an index on field 'category' with type string and field length 40: nosql>db.books.ensureIndex({"category": [1, "$string", 40]}) Index <books_xcategory> was created successfully. Create an index on field price with type number in descending order, name it 'mypriceidx': nosql>db.books.ensureIndex({"price": [-1, "$number"]}, “mypriceidx”) Index <mypriceidx> was created successfully.
You can view a list of indexes for the collection with the
getIndexes()
command. Note that a unique index
is automatically created on the _id attribute, as shown in Listing 33.
Listing 33. Index information including index on identifier (output formatted)
nosql>db.books.getIndexes() [{"v":0,"key":{"_id_":1},"ns":"TEST.books","name":"_id_","unique":true}, {"v":1,"key":{"author":1},"ns":"TEST.books","name":"books_xauthor","unique":false}, {"v":2,"key":{"category":1},"ns":"TEST.books","name":"books_xcategory","unique":false}, {"v":3,"key":{"price":1},"ns":"TEST.books","name":"mypriceidx","unique":false}]
Use the unique
flag to create a unique index on
an attribute. The system will attempt to create a unique index. If the
collection already contains documents, it must be ensured that there is no
duplicate data in the specified fields, as shown in Listing 34.
Listing 34. Error case: Create unique index on duplicate data
nosql>db.books.ensureIndex({"isbn": [1, "$string", 30]}, "myisbnix", true) nosql> Error:[nosql][1.0.66] Failed to create index. ; Caused by: A unique index cannot be created because the table contains data that would result in duplicate index entries.. SQLCODE=-603, SQLSTATE=23515, DRIVER=3.66.33
To find duplicate entries, an aggregation query comes in handy, as shown in Listing 35.
Listing 35. Example to find duplicate data (formatted)
nosql>db.books.aggregate({"$group": {_id: "isbn", countdocs: {"$sum": 1}}}, {"$project": {"grouped_isbn": "_id", "countdocs": 1}}, {"$match": {"countdocs": {$gt: 1}}})
Nested objects
Documents may also contain nested objects, for example, instead of a single string that contains the author information, the document might separate between first name and last name, as shown in Listing 36. See also books_importNested.js in the Download section.
Listing 36. Document with nested object
{ isbn: "123-456-234", author: { lastname: "Verne", firstname: "Jules" }, title: "Journey to the Center of the Earth", sales: 333.0 })
Each element in the document must be uniquely identifiable. For example, the element "title" can only occur once, as does the element "author.lastname", as shown in Listing 37. However, an attribute name may be the same if it is located in a different path. It is therefore valid to have another attribute named "lastname" as long as it has a separate absolute path in the document, for example, "editor.lastname".
Listing 37. Import and search documents with nested attributes
nosql>db.booksnest.importFile(“books_importNested.js”) nosql>db.booksnest.findOne() { "_id":{"$oid":"51b7a63d3503e8eca2e84556"}, "isbn":"122-456-789", "author":{"lastname":"Tolkien","firstname":"J.R"}, "title":"The Hobbit", "abstract":"Spiders and Dragons", "category:["Fantasy", "Fiction"], "price":5.0, "pages":216, "sales":800.8 }
To query for a nested object, use a dot-notation to specify the full path for the desired attribute. For example, to search on the author's last name, use "author.lastname", as shown in Listing 38.
Listing 38. Query on a nested object
nosql>db.booksnest.find({"author.lastname": "Verne"})
Reminder: Make sure to enclose field names in double-quotes.
Arrays
Documents may also contain multiple occurrences of an object in the form of an array. For example, a user may have multiple phone numbers, or email addresses, or a book may belong to multiple categories or can have more than one author, as shown in Listing 39.
Listing 39. Use of an array to list multiple categories (formatted)
nosql>db.booksnest.insert({isbn: "876-543-321", author: {lastname: "Doe", firstname: "John" }, title: "A new book", abstract: "The most curious items", pages: 111, category: ["Humor", "YA"] })
When you search on the attribute without specifying an array position, the value will be searched for in all positions, as shown in Listing 40.
Listing 40. Search value in an array
nosql>db.booksnest.find({category: "Humor"}) nosql>Row 1: nosql> { nosql> "_id":{"$oid":"51a293e437a81d6b7f8c1971"}, nosql> "isbn":"876-543-321", nosql> "author{lastname: "Doe", firstname: "John"}, nosql> "title":"A new book", nosql> "abstract":"The most curious items", nosql> "pages":111, nosql> "category":["Humor", "YA"] nosql> }
However, you can also search on a specific position by adding the array position to the attribute in the dot-notation, as shown in Listing 41.
Listing 41. Search an array value in a specific array position
nosql>db.booksnest.find({"category.0": "Humor"}) nosql>Row 1: nosql> { nosql> "_id":{"$oid":"51a293e437a81d6b7f8c1971"}, nosql> "isbn":"876-543-321", nosql> "author{lastname: "Doe", firstname: "John"}, nosql> "title":"A new book", nosql> "abstract":"The most curious items", nosql> "pages":111, nosql> "category":["Humor", "YA"] nosql> } nosql>1 row returned nosql>db.booksnest.find({"category.1": "Humor"}) nosql>0 rows returned
Tip: Note that the count for the first position starts with 0.
See the DB2 JSON reference documentation on working with arrays for the most recent details about restrictions for commands and functions on data in arrays.
Administration
Work with collections
To list all collection names in the current JSON namespace, use the
getCollectionNames()
command. In the current
JSON namespace in this example, three collections are present, as shown in
Listing 42.
Listing 42. Show current collections
nosql>db.getCollectionNames() [media, books, booksnest]
To rename a collection, for example, to continue working with a collection
but archive existing content, use the rename()
command to specify a new name, as shown in Listing 43.
Listing 43. Rename an existing collection
nosql>db.books.rename(“oldbooks”)
If a collection with the new name already exists, the command will fail.
However, you can force renaming the collection, by specifying a flag to
force a drop of the existing target collection. In the example, if a
collection with the name "oldbooks" already exists, the command
db.books.rename(“oldbooks”, true)
will first
drop the collection "oldbooks" and then rename the collection "books" into
"oldbooks".
Statistics
The stats()
command returns information about a
collection, in particular, the namespace, number of documents, and a list
of indexes. The statistics also includes information about the average
size of documents, and the total size of the collection, as shown in
Listing 44. This information is populated from the DB2 statistics and may
be refreshed with a delay. Unavailable values are set to -1.
Listing 44. Get collection statistics (output pruned)
nosql>db.books.stats() { "ns":"TEST.books", "count":15, "size":4.5263671875, "avgObjSize":309.0, "numExtents":-1, "nindexes":2, "totalIndexSizes":-1, "indexSizes":[{"books_xauthor": -1},{"_id_":-1}], "ok":1 }
Remove data, indexes, and collections
To delete a document or multiple documents from the collection, use the
remove()
function. You can either remove all
documents (index definitions will be kept), or remove a subset of
documents by specifying conditions, as shown in Listing 45.
Listing 45. Remove a document
nosql>db.books.remove({isbn: "123-456-789"})
If an index is not necessary anymore, or you want to recreate an index with different index characteristics, you can remove the index by either referencing the index name or by specifying the index characteristics that were used to create the index (if no index name was explicitly assigned), as shown in Listing 46.
Listing 46. Remove indexes
nosql>db.books.removeIndex("mypriceidx") Index <mypriceidx> was removed successfully. nosql>db.books.removeIndex({"author": 1}) Index <books_xauthor> was removed successfully.
To remove all data from the collection but keep the empty collection and
any indexes, use the remove()
command, as shown
in Listing 47.
Listing 47. Remove a collection
nosql>db.books.remove() OK nosql>db.books.getIndexes() [{"v":0,"key":{"_id_":1},"ns":"TEST.books","name":"_id_","unique":true}, {"v":1,"key":{"pages":1},"ns":"TEST .books","name":"books_xpages","unique":false}, {"v":2,"key":{"isbn":1},"ns":"TEST .books","name":"myisbnix","unique":true}]
To drop the collection and all indexes, use the
drop()
command on the collection, as shown in
Listing 48.
Listing 48. Drop a collection
nosql>db.books.drop() OK nosql>db.books.getIndexes() []
Cleanup
To remove all collections for a specific JSON namespace, the command
dropAllCollections()
can be used, as shown in
Listing 49. If this JSON namespace contains collections for which the user
does not have the necessary drop privilege, an error message will be
printed for any such collection.
Listing 49. Dropping all collections in a JSON namespace
nosql>db.dropAllCollections()
For a force drop of collections and to remove the JSON system
catalog, use the disable()
command with the
value true
, as shown in Listing 50. As with the
dropAllCollections()
command, adequate
privileges are required to drop all affected objects.
Listing 50. Disabling the JSON feature
nosql>disable(true)
If the disable()
command is used with the option
false
, the DDL statement will be printed, but
not executed.
Conclusion
This article has guided you through the setup of the DB2 JSON command line environment, and introduced some basic commands to work with collections to store and query JSON documents. To find out more about DB2 JSON, see how to work with the Java API, or how to use the wire listener to receive and handle requests, please see the other articles in this series. For details about commands and operations, see the DB2 JSON reference documentation.
Downloadable resources
- PDF of this content
- Sample JSON documents (books_import.zip | 2KB)
- Sample JSON documents with nesting (books_importNested.zip | 2KB)
Related topics
- Use an RSS feed to request notification for the upcoming articles in this series. (Find out more about RSS feeds of developerWorks content.)
- Learn more about tablespace and bufferpool management, compression, and time travel features in the DB2 Version 10.5 Information Center.
- Download a trial version of DB2 for Linux, UNIX, and Windows to evaluate the DB2 JSON support for yourself.
- Learn more about JSON at the JSON web site.
- Evaluate IBM products in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment, or spend a few hours in the SOA Sandbox learning how to implement Service Oriented Architecture efficiently.