JSON document representation

JSON documents consist of fields, which are name-value pair objects. The fields can be in any order, and be nested or arranged in arrays. Db2® can work with JSON documents in either their original JSON format or in the binary-encoded format called BSON (Binary JSON).

For more information about JSON documents, see JSON documents.

JSON data must be provided in Unicode and use UTF-8 encoding. Data in the BSON format must use the little-endian format internally.

For your convenience, the SYSIBM.JSON_TO_BSON and SYSIBM.BSON_TO_JSON conversion functions are provided. You can run these functions to convert a file from one format to the other, as needed.

When JSON data is presented to Db2 as input to a JSON SQL function, that data is first interpreted and stored within an appropriate Db2 data type based on the normal Db2 data type expectations. Then, the value is converted to the format that is needed for the equivalent original JSON data type based on the source Db2 data type. This JSON data value is then processed by the JSON SQL function.

To allow JSON data to be passed from the Db2 data type to the original JSON representation without conversion, the default behavior can be overridden. The overriding of the default behavior is done in the JSON SQL function invocation by using the optional FORMAT JSON clause. When this clause is not specified, Db2 interprets the data in the following way:
  • Numerical, Boolean, and NULL values are not affected.
  • Character string values are enclosed with double quotation marks and any special character within the string is escaped.
  • Date, time, and time stamp values are enclosed with double quotation marks.
  • Binary values are interpreted as UTF-8 data.

Specifying the FORMAT JSON clause indicates that the data is already in JSON format and does not need to be changed by Db2 into a valid JSON value. This clause determines only what format processing is or is not done by Db2. The clause does not detect invalid JSON.

The following table provides some examples of how values are processed from their Db2 data type representation to the equivalent JSON data representation. Among these examples, processing is done with and without the specification of the optional FORMAT JSON clause.

Db2 data type Db2 value JSON representation (FORMAT JSON clause not specified) JSON representation (FORMAT JSON specified) Description of input
Character string {"Name":123} "{\"Name\":123}" {"Name":123} Valid JSON object that is contained in character string.
Character string {"Name";123} "{\"Name\";123}" {"Name";123} FORMAT JSON on invalid JSON object
Character string ["Name", "Address"] "[\"Name\", \"Address\"]" ["Name", "Address"] Valid JSON array that is contained in character string.
Character string ["Name": "Address"] "[\"Name\": \"Address\"]" ["Name": "Address"] FORMAT JSON on invalid JSON array

Data in BSON format does not get converted and is passed directly to the JSON SQL function for processing as BSON.

Data that is presented to a JSON SQL function by using a Db2 binary data type is assumed to be in BSON format unless the optional FORMAT JSON clause is specified.

Table 1. Default format assumption by Db2 JSON SQL functions
Db2 data type Format Assumed
BLOB BSON
CHARACTER JSON
CLOB JSON
VARBINARY BSON
VARCHAR JSON

Duplicate keys

Similar to JSON itself, Db2 does not enforce uniqueness for key names in a JSON document. By not enforcing uniqueness, it is possible for the same key name to exist one or more times within the same JSON document and even within the same JSON object. Keys with the same name in the same JSON object are considered to be duplicate keys. If duplicate keys exist in a JSON object, Db2 processing returns only the first key that is encountered. The first key might or might not be the first occurrence of that key in the document.

For example, the key "a" occurs multiple times in this JSON document (which is formatted to make the example more obvious):

{"a":{"b":10,
      "c":11},
 "a":{"d":70,
      "e":71}}

In this case, the key "a" is considered a duplicate key name as it appears twice within the same JSON object. Only one key with the name "a"is returned by Db2 from this JSON object.

In another example, the key "a" occurs multiple times in this JSON document but in different JSON objects:
{"a":{"b":10,
      "c": 11},
 "b":{"a":70,
      "b": 71}}

In this example, the key "a" is not considered a duplicate key name and both values of "a" are returned by Db2 if the appropriate JSON object is queried. Similarly, while the key name "b" occurs multiple times in the example, it also appears once only in each unique JSON object. As a result, none of its occurrences are considered to be duplicate keys.

BSON format

Db2 supports data that is presented in BSON format as input to the JSON SQL functions. The BSON data can be supplied by the customer or produced from the complimentary [SYSIBM.JSON_TO_BSON] conversion function.

All the standard BSON data type limits and restrictions apply to values provided as input to the JSON SQL functions. For example, the maximum long number supported by the BSON format is 9,223,372,036,854,775,807 (264 - 1).

The following BSON data types are not supported for processing by the JSON SQL functions that query the data (for example, JSON_TABLE, JSON_VALUE, or JSON_QUERY). Direct access of one of the following BSON data types by one of these functions returns an error:
  • BSON binary
  • BSON object ID
  • BSON regular expression
  • BSON DB pointer
  • BSON JavaScript code
  • BSON JavaScript code with scope
  • BSON symbol
  • BSON time stamp
  • BSON min key
  • BSON max key