Using JSON_TABLE

The JSON_TABLE table function converts a JSON document into a relational table.

We will work with the following table, EMP, which contains four rows with one JSON object per row. By using the JSON_TABLE function we will extract the data so that it can be treated as relational data.


CREATE TABLE emp(jsondoc VARCHAR(32000) CCSID 1208);

INSERT INTO emp VALUES 
'{"id":901, "name" : { "first":"John", "last":"Doe" }, "office" : "E-334", 
"phones" : [ { "type":"home", "number":"555-3762" }, 
             { "type":"work", "number":"555-7242" } ], 
"accounts" : [ { "number":"36232"}, { "number":"73263"}] }';

INSERT INTO emp VALUES
'{"id":902, "name" : { "first":"Peter", "last":"Pan" }, "office" : "E-216", 
"phones" : [ { "type":"work", "number":"555-8925" } ], 
"accounts" : [ { "number":"76232"}, {"number":"72963"}] }';

INSERT INTO emp VALUES
'{"id":903, "name" : { "first":"Mary", "last":"Jones" }, "office" : "E-739", 
"phones" : [ { "type":"work", "number":"555-4311" }, 
             { "type":"home", "number":"555-6312" } ], }';

INSERT INTO emp VALUES
'{"id":904, "name" : { "first":"Sally", "last":"Smith" } }';

The JSON_TABLE function consists of three parts.

The JSON object to be deconstructed.
A path expression that generates zero or more rows from the JSON object.
The definition of the result columns to be returned. This includes the column name, the result data type, and a path expression to use to locate the column information.

Returning Simple Information

First we will use JSON_TABLE to extract employee names and office numbers from the JSON stored in the EMP table.

SELECT t.first, t.last, t.office 
    FROM emp, 
        JSON_TABLE( 
                   emp.jsondoc,          
                   'lax $'               
                   COLUMNS (             
                            first VARCHAR(10) PATH 'lax $.name.first',
                            last VARCHAR(10) PATH 'lax $.name.last',
                            office VARCHAR(10) PATH 'lax $.office'
                            )
                   ) AS t;

In this example, the first argument indicates the source of the JSON to process, in this case column JSONDOC in table EMP. The second argument is the path to the starting point in the JSON document. $ indicates to start at the beginning. Next are the definitions for the result columns. Each has a name, a data type, and the path to use to find the column data in the JSON object. For each of these columns, the column path is specified to use the current context ($), followed by the key name for the values.

The result of this query is:

FIRST	LAST	OFFICE
John	Doe	E-334
Peter	Pan	E-216
Mary	Jones	E-739
Sally	Smith	(null)

Notice that the structural error due to Sally Smith not having an office is returned as the null value. There are two factors that affect this behavior. Since lax was used for the office column's path, the structural error was ignored and null was returned. If strict had been used for the office path, the path navigation would have returned an error. The default behavior for JSON_TABLE when an error is returned for a column is to return the null value. You can override this for strict mode by adding the ERROR ON ERROR clause to the column definition.

Returning JSON Formatted Data

JSON_TABLE has the ability to return a column that contains data formatted as JSON. This is accomplished by using the keywords FORMAT JSON in the column definition. The result must consist of a single value: a JSON object, a JSON array, or a scalar value.

Here is an example of using JSON_TABLE to extract the employee name information as JSON data.

SELECT t.id, t.name, t.office 
    FROM emp, 
        JSON_TABLE( 
                   emp.jsondoc,          
                   'lax $'               
                   COLUMNS (             
                            id INTEGER PATH 'lax $.id',
                            name VARCHAR(100) FORMAT JSON PATH 'lax $.name',
                            office VARCHAR(10) FORMAT JSON PATH 'lax $.office'
                            )
                   ) AS t;

This query produces the following result:

ID	NAME	OFFICE
901	{"first":"John","last":"Doe"}	"E-334"
902	{"first":"Peter","last":"Pan"}	"E-216"
903	{"first":"Mary","last":"Jones"}	"E-739"
904	{"first":"Sally","last":"Smith"}	(null)

Note that the NAME column returns strings which represent JSON formatted objects.

When the path for a FORMAT JSON column results in a string value, the default behavior is to return the quotes for string values. This is the result shown for the OFFICE column. The OMIT QUOTES ON SCALAR STRING clause can be used to remove the quotes from scalar strings.

There is another option not demonstrated here that applies when returning FORMAT JSON data. If the path locates a sequence of JSON objects, they must be wrapped in an array in order to be successfully returned as a JSON value. This can be done using the WITH ARRAY WRAPPER clause. The default is to not add a wrapper, which would result in an error.

Handling a JSON array

When returning information from a JSON array, each array element is returned as a separate row in the result table. Now we are going to use JSON_TABLE to extract the telephone types and numbers which are in a JSON array.

SELECT t.type, t.number 
    FROM emp, 
        JSON_TABLE( 
                   emp.jsondoc,          
                   'lax $.phones[*]'               
                   COLUMNS (             
                            type VARCHAR(20) PATH 'lax $.type',
                            number VARCHAR(20) PATH 'lax $.number'
                            )
                   ) AS t;

In this example, the path expression is $.phones[*] meaning all the elements of the phones array. The column path expression used to find the column data in the JSON object is the context item, $, followed by the key name for the value to be returned. In this case, the context item is the result of the parent path expression, $.phones[*].

The result of this query is:

TYPE	NUMBER
home	555-3762
work	555-7242
work	555-8925
work	555-4311
home	555-6312

Handling nested information

In the previous example, returning only the phone numbers isn’t very useful because the information about the person associated with the number is not returned. In order to get this information, we need to define a nested column. A nested column allows array values to be associated with data items that exist at a higher level in a multi-level JSON object.

In this example we use nested columns to return the names associated with the phone numbers.

SELECT t.* 
    FROM emp, 
        JSON_TABLE( 
                   emp.jsondoc,          
                   'lax $'               
                   COLUMNS (
                            outer_ordinality FOR ORDINALITY,
                            first VARCHAR(10) PATH 'lax $.name.first',
                            last VARCHAR(10) PATH 'lax $.name.last',
                            NESTED PATH 'lax $.phones[*]'
                                 COLUMNS (             
                                          nested_ordinality FOR ORDINALITY,
                                          type VARCHAR(20) PATH 'lax $.type',
                                          number VARCHAR(20) PATH 'lax $.number'
                                          )
                            )
                   ) AS t;

The row path expression is $, meaning the top level of the JSON object. The first and second columns return the first and last names. This is followed by a nested column definition. The path lax $.phone[*] for the nested path means to process all the elements of the phones array. Within that array, the type and number values of the array elements are returned as the final two columns in the table.

This query also demonstrates the concept of an ordinality column. This is a column that generates a number, starting from 1, for each result row for each invocation of JSON_TABLE.

The result of this query is:

OUTER_ ORDINALITY	FIRST	LAST	NESTED_ ORDINALITY	TYPE	NUMBER
1	John	Doe	1	home	555-3762
1	John	Doe	2	work	555-7242
1	Peter	Pan	1	work	555-8925
1	Mary	Jones	1	work	555-4311
1	Mary	Jones	2	home	555-6312
1	Sally	Smith	1	(null)	(null)

In this example, there is a parent/child relationship between the nested levels. A LEFT OUTER JOIN is used to combine the information in the parent/child relationship. Since Sally Smith has no phone information, the LEFT OUTER JOIN returns NULL values for the phone columns.

Now let's examine the two ordinality columns. At the parent level, every row has the same ordinality value. While you might expect to see each row numbered sequentially, this query performs a separate invocation of JSON_TABLE for each JSONDOC row. For each invocation, the numbering starts at 1, so every row in the result ends up with 1 for OUTER_ORDINALITY. If the JSON used for this example had been one object containing all four employees, OUTER_ORDINALITY would have incremented for each employee object. For NESTED_ORDINALITY, the numbering restarts at 1 every time the parent changes.

Sibling Nesting

Nested columns can exist at the same level. The following example uses sibling nested columns to return both phone and account information. The first nested column clause accesses the phone information. A second nested column clause accesses the account information.

SELECT t.first, t.last, t.type, t.number, t.account 
    FROM emp, 
        JSON_TABLE( 
                   emp.jsondoc,          
                   'lax $'               
                   COLUMNS (
                            first VARCHAR(10) PATH 'lax $.name.first',
                            last VARCHAR(10) PATH 'lax $.name.last',
                            NESTED PATH 'lax $.phones[*]'
                                 COLUMNS (             
                                          type VARCHAR(20) PATH 'lax $.type',
                                          number VARCHAR(20) PATH 'lax $.number'
                                          ),
                            NESTED PATH 'lax $.accounts[*]'
                                 COLUMNS (             
                                          account VARCHAR(20) PATH 'lax $.number'
                                          )
                            )
                   ) AS t;

The result of this query is:

FIRST	LAST	TYPE	NUMBER	ACCOUNT
John	Doe	home	555-3762	(null)
John	Doe	work	555-7242	(null)
John	Doe	(null)	(null)	36232
John	Doe	(null)	(null)	73263
Peter	Pan	work	555-8925	(null)
Peter	Pan	(null)	(null)	76232
Peter	Pan	(null)	(null)	72963
Mary	Jones	work	555-4311	(null)
Mary	Jones	home	555-6312	(null)
Mary	Jones	(null)	(null)	(null)
Sally	Smith	(null)	(null)	(null)
Sally	Smith	(null)	(null)	(null)

In this example, there is a sibling relationship between phone and account information. For sibling related nesting, a UNION is used to combine the information. Since the phone and account information are at the same level in the JSON document, there is no result row that contains both pieces of information.