Accessing HTTP and RESTful services from DB2: Introducing the REST user-defined functions for DB2

Bringing the web into the world of relational database systems

Increasingly, Representational State Transfer (REST) has become the preferred way of communicating in today's service-oriented architecture environments. There are many services and information sources that can be addressed via a URL and accessed using HTTP. On the other hand, there is still a significant amount of information stored in relational database management systems (RDBMS) -- like DB2®. Modern RDMSes provide powerful processing features for relational data (through SQL) and XML data (through SQL/XML or XQuery). [2011 August 23: The authors updated this article to include information about HTTP proxy support. -Ed.]

Michael Schenker (mschenk@us.ibm.com), Advisory Software Engineer, IBM

Author Photo: Michael SchenkerMichael Schenker has more than seven years of work experience with IBM, joining the company as a full-time employee at the end of 2003. During his career, Michael has worked on several database-related products, starting as an intern to work on DB2 Information Integration with focus on non-relational data sources. In 2006, Michael joined the database tools organization and worked on the DB2 web service provider functionality in IBM Data Studio Developer. He gained a lot of expertise in SOA and web technologies during that time. Since 2008, Michael has been part of the DB2 for z/OS optimizer team, working on access plan selection and optimizer testing tools.



Susan Malaika (malaika@us.ibm.com), Senior Technical Staff Member, IBM

Susan Malaika photoSusan Malaika is a senior technical staff member in the IBM Information Management Group (part of IBM Software Group). Her specialties include XML, the Web, and databases. She has developed standards that support data for grid environments at the Global Grid Forum. In addition to working as an IBM product software developer, she has also worked as an Internet specialist, a data analyst, and an application designer and developer. She has also co-authored a book on the Web and published articles on transaction processing and XML. She is a member of the IBM Academy of Technology.



23 August 2011 (First published 26 May 2011)

Also available in Chinese Russian Vietnamese Portuguese Spanish

Overview

This article introduces a set of user-defined functions (UDFs) for DB2 that allow applications to consume REST-based services from SQL. The following topics will be covered:

  • Introduction
  • Basic scalar REST UDFs
    • Sending customized HTTP header values
    • Combining XML functions with the REST UDFs by retrieving a BBC newsfeed
    • Joining the REST UDF results with a relational table
    • Security — authentication and encryption
  • Helper functions
    • URL encoding/decoding
    • Base64 encoding/decoding

Introduction

Which DB2?

The information in this article applies both to DB2 for Linux®, UNIX®, and Windows®, and to DB2 for z/OS®.

HTTP was defined more than 20 years ago when the last major change (HTTP 1.1) was implemented in 1999 (see RFC2616). The longevity of HTTP is an indication that it is in a stable state. HTTP is the most popular protocol on the web. Over time, additional protocols and usage patterns like REST or WebDAV have been built on top of HTTP. Many websites and applications provide HTTP-based interfaces to access information.

Other formats used on the web: JSON

There are many other text-based formats used on the web today. JSON has become very popular in recent years. The article "Build a pureXML and JSON application, Part 1: Store and query JSON with DB2 pureXML" shows one way that JSON can be processed in DB2 using JSONx to convert JSON into XML and vice-versa.

Several (mostly XML-based) message formats have evolved over time and are used to exchange data in conjunction with HTTP as transport layer — SOAP, RSS, or Atom, for example.

Relational database management systems (RDBMSes) have evolved, too, introducing processing and storage functionality beyond just relational data and SQL, for example to support XML natively.

Figure 1. Accessing the web from DB2 via the REST functions
Diagram shows web content connecting throug HTTP consumer functions for DB2, up to relational and XML content in DB2

In this article, we introduce extensions to SQL in the form of user-defined functions (UDFs) that access URL-addressable resources via HTTP directly from SQL statements. The UDFs are called REST UDFs or REST functions. With the new functions, additional capabilities, already available in the RDBMS — such as SQL aggregation or XPath access — can be used to manipulate and store data retrieved from the web. With the REST functions, simple HTTP GET or POST requests can be issued.

More complex (verbose) signatures for the REST functions are also available, allowing the inspection of the HTTP response codes and header values, which is useful for debugging. The verbose REST functions are not further discussed in this article, but you can take a look at the samples in sample_tableUDF_Queries.txt to get an idea how the functions can be used.

The REST functions can treat the information submitted via HTTP as binary or textual data. The binary versions access information like images, sounds, and other binary data formats, whereas the textual versions deal with HTML pages and other text-based information by automatically converting data from/to the used database code page. Textual data on the Internet is typically in Unicode, whereas data in relational databases are often in other encodings.

By following the steps in this article and accompanying download, you can retrieve data from the web and join it with locally stored relational data. Furthermore, you can apply DB2 native XML store capabilities to process, store, and transform XML-based data (e.g., by using the XMLTABLE function to generate a relational dataset from an XML document).

This article starts by introducing the REST UDFs. It will then go on to provide examples that manipulate textual and binary information on the web. Customizing REST requests by setting HTTP header options is described. An example based on accessing RSS feeds illustrates the integration of web and database data. Handling HTTP authentication and encryption is covered. The article ends with examples of helper functions to perform URL encoding and Base64 encoding. The article includes an appendix with the full set of REST function parameters.

This article is accompanied by a download that includes the REST UDFs for DB2 for LUW and DB2 for z/OS and various samples. The download package provides a set of basic scalar as well as some helper functions.


Basic REST scalar functions

The basic scalar functions provide access to content on the web. REST functions are available for the four most common HTTP verbs: GET, POST, PUT, and DELETE. There are two methods available: one to handle binary content and one for text-based content. The DB2 schema for the functions is DB2XML.

All basic REST functions are scalar functions that return the content of the HTTP response message as BLOB (binary) or CLOB (textual). The function signatures follow this pattern:

DB2XML.HTTP<method><type>(url, httpHeader <,
input>)
Table 1. Method name
MethodType
GET, POST, PUT, DELETEBLOB (binary content) or CLOB (textual content) defines the return type and the type of the <input> document
Table 2. Input parameter
urlhttpHeaderinput
(Required) the URL of the resource being accessed(optional) additional header values used with the request(POST, PUT only) data sent with the request, as BLOB or CLOB

There is an additional method available to perform an HTTP HEAD request. It does not send or retrieve data. Retrieve the IBM web site via HTTP GET:

SELECT 
   DB2XML.HTTPGETCLOB('http://www.ibm.com','') 
FROM SYSIBM.SYSDUMMY1

The result is the IBM web page HTML text as CLOB text data.

Sending customized HTTP header values

Above, we provided an empty string for the HTTP header values, which means that defaults are being used when issuing the request message. It might be necessary to provide custom header information with a request, (e.g., to change the content type or setting an additional "SOAPAction" header field when invoking a SOAP web service).

With each method, an optional HTTP header XML document can be specified that allows setting additional HTTP header values. The XML header document must conform to the following format:

Listing 1. XML HTTP header format
<httpHeader connectionTimeout="10" follwRedirects="true">
	<header name="Accept" value="text/plain,application/xml,*/*;q=0.8" />
	<header name="Accept-Language" value="en-us,en;q=0.5" />
</httpHeader>

Each custom header field is represented by a <header> element having a name and value attribute. The <httpHeader> root element allows specifying some additional request options via attributes.

Table 3. Options that can be specified with <httpHeader>
Attribute nameTypeDefaultDescription
connectionTimeoutIntegerSystem defaultTimeout in milliseconds
readTimeoutIntegerSystem defaultTimeout in milliseconds
followRedirectsBooleanTrueDefines, if redirects should be followed
useCachesBooleanTrueDefines, if caches should be used

The XML header document can be provided as CLOB or native XML (DB2 LUW only) value.

Listing 2. Invoking a SOAP 1.1 web service via HTTP POST
SELECT DB2XML.HTTPPOSTCLOB('http://www.webservicex.net//stockquote.asmx',
CAST ('<httpHeader>
       <header name="Content-Type" value="text/xml;charset=utf-8"/>
       <header name="SOAPAction" value="&quot;http://www.webserviceX.NET/GetQuote&quot;"/>
       </httpHeader>' AS CLOB(1K)),
 CAST('<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
          <soap:Body>
             <GetQuote xmlns="http://www.webserviceX.NET/">
                <symbol>IBM</symbol>
             </GetQuote>
          </soap:Body>
       </soap:Envelope>' AS CLOB(10K))
) FROM SYSIBM.SYSDUMMY1;

The result is the web service SOAP response message as CLOB text data.

Combining XML functions with the REST UDFs by retrieving a BBC newsfeed

This example demonstrates how you can easily retrieve a BBC newsfeed and transform it into a relational dataset using the XMLTABLE function of DB2.

Figure 2. Using the REST UDFs and DB2 pureXML to consume a newsfeed
Diagram shows BBC news feed communicating through HTTP consumer functions for DB2, through an XMLTABLE function where RSS format is shredded into a result set, and into DB2

First, let's take a look at the BBC RSS feed using the following statement:

SELECT DB2XML.HTTPGETCLOB('http://feeds.bbci.co.uk/news/world/rss.xml?edition=uk','') 
       FROM SYSIBM.SYSDUMMY1;

Now we can wrap an XMLTABLE function call around to transform the returned XML RSS format into a relational result set.

Listing 3. Shredding the BBC newsfeed into a relational dataset
SELECT *
   FROM XMLTABLE('$result/rss/channel/item' 
      PASSING XMLPARSE(
         DOCUMENT 
           DB2XML.HTTPGETBLOB('http://feeds.bbci.co.uk/news/world/rss.xml?edition=uk','')
      ) as "result"
         COLUMNS 
            title VARCHAR(128) PATH 'title',
            description VARCHAR(1024) PATH 'description',
            link VARCHAR(255) PATH 'link',
			pubDate VARCHAR(20) PATH 'substring(pubDate, 1, 16)'
   ) AS RESULT;

The result set contains four columns: TITLE containing the title of the feed item, DESCRIPTION with the feed description, LINK with a URL to the actual news document, and PUBDATE.

More examples can be found in the download package with this article.

Joining the REST UDF results with a relational table

In the previous section, the XMLTABLE function was used to shred the XML data returned by the REST UDF. In this section, the example is taken a step further. The shredded result set is joined with an existing news table with the four columns matching on news title.

Figure 3. Joining the REST UDF results with a relational table
Like Figure 2, except the XMLTABLE function joins with a local table to produce a result set
Listing 4. Joining the REST UDF results with a relational table
   SELECT DB2XML.NEWS.*
   FROM XMLTABLE('$result/rss/channel/item' 
      PASSING XMLPARSE(
         DOCUMENT 
           DB2XML.HTTPGETBLOB('http://feeds.bbci.co.uk/news/world/rss.xml?edition=uk','')
      ) as "result"
         COLUMNS 
            title VARCHAR(128) PATH 'title'
   ) AS RESULT, DB2XML.NEWS
   WHERE RESULT.TITLE = DB2XML.NEWS.TITLE;

Security — Authentication and encryption

Using certificates with the REST UDFs

The REST UDFs take advantage of the Java Secure Socket Extension (JSSE) for secure socket layer (SSL) communication. SSL communication requires a certificate store on the client on order to store server certificates. The management of that store might depend on the JSSE implementation. The IBM JDK has it's own implementation with special extensions for z/OS (see IBM JSSE implementation on z/OS).

The REST functions support HTTP basic authentication as well as SSL encryption using HTTPS.

The following example shows how unread messages from your Google Mail account can be retrieved and parsed into a relational result set using XMLTABLE.

Figure 4. Accessing Gmail from DB2 with the REST UDFs
Shows Gmail going through an Atom feed throuth secure acces via https to the HTTP consumer functions for DB2, and from there to the XMLTABLE function that shreds Atom format into a result set and feeds into DB2

The access to Gmail is encrypted by SSL and secured using HTTP basic authentication ($quot;https" is used in the URL). Gmail returns the headlines for the unread messages as Atom feed, and we use the XMLTABLE function in this example to return the result as a relational dataset.

Listing 5. Retrieving unread messages from Gmail using SSL and HTTP authentication
SELECT *
  FROM XMLTABLE('$result/*[local-name()=''feed'']/*[local-name()=''entry'']' 
    PASSING  
       XMLPARSE(DOCUMENT 
 DB2XML.HTTPGETBLOB('https://<username>:<password>@mail.google.com/mail/feed/atom/','')) 
AS "result"
  COLUMNS 
    title VARCHAR(128) PATH '*[local-name()=''title'']',
    summary VARCHAR(1024) PATH '*[local-name()=''summary'']',
    author_name VARCHAR(255) PATH '*[local-name()=''author'']/*[local-name()=''name'']',
    author_email VARCHAR(255) PATH '*[local-name()=''author'']/*[local-name()=''email'']'
) AS RESULT;

The result set consists of four columns: TITLE, SUMMARY, AUTHOR_NAME, and AUTHOR_EMAIL.


Helper functions

In addition to the REST functions, you can find helper functions to perform URL encoding/decoding, as well as Base64 encoding and decoding.

URL encoding/decoding

The URL specification (RFC1738) defines a set of special characters that need to be replaced with escape sequences (e.g., if used in the query string of a URL). The download package comes with two functions to perform the URL encoding and decoding. The first parameter of the urlEncode function is the plain string; the first parameter of the urlDecode function is the URL-encoded string. The second optional parameter of both functions is the character-set encoding to be used. If NULL is provided, UTF-8 is used, as recommended in RFC3986.

Listing 6. URL-encoding example
SELECT DB2XML.URLENCODE(
     'Please send and email to report our % earnings to Mark & Lisa at marklisa@xyz.com'
	 , NULL) 
       FROM SYSIBM.SYSDUMMY1

Result:	   
--------------------------------------------------------------------
Please+send+and+email+to+report+our+%25+earnings+to+Mark+%26+Lisa+at+marklisa%40xyz.com
Listing 7. URL-decoding example
SELECT DB2XML.URLDECODE(
'Please+send+and+email+to+report+our+%25+earnings+to+Mark+%26+Lisa+at+marklisa%40xyz.com'
, NULL) 
FROM SYSIBM.SYSDUMMY1

Result:	   
----------------------------------------------------------------------
Please send and email to report our % earnings to Mark & Lisa at marklisa@xyz.com

Base64 encoding/decoding

Base64 encoding is widely used on the web to represent binary data as string (e.g., when sending hash keys). The download package comes with two functions to perform the Base64 encoding and decoding.

Listing 8. Base64-encoding example
SELECT DB2XML.BASE64ENCODE(
 CAST(
 'Please send and email to report our % earnings to Mark & Lisa at marklisa@xyz.com' 
 AS VARCHAR(100) FOR BIT DATA)) 
 FROM SYSIBM.SYSDUMMY1

Result:
-----------------------------------------------------------------------
UGxlYXNlIHNlbmQgYW5kIGVtYWlsIHRvIHJlcG9ydCBvdXIgJSBlYXJuaW5ncyB0byBNYXJrICYgTGlzYSBhdCBtY
XJrbGlzYUB4eXouY29t
Listing 9. Base64-decoding example
SELECT 
CAST(
   DB2XML.BASE64DECODE(
'UGxlYXNlIHNlbmQgYW5kIGVtYWlsIHRvIHJlcG9ydCBvdXIgJSBlYXJuaW5ncyB0byBNYXJrICYgTGlzYSBhdCBtY
XJrbGlzYUB4eXouY29t') AS VARCHAR(100) FOR MIXED DATA) 
FROM SYSIBM.SYSDUMMY1

Result:
-----------------------------------------------------------------------
Please send and email to report our % earnings to Mark & Lisa at marklisa@xyz.com

HTTP proxy support

A set of Java system properties need to be set in order to enable HTTP proxy support since the REST user-defined functions are implemented in Java. The following properties can be used:

  • http.proxyHost
  • http.proxyPort
  • http.proxyUser
  • http.proxyPassword
  • http.nonProxyHosts

DB2 for Linux, UNIX, Windows does not provide access to the JVM system properties. The REST UDFs will check for a restudf.properties file under the $DB2INSTPATH/function directory instead where a user can specify the HTTP proxy settings like this:

Listing 10. HTTP proxy settings in restudf.properties file
http.proxyHost=myProxyHost.com
http.proxyPort=8090
http.proxyUser=user
http.proxyPassword=secret
http.nonProxyHosts=*.ibm.com|wikipedia.org|...

On DB2 for z/OS you can specify those properties in the JVMPROPS file like this (see also installationNotesUDFzOS.txt in the download):

Listing 11. HTTP proxy settings in JVMPROPS file
-Dhttp.proxyHost=myProxyHost.com
-Dhttp.proxyPort=8090
-Dhttp.proxyUser=user
-Dhttp.proxyPassword=secret
-Dhttp.nonProxyHosts=*.ibm.com|wikipedia.org|...

Conclusion

The DB2 REST functions provide a simple and easy way to access resources on the web from within a SQL statement. Despite the simplicity of the functions, they are versatile enough to access and consume data from many sources. The presented REST functions blend in seamlessly with the DB2 pureXML capabilities because of the wide usage of XML as message format on the web. This article explains the basic use of the REST functions and how to integrate web data with database data. It also explains the use of authentication and encryption. Helper functions to assist with URL encoding and Base64 encoding are described.


Acknowledgement

The authors with to acknowledge Ling Ling for reviewing the article and testing its instructions.

Appendix

Table 4. REST UDF signatures
NameReturn typeSignatureHTTP VerbDescription
httpGetBlobBLOBhttpGetBlob(url VARCHAR(2048), httpHeader CLOB (10K)) RETURNS BLOB(5M)GETRetrieves a resource by a given URL vie HTTP GET. The resource is returned as binary LOB (BLOB).
httpGetClobCLOBhttpGetClob(url VARCHAR(2048), httpHeader CLOB (10K)) RETURNS CLOB(5M)GETRetrieves a text-based resource by a given URL via HTTP GET. The resource is returned as character LOB (CLOB). Character set conversion into the database code page is performed if necessary.
httpPutBlobBLOBhttpPutBlob(url VARCHAR(2048), httpHeader CLOB (10K), data BLOB(5M)) RETURNS BLOBPUTCreates (upload) a binary resource under a given URL using HTTP PUT. A response message is returned as binary LOB (BLOB) if issued by the server.
httpPutClobCLOBhttpPutClob(url VARCHAR(2048), httpHeader CLOB (10K), data CLOB(5M)) RETURNS CLOBPUTCreates (upload) a textual resource under a given URL using HTTP PUT. A response message is returned as character LOB (CLOB) if issued by the server. Character set conversion is performed if necessary.
httpPostBlobBLOBhttpPostBlob(url VARCHAR(2048), httpHeader CLOB (10K), data BLOB(5M)) RETURNS BLOB(5M)POSTUpdates a binary resource under a given URL using HTTP POST. The response message is returned as binary LOB (BLOB).
httpPostClobCLOBhttpPostClob(url VARCHAR(2048), httpHeader CLOB (10K), data CLOB(5M)) RETURNS CLOB(5M)POSTUpdates a textual resource under a given URL using HTTP POST. The response message is returned as character LOB (CLOB). Character set conversion is performed if necessary.
httpDeleteBlobBLOBhttpDeleteBlob(url VARCHAR(2048), httpHeader CLOB (10K)) RETURNS BLOBDELETEDeletes a binary resource under the given URL using HTTP DELETE. The response message is returned as binary LOB (BLOB) if there is any.
httpDeleteClobCLOBhttpDelete(url VARCHAR(2048), httpHeader CLOB (10K)) RETURNS CLOBDELETEDeletes a textual resource under the given URL using HTTP DELETE. The response message is returned as character LOB (CLOB) if there is any. Character set conversion is performed if necessary.
httpBlobBLOBhttpBlob(url VARCHAR(2048), httpMethod VARCHAR(128), httpHeader CLOB (10K), data BLOB(5M)) RETURNS BLOBany HTTP verbPerforms an HTTP request with the provided verb. A request message can be provided as binary LOB (BLOB). A response message is returned as binary LOB (BLOB) if issued by the server.
httpClobCLOBhttpClob(url VARCHAR(2048), httpMethod VARCHAR(128), httpHeader CLOB (10K), data CLOB(5M)) RETURNS CLOBany HTTP verbPerforms an HTTP request with the provided verb. A request message can be provided as character LOB (CLOB). A response message is returned as character LOB (CLOB) if issued by the server. Character set conversion is performed if necessary.
httpHeadCLOB/XMLhttpHead(url VARCHAR(2048), httpHeader CLOB (10K)) RETURNS CLOB (10K)HEADVerifies the HTTP header for the requested resource using HTTP HEAD. Just the HTTP header is returned as CLOB/XML.

DB2 for LUW only: The httpHeader XML document can also be provided as native XML value instead of using a CLOB.

Table 5. Signatures of the helper functions
NameReturn typeSignatureDescription
urlEncodeVARCHARurlEncode(text VARCHAR(2048), encoding VARCHAR(20)) RETURNS VARCHAR(4096)Helper function to perform URL-encoding of the provided text. The encoding value defines the character set to be used. It can be set to NULL where UTF-8 is used as the default.
urlDecodeVARCHARurlDecode(text VARCHAR(2048), encoding VARCHAR(20)) RETURNS VARCHAR(4096)Helper function to perform URL-decoding of the provided text. The encoding value defines the character set to be used. It can be set to NULL where UTF-8 is used as the default.
base64EncodeVARCHARbase64Encode(text VARCHAR(2732) FOR BIT DATA) RETURNS VARCHAR(4096)Helper function to perform Base64 encoding of the provided binary data returning the data as Base64-encoded string.
base64DecodeVARCHAR FOR BIT DATAbase64Decode(text VARCHAR(4096)) RETURNS VARCHAR(2732) FOR BIT DATAHelper function to perform Base64 decoding on the provided Base64-encoded data returning the data as binary.

Download

DescriptionNameSize
REST UDF download package1download.zip---

Note

  1. The download.zip file has been updated to fix an incompatibility problem with DB2 for z/OS V8. Prior to the fix, a user may encounter the following error on DB2 for z/OS: java.lang.NoClassDefFoundError:com.ibm.db2.jcc.DB2Xml.

Resources

Learn

Get products and technologies

  • Download a free trial version of DB2 for Linux, UNIX, and Windows.
  • Now you can use DB2 for free. Download DB2 Express-C, a no-charge version of DB2 Express Edition for the community that offers the same core data features as DB2 Express Edition and provides a solid base to build and deploy applications.
  • Build your next development project with IBM trial software, available for download directly from developerWorks.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management, Web development, SOA and web services, XML
ArticleID=660089
ArticleTitle=Accessing HTTP and RESTful services from DB2: Introducing the REST user-defined functions for DB2
publish-date=08232011