 | Level: Introductory Kathryn Zeidenstein (krzeide@us.ibm.com), Senior Software Engineer, IBM
11 Oct 2001 Describes support in DB2 Version 7 for user-defined structured types and other object-oriented features such as inheritance, methods, object identifiers, navigation by reference, and dynamic dispatch of methods (polymorphism). Index extensions allow user-defined indexing specifications to be created on structured and distinct type data for improved performance. JDBC 2.0 and SQLJ Part 2 support provide a tighter integration between the programming environment and the database server.
© Copyright International Business Machines
Corporation 2001. All Rights Reserved.
Introduction
In this two-part article, I will describe the object-relational
technology that is one of DB2's best kept secrets.
Object-relational support provides the necessary "plumbing" that
provides extensibility to relational databases. All members of the
DB2 family support large object types, user-defined functions, and
user-defined distinct types. The focus of this paper is on the
newest technology based on user-defined structured types, which
provides for the persistent storage of objects and includes other
object-oriented features such as inheritance, the specification of
behavior using methods, object identifiers, navigation by
reference, and dynamic dispatch of methods (polymorphism). In
addition, index extensions allow user-defined indexing
specifications to be created on structured and distinct type data
for improved performance. JDBC 2.0 and SQLJ Part 2 support provide
a tighter integration between the programming environment and the
database server. DB2's long term commitment to object-relational
functionality is illustrated by the fact that many IBM technologies
rely on that support, including content management, spatial data,
and XML document search.
In Part 1, I introduce the object-relational features in DB2
with a special focus on user-defined structured types and type
hierarchies, and on using tables to store instances of those types
(such as typed tables). I also describe how you can use "object
views" on legacy relational data.
Part 2 will describe how structured type objects can be stored
in columns, how methods can be invoked on objects, and on the SQLJ
and JDBC extensions that will be useful for integrating these
database objects into your Java applications.
DB2's
object-relational highlights
A little history...
In 1995, DB2 delivered its first object-relational
extensions, ahead of any other major database vendor. These
extensions were provided to address the pressing needs of many
applications to move beyond what a "normal" relational database
could handle. Whereas the majority of transactional data could be
mapped to the usual data types (money to decimal, names to
character strings, dates to date strings), the arrival of imaging
systems and text management systems meant that data values were
very large. Video and audio data was even larger. It was very
difficult to handle this type of data using the existing
technology.
To address this application requirement DB2 Universal
Database" (UDB), in its Version 2 release,
delivered new data types that could store very large amounts of
data in a single data value. This relieved the applications from
the burden of piecing together character strings to form a
document. Instead, the entire document or image could be stored in
one of the new large object types: character large object
(CLOB), binary large object (BLOB), and double-byte character large
object (DBCLOB). To make it easier to handle LOBs in an
application, there are built-in functions for manipulating LOBs and
technology for handling LOB data in client applications as files or
through locators. Operations on LOBs are highly optimized by
deferring LOB materialization as much as possible.
Figure 1: Table definition includes types for multimedia objects.
In addition to the new types, it also became possible to make
the database a lot smarter in understanding the purpose of the data
that was stored in it. Users could define distinct types,
types that are based on existing types but which are distinct by
name and semantics from all other types defined in the system. This
was an important step toward defining data in business terms and
toward encapsulating business semantics into the database.
Because distinct types are based on predefined types, they are
tightly integrated into the database manager for performance. They
share the same efficient code used to implement built-in functions,
comparison operators, indexes, etc. for predefined types. The DB2
optimizer can optimize effectively because it has a deep
understanding of the underlying data type representation.
The behavior for distinct types are defined by user-defined
functions. User-defined functions are just application logic (or a
DB2 built-in scalar function) that operate on data values and
return a result. Thus, with a combination of user-defined functions
and distinct types, simple or large object data types become
meaningful for a particular application domain. The database engine
can enforce business rules because it understands that it cannot
allow comparison of Dollars and Yen without doing the necessary
conversions (via a user-defined function) first. DB2 also supports
overloading of function names, and chooses the appropriate
function based on an efficient "best fit" algorithm.
Another feature introduced in DB2 is the ability to return
tables (record sets) from functions instead of scalar values. This
function is powerful because it can make any data source appear
like a DB2 table, which can be then used in normal SQL queries,
join operations, grouping, and so on.
DB2 also shipped types and functions for audio, image, video,
and text. These prepackaged types and behaviors are called DB2
Extenders". Of course, these types are a
small sample of what is possible to define in DB2, but these are
typical of what is used in a content-based world. Indeed, the
extensibility of DB2 is part of what powers IBM's Content Manager
products.
More recent
additions: Structure, behavior, inheritance
The addition of large object types let applications handle
large, complex types in familiar ways; that is, by using the
relational database to store and retrieve data values. In addition,
being able to name those types and to define behaviors for those
types relieves the application from supporting semantic checking
and business logic.
Nevertheless, all data that is stored in a large object type is
essentially flat -a string of characters or bits-that may very well
bear no relation to the business object that is stored there. This
can put the burden on the application to read the large object into
the application's work area, to "parse" the object to the place
that needs to be changed, and then to store the object back in the
database.
For some types of data, it may make more sense to treat the
object as a named collection of attributes. This particular breed
of user-defined type is called a structured type and was
introduced into DB2 UDB Version 5.2 in 1998. The initial
functionality provides not only the ability to create types with
structure, but allows the definition of type hierarchies and the
equivalent storage mechanism of table hierarchies. The use of
"references" to define relationships between objects and to
navigate among objects was also introduced in that release.
Structured types provide a way to create a schema that is more
natural for those who are comfortable with OO programming
languages-they include the properties of inheritance and the
establishment of "is-a" relationships (value substitutability).
Structured types are a more natural fit for populating Java objects
in the application program.
Figure 2: Structured types reused in various places.
To allow you to define behavior for the structured type, you can
use user-defined functions. However, DB2 Version 7 delivers the
ability to define methods for defining object behavior.
Methods are very much like user-defined functions in that they can
be written in an external language like Java or C, or in SQL.
Methods, however, are always associated with a particular
structured type and are invoked using different syntax. Methods are
especially suitable for implementing behavior that really can be
seen as "owned" or attributed to a specific type. The behaviors for
the type are carried with the type in whatever context it is used.
In contrast, functions support a more symmetric view on its
parameters.
Using
object-relational features with "traditional" data
Any of DB2's object-relational features can be used with
traditional SQL. One application might need just LOB support,
another might want to make full use of structured types, methods,
and inheritance. In addition, it is even possible to use
object-oriented modeling on relational data by using the object
view support described in this paper.
Stuctured type
tables ("typed tables")
In this section, we focus on the basics of using typed tables.
By creating an object model, it is much easier to think about
mapping the model to the available constructs in DB2.
Figure 3: UML-like model of relations among people and departments.
Think about the following:
Will it benefit you to take advantage of type hierarchies and
inheritance? With DB2's structured types, you can define type
hierarchies. Just as with class hierarchies in object-oriented
programming languages, objects that are modelled using DB2
structured types inherit attributes from their supertypes (types
that are above it in the hierarchy). In the model above, we are
using inheritance for the various types of people objects that we
represent.
In Version 7, you have the choice of storing structured type
objects as rows of typed tables or in individual columns of regular
or typed tables. How do you choose? Generally, you would store your
objects as rows in a table when those objects are "primary"
entities in the object model, such as our people and
departments.
Other objects that are good candidates for being stored in typed
tables are those that must be referenced and shared by other
objects (a business address that gets used by employees, shipping,
and so on) or those whose existence is independent of another
object (deleting a department does not mean that the manager goes
away). If the relationship is one of containment and dependency,
then the object can be stored as a column (Note: this is described
in further detail in Part 2).
When objects are stored as rows in a table, each column of the
table contains one attribute of the object and you can access the
attributes of the object much as you would any column in a DB2
table.
How should you map the attributes to DB2 data types? Not only
will you need to think about how much space an attribute will use,
but you need to consider the types of information that you are
trying to get back from any attribute or combination of
attributes.
The general mapping of attributes to data types is generally
straightforward in cases where you have things like names and
salaries, which map easily to character and numeric data types. For
very large objects, such as an image or very long text description,
you might need to choose one of the DB2 data types that are
designed specifically for storing very large objects. Those data
types are CLOB (character large object), BLOB (binary large
object), or DBCLOB (double-byte character large object).
You can even use structured types to define attributes. This is
called nesting. You might want to do this in those cases where the
attribute has structure, like an address, or when your columns can
hold different subtypes, such as geometries (shape, lines, points,
polygons, circles, rectangles, etc). We will discuss nesting more
in Part 2.
Creating "classes":
type hierarchies in DB2
When you create a user-defined structured type, the syntax is
very similar to that of creating a table; you specify the name of
the type, its attribute names and the data types of the attributes.
If you create subtypes under a structured type, those subtypes will
automatically inherit the attributes of the type above it in the
hierarchy (the supertype).
Figure 4: Mapping model to type hierarchies.
SQL definitional statements
|
CREATE TYPE person AS
(name varchar(20),
birthyear INTEGER,
address varchar(40))
MODE DB2SQL;
|
Create the root type of the hierarchy first.
| |
CREATE TYPE employee UNDER person
AS
(salary INTEGER)
MODE DB2SQL;
|
Create the subtype employee UNDER the Person type.
All "person" attributes are inherited by employees. All you need to
specify is additional attributes that are unique to employees (or
to types created UNDER employees).
| |
CREATE TYPE executive UNDER employee
AS
(bonus integer)
MODE DB2SQL;
|
Executives get bonuses.
| |
CREATE TYPE department AS
(ID INTEGER,
manager REF(employee),
Budget INTEGER)
MODE DB2SQL
METHOD BudgetperPerson()
RETURNS INTEGER
...
|
The department type has no subtypes or supertypes.
REF(employee) indicates that the manager is an employee, and
information about employees is in the employee type.
Method specifications are included with the type
definition (or can be added later with ALTER TYPE).
| |
CREATE METHOD BudgetperPerson()
RETURNS INTEGER
FOR department
...
|
As specified in the SQL99 standard, the method body
is specified in a separate CREATE METHOD statement.
| |
ALTER TYPE employee
ADD ATTRIBUTE dept ref(department)
|
Now that department is defined, create a cyclic
reference (departments referencing employees (i.e. as managers) and
employees working in departments)).
| |
CREATE TYPE student UNDER person
AS
(major VARCHAR(10),
wage DECIMAL)
MODE DB2SQL;
|
Students are the other leg of our person hierarchy.
They share attributes of person, but also have an additional
attribute to indicate the major field of study.
|
Storing objects:
Creating the table hierarchies
Table hierarchies are used to store objects and to allow SQL to
change those objects. Conceptually, table hierarchies mirror the
associated type hierarchies. However, types are like class
libraries that can be used in different contexts, so it may be that
not all of your types in a type hierarchy will be in the same table
hierarchy. The only requirement is that the table hierarchy is a
subset of the type hierarchy.
When you create a table, you specify an additional column for
the table that stores the object identifer values for each object
(each row) in the table. This object identifier (OID) is created as
the first column in the table, followed by the attribute columns.
The OID is used in dereference operations; that is, it is used to
"grab" onto a row and turn it into a structured type instance. This
is described in more detail in Part 2, where we describe how to
invoke methods on objects stored in rows.
SQL definitional statements
|
CREATE TABLE person_table OF
person
(REF IS oid USER GENERATED,
name WITH OPTIONS NOT NULL);
|
The CREATE TABLE statement is pretty simple. For the
root table in the hierarchy, we have to give the name of the column
that stores the OID values. In this case, we have given it the
creative name of 'oid'.
| |
CREATE TABLE employee_table OF
employee
UNDER person_table
INHERIT SELECT PRIVILEGES;
CREATE TABLE exec_table of executive
UNDER employee_table
INHERIT SELECT PRIVILEGES;
|
For any subtables, the clause INHERIT SELECT PRIVILEGES
indicates that all users who hold select privileges on the root
table when that table is created are also granted those privileges
on the subtables.
| |
CREATE TABLE dept_table OF
DEPARTMENT
(REF IS oid USER GENERATED,
manager WITH OPTIONS SCOPE
employee_table)
|
Here we create the department table. The SCOPE option on the
column options for manager indicates that employee objects referred
to from the manager column of the department table will reside in
the employee table or any subtable of the employee table.
| |
ALTER TABLE employee_table
ALTER COLUMN dept
ADD SCOPE dept_table
|
Here we create the scope of the reference going in
the other direction. All departments referred to in the employee
table will reside in the department table (or in any subtables
subsequently created).
|
Inserting data
Inserting into a typed table applies to the specified target
table (unlike SELECT, DELETE, and UPDATE, which apply to all tables
below the target table in the hierarchy). The INSERT statement is
unchanged for typed tables, other than you must remember to include
the object identifier and to cast it to the appropriate REF
type.
INSERT
INTO dept_table (oid, ID, Budget)
VALUES (department('1'), 100, 50000);
|
This
department has an object identifier of '1'. The character constant
is cast to the REF(Department) type.
| |
INSERT INTO employee_table (oid, Name,
Birthyear, Address, Salary, Dept)
VALUES (employee('6'), 'Meg Gomez', 1980,
'25 Hereandnow', 28000,
(SELECT oid FROM Dept_table WHERE
ID=100));
|
The dept attribute could be added as a constant
(cast to REF(department)). In this case, a subquery is used to
populate the column value.
|
After the employee table is populated, we can assign
Harry Potts as the manager of department 100 as follows:
UPDATE dept_table
SET Manager=(SELECT oid from employee_table
where name='Harry Potts')
WHERE ID=100;
Rich query
capabilities
The addition of type hierarchies and references bring with them
several enhancements to query, including the ability to easily
traverse references among objects and support for the implicit
consideration of subtables
An executive is an employee is a person....
A table hierarchy has the property of value substitutability;
that is, it establishes an "is-a" relationship. A query on a table
is really a query on the collection of objects in the table and all
of its subtables. SELECT, DELETE, and UPDATE statements all apply
on the selected table and on any of its subtables. However, only
attributes that are defined for the selected table are returned.
When you select all columns from Person_table, all objects in
Person_table and any of its subtables are retrieved, but only those
attributes (columns) that belong to person (including those
inherited from its supertypes) are returned.
SELECT P.* FROM
Person_table P;
Figure 5: Only attributes that are found in the person type are returned.
If the query was instead SELECT * FROM
Employee_table, all of the Employee attributes for only
employees and executives would be returned. This formulation is
certainly a lot simpler than forcing all different types of people
into the same table and trying to apply search predicates for the
different type values.
OUTER union of a table hierarchy: selecting all
attributes
If you want to see all attributes from the selected table and
its subtables, use an OUTER union on the table hierarchy: For
example, SELECT * FROM
OUTER(Employee_table) returns a result with all attributes
of employees and also includes the bonuses that only executives
have. Nulls are returned for attributes that are not applicable to
the returned row.
Figure 6: The OUTER keyword returns all attributes.
SELECT * FROM
OUTER(Employee_table)
Use ONLY to "turn
off" subtable retrieval
Use ONLY in the FROM clause to restrict the retrieval to objects
that are in the specified table, not in any subtables.
SELECT * FROM ONLY (Person_table);
|
Using path
expressions to simplify relationship queries
The dereference operator (->) is used in path
expressions. Path expressions provide a much easier syntax than
the traditional method of using joins to select attributes from
related objects. Instead of formulating the correct join
conditions, the path expression allows you to traverse the
reference-based relationship between one object and another. For
example, employees and departments have a referential relationship.
Departments have managers (who happen to be employees) and
employees have departments. To find the employee name and salary,
as well as the corresponding department ID and manager, for those
departments whose budgets are over 200000 dollars, you could issue
the following query:
SELECT E.name, E.salary, E.dept->ID, E.dept->manager->name AS MANAGER
FROM Employee_table E
WHERE E.dept->budget >200000
|
Even more complex relationships can easily be queried using path
expressions. For example, assume that you want to find the names of
all employees whose manager's manager is Lou Gerstner. You can do
it this way and avoid having to write a 5-way join:
SELECT E.name
FROM Employee_table E
WHERE E.dept->manager->dept->manager->name='Lou Gerstner';
|
You can even use path expressions between typed and regular
(untyped) tables if the untyped table includes REF(data
type) as a column.
A relationship that is represented by a reference is not the
same as one that is represented by referential integrity (although
it is certainly possible to use referential integrity on
references, too). Internally, DB2 treats the path expression like a
left outer join, which means that the path expressions take
advantage of DB2's sophisticated join optimization. DB2 also checks
to make sure that the ID executing the query has read access on all
the referenced tables.
Object
views
In relational systems, views play an important role in keeping
the data independent from the application and in controlling access
to sensitive data. These uses are so critical that it is important
that the same capabilities are available on the object-relational
infrastructure while at the same time preserving an
object-relational nature. In other words, method invocation,
substitutability, path expressions should all work. In addition,
there is the requirement to turn "legacy" data into
object-relational table/view hierarchies.
Object views fulfill these requirements.
Figure 7: Relationship between table hierarchy view hierarchies.

Object views are "virtual typed tables" that can have a structure
and behavior that is different than the underlying base tables. An
object view definition uses structured type definitions as their
basis, just as typed tables do. These view definitions can be used
to restrict access to a subset of rows of a table hierarchy or hide
some of the columns of the table hierarchy by allowing only access
through the view hierarchy.
If only a subset of the rows or tables are selected for the
view, then the already existing structured type definitions may be
used as the basis of the view definition. However, if either the
set of attributes/columns or the behavior provided through the view
needs to change, then a new type hierarchy needs to be defined as
the basis for the view hierarchy. For example, assume that we want
to create a set of views that contains no salary, wage or address
information, which excludes executives and students, and which
includes only those departments that have large budgets. Our type
definitions to create such a type hierarchy would merely be a
simpler (fewer attributes) version of those used in our existing
person type hierarchy. The body of the view definitions tells how
to populate those views from the base tables.
Notice the use of the ONLY keyword to ensure that each view is
populated only from the appropriate subtable of the original
hierarchy. Without ONLY, each view would be populated with all
objects at or below it in the hierarchy, which is not what we want
in this case.
Also note that since the OID column is a reference type and
reference types are strongly typed, the cast from the OID of the
table to the OID of the view cannot be done directly; the cast must
first be to the base type and then from the base type (VARCHAR) to
the type of the view (vperson_t).
|
Type definitions
|
View definitions
| |
CREATE TYPE vperson_t AS
(name varchar(20))
NOT FINAL
MODE DB2SQL;
|
CREATE VIEW vperson OF vperson_t
MODE DB2SQL
(REF IS oid USER GENERATED)
AS SELECT vperson_t(VARCHAR(oid)),
name FROM ONLY (Person_table);
| |
CREATE TYPE vemployee_t
UNDER vperson_t
NOT FINAL
MODE DB2SQL;
|
CREATE VIEW vemp OF vemployee_t
MODE DB2SQL
under vperson
INHERIT SELECT PRIVILEGES
AS SELECT
vemployee_t(VARCHAR(oid)), name,
vdepartment_t(varchar(dept))
FROM ONLY (Employee_table)
| |
CREATE TYPE vdepartment_t AS
(ID INTEGER,
manager REF(vemployee_t))
NOT FINAL
MODE DB2SQL;
|
CREATE VIEW vdept_bigbudget OF
vdepartment_t
MODE DB2SQL
(REF IS oid USER GENERATED,
Manager WITH OPTIONS SCOPE vemp)
AS SELECT vdepartment_t(varchar(oid)),
ID, vemployee_t(varchar(manager))
FROM Dept_table WHERE budget>100000;
| |
ALTER TYPE vemployee_t
ADD ATTRIBUTE dept ref(vdepartment_t);
|
ALTER VIEW vemp ALTER COLUMN dept
ADD SCOPE vdept_bigbudget;
|
Now, because the reference-based relationship in the original
typed tables (between employee_table and dept_table) are carried
over to the corresponding object views queries like the following
are possible:
SELECT name, dept->ID
FROM vemp;
|
Because two of the employees belong to departments with budgets
under 100000, null is returned for the department ID for those
employees.
DB2 has an exceptional ability to render views updatable. This
capability is important for making views truly useful in avoiding
base table access. This capability extends as well to object views,
even object views that are created from multiple base tables.
Using object
views to "objectify" legacy data
Object views provide a lot of functionality: they are updatable,
the view bodies (the semantics of the view) are separate from the
type definition, and objects seen through those views can be
manipulated and methods can be invoked on them via their object
identities and references, just as can objects in a base table.
Because object views provide such rich functionality, they can also
be used to create an "OO look and feel" to business data that has
always resided and been manipulated in a strictly relational
manner. Old applications can continue to work on the base tables
(or on untyped views), while new applications can run against the
object views.
Figure 8: Showing the relational schema of suppliers and parts.

Figure 9: Representing parts and suppliers using object views.

The following steps can be used to create object
views on legacy data based on the schema shown above:
- Create structured types to match the
tables.
CREATE
TYPE Supplier_t
AS
(Name VARCHAR(20))
REF USING INTEGER
MODE DB2SQL;
CREATE TYPE Parts_t AS
(Name VARCHAR(20),
Price DECIMAL (5,2),
Supplier REF(Supplier_t),
Qty INTEGER)
MODE DB2SQL
REF USING VARCHAR (10);
| Create
the supplier type to map to the Supplier relational table. Note
that we will be using the SuppID column as the object identifier in
the object view; therefore, it cannot be specified as part of the
type definition. Because the SuppID column is defined as an
INTEGER, use the REF USING clause to specify INTEGER as the
underlying type of the object identifier.
Similarly, for the parts type, the object identifier
column is mapped to the data type of the primary key of the Parts
table (column Partno of data type VARCHAR(10)).
|
- Now we can create the typed views.
CREATE
VIEW v_suppliers of Supplier_t
MODE DB2SQL
(REF IS Supp_ID USER GENERATED)
AS SELECT supplier_t(Supp_ID), Name
FROM Suppliers;
CREATE VIEW v_parts OF parts_t
MODE DB2SQL
(REF IS Partno USER GENERATED,
Supplier WITH OPTIONS SCOPE
scope v_suppliers)
AS SELECT parts_t(Partno), Name,
Price, Supplier_t(supplier), qty
FROM Parts;
| The
SuppID column of the base table is the object identifier column of
the view but must be cast to the REF type in the view
definition.
Here is the creation of the parts view.
Additional predicates can be added to the SELECT to further
restrict which rows populate the view.
|
Now, path expressions can be used:
SELECT Partno, Supplier->Name AS Supplier_Name
FROM v_parts
WHERE Price > .05;
|
Methods can be invoked, too, as will be shown in Part 2 in the
next edition of DB2DD.
Trademarks:
The following terms are trademarks of the International Business
Machines Corporation in the United States and/or other
countries:
DB2
DB2 Universal Database
IBM
The following terms are trademarks of other
companies:
Microsoft, Windows, Windows NT, and the Windows logo
are trademarks of Microsoft Corporation in the United States and/or
other countries.
Java and all Java-based trademarks and logos are
trademarks or registered trademarks of Sun Microsystems, Inc. in
the United States, other countries, or both.
©Copyright
International Business Machines Corporation 2001. All rights
reserved.
Resources - DB2 Universal Database V7, Application Development Guide. See in particular the chapter entitled "Object-Relational Programming."
- DB2 Universal Database V7, SQL Reference.
- Michael J. Carey, Serge Rielau, Bennet Vance. "Object View Hierarchies in DB2 UDB". EDBT 2000: 478-492,1999
- Michael J. Carey, Donald D. Chamberlin, Srinivasa Narayanan, Bennet Vance, Doug Doole, Serge Rielau, Richard Swagerman, Nelson Mendonca Mattos. "O-O, What's Happening to DB2?" SIGMOD Conference 1999: 511-512
- Michael J. Carey, Donald D. Chamberlin, Srinivasa Narayanan, Bennet Vance, Doug Doole, Serge Rielau, Richard Swagerman, Nelson Mendonca Mattos. "O-O, What Have They Done to DB2?" VLDB 1999: 542-553
- Chen, Weidong et al. "High Level Indexing of User-Defined Types." Proceedings of the 25th VLDB Conference, Edinburgh, Scotland, 1999
- N. M. Mattos, J. Kleewein, M. Tork Roth, K. Zeidenstein, "From Object-Relational to Federated Databases," Datenbanksysteme in Buro, Technik und Wissenschaft. March 1-3, 1999: 185-209.
About the author  | 
|  | Kathy Zeidenstein has been with IBM's Silicon Valley Lab since 1987. She started in the DB2 for OS/390 organization, which is where she got her first exposure to object-relational. After she began
working on the SQL standards team, she became involved with the more advanced object-relational
features in DB2, because much of the language design for that work was taken to the SQL standards
committees. |
Rate this page
|  |