Leverage your PostgreSQL V8.1 skills to learn DB2, Version 8.2


There has never been a better time to consider switching over to DB2. With the recent announcement of DB2 Universal Database™ (UDB) Express-C Edition V8.2, a fully functional, free version of DB2 for Linux®, Unix®, and Windows® platforms (32-bit and 64-bit), more and more developers will be making the switch to a proven data server with state-of-the-art technology and a growing developer community base. DB2 Express-C is based on the same core technology as the DB2 UDB Express Edition V8.2 product. As such, it supports a wide variety of programming languages to build and deploy applications, including C/C++, Java™, .NET, PHP, and more.

Learning a new product or system can sometimes be a complex and time-consuming endeavor. The goal of this article is to eliminate the steep learning curve and get database administrators (DBAs) and developers who are already familiar with PostgreSQL V8.1 up to speed with DB2's architecture and tools in little time. Get a basic understanding of the DB2 architecture and how it differs from PostgreSQL. Learn about the different tools and utilities that are available in DB2 and which ones are analogous to the ones you are already familiar with in PostgreSQL. Then explore all of the additional, easy-to-use features and tools that DB2 has to offer, and see for yourself why it is worth making the switch.

High-level system overview

Figure 1 presents a high-level overview of the PostgreSQL architecture. Figure 2 shows a high-level overview of the DB2 architecture.

Figure 1. PostgreSQL V8.1 Architecture
A diagram of the PostgreSQL V8.1 architecture
A diagram of the PostgreSQL V8.1 architecture
Figure 2. DB2 V8.2 Architecture
A diagram of the DB2 V8.2 architecture
A diagram of the DB2 V8.2 architecture

There are some similarities and differences in architecture. Both PostgreSQL and DB2 are based on a client-server architecture, where the client tier is independent of the server. In both cases, a client application wishing to use a database makes a request over the network and is served by a process on the server (postmaster on PostgreSQL, db2agent on DB2). PostgreSQL uses a simple process-per-user client/server model in which one client process connects to exactly one server process. The main process, called postmaster, listens at a specified TCP/IP port for incoming connections and spawns a new server process called postgres every time a connection is requested. The postgres processes communicate with each other using semaphores and shared memory to ensure data integrity during concurrent data access. Database files are accessed through shared disk buffers.

In DB2, coordinator agents (db2agent) coordinate the work on behalf of an application and communicate to other agents, using inter-process communication (IPC) or remote communication protocols. All connection requests from client applications, whether they are local or remote, are allocated a corresponding coordinator agent. When intra-partition parallelism is enabled, extra worker agents, called subagents (db2agentp) are spawned by the coordinator agent to help perform the work in parallel. In DB2, a database can be partitioned across multiple logical nodes, transparent to the application. These nodes may be spread over one or more server machines. DB2 accesses database files through buffer pools. It has the ability to fetch data into the buffer pools and move data out of the buffer pool asynchronously using prefetcher (db2pfchr) and page cleaner (db2pclnr) processes. The log writer process (db2loggw) flushes log records from the log buffer to the log files on disk, while the log reader process (db2loggr) reads the database log files during transaction processing (in other words, rollback), restart recovery, and roll-forward operations. For more information about the DB2 process model, refer to the article "Everything you wanted to know about DB2 Universal Database processes" (developerWorks, 2003).

Core administration tools

Before continuing to compare the fundamental architecture differences, let's quickly look at the main administration tools used in PostgreSQL and DB2 since this article refers to these tools in subsequent sections.

PostgreSQL's command line interface is the psql tool. psql is a terminal-based front end to PostgreSQL that enables you to type in queries interactively, issue them to PostgreSQL, and see the query results. Alternatively, input can come from a file. The psql interface is shown in Listing 1.

Listing 1. PostgreSQL's psql command line interface
$ psql mydb

Welcome to psql 8.1.3, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit

mydb=# SELECT version();

PostgreSQL 8.1.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.4 20050721 
(Red Hat 3.4.4-2)
(1 row)

mydb=# SELECT current_date;
(1 row)


There are no graphical tools that ship with the PostgreSQL distribution, although two popular open source ones, PGAdmin III and PhpPgAdmin, are recommended on the PostgreSQL Web site. These tools can connect to PostgreSQL databases and help perform common tasks.

DB2 ships with several tools you can use to administer a database. The Command Line Processor (CLP) is the main tool to administer DB2 from the command line. To launch this tool from Windows, navigate to it through the Windows start menu (Programs > IBM DB2 > Command Line Tools > Command Line Processor), or enter db2 from a DB2 Command Window. On UNIX and Linux platforms, log in as the instance owner (described shortly) and enter db2 from a command prompt. The CLP can be run in interactive mode, as shown in Listing 2. It can also be run in non-interactive mode, where commands are prefixed with db2 or come from an input file script. For more information about working with the CLP, refer to the article "Getting to know the DB2 UDB command line processor" (developerWorks, 2005).

Listing 2. The Command Line Processor in interactive mode
# db2
(c) Copyright IBM Corporation 1993,2002
Command Line Processor for DB2 SDK 8.2.3

You can issue database manager commands and SQL statements from the command
prompt. For example:
    db2 => connect to sample
    db2 => bind sample.bnd

For general help, type: ?.
For command help, type: ? command, where command can be
the first few keywords of a database manager command. For example:
 ? CATALOG DATABASE for help on the CATALOG DATABASE command
 ? CATALOG          for help on all of the CATALOG commands.

To exit db2 interactive mode, type QUIT at the command prompt. Outside
interactive mode, all commands must be prefixed with 'db2'.
To list the current command option settings, type LIST COMMAND OPTIONS.

For more detailed help, refer to the Online Reference Manual.

db2 => connect to sample

   Database Connection Information

 Database server        = DB2/LINUX 8.2.3
 SQL authorization ID   = DB2INST1
 Local database alias   = SAMPLE

db2 => values current_date


  1 record(s) selected.

db2 =>

The Control Center, shown in Figure 3, is the central graphical administration tool in DB2. From the Control Center, you can perform tasks such as creating a database, backing up a database, viewing query execution plans, viewing database objects such as tables, and configuring database settings. To launch the Control Center on Windows, navigate to it through the Windows start menu (Programs > IBM DB2 > General Administration Tools > Control Center), or enter db2cc from a command prompt. On UNIX and Linux, log in as the instance owner user and enter db2cc from a command prompt.

Figure 3. The Control Center
A diagram of the Control Center
A diagram of the Control Center

Database clusters and DB2 instances

In PostgreSQL, you must first initialize a database storage area on disk before you can create any databases. This storage area is known as a database cluster. A database cluster is a collection of databases that is managed by a single instance of a running database server (in other words, a postmaster process). In file system terms, a database cluster is a single directory under which all data is stored. Typically, a special user account, called the Postgres user account, is used to own a server and its data. Many operations require that you be logged in as this user, such as when starting the server.

After initialization, a database cluster contains a database called postgres, which is a default database used by utilities, users and third party applications. Another database called template1 is created and is used as a template for creating new databases. Before users can access a database, the server must be started. The server can be stopped by sending different signals to the postmaster process.

DB2 has a somewhat similar concept of a database cluster called an instance. An instance, also known as a database manager (DBM), is a logical server environment needed to create and work with databases. It is really just a collection of processes that provide access to database(s). If you have more than one instance created, each instance acts as an independent server environment, which can be stopped and started independently. Like PostgreSQL, you can have multiple instances active at the same time and have multiple databases active within each instance. During a DB2 installation on UNIX and Linux platforms, you have the option to create a default instance, usually called db2inst1. On these platforms, a special user account is typically created to "own" the instance and is known as the instance owning user. During installation on Windows, an instance called DB2 is always created. By default, no databases are created in an instance during installation -- you must explicitly create them. To view a list of instances defined on your system, you can use the db2ilist command, which can be found in the instance subdirectory of the DB2 installation folder. Listing 3 shows sample output of this command on Linux:

Listing 3. Sample output of the db2ilist command
$ /opt/ibm/db2/V8.1/instance/db2ilist

Alternatively, you can view defined instances in the Control Center. Expand the Instances folder in the left pane object tree, as shown in Figure 4.

Figure 4. Viewing defined instances in the Control Center
A diagram of the instances folder in the Control Center

To start or stop a DB2 instance on UNIX and Linux, log in as or switch to the instance owning user, and issue the db2start command. To stop the instance, use the db2stop command. On Windows, you can use the same commands to start and stop the instance. However, the user account you use to issue these commands must be a member of the SYSADM group defined for that instance.

You can also start and stop instances from the Control Center. Right-click on the target instance in the left pane object tree and choose the Start or Stop option, respectively. In order to stop an instance, all connections to databases inside the instance must be severed.

New instances can be created with the db2icrt command and dropped with the db2idrop command.

For more information about the user accounts created when DB2 is installed, refer to the article "Understand how user and group accounts interact with DB2 UDB" (developerWorks, 2005).


In PostgreSQL, a database is a named collection of database objects. Generally, every database object (table, function, and so on) belongs to one and only one database. However, there are a few system catalogs (for example, pg_database), that belong to an entire cluster and are accessible from each database within the cluster. Databases are physically separated, and access control is managed at the connection level. Databases are created with the CREATE DATABASE command and destroyed with the DROP DATABASE command.

In DB2, a database is a collection of tables, including the data and related objects, such as indexes, views, triggers, functions, and stored procedures. Multiple databases can exist in the same instance; however, each database is fully independent of each other in the sense that each has its own system catalog tables (containing the database metadata), its own configuration file, and its own recovery log.

Databases are isolated from each other to the extent of objects and data. For example, an application cannot, by default, issue a query that would fetch data from two different databases. A special technology called federation would need to be set up to do that. Databases created inside the same instance do share the same set of instance processes.

There are several ways to create a database, but the most common one is to use the CREATE DATABASE command, such as the following:


After issuing this command, a database called MYDB is created in the /database/mydb path using the default options. There are many other options you can specify with this command, so be sure to check the command syntax diagram in the DB2 documentation.

Another easy way to create a database is to use the Create Database wizard from the Control Center. This wizard is shown in Figure 5. To launch the wizard, expand the object tree in the left pane until you see the Databases folder. Right-click on the Databases folder and select the Create Database option, then the Standard or With Automatic Maintenance option. Follow the steps in the wizard to complete the database creation process.

Figure 5. The Create Database wizard
A screenshot of the Create Database wizard
A screenshot of the Create Database wizard

Configuration files

PostgreSQL has many configuration parameters that affect the behavior of the database system. These parameters can be set in several areas. The most common area is the postgresql.conf file, which is normally kept in the data directory. Listing 4 shows sample contents of this file:

Listing 4. Sample contents of the postgresql.conf file
# This is a comment
log_connections = yes
log_destination = 'syslog'
search_path = '$user, public'

A second way to set these parameters is to give them as a command line option to the postmaster server, such as:

postmaster -c log_connections=yes -c log_destination='syslog'

Finally, a command line option can be applied to one particular session only. The environment variable PGOPTIONS can be used for this purpose on the client side:

env PGOPTIONS='-c geqo=off' psql

DB2 must also be configured at various levels, as shown in Figure 6.

Figure 6. The different levels where DB2 can be configured
A diagram showing the different levels where DB2 can be configured
A diagram showing the different levels where DB2 can be configured

First, certain settings are configurable at the instance level, such as communications and security. To view a list of all the instance level configuration parameters, you can issue the GET DB CFG command. Listing 5 shows a snippet of the output from this command.

Listing 5. Sample output from the GET DBM CFG command
db2 => get dbm cfg

 Database Manager Configuration

 Node type = Enterprise Server Edition with local and remote clients

 Database manager configuration release level            = 0x0a00

 CPU speed (millisec/instruction)             (CPUSPEED) = 4.251098e-07
 Communications bandwidth (MB/sec)      (COMM_BANDWIDTH) = 1.000000e+02

 Max number of concurrently active databases     (NUMDB) = 8
 Data Links support                          (DATALINKS) = NO
 Federated Database System Support           (FEDERATED) = NO
 Transaction processor monitor name        (TP_MON_NAME) =

 Default charge-back account           (DFT_ACCOUNT_STR) =

 Java Development Kit installation path       (JDK_PATH) = /opt/IBMJava2-142

 Diagnostic error capture level              (DIAGLEVEL) = 3
 Notify Level                              (NOTIFYLEVEL) = 3
 Diagnostic data directory path     (DIAGPATH) = /home/db2inst1/sqllib/db2dump

To update an instance parameter, use the UPDATE DBM CFG command:

UPDATE DBM CFG USING <parameter_name> <value>

Substitute <parameter_name> with the parameter name and <value> with the parameter value. Some parameters are dynamic, meaning that any changes take effect immediately. Others require you to stop and restart the instance in order for them to take effect.

Other settings are configurable at the database level, such as logging, sort memory, and the SQL statement cache. To view a list of all the database configuration parameters, issue the GET DB CFG FOR <db_name> command where <db_name> is the name of your database. Listing 6 shows a snippet of the output from this command.

Listing 6. Sample output from the GET DB CFG command
db2 => get db cfg for sample

       Database Configuration for Database sample

 Database configuration release level                    = 0x0a00
 Database release level                                  = 0x0a00

 Database territory                                      = US
 Database code page                                      = 1208
 Database code set                                       = UTF-8
 Database country/region code                            = 1
 Database collating sequence                             = BINARY
 Alternate collating sequence              (ALT_COLLATE) =
 Database page size                                      = 4096

 Dynamic SQL Query management           (DYN_QUERY_MGMT) = DISABLE

 Discovery support for this database       (DISCOVER_DB) = ENABLE

 Default query optimization class         (DFT_QUERYOPT) = 5
 Degree of parallelism                      (DFT_DEGREE) = 1
 Continue upon arithmetic exceptions   (DFT_SQLMATHWARN) = NO
 Default refresh age                   (DFT_REFRESH_AGE) = 0
 Default maintained table types for opt (DFT_MTTB_TYPES) = SYSTEM
 Number of frequent values retained     (NUM_FREQVALUES) = 10
 Number of quantiles retained            (NUM_QUANTILES) = 20

 Backup pending                                          = NO
 Database is consistent                                  = YES

To update a parameter, use the UPDATE DB CFG command:

UPDATE DB CFG FOR <db_name> USING <parameter_name> <value>

Substitute <db_name> with your database name, <parameter_name> with the parameter name, and <value> with the parameter value. Some parameters are dynamic. Others require that you deactivate the database (if it was explicitly activated), or wait for all the connections to the database to be disconnected before letting them reconnect.

Instance and database parameters can also be updated using the Control Center. Right-click on the target instance or database, depending on the parameter you want to update, and select the Configure Parameters option. This launches the Instance/Database Configuration dialog window, as shown in Figure 7.

Figure 7. Database Configuration dialog window
A screenshot showing the Database Configuration dialog window
A screenshot showing the Database Configuration dialog window

Finally, certain parameters are configurable at the DB2 registry level. Every DB2 server has a registry where you can set flags that activate new features as well as platform-specific and special optimizations. This registry area should not to be confused with the operating system's registry. The DB2 registry only exists in and applies to DB2.

To view the current registry settings, use the db2set command. There are additional flags you can add to this command to provide additional output. For example, the -all flag shows all the registry variables that are currently set. The -lr flag lists every available registry variable that can be set for the current platform. Listing 7 shows the sample output of the db2set command.

Listing 7. Sample output from the db2set command
 # db2set -all
[g] DB2SYSTEM=localhost.localdomain
[g] DB2ADMINSERVER=dasusr1

To set or change one of the registry parameters, use the db2set command:

db2set <parameter_name>=<value>

Ensure that you do not leave a space on either side of the equals sign (=) or else your change will not take effect. Registry variables can be set and applied at various levels. You can set them globally so that they apply to all instances. They can also be set for a specific instance. Alternatively, they can be set at the environment level, which means that they are set outside of DB2 using the SET command in Windows or the EXPORT command in UNIX and Linux. With very few exceptions, the instance must be restarted before any changes you make take effect.

A utility called the Configuration Advisor can also be used to recommend an initial baseline configuration of your environment. This utility can be run from the command line using the AUTOCONFIGURE command or can be run as a wizard from the Control Center. To launch it from the Control Center, right-click on the target database and select the Configuration Advisor item.

Buffers and buffer pools

PostgreSQL uses system shared memory as a buffer. The shared_buffers configuration parameter in the postgresql.conf file sets the number of shared memory buffers used by the database server. The default is typically 1000, but may be less if the kernel settings do not support it. Each buffer is 8192 bytes, unless a different value of BLCKSZ was chosen when building the server.

DB2 uses database objects called buffer pools as a cache for table and index data. Buffer pools are created at a database level and are not shared by multiple databases. When you create a buffer pool, you must specify the page size you want to use and the number of pages to allocate. DB2 supports page sizes of 4K, 8K, 16K, and 32K. Depending on your workload complexity, you may decide to only use one large buffer pool. Or you may want to use multiple buffer pools, each of different page sizes. To create a buffer pool, use the CREATE BUFFERPOOL statement. For example:


A buffer pool can also be created from the Control Center. Expand the target database, then right-click the Buffer Pools folder, and select the Create option. This launches the Create Buffer Pool dialog window, shown in Figure 8.

Figure 8. The Create Buffer Pool dialog window
A screenshot showing the Create Buffer Pool dialog window
A screenshot showing the Create Buffer Pool dialog window

By default, when a new database is created, a buffer pool with a 4K page size called IBMDEFAULTBP is created. This buffer pool is usually insufficient in size, so you will want to increase its size or create other larger ones.

Notice the Show Command button at the bottom of the window in Figure 8. Most Control Center dialog windows let you view the SQL statement or command that is actually executed in the background. These commands can be saved in scripts for execution at a later time. This is also a great way to learn DB2 syntax.

Table spaces

Table spaces in PostgreSQL allow database administrators to define locations in the file system where the files representing database objects can be stored. Once created, a table space can be referred to by name when creating database objects. By using table spaces, an administrator can control the disk layout of a PostgreSQL installation. Tables, indexes, and entire databases can be assigned to particular table spaces. To do so, a user with the CREATE privilege on a given table space must pass the table space name as a parameter to the relevant command.

In DB2, a table space is a database object that is used as an abstraction layer between tables (and indexes) and the physical disks where their files are stored. When you create a table space, you can associate it with a specific buffer pool as well as to specific containers. A container is a directory, file, or raw device. This architecture gives you a lot of flexibility in managing performance. For example, if you have a "hot" table, you can place it in its own table space and give it its own buffer pool. This helps ensure the data of this table is always cached in memory.

Table spaces can be defined as system-managed spaces (SMS) or database-managed spaces (DMS). SMS table spaces are managed by the operating system and grow as required. They provide decent performance with minimum administration. DMS table spaces are pre-allocated and managed by DB2. This type of table space typically outperforms SMS, but more requires administration overhead. DMS table spaces are also more flexible in the sense that they allow you to split up a table's data, indexes, and large objects into separate table spaces.

To create a table space, use the CREATE TABLESPACE command, such as:

USING ('d:\usp3_cont1', 'e:\usp3_cont2', 'f:\usp3_cont3')

This command creates a table space using a page size of 16K and associates it with the bp16k buffer pool. The table space is spread across three containers that each reside on their own drive.

Table spaces can also be created using the Control Center. Expand the target database, right-click the Table Spaces folder, and select the Create option. This launches the Create Table Space wizard, shown in Figure 9.

Figure 9. The Create Table Space wizard
A screenshot showing the Create Table Space wizard
A screenshot showing the Create Table Space wizard

When a new database is created with the default options, three table spaces are created, as described in Table 1.

Table 1. Table spaces created by default when a database is created
Table space nameDescription
SYSCATSPACECatalog table space containing the system catalog tables.
TEMPSPACE1System temporary table space used as a work area by DB2 to perform operations such as joins and sorts.
USERSPACE1This table space stores user tables when a table space is not explicitly indicated at table creation time.

You can use these default table spaces or create your own. Additionally, if you want to use declared global temporary tables, you need to create a user temporary table space before creating the temporary table. This type of table space is not created by default when a new database is created.


Schemas are similar in PostgreSQL and DB2. Every database contains one or more named schemas, which in turn contain tables and other database objects. Schemas can be created using the CREATE SCHEMA command. For example:


In addition, DB2 allows schemas to be created "on the fly" if the one specified does not exist. Only users who have been granted the IMPLICIT_SCHEMA database authority can do this.

To drop a schema, the DROP SCHEMA command is used. For example:

DROP SCHEMA myschema

The major difference between PostgreSQL and DB2 is what happens when a schema is not specified. In PostgreSQL, when you create an object without specifying a schema name, it is automatically put into a schema named public. Every new database contains this schema. In other words, the following statements are equivalent in PostgreSQL:

CREATE TABLE products ( ... )


CREATE TABLE public.products ( ... )

When a schema is not specified, the system determines which object is intended by following a search path, which is a list of schemas to look in. The first matching object residing in a schema listed in the search path is taken to be the one wanted. If there is no match in the search path, an error is reported, even if matching object names exist in other schemas in the database. The first schema named in the search path is called the current schema. By default, it is the public schema. Aside from being the first schema searched, it is also the schema in which new objects will be created if the CREATE <object> command does not specify a schema name. This is why new objects are created in the public schema, since it is listed first in the search path by default. It is possible to change this path using the SET statement, such as:

SET search_path TO myschema,public;

Once the target schema is in the path, you can access database objects in the schema without qualifying them. The search path works in the same way for other objects, such as data type names, function names, and operator names.

In DB2, the current schema, by default, is the same as the user ID that is used to connect to the database. Thus, if a user called ACCTREP connects to the database, the default schema that is used for unqualified database objects is ACCTREP. All qualified database object references use the schema that was explicitly specified.

To change the default schema, use the SET SCHEMA statement. For example, the statement:


changes the default schema to JIM for the duration of connection, unless it is issued again with a different value. In DB2, there is no concept of schema path, as in PostgreSQL. There can only be one value specified for the current schema. Stored procedures and user-defined functions, on the other hand, do have a function path that works similar to the schema search path in PostgreSQL.

Tables, indexes, and views

Tables, views, and indexes are similar in PostgreSQL and DB2. In DB2, before creating any type of database object, you must first connect to the database you want to create it in. You can create these objects using command syntax or using wizards in the Control Center. Figure 10 shows the Create Table wizard. To launch the Create Table wizard, right-click the Tables folder under the target database and select the Create Table option.

Figure 10. The Create Table wizard
A screenshot showing the Create Table wizard
A screenshot showing the Create Table wizard

Although the syntax for creating objects is very similar, each database implements them differently. For example, in PostgreSQL, you can specify what type of index to create, such as a B-Tree, R-Tree, or Hash index. In DB2, you do not have this choice. However, DB2 automatically creates special types of indexes when needed, such as bitmap indices when executing certain types of queries, or block indexes, when working with multi-dimensional clustering (MDC) tables. DB2 also has special implementation of objects that PostgreSQL does not have, such as MDC tables and Materialized Query Tables.

Database application objects

The main purpose of this article is to describe the DB2 environment in comparison with the PostgreSQL environment. Database server logic, including stored procedures, triggers, and user-defined functions is a vast topic, and thus is only covered briefly.

PostgreSQL is extensible because its catalog tables can be modified by the user. Extension includes components such as types, user-defined operators, and user-defined functions.

PostgreSQL provides four kinds of functions:

  • Query language functions (functions written in pure SQL)
  • Procedural language functions (functions written in, for example, PL/pgSQL or PL/Tcl)
  • Internal functions (functions written in C that have been statically linked into the PostgreSQL server)
  • C-language functions (written in C, or a language that can be made compatible with C, and are compiled into dynamically loadable objects and loaded by the server on demand)

PostgreSQL also supports triggers. The trigger code is actually comprised of a trigger function, which is a special function containing logic to execute in the trigger. The trigger function must be created before the trigger itself. Trigger functions can be written in most of the available procedural languages, including PL/pgSQL, PL/Tcl, PL/Perl, and PL/Python.

DB2 supports stored procedures, user-defined functions, and stored procedures. Stored procedures can be written in C/C++, Java, OLE, COBOL, a Microsoft CLR-supported language, or DB2's SQL Procedural Language (SQL PL). When you create an SQL stored procedure, its procedural statements are converted into a native representation that is stored in the database catalogs, as is done with other SQL statements. When the procedure is called, the native representation is loaded from the catalogs and executed in the DB2 engine.

DB2 offers the ability to create customized user-defined functions (UDFs) using DB2's SQL PL language. They can also be written in an external language such as C or Java. UDFs can encapsulate frequently used logic to reduce the complexity of application development. DB2's procedural language for UDFs is actually a subset of DB2's PSM procedural language. As such, certain SQL PL elements that are supported in SQL stored procedures are not supported in UDFs.

Finally, DB2 supports triggers for INSERT, UPDATE, and DELETE operations. DB2 also supports triggers defined on views, which are known as INSTEAD OF triggers. It is important to note that DB2's procedural language for triggers is actually a subset of the DB2 PSM procedural language. As such, certain SQL PL elements that are supported in SQL stored procedures are not supported in triggers.


Both PostgreSQL and DB2 have authentication and authorization mechanisms. Authentication refers to the process of validating a user's credentials. Authorization refers to the process of ensuring a user has the appropriate database privileges to perform a given action.

PostgreSQL offers a number of different client authentication methods. The method used to authenticate a particular client connection can be selected on the basis of the client's host address, database, and user. Client authentication is controlled by a configuration file, which traditionally is named pg_hba.conf and is stored in the database cluster's data directory. The general format of the pg_hba.conf file is a set of records, one per line. Each record specifies a connection type, a client IP address range (if relevant for the connection type), a database name, a user name, and the authentication method to be used for connections matching these parameters. The first record with a matching connection type, client address, requested database, and user name is used to perform authentication. If no record matches, access is denied. Listing 8 shows a sample snippet of the pg_hba.conf file.

Listing 8. Sample contents of the pg_hba.conf file
# Allow any user on the local system to connect to any database under
# any database user name using Unix-domain sockets (the default for 
# connections).
  local	all		all					trust

# The same using local loopback TCP/IP connections.
  host	all		all		trust

# Allow a user from host to connect to database
# "postgres" if the user's password is correctly supplied.
  host	postgres	all	md5

Other authentication and connection settings are maintained in the postgresql.conf file.

In DB2, application or database user accounts do not exist within the database, but are rather managed by an external security facility, such as the operating system. Within the database, privileges on specific database objects can be assigned to user or group accounts defined in the external facility.

By default, authentication takes place on the DB2 server. There are a number of other authentication methods supported by DB2, including authentication performed on the client (allowing "single sign-on" access), a Kerberos security facility, or a user-defined Generic Security Service (GSS) plug-in. Additional authentication options include the ability to encrypt user names and passwords as well as data, as they travel across the network between client and server. You can update the value of the DB2 instance parameter called AUTHENTICATION to change the authentication settings for the entire instance. All databases inside the instance use that value.

PostgreSQL manages database access permissions using the concept of roles. A role can be thought of as either a database user or a group of database users, depending on how the role is set up. Roles can own database objects (for example, tables) and can assign privileges on those objects to other roles. Roles are conceptually separate from operating system users and are global across a database cluster installation and not per individual database.

In DB2, since there is no concept of a database or application user, it is necessary to create any operating system users that your application uses to connect to the database and then grant those users specific database object privileges. To facilitate administration and management, you can also create groups of users and grant specific database privileges to those groups. Users belonging to those groups inherit the group's assigned privileges. Privileges can be granted and revoked using the standard GRANT and REVOKE statements. They can also be managed from the Control Center. A list of the different privileges that can be granted and revoked can be found in the DB2 documentation. Privileges are granted at the database level and only apply to a specific database, not all of the databases inside the instance. If you want the user to have the same privileges in multiple databases, you must explicitly grant those privileges in each of the databases.

DB2 UDB also provides a hierarchy of system-level administrative authorities (SYSADM, SYSCTRL, SYSMAINT, SYSMON). These authorities allow users to perform a subset of administrative tasks such as creating/dropping databases and forcing users off the system. Authorities are assigned at the instance level by specifying the name of a group in the associated instance level parameter (SYSADM_GROUP, SYSCTRL_GROUP, SYSMAINT_GROUP, SYSMON_GROUP). Users belonging to that group inherit that authority for all databases in the instance.

For additional information about user accounts and privileges/authorities, refer to the articles " Understand how user and group accounts interact with DB2 UDB" (developerWorks, 2005) and " Understand how authorities and privileges are implemented in DB2 UDB" (developerWorks, 2006).


Unlike traditional database systems that use locks for concurrency control, PostgreSQL maintains data consistency by using a multiversion model (Multiversion Concurrency Control, MVCC). This means that while querying a database, each transaction sees a snapshot of data as it was some time ago, regardless of the current state of the underlying data. This protects the transaction from viewing inconsistent data that could be caused by other concurrent transaction updates on the same data rows.

The SQL standard describes four transaction isolation levels and the corresponding concurrency problems that may occur in each one. In PostgreSQL, you can request any of the four standard isolation levels, but internally, there are only two distinct isolation levels, which correspond to the Read Committed and Serializable level. When you select the level Read Uncommitted you really get Read Committed, and when you select Repeatable Read you really get Serializable, so the actual isolation level may be stricter than what you select.

Read Committed is the default isolation level in PostgreSQL. When a transaction runs on this isolation level, a SELECT query sees only data committed before the query began; it never sees either uncommitted data or changes committed during query execution by concurrent transactions. However, the SELECT statement does see the effects of previous updates executed within its own transaction, even though they are not yet committed. In effect, a SELECT query sees a snapshot of the database as of the instant that that query begins to run. Two successive SELECT commands can see different data, even though they are within a single transaction, if other transactions commit changes during execution of the first SELECT.

DB2's concurrency model is radically different than PostgreSQL's. At any one time, there is only one copy or instance of a row. DB2 uses row and table locking to ensure data integrity and consistency. Whenever a row is requested for read or update, DB2 obtains a lock, except if the application requesting the row is using the Uncommitted Read isolation level, in which case no row locks are obtained for read operations. Update operations always require an exclusive lock. These locks are released when a transaction is committed or rolled back. The lock settings are configurable, including how much memory to devote to maintaining lock information (LOCKLIST database configuration parameter), how long an application should wait for a lock before timing out (LOCKTIMEOUT database configuration parameter), and the maximum percentage of lock memory one application can consume (MAXLOCKS database configuration parameter).

DB2 supports different levels of concurrency control to suit the needs of most applications. An application should use the level of isolation that provides the appropriate concurrency. The level of isolation is specified by each database connection, meaning different connections to the database can use different levels of isolation. In many cases, DB2's default setting of Cursor Stability is sufficient. DB2 also allows the isolation level to be specified at the SQL statement level, thereby overriding the database default isolation level. For example, suppose that an application needs to get a "rough" count of the number of rows in a table. Performance is of the utmost importance, and the Cursor Stability isolation level is required with the exception of this one SQL statement. You could use the following SQL statement to get this information:


Since it is acceptable to read dirty data in this case, choosing the Uncommitted Read isolation level for this statement will help improve performance because read locks are not obtained.

To increase the concurrency of the system, commit your transactions often, including read-only transactions. If possible, reschedule the applications that compete for access to the same table. Also, use the Uncommitted Read (UR) isolation level for transactions where read consistency is not an issue.

You may also want to enable three relatively new registry variables (DB2_EVALUNCOMMITTED, DB2_SKIPINSERTED, and DB2_SKIPDELETED) to help improve the concurrency of this system. These might especially be beneficial when converting from a MVCC system such as PostgreSQL. For more information about these registry variables, refer to the article "Lock avoidance in DB2 UDB V8" (developerWorks, 2005).

Backup and recovery

There are three approaches typically used to back up PostgreSQL databases: SQL dump, file system backup, and on-line backup. The SQL-dump method generates a text file with SQL commands that, when fed back to the server, recreate the database in the same state as it was at the time of the dump. The utility program called pg_dump is used for this purpose. To backup an entire database cluster, the pg_dumpall program is provided. An alternative backup strategy is to directly copy the files that PostgreSQL uses to store the data in the database using whatever method you prefer for doing file system backups. Finally, you can also combine a file-system-level backup with a copy of the write-ahead log files to restore a database to a particular point-in-time.

In DB2, each database must be backed up separately using the BACKUP command. This takes a binary copy of the database state at the time of the backup that can later be used to restore the database to the same state. DB2 supports three kinds of backups: full, incremental, and delta. Full backups are a complete snapshot of the current state of the database. Incremental backup allows you to back up only index and data pages that have changed since a previous backup. Delta backups are based on the data that has changed since the last incremental or delta backup. In addition, DB2 supports online and offline backups, giving you the flexibility to keep your system running 24x7.

To take a backup, use the BACKUP command. For example:

backup database sample to /dev1/backup

This command takes a full offline backup of the SAMPLE database and places the backup files in the /dev1/backup path.

Backups can also be taken using the Control Center. Right-click on the target database in the left pane object tree and select the Backup option. The Backup Wizard, shown in Figure 11, is displayed where you can customize the backup settings.

Figure 11. The Backup Wizard in Control Center
A screenshot showing the Backup Wizard
A screenshot showing the Backup Wizard

DB2 can also automate all of your backup needs if you choose. DB2 can keep track of when a backup should be taken and will then kick one off in a user-defined maintenance window if required.

For more information about the BACKUP utility, refer to the article "The IBM DB2 Universal Database for Linux, UNIX, and Windows Backup Utility" (developerWorks, 2005).

Backup images can be restored using the RESTORE command. For example:


Alternatively, you can restore a database from the Control Center using the Restore Database wizard. To launch this wizard, right-click on the All Databases folder, then select the Create Database option, and then the From Backup option.

A database restore operation requires an exclusive connection, and the RESTORE utility prevents other applications from accessing the database until the restore operation completes successfully. A table space restore operation, however, can be done online. Refer to the DB2 documentation for more information about the RESTORE command.

In addition to the BACKUP utility, DB2 has another utility called db2look that can be used to regenerate the database object definitions. They can can then be used to recreate the same objects in a different database. Note that this utility only generates object definitions (DDL), not the data. You can also access this utility from the Control Center. Right-click on an object (in other words, a table, database, and so on) and select the Generate DDL option. For more information about the db2look utility, refer to the DB2 documentation.

Database maintenance

PostgreSQL has a couple of utilities to help maintain the health and performance of the database. The VACUUM command must be run on a regular basis to recover disk space occupied by updated or deleted rows and to update data statistics used by the PostgreSQL query planner. Additionally, it is sometimes useful to rebuild indexes periodically with the REINDEX command.

In DB2, there are several utilities that can be used to maintain the health and performance of the database. The first, and most important, is the RUNSTATS utility. The RUNSTATS utility collects statistical information for tables and indexes, which is then used during query optimization to come up with an optimal query access plan. It is critical to update statistics on a regular basis as well as after loading or deleting a lot of rows in a table or adding a new index. The runstats command is used to collect statistics for a table. For example, the following runstats command collects detailed statistics for the EMPLOYEE table and all of its indexes:

runstats on table tedwas.employee with distribution indexes all

This utility is also available from the Control Center. Right-click on a table in the Tables folder under the database of interest and select the Run Statistics option. The Run Statistics dialog window appears, as shown in Figure 12, where you can define the options to use during statistics collection.

Figure 12. The runstats dialog window
A screenshot showing the runstats dialog window
A screenshot showing the runstats dialog window

While you can collect statistics anytime you like, DB2 can also automate this process for you. DB2 can keep track of when statistics should be updated and will update them in a user-defined maintenance window if required. To find out more about this capability and the runstats command in general, refer to the article " RUNSTATS in DB2 UDB, Version 8.2" (developerWorks, 2005).

Another important maintenance utility in DB2 is the REORG utility. The REORG utility helps arrange the data in tables and indexes more efficiently. For tables, it reconstructs the rows to eliminate fragmented data and compacts information. For indexes, it reorganizes all indexes defined on a table by rebuilding the index data into un-fragmented, physically contiguous pages. For example, the following command reorganizes the EMPLOYEE table:

REORG TABLE tedwas.employee

The REORG utility can also be accessed from the Control Center. Right-click on a table in the Tables folder under the target database and select the Reorganize option (or the Reorganize Indexes option to reorganize indexes). The Reorganize Table dialog window appears, as shown in Figure 13, where you can define the options to use for the reorganization.

Figure 13. The Reorganize Table dialog window
A screenshot showing the Reorganize Table dialog window
A screenshot showing the Reorganize Table dialog window

A special utility called REORGCHK can be used to determine if a table or index needs to be re-organized. For example, the output (not shown) from the following command can help determine if the EMPLOYEE table needs to be reorganized:



PostgreSQL's statistics collector is a subsystem that supports collection and reporting of information about server activity. Several predefined views (e.g. pg_stat_activity, pg_stat_database, etc.) are available to query and display the results of statistics collection. Alternatively, you can build custom views using the underlying statistics functions.

In DB2, monitoring can be performed at various levels. You can use the snapshot monitor to capture information about the database and any connected applications at a specific time. Snapshots are useful for determining the status of a database system at a point-in-time. Taken at regular intervals, they are also useful for observing trends and foreseeing potential problems. Before taking snapshots, you might want to set a few switches to capture more detailed information if the basic information that is captured is not enough. Refer to the DB2 documentation for setting these switches. When you are ready to take the snapshot, you can issue the GET SNAPSHOT command. Listing 9 shows partial output of the GET SNAPSHOT FOR ALL DATABASES command.

Listing 9. Output from the GET SNAPSHOT FOR ALL DATABASES command
Database Snapshot

Database name                              = SAMPLE
Database path                              = /home/db2inst1/db2inst1/NODE0000/SQL00001/
Input database alias                       =
Database status                            = Active
Catalog database partition number          = 0
Catalog network node name                  =
Operating system running at database server= LINUX
Location of the database                   = Local
First database connect timestamp           = 03/09/2006 11:03:58.417363
Last reset timestamp                       =
Last backup timestamp                      =
Snapshot timestamp                         = 03/09/2006 11:41:55.957502

Number of automatic storage paths          = 0

High water mark for connections            = 5
Application connects                       = 47
Secondary connects total                   = 1
Applications connected currently           = 2
Appls. executing in db manager currently   = 0
Agents associated with applications        = 2
Maximum agents associated with applications= 5
Maximum coordinating agents                = 5

Locks held currently                       = 0
Lock waits                                 = 0
Time database waited on locks (ms)         = Not Collected
Lock list memory in use (Bytes)            = 1920
Deadlocks detected                         = 0
Lock escalations                           = 0
Exclusive lock escalations                 = 0
Agents currently waiting on locks          = 0
Lock Timeouts                              = 0
Number of indoubt transactions             = 0

Total Private Sort heap allocated          = 0
Total Shared Sort heap allocated           = 0
Shared Sort heap high water mark           = 0
Total sorts                                = 20
Total sort time (ms)                       = Not Collected
Sort overflows                             = 1
Active sorts                               = 0

You can also retrieve snapshot information using SQL, using the corresponding SNAPSHOT_xxxx() table functions. A list of these functions can be found in the DB2 documentation.

Event monitors are used to collect information about the database and any connected applications when specific events occur. These events represent transitions in database activity such as a deadlock situation or when an application connects. This type of information would be hard to capture using the snapshot monitor as you would have to take a snapshot during the exact window the event occurred. You can define an event monitor by the type of event or events you want it to monitor. For example, a deadlock event monitor waits for a deadlock to occur; when one does, it collects information about the applications involved and the locks in contention.

You can create an event monitor using the command line syntax or the Control Center. For example, the following statements create and activate an event monitor for application connections and writes its output to a file in a path called /tmp/conevents:


When the monitoring period has finished, the event monitor buffer can be flushed and the monitor can be stopped using the following statements:


The raw event monitor data can then be formatted using the db2evmon utility:

db2evmon -db sample -evm conmon

Listing 10 shows a snippet of the formatted event monitor output.

Listing 10. A snippet of formatted event monitor output for connection events
                            EVENT LOG HEADER
  Event Monitor name: CONMON
  Server Product ID: SQL08024
  Version of event monitor data: 7
  Byte order: LITTLE ENDIAN
  Number of nodes in db2 instance: 1
  Codepage of database: 1208
  Territory code of database: 1
  Server instance name: db2inst1

  Database Name: SAMPLE  
  Database Path: /home/db2inst1/db2inst1/NODE0000/SQL00001/
  First connection timestamp: 03/09/2006 12:00:22.473810
  Event Monitor Start time:   03/09/2006 12:02:39.125790

3) Connection Header Event ...
  Appl Handle: 9
  Appl Id: *LOCAL.db2inst1.060309170022
  Appl Seq number: 0007
  DRDA AS Correlation Token: *LOCAL.db2inst1.060309170022
  Program Name    : db2bp
  Authorization Id: DB2INST1
  Execution Id    : db2inst1
  Codepage Id: 1208
  Territory code: 1
  Client Process Id: 16627
  Client Database Alias: SAMPLE
  Client Product Id: SQL08024
  Client Platform: Unknown
  Client Communication Protocol: Local
  Client Network Name: localhost.localdomai
  Connect timestamp: 03/09/2006 12:00:22.473810

4) Connection Header Event ...
  Appl Handle: 11
  Appl Id: *LOCAL.db2inst1.060309170318
  Appl Seq number: 0001
  DRDA AS Correlation Token: *LOCAL.db2inst1.060309170318
  Program Name    : DB2HMON
  Authorization Id: DB2INST1
  Execution Id    : db2inst1
  Codepage Id: 1208
  Territory code: 0
  Client Process Id: 16323
  Client Database Alias: SAMPLE
  Client Product Id: SQL08024
  Client Platform: Unknown
  Client Communication Protocol: Local
  Client Network Name: localhost.localdomain
  Connect timestamp: 03/09/2006 12:03:17.225823

To create an event monitor from Control Center, right-click the Event Monitors folder under the database of interest and select the Create option. The Create Event Monitor dialog window, shown in Figure 14, is displayed where you can configure the event monitor's properties.

Figure 14. Creating an event monitor in Control Center
A screenshot showing the Create Event Monitor dialog window
A screenshot showing the Create Event Monitor dialog window

To start or stop the monitor, right-click on the new monitor created in the Event Monitors folder, and select the Start Event Monitoring or Stop Event Monitoring options, respectively. To view the monitor output, right-click again on the new monitor you created and choose the Analyze Event Monitor Records option. Figure 15 shows the Event Analyzer dialog window.

Figure 15. Viewing event monitor analysis using the Event Analyzer
A screenshot showing the Event Analyzer dialog window
A screenshot showing the Event Analyzer dialog window

Event monitors can also be configured to log their output into a pipe or tables.

Command/utility mapping and other tools

PostgreSQL contains several utilities and commands that are used to perform maintenance on databases. Table 2 summarizes them and provides the corresponding utilities or commands in DB2, if they exist.

Table 2. A mapping of PostgreSQL's commands and utilities to DB2
clusterdbA utility for re-clustering tables in a database. Re-clusters them again on the same index that was last used.REORG command. This functionality is also available through the Control Center.
createdbA wrapper around the CREATE DATABASE command.CREATE DATABASE command. This functionality is also available through the Control Center.
pg_configPrints configuration parameters of the currently installed version of PostgreSQLdb2level utilty.
pg_dumpA utility for backing up a PostgreSQL database. Dumps can be output in script or archive file formats.Either the BACKUP or db2look commands. A DB2 backup is a binary dump of the database which can only be used with the RESTORE command. The db2look command is used to dump the structure (DDL) with the option of including data too. This functionality is also available through the Control Center.
pg_dumpallA utility for writing out ("dumping") all PostgreSQL databases of a cluster into one script file. The script file contains SQL commands that can be used as input to psql to restore the databases.No equivalent in DB2. Each database in an instance must be backed up separately using the BACKUP utility, or have its structure extracted using the db2look utility.
pg_restoreA utility for restoring a PostgreSQL database from an archive created by pg_dump in one of the non-plain-text formats.RESTORE command if a backup was taken using the BACKUP command. If the database structure and data was dumped using the db2look command, the file can be given as input to the Command Line Processor to re-create all the database objects and data.
psqlA terminal-based front-end to PostgreSQL. Enables queries to be issued interactively and displays the query results. Alternatively, input can be from a file.db2 command. Entering the db2 command at an operating system prompt loads the DB2 Command Line Processor in interactive mode.
reindexdbA wrapper around the REINDEX command.REORG utility. This functionality is also available through the Control Center.
vacuumdbA wrapper around the VACUUM command.REORG command to re-organize tables and indexes. RUNSTATS command to collect statistics for tables and indexes. This functionality is also available through the Control Center.
initdbCreates a new PostgreSQL database cluster. A database cluster is a collection of databases that are managed by a single server instance.db2icrt command. This command creates a new DB2 instance. Databases can then be created in the instance.
pg_controldataPrints information initialized during initdb, such as the catalog version and server locale.
  • GET DBM CFG command displays information about an instance.
  • GET DB CFG FOR <db_name> command displays information about a database.
This functionality is also available through the Control Center.
pg_ctlA utility for starting, stopping, or restarting the PostgreSQL backend server (postmaster), or displaying the status of a running server.
  • db2start command starts a DB2 instance.
  • db2stop command stops a DB2 instance.
This functionality is also available through the Control Center.
pg_resetxlogClears the write-ahead log (WAL) and optionally resets some other control information stored in the pg_control file. This function is sometimes needed if these files have become corrupted.RECOVER DATABASE command. This command can be used to bring a database back to a consistent state. The RESTORE command can also be used to restore a database or table space from a backup image, or restore the database's history file.
postgresThe actual PostgreSQL server process that processes queries. It is normally not called directly; instead a postmaster multi-user server is started.db2sysc is the main DB2 instance process. It is started when the db2start command is issued and stopped when the db2stop command is issued.
postmasterThe PostgreSQL multi-user database server. In order for a client application to access a database it connects (over a network or locally) to a running postmaster. The postmaster then starts a separate server process (postgres) to handle the connection. The postmaster also manages the communication among server processes.db2sysc is the main DB2 instance process. It is started when the db2start command is issued and stopped when the db2stop command is issued. The db2sysc process spawns other processes as needed.

DB2 also comes with many other tools, which were not discussed in this article. The Command Editor, shown in Figure 16, is a graphical version of the Command Line Processor for those who like to work in graphical environments.

Figure 16. The Command Editor
A screenshot showing the Command Editor
A screenshot showing the Command Editor

The Configuration Assistant, shown in Figure 17, can help you configure client connections to your remote databases by automatically detecting them over the network and configuring them for you. You can also manually configure connections using this tool.

Figure 17. The Configuration Assistant
A screenshot showing the Configuration Assistant
A screenshot showing the Configuration Assistant

The Task Center, shown in Figure 18, is used to schedule tasks, such as taking backups or running maintenance scripts.

Figure 18. The Task Center
A screenshot showing the Task Center
A screenshot showing the Task Center

Finally, the Development Center, shown in Figure 19 is a complete integrated development environment for building SQL and Java stored procedures and user-defined functions.

Figure 19. The Development Center
A screenshot showing the Development Center
A screenshot showing the Development Center

On Windows, all of these tools can be launched by navigating to them through the Windows Start menu. They can also be launched by clicking the corresponding toolbar button in Control Center, or by selecting them from the Control Center's Tools menu. We highly recommend exploring these tools to help you increase your productivity and reduce your learning curve.


In this article, we reviewed the main architectural and environmental differences between PostgreSQL and DB2. This included a review of important DB2 concepts such as instances, databases, buffer pools, and table spaces, as well as important tools including the Control Center, Command Line Processor, and various wizards.

In addition to this article, there are a variety of other resources available to help you. A new interactive forum has been introduced to help answer your questions about DB2 Express-C. It is staffed by DB2 experts from IBM and is becoming a very strong community for developers and administrators. Finally, the DB2 Migration Station also contains a variety of resources to help you make the switch to DB2.

Downloadable resources

Related topics


Sign in or register to add and subscribe to comments.

Zone=Information Management
ArticleTitle=Leverage your PostgreSQL V8.1 skills to learn DB2, Version 8.2