Introduction to DB2 databases and database objects
Before showing you how to create, connect to, and alter objects in a database, you should have a general understanding of the organization and terminology DB2 uses. Similar to file systems that use a hierarchy of folders to organize files, DB2 uses a hierarchy of objects to organize data.
At the top of this hierarchy is the server or computer on which DB2 is loaded. Below this is the database instance level. An instance controls the databases under it and manages system resources. There must be at least one instance on the server, but there can be more. Each instance acts as its own, independent DB2 installation and controls access to the databases in it.
The database is our third level in the progression of database objects, and it is on this level you will start learning basic commands and the creation process. A database is a collection of data. DB2, being a relational database, contains tables that have a relationship to one another and use these relationships to organize data while keeping duplication of that data to a minimum.
All of the objects that follow can be considered part of the fourth level in our database hierarchy, as they are all parts of a DB2 database.
The syntax for
CREATE statements of the database objects is not covered
in this tutorial, but you should become familiar with the syntax for
the exam. More detail will be given about many of these objects
(for example: tables, indices, views) in other tutorials, and the full
statements are fully detailed in the DB2 10.1 Information Center. Direct links
to each object's
CREATE statement are in
the Resources section of this tutorial. Also, by going through
the examples in "Creating, Altering, and Dropping
Objects in Data Studio," you can generate your own
CREATE statements through a hands-on process.
A tablespace is a storage structure used by DB2 to group data and database objects so they may be stored on the system. A tablespace is used as a layer between the database and the container objects that hold the actual data. A tablespace can contain more than one table. When a database is first created, DB2 automatically creates the SYSCATSPACE, USERSPACE1, and TEMPSPACE1 tablespaces. When a table space is created, it must be associated with a buffer pool.
As mentioned, every tablespace in DB2 must be assigned a buffer pool. A buffer pool caches table and index information read from the disk. When a tablespace is created, it must be associated with a buffer pool, and the page size of the buffer pool must match that of the tablespace. When a database is created, DB2 creates the IBMDEFAULTBP buffer pool, and all the default table spaces use it.
A schema is a logical grouping of database objects. It also is used for
name qualifiers, which help naturally organize database objects by
their name. The convention for this is
<SCHEMA-NAME>.<OBJECT-NAME>. This also allows more than on
object to have the same object name as long as each is in a different
For example, if you sold hardware and software, you could have a HARDWARE and SOFTWARE schema. In each of them, you could have an ORDERS table to differentiate between hardware and software orders (HARDWARE.ORDERS and SOFTWARE.ORDERS).
If a schema is not qualified, or explicitly stated when the object is created, the schema name is the user ID of the user who created the object. If I am logged in as db2user and create a table named TEMP1, the full name of that table would be DB2USER.TEMP1.
Tables are where your data is stored, and because of this, they are the heart of DB2. A table consists of rows and columns. Each row is called a record, and the data at intersection of a row and a column is called a value. As the data may need to be described, access, and organized in different ways, there are many different types of tables. Tables will be covered in more detail in other tutorials in this series, but the types of tables are:
- Base tables — These tables hold persistent data, or data you want
to keep throughout multiple connections/disconnections from the
database. This group includes:
- Regular tables — The most frequently used type of table; considered to be general purpose.
- Clustered tables — While not covered in detail, these
tables have their data clustered together to provide
access benefits, primarily in very large database
environments. The types of clustered tables are:
- Insert Time Clustering (ITC) Tables
- Range Clustering Tables (RCT)
- Multidimensional Clustering (MDC) Tables
- Temporal tables — A new feature of DB2 10.1:
- System-period temporal tables — This type of temporal table uses an associated history table to keep historical versions of each row of the main table updated or deleted. The two tables use SYSTEM_TIME columns to determine when the row is current. Then a row is changed, the original row and the updated SYSTEM_TIMEs are inserted into the history table. The database generates both these columns and maintains the history table.
- Application-period temporal tables — These store the stores data based on the time that the data is valid, using two BUSINESS_TIME period columns. The range provided by these columns indicates when the row is valid. Since there isn't a history table, all event periods, past, present, and future, are denoted in separate records of the same table, effective dates and their associated data are maintained in a single table.
- Bitemporal tables — This type of temporal table combines the aspects of both application- and system-period temporal tables: the history table and the valid time period.
- Temporary tables — As the name implies, these are used for
- Declared — These are only good for one database session. When your database connection is terminated, the table and all of the data in it is dropped.
- Created — Unlike Declared Temporary Tables, these do remain after each session, although any data in it does not.
- Materialized query tables — These are created by a
SELECTstatement and are used to improve the performance of queries.
In most cases, there is no guarantee that the data in your tables is in any order because when data is added to a table, it is appended to the end of the table. This means that in most cases, every row in the table would need to be scanned to see if it contained the required information. When tables become very large, this can dramatically affect the amount of time it takes to access your data.
Like the index of a book organizes the topics in the book and points you to a page to assist you in finding the information that you want faster, an index in DB2 improves the database's performance in retrieving data. While a book's index organizes data in a book alphabetically by topic, as the DBA, you get to decide what DB2 uses to organize data. Primarily, you would choose one or multiple columns, but indices can even be used to point to XML data or blocks of data.
You can always see what indices are enabled on your database by finding the metadata is in the system catalog. Indices will be covered in more detail in later tutorials in this series.
A view provides an efficient way of examining of examining data in any combination of one or more tables and/or one or more views. Views can be used like tables and have columns and rows, but they do not require any permanent storage.
A view uses a
SELECT SQL statement that combines these sources (more in
later tutorials). Since users of databases tend to have to draw data
from multiple tables on a constant basis, a view is an efficient and
simple method of data retrieval. Also, some in tables may contain
sensitive data — employee salaries, for example. By creating a view
SELECT statement that excludes that column, the data can remain
secure. Views will be covered in more detail in later tutorials in
A trigger contains an action or set of actions executed when an insert, update, or delete operation occurs on a specified table, also called a triggering event. There are three primary types of triggers:
- Before — The trigger's actions occur just before the triggering event takes place.
- After — The trigger's actions occur immediately after the triggering event takes place.
- Instead of — Used to define insert, update, or delete actions performed against complex views.
Sequences are used to automatically generate numeric values
independently from all other database objects. They are frequently
used to generate unique values. Consecutive values are retrieved using
NEXT VALUE or
PREVIOUS VALUE expressions. If many numbers will
need to be generated in rapid succession, they can be cached to help
expedite I/O operations.
Like the name sounds, an alias is another name the database relates to a sequence, table, view, or another alias. They are normally used to reference an object without having to specify the schema name, but are useful for a shorter, simple, or colloquial term for the object.