Performance metrics have been available for a long time through snapshots. The snapshots deliver a lot of information but are not well suited for collecting historical data. If you want to collect historical data, you need to throw away a lot of useless data because the snapshot captures more than you need. Furthermore, it takes a lot of scripting to extract just the data from the snapshot output and to store it.
Starting with DB2 9.7, a new monitoring framework has been introduced. This new framework works at the database level, and the SQL-based interface lets you retrieve exactly what you need from DB2. All that you need is some basic knowledge of SQL and out-of-the-box DB2 functions to get started.
The following steps show how to successfully build a performance database. The examples in this document have been created using DB2 V10.1 for Linux, UNIX, and Windows. Because the monitoring framework became available with DB2 9.7, you can also use that version of DB2. The monitoring framework is available in all editions of DB2.
Collecting performance data and storing it
To collect and store performance data:
- Determine what data you want to capture and in what time interval you want to have the data in your performance database.
- Set up a database to collect the data in.
- Set up a mechanism to collect and transport the performance data into your database.
Determining what data you want to capture and the time interval
There are a number of ways to collect DB2 performance data using the new monitoring framework. One way is to use the views that are located in the schema SYSIBMADM. These views are ready to use with SQL and are constructed by combining the results of one or more table functions. You can use the following query on the database, and you will be capturing data to determine which views are available.
Listing 1. List tables command
list tables for schema SYSIBMADM
Figure 1 shows the output of the list tables command, the name of the table/view, the schema name, type (T for Table and V for View), and the creation time.
Figure 1. List tables command
Another option that you have is to use the table functions that come with DB2 V9.7. These table functions can also be queried by SQL but require input parameters that the views mentioned above do not require. Depending on what your focus is, you can use different functions. The main categories are:
- Monitoring functions that are oriented on various database objects and components of DB2. Their names all start with mon_get.
- WorkLoad Management(WLM)-oriented table functions. These table functions give you an insight into how DB2 performs from a WLM perspective. The available functions all start with wlm_get.
- Functions that start with admin_get deliver information that can be useful in a more database administration-like setting (for instance, if you want to monitor table sizes or memory consumption).
To get a complete list of all available table functions and administrative views, refer to the DB2 Version 9.7 for Linux, UNIX, and Windows Information Center (see Resources).
To get the table functions most commonly used, use the query that is shown in Listing 2.
Listing 2. Query available tablefunctions
select funcname from syscat.functions where funcname like 'MON_GET%' \n or funcname like 'WLM_GET%' or funcname like 'ADMIN_GET%'
An example showing the available tablefunctions is shown in Figure 2.
Figure 2. Query available tablefunctions
The DB2 Information Center provides a good understanding of which table functions return which data (see Resources).
In this example, you can monitor the number of table scans on all tables in the
database. You can use the
MON_GET_TABLE table function because the
returns a column that shows you the number of scans on each table in the
To get a clear picture of the number of scans over a working day, you can collect the desired data in a 10-minute interval using the query that is shown in Listing 3.
Listing 3. Example query
SELECT current_timestamp, tabschema, tabname, table_scans FROM TABLE(MON_GET_TABLE('','',-2)) AS t
If you look at the table function
MON_GET_TABLE, you see that it has
three input parameters. The first one is for a table schema, the second is for a
table name, and the third parameter lets you specify for which member of a
DPF (database partition feature) database you want to collect data from.
In this example, you can capture all table scans from all tables in all schemas
from all database members. This is why the first two parameters are '', meaning NULL
and the last one is
-2, which means all members.
In a real-world situation, you want to limit the number of tables to monitor because not everything is interesting to you and you will collect more data than you need.
Determining the interval of data collection
The interval of the data collection is important and is worth spending some time on. It is the primary factor that lets you determine a specific item or effect you want to see.
You need to know your goal for the data collection. If you are troubleshooting or trying to determine where the cause of a problem lies, you need data with a small time interval. If you are collecting long-term trend data, you do not need to capture a lot of detailed information for a short time period.
Another factor to keep in mind is the amount of space you want to reserve for your performance database. If you collect data in short intervals, you can expect to have some large tables in a very short time, taking up disk space.
Setting up your performance database
You have several options when setting up your performance database. The most important question to be answered is about the location of your performance data.
You can choose to co-host your performance database on the same system from which you are collecting your data. However, a better option is to have a separate database, preferably on another system, to store your data in. The latter choice has a number of benefits.
- You are not disturbing the workload that you are monitoring by generating load or I/O from inserting your data in your tables.
- Your possible data mining activities are not interfering with the source system.
- You are free to change parameters or settings on your performance database because you are the primary and sole user of it.
The first step in setting up your database is creating the performance database in the required instance and connecting to it. You can use the default settings for the database PERFDB, shown in Listing 4.
Listing 4. Create and connect to your database
create db perfdb connect to perfdb
Figure 3 shows an example of the create database statement and the subsequent connect statement.
Figure 3. Create and connect to your database
The next step is creating the table TABLESCANS that contains the captured data. DB2
creates the table for you based on the query while you are using the same data
types. After the table creation, you can use the
describe table command to verify the
definition of the table.
Listing 5. Query to create a table for storing your data
create table tablescans as describe table tablescans (SELECT current_timestamp as current_timestamp, tabschema, tabname, table_scans FROM TABLE(MON_GET_TABLE('','',-2)) AS t) with no data
Figure 4 (see larger image) shows an example of the create table statement.
Figure 4. Create the table for storing your data
Setting up the collection and transport mechanism to capture the desired data
If your target tables reside in the same database as your source system, you can use
INSERT statements to insert the data into your tables. If your target tables are in
a separate database, you can use the load utility to cross-load the data from your
source system into your target system.
To create historical data, you need a time component in your data that you can use
to track your data over time and to summarize or aggregate the data over the desired
time periods. If you are using the administrative views, the current timestamp is
already included in the view definition. If you use the table functions,
you must manually include the
current_timestamp special register in your query.
After you have created a separate database that is called PERFDB that will contain the performance data, you must construct a load statement that actually issues the query on the source system and sends the data back to the LOAD utility. This activity can be broken down into four steps:
- Connect to the target database.
- Declare the cursor with the select statement.
- Invoke the load utility to actually load the data into the target table.
- Reset the connection.
The query in Listing 6 shows you how to create the load script that declares the cursor, selects the appropriate data, and, subsequently, loads the data into the desired table
Listing 6. Query to create the load script that declares the cursor
CONNECT to PERFDB; DECLARE C1 CURSOR database sample user <userid> using <password> FOR SELECT current_timestamp, tabschema, tabname, table_scans FROM TABLE(MON_GET_TABLE('','',-2)) AS t; LOAD FROM "C1" of cursor MESSAGES "C:\temp\msg.txt" TEMPFILES PATH "C:\temp" INSERT INTO TABLESCANS (current_timestamp, tabschema, tabname, table_scans) NONRECOVERABLE INDEXING MODE AUTOSELECT; CONNECT RESET;
All four steps need to be performed in sequence, so it makes sense to put them together in a file and have DB2 process that file on a regular basis using a scheduling facility.
In this case, a file called loadcursor.sql was created and it contains the statements from the four steps. The DB2 command-line processor has been used to execute the statements from a DOS prompt.
Listing 7. Run the created load script
db2 -tvf loadcursor.sql
The output shows the load utility statement that has been executed, including some statistics about how many rows have been loaded (see larger image).
Figure 5. Run the created loadscript
The previous examples have been developed on the Windows platform, but you can use them as easily on the Linux and UNIX platforms.
There are a few things to keep in mind.
- Most measures in the monitoring framework are cumulative. If you want to know the delta values between two moments of measurement, you need to subtract the measures from each other.
- The "old" monitor switches on the instance level have no influence on the data collection of the "new" framework, which is on the database level.
- In this article, only a simple example has been shown. Nothing stands in your way of combining multiple table functions into one query and extracting that data.
If you are planning to collect a lot of data to do performance management on your DB2 system, it is probably more cost efficient and much easier to use Optim Performance Manager (OPM). OPM is included with DB2 Advanced Enterprise Server Edition (AESE) or sold as a separate product. OPM is a comprehensive and proactive solution for real-time monitoring and performance warehousing for DB2 and InfoSphere Warehouse for Linux, UNIX, and Windows. More information can be found in Resources.
You can use the captured data with other products such as Optim Query Workload Tuner, Optim Query Tuner, or Cognos BI to do tuning on your queries, tune entire workloads, and create reports on your captured data. For instance, you can capture data showing you the top 10 executed queries in your environment for a certain period of time to have them as input for tuning activities in Optim Query Workload Tuner.
In this article, you have constructed a query that retrieves DB2 performance data using table functions. The retrieved data is then loaded into a separate database that can be used for analysis purposes or as a source for other performance management or reporting tools.
- More general information on DB2 can be found in the DB2 Information Center.
- To get a complete list of all available table functions and administrative views, you can have a look at them in the DB2 Version 9.7 for Linux, UNIX, and Windows Information Center.
- More information about the table functions can be found in the table functions section of the Information Center.
- More information on moving and copying data using load can be found in the Information Center section on Moving data using the CURSOR file type.
- There is an excellent "chat with the lab" replay available introducing the monitoring framework. You can find it at ChannelDB2.
- Information regarding Optim Performance Manager can be found at the InfoSphere Optim Performance Manager website.
- Information regarding Optim Query Tuner can be found at the Optim Query Tuner website.
- Information regarding Optim Query Workload Tuner can be found at the Optim Query Workload Tuner for DB2 for Linux, UNIX, and Windows page.
- Information about Cognos BI can be found at the Cognos software site.
- Follow developerWorks on Twitter.
- Watch developerWorks on-demand demos ranging from product installation and setup demos for beginners, to advanced functionality for experienced developers.
- Visit the developerWorks resource page for DB2 for Linux, UNIX, and Windows to read articles and tutorials and connect to other resources to expand your DB2 skills.
Get products and technologies
- Download a free trial version of DB2 for Linux, UNIX, and Windows.
- Now you can use DB2 for free. Download DB2 Express-C, a no-charge version of DB2 Express Edition for the community that offers the same core data features as DB2 Express Edition and provides a solid base to build and deploy applications.
- Get involved in the My developerWorks community. Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.