DB2 10.5 for Linux, UNIX, and Windows

Exercise 3: Using thresholds to control rogue activities and using the threshold violation monitor

This exercise demonstrates how you can use thresholds to detect resource misuse or the beginning of system overload by establishing limits over the consumption of a specific resource.

Estimated time: 15-20 minutes

If a threshold is violated, a specified action can be triggered. The supported actions are:

Regardless of whether an activity that violates a threshold is stopped or permitted to continue running, a record of the violation is written to an active THRESHOLD VIOLATIONS event monitor (assuming one is defined in advance) each time a threshold is violated. The record contains information such as which threshold was violated, the time of the violation, and the threshold action.

This exercise demonstrates how thresholds can be used to detect or prevent rogue activities from running on your system and using up system resources. A rogue activity is any activity that uses an unexpectedly high amount of resources. For example, a query that runs for an abnormally long time, or returns an unexpectedly large result set.

Step 1: Create a threshold violation event monitor

Create and enable a write-to-table event monitor that will be used to capture the threshold violation information and enable the activity event monitor that was created in Exercise 1.

CREATE EVENT MONITOR threvio FOR THRESHOLD VIOLATIONS WRITE TO TABLE
          THRESHOLDVIOLATIONS(IN userspace1),
          CONTROL(IN userspace1)

SET EVENT MONITOR threvio STATE 1

SET EVENT MONITOR db2activities STATE 1

Step 2: Create a workload

Create a workload such that all activities run from the workth.db2 script will get mapped to the work1_sc service class.

The work1_sc service class already exists since it was created in Exercise 2.

CREATE WORKLOAD workth_wl
          CURRENT CLIENT_APPLNAME('CLP workth.db2')
          SERVICE CLASS work1_sc

Step 3: Create thresholds

Create two thresholds, one of which (th_estcost) is an ESTIMATEDSQLCOST threshold and another (th_sqlrows) is a SQLROWSRETURNED threshold and apply them to the service class you wish to control the activities for (in this case, work1_sc service class).

The th_estcost threshold specifies an upper bound (10000 timerons) for the optimizer-estimated cost (in timerons) for an activity running in the work1_sc service class. If any query with an estimated cost greater than 10000 timerons, tries to execute in the work1_sc service class, this threshold is violated and the query is not permitted to run.

The th_sqlrows threshold specifies that any activity running in the work1_sc service class can return at most 30 rows from the data server. If any query tries to return more than 30 rows, this threshold is violated, only 30 rows will be returned to the client and the query will be stopped. In addition, data about the activity that caused the threshold violation will be collected.

In either case, when an activity violates the threshold, a threshold violation record is written to the THRESHOLD VIOLATIONS event monitor as defined in step 1 and the execution of the activity is stopped (because of the STOP EXECUTION action). The application that submitted the activity will receive an SQL4712N error.

CREATE THRESHOLD th_estcost
    FOR SERVICE CLASS work1_sc ACTIVITIES
    ENFORCEMENT DATABASE
    WHEN ESTIMATEDSQLCOST > 10000
    STOP EXECUTION

CREATE THRESHOLD th_sqlrows
    FOR SERVICE CLASS work1_sc ACTIVITIES
    ENFORCEMENT DATABASE
    WHEN SQLROWSRETURNED > 30
    COLLECT ACTIVITY DATA WITH DETAILS AND VALUES
    STOP EXECUTION  

Additional information: A threshold can be either predictive or reactive:

Step 4: Run some activities

Run some activities, some of which violate the threshold upper bounds defined in the previous step.

db2 -o -tvf workth.db2

Note that the statements which violate the thresholds defined previously, fail with an error of SQL4712N/SQLSTATE 5U026.

Step 5: View the threshold violation event monitor

Information about every threshold violation is collected by the THRESHOLD VIOLATIONS event monitor. You can query the threshold violation information by issuing regular SQL statements against the threshold violation monitor table as shown in the following example.

CONNECT TO SAMPLE
SELECT  APPL_ID,
       UOW_ID,
       ACTIVITY_ID,
       COORD_PARTITION_NUM AS COORDPART,
       THRESHOLD_PREDICATE,
       THRESHOLD_ACTION,
       TIME_OF_VIOLATION
FROM THRESHOLDVIOLATIONS_THREVIO
ORDER BY THRESHOLD_ACTION, THRESHOLD_PREDICATE, TIME_OF_VIOLATION

The output will look something such as the following:

APPL_ID                                                          UOW_ID     
ACTIVITY_ID          COORDPART   THRESHOLD_PREDICATE
                                        THRESHOLD_ACTION TIME_OF_VIOLATION
--------------------------------------------------------------------------- ----
---------------- ----------- -------------------------
------------------------------------------------------- ------------------------
--
*LOCAL.DB2.070821150008                                                  11
1           0 EstimatedSQLCost
                                        Stop            2007-08-21-
11.00.11.000000
*LOCAL.DB2.070821150008                                                  10
1           0 SQLRowsReturned
                                       Stop             2007-08-21-
11.00.10.000000

  2 record(s) selected.

Step 6: View information for the activity that violated the threshold

Activity information is collected for any activity that violates a threshold that is defined with a COLLECT clause. Show the detailed information about the activities that violated a threshold using the following query:

SELECT VARCHAR(A.APPL_NAME, 15) as APPL_NAME,
       VARCHAR(A.TPMON_CLIENT_APP, 20) AS CLIENT_APP_NAME,
       A.ACTIVITY_ID,
       A.ACTIVITY_TYPE,
       A.WORKLOAD_ID,
       T.THRESHOLD_PREDICATE,           
       A.QUERY_CARD_ESTIMATE,
       T.THRESHOLD_MAXVALUE,
       T.TIME_OF_VIOLATION,
       VARCHAR(AS.STMT_TEXT, 100) AS STMT_TEXT
FROM  THRESHOLDVIOLATIONS_THREVIO AS T,
       ACTIVITY_DB2ACTIVITIES AS A,
       ACTIVITYSTMT_DB2ACTIVITIES AS AS       
WHERE T.APPL_ID = A.APPL_ID AND
       T.UOW_ID = A.UOW_ID AND
            T.ACTIVITY_ID = A.ACTIVITY_ID AND
            A.APPL_ID = AS.APPL_ID AND
            A.ACTIVITY_ID = AS.ACTIVITY_ID AND
            A.UOW_ID = AS.UOW_ID

The output will look something such as the following:

APPL_NAME       CLIENT_APP_NAME      ACTIVITY_ID          ACTIVITY_TYPE
                                           WORKLOAD_ID THRESHOLD_PREDICATE
                                        QUERY_CARD_ESTIMATE  THRESHOLD_MAXVALUE
  TIME_OF_VIOLATION          STMT_TEXT

--------------- -------------------- -------------------- ----------------------
------------------------------------------ ----------- -------------------------
--------------------------------------- -------------------- -------------------
- -------------------------- ---------------------------------------------------
-------------------------------------------------
db2bp           CLP workth.db2                          3 READ_DML
                                                     3 SQLRowsReturned
                                                          41                   3
0 2007-08-31-09.01.16.000000 SELECT * FROM SALES

Note that the activity that violated the th_estcost (EstimatedSqlCost) threshold is not shown. The reason is that the threshold did not specify the COLLECT ACTIVITY DATA clause, so that no activity data was collected for that activity.

Step 7: Reset for next exercise

Disable the event monitors that were enabled. Also disable and drop the th_estcost and th_sqlrows thresholds that were created.

SET EVENT MONITOR threvio STATE 0
SET EVENT MONITOR db2activities STATE 0

ALTER THRESHOLD th_estcost DISABLE
DROP THRESHOLD th_estcost

ALTER THRESHOLD th_sqlrows DISABLE
DROP THRESHOLD th_sqlrows

Also clean up the activities event monitor tables and the threshold violation table

DELETE from ACTIVITY_DB2ACTIVITIES
DELETE from ACTIVITYSTMT_DB2ACTIVITIES
DELETE from THRESHOLDVIOLATIONS_THREVIO

CALL WLM_COLLECT_STATS()