How to monitor performance and analyze wait events

When you use Oracle databases on Unix

High performance of web applications depends on very efficient databases that can process requests and fetch query results quickly. This article shows how to analyze Oracle wait events to improve performance. It also shows how to use operating system utilities to effectively gather performance statistics of an Oracle database and analyze them to identify the issues.

Share:

Sachin Joshi (sajoshi7@in.ibm.com), Test Specialist, IBM

Author1 photoSachin Joshi is works at IBM as a Test Specialist. He is part of at the Platform Technology Center. He has more than 9 nine years of software testing experience. His experience includes: Oracle performance testing, OS Unix performance monitoring, automation, middleware testing, batch jobs testing, Interactive Voice Response (IVR) testing, and data warehousing testing. He holds a Mechanical Engineering degree from Visweswariah Technological University ( VTU) in Karnataka, India.



29 April 2014

Also available in Russian

Monitor database performance and analyze wait events

To improve the performance of Oracle databases, you need to analyze Oracle wait events for clues that can help identify bottlenecks and tune the database settings. When you improve the efficiency of Oracle database processing, you improve the performance of all the systems and software that rely on the database. This article explains how to effectively gather the performance statistics of an Oracle database with the use of usual OS utilities and analyze them to identify the issues.

Several factors affect Oracle database performance. The following factors, in particular, significantly affect performance.

Ineffective SQL statements affect performance

Pay attention when you construct SQL statements, because they can significantly affect query performance. The more specific the statement is, the faster the result is. For example, assume you need to see the details of a particular employee, employee1, with employee_id 123 from the employee table. The employee table includes employee_name, employee_sal, and employee_id, where employee_id is the primary key. Use one of the following options, listed in order of least effective to most.

  • Use the query select * from employee to fetch all the records from the table. Look for the specific employee from the results. If the query result includes many records, this is a slow process.
  • Use the query select * from employee where employee_name='employee1'. The where clause narrows down the search and makes the process faster.
  • Use the query select * from employee where employee_id=123. Because employee_id is the primary key, this query is the fastest option.

To construct the SQL statement that most effectively enhances performance, apply the following practices:

  • Avoid queries that do not contain where conditions and primary keys.
  • A high number of delete statements causes an increase in non-continuous data blocks, over time. A delete statement only clears out the data from a data block but it does not make the data block available for use. Because delete statements do not bring down the high water mark, they do not reduce the total size of the data that must be searched in future queries. Use truncate wherever possible, because it resets the high water mark.
  • Use COMMIT statements only when necessary. Avoid frequent use, because they initiate a buffer flush, which can result in excessive I/O processing.

Ineffective database configuration affects performance

A database that is not skillfully configured can affect database performance. Particular database initialization parameters and configuration parameters affect not only startup configuration, but also affect request processing. The following parameters can affect performance if they are not configured for optimal processing.

PCTFREE parameter

The PCTFREE parameter sets the minimum percentage of a data block to reserve as free space for possible updates to rows that already exist in that block. A PCTFREE value that is too low can cause problems during updates. A value that is too high can waste space and cause larger, unnecessary movement of data blocks.

PCTUSED parameter

The PCTUSED parameter sets the minimum percentage of a block to use for row data and overhead before new rows are added to the block. After a data block is filled to the limit determined by PCTFREE, Oracle considers the block unavailable for the insertion of new rows until the percentage of that block falls beneath the value set on the PCTUSED parameter. Until this threshold is reached, Oracle uses the free space of the data block only for updates to rows already contained in the data block.

Inadequate System Global Area (SGA) memory affects performance

A system global area (SGA) is a group of shared memory structures that contain data and control information for one Oracle database instance. If multiple users are concurrently connected to the same instance, the data in the instance's SGA is shared among the users. Consequently, the SGA is sometimes called the shared global area. To ensure efficient database performance, determine the optimum memory required for SGA and configure it appropriately. Too little memory can significantly reduce database performance.

Wait events affect performance

Deploy with confidence

Consistently deliver high-quality software faster using DevOps services on IBM Bluemix. Sign up for a free Bluemix cloud trial, and get started.

Wait events are the events that occur during a database operation when a request has to be processed. When a request is sent to a read/write Oracle database, it invokes multiple processes. These processes can encounter hardware, software, operating system, or configuration issues that halt or delay the processing. These blocking factors are called wait events. For a database to perform effectively, wait clauses must be kept to a minimum. Analyze all of the wait clauses that occur during an operation and fix them if the wait period is too high. It's not possible, however, to completely eliminate all of the wait events.

Oracle Database 11g has over 1000 wait events that can cause delays in processing a request. These wait events are broadly classified as:

  • Cluster
  • Network
  • Administration
  • Configuration
  • Commit
  • Application
  • Concurrency
  • System I/O
  • User I/O
  • CPU

Use Oracle Enterprise Manager to analyze wait events

Use Oracle Enterprise Manager (OEM) to monitor wait events during a data process. OEM graphically shows the database status during a process. It also gives a detailed analysis view from which you can drill down to each event and find the associated SQL statements, as shown in Figure 1. The legend entries on the right show the type of wait events that are occurring.

Figure 1. Sample screen capture from Oracle Enterprise Manager
Data load showing different wait events

Click to see larger image

Figure 1. Sample screen capture from Oracle Enterprise Manager

Data load showing different wait events

(Original image published by docs.oracle.com, 2011: http://docs.oracle.com/cd/E11857_01/em.111/e11982/database_management.htm)

Figure 2 shows the detailed view of the User I/O wait class from the data used in Figure 1. Click User I/O in OEM to see the details page of the wait events that cause a User I/O type of wait.

Figure 2. Active sessions waiting
Detailed view of User I/O wait events

Click to see larger image

Figure 2. Active sessions waiting

Detailed view of User I/O wait events

(Original image published by docs.oracle.com, 2011: http://docs.oracle.com/cd/E11857_01/em.111/e11982/database_management.htm)


Analyze wait events by using a shell script

To capture data without using a tool, use a shell script and native operating system utilities to gather wait event statistics from an Oracle database. Use the following steps to gather statistics, analyze them to understand the source of a bottleneck, and fix the issues with the help of a database administrator.

Prepare the data

Prepare enough data so that the database will run for several hours or overnight. A long time period is often required to detect issues such as cache problems, which might not occur during the initial few hours of database operation. It's important to stress the database to imitate the production load. Ensure that you have a valid mixture of data so that query results include real-time, insert, update, delete, and truncate actions. Resource use for each data manipulation language (DML) syntax element varies; therefore, strive to make the set of data the same as or close to the same as the data that is received in a production environment.

Step 1. Create the script

To get details and SQL statements for the wait events from the views V$ACTIVE_SESSION_HISTORY, V$EVENT_NAME, V$SQLAREA, create a simple join on the three tables.

V$ACTIVE_SESSION_HISTORY
Displays sampled session activity in the database. It contains snapshots of the active database session taken once a second.
 
V$EVENT_NAME
Displays information about wait events.
 
V$SQLAREA
Lists statistics on the shared SQL area and contains one row per SQL string. It provides statistics on SQL statements that are in memory, parsed, and ready to be run.
 

Because data in these views is flushed out periodically, use the script shown in Listing 1 to collect data on a regular interval.

Listing 1. Script gather_event_stats.sh to collect data at intervals

Click to see code listing

Listing 1. Script gather_event_stats.sh to collect data at intervals

#!/bin/sh
echo "Collecting the wait events statistics $1 times at an interval of $2"

# Creating the time stamps in Oracle acceptable format 
curr_date=`date '+%d-%b-%y' | tr '[:lower:]' '[:upper:}'
curr_time=`date +%H.%M.%S`
start_time="$curr_date $curr_time"
outfile_name=Events_output_$curr_date $curr_time

# sleeping for given interval of time
sleep $2

# Running the iterations until given number of times
for((i=1;i<=$1;i++))
do 
curr_date=`date '+%d-%b-%y' | tr '[:lower:]' '[:upper:}'
curr_time=`date +%H.%M.%S`
end_time="$curr_date $curr_time"

# Dynamically building the sql file to get the data in required format.
echo "">sql
echo "SET HEADING OFF;">>sql 2>/dev/null
echo "SET FEEDBACK OFF;">>sql 2>/dev/null
echo "SET PAGESIZE 0;">>sql 2>/dev/null
echo "SET LINESIZE 1500;">>sql 2>/dev/null
echo "SET ECHO OFF;">>sql 2>/dev/null
echo "SET VERIFY OFF;">>sql 2>/dev/null


echo "SET MARKUP HTML OFF;">>sql 2>/dev/null

# Set the column separator for each field to " ~ "
echo "SET CLOSEP '~';">>sql 2>/dev/null

# Build a query joining all 3 tables to get wait event, wait time and SQLID.
echo "SELECT TRIM(A.SAMPLE_TIME),TRIM(C.SQL_TEXT),TRIM(B.NAME),COUNT(*), SUM(TIME_WAITED) FROM v\$ACTIVE_SESSION_HISTORY A,v\$EVENT_NAME B,v\$SQLAREA C WHERE to_char(A.SAMPLE_TIME, 'DD-MON-YY HH24.MI.SS')BETWEEN '"$start_time", AND '"$end_time"' AND A.EVENT=B.NAME AND A.SQL_ID = C.SQL_ID GROUP BY A.SAMPLE_TIME, C.SQL_TEXT, B.NAME ORDER BE 1 ASC:" >>SQL

#Connecting to Oracle using sqlplus and executing the sql file. Appending the results to a file
sqlplus ORACLEID/PASSWORD@ORADBNAME< sql >> $outfile_name
2>/dev/null 

# Sleeping for $2 number of minutes/hours/days
sleep $2
start_time="$end_time"

# Clearing the sql file to build a fresh query with changed time stamps.
echo "">sql

done

The tilde character (~) is used as a column separator so that the data collected in the flat file can be parsed using a spreadsheet application such as Microsoft Excel to analyze it further. This script creates an SQL file at regular intervals, executes the SQL to gather data from the views, and collects the stats in a flat file $outfile_name.

Step 2. Launch the script

Place the script on any of the application servers and run the script to collect the data. In the following code, substitute the variables ORADBNAME ORACLEID PASSWORD before invoking the script.

sqlplus ORACLEID/PASSWORD@ORADBNAME< sql >> $outfile_name 2>/dev/null

Note: You can also parameterize these values and pass them to the script during run time.

Use the following syntax to run the script:

gather_event_stats.sh <No.of.Iterations> <Time Interval in m(in)/h(ours)/d(ays)>
gather_event_stats.sh 5 2h ---> would run for 10 hrs (5times*for every 2hours = 10Hrs)

Start this script before the data load and do not end it until the data load is complete. Carefully choose the No.Of.Iterations ($1) and Interval ($2) when you run the script, so that No.Of.Iterations x Intervals = Test Duration.

To ensure the script is started, run the following command:

ps -eaf | grep "gather_event_stats.sh"

Step 3. Process the data

Inject the data and make sure the application is processing the data. Double-check to make sure the data is getting loaded into the database. Wait until all the data is processed. List the $outfile_name (the name of the file that receives the output from the query, in this case the statistics of the wait events) at regular intervals and ensure the size of the file is increasing.

ls -ltr <$outfile_name>

Step 4. Analyze the data and tune the database

  1. After the data processing is complete and the gather_event_stats.sh script has ended, download the output file $outfile_name from the Unix machine to a Windows machine.
  2. Open the file in a text editor. If there are any tab positions or extra blank spaces in the raw data file, replace them with a single space before you parse them in an Excel spreadsheet. Figure 3 shows a sample of the downloaded data, as viewed in Notepad.
Figure 3. Screen capture of raw data collected in a flat file
Data after downloading and opening in Notepad

Click to see larger image

Figure 3. Screen capture of raw data collected in a flat file

Data after downloading and opening in Notepad
  1. Import the data to an Excel spreadsheet by clicking Data > Convert Text to Columns. Use the tilde character (~) as the delimiter to split the data into different columns, as shown in Figure 4.
Figure 4. Screen capture of wait events data after splitting in Excel using delimiter ~
View of data after splitting, using an Excel sheet

Click to see larger image

Figure 4. Screen capture of wait events data after splitting in Excel using delimiter ~

View of data after splitting, using an Excel sheet
  1. Draw a pivot chart on the data to get the sum of the total time waited for each event. The time unit is 1/100th of a second; therefore, divide it by 100 to convert the value into seconds. After the pivot is finished, sort the pivot table to get the maximum waited event, as shown in Figure 5.
Figure 5. Screen capture of pivot table chart
Pivot chart of wait events

Click to see larger image

Figure 5. Screen capture of pivot table chart

Pivot chart of wait events
  1. Analyze the wait event with the help of an Oracle database administrator to understand the root cause of the event. In this example, the TX - index contention is the event with the highest wait time. This situation indicates that there is a problem with indexing during transactions processing.
  2. Get the SQLID for the wait clauses with the greatest wait time and check the respective SQL statements. The C.SQL_TEXT field indicates the SQL statement for each SQLID. Consult the developers and database administrators to determine whether the SQL statements can be tuned to reduce the wait time.
  3. Repeat Step 6 for each of the top 10 wait activities.
  4. Implement the solutions suggested by the developers and database administrators to eliminate the wait events.
  5. Rerun the same test with same data load for same duration to ensure the wait time is reduced or to completely eliminate the wait event.

Summary

This article explains one way to gather database wait event statistics for an Oracle database. The process is simple and requires no special tools. For IBM products that are integrated with Oracle back-end databases, apply this method to analyze performance issues with the database and address them.

Resources

Learn

Get products and technologies

  • Try building and deploying your next project on the IBM Bluemix cloud platform, where you can take advantage of pre-built services, runtimes, frameworks, application lifecycle management, and continuous integration.
  • Download a free trial version of Rational software.
  • Evaluate IBM software in the way that suits you best: Download it for a trial, try it online, or use it in a cloud environment.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Rational software on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Rational
ArticleID=969946
ArticleTitle=How to monitor performance and analyze wait events
publish-date=04292014