Running Oracle on AIX
Performance tips and tricks
As a systems administrator, you should already know some of the basics of memory, CPU, and Disk I/O (see the Related topics section for articles on these subjects). What you may not fully understand is how the VMM works in AIX and what that means to Oracle. You will also find that because many of the AIX tuning commands and parameters have changed in recent years, Oracle has changed also, and there are changes to utilities such as the Oracle Enterprise Manager, which is an important utility you should definitely take the time to learn and add to your repertoire.
This article discusses in detail the AIX VMM and the tuning commands that you will be using to tune memory. It also introduces some of the monitoring tools that you will be using, which will help put you in a position to tune.
Before we get started, it is important to note that you must have an overall approach to what you are doing. Make sure you use proper change control processes; only make one change at a time and monitor that change very carefully before introducing that change into other environments, particularly production. Performance tuning really is an iterative, ongoing process and you'll oftentimes find that by fixing one bottleneck you will create another, which is okay as long as you continuously look to improve the health of your systems. Make sure that you start monitoring your system at the beginning, well before your users are screaming about slow performance. How can you know what a poorly performing system is like unless you know what a healthy system looks like? A proper baseline is key. The system we'll be looking at is running Oracle 10g -- 10.1.0.2.0 and AIX 5.3 TL7 on a POWER5™ LPAR with one CPU and 4GB of RAM.
In this section, we'll review memory as it relates to AIX and Oracle. We'll discuss how AIX uses virtual memory and how this relates to Oracle. We'll also analyze the data and tune our subsystems.
Let's start with VMM. It's important to understand that the VMM services all memory requests from the system, not just virtual memory. When RAM is accessed, the VMM must allocate space even where there is plenty of physical memory left on the box. This is what confuses both DBAs and systems administrators at times. It does this by using a process called early allocation of paging space, by partitioning segments into pages. These pages can be either RAM or paging space (virtual memory stored on disk). At the same time, VMM maintains a free list of unallocated page frames, which are used to satisfy page faults. The VMM has a page-replacement algorithm, which assigns the page frames and determines exactly which virtual-memory pages currently in RAM will have their page frames brought back to the free list.
Furthermore, the AIX operating system will use all available memory, except that which is configured to be unallocated and known as the free list. Obviously, administrators prefer to use physical memory rather than paging space, where the physical memory is available. VMM classifies memory segments into two categories: persistent segments and working segments. Persistent segments use file memory and working segments use computational memory. What does this mean to us? It's the computational memory that is used while your SQL queries are accessing the database. These are working segments and will terminate when the process is completed. These segments have no real permanent location. On the other hand, file memory uses persistent segments and do have permanent locations on the disks. They will remain in memory usually until the pages are stolen or the database is recycled. Again, you want the file memory paged to disk and not the computational memory.
How do we tune our systems? One critical parameter worth discussing is the Translation Lookaside Buffer (TLB). Applications like Oracle exploit a tremendous amount of virtual memory, so using large pages increases performance substantially. Increasing the size of this buffer allows the system to map more virtual memory, which results in a lower miss rate for applications that use a lot of virtual memory like Oracle. This includes both OLTP and Data Warehouse applications. Oracle uses large pages for its SGA, because it is the SGA that really dominates virtual memory. With AIX 5.3 and older, we will use vmo; prior to that, we used vmtune.
Let's look at the parameters, using vmo, as shown in Listing 1.
Listing 1. Parameters using vmo
root@lpar21ml16ed_pub[/] > vmo -L lgpg_size NAME CUR DEF BOOT MIN MAX UNIT TYPE DEPENDENCIES -------------------------------------------------------------------------------- lgpg_size 0 0 0 0 16M bytes D lgpg_regions root@lpar21ml16ed_pub[/] > vmo -L lgpg_regions NAME CUR DEF BOOT MIN MAX UNIT TYPE DEPENDENCIES -------------------------------------------------------------------------------- lgpg_regions 0 0 0 0 D lgpg_size
Using the following command, we'll allocate 16777216 bytes to provide large pages, with 256 actual large pages:
# vmo -r -o lgpg_size=16777216 lgpg_regions=256
At the same time, with Oracle Database 10g, make sure that the
Oracle initialization parameter is set to TRUE, so that Oracle request large pages when
allocating shared memory. By far, the two most important vmo settings are
maxperm. We use these parameters to determine
whether our system favors computational memory or file memory. The first thing we do here
is make certain that our
lru_file_repage parameter = 0. This parameter was
introduced in ML1 of AIX 5.3 and determines if the VMM repage-counts are considered and
the type of memory it should steal (see Listing 2).
Listing 2. The
root@lpar21ml16ed_pub[/] > vmo -L lru_file_repage NAME CUR DEF BOOT MIN MAX UNIT TYPE DEPENDENCIES -------------------------------------------------------------------------------- lru_file_repage 1 1 1 0 1 boolean D -------------------------------------------------------------------------------- root@lpar21ml16ed_pub[/] >
As shown in Listing 2, the default is 1, so we'll need to change this using vmo (see Listing 3).
Listing 3. Changing the default setting for the
lru_file_repage parameter using
root@lpar21ml16ed_pub[/] > vmo -o lru_file_repage=0 Setting lru_file_repage to 0 root@lpar21ml16ed_pub[/] >
Setting this to 0 tells the VMM that you want to steal only file pages
and not computational pages. As this will change if the
minperm or >
maxperm, we will make
minperm very low. Years ago, before the
parameter was introduced, we used to make
maxperm low. If we did this now,
we would stop the application caching programs that are currently running.
Listing 4 shows how we'll set these parameters:
Listing 4. Setting the
vmo -p -o minperm%=5 vmo -p -o maxperm%=90 vmo -p -o maxclient%=90
We also want to take a look at
maxfree. When the
pages on our free list fall below
minfree, the VMM will start to steal
pages, which we don't want to happen until the free list has beefed up the number in
maxfree. The values should be similar to the ones shown in Listing 5.
Listing 5. Setting the
vmo -p -o minfree=960 vmo -p -o maxfree=1088
In this section, we'll discuss CPU as it relates to AIX and Oracle. We'll discuss how we can tune our CPU subsystems and take advantage of recent System p innovations to increase Oracle performance.
Let's start with SMT. This important POWER5 innovation allows for the ability of one single physical processor to concurrently dispatch instructions from several hardware threads. In AIX 5L Version 5.3, a dedicated partition created with one physical processor is configured as a logical two-way by turning on SMT, which allows two hardware threads to run on one physical processor at the same time. You should always leave SMT on with Oracle (see Listing 6).
Listing 6. Leaving SMT on with Oracle
oot@lpar21ml16ed_pub[/home/u0004773] > smtctl This system is SMT capable. SMT is currently enabled. SMT boot mode is not set. SMT threads are bound to the same virtual processor. proc0 has 2 SMT threads. Bind processor 0 is bound with proc0 Bind processor 1 is bound with proc0 root@lpar21ml16ed_pub[/home/u0004773] >
Let's run a performance-monitoring utility, mpstat (see Listing 7).
Listing 7. Running the mpstat utility
root@lpar21ml16ed_pub[/] > mpstat 1 5 System configuration: lcpu=2 ent=0.2 mode=Uncapped cpu min maj mpc int cs ics rq mig lpa sysc us sy wa id pc %ec lcs 0 0 0 0 557 274 128 1 1 100 682 26 51 0 22 0.02 9.9 769 1 0 0 0 289 2 2 1 1 100 0 0 27 0 73 0.01 4.1 772 U - - - - - - - - - - - - 0 86 0.22 86.1 - ALL 0 0 0 846 276 130 2 2 100 682 3 6 0 91 0.03 13.9 1541
Though our system has only one physical CPU, we can see that both logical CPU's come up when analyzing our systems.
Another important utility worth mentioning is nmon, which has been my favorite monitoring utility for years now (see Figure 1).
Figure 1. nmon output
Although nmon shows activity by CPU, you can use different flags to show the amount of activity that the Oracle processes are using. Furthermore, using the nmon analyzer, you can download information into spreadsheets and compile nice-looking charts that senior management likes to see.
There are some other important things you can do with CPU:
- Processor affinity -- This allows processes to run on specific processors. You can actually correlate specific processes with running processes.
- Nice and Renice -- These change the priority of running processes. It is not recommended to renice Oracle processes.
Another utility that is important with monitoring CPU is vmstat, which will also quickly let you know where a bottleneck resides.
In this section, we'll discuss the disk I/O subsystem as it relates to AIX and Oracle. We'll review how we can monitor and tune our I/O subsystems and also discuss some important subsystems that relate to I/O.
When our system is slow, most inexperienced administrators will usually look at CPU. It is, however, the disk I/O subsystem that can cause the most problems. We'll examine the ever-important asynchronous I/O and concurrent I/O in this section, as well.
Asynchronous I/O (AIO) servers
AIO determines if Oracle is waiting for your I/O to complete prior to starting new processing. If asynchronous I/O is not tuned properly, it can significantly affect the overall performance of writes on the I/O subsystem. What it does is allow the system to continue processing while I/O completes in the background. This improves performance significantly because processes can run at the same time as I/O is going on. We can monitor the AIO subsystem by using either iostat or nmon (see Listing 8).
Listing 8. Monitoring the AIO subsystem using iostat
oot@lpar21ml16ed_pub[/home/u0004773] > iostat -A 1 5 System configuration: lcpu=2 drives=2 ent=0.25 paths=2 vdisks=2 aio: avgc avfc maxg maif maxr avg-cpu: % user % sys % idle % iowait physc % entc 0 0 312 0 4096 3.1 7.1 89.8 0.0 0.0 16.7 Disks: % tm_act Kbps tps Kb_read Kb_wrtn hdisk1 0.0 0.0 0.0 0 0 hdisk0 0.0 0.0 0.0 0 0
The following list is a description of parameters used to monitor the AIO subsystem.
avfc: This reports back the average fastpath request count per second for your interval.
avgc: This reports back the average global asynchronous I/O request per second of the interval you specified.
maxgc: This reports back the max global asynchronous I/O request since the last time this value was fetched.
maxfc: This reports back the maximum fastpath request count since the last time this value was fetched.
maxreqs: This is the maximum asynchronous I/O requests allowed.
In our case, AIO servers are not a system bottleneck.
Concurrent I/O (CIO)
CIO, introduced in AIX Version 5.2, is an extremely important system that you should use
in your Oracle environment. Similar to its predecessor, direct I/O, when turned on, it
allows filesystem I/O to bypass the VMM and transfer data directly to disk from the
user's buffer. CIO allows multiple threads to read and write data concurrently to the
same file, which is due to the way in which JFS2 is implemented, allowing users to read
and write simultaneously. In order to turn this on, you mount your filesystems with the
# mount -o cio /orafilesystem.
These elements are important to consider with CIO:
- Raw devices -- While some Oracle DBAs like to create raw logical volumes for their data, and there is little argument about the performance benefit, in most cases it is too difficult to administer and usually I've found that the UNIX® administrators can talk the Oracle DBAs out of this one. With the advent of CIO, I would not use raw logical volumes unless performance is the driving factor of everything you are doing and you have the staff that can maintain the complexities inherent in this type of environment.
- Spreading the wealth -- The more spindles you have, the more you should spread your wealth around. The more adapters you will have, the more performance will also increase. You should also try to keep indexes and redo logs off the same volumes as your data.
- SAN -- Make sure you spend time looking at your SAN; optimizing the hardware will help you more than anything you can do at the OS level.
In this section, we'll look at Oracle-specific tools that can help you with your AIX administration.
This is an Oracle performance diagnosis tool, and I highly recommend that Unix
administrators learn to use this tool. It's really not that hard, once you have it set up
and configured. This is done from sql once you have the Oracle installed. There are
really two types of collection options:
You need to configure the
level parameter, which controls the type of data
collected from Oracle. The
threshold parameter acts as a filter for the
collection of SQL statements the status summary tables.
Here's how to install it. After logging on to the systems as Oracle, start up sqlplus and then just follow the steps as instructed (see Listing 9).
Listing 9. Starting up sqlplus to install Statspack
SQL*Plus: Release 10.1.0.2.0 - Production on Sun May 18 19:21:21 2008 Copyright (c) 1982, 2004, Oracle. All rights reserved. Enter user-name: system as sysdba Enter password: Connected to: Oracle Database 10g Enterprise Edition Release 10.1.0.2.0 - 64bit Production With the Partitioning, OLAP and Data Mining options SQL> execute SQL> @?/rdbms/admin/spcreate Choose the PERFSTAT user's password ----------------------------------- Not specifying a password will result in the installation FAILING Oracle Enterprise Manager choose the Temporary tablespace for the PERFSTAT user ----------------------------------------------------- Below is the list of online tablespaces in this database which can store temporary data (e.g. for sort workareas). Specifying the SYSTEM tablespace for the user's temporary tablespace will result in the installation FAILING, as using SYSTEM for workareas is not supported. Choose the PERFSTAT user's Temporary tablespace.
Oracle Enterprise Manager
The Oracle Enterprise Manager is a tool that I've used for years. In order to turn it on,
you'll need to make sure you first allow it to run when installing Oracle or creating a
database using the Oracle dbca utility. After the database is created, you'll need to
turn OEM on using:
$ emctl start dbconsole.
This is what you'll put in your browser:
After logging in, you'll see something like Figure 2.
Figure 2. The Oracle Enterprise Manager
There is so much you can monitor and tune within OEM that there are actually books on this utility. If you are working in an Oracle environment, this is a must-use system.
In this article, we introduced the concepts of performance tuning as it relates to Oracle. We looked at the memory, CPU, and I/O subsystems as we analyzed and tuned our systems. We captured data and analyzed the results of our changes. We discussed important systems such as concurrent I/O and why implementing these systems will help our systems perform better. We also discussed some important kernel parameters, what they do, and how to tune them. At the same time, we made note of some important changes through the years and our approach to certain parameters. We also looked at some Oracle-specific utilities and how they could help us as AIX systems administrators.
- Tuning IBM AIX 5L for an Oracle Database: Learn about performance analysis and tuning for different types of Oracle workloads and IBM AIX 5L setup configurations.
- Optimizing AIX 5L performance: Tuning network performance, Part 1 (Ken Milberg, developerWorks, November 2007): Read Part 1 of a three-part series on AIX networking, which focuses on the challenges of optimizing network performance.
- For a three-part series on memory tuning on AIX, see Optimizing AIX 5L performance: Tuning your memory settings, Part 1 (Ken Milberg, developerWorks, June 2007).
- Read the IBM whitepaper Improving Database Performance with AIX concurrent I/O.
- For a comprehensive guide about the performance monitoring and tuning tools that are provided with AIX 5L Version 5.3, see the IBM Redbook AIX 5L Practical Performance Tools and Tuning Guide.
- The AIX 5L Differences Guide Version 5.3 Edition (December 2004) redbook focuses on the differences introduced in AIX 5L Version 5.3 when compared to AIX 5L Version 5.2.