Real-time WebSEAL statistics with Windows Performance Monitor

Graphing real-time WebSEAL junction and authentication statistics with the Windows Performance Monitor

In this article I cover the practical interpretation of the statistics capabilities in Tivoli® Access Manager WebSEAL. I’ll show you how to use sample periods to effectively determine the usage characteristics of your WebSEAL environment, validate that front-end load balancing is working effectively, and graph all this information using Windows® Performance Monitor.

Shane B. Weeden, Senior Software Engineer, IBM

Shane WeedenShane Weeden is a senior software engineer with the IBM Federated Identity Manager development team. He has worked in IT security for 13 years, and has spent the last seven years working with Tivoli Security products. Shane has been with the Federated Identity Manager development team since its conception, and now divides his time between customer-focused engagements and core product development activities. He holds a Bachelor of Information Technology from the University of Queensland in Australia.


developerWorks Professional author
        level

02 March 2007

Background information

Readers of this article should be familiar with administering and using Tivoli Access Manager WebSEAL. It is recommended that you also read about WebSEAL statistics in the Tivoli Access Manager product documentation. A link to this information is also contained in the Resources section.


Enabling WebSEAL statistics

WebSEAL statistics can be enabled either dynamically or statically.

Dynamic enablement refers to a WebSEAL "server task", which turns on various statistics when WebSEAL is running. This can be done with pdadmin, or with the administration API.

Static enablement of WebSEAL statistics refers to specific entries in the WebSEAL configuration file that turn on statistics when WebSEAL starts. For the purposes of this article, you can use either approach, however be aware that if you enable statistics dynamically, you must re-enable them every time the WebSEAL process is restarted.

Example 1 demonstrates dynamic enablement of WebSEAL authentication statistics and serveral junctions. Example 2 shows the WebSEAL configuration file modifications for enablement of the same statistics using the static method.

Example 1. Dynamically enabling WebSEAL statistics with pdadmin
pdadmin sec_master> server task idp-webseald-idp.ibm.com stats on pdweb.authn
pdadmin sec_master> server task idp-webseald-idp.ibm.com stats on pdweb.jct.1
pdadmin sec_master> server task idp-webseald-idp.ibm.com stats on pdweb.jct.2
pdadmin sec_master> server task idp-webseald-idp.ibm.com stats on pdweb.jct.3
Example 2. Statically enabling WebSEAL statistics in the WebSEAL configuration file
[aznapi-configuration]
stats = pdweb.authn				
stats = pdweb.jct.1
stats = pdweb.jct.2
stats = pdweb.jct.3

The question often arises about which junction "number" should be enabled and to what actual WebSEAL junction name the number refers. The way to determine this is after creating your junction, list the set of available statistics and then show the (empty) statistics for all junction numbers. You can see the name-to-junction number mapping. From Example 3, you can see that pdweb.jct.2 corresponds on my system to the junction named /FIM.

Example 3. Determining the set of available statistics and the junction name for a particular junction "number"
pdadmin sec_master> server task idp-webseald-idp.ibm.com stats list
pd.ras.stats.monitor
pd.log.EventPool.queue
pd.log.file.clf
pd.log.file.ref
pd.log.file.agent
pdweb.authn
pdweb.authz
pdweb.http
pdweb.https
pdweb.threads
pdweb.jmt
pdweb.sescache
pdweb.drains
pdweb.certcallbackcache
pdweb.usersessidcache
pdweb.doccache
pdweb.jct.1
pdweb.jct.2
pdweb.jct.3
pdadmin sec_master> server task idp-webseald-idp.ibm.com stats get pdweb.jct.2[/FIM]
reqs     : 0
max      : 0.000
total    : 0.000

The remainder of this article and the sample code cover the interpretation and graphical representation of the following set of WebSEAL statistics:

  • pdweb.authn
  • pdweb.http
  • pdweb.https
  • pdweb.jct.X

You can modify the sample code to graph other statistics if you choose, although this is a non-trivial exercise!


Understanding and interpreting WebSEAL statistics data

All of the statistical elements we are covering in this article have two measurable properties of interest. These are:

  • Average turnaround time for a given sample period
  • Transactions processed (per second) for a given sample period

Understanding the concept and application of the sample period is one of the most important aspects of getting value out of the statistics data. If your sample period is too short or too long, the averages become meaningless. Let's look at how graph plot points are calculated for each of our measurable properties of interest (turnaround time and transactions per second) and this should become clear.

For the illustrated examples that follow, consider two sample periods. The first starts at time T1 and finishes at time T2 (this is real time, like on your watch). The second starts at T2 and finishes at T3. During the T1-T2 sample period, the "transactions" shown in Table 1 occur. These can be for any of our statistical components (authentications, http txns, https txns, or a junction access for a particular junction). During the T2-T3 sample period, the transactions shown in Table 2 occur.

Table 1. Example T1-T2 sample period transaction data for demonstrating statistics calculations
Transaction IDTime (msec) to process transaction
110
26
38
44
512
68
Total48
Table 2. Example T2-T3 sample period transaction data for demonstrating statistics calculations
Transaction IDTime (msec) to process transaction
14
28
310
414
Total36

Average turnaround time

It's easy to calculate that during the T1-T2 sample period represented by the transactions in Table 1, the average turnaround time per transaction was 8 milliseconds. This is calculated generically using the formula shown in Figure 1.

Figure 1. Calculating average turnaround time in a sample period
Average turnaround time

Now let's apply this concept to WebSEAL statistical data. We take the output of the junction statistics at T2 and T1 and determine the result as shown in Figure 2:

Figure 2. Average turnaround time from WebSEAL statistics
Average turnaround time from WebSEAL statistics

The number (in our case "8") becomes a plot point on a graph where the vertical axis is turnaround time in milliseconds, and the horizontal access is time. This plot point is at time T2, and represents the average turnaround time for all transactions that have occurred in the sample period ending at T2. This is shown pictorially in Figure 3.

Figure 3. Plotting turnaround times
Plotting Average Turnaround Times

The next plot, at time T3, represents the average turnaround time for only those transaction occurring between T2 and T3.... and so on.

There are a couple of observations to make at this point. The max statistics parameter in the output of a pdadmin "stats get" command is the maximum because statistics have been turned on. This is typically meaningless in statistical terms because we are interested in the trends in average behaviour over a set of sample periods.

The sample period that you choose is critical to providing meaningful data. If the sample period is too short you will see lots of sample periods with zero transactions and the graph will become unreadable and not show trends in customer experience. Similarly, if the sample period is too long, you will start to see the same average printed all the time (the graph will look like a flat line on an ECG monitor) and you will not see how shorter busy periods might be affecting customer experience.

The ideal balance for choosing a sample period depends upon how busy your servers are and is something that comes with practice. It might well be better to use a shorter sample period during very busy periods, for example from 9-10 a.m. and just after lunch when people tend to log in and access your system. During this time you can use a sample period of 5 minutes, however for the rest of your business day when transaction rates are relatively slow, you may choose a longer sample period of perhaps 30 minutes. You want a healthy number of transactions for each sample period without missing key indicators such as slower turnaround time during particularly busy intervals.

Now let’s look at the other measurable property of interest, transactions per second, and how this should be used in conjunction with the average turnaround time to detect potential bottlenecks in your deployment.

Transactions per second

Looking back to Table 1, what is the number of transaction per seconds that occurred during the sample period?

Guess what, I have no idea either. Without knowing how many seconds have expired between T1 and T2, you cannot calculate the transactions per second.
Figure 4 presents the generic formula used to calculate the transactions per second for a given sample period.

Figure 4. Calculating transactions per second in a sample period
Calculating transactions per second

Let's presume for the rest of this illustration, that only 2 seconds have expired between T1 and T2, and again between T2 and T3. The calculation for our example then becomes trivial, as is shown in Figure 5:

Figure 5. Transactions per second from WebSEAL statistics
Transactions per second from WebSEAL

The number of transactions per second (in our case "3") becomes a plot point on a graph where the vertical axis is number of transactions per second and the horizontal access is time. This plot point is at time T2 and represents the average number of transactions per second during the sample period ending at T2. This is shown pictorially in Figure 6.

Figure 6. Plotting transactions per second for a sample period
Plotting Transactions Per Second

The next plot, at time T3, represents the average number of transactions per second processed between T2 and T3, and so on.

Again there are very important observations to notice. The number that we plot for transactions per second does NOT represent the processing capacity of the system; it is simply an indication of how many transactions are actually being processed. To determine your system capacity, you should use controlled, incremental load testing and continue to add load until you see little or no increase in transactions per second and an unacceptable increase in client response time. During this type of load testing, you can set your sample period to a relatively small number (for example, 1 minute).

When you have done this baseline analysis of your capacity, you can use the real-time statistics of transaction-per-second during peak usage periods (again, selecting an appropriate sample period is crucial) to determine when you might need to expand your system resources.

You should observe both the transactions-per-second and turnaround time trends for the same statistical element before jumping to conclusions.

For example, you notice a sudden decrease in the transactions per second on a particular junction and a spike in transaction turnaround time for that junction. This is a good indicator that the application running on that junction has degraded performance.

Consider another example where everything was working fine one day, and then the next day customers start reporting that everything is working, but very slowly. Statistically you notice a massive increase (compared to the day before) in the number of transactions per second; however the average turnaround time per transaction has actually decreased. One possible explanation is a change to the back-end server's cache policy for images. Suddenly a large number of small images that used to be read from browser cache after being requested one time are now being requested from the server every time a page is refreshed.

Now that we know how statistics are calculated, let's look at a utility for graphing them using Windows Performance Monitor.


Graphing statistics in Windows Performance Monitor

Example Performance Monitor graph

Figure 7 shows an example of real-time output from a test system showing authentication turnaround times and transaction per second under a constant load test:

Figure 7. Example Windows Performance Monitor output
Performance Monitor Example

You can see that this test server is processing authentications at a constant rate of just under 5 per second, with an average authentication turnaround time of approximately 3 milliseconds. The CPU on the test server was at 100 percent during this test (all processes were running on the same server) indicating that the test machine was operating at capacity. I was actually using a VMWare image for the TAM/LDAP environment for this test.

Now let's look at the architecture of this application, and installation and configuration instructions.

Windows Performance Monitor integration srchitecture

Figure 8. Windows Performance Monitor integration architecture
Performance Monitor Architecture

Figure 8 presents the architecture of the integration of WebSEAL statistics into Windows Performance Monitor. The Performance Monitor application loads a shared library (pdstats.dll), which is responsible for providing information about the statistics counters and their values at the various sample periods. The sample period is configured in Windows Performance monitor as a property of the graph. Right click on the graph, select the General tab, and update the "Sample Automatically every: XXX seconds" entry field, as shown in Figure 9.

Figure 9. Updating the Sample Period in Windows Performance Monitor
Performance Monitor sample period

The pdstats.dll reads statistics counter information from a shared memory segment which is periodically updated by an external application called pdstats.exe. The pdstats.exe program can run as either a Windows service, or as a command line application (which I frequently use for testing) by running it as pdstats -foreground from a command prompt. The installation script installs it as a service with manual startup by default. This application polls WebSEAL for updated statistics data (by issuing the TAM Administration API equivalent of a stats get command) at regular polling intervals. The polling interval is defined during installation and stored in the Windows Registry, and is set to 3 seconds by default. It is very important that the sample period configured in Windows Performance Monitor (see Figure 9) is a significantly larger number than the poll period or otherwise you will not see graphs that are representative of what's really happening.

Effect of a small sample period

If you look closely at the txns/sec line in Figure 7 (this is the top line on the graph) you will see small dips in the line at regular intervals. These are an example of the effect we are talking about. What is happening is that the poll period fires sometime before the end of the sample period, leaving a gap in the actual real-time number of transactions that are captured at the end of that sample period. The larger the sample period is relative to the poll period, the smaller these dips would be.

For example, I would not use a sample period of less than 30 seconds in Windows Performance Monitor when using a poll period for pdstats.exe of 3 seconds. The reason for this is because the polling and updating of the shared memory is done independently of the reading of the shared memory by Windows Performance Monitor. You can actually miss as much as a "poll period" seconds worth of data during the sample period. By making the sample period a larger multiple of the poll period, you lessen the impact this has on graph data.

You do not however want to make the poll period too small or you will flood the TAM Policy Server (and WebSEAL) with requests for statistics. This, at some point, will affect performance and availability of the Policy Server and WebSEAL. Three seconds is a good minimum value.

Configuration and Installation

Before attempting to install or configure the Performance Monitor integration code, you must have the Tivoli Access Manager Runtime for Windows installed and running on your Windows system. You should be able to verify this by using pdadmin from the command line to enable and display the statistics you want to monitor. See Example 3 for details.

In the Downloads section, there is a zip of example code. Within that zip is a directory called pub. If you have no interest in actually building the code yourself and just want to install and run the Performance Monitor integration, this is the only directory that you will need.

The only configuration file that you should need to update is called pdstats.regbat. This file is used by a simple command line utility called regbat.exe which can perform various registry and windows configuration operations from a scripted file. This file has self-describing comments for all the parameters contained within it; however you will likely need to update the following settings.

  • UserId - an administrative user ID that is able to retrieve WebSEAL statistics. Test the user ID that you plan to use with pdadmin first.
  • Password - the TAM password for the user stored in UserId
  • Servers - The set of WebSEAL server names (as reported by a pdadmin>server list command) for which you wish to monitor statistics.

Provided you install into the default directory of c:\pdstats, this is all you should have to modify.

To install the solution, open a command prompt and change the directory to the pub directory. From this directory, run the setup.bat script. The installation copies files to the appropriate directories, creating c:\pdstats if needed, and prints several status messages which should indicate success. You can safely run the setup.bat script multiple times without having to perform any kind of uninstall if you change your configuration and want to "re-install".

Once you have completed the install, you should be able to start the pdstats service in the Services control panel, or run pdstats -foreground from the c:\pdstats directory in a command window. After pdstats.exe is running using one of these techniques, start Windows Performance Monitor (Control Panel -> Administrative Tools -> Performance), or (Start -> Run -> perfmon.exe). All status and error messages from pdstats.exe are written to the Windows Event Log, and are viewable in the Application Log of the Event Viewer (Control Panel -> Administrative Tools -> Event Viewer).

Delete the default counters from the display, and modify the display settings (as shown in Figure 9) to set a suitable sample period and vertical axis scale. Press the "+" button to add a new counter, and from the Performance Object dropdown list, select PD Statistics, as shown in Figure 10.

Figure 10. Selecting the PD statistics performance object
PD Statistics performance object

If the PD Statistics object is not shown, chances are that you do not have pdstats.exe running (either as a service, or in the foreground as a console application).

After successfully selecting the PDStatistics, the various statistics objects offered by the WebSEAL servers you configured during installation should be shown, as demonstrated in Figure 11.

Figure 11. Selecting the PD Statistics counters
PD Statistics Performance Counters

You can select one or both counters (transaction types) on the left panel, and then as many of the various counter instances as you like on the right. After choosing Close, wait for a couple of sample periods (and make sure your WebSEAL has traffic) to see your statistics displayed. If your graph remains at zero, you have probably not enabled the statistics since last starting WebSEAL. You can also manually use "pdadmin stats get" to validate that WebSEAL is producing statistics data.

That concludes the instructions for practical use of Windows Performance Monitor for monitoring WebSEAL statistics. The rest of the article contains information for developers who want to compile the code themselves.

Developer information

If you want to build the example code, use Microsoft Studio .NET (I used 2005 version), and also install the Tivoli Access Manager ADK for Windows on you development system. You should then be able to load the solution file pdstats600.sln into Visual Studio and see and build the various sub-projects. When building the projects, build them in this order:

  • perfdll - this builds pdstats.dll
  • pdstatsa - this builds pdstatsa.dll from the appdll directory which is used by the pdstats.exe program
  • pdstats - this builds pdstats.exe service/application from the app directory

The pub directory is simply manually populated with the built binaries from the various projects, plus the installation and configuration pieces.


Download

DescriptionNameSize
Demonstration Code and Binaries1webseal_statistics.zip3.74MB

Note

  1. The download contains both source code and pre-built binaries for integrating WebSEAL statistics monitoring with Windows Performance Monitor.

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Tivoli (service management) on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Tivoli (service management), Tivoli
ArticleID=195503
ArticleTitle= Real-time WebSEAL statistics with Windows Performance Monitor
publish-date=03022007