Skip to main content

skip to main content

developerWorks  >  Tivoli  >

Tivoli Directory Server monitoring using IBM Tivoli Monitoring

developerWorks
Document options

Document options requiring JavaScript are not displayed


Rate this page

Help us improve this content


Level: Intermediate

Dave Bachmann (bachmann@us.ibm.com), Senior Software Engineer, IBM US
Ramakrishna Gorthi (rjgorthi@in.ibm.com), Software Engineer, IBM Global Services, India
Amit Bhate (abhate@in.ibm.com), Staff Software Engineer, IBM India

14 Sep 2007

IBM® Tivoli® Monitoring monitors and manages system and network applications on a variety of platforms and keeps track of the availability and performance of all parts of your enterprise. This article details how IBM Tivoli Monitoring can be used to monitor the performance of the IBM Tivoli Directory Server.

Introduction

Any successful deployment of a directory solution requires ongoing monitoring to ensure that system performance and availability meet an organization's goals. This article documents a procedure to monitor the Tivoli Directory Server using the Tivoli Monitoring Universal Agent. The article is based upon a set of best practices for monitoring Tivoli Directory Server and its component parts. A parameter list that has been formulated over years of research into successful deployments of the directory server is being monitored through this article. The article also provides insights into setting situations to catch problems that administrators should be concerned about.

This article focuses on monitoring Tivoli Directory Server version 5.2 and 6.0 servers, but it can be extended to other versions of Tivoli Directory Server by following the documented examples.

Some of the key questions that this article addresses, which are important to any Tivoli Directory Server deployment, are

  • Is the server available?
  • How busy is the server?
  • Is the server about to run out of memory?
  • Is the server backlogged?
  • How responsive is the server?
  • Is the database about to run out of space?
  • Are there any errors being logged?

There are numerous ways of presenting data in the Tivoli Enterprise Portal Client (TEPC). One of the ways in which monitoring data is presented in the TEPC workspace is TDS monitoring statistics in Tivoli Enterprise Portal Client.
Figure 1: TDS monitoring statistics in Tivoli Enterprise Portal Client
Tivoli Enterprise Portal Client

The table below contains acronyms used throughout this article. These acronyms might or might not be the official names associated with the respective products.

LDAP:Lightweight Directory Access Protocol
TDS:IBM Tivoli Directory Server
ITM:IBM Tivoli Monitoring
UA:Universal Agent
TEPC:Tivoli Enterprise Portal Client
TEMS:Tivoli Enterprise Monitoring Server


Back to top


IBM Tivoli Directory Server

The IBM Tivoli Directory Server (TDS) implements the Internet Engineering Task Force (IETF) LDAP V3 specifications. It also includes functional and performance enhancements added by IBM.. TDS uses IBM DB2® as the backing store to provide per LDAP operation transaction integrity, high performance operations, and on-line backup and restore capability. The IBM Tivoli Directory Server interoperates with the IETF LDAP V3-based clients.

Figure 2 provides a high level overview of the various components of the directory server and how the clients interact with it.
Figure 2: Tivoli Directory Server
Tivoli Directory Server



Back to top


IBM Tivoli Monitoring

IBM Tivoli Monitoring monitors and manages system and network applications on a variety of platforms and keeps track of the availability and performance of all parts of your enterprise. IBM Tivoli Monitoring provides reports you can use to track trends and troubleshoot problems. You can use IBM Tivoli Monitoring to perform the following tasks:
  • Visualize real-time monitoring data from your environment.
  • Monitor resources in your environment for certain conditions, such as high CPU or an unavailable application.
  • Establish performance thresholds and raise alerts when thresholds are exceeded or values are matched.
  • Trace the causes leading to an alert.
  • Create and send commands to systems in your managed enterprise by means of the Take Action feature.
  • Use integrated reporting to create comprehensive reports about system conditions.
  • Monitor conditions of particular interest by defining custom queries using the attributes from an installed agent or from an ODBC-compliant data source.


Back to top


Software pre-requisites

The article is written with the following software requirements:

  • IBM Tivoli Monitoring version 6.1.0
  • IBM Tivoli Universal Agent version 6.1.0
  • IBM Tivoli Directory Server version 5.2 or 6.0

The installation information for these products is available through the links in the Resources section below.

This article covers the information on setting up the Universal Agent. For information about configuring and setting up the rest of the products, see the documents in the Resources section below.



Back to top


Universal Agent essentials

Tivoli Enterprise Monitoring Agents are installed on the systems or subsystems whose applications and resources that you want to monitor. The agent collects monitoring data from the managed system and passes it to the monitoring server to which it is connected. The client gathers the current values of the attributes and produces reports formatted into tables and charts. It can also test the values against a threshold and display an alert icon when that threshold is exceeded or a value is matched. These tests are called "situations".

The IBM Tivoli Universal Agent is a generic agent of IBM Tivoli Monitoring. You can configure the IBM Tivoli Universal Agent to monitor any data you collect. You can view the data in real-time and historical workspaces on the Tivoli Enterprise Portal and manage with Tivoli Enterprise Portal monitoring situations and automation policies, the same as data from other Tivoli Enterprise Monitoring Agents.

The Universal Agent is thoroughly documented in the IBM Tivoli Universal Agent User's Guide , but a few details are worth mentioning here.

Starting and stopping the Universal Agent

You can start and stop the Universal Agent with the itmcmd command. You can start the Universal Agent with:

$CANDLEHOME/bin/itmcmd agent start um

You can stop the Universal Agent with:

$CANDLEHOME/bin/itmcmd agent stop um

On Solaris, CANDLEHOME is typically set to /opt/IBM/ITM, if the default installation path is followed.

Another environment variable that needs to be set for the Universal Agent to work is the UAHOME variable. On Solaris, the UAHOME variable is set to /opt/IBM/ITM/sol286/um. These settings might change with the operating system and with the path of installation.

If you prefer the GUI for the settings, you can right-click on the Universal Agent in the Navigator view and choose Start, Stop or Restart.

Defining data in the Universal Agent

An IBM Tivoli Universal Agent application consists of one or more attribute groups, each group consisting of one or more attributes. The application to be monitored is defined in a data definition file, called a “metafile”, which is imported by the IBM Tivoli Universal Agent. The command to import the metafile is the um_console command. The command to import the metafile is um_console.

The metafile is a plain text file. It contains the following control statements in the order shown (if present):
SNMP:
For SNMP Data Providers only, introduces the data definition for IBM Tivoli Monitoring provided SNMP MIB applications. SNMP TEXT introduces the data definition for user-defined SNMP applications.
APPL:
Specifies the name that IBM Tivoli Monitoring uses for the application.
NAME:
Defines the name of an attribute group, the type of data being collected, and the period for which the data is valid.
INTERNAL:
Provides for data redirection between attribute groups as a way to perform additional processing.
SOURCE:
Defines the location of the data you are collecting.
RECORDSET:
For File Data Providers only, defines the set of records from which the data provider extracts data.
CONFIRM:
For Socket Data Providers only, specifies the requirements for data acknowledgment.
SQL:
For ODBC Data Providers only, defines the Select statement or stored procedure to use for collecting relational data.
SUMMARY:
Defines the requirements for gathering the frequency of data input during monitoring.
ATTRIBUTES:
Introduces the attribute definitions and specifies the attribute delimiters in the data string. Below the ATTRIBUTES control statement, list the individual attribute definition statements.

Note: Ensure that the $UAHOME/work/KUMPCNFG file has a mention of the mdl file you have written. um_console would take care of this for you, but that's something you can cross check.

Choosing a data collection approach

The Universal Agent supports several different ways of providing data to the monitoring server, including log file parsing, script execution, and receiving data over a socket, all of which will be used in this article. The Universal Agent can also receive events via SNMP and API calls, as well as monitor Web servers via URL polling, and databases via ODBC; however, this article does not make use of those capabilities.

The best choice of data provider is going to be influenced by the source of the data and how you want to monitor it, that is: polled, continuously, or event-driven. Because data manipulation (for example, calculating response times, hit ratios, or percentages) must be done by the data provider before the TEMS receives it and passes it on to the portal, the choice of data provider becomes important.

When the data of interest is already in a log file and just needs to be parsed and provided to the monitoring server as it is written to the log file, the “log file adapter” is the most useful tool to use. If only simple calculations (add, subtract, divide or multiply integers) need to be performed on the data before sending it to the monitoring server, the calculations can be embedded in the metafile data descriptions and performed by the log file adapter.

If more advanced processing needs to be done (for example, calculating response times from timestamps), the best approach is the “socket provider”, which can receive data from a program or script that reads the log file, performs the processing, and writes the data to a socket connected to the Universal Agent.

When the data needs to be retrieved periodically from some utility (for example, using ldapsearch to fetch the cn=monitor statistics), then a shell script invoked by the script data provider is the best approach.

Describing the data to be monitored

All of the definitions developed in the rest of this paper will be contained in the metafile for the TDS application. The file is named tds.mdl, which starts with declaring the name of the application as shown below.

//APPL TDS @ Monitoring for Tivoli Directory Server

Note: Everything after the "@" is a comment

This declares the name of the application as TDS, which will show up in the Portal underneath the node name as an application named TDSnn, where the nn is the version number of the metafile, starting initially at 00 and incrementing each time you make significant changes to it. The IBM Tivoli Universal Agent User's Guide lists the types of metafile updates that result in major or minor version changes.

After the name of the application, the data description is provided to the Universal Agent. The data is described to the Universal agent in the format as shown below.

//NAME LDAP_Monitor P 90 AddTimeStamp @ LDAP stats from cn=monitor
//SOURCE SCRIPT TDSMon.sh envfile=TDS.env Interval=60
//INTERNAL OUTPUT LdapMonitor
//ATTRIBUTES ';'

In the above example, the data description starts with the //NAME line, naming the attribute group LDAP_Monitor, giving it a type of P (polled) with a 90 second lifetime, and have the Universal Agent add a timestamp each time the script is run. The //SOURCE line declares the source of the data is a script named TDSMon.sh, with environment variable set from the TDS.env file, to run every 60 seconds. The data would be saved to an internal buffer named LdapMonitor for further processing by another attribute group. Declare that the data is separated by semicolons.

After the data description shown above, the type of each attribute is specified. Here are some examples of the different attribute types:

Attribute types
APP_VersionD50  
TotalConnectionsC99999999  
ElapsedTimeSecondsC99999999Scale{9} Precision{11}

Display strings, such as the APP_version string, are declared with a “D” followed by the maximum length of the string. Integer values, such as the TotalConnections counter, are declared with a “C” followed by the maximum value of the integer. Floating point values, such as ElapsedTimeSeconds, can also have the Scale and Precision specified.



Back to top


Important Tivoli Directory Server metrics

Some of the questions that a monitoring deployment should be able to answer about Tivoli Directory Server are as follows:

  • Is the server available?
  • How busy is the server?
  • Is the server about to run out of memory?
  • Is the server backlogged?
  • How responsive is the server?
  • Is the database about to run out of space?
  • Are there any errors being logged?

The data that the set of monitors, described in this article, collects for the Tivoli Directory Server answers those questions by reporting the following:

  • Server availability
  • Server process activity
  • Size of the server process
  • Workflow queue sizes
  • LDAP request response times
  • Tablespace sizes and utilization
  • Errors logged to the Tivoli Directory Server message log

Each of these metrics is discussed separately below and a monitoring solution is developed to report that metric to the monitoring server.



Back to top


Tivoli Directory Server monitoring

This section provides details as to how the monitor search results can be analyzed to derive the directory server metrics mentioned in the previous section.

Tivoli Directory Server availability and workload

The easiest way of checking the availability of the Tivoli Directory Server is by sending it a search request. If a monitor search request is used to check the server availability, it can also help us get workload statistics. A base-level search against the base cn=monitor for "objectclass=*" will return the following information:

CN=MONITOR
version=IBM Tivoli Directory (SSL), Version 5.2
totalconnections=18366
total_ssl_connections=0
total_tls_connections=0
currentconnections=51
maxconnections=65516
writewaiters=0
readwaiters=0
opsinitiated=55365
livethreads=1
opscompleted=55365
entriessent=21840
searchesrequested=18681
searchescompleted=18680
bindsrequested=18366
bindscompleted=18366
unbindsrequested=18315
unbindscompleted=18315
addsrequested=0
addscompleted=0
deletesrequested=0
deletescompleted=0
modrdnsrequested=0
modrdnscompleted=0
modifiesrequested=4
modifiescompleted=4
comparesrequested=0
comparescompleted=0
abandonsrequested=0
abandonscompleted=0
extopsrequested=0
extopscompleted=0
unknownopsrequested=0
unknownopscompleted=0
slapderrorlog_messages=21
slapdclierrors_messages=0
auditlog_messages=55365
auditlog_failedop_messages=4
filter_cache_size=25000
filter_cache_current=21
filter_cache_hit=275
filter_cache_miss=103
filter_cache_bypass_limit=100
entry_cache_size=25000
entry_cache_current=1061
entry_cache_hit=2504
entry_cache_miss=1061
acl_cache=TRUE
acl_cache_size=25000
cached_attribute_total_size=0
cached_attribute_configured_size=0
currenttime=2006-12-03 23:21:12 GMT
starttime=2006-11-27 14:41:05 GMT
trace_enabled=FALSE

The above output needs to be formatted into a single line of data separated by semicolons for the Universal Agent's consumption. This can be done with a simple awk script:


NR>2{printf "; "}NR>1{n=split($0,attrval,"=");printf attrval[2]}END{print ""}

The output of monitor search above can be passed onto the awk script to get the data in the desired format. TDSMon.sh is the script where, the ldapsearch and the awk construct are put together. The contents of the TDSMon.sh script would be:

ldapsearch -h $LDAPSERVER -s base -b cn=monitor "objectclass=*" | awk 'NR>2{printf "; "}NR>1{n=split($0,attrval,"=");printf attrval[2]}END{ print ""}'

Now the mdl file needs to be updated to display the attributes gathered out of the monitor search. The associated mdl file (tds.mdl) will look like the following:

LDAP_Monitor Attribute group
//NAME LDAP_Monitor P 90 AddTimeStamp @ LDAP stats from cn=monitor
//SOURCE SCRIPT TDSmon.sh envfile=TDS.env Interval=60
//INTERNAL OUTPUT LdapMonitor
//ATTRIBUTES ';'
APP_VersionD50  
TotalConnectionsC99999999  
TotalSSLConnectionsC99999999  
TotalTLSConnectionsC99999999  
CurrentConnectionsC99999999  
MaxConnectionsC99999999  
WriteWaitersC99999999  
ReadWaitersC99999999  
OpsInitiatedC999999999  
LiveThreadsC999999  
OpsCompletedC999999999  
EntriesSentC999999999  
SearchesRequestedC999999999  
SearchesCompletedC999999999  
BindsRequestedC999999999  
BindsCompletedC999999999  
UnbindsRequestedC999999999  
UnbindsCompletedC999999999  
AddsRequestedC999999999  
AddsCompletedC999999999  
DeletesRequestedC999999999  
DeletesCompletedC999999999  
ModRdnsRequestedC999999999  
ModRdnsCompletedC999999999  
ModifiesRequestedC999999999  
ModifiesCompletedC999999999  
ComparesRequestedC999999999  
ComparesCompletedC999999999  
AbandonsRequestedC999999999  
AbandonsCompletedC999999999  
ExtOpsRequestedC999999999  
ExtOpsCompletedC999999999  
UnknownOpsRequestedC999999999  
UnknownOpsCompletedC999999999  
SlapdErrorLogMessagesC999999  
SlapdCliErrorsMessagesC999999  
AuditLogMessagesC999999  
AuditLogFailedOpMessagesC999999  
FilterCacheSizeC999999  
FilterCacheCurrentC999999  
FilterCacheHitC999999  
FilterCacheMissC999999  
FilterCacheBypassLimitC999999  
EntryCacheSizeC999999  
EntryCacheCurrentC999999  
EntryCacheHitC999999  
EntryCacheMissC999999  
AclCacheC999999  
AclCacheSizeC999999  
CachedAttributeTotalSizeC999999  
CachedAttributeConfiguredSizeC999999  
CurrentTimeD100  
StartTimeD100  
TraceEnabledD100  
TraceMessageLevelD100  
TraceMessageLogD100  
EnCurrentRegsC999999  
EnNotificationsSentC999999  
BypassDerefAliasesD10  
AvailableWorkersC999999  
CurrentWorkqueueSizeC999999  
LargestWorkqueueSizeC999999  
IdleConnectionsClosedC999999  
AutoConnectionCleanerRunC999999  
EmergencyThreadRunningC99  
TotaltimesEmergencyThreadRunC99999  
LasttimeEmergencyThreadRunC999999  
ElapsedTimeSecondsC99999999Scale{9} Precision{11}

The contents of TDS.env file, which is used as an environment to the above script, are:

LDAPSERVER=1.2.3.4
LDAPDN=cn=root
LDAPPW=root

Where,
LDAPSERVER should point to the hostname or the IP of the directory server to be monitored.
LDAPDN should match the bind DN.
LDAPPW should match the bind PW.
It's assumed that the directory server runs on the port 389.

Tivoli Directory Server workload rates

A lot of the data returned by cn=monitor is in the form of counters, for example, OpsInitiated and OpsCompleted. By doing some simple arithmetic, useful results can be derived, such as current operations in progress, which are OpsInitiated - OpsCompleted. The Universal Agent can do this arithmetic for us when derived variables are declared in the metafile. For the example of the operations in progress, the OpsOutstanding counter is declared like this:

OpsOutstanding (OpsInitiated - OpsCompleted)

One can also get rates, such as OpsPerSecond, using the "?" type declaration, like this:

OpsPerSecond ? (OpsInitiated)

Earlier the data from the LDAP_Monitor attribute group was saved to an internal buffer named LdapMonitor. Here an attribute group is declared that uses that buffer as a source of data.

//NAME LDAPRates P 90 AddTimeStamp @ LDAP stats with additional calculations
//INTERNAL INPUT LdapMonitor
//ATTRIBUTES ';'

In the LDAPRates attribute group, the attributes that aren't useful for calculations are skipped, for example, APP_Version. The attributes to be skipped are prefixed with "-". Here is the entire attribute group:

LDAPRates attribute group
//NAME LDAPRates P 90 AddTimeStamp @ LDAP stats with additional calculations
//INTERNAL INPUT LdapMonitor
//ATTRIBUTES ';'
-APP_VersionD50
TotalConnections?999999
-TotalSSLConnectionsC999999
-TotalTLSConnectionsC999999
CurrentConnections?999999
-MaxConnectionsC999999
-WriteWaitersC999999
-ReadWaitersC999999
OpsInitiatedC9999999
LiveThreadsC999999
OpsCompletedC9999999
OpsPerSecond?(OpsInitiated)
OpsOutstanding   (OpsInitiated - OpsCompleted)
EntriesSent?9999999
SearchesRequestedC9999999
SearchesCompletedC9999999
SearchesPerSecond?(SearchesRequested)
SearchesOutstanding   (SearchesRequested - SearchesCompleted)
BindsRequestedC9999999
BindsCompletedC9999999
BindsPerSecond?(BindsRequested)
BindsOutstanding   (BindsRequested - BindsCompleted)
UnbindsRequestedC9999999
UnbindsCompletedC9999999
UnbindsPerSecond?(UnbindsRequested)
UnbindsOutstanding   (UnbindsRequested - UnbindsCompleted)
AddsRequestedC9999999
AddsCompletedC9999999
AddsPerSecond?(AddsRequested)
AddsOutstanding   (AddsRequested - AddsCompleted)
DeletesRequestedC9999999
DeletesCompletedC9999999
DeletesPerSecond?(DeletesRequested)
DeletesOutstanding   (DeletesRequested - DeletesCompleted)
ModRdnsRequestedC9999999
ModRdnsCompletedC9999999
ModRdnsPerSecond?(ModRdnsRequested)
ModRdnsOutstanding   (ModRdnsRequested - ModRdnsCompleted)
ModifiesRequestedC9999999
ModifiesCompletedC9999999
ModifiesPerSecond?(ModifiesRequested)
ModifiesOutstanding   (ModifiesRequested - ModifiesCompleted)
ComparesRequestedC9999999
ComparesCompletedC9999999
ComparesPerSecond?(ComparesRequested)
ComparesOutstanding   (ComparesRequested - ComparesCompleted)
AbandonsRequestedC9999999
AbandonsCompletedC9999999
AbandonsPerSecond?(AbandonsRequested)
AbandonsOutstanding   (AbandonsRequested - AbandonsCompleted)
ExtOpsRequestedC9999999
ExtOpsCompletedC9999999
ExtOpsPerSecond?(ExtOpsRequested)
ExtOpsOutstanding   (ExtOpsRequested - ExtOpsCompleted)
UnknownOpsRequestedC9999999
UnknownOpsCompletedC9999999
UnknownOpsPerSecond?(UnknownOpsRequested)
UnknownOpsOutstanding   (UnknownOpsRequested - UnknownOpsCompleted)
-SlapdErrorLogMessagesC999999
-SlapdCliErrorsMessagesC999999
-AuditLogMessagesC999999
-AuditLogFailedOpMessagesC999999
-FilterCacheSizeC999999
-FilterCacheCurrentC999999
FilterCacheUsage   (FilterCacheCurrent % FilterCacheSize)
-FilterCacheHitC999999
-FilterCacheMissC999999
-FilterCacheAttempts   (FilterCacheHit + FilterCacheMiss)
FilterCacheHitRatio   (FilterCacheHit % FilterCacheAttempts)
-FilterCacheBypassLimitC999999
-EntryCacheSizeC999999
-EntryCacheCurrentC999999
EntryCacheUsage   (EntryCacheCurrent % EntryCacheSize)
-EntryCacheHitC999999
-EntryCacheMissC999999
-EntryCacheAttempts   (EntryCacheHit + EntryCacheMiss)
EntryCacheHitRatio   (EntryCacheHit % EntryCacheAttempts)
-AclCacheC999999
-AclCacheSizeC999999
AclCacheUsage   (AclCache % AclCacheSize)
-CachedAttributeTotalSizeC999999
-CachedAttributeConfiguredSizeC999999
CachedAttributeCacheUsage   (CachedAttributeTotalSize % CachedAttributeConfiguredSize)
-CurrentTimeD100
-StartTimeD100
-TraceEnabledD100
-TraceMessageLevelD100
-TraceMessageLogD100
-EnCurrentRegsC999999
-EnNotificationsSentC999999
-BypassDerefAliasesD10
-AvailableWorkersC999999
-CurrentWorkqueueSizeC999999
-LargestWorkqueueSizeC999999
Workload   (CurrentWorkqueueSize % AvailableWorkers)
-IdleConnectionsClosedC999999
-AutoConnectionCleanerRunC999999
-EmergencyThreadRunningC99
-TotaltimesEmergencyThreadRunC99999
-LasttimeEmergencyThreadRunC999999
-ElapsedTimeSecondsC99999999 Scale{9} Precision{11}

Note here we’ve used "%" to divide and multiply by 100 (very useful for calculating percentages). Also note that attributes like FilterCacheAttempts attribute are an example of intermediate derived values, used to provide input to the calculation of attributes like FilterCacheHitRatio and not actually displayed.

Viewing Tivoli monitoring data

This section provides screenshots of the graphs and reports that the portal provides, in response to the metafile and scripts shown in the earlier section. There are a couple of ways for viewing the graphs and reports:

  • Launch the GUI for managing the Tivoli Enterprise Monitoring Services using the command:



    $CANDLEHOME/bin/itmcmd manage



    In the above GUI, right-click on the Tivoli Enterprise Portal Desktop Client and click Start.
  • Launch the Tivoli Enterprise Portal using the link:



    http://host:1920///cnp/kdb/lib/cnp.html


    Where,
    host is the name of the system where the Tivoli Enterprise Portal Server is running.

When the Tivoli Enterprise Portal is launched, in the navigator on the left, you will see the hostname of the monitoring server listed under the node of the corresponding operating system. Under the Universal Agent select the application host:TDSxx to list down the attribute groups you have in your mdl file.

Here is a screenshot of the navigator showing a Universal Agent on a Linux system.
Figure 3: Universal Agent navigator
Universal Agent Navigator

The attribute groups describing the monitor statistics are the LDAP_Monitor and LDAPRates. As explained earlier, the LDAPRates attribute group contains a set of derived fields for each type of operation showing the number of operations currently being worked on by the server. A bar chart can be plotted with all the "XXX_Outstanding" counters, replacing the default Table view at the top of the screen.

  • Open the report for the LDAPRates attribute group by clicking on the same in the navigator on the left.
  • Click the Bar Chart tool icon Bar chart and click on the Table view which contains the data to be plotted.
  • A dialog box will pop-up providing you the option of selecting the attributes that can be plotted in the bar chart.
  • Select all of the "...Outstanding" items (use the Ctrl key to make multiple selections) and click OK.

The bar chart will now display in the top view, using default settings:
Figure 4: Outstanding operations
Outstanding Operations

To change the label for the chart above:

  • Right-click on the new chart and choose Properties....
  • Click on the Style tab and change the text field from Bar Chart to Current Workload.
  • Click on the picture of the legend on the right and click on the Legend Label tab and you can take Oustanding out of all the labels.
  • Click on OK and you will see the updated chart.

Here's the chart with custom labels:
Figure 5: Directory Server workload
Workload

To save all this work, click on any another attribute group, and answer Yes when asked if you want to save your changes to the LDAPRates attribute group.

The LDAPRates attribute group also has several attributes related to the cache utilizations and hit ratios. Figure 1 provides the report on cache utilizations and hit ratios.



Back to top


Tivoli Directory Server audit log

This section details the procedure on analyzing the audit log to calculate the response times of operations.

Enabling the Tivoli Directory Server audit log

In order to get Tivoli Directory Server response times, the audit log needs to be enabled. This can be done either via the Tivoli Directory Server Web Administration GUI or via an ldapmodify command. See the Tivoli Directory Server Administration Guide for information on enabling the audit log via the GUI. You can easily turn on and off auditing for different operations via an ldapmodify operation against the cn=audit,cn=localhost object. You can also query the current state of auditing via an ldapsearch against the cn=audit,cn=localhost object. For example, on Tivoli Directory Server v5.2:

# ldapsearch -Dcn=root -wtivoli -s base -b cn=audit,cn=localhost "objectclass=*"
CN=AUDIT,CN=LOCALHOST
objectclass=ibm-auditConfig
objectclass=ibm-slapdConfigEntry
objectclass=top
cn=audit
ibm-auditLog=/var/ldap/audit.log
ibm-auditVersion=2
ibm-auditBind=true
ibm-auditUnbind=true
ibm-audit=false
ibm-auditfailedoponly=false
ibm-auditsearch=false
ibm-auditadd=false
ibm-auditmodify=false
ibm-auditdelete=false
ibm-auditmodifydn=false
ibm-auditextopevent=false
ibm-auditextop=false

For TDS v6.0 the audit log information is stored under cn=Audit,cn=Log Management,cn=Configuration. Hence, if you want to enable or disable audit log for TDS v6.0, use that as the base. For the sake of this example, TDS v5.2 is being assumed.

By default auditing is off (ibm-audit=false). You'll want to set ibm-audit=true as well as ibm-auditFailedOPonly=false and ibm-auditOpXXX=true for each operation you are interested in monitoring. Create the following LDIF file (ldapaudit.on.52) to be used in enabling the audit log:

dn: cn=audit,cn=localhost
changetype: modify
replace: ibm-audit
ibm-audit: true

dn: cn=audit,cn=localhost
changetype: modify
replace: auditFailedOPonly
ibm-auditFailedOPonly: false

dn: cn=audit,cn=localhost
changetype: modify
replace: auditSearch
ibm-auditSearch: true

dn: cn=audit,cn=localhost
changetype: modify
replace: ibm-auditAdd
ibm-auditAdd: true

dn: cn=audit,cn=localhost
changetype: modify
replace: ibm-auditModify
ibm-auditModify: true

dn: cn=audit,cn=localhost
changetype: modify
replace: ibm-auditDelete
ibm-auditDelete: true

dn: cn=audit,cn=localhost
changetype: modify
replace: ibm-auditModifyDN
ibm-auditModifyDN: true

dn: cn=audit,cn=localhost
changetype: modify
replace: ibm-auditExtOPEvent
ibm-auditExtOPEvent: true

dn: cn=audit,cn=localhost
changetype: modify
replace: ibm-auditExtOp
ibm-auditExtOp: true

This turns on auditing for every operation. The modification operation is passed to the Tivoli Directory Server server using ldapmodify:

# ldapmodify -Dcn=root -wpassword -f ldapaudit.52.on
modifying entry cn=audit,cn=localhost

modifying entry cn=audit,cn=localhost

modifying entry cn=audit,cn=localhost

modifying entry cn=audit,cn=localhost

modifying entry cn=audit,cn=localhost

modifying entry cn=audit,cn=localhost

modifying entry cn=audit,cn=localhost

modifying entry cn=audit,cn=localhost

modifying entry cn=audit,cn=localhost

Audit log with calculated response times

When the Tivoli Directory Server audit log is enabled, the server will write information to a file (/var/ldap/audit.log file) every time it sends the response to a request. Here is a snippet of an audit log for a bind followed by a search.

AuditV2--2006-08-08-12:02:25.797-06:00DST--V3 Bind--bindDN: cn=root--client: 9.48.171.95:62516--connectionID: 1642--received: 2006-08-08-12:02:25.797-06:00DST--Success
name: cn=root
authenticationChoice: simple
AuditV2--2006-08-08-12:02:32.881-06:00DST--V3 Search--bindDN: cn=root--client: 9.48.171.95:60905--connectionID: 155--received: 2006-08-08-12:02:32.875-06:00DST--Success
base: eruid=ITIM Manager,ou=systemUser,ou=itim,ou=acme,dc=com
scope: baseObject
derefAliases: neverDerefAliases
typesOnly: false
filter: (objectclass=*)

The first line, starting with “AuditV2”, is the same for every operation. After “AuditV2—“ is the timestamp of the reply, followed by "--", then the operation, then "--bindDN:" and the DN that the client was bound as, followed by "--" again. The client IP address and port is delimited by the "client:" strings and "--", the connection ID is delimited by "connectionID:" and "--", then after "received: " is the timestamp that the request was received, then "--" and finally the status.

The built-in capability of the Universal Agent to read formatted log files can be used to parse the audit log. The attribute group to parse the audit log can be defined as follows:

Attribute group to parse audit log
//NAME AuditLog E @ LDAP audit log
//SOURCE FILE '/var/ldap/audit.log' tail
//RECORDSET 10 NEW(0,==,AuditV2)
//ATTRIBUTES
TimestampD33DLMSTRBGN='AuditV2--'DLMSTREND='--'+FILTER={MATCH(0,200)}
OperationD20   DLMSTREND='--'  
BindDND20DLMSTRBGN='bindDN: 'DLMSTREND='--  
ClientD24DLMSTRBGN='client: 'DLMSTREND='--'  
ConnectionIDD12DLMSTRBGN='connectionID: 'DLMSTREND='--'  
ReceivedTimeD33DLMSTRBGN='received: 'DLMSTREND='--'  
StatusD10      

After giving the filename, the tail keyword tells the Universal Agent to watch for new records being appended to the end of the log file. Each record consists of up to 10 lines starting with "AuditV2", which is conveyed to the Universal Agent using the "//RECORDSET" keyword. The DLMSTRBGN keyword gives the string immediately preceding the desired data and the DLMSTREND keyword gives the string that follows. The lines of interest are the ones that begin with "AuditV2--" followed by a timestamp (which begins with "200") so the "+FILTER={MATCH(0,200)}" is used to say that the timestamp must begin with "200. If not, the line is ignored.

All of the fields are strings so the type of "D" is used.

The objective of parsing the audit log is to know how long the server is taking to respond to client requests. This can be calculated by taking the difference between the two timestamps in each audit record. This is a little more complicated arithmetic than the Universal Agent is capable of, so a custom script needs to be written to do the calculations.

Let's say that ldapaudit.awk is the awk script to do this task and the result of running the ldapaudit.awk script against the bind and search examples are:

2006-08-08-12:02:25.797 2006-08-08-12:02:25.797 0 ms Bind cn=root 9.48.171.95:62516 1642 Success
2006-08-08-12:02:32.875 2006-08-08-12:02:32.881 6 ms Search cn=root 9.48.171.95:60905 155 Success base: eruid=ITIM Manager,ou=systemUser,ou=itim,ou=acme,dc=com scope: baseObject filter: (objectclass=*)

The data needs to be passed to the socket data provider, which is listening on port 7500. There are several different ways of associating the passed data with the correct attribute definition, but the most flexible way is to pass the name of the metafile when the connection is first made, then include the name of the application and attribute group on each line of the passed data, in the following form:

<ApplName=TDS><AttrGroup=Audit_Times>

The following perl script (feedsock.pl) will take care of this for us, when invoked as "perl feedsock.pl -m tds.mdl -n TDS -g Audit_Times"

#!/usr/bin/perl -w
# SockFeed.pl
# a simple UA client using IO:Socket
# takes stdin and feeds data to the UA
#----------------
use strict;
use IO::Socket;
use Getopt::Std;

our $opt_h; # h = host
our $opt_p; # p = port
our $opt_m; # m = metafile
our $opt_n; # n = application name
our $opt_g; # g = attribute group
getopts('h:p:m:n:g:');
my $host = $opt_h ? $opt_h : 'localhost';
my $port = $opt_p ? $opt_p : 7500;
my $metafile = $opt_m ? $opt_m : 'Sock';
my $applname = $opt_n ? $opt_n : 'SockEvent';
my $attrgroup = $opt_g ? $opt_g : 'Events';
my $prefix = "<ApplName=$applname><AttrGroup=$attrgroup>";
print "Prefix is $prefix\n";
print "Connecting to $port on $host\n";

# Initialize socket connection to UA
#----------------
my $line;
my $sock = new IO::Socket::INET( PeerAddr => $host, PeerPort => $port, Proto =>
'tcp');
$sock or die "no socket :$!";

# Send explicit specification of metafile
#----------------
syswrite $sock, "//$metafile\r";

while ($line=<>) {
syswrite $sock, "$prefix$line\r";
}

# Finalization Processing
#---------------
syswrite $sock, "//END-DP-INPUT\n";
close $sock;

As evident from the code snippet, the Universal Agent will accept the above data in the attribute group Audit_Times, as defined below:

Audit_Times attribute group
//NAME Audit_Times E @ LDAP audit log with response times
//SOURCE SOCK localhost
//ATTRIBUTES ' '
StartTimeD33
StopTimeD33
ResponseTimeC99999
-msTagD2
OperationD12
BindDND40
ClientIPD24
ConnIDD12
StatusD12
ParametersZ1024

The above attribute group signifies that the socket data provider expects data in the form of values separated by spaces. The "ms" label after the response time is ignored. The "Z" type for the "Parameters" attribute says to assign the remainder of the line following the status value to it. In other words, the search parameters and the name of the entry being modified or deleted are collected in the attribute named Parameters.

A bar chart of response times can be plotted using the AUDIT_TIMES attribute group's workspace to discern high response times if any.
Figure 6: AUDIT_TIMES attribute group
AUDIT_TIMES attribute group



Back to top


Setting up situations

A “situation” is a mechanism whereby the tool can flag conditions that need to be brought to the attention of administrators, for example, if response times are starting to get too high. When a situation fires, a warning icon reflecting the severity of the situation will be displayed next to the attribute group's name in the navigator. Situations are rolled up in the hierarchy of the navigator. Higher levels in the navigator will display the icon of the most severe situation, so one can immediately drill down to the worst problems first.

The IBM Tivoli Monitoring 6.1 User's Guide's information on “situations for event-based monitoring” describes how to set up situations. This article talks about setting up a situation for the Tivoli Directory Server response times.

Start by right-clicking on the AUDIT_TIMES attribute group in the navigator and select Situations... from the popup menu to open the Situation editor. Right-click AUDIT_TIMES and select Create New.... Fill in the fields for the name and description of the situation.
Figure 7: Situations for AUDIT_TIMES
Situations for AUDIT_TIMES

Click OK and then select the Response Time field from the Attribute Item list in the "Select condition" dialog.
Figure 8: Condition for situation
Select condition for situation

Click OK and the Situation editor will now let you specify a formula. Click on the cell immediately below the name heading and then click on the == and select > Greater than from the menu
Figure 9: Formula for the situation
Specify formula for situation

Now enter the value at which you want the situation to be triggered, for example 10000 (10 seconds). Click on the next cell down and click on the formula box (with the "v") so you can choose the Average value formula.
Figure 10: Enter value for triggering situation
Enter value for triggering the situation

Fill in 1000 (1 second) for the average value to trigger from. Now click on the Expert Advice tab and you can add helpful suggestions.
Figure 11: Expert advice
Expert advice

Now click OK. The default sampling interval is 15 minutes so you'll need to wait a little while before the situation fires. Then the navigator will show the critical icon Critical icon at each level from AUDIT_TIMES up to the Enterprise.
Figure 12: Critical icon
Critical icon

The Enterprise-level workspace will also display current situations when you first log in to the Portal.
Figure 13: Enterprise-level workspace
Enterprise level workspace

If you right-click on the entry in the Situation Event Console, you can select Situation Event Results...
Figure 14: Situation event console
Situation event console

Here you can see the values that caused the situation along with the help page that you created.
Figure 15: Help page
Help page

You can create a similar situation for the cache hit ratios, as well as process memory utilization. If you see that your hit ratio is low while the cache is full, then it would potentially be useful to increase the size of the cache. If, however, the process size is getting close to any limit you've set, then you might not want to increase the cache size, and might even want to consider reducing the cache size to reduce the process size.

More tuning information is in the IBM Tivoli Directory Server Performance Tuning Guide, in the Resources section.

For information on more active notification of situations, such as sending a text message, see the IBM Tivoli Monitoring 6.1 User's Guide (search for "reflex automation").



Back to top


Prototypes

A directory server monitoring solution has been developed which will provide better insights on the above mentioned scripts and metafile. A white paper has been written that describes the uploaded solution. The complete package containing the latest version of the white paper along with the available scripts and metafiles is available on the Open Process Automation Library (OPAL). Search for "directory server using universal agent".



Back to top


Conclusion

This document provides the steps to set up a solution to monitor a Tivoli Directory Server environment, based upon best practices observed from successful customer deployments. Additionally, the instructions provide enough background to develop new views or to customize the offering to suit individual needs.



Resources



About the authors

Dave Bachmann

Dave Bachmann is the performance lead for Tivoli Security Products. Based in Austin, Texas in the United States, he has been working on the performance of various distributed systems since coming to IBM in 1992. He has helped with the performance of IBM's Distributed Computing Environment (DCE), IBM Tivoli Directory Server (LDAP), IBM Tivoli Access Manager, IBM Public Key Infrastructure, IBM Risk Manager, IBM Tivoli Identity Manager, IBM Tivoli Federated Identity Manager, IBM Tivoli Directory Integrator and IBM Tivoli Security Operations Manager products. He received a B.S. at Iowa State, an M.S from The University of Michigan, and a Ph.D. from The University of Michigan, where he worked on performance modeling of distributed systems.


Ramakrishna Gorthi

Ramakrishna J Gorthi is a developer for the IBM Tivoli Directory Server, Pune center in India. He has six years of experience in the IT industry, all in IBM, with one year of experience in Level 2 Customer Support for the various versions of the IBM Tivoli Directory Server and the rest of the experience in IBM Tivoli Directory Server development and testing. He has authored the TDS IBM Redbook® titled Understanding LDAP. He has written developerWorks articles pertaining migration and distributed directory scenarios. He holds a degree in Computer Engineering from Pune Institute of Computer Technology, Pune (India). His areas of expertise include IBM Tivoli Directory Server from the Tivoli Security Products and DB2®.


Amit Bhate

Amit Bhate is the development lead for IBM Tivoli Directory Server, Pune center in India. He has a Computer Engineering degree from University of Pune (India). He has over seven years experience in IBM and worked on various IBM products. He spent three years in the Level 3 Support team for the IBM Distributed File System product. He was involved in the development activities of IBM DFS 3.0 Client on Windows, Lotus Notes 7.0 and IBM Tivoli Directory Server 6.1.




Rate this page


Please take a moment to complete this form to help us better serve you.



 


 


Not
useful
Extremely
useful
 


Share this....

digg Digg this story del.icio.us del.icio.us Slashdot Slashdot it!



Back to top