Troubleshooting application performance: Part 2: New tools in Lotus Notes/Domino 7

We conclude our two-part series on troubleshooting Domino application performance with a look at new tools introduced in Lotus Notes/Domino 7 that can help you identify potential performance issues in your applications.

Julie Kadashevich, Senior Software Engineer, EMC

Julie Kadashevich has been working as a developer on the programmability team of Domino server since 1997. Her specific area of expertise has to do with anything related to agents.


developerWorks Contributing author
        level

Raphael Savir, Senior Support Analyst, EMC

Raphael Savir is a senior analyst in support. Since the mid-1990's, he has been developing Domino applications and troubleshooting customer problems, particularly in the areas of performance and security.



05 April 2005

Also available in Japanese

In this article, we continue our discussion about identifying and solving performance problems in Domino applications. In the first article of this series, we looked at tried-and-true methods for identifying problems as well as tips to help you optimize view indexing and agents. In part two, we look ahead to new Lotus Notes/Domino 7 tools that will help you identify trouble spots in your Domino applications. We use these tools to identify performance issues in agents that we created to demonstrate the code tips from the first article.

Troubleshooting application performance: New tools for data collection

Through years of listening to customers discuss their experiences with agents, we have heard that it is very difficult to identify troublesome agents. This is especially true in an HTTP environment where many agents run concurrently, making it difficult to determine which one of them is responsible for hogging CPU or memory resources. In addition, when the problematic agent is identified, it takes a significant amount of work to figure out which portion of the agent code would benefit most from optimization. The new diagnostic tools, which will appear in Lotus Notes/Domino 7, aim to solve these problems.


Application probes in Domino Domain Monitoring

Domino Domain Monitoring (DDM) is a new feature of Lotus Notes/Domino 7. It is a framework that has been added to all server tasks for self-monitoring for various conditions that are of interest based on some simple configurations specified by the administrator. DDM can only monitor Domino 7 servers; it cannot monitor the previous releases of Lotus Domino because those releases do not have the self-monitoring code. In this article, we highlight the application probes that are related to performance issues.

Application probes cover agents and Web services--specifically, scheduled and event-based agents that run in the Agent Manager task and Web agents and Web services that run in the HTTP task. The following application probes are available:

  • Agents and Web services ranked by CPU usage
  • Agents and Web services ranked by memory usage
  • Agents and Web services that run longer than expected
  • Agents whose start times have fallen behind schedule (applicable to Agent Manager only)
  • Agent and Web services error probes:
    • Agent Security
    • Full-text operations on non-full-text indexed databases
    • Agent timeouts
    • Design Update disabling agents during design update

In this article, we discuss CPU, memory, and full-text probes and how they can assist troubleshooting performance issues.

Events4.nsf (Monitoring Configuration database), which contains a lot of server configuration information, has been enhanced to contain several new views that hold the configuration for all DDM probes.

Figure 1. Monitoring Configuration database
Monitoring Configuration database

Events4.nsf that ships with Lotus Notes/Domino 7 includes default probes that you can use as-is or that you can modify to fit the needs of your organization.

Application CPU Usage probes

Let's take a closer look at the CPU Usage probe. CPU Usage probe measures CPU used by each agent. The measuring is done on a per agent basis. The CPU is measured from the start of a particular agent to its finish, no matter how long the agent runs. CPU Usage probe measures the CPU used by just ONE agent even if the agent runs on one of many active threads in a given process. In the case of Java agents, where each agent runs on several different threads, the measuring is done on all threads that belong to that agent and the results for the agent are the total sum of CPU used on all threads belonging to the given agent. This specificity of measuring is not available in any way other than through DDM probe instrumentation.

Configuring application probes is very simple. Figure 2 shows the configuration for the CPU Usage probe. All other application probes look very similar to it.

Figure 2. CPU Usage probe configuration
CPU Usage probe configuration

After you choose the probe type (Application probe/ranked by CPU usage), select the server that you want to monitor and whether you want to monitor scheduled agents running in the Agent Manager or Web agents/Web services. Then select the specifics of the probe.

You can select up to four levels of warnings (from Warning Low to Fatal) to be generated based on the amount of CPU used by any agent. If the agent, during its execution, uses more than the configured amount of CPU, an event with appropriate warning level is generated. The event task processes the event and displays the output in the Domino Domain Monitoring database (ddm.nsf).

The configuration shown in Figure 2 is the configuration used to run agents we used as examples for this article. Setting the lowest level (Warning Low) to one second allows us to generate events for all the agents in our examples, even the ones that used very little CPU. In a production system, you may not want to use such a low threshold for configuration because it will likely generate too much information.

For illustration purposes, we created four agents using the techniques described in part 1 of this article series. The agent names incorporate the technique that was used to implement each agent, for example, Slow Mail Agents (dbSearch) used the method db.Search. Here is the output as it appears in ddm.nsf.

Figure 3. Domino Domain Monitoring
Domino Domain Monitoring

As you can see, the CPU Usage probe confirms the relative efficiency of these methods:

  • Slow Mail Agent (AllDocs) CPU usage is 1034 seconds.
  • Slow Mail Agent (dbSearch) CPU usage is 624 seconds.
  • Slow Mail Agent (GetMail_ftSearch) CPU usage is 281 seconds.
  • Slow Mail Agent (GetMail_ByView) CPU usage is 56 seconds.
  • Slow Mail Agent (ByViewEntryCollection) CPU usage is 50 seconds.

Application memory probes

Next, we turn our attention to the Memory Usage probe. The goal of DDM is to identify problems rather than provide a status report for each agent. The Memory Usage probe measures and evaluates memory used by each agent. To provide meaningful information without creating a great drag on the server performance, this probe measures only back-end memory pool rather than every byte used by this agent. Domino memory management is quite complex and operates with many different memory pools, which are used depending upon what needs to be accomplished. Back-end memory pool is used to allocate memory for Domino objects (such as NotesDocument and NotesSession). The usage of the back-end memory pool has a strong correlation to overall memory usage, which helps us identify memory hogs without measuring every single byte of memory used by each agent. Because we are not measuring every byte of memory used by the agent, we report the memory usage ranking instead of reporting a more specific number.

Any agents evaluated as running with Very High memory utilization have the potential to threaten server stability and should receive high priority attention.

Memory used by the agent is freed when the agent is finished, so the ranking is based on the peak usage of memory during the agent run. Similar to the CPU Usage probe, Memory Usage probe measures memory only on the threads that belong to that one agent even if the agent runs on one thread or multiple threads among many other active threads in a process.

The ranking of the agent depends on two factors. First, the percentage of available back-end memory used by this agent, and second, whether or not the agent has to share memory with other agents running in the same process. When you configure Web agents to run concurrently in HTTP, the available memory pool has to be shared by all agents running at the same time. For the server to perform well at peak load, each agent has to be lean and mean.

NOTE: The Web agents are configured to run serially or concurrently in the Domino Directory in the Server document on the Internet protocols - Web server engine tab Web Agents section. The default is to run agents serially.

On the other hand, agents running in an environment where they can use the entire memory pool can afford to use more resources. Such is the case with HTTP configured to run agents sequentially or agents run by Agent Manager where concurrent agents run in separate processes. Memory Application probe ranks the same agent differently if it runs in HTTP (in concurrent agent mode) or in the Agent Manager. For the following example, we ran the same agent in HTTP and in the Agent Manager, and you can see that the ranking in HTTP was very high and the ranking in Agent Manager was high.

Figure 4. Memory Application probe
Memory Application probe

Memory Application probe measures the largest footprint of the agent during its execution. A good illustration of what Memory Application probe shows and how it can be used to optimize your code is an example of two Java agents. These two agents are almost identical, the only difference between the two is that one agent is using the recycle method and the other does not. When the agent without recycle completes, Lotus Domino recycles the objects used in the agent on behalf of the agent during the clean-up portion of the agent completion logic, but the footprint of the agent during execution is much larger because the back-end memory is not freed up until after the agent is finished.

Here is a code snippet from the agent jMem_5000 without the recycle method:

newDoc.appendItemValue("Form","JournalEntry");
newDoc.appendItemValue("Subject","removeme");
RichTextItem rti = newDoc.createRichTextItem("Body");
rti.appendText("lotsoftextgoeshere");
myArray[i] = newDoc; 
// ... //
depth = myArray.length;

Here is a code snippet from the agent jrMem_5000 with the recycle method:

newDoc.appendItemValue("Form","JournalEntry");
newDoc.appendItemValue("Subject","removeme");
RichTextItem rti = newDoc.createRichTextItem("Body");
rti.appendText("lotsoftextgoeshere");
myArray[i] = newDoc;
// ... //
myArray[i].recycle();
myArray[i] = null;
depth = myArray.length;

And here are the memory reports for each of the agents in the DDM output:

Figure 5. Memory reports
Memory reports

Full-text index operations

Another common source of performance problems are full-text operations on non-full-text indexed databases. The reason that these operations cause a problem is that to perform a full-text operation, a temporary index is created. After the operation is performed, the temporary index is deleted. If an agent sorts the incoming mail into folders based on a keyword match, this temporary index is created and deleted each time the agent processes a mail message. For companies that have a policy of not creating full-text indexes for mail files, this happens each time a mail message arrives for any user who has an agent that sorts mail.

There are two ways to invoke full-text search in an agent, as follows:

  • Using a full-text search method, for example: Set dc = db.FTSearch( query$, 0, _FT_SCORES, FT_STEMS).
  • Specifying words to match in the Document Selection area of your agent.

Selecting the words in selection criteria can be used in any agent, including simple action agents, which are quite commonly used even by non-programmers. We found that many people do not realize that selection criteria invokes full-text search, so we wanted to highlight it in this article.

In Lotus Notes/Domino 6, we generated a warning to the server console that this operation has a serious performance side effect:

01/18/2005 05:10:58 PM Agent Manager: Full text operations on database 'c:\notes_server\data\my\LS_Memory.nsf' which is not full text indexed. This is extremely inefficient.

In Lotus Notes/Domino 7, through DDM probes, we can provide additional useful information by collating the full-text operations on non-full-text indexed databases by database, so you can see which databases get the most queries. This additional information helps you determine which databases would benefit most from the creation of a full-text index.

Figure 6. Domino event
Domino event

Profiler for agents and Web services

After you have identified an agent causing problems, use the Profiler to help you optimize the agent code. Profiler measures the time it takes to perform each method in the agent logic, thus helping you identify bottlenecks. This tool allows you to concentrate on the portions of the code that take the longest amount of time and bring the highest payoff.

Profiler profiles back-end methods in Java and LotusScript. It profiles Domino back-end methods, that is, operations on back-end objects (for example, dir.OpenDatabase("db.nsf")) not standard language constructs (for example, Open fileName$).

To profile an agent, the agent profiling has to be turned on for that particular agent. This setting is on the second tab of the Agent Properties box.

Figure 7. Agent Properties box
Agent Properties box

After the profiling toggle is turned on, the next time the agent runs it will be profiled. Agents can be profiled regardless of how they run (for example, as a scheduled agent, as a Web agent, or manually from the Action menu). The profiling information is stored in a Profile document in the database associated with the agent.

To view profiling information, select the agent you are profiling in Domino Designer, and then choose Agent - View Profile Results. Figures 8, 9, and 10 display Profiler output for several of the agents used as examples in this article. At the top of each Profiler output, you see the name of the agent and the time stamp of when profiling was done. Elapsed time is the total amount of time the agent ran, followed by a total measured time, which is typically somewhat smaller because time values are rounded down for display purposes. For example, the values under one millisecond are displayed as zeros in the following table. The profiling table contains the class, the method, the operation, the total number of calls to that method, and the total amount of time spent on all calls to that method. The information in the table is sorted in descending order, showing the methods where the most amount of time was spent at the top.

Figure 8. Slow Mail Agent (ByViewEntryCollection) Profile
Slow Mail Agent (ByViewEntryCollection) Profile
Figure 9. Slow Mail Agent (GetMail_ftSearch) Profile
Slow Mail Agent (GetMail_ftSearch) Profile
Figure 10. Slow Mail Agent (AllDocs) Profile
Slow Mail Agent (AllDocs) Profile

As you look through these examples of the Profiler, keep the following points in mind:

  • In each case, the code being profiled is designed to walk through mail files on a server and count up document size for every document in every database. This is a boring task which can be accomplished many ways, which makes it ideal for this study. We coded this task a few different ways to demonstrate the capabilities of the Profiler.
  • In the profile output for each agent, the top one or two methods can take up a vast majority of the total execution time. Realize that you have to divide the time by the number of calls to get time / iteration. In some cases, this may discourage you from trying to optimize that piece of code, but in other cases, such as these, it only strengthens the case for finding the best way to get the collection of documents.
  • As per the findings earlier in this article, using set nvc = view.GetAllEntriesByKey is an outstanding performer in comparison to other methods of getting access to data.

Conclusion

Performance tuning is a constant battle of technology and expertise fighting against increased requirements and expectations by your customers. We hope that the information, tips, and tools presented in this article series give you the upper hand.

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into IBM collaboration and social software on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Lotus
ArticleID=57752
ArticleTitle=Troubleshooting application performance: Part 2: New tools in Lotus Notes/Domino 7
publish-date=04052005