With the release of Notes/Domino 7 and the good benchmarking numbers it is producing, the question a lot of customers are asking is, âHow much will this help me in my production environment?â In this article, we discuss performance results we observed when upgrading two of IBMâs production Domino servers on zOS for zSeries from Domino 6.5.x to Domino 7. We show the performance of the various Domino tasks that are running in these DPARs (Domino Partition Servers), to help you understand the benefits Domino 7 could have in your environment.
We know that not all customers will have the same results from moving to Domino 7. Some may see better numbers than the published benchmarks, while others may see smaller improvements. Also, even within a single customer site, an individual DPAR might see better numbers than another DPAR in the same location. By understanding which processes in Domino are providing performance enhancements, you will be better able to predict the improvements you're likely to see on your own DPARs.
In this article, we:
- Review the environment for the upgraded Domino 7 DPARs.
- Outline the methodology used for measuring a production DPAR.
- Document the results obtained (up to 28 percent CPU usage improvement) from our upgrade of the production DPARs.
We assume you're an experienced Domino administrator. For a summary of all the new features included in Domino 7, see the developerWorks Lotus article, "New features in Lotus Domino 7.0."
Hardware and software configuration
Both z/OS DPARs (named D01MLC83 and D01MLC96) discussed in this article reside on the same physical hardware in the same LPAR (Logical Partition), called ML96. This box has another LPAR with several other DPARs. The hardware and software configuration for the ML96 LPAR is as follows:
| Machine type | 2086 |
| Number of physical CPUs | 8 |
| Number of shared logical CPUs in LPAR | 5 |
| Number of LPARs | 2 (both z/OS) |
| Storage in LPAR | 8 GB |
| Operating system | z/OS 1.6 |
| File systems | zFS |
The D01MLC83 DPAR was upgraded from Domino 6.5.3 to Domino 7 (Milestone 3) on December 9, 2004, and had approximately 335 registered users. We subsequently upgraded this DPAR to beta 3 (January 27, 2005) and beta 4 (April 29, 2005), and upgraded to the Domino 7 Gold candidate build on August 26, 2005. The number of registered users ranged from 335 to 900 during the beta programs. D01MLC83 currently has 870 registered users, with a "15-minute active" rate of 348 users (in other words, during an average 15-minute interval, 348 users are active), resulting in a 40 percent active rate. (We talk more about the "15-minute active" rate later in this article.) Mail template usage is as follows: 85 percent run the Notes 6 mail template, 10 percent run Notes 5, 3 percent Notes 4.5, and 2 percent run "other."
The D01MLC96 DPAR was upgraded from Domino 6.5.3 to Domino 7 (Milestone 4) on February 26, 2005, and had 3,483 registered users. Since then, we have upgraded this DPAR to beta 4 (April 29, 2005), and upgraded to the Domino 7 Gold candidate build on September 9, 2005. The number of registered users ranged from 3,483 to 3,280 during the beta programs. D01MLC96 currently has 3,280 registered users, with a 15-minute active rate of 1,150 users (36 percent). Mail template usage is 89 percent on the Notes 6 mail template, 7 percent run Notes 5, 3 percent Notes 4.5, and 1 percent run "other."
We ran the following server tasks on both servers:
- ServerTasks=Update, Adminp, CalConn, Collect, Router, Sched, Amgr, http, dbscan mail1.box, dbscan mail2.box, dbscan mail3.box, tmmscan, tmmscan
- ServerTasksAt1=Catalog
- ServerTasksAt2=UpdAll
We ran HTTP with SSL enabled on D01MLC83 only. On both servers, we had SMTP configured for inbound and outbound mail. We used an Extended Directory Catalog containing 495,000 entries, and used Directory Assistance for edircat, as well as another secondary directory (used for public groups). We also enabled LDAP for Domino Web Access authentication on D01MLC83. We had no agent restrictions, so users could run personal agents. We set up Admin4.nsf so it replicated only during off hours. And we had three Mail.box files on each DPAR.
We tweaked the Notes.ini files of the servers to include the following settings:
AMgr-DisableMailLoopup=1
DEBUG_THREADID=1
Disable_BCC_Group_Expansion=1
FT_FLY_INDEX_OFF=1
IOCP_DISABLE_ASYNC_NOTIFICATION=1
MAILBOXDISABLETXNLOGGING=1
NSF_BUFFER_POOL_SIZE_MB=256 (on D01MLC96), 172 (on D01MLC83)
NSF_DBCACHE_MAXENTRIES=2300 (on D01MLC96), 516 (on D01MLC83)
RouterEnableMailByDest = 1
SERVER_ENABLE_THREADPOOL=1
SERVER_MAX_CONNCURRENT_TRANS=-1 (The default for zOS is set to unlimited.)
SERVER_POOL_TASKS=125 (We recommend leaving this value at the default of 100. We changed this during a debug/problem analysis from the default of 100.)
SERVER_SESSION_TIMEOUT=31
TRANSLOG_HARDWARECOMPRESS=1
To provide backups, we ran the Tivoli Storage Manager server on the same LPAR as our Domino servers. Transaction logging was enabled, and used archival logging. Full backups and incremental backups were only run "off shift," with archive logs backed up during the day. And we ran zSeries hardware compression for transaction logging (a feature of Domino 7).
Additional software included Trend Micro Scanmail running with two tmmscans per DPAR, and IBM Tivoli Monitoring for Messaging and Collaboration.
Before we discuss our results, we first need to explain how to measure a production DPAR. This is very different from a benchmark, where the environment is tightly controlled. Measurements used in a benchmarking environment do not necessarily apply to a production environment, where the workload can change, not only day-to-day and hour-to-hour, but minute-to-minute.
To compare Domino 7 to Domino 6.5, we needed to compare the data from our two DPARs when they were running Domino 6.5.x code. For D01MLC83, this was before December 12, 2004, and for D01MLC96, this was before February 26, 2005. For this article, we will look at the performance of these DPARs during "prime shift." (Prime shift is defined as Monday through Friday, from 8 AM to 4 PM in the local time of the DPARs.)
Measuring a benchmark workload can be fairly straightforward compared to measuring a production workload. In a benchmark environment, you have control over the setup, and you establish exactly what is running on your server, at what specific rates, from your benchmark clients. You can change a parameter, a line of code, or a workload item, and rerun your benchmark to get a good measurement of the impact of the change.
Things are a little more complex in a production workload. You do not know at any given time what workload is be driven against your server by your users. In addition to measuring the resources being consumed by your servers, you must also measure the workload being driven against it. Domino has many different components that can impact your serverâs resources.
Here are just a few examples of situations that can have a profound impact on your servers:
- Users using a new feature.
- A mass mailing (or mail virus) that greatly increases your message workload.
- New releases of the client or server code that change how existing workloads are processed.
- Varying numbers of registered/active users on any given server, due to consolidations or attrition.
- User activity that varies based on seasonal trends (vacations/holidays).
- A new user agent begins running.
- Change to the replication/clustering schedule.
- Different views/indexes being used.
- Administration processes are started (AdminP, Fixup, Compact, backups/restore, and so on).
Another big difference between production and benchmark workloads is what you choose to measure. In a benchmark, you typically have some ramp-up period that can take many hours. After your users have established the connections on your server, you achieve a "steady state" that you measure. The workload spikes of this ramp-up period are not included in the benchmarkâs steady state measurements. In production, you must plan to have enough capacity for your peak loads, not the average steady state. Users are constantly coming and going on your DPARs. A worst-case scenario is the first day after a holiday or a DPAR restart in the middle of prime shift, where a large number of users are all logging into the DPAR in a very short period of time. The changes in these peaks loads from one release to another is what you need to plan for, not the average workload.
In figure 1, you can see the yearly trend of active 15-minute users on a set of production DPARs at IBM. This trend reflects the typical usage of our active users on these DPARs over the year. You can see that there are dramatic drops around Thanksgiving and Christmas/New Years. Also, there is a yearly trend that starts around February. Here, the number of active users starts to tail off, with a large drop over the summer months. It has not been uncommon to see up to a double-digit increase in active users (and the CPU that goes along with it) from the middle of August to the middle of September on our large production DPARs.
Figure 1. Yearly trend of active 15-minute users

Part of the active user drop can also be attributed to the attrition rate on any given DPAR over time. If no new users are added over the course of a year, a DPAR starts to lose users (they leave the company, retire, and so on). This is reflected in the changes in registered user counts on D01MLC83 and D01MLC96 over the past year. There are also times during the year when the Notes/Domino administrators occasionally add users to a DPAR to replace lost users. You must account for such changes in the registered and active user population over time in your analysis.
When reviewing user activity on a DPAR, you should look at Dominoâs active user counts instead of the server.users count. The active user rate shows how many different users were active in the indicated interval. The server.users count shows how many users are connected to your DPAR. These users are counted until their sessions are timed out by the DPAR, or the client closes the session. By changing the SERVER_SESSION_TIMEOUT Notes.ini parameter, you can have a significant impact to the server.user number, even though the throughput rates on your DPARs remain the same. This statistic does not represent an absolute indication of the number of users, but a relative one to the SERVER_SESSION_TIMEOUT value. Therefore, it is not a valid indication of workload in your production environment. This is especially true if you have different values on the DPARs that you are comparing.
For example, suppose you change the timeout value from two hours, let it default to the four-hour setting, and continue running the same user workload. Your server.users value could go up by 20 percent. This doesn't mean that you have more users actively doing work on your DPAR; it means that you have more connected users being managed by your DPAR for the same workload. You did not just add 20 percent more users by changing this value (or reduce the number of active users by lowering this value). It should be noted that typically in a benchmark environment, the server.user counts will not change, because (as defined by NotesBench) all users are active every 15 minutes. We find Dominoâs active 15-minute user statistic (server.users.active15min) to be useful in measuring user activity in a production environment.
Understanding these workload patterns allowed us to plan and anticipate the workload increases to the DPARs (in this example, around Labor Day in September). If we did not understand these workload cycles, we would have been concerned when we saw the increase in resources used during this timeframe, possibly believing there was a problem. We took into account this workload change to normalize our DPARâs usage over time, so that we do not misinterpret a CPU change as a performance enhancement/degradation related to a workload change.
You may also be tempted to use the Domino transaction count for measuring workloads. This count is very different from the transaction count from a NotesBench run. The transaction counts from a NotesBench run are derived from the clients -- not from a Domino server statistic. The Domino transaction count statistics on the DPARs are a measurement of the number of NRPC (Notes Remote Procedure Calls) that have occurred on that DPAR.
For both internal and external users upgrading from Domino 5 to Domino 6, we saw measurable changes overnight in the number of transactions being driven on a DPAR. While the active user counts were the same, the number of transactions per active user changed. This is the total number of transactions for some interval divided by the total number of active users for that interval. The difference in the transaction counts were from the new version of the code on the DPARs. The Notes clients were able to exploit different calls to do the same work. We saw both increases and decreases, with the largest difference being almost 20 percent per user. Also, as you upgrade Notes clients, you may see a trend up or down in the transaction counts, as more users are now able to use the new calls and functions for their existing workloads.
The other issue with using the transaction count is that not all transactions are equal. It is possible for two DPARs to show roughly the same transaction counts, but have very different CPU usage because there are different transactions being driven on each DPAR. One DPAR could be running almost all local replica copies on the clients and is doing nothing but replication, while the other DPAR could have almost all Notes clients accessing mail databases directly on the DPAR. The other issue with using transaction counts is that this only applies to NRPC workloads. If you are using other clients (Web, IMAP, POP3, and so on), the transaction count does not reflect a standard way of reporting on these activities. The key thing to remember is that the Domino transaction count statistic on the server is a measurement of the number of calls between the Notes client and the Domino server, not an atomic measurement of the amount of work the DPAR is being asked to complete.
Figure 2 shows the total number of transactions for each of the two DPARs (D01MLC83 and D01MLC96) over the past year. Point A in the chart represents when D01MLC83 was upgraded, while point B represents when D01MLC96 was upgraded.
Figure 2. Total number of transactions for each DPAR

In figure 2, note the large drop, then increase, in the transaction counts for D01MLC83 (red line) in the January to March timeframe. If we plot the number of transactions per user, we see a different trend (see figure 3).
Figure 3. Transactions per user

At point A in figure 3, there was an increase in the transactions per active users (around the time of the first Domino 7 beta) for D01MLC83. At point B, we can see a large drop in the number of transactions per users on D01MLC83 that matches the drop in the total transactions. At point C, we can see a further smaller drop per active user when the number of registered (and active-15 minute users) doubled.
We know that more users were added to D01MLC83 in March and that the transactions per active 15-minute users line remained flat at this time, so we can say that the increases for total number of transactions were due to more users (more users doing the same workload). However, the January drop in total number of transactions for D01MLC83 reflects some change in performance or workload characteristics (same users, but less transactions per user), or change in the environment (code level, network) that caused the transaction differences. If we just looked at the total transaction counts from December and March, we see roughly the same counts, even though there are almost twice as many active users from December to March. While D01MLC96âs total transactions counts show some fluctuations over the full year, the number of transactions per active user on this DPAR seems to be relatively stable over this same period of time.
However, D01MLC96 did not show the same behavior as D01MLC83. Point D pinpoints the first upgrade of D01MLC96 to a Domino 7 beta release. We can see that there was not a major change in the number of transactions per users. We can also see that the ratio for D01MLC96 trends down over the next couple of months, then back up again. This up trend is matched in the D01MLC83 line for the last several months.
To accurately measure the performance in a production server with a dynamic workload, we need to measure the cost per active 15-minute user. To do this, we will take the total CPU used in an interval and divide it by the total active 15-minutes users in that interval. If this cost per active 15-minute user stays flat, but the total CPU line goes up (or down), we can conclude that this is a workload or a capacity related issue. If the cost per active 15-minute user goes up/down, we are more likely looking at a performance related issue (or in this case, an improvement because of a new release).
Another area that we monitor on zSeries is the semaphore usage in a DPAR. While this does not translate directly into a CPU number, we can see potential bottlenecks or savings by understanding the changes in the semaphore usage.
Figure 4 is a sample of the busiest semaphores (total count) for D01MLC96 over our study period. You can clearly see that on February 26, there was a change in the semaphore usage, due to the first Domino 7 beta. You can also see the reduction in the semaphores in this production workload with Domino 7. The two semaphores labeled â__unknownâ in figure 4 are new Domino 7 semaphores that our current charting process did not have a mapping for at the time of this writing.
We can further break this down into the various types of locks (read, write) and to level (down to operating system level locking) for each semaphore type. We can see the benefits of Domino 7 in the way the semaphores are managed, and the reduction in the amount of semaphores needed for a similar workload.
Figure 4. Sample of the busiest semaphores for D01MLC96

Notes/Domino 6 to 7 comparisons
Figure 5 shows the CPU cost per user for MLC83 and MLC96 over the past year.
Figure 5. CPU cost per user

The first release of Domino 7 on D01MLC83 in December 2004 is identified as point A, and is accompanied by a large increase in the cost per user. This first beta included all the debugging code that was needed at that time. You can see that the cost per user then decreases over time, with two large drops at points B and C. Point B reflects a new beta release, while point C reflects a new beta and an increase in users on D01MLC83.
While these numbers look good, we need to keep in mind that because this is a production environment over an extended period of time, various processes will be running at one level in one sample (Domino 6) and not at the same level in the other sample (Domino 7). Processes such as Compact, Fixup, AdminP, Updall, and other maintenance tasks (such as backup and recovery) can be very CPU-intensive, and are not directly related to a user's workload. Rather than look for individual days where these tasks were not running during prime shift, we filtered them out of our multi-week samples for both Domino 6 and 7. We also wanted to filter out the CPU for TMSCAN, as this is not part of the Domino 7 code, but is an anti-virus addin that was running.
To do this, we needed to determine the amount of CPU each Domino task consumes. By plotting this data for D01MLC83, we see the results shown in figure 6.
Figure 6. Amount of CPU consumed by each Domino task (D01MLC83)

While D01MLC96 shows the results displayed in figure 7.
Figure 7. Amount of CPU consumed by each Domino task (D01MLC96)

The CPU numbers in these two DPARs for these maintenance tasks were not very large. However, they do change the percentages of the cost per user by several percentage points. D01MLC83 gained a couple of percentage points, while D01MLC96 lost a couple of points in the cost per active 15-minute user. As we have seen with other customers, these tasks can have a large impact on your numbers if they are present in only one of your samples and not the other. By looking at the individual process utilization within a DPAR, we can see what impact these tasks have on our data.
While the total number of CPU seconds used on D01MLC96 looks a lot better (CPU drops) than D01MLC83 (CPU actually went up), the decrease in the number of users on D01MLC96 for the summer months keeps the cost per active 15-minute user up, while the increase in the active users for D01MLC83 offsets the CPU increase, and actually produces a much better cost per user result on this DPAR.
The following charts show the reduction in CPU seconds on a per active 15-minute user for each of the major Domino tasks that were running. In figure 8, we can see that the largest drops in CPU used were in the Server task, followed by the Logasio task.
Figure 8. Reduction in CPU seconds per active 15-minute user by process

Figure 9 shows the percent CPU usage savings (positive number) or loss (negative) for each of the major tasks running in both DPARs. Keep in mind that the majority of the CPU cycles are used by the Server task. While some of the other tasks show large changes, this percentage against actual CPU usage is much smaller when compared to the total CPU used in figure 9.
Figure 9. Percent CPU savings by active 15-minute user

We can see the savings in the Server task from the Domino 7 enhancements. We can also see the 80 to 90 percent savings in the Logasio task from the zSeries implementation of hardware compression for the transaction logger. You need to keep in mind that the majority of cycles for transaction logging are not in the Logasio task, but the various other tasks (Server, Agent Manager HTTP, IMAP, and so on) that are driving database changes. These tasks will also see some savings from using zSeries hardware compression for transaction logging.
Notice the negative savings (representing more CPU usage per active 15-minute user) in the Router. If we look at the cost per message processed by the Router task, we can see that there is less CPU cost per message with Domino 7 (see figure 10).
Figure 10. Cost per message processed by the Router task

This indicates that it is cheaper per message with Domino 7, but our data shows that the Router is more costly in our Domino 7 measurements. This indicates an increase in the number of messages per user between our Domino 6 and Domino 7 samples. Figure 11 shows the growth in messages per user over the last year.
Figure 11. Message growth

Another area that produced some interesting numbers was NSF database cache usage. Figure 12 shows the percentage of entries that were in use in the NSF database cache. Domino can dynamically extend the number of entries in this cache by 50 percent, so we can expect to see values that are over 100 percent if the DPAR needs to extend the entries. Note the large drop in the number of entries in the buffer pool for both D01MLC83 and D01MLC96.
Figure 12. Entries in use in the NSF database cache

While the Notes/Domino 7 code could be a difference in the management of the database cache entries, another indication of a workload change is in the number of replications initiated and processed. With databases being replicated less frequently (or fewer databases being replicated), we need fewer active entries in the DBCache. Figure 13 shows the number of database replications initiated by each DPAR.
Figure 13. Replications initiated by each DPAR

A large portion of CPU usage for replication and clustering is in the Server task, not the Replica or Clustering task.
We can see the drop in the number of replications from the Domino 6 to the Domino 7 timeframe for both D01MLC83 and D01MLC96. Since our comparison for D01MLC83 was last November/December, its replication difference is much larger than D01MLC96, whose comparison date is January/February of this year. This may also show why there is a much larger difference in the Server task for D01MLC83, because the change/percentage in the replication counts are more dramatic for that DPAR from its Domino 6 base numbers than D01MLC96 and its base numbers.
In figure 14, you can see the results of upgrading our DPARs to Domino 7 and the savings it produced. We have presented both the total CPU used by each DPAR and the CPU per active 15-minute user. The green bars represent the percentage difference in the amount of CPU each DPAR used with Domino 7, compared to its last several weeks (excluding holidays) running Domino 6. The blue bars illustrate the same comparison, but we have divided the total CPU by the active 15-minute users counts.
Figure 14. Notes/Domino 6 to 7 results

What stands out is that D01MLC96 had the largest drop in CPU used between its Domino 6 and 7 CPU numbers, as measured by its total CPU utilization. However, when this CPU usage is adjusted for the workload, we find that D01MLC83 actually produced much better savings in the cost per active 15-minute user.
Knowing that there was a change in the way replications were processed, we can say that this impacted D01MLC83âs CPU per active 15-minute user savings more dramatically than D01MLC96âs. Also, there was an increase in the number of mail messages per active 15-minute user during our beta, and this impacted our results as well. We conclude that the D01MLC96âs CPU per active 15-minute user should be higher than shown in figure 14, while the D01MLC83âs active 15-minute users should be lower. What we can not say with any certainty is what the savings would be if these changes had not occurred during the beta. We can say that there is a savings with Domino 7 that will vary, depending on your DPAR's specific workloads.
Based on our observations, Domino 7 provides savings in a production environment. As expected, we saw that different DPARs derived different benefits from the Domino 7 upgrade, due the different workloads they are running. In the published NotesBench NRPC benchmarks for Domino 7, we have seen that the 25 percent reduction is mainly in the Server task. For these benchmarks, only the Server and Router task were running.
In our measurements, we tried to compare the various components that were running between the two releases, to see what savings these other tasks produced with Domino 7. However, due to the extended nature of the beta (over 6 months), we found that there were substantial changes in the user workloads and active/register counts, which brings into question the numbers from our two DPARs.
It should be noted that the findings in this article refer to a mail DPAR and not an application DPAR. An application DPAR would be running a different workload, using calls different from a mail DPAR. You should not try to extrapolate the findings from this article to your application DPARs.
When measuring your own production server using just CPU and transactions, be sure to exercise caution, because these numbers can be misleading over an extended time period. You must be able to measure and account for the varying workloads that occur on a production server over time, as these changes will impact your numbers either positively or negatively.
In a production environment, you will be running many more tasks than Server and Router. For example, suppose 50 percent of your CPU usage is from these two tasks, and the remaining 50 percent is from other Domino tasks. If none of these other tasks shows any improvements, the 25 percent benchmark savings would become a 12.5 percent savings on your server. As more cycles are consumed by tasks that do not show a performance benefit, your potential savings will be diluted. Also, third-party addins such as TMSCAN (anti-virus) are part of your total CPU usage, and will increase that portion of your cycles that will not have a CPU benefit from Domino 7.
It is our intention to publish follow-up documentation after several large production DPARs have been migrated from Domino 6.5.x to Domino 7. Each DPAR will be upgraded over a weekend, so we will be able to build a clear before/after view of the DPARs and their usage at the process level, and compare this utilization to their individual workloads. This will also allow us to compare not only the average loads described in this article, but the peak loads (15-minute snapshots) mentioned earlier. This will allow us to build a more detailed understanding of potential savings from Domino 7 for any given production DPAR.
Learn
-
For a summary of all the new features included in Domino 7, see the developerWorks Lotus article, "New features in Lotus Domino 7.0."
-
The article, "Lotus Domino 7 performance in production at IBM on pSeries servers," reviews the performance improvements we achieved by deploying Domino 7 on pSeries servers in a live IBM production environment.
-
You can also consult our three-part article series on Notes/Domino 7 performance.
- The first part of this article series, âLotus Domino 7 server performance, Part 1: Lotus Notes client workloads" discusses Domino 7 performance results we obtained by simulating Notes client users.
- The second article in this series, âLotus Domino 7 server performance, Part 2: Domino 7 performance for Domino Web Access users," discusses Domino 7 performance results we obtained by simulating Domino Web Access users.
- And the third part of the series, âLotus Domino 7 server performance, Part 3: Enterprise mail performance," is focused on a new benchmark called Enterprise Mail.
Discuss
-
Participate in developerWorks
blogs and get involved in the developerWorks community.
Barbara Filippi is a Consulting IT Specialist with the Domino for zSeries Team in the Washington Systems Center. She has worked at IBM for 25 years and has been involved with Domino on zSeries since it initially became available. Her focus areas are Domino installation and administration, capacity planning, performance analysis, and migration to zSeries from other Domino platforms.




