Skip to main content

Lotus Domino 7 server performance, Part 1

Lotus Notes client workloads

Lotus Performance Team, Software Engineers, IBM, Software Group
Members of the Notes/Domino Performance Team who contributed to this article include Rich Buck, Wu W Huang, Angelo Lynn, Dave Johnson, Joseph H Peterson, James Powers, and Andrew Nolet.

Summary:  Lotus Notes and Domino 7 have shipped. Now everyone wants to know: how does it perform compared to its previous release? In this first of a three-part series, we discuss testing we performed to determine how the various Domino 7 platforms stack up against the previous release of Domino. (Hint: There's good news for Domino administrators everywhere!)

Date:  27 Sep 2005
Level:  Intermediate
Activity:  2853 views

Improving performance, and consequently reducing TCO, is a major theme of Lotus Notes/Domino 7. For NRPC users (Notes Remote Procedure Call, the native Notes/Domino mail protocol), we increased scalability by removing some internal constraints, and streamlined the code to allow more users to be serviced at a given level of processor utilization. As a result, most Notes/Domino 7 platforms show a reduction in CPU utilization with the same number of R6Mail users. The CPU savings represent the maximum level of performance improvement we would expect to see in a customer environment.

This article is the first in a three-part series in which we review performance improvements we have measured by comparing Lotus Notes/Domino 7 with release 6.5. This article focuses on benchmark results that simulate Notes users by using the R6Mail workload. We will show the benchmark results we obtained on a variety of platforms. These results are from a single Domino partition, which is not using transaction logging except where noted. We will show Domino 7 results with user mail files based on the Notes 6.5 and Notes 7 versions of the mail template. Each of these is compared to the Domino 6.5 server where the user mail files were based on the Notes 6.5 template. This shows what to expect in migration scenarios where the conversion of the user mail files to the Domino 7 template may occur later than the server upgrade.

All of these results represent sub-second response times from Domino. And for benchmarking purposes, we were only running the router task (except where noted) to avoid spikes in the data from other activity. We hope you will find the information useful, and gain an understanding of the improvements that have gone into Notes/Domino 7.

Note: The results in this article were from benchmarks executed in a controlled environment. While some effort was made during the creation of the benchmark to include typical user operations, it is likely that real users will make different use of Notes and Domino from the narrow range of function that we tested with the benchmark. These numbers should therefore be used primarily to understand the relative performance of the Domino releases, and do not represent recommendations for real-world deployment. For assistance with capacity planning, we recommend you consult your hardware vendor.

Also, while we show results on a variety of hardware platforms, these configurations are not of uniform capacity. It is our intent here to focus on the performance of Domino itself, and this data should not be used to compare one platform against another.

The R6Mail workload

The workload used in the benchmarks in this article is the R6Mail workload from the server.load performance tool included in the Domino product. Information about server.load and this workload can be found in the Domino Administration Guide.

At a high level, however, the benchmark environment consists of a series of load generating workstations, each simulating the Notes client-to-server actions of up to 1500 virtual users. We added additional load generators over time until we saw response times exceed one second on average. On the Domino server, we configured the Domino Directory to handle the appropriate number of virtual users in the test, and gave each of these users their own mail file in the data directory.

Each of the virtual users performed the following tasks in a 90 minute period:

Actions every 90 minutesR6Mail workload
Open inbox6
Read message30
Delete message12
Add message to inbox12 (50 KB)
Send message to three recipients1 (100 KB average)
Send invitation to three recipients1
Send RSVP1
Close inbox6

Breaking down this workload into the low-level Domino transactions shows the following distribution:

Transaction typePercent
Read inbox documents25
Add documents25
Open inbox10
Read inbox10
Delete documents10
Prepare/send RSVP5.8
Modify documents5
Close inbox5
Directory lookups/validation 1.7
Prepare/send message0.8
Prepare/send invitation0.8
Create appointment0.8

The following sections in this article examine our test results platform-by-platform.


AIX

For our AIX testing, we used the following hardware setup:

ModelP670
CPUs32 physical Power4 CPUs with a clock speed of 1.4 GHz, divided into three logical partitions (LPARs). The LPAR that we used for these tests was configured with eight CPUs assigned to it.
Installed memoryThe test LPAR has 32 GB of RAM assigned to it.
Active physical drives64 SSA drives configured with four trays for Domino Binaries and Domino Data (each tray is also a logical volume).
15 9-GB drives per drive, and 1 9-GB drive for the JFS Log.
Active logical volumesFive:
  • four logical volumes for Domino Binaries and Domino Data (JFS 2)
  • one logical volume for the operating system
Operating systemAIX 5.2

To help optimize performance, we entered the following settings into the test servers' Notes.ini files:

Domino 6.5Domino 7
NSF_Buffer_Pool_Size_MB=210
Server_Pool_Tasks=64
Server_Max_Concurrent_Trans=64
NSF_DbCache_MaxEntries=2000
ServerTasks=Router,LDAP,HTTP,SMTP
Server_Transinfo_range=12
NSF_Buffer_Pool_Size_MB=210
Server_Pool_Tasks=100
Server_Max_Concurrent_Trans=100
NSF_DbCache_MaxEntries=2000
Server_Transinfo_range=12
ServerTasks=Router,LDAP,HTTP,SMTP

AIX uses a memory-segmented architecture that limits the total number of segments used for shared memory and heap. We therefore used a very small NSF_Buffer_Pool_Size value. This allowed the test runs to achieve high simulated user levels. In an actual production configuration, we would expect that the NSF Buffer Pool Size value would be set to a higher value. The Server_Pool_Tasks and Server_Max_Concurrent_Trans values were set to support the high-end user numbers achieved on each Domino release. Before you change the defaults for these settings, we recommend that you analyze your environment to optimize the values used.

Our tests ran with a minimum of ServerTasks and is not typical of a production environment. We recommend setting the Notes.ini parameter Server_Transinfo_range on all Domino production machines. Determining the value to set this to should be an iterative process, based on monitoring the Server Expansion Factor and the Server Availability Index. For a complete understanding of these values and settings, refer to the Domino Administrator Help section on configuring the Server Availability Index.

Our lab testing showed that on our p670 LPAR, we were able to achieve approximately 15,000 simulated Notes users running Domino 7, compared to 10,000 simulated users on Domino 6.5.

The following two tables compare system resources used by Domino 6.5 and Domino 7 when running the same number of simulated users (10,000). In the first table, our simulated users are using the Mail6 mail template:

ResourceDomino 6.5Domino 7Change (percent)
CPU percent busy2317.6-23
Total disk read KB/sec1,150,318905,709-21
Total disk write KB/sec3,316,0833,175,272-4
Shared memory used (MB)116011521
Process memory used (MB)2363173
Network bytes/sec2,059,5412,233,8238

In the second table, the Domino 7 users are using the Mail7 mail template:

ResourceDomino 6.5Domino 7Change (percent)
CPU percent busy2318.51-20
Total disk read KB/sec1,150,3181,123,650-2
Total disk write KB/sec3,316,0833,461,6094
Shared memory used (MB)116011903
Process memory used (MB)2364178
Network bytes/sec2,059,5412,240,5679

We see from the preceding tables that Domino 7 used 20 and 23 percent less CPU, slightly more memory, and 2 and 21 percent less disk I/O reads, depending on which mail template was used on the Domino 7 server. We also see in figure 1 that the Domino 7 server supported between 12,000 and 14,000 R6Mail users (depending on which mail template was used) before CPU utilization exceeded 25 percent. At this same level of CPU utilization, Domino 6 supported only 10,000 R6Mail users.


Figure 1. Domino 7 vs Domino 6.5 percent CPU utilization on AIX
Domino 7 vs Domino 6.5 percent CPU utilization on AIX

Our p670 machine was divided into three LPARs; these results are from a single LPAR. The other two LPARs were also being heavily used during these test runs, for development troubleshooting and testing. We point this out to demonstrate how well LPARs work on the p670, and that we were able to run multiple, diverse activities on this machine and still achieve these dramatic test results.


SuSE Linux (Intel)

The following table shows the configuration of the server we used in our NRPC benchmark testing:

CPUsFour 1.4 GHz, Xeon MP
Installed memory4 GB RAM
Active physical drivesSCSI Controller with three XP300 RAID arrays, and one FAStT 600 with two attached EXP700s, all set in RAID 0 configuration.
Active logical volumesThe drives were configured into 14 logical volumes. One for /opt, one for /tmp, one for the transaction log files when needed, and 11 for Domino data. This allowed the mail databases to be distributed across a large amount of disks eliminating an I/O bottleneck.
Operating systemLinux SuSE SLES 9 SP2

The system is a "conservative" Intel platform composed of four 1.4 GHz Xeon MP CPUs (hyperthreaded) with 4 GB of RAM. The disk configuration is a mix of IBM EXP arrays attached to a SCSI controller, and a FAStT 600 system connected to the server via two QLogic fiber cards. Our goal was to eliminate any disk bottlenecks, allowing the system to attain 15,000 simulated Notes users. The operating system used was SuSE SLES 9, to allow Domino 7 to take advantage of the features in the 2.6 kernel, as well as the NPTL Posix library.

The next table shows the changes we made to each server's Notes.ini file:

Domino 6.5Domino 7
ConstrainedSHMSizeMB=1024
NSF_buffer_pool_size_MB=256
NSF_DBUcache_max_entries=5000
NSF_DBcache_maxentries=5000
Server_Max_Concurrent_trans=200
server_pool_tasks=100
ServerTasks=Router
ConstrainedSHMSizeMB=2560
NSF_DBUcache_max_entries=6100
NSF_DBcache_maxentries=6100
NSF_buffer_pool_size_MB=512
Server_Max_Concurrent_trans=200
server_pool_tasks=100
ServerTasks=Router

A special note of interest here is the settings for the ConstrainedSHMSizeMB variable. In Domino 6.x, this value needs to be set to around 1 GB because Domino only has 2 GB of memory to use (SuSE SLES 8 and SLES 9 constrains memory given to Domino to 2 GB), and some of this is required for stack space. In Domino 7, this Notes.ini parameter value can be increased because we have found a way in SuSE SLES 8 and SLES 9 to allocate almost 4 GB of memory to Domino by default. This is done through a special program, tunekrnl, which automatically adjusts system parameters to make Domino run more efficiently. (This is done automatically with Domino 7). Also shown in these tables is that the server tasks running are limited to only those required for this NotesBench test. This allows the server to attain its maximum performance for the test.

When Domino 6 was designed, the Linux (x86) kernels it had to support did not have sys-epoll capability. Therefore, Domino could not support a threadpool model like the other platforms use. This caused each NRPC user to spawn off a server thread. Each server thread then required a stack, which takes up 256K of memory. These stacks are allocated from the 2 GB of memory that the operating system gives Domino, thereby limiting the number of NRPC users Domino 6 can support to 3000. (Linux on zSeries implemented sys-epoll in Domino 6.5.)

With Domino 7, we take advantage of the sys-epoll feature in the new kernels (for example, SuSE SLES 8 and SLES 9), which allow Domino to use the same threadpool model as other platforms. This, along with the improvements in the NPTL Posix library contained in SLES 9, has greatly improved Domino’s scalability, allowing us to attain 15,000 R6Mail users -- a 400 percent improvement! (See figure 2.)


Figure 2. Domino 7 vs Domino 6.5 percent CPU utilization on SuSE Linux (Intel)
Domino 7 vs Domino 6.5 percent CPU utilization on AIX

Although Domino 7 can attain 15,000 users, the following tables compare results obtained with 3000 users, because that’s the highest number Domino 6.x could attain. Unfortunately, at this low end of the scale, slight variation in values can show large percentage changes. This should be taken into consideration when interpreting the data. Also note that because we now have more memory available for Domino 7 to use, the shared memory value is larger, allowing it to handle a larger user load. In the first table, our simulated users are using the Mail6 mail template:

ResourceDomino 6.5Domino 7Change (percent)
CPU percent busy67.525
Total disk read KB/sec40822280-44
Total disk write KB/sec35493460-03
Shared memory used (MB)61597759
Process memory used (MB)9471,02008
Network bytes/sec63212875288119

The second table shows results with simulated Domino 7 users running the Mail7 template:

ResourceDomino 6.5Domino 7Change (percent)
CPU percent busy68.338
Total disk read KB/sec40823770-08
Total disk write KB/sec3549374105
Shared memory used (MB)61599862
Process memory used (MB)9471,06012
Network bytes/sec632,128725,06315

In summary, Domino 7 on Linux has made great strides in scalability performance on NRPC, increasing the number of users per partition by 400 percent. This has put it on an equal playing field with other high-performance server platforms.


iSeries

Domino 7 provides substantial performance benefits for iSeries environments. In this section, we discuss results for two different configurations, one using the iSeries model 570 and the other using an iSeries model 810, to show a range of improvements that we observed with Domino 7 lab testing.

iSeries model 570

Our first test environment used an iSeries model 570 with 14 processors, with abundant memory and disk resources. This configuration was selected to show Domino 7 results in an unconstrained environment, and was also used to test the new capabilities of Domino 7 to support more users in a single Domino partition.

ModeliSeries model 570
CPUs14 1.65 GHz
Installed memory128 GB
Disk drives93
Operating systemI5/OS V5R3

Default settings were used for all tests, with the exception of increasing the Domino 7 Notes.ini setting server_pool_tasks to 100 for the tests on the 570 server. We did this to support the large number of users now possible in a single Domino partition.

Domino 6.5 was limited to running a maximum of 10,000 users. In Domino 7, this limitation has been removed, and on iSeries we were able to run 18,000 users in our test configuration. Comparing CPU utilization between Domino 6.5 and Domino 7, with the new Mail7.ntf template being used with Domino 7, we see up to a 25 percent improvement. If we compare Domino 6.5 and Domino 7 using the same Mail6.ntf template for both tests, we observe an even larger improvement of 33 percent at 10,000 users (see figure 3).


Figure 3. Domino 7 vs Domino 6.5 percent CPU utilization on iSeries model 570
Domino 7 vs Domino 6.5 percent CPU utilization on iSeries model 570

These numbers represent the maximum level of performance improvement that we would expect to see in a customer environment. The reason that the CPU utilization is so low is that this system configuration was also used for Domino Web Access testing, which requires more CPU resource than NRPC.

The following two tables compare results obtained with 10,000 simulated Notes users running on the iSeries model 570. In the first table, our simulated users are using the Mail6 mail template:

ResourceDomino 6.5Domino 7Change (percent)
CPU percent busy5.53.7-33
Disk read requests/second87903
Disk write requests/second669594-11
Base pool pages/second19525028
Total network KB/second20332022-1
Average response time (msec)
1 GB/secs Ethernet
8.0 6.7-16

And in this table, Domino 7 users are running the Mail7 mail template:

ResourceDomino 6.5Domino 7Change (percent)
CPU percent busy5.54.1-25
Disk read requests/second87184110
Disk write requests/second66989734
Base pool pages/second19538396
Total network KB/second2033284140
Average response time (msec)
1 GB/secs Ethernet
8.0 7.5 -6

While both templates show significant CPU reductions, you can see that the Domino server uses somewhat higher disk and memory resources to support the new features that are included with the Domino 7 template. The results shown in these tables demonstrate that Domino 7 supports a significantly larger number of users in a single partition and also lowers CPU requirements per user. Despite some additional memory and disk processing with Domino 7, this system environment showed improved response time with Domino 7 over Domino 6.5 for both mail templates tested.

iSeries model 810

Our second test environment used an iSeries model 810 with two processors. This server was equipped with 16 GB of memory and 63 disk drives, and was configured with four Domino partitions.

ModeliSeries model 810
CPUsTwo 750 MHz
Installed memory16 GB
Disk drives63
Operating systemI5/OS V5R3

We used the default Notes.ini settings for both the Domino 6.5 and the Domino 7 tests.

Four Domino partitions were configured for this environment, and data points of 6000 and 9000 users were tested. At the 6000 user point, 1500 users were active in each partition. At 9000 users, two of the partitions were increased to 3000 users. These numbers of users reflect a more typical customer configuration for number of users per Domino partition compared with the model 570 configuration, and also includes more processing in the router task.

Comparing CPU utilization between Domino 6.5 and Domino 7, with the new Mail7.ntf template being used with Domino 7, we see about a 4 percent improvement at 9000 users. If we compare Domino 6.5 and Domino 7 using the same Mail6.ntf template for both tests, we observe a larger improvement of 18 percent at 9000 users. These numbers represent perhaps a more typical range of performance improvements that we would expect to see in a customer environment configured with multiple Domino partitions and lower numbers of users per partition. These results are shown in figure 4:


Figure 4. Domino 7 vs Domino 6.5 percent CPU utilization on iSeries model 810
Domino 7 vs Domino 6.5 percent CPU utilization on iSeries model 810

The following tables show how resource utilization compares for each mail template. The first table shows 9000 simulated users running the Mail6 mail template:

ResourceDomino 6.5Domino 7Change (percent)
CPU percent busy81.467-18
Disk read requests/second976772-21
Disk write requests/second772762-1
Base pool pages/second37612804-25
Total network KB/second18141832+1
Average response time (msec)
100 MB Ethernet
105.351.4-51

In the second table, our Domino 7 users ran the Mail7 mail template:

ResourceDomino 6.5Domino 7Change (percent)
CPU percent busy81.477.9-4
Disk read requests/second9761180+21
Disk write requests/second772753-2
Base pool pages/second37614938+31
Total network KB/second18141850+2
Average response time (msec)
100 MB Ethernet
105.397.5-7

While both templates show reduced CPU, the Domino 7 server uses somewhat more disk and memory resources to support the new features that are provided by the new Mail7.ntf template.

The benefits of Domino 7 shown in the two environments described in this section demonstrate a range of performance improvements that may be realized in a customer environment. Performance improvements will vary depending on the amount of CPU, memory, disk, and network resources available for Domino processing. As shown in the preceding tables and illustrations, higher levels of performance improvements can be realized when ample system resources are available, and when using Domino 7 with the Mail6.ntf template. With the increased ability of Domino 7 to scale to higher numbers of users in a single Domino partition, consolidating to use fewer Domino partitions can also provide substantial performance improvements.


Solaris 9

The Sun 6800 used for performance testing consists of an 8 CPU domain carved out of a 12 CPU system. We used six T3 arrays, with nine drives each, in this test:

ModelSun 6800
CPUsEight 1050 MHz
Installed memory32 GB
Active physical drives54
Active logical volumes6 – Raid 0 Arrays
Operating systemSolaris 9

We made the following Notes.ini file modifications on the servers:

Domino 6.5Domino 7
nsf_buffer_pool_size_mb=1536
ServerTasks=Router
server_pool_tasks=100
server_max_concurrent_trans=100
NSF_dbcache_maxentries=18000
MEM_EnablePreAlloc=1
DEBUG_ENABLE_SYS_V_SHM=1
ConstrainedSHMSizeMB=3300
ServerTasks=Router

For Domino 6.5 testing, we used 1.5 GB for the NSF buffer pool, but for Domino 7 we needed to shrink this back to the default of 1.2 GB due to the increased numbers of users we needed to support. We also increased server_pool_tasks, server_max_concurrent_trans, and NSF_dbcache_maxentries to better deal with the increased numbers of active users. The remaining changes are for enabling the large page support for Solaris that Domino 7 is able to take advantage of.

Domino 6.5 is limited to run a maximum of 10,000 Notes users without running out of handles. In Domino 7, this limitation has been removed, and on Solaris, we can run up to 18,000 users in our test configuration. In addition, CPU utilization is reduced by up to 45 percent with 10,000 out of 18,000 defined users active. Also, as you can see in figure 5, Domino 7 now supports between 14,000 and 15,000 users with the same CPU utilization that was used by Domino 6.5 when 10,000 users were active:


Figure 5. Domino 7 vs Domino 6.5 percent CPU utilization on Solaris 9
Domino 7 vs Domino 6.5 percent CPU utilization on Solaris 9

The following tables show resource utilization numbers when testing with both templates at the maximum (10,000 user) level that we measured on Domino 6.5. The first table shows 10,000 simulated users running the Mail6 mail template:

ResourceDomino 6.5Domino 7Change (percent)
CPU percent busy34.918.8-46
Total disk read KB/sec29,01129,3821
Total disk write KB/sec13,63112,248-10
Shared memory used (MB)27062114-22
Process memory used (MB)1856211
Network bytes/sec1,934,8011,912,166-1

This table shows our Domino 7 users using the Mail7 template:

ResourceDomino 6.5Domino 7Change (percent)
CPU percent busy34.922.8-35
Total disk read KB/sec29,01130,5835
Total disk write KB/sec13,63113,9903
Shared memory used (MB)27062173-20
Process memory used (MB)1857217
Network bytes/sec1,934,8011,908,093-1

Process memory is larger with the Domino 7 configurations, due to configuration changes needed to support the additional users. The Domino 6.5 numbers are with the Mail6 template.

Single partition Domino 7 scalability on Solaris has grown to 18,000 users, up from the 10,000 users in Domino 6.5. In addition, particularly at higher user concurrencies, we see significant reductions in CPU utilization with Domino 7 when handling equal user loads.


Windows 2003 Enterprise Server

Domino 7 was set up as a single partition server on a eServer xSeries 365 running Windows 2003 Enterprise Server with two processors, no hyper threading, and with 3.5 GB of memory recognized by Windows. The Domino executable files were installed on one IBM FAStT 600 (200 GB, RAID 0). The mail databases were spread across five IBM FAStT 600 arrays, also RAID 0. Network access was through a single 1 GB Ethernet adapter running in full duplex mode. The following table shows our xSeries server configuration.

ModeleServer xSeries 365
CPUsTwo 3.0 GHz
Installed memory3583 MB
Active physical drives62
Active logical volumesFive arrays RAID 0
Operating systemWindows 2003 Enterprise Server

As in most of our other tests, we tweaked the servers' Notes.ini files:

Domino 6.5Domino 7
NSF_buffer_pool_size_MB=300
Server_Pool_Tasks=60
Server_Max_Concurrent_trans=100
NSF_DBcache_maxentries=10000
platform_statistics_enabled=1
platform_statistics_enabled=1
DEBUG_SHOW_SEM=1
NSF_Buffer_Pool_Size_MB=250
server_pool_tasks=60
server_max_concurrent_trans=100
NSF_DBcache_maxentries=15000

As was also shown on other platforms, Domino 7 running on Windows 2003 Enterprise Server offers improvement in CPU utilization and scalability. The maximum number of users supported on Domino 7 is 15000 on a Windows 2003 based platform, compared to 10,000 in Domino 6.5.

Domino 7 shows significant reduction in CPU utilization vs. Domino 6.5, at the 10,000 user load level. Domino 6.5 at 10,000 users utilized 58 percent of CPU, while Domino 7 at 10,000 users utilizes only 47 percent of CPU (a 19 percent reduction). The CPU percentage saving increased as Domino 7 increased the number of users. Also, as you can see in figure 6, Domino 7 now handles between 12,000 and 14,000 users with the same CPU utilization that was used by Domino 6.5 when 10,000 users were active.


Figure 6. Domino 7 vs Domino 6.5 percent CPU utilization on Windows 2003 Enterprise Server
Domino 7 vs Domino 6.5 percent CPU utilization on Windows 2003 Enterprise Server

The next two tables display resource utilization numbers when testing 10,000 users with both the Mail6 and Mail7 templates. The first table shows results obtained with the Mail6 template:

ResourceDomino 6.5Domino 7Change (percent)
CPU percent busy5841.6-28
Total disk read KB/sec18,65114,375-23
Total disk write KB/sec11,12710,434-6
Shared memory used (MB)12941199-7
Process memory used (MB)234074
Network bytes/sec2,068,1842,049,162-1

The second table shows our numbers with Domino 7 users running the Mail7 template:

ResourceDomino 6.5Domino 7Change (percent)
CPU percent busy5846.9-19
Total disk read KB/sec18,65118,138-3
Total disk write KB/sec11,12711,5053
Shared memory used (MB)12941184-8
Process memory used (MB)2346100
Network bytes/sec2,068,1842,051,6151

While both templates show significant CPU reductions, you can see that the Domino server uses a bit more resource when dealing with some of the new features that are integrated into the Domino 7 template.

As you can observe from our benchmark data, Domino 7 running on Windows 2003 Enterprise Server has several significant performance advantages over Domino 6.5. These include lower CPU usage, improved memory savings, and 50 percent more users supported on Domino 7.


Linux on zSeries

When we compared Domino 7 to Domino 6.5 on the zSeries, we found CPU reductions of 25 to 30 percent for the traditional Notes mail R6Mail workload, and 10 to 14 percent (when used with the zSeries hardware-assisted data compression feature) for transaction logging. In addition, various cross-platform improvements in Domino 7 contributed another 10 to 20 percent of processor savings (such as quick pool implementation and memory fragmentation). We used SLES 8 with SP3 for our Domino 6.5 and Domino 7 comparison. However, we recommend you use Domino 7 on SLES 9 because our lab measurements show an approximately 10 percent CPU usage saving with SLES 9 over SLES 8. Plus, SLES 9 is a 64-bit operating system, and multiple Domino partitions (DPARs) can be installed in a single LPAR.

These CPU utilization improvements were measured using the R6Mail workload. Additional functionality included with the Domino 7 mail template has shown a 5 to 15 percent CPU utilization cost when running R6Mail workload tests. This could offset some of the CPU improvements we observed. This section describes the CPU improvements for both the Domino 6 mail template and the Domino 7 mail template. There is continued focus on improving the CPU consumption in future releases of Domino, including Domino 7 maintenance releases.

All zSeries performance tests were done on one logical partition (LPAR) on a series z990 model 2084-C24. The z990 has 24 CPUs available, 6 of which are dedicated to the performance test LPAR. The remaining 18 CPUs, as well as some other machine resources, were shared among 13 other LPARs used for Domino development and test activities. This machine has Quattro boots, such as z/OS, SLES 8, SLES 9, or REL 4. For the NRPC mail tests, we used three of the six CPUs for Domino 6.5 and Domino 7, to drive the load with higher CPU utilization. Domino 6.5 does not run on SLES 9 or REL 4. The performance test LPAR was configured with 12 GB memory. Our LAN is isolated to avoid other network traffic interference from unrelated activities. All disks are allocated from an Enterprise Storage Server (2105 Model 800) array with each disk configured as a 3390 model 3. There were separate file systems allocated on single volumes (disks) for the Domino execution, data (excepting client mail databases), and the Domino address book (Names.nsf), two volumes in a logical volume manager (LVM) file system for transaction logging. Client mail databases were distributed evenly over 52 LVM file systems, each allocated across 5 volumes in a single LVM, providing 11.5 GB of useable space per file system. EXT 3 file system was used on Linux for zSeries. The operating systems installed were SLES 8 with SP3 or SLES 9 with SP1.

On SLES 8, only 2 GB used for central memory because of the 31-bit operating system, and 2 GB expanded memory used for swap space. On SLES 9, we used 12 GB total. The following table shows our hardware configuration:

Modelz990 2084-C24
CPUsThree dedicated CPUs
Installed memory12 GB
DASD type2105 model 800, 3390 model 3 type volumes
File system52 x 5 LVM mail databases, 7 other volumes for Notes data, notesbin, Domino Directory, mailbox, utility, and translog
Operating systemSLES 8 SP3 / SLES 9 SP1

Prior to testing, we configured both the Domino 6.5 and Domino 7 servers' notes.ini files to include the following: TRANSLOG_Status=1
TRANSLOG_MaxSize=3000
TRANSLOG_Performance=1
NSF_Buffer_Pool_Size_MB=256
Server_Pool_Tasks=100
ServerTasks=Router
NSF_DBCache_MaxEntries=10000

Figure 7 shows the CPU improvement from Domino 6.5 vs. Domino 7 running a NRPC mail workload implementing either the Mail6 template from Domino 6.5 or the Mail7 template.


Figure 7. Domino 7 vs Domino 6.5 percent CPU utilization on Linux on zSeries
Domino 7 vs Domino 6.5 percent CPU utilization on Linux on zSeries

In our test procedure, we waited one hour following the start/addition of 1000 clients, to allow for a “steady state” period following each change. The CPU percentage shown is the average of the one hour steady state. The maximum number of Notes client users in Domino 6.5 is 10,000. Domino 7, however, is able to scale to 12,000 client users.

Figure 7 shows a range of CPU improvement from 15 to 28 percent running Domino 7 with the 6.5 mail template, and 9 to 21 percent running Domino 7 with the Mail7 mail template. Clearly, Domino 7 improved CPU with both Mail6.ntf and Mail7.ntf over Domino 6.5. We will continue to focus on optimizing the Mail7 template in future Domino releases.

Figure 8 shows the percent of CPU improvement from SLES 9. This shows a range of CPU improvement from 1 to 18 percent running Domino 7 with the Mail6 template. After 6000 users, SLES 8 ran out of real memory. It paged in and out from SWAP disk. However, SLES 9 didn’t do that; it has lots of memory available. Therefore, we saw more CPU improvement in SLES 9 over SLES 8 after 6000 users on Linux on zSeries.


Figure 8. Percent CPU improvement with SLES9
CPU improvement with SLES9

The workload generated the same amount of work in both the Domino 6.5 and Domino 7 servers. Each caused the same number of network bytes sent and received, each sent the same number of messages, and each completed the same number of transactions.

The CPU reduction on Domino 7 translates into improved stability at high workload levels, allowing for more clients to be supported by a single Domino 7 server on Linux on zSeries. More importantly, the lighter CPU requirements of Domino 7 can produce substantially lower total cost of ownership compared to Domino 6.5 on Linux on zSeries.


z/OS

For Domino 7 running on the zSeries z/OS platform, hardware features exclusive to zSeries were utilized, as well as improvements to Domino server code. The first area of interest was the benefit vs. performance cost of transaction logging. While transaction logging provides substantial benefits, it has a definite cost in terms of additional CPU consumption. As much as a third of that CPU consumption has been attributed to data compression activity. For Domino 7, the zSeries hardware-assisted data compression feature is used. Our benchmark tests have shown a 10 to 11 percent overall CPU reduction when running Domino 7 on z/OS with transaction logging enabled using hardware-assisted compression, when compared to the software compression algorithms used in Domino 6.5.

One of the techniques used for identifying potential performance opportunities in Domino 7 was to analyze the internal operations that occur during stress testing. By analyzing this data across platforms, we discovered that z/OS was performing more network buffer memory allocations per read or write (to each Notes client) than were being done on some of the other Domino platforms. Improvements to the memory allocation algorithms for z/OS resulted in a 7 percent CPU utilization improvement in our testing.

With these changes alone, Domino 7 on z/OS shows a 17 to 20 percent CPU utilization improvement compared to Domino 6.5. In addition, various cross-platform improvements in Domino 7 contributed another 10 percent of processor savings. The total improvement when running the mail benchmark test is 25 to 30 percent for Domino 7 on z/OS, compared to Domino 6.5 (details follow).

These CPU utilization improvements were measured using R6Mail workload tests with the Domino 6 mail template and running on a Domino 6.5 server versus a Domino 7 server. As mentioned previously, functionality included with the Domino 7 mail template could offset some of these CPU improvements.

All performance test results described in this section come from one logical partition (LPAR) on a series z990 model 2084-C24. The z990 has 24 CPUs available, 6 of which are dedicated to the performance test LPAR. For these NRPC mail tests, we only used three of the six CPUs to drive the load with higher CPU utilization. The performance test LPAR was configured with 12 GB of central storage memory. We used a single Gigabit Ethernet Open Systems Architecture (OSA) card. Our LAN is isolated to avoid network traffic interference. Disks are allocated from an Enterprise Storage Server (2105 Model 800) array, with each disk configured as a 3390 model 3. There is a separate z/FS file system allocated on single volume (disk) for the Domino execution, data (excepting client mail databases), and the Domino Directory (Names.nsf). A file system that spans two volumes is allocated for transaction log data. Client mail databases were distributed evenly over 53 z/FSs, each allocated across 5 span volumes, providing 11.5 GB of useable space per file system. The operating system installed was z/OS version 1 release 5.

The following table lists the hardware configuration used in this test:

Modelz990 2084-C24
CPUsThree dedicated CPUs
Installed memory12 GB
DASD type2105 model 800, 3390 model 3 type volumes
File system53 x 5 z/FS mail databases, 7 other volumes for Notes data, notesbin, Domino Directory, mailbox, utility, and translog
Operating systemz/OS 1.5

We made the following configuration modifications to our Domino 6.5 and Domino 7 servers' Notes.ini files: TRANSLOG_Status=1
TRANSLOG_MaxSize=3000
TRANSLOG_Performance=1
NSF_Buffer_Pool_Size_MB=128
Server_Pool_Tasks=100
ServerTasks=Router
NSF_DBCache_MaxEntries=10000

Figure 9 shows the CPU improvement from Domino 6.5 vs. Domino 7 running a NRPC mail workload, implementing either the Mail6 template from Domino 6.5 or the Mail7 template:


Figure 9. Domino 7 vs Domino 6.5 percent CPU utilization on z/OS on zSeries
Domino 7 vs Domino 6.5 percent CPU utilization on z/OS on zSeries

As with our Linux on zSeries test, we waited one hour following the start/addition of 1000 clients to allow for a steady state period. The CPU percent we measured is the average of the one hour steady state. In Domino 6.5, the server ran out of shared memory after 9000 clients. Domino 7, however, is able to scale beyond 10,000 clients.

Figure 9 shows a range of CPU improvement from 21 to 29 percent running Domino 7 with the Domino 6.5 mail template, and 10 to 19 percent improvement running Domino 7 with the Mail7 mail template. Clearly, Domino 7 improved CPU with both Mail6.ntf and Mail7.ntf over Domino 6.5.

Figure 10 shows the improvement coming from the server task on Domino 7. The router task CPU utilization remains unchanged between Domino 6.5 and Domino 7.


Figure 10. Server task improvement
Server task improvement

The workload generated the same amount of work in both the Domino 6.5 and Domino 7 servers.


Conclusion

This concludes our review of Domino 7 server performance running Notes client workloads. As you can see, Notes/Domino 7 provides significant improvements in scalability on all the platforms we measured, though the magnitude of the improvements may vary due to architectural differences. We currently see results well beyond the 10,000-user limitation we had with Domino 6.5 on configurations where we have adequate resources available. On Solaris and iSeries, where we have access to somewhat larger address spaces than on the other platforms, we see as many as 18,000 benchmark users. And with SuSE Linux, we are able to take advantage of the sys-epoll kernel enhancements to implement threadpools similar to what Domino has had on the other platforms. This resulted in a dramatic boost of scalability for Linux from 3000 users up to 15,000 users. These scalability improvements can drive server consolidations, provided the server has the resources for the additional users.

As we mentioned earlier, this article is the first of a three-part series. In part 2, we will discuss Domino HTTP performance results measured against simulated Domino Web Access users (using the R6iNotes workload). And in the concluding part 3, we will show results you can expect in a typical enterprise environment, which we obtained by using a workload that includes cluster replication, local replication, and full-text indexing; as well as Notes client traffic.


Resources

About the author

Members of the Notes/Domino Performance Team who contributed to this article include Rich Buck, Wu W Huang, Angelo Lynn, Dave Johnson, Joseph H Peterson, James Powers, and Andrew Nolet.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Lotus
ArticleID=94599
ArticleTitle=Lotus Domino 7 server performance, Part 1
publish-date=09272005
author1-email=
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers