Skip to main content

Optimizing server performance: CPU scalability

James Grigsby, Project Leader , Iris Associates
James is the project leader for the Domino Performance team. He came to Iris in 1997 from Lotus, where he worked in Product Management, covering areas such as, competitive analysis, performance, and the Notes server. Previously, he developed IT outsourcing proposals with Computer Sciences Corp. and had a career as an Air Force Officer working with computer systems at bases worldwide.
Nirmala Venkatraman, Contractor, Iris Associates
Nirmala Venkatraman works for Iris as a contractor. She started in April 1998 and primarily works on UNIX performance. She previously worked at Sun Microsystems.
Susan Barber, Content Developer, Iris Associates
Susan contributed articles for the past year in the award-winning Notes.net webzine, "Iris Today." She also wrote and designed the award-winning "History of Notes/Domino." Susan left Iris in July 1999 to pursue a writing opporunity at another Boston-based start-up company.

Summary:  Analysis of tests showing how server response time is affected by the number of CPUs running in your Domino server, and by changing the disk RAID (redundant array of independent disks) level for the Domino data directory from RAID0 to RAID5.

Date:  02 Aug 1999
Level:  Advanced
Activity:  1245 views

Do you want to get the maximum level of performance from your servers? If you are like most administrators, your answer is, "Yes"! You need to know what you can do to get this level of performance, and you probably want concrete test data that backs up the recommendations. If you've already read "Optimizing server performance: Port encryption & Buffer Pool settings", you might be thinking that the only test data that exists applies to servers running Windows NT. That's not true. Here at Iris/Lotus, we routinely run performance test scenarios with Domino servers running on UNIX. In this second article, we share some of these test results with you.

This article gives you an in-depth look at a performance analysis of CPU scalability. The test shows how changing the number of CPUs running on your Domino server can affect server response time. In a second test, we analyze how changing the disk RAID (redundant array of independent disks) level for the Domino data directory from RAID0 to RAID5 affects response time. We start by defining CPU scalability, then we describe the test methodology and test data, and finally we summarize what the results mean to you. This can help you decide how you want to set up your environment in the future.

For more background information about how we conduct performance analyses here at Lotus/Iris, or an introduction to the tools we use, see "Optimizing server performance: Port encryption & Buffer Pool settings". To read more recommendations for improving server performance, see "The top ten ways you can improve server performance".

What is CPU scalability?

CPU scalability refers to the process of adding additional CPUs to a server machine without causing excessive increases in complexity or loss of performance. Ideally, response time should improve with additional CPUs. Most organizations want to know the number of CPUs they should use in order to maximize the performance of their Domino servers. To answer this question, we set up a test scenario to observe how other system metrics increased or decreased when the only system change throughout the test is the number of CPUs running on the system. Most "CPU scalability" tests include several changes. (For example, the tester may change the number of CPUs, the amount of memory, and in some cases, the size of Level Two cache used). Changing multiple components simultaneously makes it difficult to determine if improvements are from additional processors or some other component.

In the second test, we changed the disk RAID level from RAID0 to RAID5. RAID is a data storage method where data, along with information used for error correction, is distributed among two or more hard disk drives in order to improve performance and reliability.


Test methodology and test data

In each test scenario, we used Domino R4.6x to establish a baseline for testing performance improvements in Domino R5. To run the test scenarios, we set up four to five Notes client simulators running our new R5 messaging workload with the following configuration:

  • CPUs: Dell Dimension XPS D200, Pentium II processor
  • Memory: 256MB RAM
  • OS: Windows NT 4.0 Workstation
  • Notes: 4.61
  • NotesBench: 4.61

We set up a Domino server with the following configuration:

  • CPUs: Sun Ultra Enterprise 4000, with 12 Ultrasparc 167MHz processors
  • Memory: 1GB RAM
  • Hard Drives: four RAID0 drives (total 8GB storage) for 2GB OS SWAP file, six RAID0 drives (total 12GB storage)/six RAID5 drives (total 8GB storage) for the Domino data directory. (For more information about the distinction between the RAID0 and RAID5, configurations visit the AC&NC Web site)
  • OS: Solaris/Sparc 2.6, with Sun Enterprise Volume Manager 2.5
  • Domino: 4.61 server for Sparc/Solaris

In particular, we wanted to test the relative impact (the number of users, the response time, and the resource utilization) when transitioning from four, to eight, to 12 CPUs. This test scenario compared response times and the system CPU resource utilization at the same user load, but varied the number of CPUs in the machine. We also wanted to test the relative impact (the number of users, the response time, and the resource utilization) on our Solaris configuration when we transitioned from RAID0 to RAID5. This test compared response times and the system CPU resource utilization at the same user load and with same number of CPUs in the machine, but changed the disk RAID level for the Domino data directory from RAID0 to RAID5.

The workload we used for all the tests is a new R5 workload. This is the R4.0 MailDb, but with the same server message delivery. However, it also includes adding and deleting mail, and the ability to exercise the server's directory for message recipient addressing. In addition, the message size increased by a factor of 10 (to 10,000 bytes).

We ran each test for approximately 90 minutes in a steady state, with a ramp-up period of around one hour (for 1800 users). For all the tests, we set the following shell environment variable:

Notes_SHARED_DPOOLSIZE=4000000

As documented in the Release Notes, this variable controls the size of a shared memory segment or mmap files for shared data. We increased this value from the default value of 1MB, so that we didn't reach any limitations on the number of segments or files that the kernel would allow a user application to create.

We ran two of the tests using NotesBench on four to five Notes clients, each launching 300-400 threads (for a total of 1800 users). Using ThreadStagger=2 seconds (which starts each user logon at two seconds apart) on the client helped the server ramp up smoothly, without having connection timeouts during the ramp-up phase. We also configured the Domino Directory so that the Router would deliver all mail messages locally.

You can see the results of this test in the sidebar, "CPU scalability test results".


What did we find out?

When we tested four, eight, and twelve CPUs with RAID0, even at 1800 users, there was still 50 to 80 percent of the total CPU horsepower left over. Overall, Domino had a good response time at a particular user load when we increased the number of CPUs from four to eight. Due to current code limitations, we did not see appreciable scalability in terms of response time after eight CPUs. We also did not see appreciable scalability in terms of concurrent users or capacity after four CPUs using RAID0. This laid the foundation (RAID0 provides the best response time but no reliability) for assessing the impact RAID5 would have on a system. We measured the impact by monitoring Domino transactions (NotesMarks), response time, CPU utilization, memory utilization, and disk response time.

When we moved from a RAID0 to a RAID5 disk subsystem for the Domino data directory, we observed a degradation in the user response time (the values increased) at the same user load. For example, when we ran this test with eight CPUs and 1800 users, the response time increased 150 percent (but the response time was still in the acceptable sub-second range), and there was a three percent increase in the amount of the CPU used. Also, the virtual memory page scan rate went up from 75 pages per second to 100 pages per second, and the average disk service times increased from 14 milliseconds to 34 milliseconds. These values show the effect of the disk subsystem on the percentage of the CPU used. You should take these values into account when you decide on the size of your Domino server.

One additional thing we noticed was that using the server's Public Address Book for all address lookups loaded the server heavily, and caused heavy network timeout errors on clients trying to connect to the server. Also, the clients failed to build the Message Recipient List properly because of timeouts and retries. The clients then sent messages to the server without any recipients, and the Router started producing error messages saying, "Unable to deliver message xxxxxx containing no recipients". With the efficient name lookup cache mechanism in R5, we can circumvent this problem and support more users.

In testing CPU scalability, in general, we found that Domino scaled well. Overall scalability for this system would have been higher with Domino partitioning, but we were focusing on a single Domino instance. Based on this information from our tests and with restrictions on the user response time, you can choose a Domino server containing the number of CPUs that best satisfies the load you need to support on your server.

This allowed us to get a baseline measure for CPU scalability on R4.6x servers. We used this information to identify potential performance bottlenecks, and we gave the information back to the developers here at Iris to help them further identify performance improvements in Domino R5. We expect that with major database, NameLookup and Router improvements, R5 will scale significantly better than R4.6x. In addition, when we start using R5 for our tests, we can enable Input Output Completion Ports (IOCP), where persistent worker threads service end-user requests on the Domino server. This will allow us to assess CPU scalability on UNIX servers in terms of the user load that they can support.


Resources

About the authors

James is the project leader for the Domino Performance team. He came to Iris in 1997 from Lotus, where he worked in Product Management, covering areas such as, competitive analysis, performance, and the Notes server. Previously, he developed IT outsourcing proposals with Computer Sciences Corp. and had a career as an Air Force Officer working with computer systems at bases worldwide.

Nirmala Venkatraman works for Iris as a contractor. She started in April 1998 and primarily works on UNIX performance. She previously worked at Sun Microsystems.

Susan contributed articles for the past year in the award-winning Notes.net webzine, "Iris Today." She also wrote and designed the award-winning "History of Notes/Domino." Susan left Iris in July 1999 to pursue a writing opporunity at another Boston-based start-up company.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Lotus
ArticleID=23379
ArticleTitle=Optimizing server performance: CPU scalability
publish-date=08021999
author1-email=
author1-email-cc=
author2-email=
author2-email-cc=
author3-email=
author3-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers