Skip to main content

skip to main content

developerWorks  >  Lotus  >

Lotus Domino 7 server performance, Part 2

Domino 7 performance for Domino Web Access users

developerWorks
Document options

Document options requiring JavaScript are not displayed


New site feature

Check out our new article design and features. Tell us what you think.


Rate this page

Help us improve this content


Level: Intermediate

Rich Buck, Software Engineer, IBM
Wu W Huang, Software Engineer, IBM
Dave Johnson, Software Engineer, IBM
Angelo Lynn, Software Engineer, IBM
Andy Nolet, Software Engineer, IBM
Joseph H. Peterson, Software Engineer, IBM
Jim Powers, Software Engineer, IBM

08 Nov 2005

In this second of a three-part article series, we review the results of tests we performed to determine how Domino 7 compares with Domino 6.5 for supporting Domino Web Access users.

This article is the second of a three-part series that will talk about the performance improvements we have measured with Domino 7. This article is focused on benchmark results that simulate Domino Web Access users, by using our R6iNotes workload. The first article, “Lotus Domino 7 server performance, Part 1: Lotus Notes client workloads," discussed Domino 7 performance results we obtained by simulating Notes client users. The final article will review results that more closely mimic typical enterprise environments, by using a workload that includes cluster replication, local replication, and full-text indexing, as well as Notes client traffic.

Improving performance, and consequently reducing total cost of ownership (TCO), was a major theme in Domino 7. For Domino Web Access users, we focused on increasing scalability by reducing server CPU usage per user, and optimizing code paths to minimize bottlenecks, to allow more users to be serviced at a given level of processor utilization. All platforms that were tested show a reduction in CPU utilization with the same number of R6iNotes users. The CPU savings represent the maximum level of performance improvement we would expect to see in a customer environment.

There have been improvements in Domino performance on all of the platforms we have measured in Domino 7, though the magnitude of the improvements will vary from one environment to another due to architectural differences. We have been able to achieve improved scalability in terms of users supported on all the platforms, primarily by reducing CPU and memory usage on the Domino 7 server for Domino Web Access users. These scalability improvements can drive server consolidations, provided the server has the resources for the additional users.

The rest of this article will show the benchmark results we obtained on a variety of platforms. These results are from a single Domino partition, which is not using transaction logging, except where noted. We will show Domino 7 results with user mail files based on the release 7 versions of the mail template (dwa7.ntf). This is compared to the Domino 6.5 server, where user mail files were based on the 6.5 template (iNotes6.ntf).

All results represent sub-second response times from Domino. For benchmarking purposes, we are only running the Router and HTTP task (except where noted) to avoid spikes in the data from other activity. We hope you will find the information useful, and gain an understanding of the improvements that have gone into Domino 7.

Note: The results in this article were from benchmarks executed in a controlled environment. While some effort was made during the creation of the benchmark to include typical user operations, it is likely that real users will make different use of Domino than the narrow range of function that is tested by the benchmark. These numbers should therefore be used primarily to understand the relative performance of the Domino releases, and do not represent recommendations for real-world deployment. For assistance with capacity planning, we recommend you consult your hardware vendor. Also, while we show results on a variety of hardware platforms, these configurations are not of uniform capacity. It is our intent here to focus on the performance of Domino itself, and this data should not be used to compare platforms against each other.

The following sections in this article examine our test results platform-by-platform.

AIX

For our AIX testing, we used the following hardware setup:

Model p670
CPUs32 physical Power4 CPUs with a clock speed of 1.4 GHz, divided into three logical partitions (LPARs). The LPAR that we used for these tests was configured with eight CPUs assigned to it.
Installed memoryThe test LPAR has 32 GB of RAM assigned to it.
Active physical drives64 SSA drives configured with four trays for Domino Binaries and Domino Data (each tray is also a logical volume).
15 9-GB drives per drive, and 1 9-GB drive for the JFS Log.
Active logical volumesFive:
  • four logical volumes for Domino Binaries and Domino Data (JFS 2), with one for Domino transaction logging (when used)
  • one logical volume for the operating system
Operating systemAIX 5.2

To help optimize performance, we entered the following settings into the test servers' Notes.ini files:

Domino 6.5 Domino 7
NSF_Buffer_Pool_Size_MB=450
Server_Pool_Tasks=64
Server_Max_Concurrent_Trans=64
NSF_DbCache_MaxEntries=2000
ServerTasks=Router,LDAP,HTTP,SMTP
Server_Transinfo_range=12
NSF_Buffer_Pool_Size_MB=450
Server_Pool_Tasks=100
Server_Max_Concurrent_Trans=100
NSF_DbCache_MaxEntries=2000
NSF_DbUCache_Max_Entries=12000
Server_Transinfo_range=12
ServerTasks=Router,LDAP,HTTP,SMTP

AIX uses a memory segmented architecture that limits the number of segments used for shared memory and heap. Therefore, we used an NSF_Buffer_Pool_Size value that is below the default, to enable the test runs to achieve high simulated user levels. In a production machine configuration, we would expect that NSF_Buffer_Pool_Size might be set to a slightly higher value. The Server_Pool_Tasks and Server_Max_Concurrent_Trans values were set to support the high end-user scalability numbers achieved on each Domino release. Before you change from the default value of these settings, we recommend that analysis be done to optimize the values used.

We recommend setting the Notes.ini parameter Server_Transinfo_range on all Domino production machines. The value to set this to will need to be an iterative process, based on monitoring the Server Expansion Factor and the Server Availability Index. For a complete understanding of these values and settings, refer to the Domino Administrator's Help section on configuring the Server Availability Index.

In figure 1, you will see a substantial reduction in CPU resources required for an R6inotes virtual user, compared to what we had with Domino 6.5.


Figure 1. CPU usage for AIX
CPU usage for AIX

The amount of reduction varies, depending on how busy the server is, but at 6500 users, Domino 7 is approximately 53 percent busy, compared to 92 percent busy we saw on Domino 6.5 with an equal number of users active. That is a 42 percent reduction in CPU! We also see that for this benchmark, Domino 7 will support 10,500 users with the same CPU usage that Domino 6.5 consumed with 6500 users.

The following table shows the CPU, disk, memory, and network resources consumed by Domino 6.5 and 7 when 6500 benchmark users were active on each build. In addition to the CPU savings mentioned before, we see less network bandwidth consumption and a higher total disk activity. The process memory was higher due to the increased number of server pool tasks and server max concurrent transactions used in the Domino 7 test.

Resource Domino 6.5 Domino 7 Change (percent)
Number of users65006500n/a
CPU percent busy9253-42
Total disk read KB/sec112,680186,45765
Total disk write KB/sec1,597,2551,608,1500.7
Shared memory used (MB)112811633
Process memory used (MB)6812888
Network bytes/sec1,823,930679,017-63

We see from this table that Domino 7 used approximately 42 percent less CPU, 63 percent less network bandwidth, approximately 3 percent more shared memory, and 65 percent more disk I/O reads on the Domino 7 server, at the same simulated user load, while running the exact same test workload for the same amount of time. This is a clear indication of how Domino 7 deployment can drive server consolidation on AIX/pSeries.

As we noted earlier in this section, our p670 machine was “carved up” into three logical partitions (LPARS) and that the results we recorded were on a single LPAR. The other two LPARS were also being heavily used during these test runs, for development troubleshooting and testing. On the p670, we were able to run multiple, diverse activities on this machine, and still achieve the dramatic test results reported in this section.



Back to top


Linux

The configuration of the server used in this Domino Web Access benchmark testing is shown in the following chart. This system is a conservative Intel platform composed of four 1.4 GHz Xeon MP CPUs (Hyperthreaded), with 4 GB of RAM. The disk configuration is a mix of IBM EXP arrays attached to a SCSI controller and a FAStT 600 system, connected to the server via two QLogic fiber cards. The goal here was to eliminate any disk bottlenecks, allowing the system to attain 15,000 users. The operating system used was SuSE SLES 9 to allow Domino 7 to take advantage of the features in the 2.6 kernel, as well as the NPTL Posix library.

CPUsFour Xeon MP CPUs with a clock speed of 1.4 GHz.
Installed memoryFour GB of RAM.
Active physical drivesSCSI controller with three EXP300 RAID arrays, and one FAStT 600 with two attached EXP700s, all set in RAID 0 configuration.
Active logical volumes14:
  • one for /opt
  • one for /tmp
  • one for the transaction log files when needed
  • 11 for Domino data
Operating systemLinux SuSE SLES 9 SP2

The next table shows the Notes.ini changes that were used in the test that differ from the default settings. A special note of interest here is the settings for the ConstrainedSHMSizeMB variable. In Domino 6.x, this value needs to be set to around 1 GB because Domino only has 2 GB of memory to use (SuSE SLES 8 and SLES 9 constrains memory given to Domino to 2 GB), and some of this is required for stack space. In Domino 7, this Notes.ini parameter value can be increased, because we have found a way in SuSE SLES 8 and SLES 9 to allocate almost 4 GB of memory to Domino by default. This is done through a special program, tunekrnl, which automatically adjusts system parameters to make Domino run more efficiently. Also shown in this table is that the server tasks running are limited to only those required for this NotesBench test. This allows the server to attain its maximum performance for the test.

Domino 6.5 Domino 7
ConstrainedSHMSizeMB=1024
NSF_buffer_pool_size_MB=256
NSF_DBUcache_max_entries=5000
NSF_DBcache_maxentries=5000
Server_Max_Concurrent_trans=200
server_pool_tasks=100
ServerTasks=Router
ConstrainedSHMSizeMB=2560
NSF_DBUcache_max_entries=6100
NSF_DBcache_maxentries=6100
NSF_buffer_pool_size_MB=512
Server_Max_Concurrent_trans=200
server_pool_tasks=100
ServerTasks=Router

Unlike NRPC on Domino 6.x, Domino Web Access on Linux does not require an HTTP thread and stack per user. The HTTP task itself is able to do the pooling, so the default of 40 HTTP threads can support several thousand users. This is true in both Domino 6.x and Domino 7, making percent CPU the limiting factor in the number of users that can be supported.

In Domino 7, we substantially reduced the amount of CPU required to support a user. Figure 2 shows this reduction in percent CPU and how it allows Domino 7 to gain around 50 percent more Domino Web Access users than Domino 6.x.


Figure 2. CPU usage for Linux
CPU usage for Linux

The following table shows several measurements for comparison, taken at 4000 Domino Web Access simulated users on both Domino 6.x and Domino 7. The number of 4000 simulated users was used because that is the maximum number of users that this test hardware configuration can support on Domino 6.x.

Resource Domino 6.5 Domino 7 Change (percent)
Number of users40004000n/a
CPU percent busy9559.4-38
Total disk read KB/sec1483.831548.514
Total disk write KB/sec337.78352.124
Shared memory used (MB)66494943
Process memory used (MB)709231
Network bytes/sec427,407483,98913

In this table, the percent CPU savings in Domino 7 can be clearly seen. The slight increase in disk reads and writes, as well as network bytes, are a factor of the more robust Domino 7 mail template and variances in the NotesBench test script used. The shared and process memory increases in Domino 7 are a factor of the larger memory space we are now able to give Domino.

Domino 7 on Linux shows a substantial improvement in Domino Web Access scalability or a large percent CPU savings, whichever way you want to look at it. This allows the user greater flexibility in planning for server consolidation or expansion by having the option of adding more users or additional applications.



Back to top


iSeries

Domino 7 provides substantial performance benefits for iSeries environments. In this section, we discuss results for two different configurations, one using the iSeries model 570, and the other using an iSeries model 810, to show a range of improvements that we observed with Domino 7 lab testing.

iSeries model 570

In our first test, we used an iSeries model 570 with 14 processors, with abundant memory and disk resources. This configuration was selected to show Domino 7 results in an unconstrained environment, and was also used to test the new capabilities of Domino 7 to support more users in a single Domino partition.

Model iSeries model 570
CPUs14 1.65 GHz
Installed memory128 GB
Disk drives93
Operating systemI5/OS V5R3

The following chart shows the Notes.ini changes that were used in the test that differ from the default settings.

Domino 6.5 Domino 7
Server_Max_Concurrent_Trans=1000Server_Max_Concurrent_Trans=1000
NSF_Buffer_Pool_Size_MB=1500

Domino 6.5 was limited to running a maximum of 16,000 users. In Domino 7, that limitation has been relaxed, and on iSeries we were able to run 18,000 and 20,000 users in our test configuration with the dwa7.ntf and inotes6.ntf templates, respectively. Comparing CPU utilization between Domino 6.5 and Domino 7 at 16,000 users, with the new dwa7.ntf template being used with Domino 7, we observed a 13 percent improvement. If we compare Domino 6.5 and Domino 7 using the same inotes6.ntf template for both tests, we observe an even larger improvement of 32 percent at the same 16,000 user level. These numbers represent the maximum level of CPU performance improvement that we would expect to see in a customer environment. The results are included in figure 3.


Figure 3. CPU usage for iSeries model 570
CPU usage for iSeries model 570

It is likely that, when moving to Domino 7, customers will first upgrade the server, and then at some later date migrate users to the Domino 7 template.

The following tables contain resource utilization numbers for tests we performed with both templates. The first table shows 16,000 simulated users running the mail6.ntf mail template:

Resource Domino 6.5 Domino 7 Change (percent)
Number of users16,00016,000n/a
CPU percent busy67.045.7-32
Disk read requests/second115.381.6-29
Disk write requests/second1046.21206.315
Base pool pages/second413.0220.1-47
Total network KB/second4308.74250.3-1
Average response time (msec)
1 GB Ethernet
124.9137.910

In the second table, our simulated Domino 7 users ran the mail7.ntf template:

Resource Domino 6.5 Domino 7 Change (percent)
Number of users16,00016,000n/a
CPU percent busy67.0 58.3-13
Disk read requests/second115.3279.6243
Disk write requests/second1046.21902.2182
Base pool pages/second413.0615.450
Total network KB/second4308.76999.762
Average response time (msec)
1 GB Ethernet
124.9166.433

While both templates show significant CPU reductions, you can see that in some cases the Domino 7 server requires more resources to provide some of the new features that are integrated into the Domino 7 template.

These results demonstrate that Domino 7 supports a larger number of users in a single partition, and also lowers CPU requirements per user. Increases in resources for memory, disk, and network resources resulted in slightly increased average response times. Results with the inotes6.ntf template were substantially better than with the dwa7.ntf template. While these tables show comparable response times between Domino 6.5 and Domino 7, the configuration described in the following section shows larger response time improvements, because the baseline data for the Domino 6.5 environment was tested at relatively high CPU utilization.

iSeries model 810

Our second iSeries test environment used an iSeries model 810 with two processors. This server was equipped with 16 GB of memory and 63 disk drives, and was configured with a single Domino partition.

Model iSeries model 810
CPUsTwo 750 Mhz
Installed memory16 GB
Disk drives63
Operating systemI5/OS V5R3

We used the default Notes.ini settings for both the Domino 6.5 and the Domino 7 tests.

A single Domino partition was configured for this environment, and data points of 1200 and 1800 users were tested. These numbers of users reflect a more typical customer configuration for number of users per Domino partition, compared with the iSeries model 570 configuration described in the preceding section.

Comparing CPU utilization between Domino 6.5 and Domino 7, with the new dwa7.ntf template being used with Domino 7, we see about an 8 percent improvement at 1800 users. If we compare Domino 6.5 and Domino 7 using the same mail6.ntf template for both tests, we observe a larger improvement of 24 percent at 1800 users. These numbers represent perhaps a more typical range of performance improvements that we would expect to see in a customer environment configured with lower numbers of users per partition. These results are included in figure 4.


Figure 4. CPU usage for iSeries model 810
CPU usage for iSeries model 810

Again, it is likely that when moving to Domino 7, customers will first upgrade the server, and then at some later date migrate users to the Domino 7 template.

The following two tables display resource utilization numbers for tests performed with both templates. While both templates show CPU reductions, you can see that the Domino server uses a bit more resources when dealing with some of the new features that are integrated into the dwa7.ntf template. Average response time was lower in both cases, primarily due to the reduced CPU requirements of Domino 7. This effect was most pronounced when Domino 7 was tested with the inotes6.ntf template, due to the 24 percent reduction in CPU processing. Some of the percent change values shown in the tables may be exaggerated by the fact that these particular metrics had low baseline results with Domino 6.5, such that even marginal growth from Domino 7 can show up as large percentage increases.

In the first table, our simulated users are using the mail6.ntf mail template:

Resource Domino 6.5 Domino 7 Change (percent)
CPU percent busy94.671.6-24
Disk read requests/second4.76.538
Disk write requests/second112.5119.66
Base pool pages/second12.318.953
Total network KB/second437.3491.212
Average response time (msec)
GB/sec Ethernet
554.2250.9-55

And in this table, Domino 7 users are running the mail7.ntf mail template:

Resource Domino 6.5 Domino 7 Change (percent)
CPU percent busy94.687.4-8
Disk read requests/second4.712.125
Disk write requests/second112.5186.666
Base pool pages/second12.334.5280
Total network KB/second437.3780.9178
Average response time (msec)
1 GB/sec Ethernet
554.2426.9-23

The benefits of Domino 7 shown in the two iSeries environments described in this section demonstrate a range of performance improvements that may be realized in a customer environment. Performance improvements will vary depending on the amount of CPU, memory, disk, and network resources available for Domino processing. As shown in the preceding tables, higher levels of performance improvements can be realized when ample system resources are available, and when using Domino 7 with the inotes6.ntf template. With the increased ability of Domino 7 to scale to higher numbers of users in a single Domino partition, consolidating to fewer Domino partitions can provide additional performance improvements.



Back to top


Solaris 9

The Sun 6800 used for performance testing consists of an 8 CPU domain from a 12 CPU system. We have six T3 arrays, with nine drives each used in this test. We installed the Domino executables on the first array, and spread the user databases evenly across all six arrays.

Model Sun 6800
CPUsEight 1050 Mhz
Installed memory32 GB
Active physical drives54
Active logical volumes6 – Raid 0 Arrays
Operating systemSolaris 9

We made the following Notes.ini file modifications on the servers:

Domino 6.5 Domino 7
NSF_Buffer_Pool_Size_MB=1536
ServerTasks=Router
NSF_Buffer_Pool_Size_MB=1024
server_max_concurrent_trans=100
nsf_dbucache_max_entries=10000
inotes_wa_profilecachesize=10000
MEM_EnablePreAlloc=1
ConstrainedSHMSizeMB=3300
ServerTasks=Router,HTTP

On the Domino 6.5 testing, we used 1.5 GB for the NSF buffer pool, but for Domino 7 we decreased this to 1 GB to allow for the increased number of users we needed to support with Domino 7. We also increased server_max_concurrent_trans, NSF_dbucache_maxentries, and inotes_wa_profilecachesize to better handle the additional user load.

We made additional changes to the Domino Directory configuration, for enabling the large page support for Solaris that Domino 7 is able to take advantage of.

Setting Domino 6.5 Domino 7
Number of users 10,00010,000
HTTP threads250250
Listen queue size60002000
Maximum number of concurrent network sessions30002000
Domino Web Engine - maximum cached users100004000

In figure 5, you will see a substantial reduction in CPU resources required for an R6iNotes virtual user, compared to what we had with Domino 6.5. The amount of reduction varies depending on how busy the server is, but at 5500 users, Domino 7 is 51 percent busy, compared to the 86 percent busy we saw on Domino 6.5 with an equal number of users active. That is a 41 percent (relative) reduction in CPU! We also see that for this benchmark, Domino 7 will support almost 9000 users with the same CPU 6.5 consumed with 5500 users. We were able to run a maximum of 9750 users on Domino 7 at 97 percent CPU busy.


Figure 5. CPU usage for Solaris 9
CPU usage for Solaris 9

The following table shows the CPU, disk, memory, and network resources consumed by Domino 6.5 and Domino 7 when 5500 benchmark users were active on each. In addition to the CPU savings mentioned before, we see roughly equal network bandwidth consumption, and a slight increase in total disk activity. The differences in memory utilization are primarily related to the configuration changes we made for the Domino 7 server. For instance, the half-GB reduction in NSF buffer pool shows up here, as roughly the amount of decrease in shared memory.

Resource Domino 6.5 Domino 7 Change (percent)
Number of users55005500n/a
CPU percent busy8651-41
Disk read requests/second19401659-14
Disk write requests/second6697735110
Shared memory used (MB)21921603-27
Process memory used (MB)11815330
Network bytes/sec533,486525,612-1

On Solaris, Domino 7 shows excellent improvements in scalability, allowing 9750 benchmark users to run on the same hardware where we previously could only achieve 5500 users with Domino 6.5. In addition, there have been significant savings in CPU, which should be very valuable when server consolidation is considered.



Back to top


Windows 2003 Enterprise Server

Domino 7 was set up as a single partition server on an eServer xSeries 365 running Windows 2003 Enterprise Server with eight processors, hyperthreading enabled, and with 3.5 GB of available memory. The Domino executable files were installed on IBM FAStT 200 GB array RAID 0. The mail databases were spread across five IBM FastT 200 GB arrays, also configured in RAID 0. Network access was through a single 1 GB Ethernet adapter, running in full duplex mode. The following data consist of the xSeries server configuration and the Domino server configuration.

Model eServer xSeries 365
CPUsEight 3.0 GHz HT
Installed memory3583 MB
Active physical drives62
Active logical volumesFive arrays RAID 0
Operating systemWindows 2003 Enterprise Server

As in most of our other tests, we tweaked the servers' Notes.ini files:

Domino 6.5 Domino 7
Show_Server_Performance=1
platform_statistics_enabled=1
iNotes_WA_EnableProfileStats=1
NSF_DBUCACHE_MAX_ENTRIES=12000
iNotes_WA_ProfileCacheSize=11050
NSF_DBcache_maxentries=11050
NSF_Buffer_Pool_Size_MB=512
Show_Server_Performance=1
platform_statistics_enabled=1
server_max_concurrent_trans=100

We defined 7000 users for Domino 6.5, and 10,000 for Domino 7.0.

Figure 6 displays our results.


Figure 6. CPU usage for Windows 2003 Enterprise Server
CPU usage for Windows 2003 Enterprise Server

As figure 6 shows, we found a substantial reduction in CPU resources required for an R6iNotes virtual user, compared to what we had with Domino 6.5. As with other platforms, Domino 7 running on Windows 2003 Enterprise server offers improvement in CPU utilization and scalability. The maximum numbers of virtual users supported on Domino 7 is 10,500 on a Windows 2003-based platform. Domino 7 has shown significant reduction in CPU utilization vs. Domino 6.5, at the 7000 user load level. Domino 6.5 at 7000 virtual users is utilizing 92 percent CPU, while Domino 7 at 7000 virtual users is utilizing only 40 percent CPU. That is a 57 percent relative reduction in CPU savings. The CPU percent saving increases as Domino 7 increases in virtual user level.

The following tables contain resource utilization numbers we obtained when testing with both templates. While both templates show significant CPU reductions, you can see that the Domino server uses a bit more resource when dealing with some of the new features that are integrated into the Domino 7 template. In the first table, our simulated users are using the mail6.ntf mail template:

Resource Domino 6.5 Domino 7 Change (percent)
Number of users70007000n/a
CPU percent busy9235-62
Disk read requests/second2125513008-39
Disk write requests/second73455869-20
Shared memory used (MB)13691140-17
Process memory used (MB)6312090
Network bytes/sec881255649861-26

And in this table, Domino 7 users are running the mail7.ntf mail template:

Resource Domino 6.5 Domino 7 Change (percent)
Number of users70007000n/a
CPU percent busy9240-57
Disk read requests/second2125519824-7
Disk write requests/second73455975-19
Shared memory used (MB)13691250-9
Process memory used (MB)6312090
Network bytes/sec881255655926-26

As you can observe from our data, Domino 7 running Domino Web Access on Windows 2003 has several significant performance advantages over Domino 6.5. With lower CPU usage, improved memory savings, and an increase of 50 percent more users supported, Domino 7 continues to improve on the scalability, performance, and TCO when compared to Domino 6.5.



Back to top


Linux on zSeries

For the tests we performed on the Linux on zSeries platform, we used one logical partition (LPAR) on a series z990 model 2084-C24. The z990 has 24 CPUs available, 6 of which are dedicated to the performance test LPAR. The remaining 18 CPUs, as well as some other machine resources, were shared among 13 other LPARs used for Domino development and test activities. The performance test LPAR was configured with 12 GB of memory. On SLES 8, only 2 GB were used for central memory, because of the 31-bit operating system, 2 GB expanded memory for swap space. On SLES 9, we used 12 GB total. We used a single GB Ethernet Open Systems Architecture (OSA) card. The LAN is isolated. All disks are allocated from an Enterprise Storage Server (2105 Model 800) array with each disk configured as a 3390 model 3. There were separate file systems allocated on single volumes (disks) for the Domino execution, data (excepting client mail databases), and the Domino address book (Names.nsf), plus two volumes in a logical volume manager (LVM) file system for transaction logging. Client mail databases were distributed evenly over 52 LVM file systems, each allocated across 5 volumes in a single LVM, providing 11.5 GB of useable space per file system. The EXT 3 file system was used on Linux for zSeries. The operating systems installed were SLES 8 with SP3 or SLES 9 with SP1. We ran transaction logging enabled, with hardware compression data instead of LZ1 software. This feature is only available in Domino 7 on zSeries.

Model z990 2084-C24
CPUsSix dedicated CPUs
Installed memory12 GB
DASD type2105 model 800, 3390 model 3 type volumes
File system52 x 5 LVM mail databases, 7 other volumes for Notes data, notesbin, Domino Directory, mailbox, utility, and translog
Operating systemSLES 8 SP3 / SLES 9 SP1

Prior to testing, we configured both the Domino 6.5 and Domino 7 servers' Notes.ini files to include the following:

TRANSLOG_Status=1
NSF_Buffer_Pool_Size_MB=256
ServerTasks=Router, HTTP
NSF_DBCache_MaxEntries=9000
iNotes_WA_ProfileCacheSize=9000
iNotes_WA_ProfileCacheSize=9000

Figure 7 shows the CPU improvement from Domino 6.5 vs. Domino 7 running an R6iNotes mail workload, implementing either the iNotes6 template from Domino 6.5, or the dwa7 template from Domino 7.


Figure 7. CPU usage for Linux on zSeries
CPU usage for Linux on zSeries

Figure 7 shows a range of CPU improvement, from 25 to 32 percent running Domino 7 with the iNotes6 template, and 11 to 22 percent improvement running Domino 7 with the dwa7 template. Clearly, Domino 7 improved CPU with both iNotes6.ntf and dwa7.ntf over Domino 6.5. The variations in templates (custom or out of box) are to be expected, based on differences in features included as part of the templates. IBM continues to evaluate these areas to maximize CPU efficiency while maintaining increased functionality.

Figure 8 shows a range of CPU improvement from 5 percent to 12 percent running Domino 7 with the iNotes6 template on SLES 9. SLES 9 does not have the memory limitation that is inherent in the 31-bit SLES 8 architecture, so SLES 9 is able to benefit from the Domino 6.5 to Domino 7 CPU improvements across a higher utilization.


Figure 8. CPU usage improvement on SLES 9 compared with SLES 8
CPU usage improvement on SLES 9 compared with SLES 8

The workload generated the same amount of work in both the Domino 6.5 and Domino 7 servers. Each caused the same number of network bytes sent and receive, each sent the same number of messages, and each completed the same number of transactions. As a result, CPU reduction on Domino 7 translates into improved stability at high workload levels, allowing for more clients to be supported by a single Domino 7 on Linux on zSeries server. More importantly, the lighter CPU requirements of Domino 7 can produce substantially lower total cost of ownership compared to Domino 6.5 on Linux on zSeries.



Back to top


z/OS

All performance tests described in this section ran on one dedicated LPAR of a zSeries z990 model 2084-C24. It has 24 CPUs available, 6 of which are dedicated to the performance test LPAR. The remaining 18 CPUs, as well as some other machine resources, were shared among 13 other LPARs used for Domino development and test activities. The performance test LPAR was configured with 12 GB of central storage memory. We used a single GB Ethernet OSA card. Our LAN is isolated. All disks are allocated from an Enterprise Storage Server (2105 Model 800) array; each disk is configured as a 3390 model 3. There is a separate z/FS file system allocated on single volume (disk) for the Domino execution, data (excepting client mail databases), and the Domino Directory (Names.nsf). A file system, which spans two volumes, is allocated for transaction log data. Client mail databases were distributed evenly over 53 z/FSs, each allocated across 5 span volumes, providing 11.5 GB of useable space per file system. The operating system installed was z/OS version 1 release 5. We ran with transaction logging enabled, with hardware compression data instead of LZ1 software.

Model z990 2084-C24
CPUsSix dedicated CPUs
Installed memory12 GB
DASD type2105 model 800, 3390 model 3 type volumes
File system53 x 5 z/FS mail databases, 7 other volumes for Notes data, notesbin, Domino Directory, mailbox, utility, and translog
Operating systemz/OS 1.5

We made the following configuration modifications to our Domino 6.5 and Domino 7 servers' Notes.ini files: TRANSLOG_Status=1
NSF_Buffer_Pool_Size_MB=256
ServerTasks=Router. HTTP
NSF_DBCache_MaxEntries=10000
iNotes_WA_ProfileCacheSize=10000
iNotes_WA_ProfileCacheSize=10000

We defined 10,000 users for this test.

Figure 9 shows the CPU improvement from Domino 6.5 vs. Domino 7 running an R6iNotes workload, implementing either the iNotes6 template from Domino 6.5 or the dwa7 template from Domino 7. This chart shows a range of CPU improvement from 27 to 34 percent running Domino 7 with the iNotes6 template, and 10 to 24 percent improvement running Domino 7 with the dwa7 template. As you can see, Domino 7 improved CPU with both inotes6 and dwa7 over Domino 6.5.


Figure 9. CPU usage for Linux on z/OS
CPU usage for Linux on z/OS

Figure 10 shows the majority of the improvement is coming from the HTTP task on Domino 7. The Router task also shows some CPU improvement as well on Domino 7.


Figure 10. CPU usage for Server, Router, HTTP, and event tasks
CPU usage for Server, Router, HTTP, and event tasks

The workload generated the same amount of work in both the Domino 6.5 and Domino 7 servers. Each caused the same number of network bytes sent and received; each sent the same number of messages; and each completed the same number of transactions. Therefore, CPU reduction on Domino 7 translates into improved stability at high workload levels, allowing for more clients to be supported by a single Domino 7 on z/OS server.



Back to top


Summary

For Domino 7, we worked on two main aspects of performance for Domino Web Access users. The first was to increase scalability of the Domino server beyond the code bottlenecks, and the second was to minimize CPU usage and resource contention that existed in previous versions. With the current increases in server hardware performance, many customers are looking to consolidate environments, and we need to be ready to help make this possible. To this end, we have shown increased benchmark users running on all platforms, compared to the number of users with Domino 6.5. While you probably would not want to push a production server as hard as we do during a benchmark, it is good to know that Domino can now handle 50 percent more users than it could in release 6.5.

Along with this increase in scalability, Domino 7 has reduced the amount of CPU needed to handle an equal number of users, compared to Domino 6.5. The amount of improvement will vary by platform, and server utilization, but in general we see the greatest improvements on the busiest servers.



Resources



About the authors

Rich Buck has had a long career working on performance within IBM. Rich works on the full spectrum of operating systems supported by Domino with a current concentration on Solaris.


Wu W Huang is a member of the Lotus Domino Performance team, with primary focus on the zSeries platform.


Dave Johnson is currently a member of the iSeries System Performance area with a focus on Domino performance. Dave's team is also responsible for NotesBench audits for IBM eServer iSeries.


Angelo Lynn is a Performance Engineer on the Domino Performance Team. His current focus is Domino Performance on Windows-based platforms. He is a recent graduate from Northeastern University.


Andy Nolet has been working with customers on Notes performance-related issues since the late 1990s. Before joining the Domino Performance team, Andy worked for Lotus Support.


Joseph H. Peterson is on the iSeries development team in Rochester Minnesota. His focus is on performance of Lotus products.


Jim Powers is a member of the Domino Performance team. Previously, Jim led the performance team for the Lotus Domino Support organization. His experience with computer systems goes back over 30 years; performing various hardware and software roles throughout his career.




Rate this page


Please take a moment to complete this form to help us better serve you.



 


 


Not
useful
Extremely
useful
 


Share this....

digg Digg this story del.icio.us del.icio.us Slashdot Slashdot it!



Back to top