Level: Intermediate Alexandre Polozoff (polozoff@us.ibm.com), Software Services for WebSphere consultant, IBM
20 Nov 2002 This article provides a protocol for conducting performance testing to determine the optimal environmental settings for an application in a variety of load scenarios. Topics include planning the performance environment, performing the actual testing, and measuring the application's performance characteristics.
Introduction The term protocol is defined as "a detailed plan of a scientific or medical experiment,
treatment, or procedure". This article provides a protocol for conducting
performance testing on WebSphere© Application Server-based applications,
including information on planning, setting up the performance environment,
performing the actual testing, and measuring the application's characteristics. Performance testing is the only way to determine
the optimal settings (for JVM, connection pooling, etc.) for an
application in a variety of load scenarios. Every application is
different and exhibits different behavior under a variety of conditions,
which suggests, therefore, that it should be mandatory for all applications
to undergo performance testing activity before being implemented
in a production environment.
The Performance Testing Environment
Ideally, the performance test environment will identically mimic the production environment in every detail, from the number of server firewalls and backend resources to the gauge of the network cabling. However, due to the size and scale of high volume production environments, this is rarely practical. A smaller environment with a minimum of two or three physically separate WebSphere Application Server machines is a more typical performance testing base configuration.
Figure 1. The base performance environment
As shown in Figure 1, the base performance environment in the distributed
space has two physically separate WebSphere Application Servers connecting
to a remote database and driven by a single, remote HTTP Server.
If the HTTP Server is remote in the application's production environment, then it is preferable to have the HTTP Server remote in the performance environment as well. Each WebSphere Application Server runs independently on the nodes themselves. Having other applications also running that are unrelated to the testing introduces competition for local CPU, memory and disk resources. This not only affects the results of the testing, but the interaction of these applications with the environment's resources is difficult to measure.
At the very least, the two WebSphere Application Servers should have the
same machine and OS level configurations. Common mistakes include putting
one or the other application server on a different OS patch or fixpack
level, or on different memory configurations, resulting in inconsistent
results and/or behavior. As an aside, make it a point to doublecheck that
the TCP/IP stack settings are identical to each other as well, particularly
the duplex settings on the NIC cards.
Ideally, all application data and the database for the admin repository
will reside on a remote machine, although the application data does not
necessarily need to reside on the same database server as the admin repository.
If HTTP Session persistence is enabled, make sure that the Sessions table
is isolated from other databases and marked as VOLATILE.
This is not to say that a completely different configuration for the performance
environment is entirely unacceptable. In today's business climate, it is
common for the HTTP Server to reside locally on each of the WebSphere Application
Server nodes. It is less desirable, however, for one of the WebSphere Application
Servers to also play double duty as the database for the admin repository.
These and other configuration characteristics violate the constraint that
applications should not compete for local resources. Such imbalances in
the performance environment can, and often do, skew the final results.
However, although it is certainly not preferable to have even a compromised
performance testing environment, it is better to have almost any type of
performance test environment than none at all.
Dedicated server environment
Obviously, the testing environment should mirror the production environment
as closely as possible, since any differences (any at all) introduce uncertainty.
If you scale down your test environment, you will have to scale up your
results to approximate the numbers for the production environment. Similarly,
if your HTTP Server is independent in production, but is included on an
app server node in test, then your performance results will not be quite
indicative of production either. Configuring your test environment is a
process of making choices and concessions to get the most accurate data
possible.
The ideal performance test environment is made up of dedicated server machines with a dedicated network connecting them. It is difficult to conduct performance testing if the servers for the WebSphere Application Server-based applications are also operating as servers for other unrelated applications or configurations. It is crucial that the competition for local CPU, memory and disk resources be kept to an absolute minimum.
Backend resources can be the hardest to dedicate to performance testing.
Because of operating policies and replication costs at some installations,
a dedicated server environment is not always possible. This explains why
performance tests are often run "overnight" when backend resources
are subject to light (or lighter) load conditions. The world of the Internet,
however, is continuing to strain backend resources, as multinational corporations
are trending toward 24/7 operations.
The lack of a dedicated server environment can throw performance testing for a loop. Some problem scenarios in an undedicated server environment include:
- Shared network resources where the performance environment is also shared by the organization's intranet. This can cause sporadic performance results depending directly upon the network utilization at the time of the tests. For example, if someone kicks off a network backup that heavily utilizes the network, this could increase response times, or even deny network connectivity to the components in the performance environment altogether. A network sniffer can be used in cases like this to help identify network bandwidth utilizations and determine when negative response times are not due to the application. Still, this is an extremely frustrating and inefficient environment in which to conduct performance testing.
- Shared backend resources where multiple applications are attempting to access the same or related data. This can also result in slower than normal response times, and is a very difficult situation to identify without the help of tools directly monitoring the backend resources during testing to understand the utilization picture. Additionally, the other applications may be changing data on the backend resources, making repetitive tests difficult or impossible.
A WebSphere Application Server machine with several applications being tested implies that both the applications and the HTTP Servers are under the additional strain of multiple tests. This is a completely reasonable scenario to run, since it conducts performance system integration testing of multiple applications.
Base configuration of the WebSphere Application Server environment There are base configurations of the WebSphere Application Server environment
that are commonly misunderstood or incorrectly configured.
The layered gate
One configuration is the gating down of requests from the Internet into
WebSphere Application Server and consequently to the backend resources.
The reason for gating the requests is to configure the application for
its best performance characteristics. There is a point in the load curve
where response time degradation is directly due to the number of requests
the application is processing. Once that point has been determined, by
using the protocol described, you "gate" the number of requests
processed by the application. As a result, incoming requests should be
queued up at the HTTP Server until the application is able to process another
request. Gating is accomplished by configuring three separate points in
the infrastructure that control the flow and volume of requests into the
next layer.
Figure 2. Gating requests

Figure 2 shows the primary components in the infrastructure starting from
the left side, where the requests come into the HTTP Server from the Internet.
The requests are then sent to the WebSphere Application Server, where the
application connects to backend resources, such as a database. At each
one of the three arrows in the figure is a configuration point gating the
incoming requests into the application and subsequently to the backend
resource. Gating requests into the environment tunes the maximum workload,
and provides a positive user experience.
Gating protocol
The gating protocol described here is based upon recommended WebSphere
Application Server best practices, information compiled from Redbooks,
and experiences at high volume customer locations.
There are three steps in the protocol, based upon three configuration points
in the infrastructure:
1. HTTP Server Maximum Concurrent Requests
Web servers supported by WebSphere Application Server all provide the capability to define the maximum number of concurrent requests (as opposed to concurrent users) to accept. The differentiation between user and request is pointed out here because a single HTML page containing several images results in multiple requests by a single user.
In the Apache world, the maximum concurrent request gate is controlled
solely by the MaxClients setting. (The corresponding setting on iPlanet
is ThrottleRequests.) Requests for local static content are processed by
the Web server, and requests for the application are forwarded to the WebSphere
plug-in that sends the request out to the application.
The Web server processes requests for both local static content and for
the application. Determining the value for MaxClients requires examining
the content sent to the browser and the maximum number of servlet engine
threads in the application server. A general rule of thumb to use for setting
a starting point value for any application with one Web server and one
application server, is:
maxClient = imageContentPerPage*maxServletEngineThreads
where imageContentPerPage is the average (or maximum) number of images in the HTML responses, and maxServletEngineThreads is the maximum number of servlet engine threads defined for the application server.
An extrapolation of this formula is based upon the WebSphere Application
Server's environment. Take, for instance, the scenario where there are
three Web servers feeding two application servers (these can be clones
within the same ServerGroup). The formula, then, becomes:
maxClient = imageContentPerPage*maxServletEngineThreads*numberApplicationClones/numWebServers
If the value for maxClient is too low, the load test client experiences connection errors due to too few available listeners on the Web server. The value for MaxClient should be increased in this case.
2. Maximum Servlet Engine Threads
The maximum number of servlet engine threads is determined by analyzing the performance test results. There is no valid "general" value to set this to, and the default of 25 is normally too low for high volume applications
Something to keep in mind when setting the maximum values for the servlet engine thread count is the amount of the maximum JVM Heap Size settings. Each servlet engine thread is allocated its own stack, which takes up memory within the JVM. As each servlet thread processes requests under a load condition, you also have to account for the number of objects that are created during this activity. Make sure that the JVM maximum heap size setting is high enough to support the increased number of servlet engine threads and avoids Out of Memory conditions. Also, keep in mind that you will probably want to have generational garbage collection enabled, as defined in the WebSphere Application Server InfoCenter Performance Tuning Guide.
The actual maximum value that you use will be derived solely from application
monitoring during the load and stress testing. Using the application monitor,
determine whether or not all the servlet engine threads are established,
then adjust the maximum accordingly and rerun the tests. Bear in mind that
changes in the maximum number of servlet engine threads will require that
the corresponding change be made to the MaxClients parameter on the Web
servers.
3. Maximum Connection Pool Size
The final gate in the chain is the size of the connection pool to the data sources accessed by the application. Documentation on connection pooling (see the Resources section later in this article) states that applications should briefly hold database connections while executing their transactions. This allows for efficient management and sharing of a few database connections. The generally accepted maximum value for data source connections, even in high volume installations, is 40, with the typical application somewhere between 10 and 20.
The actual maximum value that you will use here will also be derived solely
from application monitoring during the load and stress testing. Application
monitoring should identify how many data source connections are utilized
in the pool. If the setting is at 20 and all 20 data source connections
are fully utilized under load, then increase the setting to 30. Rerun the
tests and reexamine the utilization of the increased connection pool and
determine again if the pool is fully utilized or not. Readjust the maximum
as needed.
Experimentation is necessary to find the true maximum number of connections
needed. When open connections exist, there is an associated expense in
the form of memory usage and network utilization that factors into the
application's performance. While determining the number of connections
needed is not an exact science, through trial and error and application
monitoring the best possible value can be found.
Correlate expected test results
The test results should show some correlation between the numbers seen
on the load test client side and those on the application server. For example,
if three clones of an application server are configured with 25 servlet
engine threads per server, and the observation is that the servlet response
time is sub-second, then you would expect to see at least 75 requests per
second, or more, through the application server. You would also not expect
to see 8-second response times on the client side. The results must take
into account the number of HTTP connections for static data at the same
time. Make sure that the test results match what you are seeing in the
application server's configuration. If the results do not match, make sure
that the load test client is not running hot (100% CPU), low on memory
or that some network bottleneck has been encountered.
Load test clients
The client side of the performance testing environment has a major impact on performance test results. Conventional load testing tools run as agents on several client machines. More than one client machine is needed to generate a representative volume load, since the client is limited (by CPU and memory) to the number of users it can realistically represent while collecting accurate results.
Figure 3. Load testing clients direct volume into the server environment
In Figure 3 we have three load test clients that are applying load against the servers in the performance test environment. Notice that the load test clients are not residing on the application servers themselves. The clients should always be located on machines separate from the application servers.
Dedicated clients
The load testing client machines should be dedicated to the sole task of load testing. These machines are not to be shared with other applications competing for local CPU, memory and disk resources. Such competition for local resources does affect the reliability of the measured responses. The client machines should remain on the same routers and network configuration, and should be as close as possible (in networking terms) to the dedicated server environment.
Also, the load test client must be capable of generating the appropriate
type of request needed for loading the application in question. For example,
an application based on RMI access to EJBs is different than testing HTTP-based
requests to a JSP or servlets framework.
Base line and consistency of tests
Dedicated load testing machines provide consistency between the separate performance test runs. One of the first steps in the performance testing protocol for an application running on WebSphere Application Server is recording a base line set of results. The base line is only relevant if the entire test scenario can be consistently reproduced (i.e. you can always rerun the base line and get the same results back). Haphazard swapping of the load test machines or their configurations makes this "base lining" task difficult, if not impossible, to achieve and, further, makes analysis of the performance data ambiguous at best.
Time outs
Set the client page timeouts to 2 minutes or less. This will force errors on the client side if the application is not responding within a reasonable amount of time. Few users tolerate lengthy response times.
Collecting and measuring results
Part of the overall exercise in collecting performance data involves capturing
metrics that represent the characteristics of the Java Virtual Machine
(JVM) or the application server. The ability to view the separate resources
of the application server itself, and how they are performing in relation
to the rest of the application, is invaluable. WebSphere Resource Analyzer
is one such tool that provides timings and resource allocation values on
a per-server basis. Other sophisticated tools, such as Wily's Introscope#x2122
and the IBM Tivoli© Monitoring v5.1 WebSphere PAC, offer more indepth data
collection with supplementary capabilities, such as compiling data from
multiple servers in the cluster into one view, and saving collected metrics
into a database for a historical archive. These tools also typically play
double duty in the production monitoring environment by watching many of
the same critical application performance points. In fact, the performance
test activity identifies the application performance points that should
be monitored in production.
Figure 4. Application monitoring
In Figure 4, application monitoring captures metrics identifying application bottlenecks and/or issues with backend resources.
Application monitoring, regardless of the tool you decide upon, is the single important element in performance testing and problem determination. Monitoring provides the ability to measure the response time of the application servlets, JSPs and EJBs against the backend resources, and against those measured by the load test clients. These metrics assist in identifying problem areas requiring attention either within or outside the scope of the application.
Historical archive
As the testing proceeds, the measurements obtained should be compared against
past runs in order to determine if performance is improving or degrading.
Application monitoring tools can typically save the collected data in one
format or another. Higher end tools can even integrate results directly
into databases or other data store repositories. If the tool you use does
not provide a method for saving data in a usable format, you can keep relevant
data in a simple spreadsheet.
Programmatic timings
Application development teams tend to build performance timing measurements
into their applications via some sort of logging mechanism. The logging
of performance measurements from within the application should be strongly
discouraged, since it introduces additional code into the application that,
in addition to consuming processing cycles, must also be maintained and
tested. External application monitoring tools that bolt onto the application
and are controlled from a single point of command are ultimately more efficient.
Monitoring the JVM
The JVM has several key points to monitor during the performance testing activity to better understand how it is performing. Some base parameters to observe include:
- Number of active servlet engine threads. This allows for understanding the maximum amount of work that the applications in the JVM can provide at any one point in time. It also assists in determining the required maximum settings for the production environment.
- Number of active ORB threads for applications with EJBs.
- Free and used memory to help in understanding how the applications are utilizing memory and
how often garbage collection cycles are executed.
- Servlet response time to help compare and contrast the response times observed on the application server against those measured on the load test clients. Bottlenecks through firewalls and other network components could introduce delays that are confirmed by recording servlet response times.
Monitoring the application
Application monitoring is specific to each application. Key methods that access backend resources should be considered for monitoring. Methods that serialize/deserialize objects must be watched in order to understand the performance impact they have.
Frequency of calls and duration of method execution are two performance numbers that should be captured. Frequency is monitored to determine the number of times the methods are called during various performance runs. Method execution timings provide insight on what percentage of the overall response time is spent on specific tasks, and helps identify application bottlenecks. Developers can use this information for further code refactoring to improve the application's performance.
Monitoring backend resources
Monitoring of backend resources during performance testing is a crucial
part of the performance test activity. The performance of any application
is only as fast as the slowest link in the environment. If the backend
resource is not providing adequate levels of performance, then neither
is the application accessing the backend resource performing at optimum
levels. There can be many reasons why backend resources suffer degraded
performance. The administrator for each specific resource with the appropriate
toolset and knowledge is needed to help resolve any issues of this sort
related to the backend.
Monitoring network resources
The common link of networked application environments is the network itself.
Network resources participating in the test activity must also be monitored
and their data analyzed to ensure peak throughput and efficiency. This
includes the entire network, including any firewalls, routers, CSS switches,
load balancers, reverse proxies, etc. that participate in the test. Any
performance issues due to improperly configured firewalls, reverse proxies
or other network devices must be resolved first, or the data collected
will always be skewed and/or inconsistent. Common network configurations to check for include: - Throughput set to half duplex instead of full duplex.
- Routing taking different hops to/from the same set of devices.
- Firewall set up for proxy instead of passthrough.
Network resources should be monitored for:
- CPU (where applicable).
- Ports in the "established" state.
- Throughput timings of connections.
- Bandwidth.
Setting Performance Expectations Setting expectations from a few different perspectives prior to the start
of performance testing is very important in making sure that the test results
are valuable, and that the testing activity itself is successful. These
expectations include what various inputs will be required by the test team,
and what outputs will define reasonable application performance. If not
set ahead of time, it is difficult to define whether the performance of
the application being tested is adequate or not. Likewise, expectations must be set for the duration of the performance testing activity. Due to the number of tests and parameters that are modified, performance testing can take a considerable amount of time to complete. Setting up front an appropriate amount of test time is to everyone's benefit. Application expectations In order to determine if the application is performing adequately, performance
expectations must be defined. It should be the first part of any test plan
to outline the following expectations: - Acceptable servlet response time.
- Acceptable load client response time. This will be different from servlet response time due to network overhead, additional hops through Web servers and firewalls, etc.
- Acceptable requests per second throughput.
- Acceptable backend resource response time.
- Acceptable backend requests per second throughput.
- Acceptable network overhead, including Web servers, firewalls, reverse
proxies, load balancers, CSS, etc.
Likewise, a plan to address any shortcomings, should the test results not
meet the acceptable results criteria, must be developed. There are two
basic strategies for alleviating performance bottlenecks within an application: - Throwing more hardware at the problem, which can be expensive.
- Fixing the application bottlenecks, which can take a long time.
Depending on the problem, both strategies may be sound and feasible, though
budgetary constraints normally play a factor in the decision. Neither solution
is inexpensive, but fixing application bottlenecks is generally a better
strategy to follow, whenever possible. Fixing performance problems within
the application should also result in some type of post-mortem process
to document and distribute the knowledge gained from these tasks. Total duration of the performance testing activity It is, unfortunately, common for a couple of weeks of performance testing to be scheduled at the end of the development lifecycle, just prior to moving the application into the production environment. The problem with this philosophy is that many application issues do not surface until they are placed under load, and it is only during the performance testing phase that applications are subjected to the expected load volumes of the production environment.
Depending on the application, the environment, and many variable factors, the performance test phase of the development lifecycle can easily take several months to complete, even if the application has few problem issues. Applications with more problems and bugs can take considerably longer. This is one reason why testing early and often within the performance test environment is strongly recommended. The earlier testing is started in the development lifecycle, the sooner application issues can be detected and properly dealt with. Waiting until the very end of the development lifecycle to begin load testing is probably the worst performance testing scenario.
Application acceptance criteria After any serious software development activity, the application must meet
performance test acceptance criteria, prior to the start of performance
test activity. If an application does not meet these criteria, it should
not be accepted into the performance testing environment: - Unit test capability. All application development efforts must provide a comprehensive unit test strategy and accompanying unit test code that can be executed to determine that the build is complete and functional. If the unit test cases cannot be successfully completed, either due to bugs in the application or the lack of unit test code, then the application should not be accepted for performance testing.
- Low load level capability. The application should perform reasonably and within predefined expectations at the single-user and 10-user load levels with normal think time in place. If the application cannot function at these low load levels, then it will certainly be unable to function at higher load levels. Beginning performance testing would be a waste of time.
- Test data available. Data for the application to execute during the performance testing must
be provided, or detailed in such a manner that the performance test team
can set up as close a replica to the production environment as possible.
The test data must be realistic, consistent and complete.
An application must never be placed into the production environment until
it has passed performance tests with the expected behavior.
The Testing Protocol When the performance test environment is set up and running optimally,
the next step is to produce a detailed test plan that will measure the
performance and characteristics of the application being tested. The following
information is provided as a checklist of tasks to perform and measurements
to take. Many of these protocol recommendations also figure in the capacity planning steps for identifying how many JVMs must be defined to adequately handle anticipated user load in production.
Individual test durations Typically, the individual tests and configuration points that are described
below should only be run at the peak load level for an individual test
for a maximum of 10 to 30 minutes. The first set of test runs is strictly
for gathering data at a variety of configuration points and load levels.
Once the results are examined and optimal configuration points for the
application-defined expectations have been determined, then longer tests
of 12 hours or more in duration can be executed to measure the application's
characteristics over time. The longer tests also provide functionality
and stability testing under the prescribed load. Single and multiple JVM measurements Performance testing has two basic setups: single and multiple JVMs. The
single JVM test illustrates base application performance when running alone.
Measurements such as responses per second, servlet response time and number
of backend accesses provide the maximum throughput to be expected. These
are maximums because the clustered environment may have difficulty outperforming
the single application, due to backend resource limitations. The clustered
environments provide scalability and fail over. On rare occasions, application
issues can arise in the multiple JVM configuration that are not possible
in the single JVM configuration. Recommendation: All testing must be done for both the single JVM and for the clustered, multiple JVM configurations. JVM heap size settings The JVM heap size settings are adjusted from a minimum to a maximum set
of values until the optimal settings are found. Optimal settings are in
some part due to the servlet engine thread count settings, as each stack
takes up memory within the JVM. Recommendation: Adjust the JVM heap size settings in reasonable increments to determine the optimal memory settings for the application. Make sure that the JVM heap size settings are within the physical memory limits of the machine, along with any other applications on the same server. Remember that the base operating system also has memory requirements that must be accounted for, and that swapping negatively impacts application performance in all operating system environments. Generational garbage collection Introduced with JVM v1.3 for all operating system platforms is the concept of generational garbage collection . High volume applications that actively create many temporary objects and run with a high number of users generally benefit from having generational garbage collection turned on. However, this may or may not be the case depending on how the application uses objects.
Observe the number of major garbage collection cycles that occur during the performance testing and understand the performance implications to the application during these cycles. Physical memory restrictions leading to smaller JVM sizes typically see higher rates of collection under load. Understanding the performance impact here is the key to avoiding surprises in the production environment. Contrary to popular belief, garbage collection does not cause the application server to lose requests.
Recommendation: For each set of JVM heap size settings, run one test with garbage collection turned on, and another with it turned off. These tests should be run long enough in order to ensure that at least a couple of garbage collection cycles are executed so that the application's behavior can be measured. Monitor the garbage collection cycles by watching how the free and used memory is utilized by the JVM. Servlet engine thread pool The size of the servlet engine thread pool is what determines the amount of work that the JVM containing the web application can execute at any one time. With WebSphere Application Server v4.0 and later, the thread pool defines minimum and maximum sizes which reflect the limits of the pool size. High volume applications generally have higher thread pool sizes, but factors such as application bottlenecks, usage of the synchronized keyword and/or other limitations within the code prevent efficient utilization of large thread pools.
Applications that utilize the servlet engine thread pool are typically applications with servlets and JSPs. This also includes SOAP-based applications. Application clients that talk directly to EJBs do not use servlet threads, but do use ORB threads (see next section).
Recommendation: Test the application with a variety of minimum and maximum thread pool sizes to determine which settings move as much work as possible through the application. JVM heap size settings generally are adjusted upwards for larger thread pool sizes, but start the testing with minimal memory settings. Monitor the CPU utilization of the application. Once the CPU utilization of the application approaches 80% you will be hitting against the limits of the CPU, since cycles must be reserved for the base operating system itself. Remember that the servlet engines thread pool size directly affects the MaxClient setting on the Web servers. Ensure that the Web server has enough listeners defined in order to avoid "failed to connect" errors from the load test clients. If "failed to connect" errors do occur, then the Web server's settings for the maximum number of listeners is too low and must be increased.
ORB thread pool size EJBs that are run within the container execute inside the threads that are allocated to the ORB thread pool. Remote Method Invocation (RMI) -based and servlet-based applications that communicate with EJBs must have the ORB thread pool size configured.
Recommendation: Adjust the size of the ORB thread pool. Monitor the thread pool and EJB activity/response time in order to determine the optimal thread pool size. Connection pool Applications that make use of connection pooled JDBC resources need to set the size of the connection pool. Generally accepted practices limits the maximum number of connections in the pool to about 30-40. Little benefit is achieved from having more than 40 connections and one should bear in mind that each established connection consumes memory and network resources.
Session persistence data sources generally need only 10 connections defined
for a maximum. Never have a maximum connection pool size that is greater
than the servlet engine thread count. Recommendation: Adjust the maximum and minimum sizes of the connection pool. Monitor the servlet response time and throughput (request/second), JDBC activity and memory utilization for best performance. The user It is important to identify the type and number of users for the performance
testing activity that will accurately represent the application's behavior
in production. Correlation of users to expected loads is measured differently
by organizations. Some measure utilization by number of CICS transactions
per second, as opposed to the number of users. Correlating the user load
to the appropriate metric requires understanding how the application is
utilizing backend resources, and becomes an exercise in mathematics. However
the load is measured, there should be a common understanding of the final
results and what they represent.
Number of users
The number of users defines a specific load onto an application. Since every application probably has different types of users (e.g. a user that browses a catalog vs. a shopper that purchases items), applications must be tested with a variety of scenarios that, together, resemble the average, expected work load.
Some applications perform better at high user loads than others. An application suffering from bottlenecks or excessive synchronization typically exhibits poor response times and low CPU utilization.
The single user load test is typically for establishing application base
line performance. As mentioned previously, if the application performs
poorly or breaks at the single user load level, it is not useful to continue
performance testing the application at higher load levels. Likewise, it
is recommended that testing be halted if the application performs poorly
at the 10-user level. Once the CPU utilization on the application server approaches and hits the 100% mark, any increase in the number of users only results in poorer response times. This should be measured and recorded, as it clearly defines a limitation for the application. This also assists in capacity planning, and in determining the number of application clones that will be required within the clustered environment to support load expectations.
Type of user
Typically an application can have several types of users. For instance,
a Web site selling items has at least two types of users: a browser and
a shopper. One type of user interacts with the site but does not purchase
anything. The other user is executing additional functionality, such as
shopping and credit card verification. Not all users are shoppers, but
all users are at least browsers. Defining the type of user participating
in the test is key to understanding the performance of a site. The better
a prior knowledge of the cross-segment of users using the site, the more
realistic the performance test being conducted. Likewise, understanding
the flow of pages typically used helps as well. This requires that the
person developing the load test scripts understands the application, types
of users, and what those users normally do. Recommendation: Test each application through all of the previous protocol recommendations
at the following user load levels: 1, 10, 100, 500, 800, 1500, etc. (as
appropriate based on realistic load expectations). Monitor the application's
response time, throughput (requests/second), CPU utilization, JVM memory
utilization, thread utilization, and JDBC backend utilization, throughput
and response times. CPU utilization Recording the CPU utilization of the application server machines is necessary in order to understand the impact of the applied load to the applications under test. The goal of a server is for it to drive to the highest CPU utilization possible. It is not cost effective for expensive servers to run at only 20% CPU utilization. In the distributed environment, CPU utilization must be balanced against expected peak load, hardware failure and defined Quality of Service requirements.
Figure 5. CPU and Response Time

The chart in Figure 5 shows CPU utilization in blue and response time in red. Once CPU utilization has reached saturation, increased load only increases response time. Measurements must be compared to the predefined application expectations to determine at which load levels acceptable response times are possible. This may be at significantly lower CPU utilization if the application has bottlenecks or other issues.
Measure the user load level at which an application saturates the server and document it. These numbers are useful in the capacity planning exercise.
Recording and Evaluating Results The execution of the various tests based on the above protocol recommendations involves a corresponding record keeping activity. Once the data is recorded, the analysis can then be conducted. Record keeping is the final important step in the performance test activity.
Table 1. Sample spreadsheet collects results from a specific run based on the protocol recommendations
| Date / Time | August 6, 2002 - 12:03:54 | | Number of Users | 100 | | Test Duration (minutes) | 10 |
| Web Server Settings | | MaxClients | 600 | | TTL (requests total per process) | 10,000 | | WebSphere Application Server Settings | | JVM Heap Size Min | 256 | | JVM Heap Size Max | 512 | | Generational Garbage Collection | On -XX settings | | Servlet Engine Thread Pool Size Minimum | 10 | | Servlet Engine Thread Pool Size Maximum | 200 | | ORB Thread Pool Size Minimum | 10 | | ORB Thread Pool Size Maximum | 200 | | Web Server Measurements | | CPU Utilization (measured peak) | 40-45 % | | Average Throughput | 0.45 sec | | Requests per second | 44.85 | | Load Client Measurements | | Requests per second | 44.85 | | Requests Total | 7,348 | | Requests Completed Successfully | 7,342 | | Requests Timed out | 4 | | Requests Failed to Connect | 2 | | Page #1 response time | 0.35 sec | | Page #2 response time | 0.54 sec | | Page #3 response time | 1.33 sec | | WebSphere Application Server Measurements | | Servlet Response Time | 0.28 sec | | Number of Garbage Collection Cycles | 2 | | CPU Utilization (measured peak) | 33% | | JDBC Number Requests per second - Queries | 125 | | JDBC Response Time - Queries | 37 ms | | JDBC Number Requests per second - Insert | 35 | | JDBC Response Time - Insert | 125 ms | | JDBC Number Requests per second - Update | 98 | | JDBC Response Time - Update | 222 ms |
|
A set of tests based on the protocol recommendations presented in this
article results in data such as that depicted in Table 1, where the various
elements of the configuration and the measured results are recorded. This
is done for each individual series of tests. The set of collected data
is dependent upon the backend resources utilized by the application. Applications
that do not directly access JDBC resources would obviously not collect
JDBC timings. Basic timings and frequency data should be collected for
backend resources. Once the results are collected and compiled they can be generated into
a variety of charts to illustrate the performance of the application. It
is possible that collecting and compiling the results may involve some
manual manipulation of the data, due to the fact that no single tool should
be expected to collect all the data required.
Figure 6. Requests completed by number of users
Figure 7. Response Time by page and number of users

Figures 6 and 7 show two possible charts that could be created after combining the test results into a single spreadsheet. The results can be juxtaposed against each other in a graphical format allowing others to analyze the results. Text explaining any measured or observed anomalies should be provided, especially when performance results either dramatically improve or degrade.
(Keep in mind that the manual maintenance of spreadsheets or databases is prone to human error. A review of the data should be made after it is recorded in order to identify possible mistakes or misrepresentations.)
Analysis of the data involves identifying the following application characteristics to determine various optimal values:
- Servlet response time and client response time
- Network bandwidth
- CPU utilization of the application and Web servers
- Backend resource utilization
- Networked components utilization.
Once the optimal settings are determined, a consideration must be made as to whether the application should run on its own application server, or on an application server along with other applications. The number of application servers to be run on the same physical machine must also be considered, since there will likely be physical resource limitations. Finally, the clustered environment must be evaluated for not only physical limitations on the application server machines, but also on the backend resources utilized by the applications. All of these factors play into the performance and capacity planning stages of the production environment in order to accomplish optimal performance.
Frequency of performance testing One common misconception is that performance testing is a one time effort. It can be a one time effort, if you never change the application code or the machine configurations. More often than not, though, application code changes or more powerful machine configurations are introduced into the production environment, or both. Therefore, any time a change occurs in the application code or in the machine configuration, then it follows that performance testing must be re-executed in order to determine the new parameters for optimal performance.
Conclusion Performance testing protocol is a comprehensive mix of definitions, understanding, and execution. Testing early in the development cycle and as often as possible is the best way to determine performance issues prior to the release of an application into the production environment. Teams dedicated to performance testing provide consistent testing and alleviate the burden of performance testing on individual development teams. Tools that provide insight to the JVM and application characteristics are usually well worth their cost in providing quick problem determination, and in keeping the performance test activities as quick and efficient as possible.
Resources
About the author  | |  | Alexandre Polozoff is a Software Services for WebSphere consultant engaged in the development of performance practices and techniques for high-volume and large-scale installations. His expertise includes third-party tool evaluations and best practices for performing post-mortem analysis. Alexandre also continues to be involved in open technology standards, such as SNMP, TMN, and CMIP. He can be reached at polozoff@us.ibm.com. |
Rate this page
|