Skip to main content

skip to main content

developerWorks  >  WebSphere  >

Determining How Many Copies of a Message Flow Should Run Within WebSphere Business Integration Message Broker V5

developerWorks
Document options

Document options requiring JavaScript are not displayed


Learn and share!

Exchange know-how with your peers -- try our new Pass It Along beta app


Rate this page

Help us improve this content


Level: Intermediate

Tim Dunn (dunnt@uk.ibm.com), WebSphere MQ Integrator Performance Team, IBM

19 Nov 2003

Understanding the processing profile of each message flow is necessary to determine how many copies of a message flow you need to run to achieve a particular message rate. This article explains how a message flow runs in an execution group and helps you understand the processing characteristics of your own message flows, so that you run sufficient copies.

Introduction

Once message flow coding and test are complete, you are in a position to configure the production environment. A key question at this point is how many copies of each of the message flows should be run. Run too few and the required message throughput rate will not be met. Run too many and the system could become flooded causing an unnecessary overhead as the operating system attempts to manage many concurrent units of work. If the execution groups have very large memory requirements, this can lead to excessive paging that in the most extreme cases can render the machine useless as the operating system attempts to manage the competing needs of the different message flows running in their respective execution groups. This is clearly not a good position to be in and can easily be avoided by understanding the processing characteristics of the message flow.

This article helps you avoid this situation by providing an understanding of the principle of how a message flow runs in an execution group and helps you understand the processing characteristics of your own message flows.

In a complex environment, there will be many message flows running within a single WebSphere® Business Integration Message Broker. In such a case, it is ideal to understand the characteristics of all message flows. Where this is not feasible or practical it should be possible to divide the message flows into a limited number of categories where each message flow in the category has broadly similar processing characteristics. The number of copies of the message flow should be allocated on the characteristics of the category. To do this, you will need to study in detail at least one flow per category.

This article assumes that you are familiar with the major components and configuration of WebSphere Business Integration Message Broker V5. It also assumes that the ESQL code and node usage within the message flows are optimized for performance.



Back to top


Use of operating system threads by a message flow

A WebSphere Business Integration Message Broker message flow can only process messages when running within the WebSphere Business Integration Message Broker run time environment.  Each message flow must be assigned and successfully deployed to one or more execution groups.  The message flow is then able to process any messages appearing on the input queue(s) associated with the MQInput nodes. Each copy of the message flow runs on an operating system thread in the Windows or UNIX environment.  On z/OS, a UNIX Systems Services (USS) thread is a Task Control Block (TCB).  The execution group is implemented as an operating system process in the Windows and UNIX environment.  On z/OS, a USS process is an Address Space.

Operating system limits mean that it is not possible to run more than around 500 threads per process (execution group) in the Windows or UNIX environments.  With z/OS the limit is 255 TCBs per address space (execution group).  This places an upper limit on the number of message flows that can be assigned to one execution group.  Once we allow for a number of WebSphere Business Integration Message Broker management threads, the practical working limits of the number of threads per process or TCBs per Address Space is further reduced. Recommended working limits are 256 threads per process for Windows and UNIX and 230 TCBs per Address Space on z/OS.  It is unusual for these limits to be a concern in practice since it would be unwise to assign all copies of message flow or all message flows to a single execution group.  You are advised to use more than one execution group to provide a level of resilience in the event of execution group failure.

A single copy of a message flow may use more than one thread depending on how it is written.  A message flow requires one operating system thread per MQInput node (or any other input node).  A  message flow with three MQInput nodes will require three operating system threads to execute even if only a single instance of the message is specified.  In this case, it would mean that it is not possible to run more than 85 copies (256 thread working limit divided by 3 per message flow) of such a flow in in one execution group in a Windows environment.

Any Additional Instances that are specified for a given message flow within an execution group are shared among all of the MQInput nodes in the message flow.  It is not possible to guarantee that a thread is allocated to a particular MQInput node.  The only way to do this is to ensure that the message flow contains a single MQInput node.  Where multiple instances of a message flow are in use the distribution of threads across the multiple MQInput nodes will be dependent on the number of messages to be processed on the WebSphere MQ queue from which the MQInput nodes read.  MQInput nodes reading from those input queues with the most messages on them will obtain most threads.  This distribution is not fixed for the life of the WebSphere Business Integration Message Broker.  After a short period with no messages on an input queue the additional threads will be returned to the pool.  Each MQInput node will still have one thread allocated.



Back to top


How to run multiple copies of a message flow

There are two recommended mechanisms that let you run multiple copies of a message flow within a WebSphere Business Integration Message Broker.  These are the use of the Additional Instances attribute and the assignment of a message flow to more than one execution group.  Both of these actions must be performed using the Message brokers Toolkit for WebSphere Studio in WebSphere Business Integration Message Broker V5 or the Control Center in WebSphere MQ Integration V2.

Additional Instances

Once a message flow has been successfully assigned to an execution, there is by default a single copy of the message flow running.  It is possible to increase the number of copies of that flow by increasing the value of the Additional Instances parameter.  This specifies how many more copies of the message flow should be run in that particular execution group.  The technique for doing this varies between WebSphere MQSeries Integration V2.1 and WebSphere Business Integration Message Broker V5.

WebSphere MQSeries Integration V2.1

Once you've successfully assigned a message flow to an execution group:

  1. Expand the Domain Hierarchy in the Operations pane of the Control Center.
  2. Right-click on the required message flow within the execution group and select the properties option. A window displays the properties of the message flow in that execution group. See Figure 1 below.
  3. Specify the number of additional copies of the message that you would like to run in the Additional Instances box.  The value you enter applies to that message flow in that execution group only.  It is possible to run the same or a different number of instances for that message flow in another execution group.


Figure 1. Changing the number of instances of a message flow in the control center.

Displays the changing the number of instances of a                message flow in the control center.

WebSphere Business Integration Message Broker V5

The number of instances for a message flow has to be specified in at the message flow level in the relevant Broker Archive (BAR). To do this:

  1. Select the BAR in the Broker Administration Navigator Perspective in the Toolkit to change the number of instances for a particular message flow.
  2. Select the Configure tab, required message flow, and specify the number of instances required in the Additional Instances box
  3. Make changes to any other message flows in the same BAR file as required (See Figure 2 below).
  4. Deploy the BAR to the broker to make the changes effective.

Figure 2. Changing the number of instances of a message flow in the Message Broker Toolkit.

Displays the changing the number of instances of a message flow in the Message Broker Toolkit.

With the use of Additional Instances, separation between different copies of the message flow is provided by the operating system thread level protection on which the message flow is running.  Should an execution group be stopped or fail, all message flow running in the execution group will stop processing messages. For this reason, it is wise to consider assigning copies of the message flow to more than one execution group.

Multiple execution groups

In this approach, one copy of a message flow is assigned to one execution group within a given WebSphere Business Integration Message Broker.  When many copies of a message flow are required, many execution groups are also required.  The memory and processing cost of an additional execution group is larger than that for an within an existing execution, and for this reason you may decide that this approach is not suitable because of the additional resources required.

When a message flow is allocated to more than one execution group, it is still possible to use the Additional Instances attribute to increase the number of copies of the message flow running in a particular execution group.  Where multiple copies of a message flow are required, it is most sensible to use a combination of the Additional Instances and Multiple Execution Group approaches.

Message sequencing

If message sequence must be maintained within a message flow, then all instances of the message flow must be assigned to the same execution group.  Message sequence can only be coordinated across instances of a message flow if all the instances are in the same execution group.  If instances of a message flow are spread across more than one execution group, message sequence cannot be guaranteed.  Message sequencing is controlled by the Order Mode property on the Advanced Properties tab of the MQInput node.  There are three possible values:

Default
Messages are retrieved in the order defined by the queue attributes, but this order is not guaranteed as the messages are processed by the message flow.
User Id
Messages that have the same UserIdentifier in the MQMD are retrieved and processed in the order defined by the queue attributes, and this order is guaranteed to be retained when the messages are processed. Therefore, a message associated with a particular UserIdentifier that is being processed by one thread will be completely processed before the same thread, or another thread, can start to process another message with the same UserIdentifier. No other ordering is guaranteed to be preserved.
Queue order
Messages are retrieved and processed by this node in the order defined by the queue attributes, and this order is guaranteed to be retained when the messages are processed. This behavior is identical to the behavior exhibited if the Additional Instances property of the message flow is set to 0.

Dependening on the strictness of the sequencing, it may be possible to run multiple instances of a message.  For example, if it is only necessary to maintain message order with a particular userid then messages for more than one one userid may be processed at a time.

Where processing does have to serialized through a single instance of a message flow it is recommended to minimize the amount of processing in such a flow and to pass messages onto a subsequent message flow which can be run with multiple copies at the earliest point possible.



Back to top


Determining the message throughput rate for a message flow

Before you can decide how many copies of a message flow are needed in total, you have to know the message throughput rate that is achievable with one copy of the message flow, otherwise setting the number of copies is pure guess work.  It is not possible to determine the throughput by inspecting the message flow or counting the number of nodes; for example, you have to run some tests.

This testing does not have to be complex or take a long period of time.  The key requirement is to conduct the measurements in a realistic environment, that is one as close to the production environment as possible, using the same type and mix of data, the same hardware and the same operating system.  Measuring in an environment that is noticeably different from that which is to be used for production may generate misleading results and lead to a poorly configured production environment.

The environment used for testing should ideally be isolated so that the results that you obtain reflect only the processing of the message flow and not other work being processed. Obtaining reliable and repeatable measurements on a system which is busy with other work is extremely difficult and should be avoided.

It is suggested that you measure message throughput over a period of 5 minutes or more.  Measuring for periods of 30 seconds or less does not yield realistic results as data for the first few messages may be buffered.  This is probably unrealistic in the normal course of processing where there will typically be a continual throughput of messages.

The sequence of events in conducting such testing is typically as follows:

  1. Initialize the environment, including starting WebSphere Business Integration Message Broker with a single copy of the message flow being profiled.
  2. Use a program to load messages on to the WebSphere MQ input queue.  The best approach is to continually add messages onto the input queue as messages are consumed by the message flow.  Sometimes messages are preloaded onto the input queue.  This is not as effective and may skew the results.
  3. Determine the rate at which messages are written to the output queue(s).  This can be done by writing a simple program to repeatedly issue an MQGET for messages on the output queue and record the rate at which messages are read.  The program should report the rate every 30 seconds, for example.  An alternative approach is to keep all of the messages on the output queue and determine the time interval between the first and last message.  The message rate is then given by the result of (Put Time of Last Message – Put Time of First Message) / Number of Messages on the queue.  You need to process a reasonable number of messages. This approach is not effective for a small number of messages and remember that the smaller the number of messages the less accurate the calculation.  To  get valid Put timestamps on the output messages, you need to change the Message Context attribute in the advanced properties folder of the MQOutput node to a value of Default.

When running such tests, it is important to ensure that there are always messages on the input queue, otherwise the rate reported will not be realistic.

The tools provided in SupportPac IH03, WebSphere MQSeries Integration Message display, Test and Performance Utilities available at http://www.ibm.com/software/integration/support/supportpacs/individual/ih03.html are particularly useful for such testing.

The output of this testing should be a message rate that is achievable with each of the message flows. The next stage is to compare the rate at which messages can be processed with the message flow against the required message rate.

The rate at which message flows need to process messages often varies considerably.  Some message flow may only be used at month end for example and then only need to process hundreds or low thousands of messages over a day or two.  Other message flows may have to process messages every day and must be capable of processing tens, hundreds, or even thousands of messages per second.

Those message flows with low throughput requirements probably need no further study or planning if  a small number of instances of the message flow will be able to process the low volume of messages in sufficient time.

For those message flows with a message throughput requirement that is significantly greater than the rate which was achieved with one copy of the message flow, more involved analysis is required to make optimal use of the resources that you have. Without this you could easily over allocate resources, such as additional brokers and machines.

The analysis required is not complex and can be extremely useful in understanding the characteristics of your message flow and system as a whole.  What you need to determine is whether the message flow is CPU or I/O bound.  You also need to determine whether by tuning the I/O configuration greater message throughput is possible with a single copy of the message flow.  Once you know this, it will then be possible to finally determine how many copies of the message flow and how much processing power is required to achieve the required message rate.  In this next section, we examine how to determine whether your message flow is CPU or I/O bound.



Back to top


Determining whether a message flow is CPU bound or I/O bound

Processors now typically operate at Giga Hertz clock speeds.  Disks operate at millisecond speed.  Given this difference in magnitude, it is clearly much better to be limited by the speed of a processor than by a disk.  Processing which is CPU bound is generally able to run faster through the use of processors that run at a greater speed.

A message flow is CPU bound if one copy of it it is able to fully utilize one processor of a machine.  When this happens, message throughput can be increased by the use of a faster processor.  Message flows that exhibit this characteristic tend to process non persistent WebSphere MQ messages and have little or no database I/O.  Alternatively, they have may process persistent messages but have significant parsing and manipulation of the message.

A message flow is I/O bound if it is not able to fully utilize a processor despite an adequate availability of CPU and messages to be processed.  This manifests itself as a low CPU utilization.  Processing is restricted because the message flow is waiting for one or more I/O operations to complete.  This could involve WebSphere MQ logging or database processing.  For example, a database log write (which is synchronous) to commit a database Insert/Delete/Update operation by the message flow will result in a delay while the log I/O completes.  Dependening on the amount of other activity to the log and the speed of the disk on which the log is located, the log request may take up to 10 milliseconds to complete.  During this time the thread on which the message flow is running will be blocked and unable to process any more messages. To reduce such delays to a minimum, it is necessary to reduce the latency of such requests by locating logs on disks with low latency, such as those with a nonvolatile fast write cache.  Such disks are capable of processing a write request in 1 millisecond or less. Solid state disks are an alternative low latency device. Where it is not possible to reduce the latency of the disks you need to increase the level of concurrency within the system, that is to have more copies of the message flow running together so that the CPU processing for one  instance of the message flow can be overlapped with the I/O processing of another. This will significantly reduce the amount of time the CPU is idle waiting for an I/O to complete.

A danger of running with I/O bound message flows is to assume that no increase in message throughput is easily possible and that another WebSphere Business Integration Message Broker must be configured (possibly on the same or an additional machine) and used to increase message throughput.  This addition of resources may not be required with suitable tuning.  At this point, it is important to understand in more detail what is happening in the machine on which the WebSphere Business Integration Message Broker is running.  To do this you need to use a tool that is able to report CPU and I/O utilization.  The tools vary by platform but all achieve a similar goal.

Suitable tools are:

  • The Windows Task Manager or Windows Performance Monitor on Windows 2000 and Windows NT (see the article, Custom Performance Analysis using the Microsoft Performance Data Helper for details on how the Windows Performance Monitor may be used to measure system performance).
  • iostat and vmstat on UNIX systems.
  • Resource Measurement Facility (RMF) or Spool Display and Search Facility (SDSF) on z/OS.

Use the tool appropriate to the platform on which the WebSphere® Business Integration Message Broker is running.  The aim of the testing is to determine how much of one CPU a single copy of the message flow is able to use while processing messages. When the message flow is the only active work in the system, the system CPU utilization will tell you how much CPU is being consumed by the message flow. When there is other work active in the system, you will need to subtract the sum of that work to determine the amount of CPU consumed by the message flow.  Alternatively, there may be CPU consumption reporting at the process or address space level.

To complete the analysis, you need to understand how many CPU processors there are in the system on which the measurements are taking place. This is so that you know what CPU utilization represents one processors worth (and so one message flows worth) of processing.   On a four processor machine, each processor provides 25 percent of the available CPU.  On an eight processor machine, each processor provides 12.5 percent of the available CPU.

If the execution group is responsible for a system CPU utilization of 25 percent on a four-processor system, the message flow is fully utilizing one processor and so is CPU bound.  This is the maximum rate that the single message flow will be able to process messages on that machine.  A single copy of a message flow will not be able to use more than one processor (assuming messages are only placed on one input queue). Additional copies of the message flow would be able to use additional processors.

If the message flow was only responsible for a system, CPU utilization of 15 percent on a four-processor system,  for example, then it is very likely that message processing is I/O bound (assuming a plentiful supply of messages on the input queue).  The 15 percent utilization is significantly below the 25 percent that could be expected.  At this point, you need to examine I/O activity since this is the most likely reason for low CPU utilization. Look for disks with poor response times or high utilizations.

The most likely sources of I/O delays are:

  • The queue manager log when processing persistent messages.
    The solution to this is to increase the buffer sizes and log extent size of the queue manager log.  Also, consider using low latency disks on which the log is located such as one with a nonvolatile write cache.  This can significantly improve the rate at which persistent messages can be processed.
  • The database log when there is database Insert/Delete/Update activity within the message flow.
    The same considerations apply as for the queue manager log.
  • Poor response time from an application that has been called synchronously.
    This may be on the same or a different machine. The solution to this problem is to investigate the behavior of the other application and make whatever performance improvements are possible.  This may include running multiple copies of the application or increasing the hardware resources of the machine on which the application. runs.                                                
  • Poor database buffer tuning.
    If the message flow reads data from a very large table in a database a large amount of I/O may have to be performed by the database to retrieve the required row.  The solution to this is to tune the database.

At this point, you should know whether a particular message flow is CPU bound or I/O bound.  If it is I/O bound, you should have taken the necessary steps to reduce the extent of the I/O contention.  It will not always be possible to change an I/O bound message flow into a CPU bound one but in many cases it is possible to reduce the extent to which it is I/O bound.  Reducing the I/O delays will lead to an increase in message throughput so there is every incentive to go through this exercise.

Once you have determined the rate that is achievable with a single copy of the message flow, start testing with an increasing number of copies.  An initial estimate of the number of copies required for a CPU bound message flow is given by the result of (required message rate / the rate possible with one copy of the message flow). This is also likely to be the number of processors required as each copy of the message flow is capable of fully utilizing one processor. Where the number of processors required exceeds the capacity of one machine, you will need to plan for a larger machine or use multiple machines to achieve the target message rate.

If the message flow is I/O bound, it is likely that you will need to run more copies than in the CPU bound case. Start with the same number of copies of the message flow as for the CPU bound case but expect to have to increase the number of copies required.  How many more copies are required will depend on the extent to which the message flow is I/O bound.  Continue to increase the number of copies and monitor both CPU utilization and message rate until the target message rate is achieved or the processors are fully utilized.  If the target message rate has not been achieved, you will need to plan for a larger machine or use multiple machines to achieve the target message rate.  If during this testing CPU consumption rises without an increase in message rate, further investigation for bottlenecks and tuning may be necessary to maximize the message rate.  Bear in mind that it may not be possible to totally eliminate I/O delays.

At this point, you should have determined how many copies of the message flow are required to achieve the target message rate.  This may exceed the capacity of one machine dependent on the complexity of the message flow and the speed of the processors. The number of copies required for CPU bound and I/O bound message flows is likely to be noticeably different.



Back to top


Configuration recommendations

Here are some configuration recommendations:

  • More than one instance of a message flow is required.
    Consider allocating it over more than one execution group to increase availability in the event of execution group failure.  WebSphere Business Integration Message Broker provides automatic restart of failed execution groups anyway but spreading instances of a message flow across more than one execution group further extends the availability and will minimize any outage.
  • Message flow has large virtual memory requirements.
    Consider restricting all instances of such message flows to a small number of execution groups to minimize the demand for virtual storage.  Virtual memory sizes for an execution group of 1GB are not unknown with very large messages and complex message flows. You do not want 10 execution groups each running a copy of such a message flow or there will be a requirement for at least 10GB of virtual memory.  By restricting the allocation of the instances to two execution groups, virtual memory requirements can be reduced to approximately 2GB.

    Use the Additional Instances mechanism to increase the number of copies of a message flow that are running in preference to allocating one copy of the message flow to each of many execution groups. It is easier to manage a small number of execution groups in the WebSphere MQSeries Integration V2.1 Control Center or WebSphere Business Integration Message Broker V5 Toolkit.
  • Message flow is run as part of a sequence.
    Ensure that the sequence as a whole is balanced and that all message flows in the sequence are capable of processing messages at the same rate.  As an illustration, suppose we have two message flows, one which formats a request message called Request_Flow and a second message flow called Reply_Flow which processes the reply message.  If the processing in Request_Flow is twice as complex as that in Reply_Flow it is likely that twice as many instances of Request_Flow compared with Reply_flow will be needed to achieve a system which is balanced.  You should evaluate separately the number of instances of each message flow that are required to achieve the target message rate.


Back to top


Conclusion

This article has hopefully shown you the importance of understanding the processing profile of each message flow.  You need to understand this to determine how many copies of a message flow to run in order to achieve a particular message rate.  The behavior of a particular type of message flow should be consistent given the same type of data is processed within it but different message flows will have different processing profiles.  Some will be CPU bound while others are I/O bound.  It is important to optimize message flows before starting any tuning or final configuration work.  Fewer copies of a message flow that is CPU bound will be required to achieve a particular message rate than one which is I/O bound.



About the author

Author: Tim Dunn

Tim Dunn is a Senior Performance Specialist with the WebSphere MQ Integrator performance team in IBM Hursley. Tim works with development in evaluating new releases of WebSphere MQ Integrator and with leading customers to provide consultancy on design, configuration, and tuning issues relating to WebSphere MQ Integrator. Tim has presented on WebSphere MQ Integrator performance in the United States and Europe. He has also authored a number of articles on improving the efficiency of a WebSphere MQ Integrator implementation.




Rate this page


Please take a moment to complete this form to help us better serve you.



 


 


Not
useful
Extremely
useful
 


Share this....

digg Digg this story del.icio.us del.icio.us Slashdot Slashdot it!



Back to top