Skip to main content

skip to main content

developerWorks  >  WebSphere | Autonomic computing | Architecture  >

Comment lines: Jason McGee: Dynamic middleware and the six attributes of virtualized application serving environments

developerWorks
Document options

Document options requiring JavaScript are not displayed


Rate this page

Help us improve this content


Level: Introductory

Jason McGee (jrmcgee@us.ibm.com), Distinguished Engineer, WebSphere XD Chief Architect, IBM WebSphere Extended Deployment

20 Sep 2006

Applying virtualization and automation is one way to significantly ease the burden of managing a modern, complex application server environment made up of many applications spread across a large number of machines. When evaluating virtualization for such environments, it is important to look for technologies and products that address the realities of your IT environment head on. Here are six key attributes you should look for in a solution to make sure it really addresses the complexities of your environment.

From the IBM WebSphere Developer Technical Journal.

Introduction

Let's face it: Managing production application server environments is hard. Modern environments are often comprised of many applications spread across a large number of machines. These applications have to handle an unpredictable load, they have to be available all the time, and they are constantly changing. Plus, there is a constant pressure to do it for less money and with less people. So how you do manage this complexity? There are many solutions, but one promising approach is through the use of virtualization and automation. If the middleware could be smarter and could understand your goals, then the systems could be both cheaper and easier to manage.

There are a number of products and technologies that claim to provide these benefits. How do you know which are the good ones? Below I have outlined six attributes that I feel must be addressed for a solution to really address the complexities of modern application server environments. All of these attributes are addressed in IBM's WebSphere® Extended Deployment product, which I will use as the example throughout.



Back to top


1. Awareness of the environment

The first thing that a virtualized environment must have is situational awareness. In order to manage the application servers, the virtualization system must understand three critical elements:

  • Topology includes information about both the servers and the applications. For servers, the system needs to know all of the application servers, their current status (started or stopped), and their communication ports (host and port number). For applications, the system needs to know which applications are installed, on which server or servers they reside, whether the application is currently running or not, and which endpoints they expose. An endpoint could be a URL, a Remote Object over IIOP, or a message driven bean. It is also critical that the topology information is live, that it is updated as the system changes. Changes could be administrative actions, like installing a new application, or failures, like a machine dies. A proper virtualization system will learn as much of this topology as possible on its own without requiring the administrator to define and maintain this information separately. In WebSphere Extended Deployment, a subsystem called On-Demand Configuration (ODC) is responsible for discovering, maintaining, and distributing this topology information around the system. ODC is a live state model, showing the topology of the current running system.

  • Load and capacity involves the virtualization system understanding how much computing power is available in a cluster of machines that it is managing, and how much of that power is in use at any moment. This information is critical to decisions that the virtualization system will make later. A proper virtualization system must be able to compare the computing power of a heterogeneous collection of machines (Intel mixed with Power5 mixed with UltraSparc, and so on). The system must also be able to determine the current percentage of that computing power that is in use at any moment and must be able to continuously distribute that information. In WebSphere Extended Deployment, the NodeDetect subsystem is responsible for determining and distributing load and capacity information. NodeDetect computes a node speed rating for each machine by looking at the number of CPUs, CPU speed in Mhz, and CPU architecture, and applying a scaling factor to normalize the different CPU architectures. NodeDetect also monitors in-use CPU load, memory utilization, and other metrics to help determine how much of the machine is in use at a snapshot in time.

  • Demand is the final element of which the virtualization system must have awareness. Demand is a measure of the requested load on the system. It is a measure of how much traffic is being sent into the system. At a minimum, a virtualization system must know how many requests are being sent into the system. Assuming all requests cause equal load on the servers, this incoming rate enables the system to compute the required computing power needed to handle the demand. Of course, in real life, all requests are not equal. Some requests are computationally expensive and others are cheap. So a good virtualization system has an ability to measure the cost of each different type of request in terms of computing power required to process the request. In WebSphere Extended Deployment, the On Demand Router (ODR), which I will discuss shortly, tracks the incoming demand in terms of arrival rate. WebSphere Extended Deployment also has a subsystem called the Work Profiler that is responsible for automatically profiling applications to determine the average per-request computing cost of different types of requests in the system. This gives WebSphere Extended Deployment a very accurate view of the demand on the system.

The combination of topology, load/capacity, and demand provide a virtualization system, such as WebSphere Extended Deployment, a clear and complete picture of what is happening within a collection of machines. However, before a virtualization system can use this information effectively, the system needs to understand how the administrator wants the systems to be managed.



Back to top


2. Goal orientation

The second attribute that must be present in a virtualized application server environment is some notion of goal orientation. With virtualization, one of the main benefits is the ability of the virtualization system to make changes automatically for the administrator in response to current conditions. But to do that, the system must have some criteria by which to make decisions. In a proper virtualization system, that criteria is expressed as policy. Policy capabilities vary widely, depending on the functionality of the virtualization system. In WebSphere Extended Deployment, there are two primary policy types:

  • The first is a performance management policy called service policy, which is a type of service level agreement (SLA). Service policies are comprised of a set of classification rules, a performance goal, and an importance level. The classification rules are predicate statements based on elements of the incoming request, such as the URL, headers, cookies, client IP, and so on. The performance goal is usually something like an average response time goal for all requests that match the classification rules. The importance level is a relative ranking amongst different workloads. If all workloads are able to meet their defined performance goal, then importance level has no effect. However, when demand exceeds capacity, the importance level would be used to make trade offs.

  • The second policy type is a health policy. Health policies define common problems that the administrator wants to deal with. I will discuss this topic in more detail later.

A final critical comment on goal orientation is that a proper virtualization system will enable goals to be defined in terms of end user characteristics. For example, it is better to define a Web application's policy in terms of response time to the end user instead of maximum CPU utilization threshold. Systems like WebSphere Extended Deployment provide end user goals.



Back to top


3. Managing demand vs. capacity

Once we have situational awareness and a set of goals to manage against, the real work (and value) of virtualization can begin. The purpose of a virtualization system is to enable the goals to be met in the face of current, real world conditions. To do this, the system must first be able to exert control. There are many different points of control in a virtualization system, from controlling traffic flow, to controlling cluster sizes and machine allocation, to controlling CPUs and memory allocated to processes. All have their place and often differ in how quickly they can be applied. In WebSphere Extended Deployment, the two primary points of control are traffic control and cluster size control:

  • Traffic control is provided by the On Demand Router (ODR). The ODR is a Java™-based proxy server. It is placed in front of the application servers and is responsible for routing all traffic into a set of application servers. In this important spot in the network, the ODR is capable of controlling how resources are used on the back-end servers. To meet the defined goals, the ODR provides two key capabilities:

    • Dynamic workload management (DWLM) is able to route and balance traffic across a cluster of servers using load awareness information. This awareness allows DWLM to balance the load in the most efficient way possible, leading to better throughput and response times.

    • Autonomic request flow manager (ARFM) provides flow control for incoming workload. This enables the ODR to differentiate workloads in order to meet the defined goals. So if a low importance workload is consuming too much resources, ARFM can slow down (queue) the low importance work and speed up higher importance work to change the mix of workload on the back-end application servers.

    These traffic control capabilities provide a way to meet goals with fast responsiveness, adapting quickly to changes in the situational environment.

  • Traffic control is, however, not sufficient. If there is not enough CPU assigned to a given application, managing the incoming traffic will not let you to meet the assigned goal. To remedy this, WebSphere Extended Deployment has a capability called the Application Placement Controller (APC). The APC's job is to control the size of the cluster hosting a given application based on the demand on the system. So as demand increases, the amount of CPU allocated to a given application can increase and visa-versa. However, APC does not deal with applications in isolation. It must decide for all applications running on the same collection of machines how best to divide up the resources of the pool amongst all of the applications so that they all meet their goals. This is critical. Because APC looks at the holistic problem, it can enable a pool of hardware to host a larger than typical collection of applications, since each application can claim a large portion of the pool as needed based on real demand.

The ability to manage demand enables a virtualization system to eliminate a number of complex tasks from the administrator. Capacity planning becomes less important. The virtualization system can decide where applications should run, how big the clusters should be, and how to route to those applications. As things change, the virtualization system can adapt automatically. These are all issues the administrator no longer has to deal with manually.



Back to top


4. Handling failure

Of course, no system and no application is perfect. Problems happen. Applications have memory leaks. Servers hang. A proper virtualization system will expect these problems and deal with them appropriately. A proper virtualization system will also mitigate the impact of these problems in the production system. Mitigation means that the end user is not aware that problems are occurring, giving the administrators more time to resolve the problems. In WebSphere Extended Deployment, there is a health management system called HMM. HMM enables the administrator to define health policies to monitor for common problems and take some mitigating action when they occur. The HMM system can monitor for things such as memory leaks in the Java Virtual Machine (JVM), hung or non-responsive servers, requests that are timing out, storm drains, excessive service policy violations, and other conditions. When one of these conditions is detected, HMM can notify the administrator, capture some diagnostic information for later debugging (such as a JVM heap dump or thread dump), and restart the server to remove the bad server from the production environment. That server restart is intelligent, ensuring that the application is never taken off-line because of the restart and predicting failure in advance, so the restart can be triggered before requests start producing errors.

When combined with performance management policies, this health management system can yield a very robust environment that continues to meet the administrator's goals in the face of both real demand and failures, all without manual intervention. This is a win-win situation: better resiliency with less work.



Back to top


5. Planning for change

Once a system is in production and running smoothly, handling the changing loads and recovering from failure, it would be nice to just let things be. But that is not reality. Change is constant. Applications are upgraded, software maintenance must be applied. A proper virtualization system helps manage these situations as well.

Software maintenance includes upgrades to both the operating system and the application server software on a given machine. As an administrator, you need the ability to take a given machine cleanly out of production so you can apply those changes. Then you need to be able to put the machine back in service. The virtualization system should orchestrate this for you. WebSphere Extended Deployment accomplishes this through a capability called Node Maintenance Mode. When you place a node (or machine) in maintenance mode, WebSphere Extended Deployment will cleanly drain all work from the machine, allowing in-flight requests to complete while blocking all new requests from using that machine. WebSphere Extended Deployment will also ensure that all servers assigned to that machine are moved to other machines, ensuring that the goals for a given application continue to be met on the now smaller set of hardware resources. This may involve a complex rebalancing of workload, but it will be handled automatically. Once the machine is removed from production, maintenance can be applied by the administrator. When completed, the machine can be removed from maintenance mode and be made available again in the pool of resources. WebSphere Extended Deployment can then move applications and workload back to the machine as needed. The beauty of this system is that the administrator does not have to move applications, change routing tables, stop workload, or do anything manual. The administrator simply declares that a given machine can no longer be used and the WebSphere Extended Deployment virtualization system adapts automatically.

Application code upgrades are a little more complex. The challenge is to make these upgrades transparent to the end user. Some of that challenge must be met through careful programming and making compatible upgrades to the design of an application. A proper virtualization system will help you meet the other part of the challenge, deploying the new application code into production. In WebSphere Extended Deployment, two major types of application version changes are supported:

  • Quick replace. The goal here is to replace one version of an application with another in as short a time period as possible while keeping the application continuously available. In WebSphere Extended Deployment, this is accomplished through an automatic coordination of the roll out of the application code with the flow of traffic from the ODR tier. By coordinating these two activities, WebSphere Extended Deployment can ensure that code is replaced one server at a time while traffic is routed around the server being changed.

  • Coexistence. Here, the goal is to have two versions of an application running side by side for an extended period of time. This is useful for things like running pilots. For this to work, the incoming workload must be split somehow to the two destinations. In WebSphere Extended Deployment, this is accomplished through a routing policy on the ODR. This routing policy enables the administrator to define how to route the traffic and which version of the application to send the traffic to. For example, the administrator might route the traffic based on the client's IP address, letting users from one location go to version 1.0 of an application and all other users go to version 2.0 of the application. This routing control enables great flexibility for the administrator to make changes in the infrastructure without that change showing through to the end user.



Back to top


6. People: trust, proof, and money

The final attribute of a proper virtualization system is the recognition that people play a pivotal role in the success of these systems. The virtualization system must address this role directly. There are three major issues at stake related to the people using these systems: trust, proof, and money:

  1. Let's face it: few people trust that software is going to do what it says. That is why people test everything. So it stands to reason that a new, fancy virtualization system that automatically makes decisions and makes changes for an administrator -- while powerful -- would initially not be considered trustworthy. It is the responsibility of that virtualization system to earn trust. In WebSphere Extended Deployment, this is addressed through a mode of operation called supervised mode. In supervised mode, before any changes are made to the system in reaction to changing input conditions, WebSphere Extended Deployment notifies an administrator of the pending change, describes the reasons behind the change, describes in detail what changes it desires to make, and then asks for approval. If approved, the changes will be made. Otherwise, they are canceled. This system is in place to build trust. The administrator can see all changes being made, can control whether they are made or not, and can see the reasons behind the changes. Over time, the administrator will come to realize that WebSphere Extended Deployment is consistently making sound recommendations and the administrator can then move the system into the more desirable automatic mode, after the trust has been earned.

  2. Closely related to trust is proof. A proper virtualization system must provide proof that it is making good decisions, even if those decisions are made and executed automatically. It must provide an audit trail of the changes it has made, and it must be able to prove that it is meeting the goals that are defined. In WebSphere Extended Deployment, this is accomplished through an audit and performance data logging infrastructure that lets the system record all of the changes it is making and enables the state of the system over time to be recorded. This information is invaluable for the administrator to understand how the system is behaving over time and why it is making those decisions.

  3. In the end, it all comes down to money. Is the system less expensive to run? Is it easier to manage, requiring less people? Is it more resilient, resulting in less opportunity cost lost? And how do I pass that cost on to the application owners? These are important questions. Some are proved in the way the system operates on a daily basis. But one big change is the last question: how do I pass on the cost? Many IT departments do internal charge-back to the application owners for the cost of hosting their applications. Today, that is often done in fixed units. In a proper virtualization system, that billing can be done in a more fine-grained manner, by actual usage of resources. This is especially critical since most virtualization systems, including WebSphere Extended Deployment, are built on the concept of shared resources, making fixed unit charge-back inaccurate. In WebSphere Extended Deployment, we solve this requirement through the ability to capture fine-grained-usage-based metrics on system utilization by application or service policy. We also enable consumption of those metrics into a formal billing and metering system, such as IBM Tivoli® Usage and Accounting Manager. This base capability enables a complete transformation of how IT recovers cost from their application owners.



Back to top


Conclusion

As you can see, providing a virtualized infrastructure is complex. There are many facets to be considered. It is possible to address only part of the requirements in a given product, but by doing so, an administrator is left with the task of filling in the gaps. When evaluating virtualization for application server environments, it is important to look for technologies and products that address the realities of your IT environment head on. IBM WebSphere Extended Deployment is such as system.



Resources



About the author

Photo: Jason McGee

Jason McGee is a Distinguished Engineer and Chief Architect for WebSphere Extended Deployment (XD). Previously, Jason was the Chief Architect of the Base and Network Deployment versions of WebSphere Application Server. He is a senior architect on the WebSphere Foundation Architecture Board and an associate member of the IBM Software Group Architecture Board. Jason serves as the Director of WebSphere Advanced Technologies, responsible for the productization new technologies into the WebSphere platform. Jason joined IBM in 1997 and has been a member of the WebSphere Application Server product since its inception. He helped to define the concepts of Servlets and JavaServer Pages (JSP) for processing Web presentation logic on WebSphere. Jason was responsible for the design and implementation of the Web Container in WebSphere Application Server. Mr. McGee has been heavily involved in leading the architecture for key parts of the WebSphere Application Server, including the server runtime and the XML-based systems management architecture. Jason graduated with a B.S. degree in computer engineering from Virginia Tech in 1995.




Rate this page


Please take a moment to complete this form to help us better serve you.



 


 


Not
useful
Extremely
useful
 


Share this....

digg Digg this story del.icio.us del.icio.us Slashdot Slashdot it!



Back to top