Skip to main content

skip to main content

developerWorks  >  Linux  >

Advantages of openMosix on IBM xSeries, Part 1

Use networked Linux systems to solve your computing challenges

developerWorks
Document options

Document options requiring JavaScript are not displayed


New site feature

Check out our new article design and features. Tell us what you think.


Rate this page

Help us improve this content


Level: Introductory

Daniel Robbins (drobbins@gentoo.org), President/CEO, Gentoo Technologies, Inc. (Courtesy Intel Corporation)

10 Feb 2005

By the end of this three-part series, you'll have your own openMosix mini-cluster up and running and will be ready to use it effectively to accelerate your computing tasks. In Part 1, you get a clear and understandable introduction to the current clustering technologies available for Linux and and an introduction to openMosix. In Part 2, you will get a fully-functional openMosix cluster configured and running. Finally, in Part 3, you'll see some ways to use openMosix to tackle computing challenges.

Introduction

In this three-part series, I will introduce you to a clustering technology for Linux called openMosix. As you may know, clustering technologies allow two or more networked Linux systems (called "nodes") to combine their computing resources to solve computing challenges faster than would be possible if they were tackling the problem on their own. Choice of hardware is flexible of course, and openMosix is not limited to any single hardware platform, but clusters built on IBM xSeries servers running Intel® XeonTM processors will have some unique advantages. Making use of performance-enhancing technologies such as Intel's Hyper-Threading Technology, now supported under Linux, improves the performance of multi-threaded applications by allowing a single Xeon processor to appear to the operating system as two virtual processors. By taking advantage of Hyper-Threading, you can benefit from having multiple physical and/or virtual processors and also enjoy the benefits of openMosix itself. As this series progresses, I'll guide you through the process of setting up your own openMosix cluster. By the end of the series, you'll have your own openMosix mini-cluster up and running and will be ready to use it to effectively accelerate your computing tasks.

In this first article, I'll provide you with a very clear and understandable introduction to the current clustering technologies available for Linux and then introduce you to openMosix. I'll also explain how openMosix compares to other more traditional Linux clustering technologies, as well as point out differences and similarities between openMosix clusters and multi-processor computers. By the end of this article, you'll have all the necessary background to prepare you for setting up your own openMosix cluster. As you might guess, I'll show you how to do that in part 2 of this series.



Back to top


What is clustering?

In general, when people speak of "clustering," they are referring to technologies that allow multiple computers to work together to solve common computing problems. The computing problems in question can be anything from complex CPU-intensive scientific computations to a horde of miscellaneous processes with no underlying commonality.

Probably the best-known type of Linux-based cluster is the Beowulf cluster. A Beowulf cluster consists of multiple machines connected to one another on a high speed LAN. In order for these systems to be able to pool their computing resources, special cluster-enabled applications must be written using clustering libraries. The most popular clustering libraries are PVM and MPI; both are very mature and work very well. By using PVM or MPI, programmers can design applications that can span across an entire cluster's computing resources rather than being confined to the resources of a single machine. For many applications, PVM and MPI allow computing problems to be solved at a rate that scales almost linearly in relation to the number of machines in the cluster.

When a user on a computer connected to the Internet opens a browser window and requests a page from a web site that is Powered by Vignette, here's what happens.



Back to top


PVM and MPI are not for everyone…

While Beowulf clusters are extremely powerful, they aren't for everyone. The primary drawback of Beowulf clusters is that they require specially designed software (written with explicit PVM or MPI support) in order to take advantage of cluster resources. This is generally not a problem for those in the scientific and research communities who are used to writing their own special purpose applications from scratch; since they write their code in-house, they can use the PVM or MPI libraries to create cluster-aware applications.

However, those who "roll their own code" are actually a very small percentage of the computing public, all things considered. Everyone else -- all those who simply want to set up a cluster and see some kind of performance benefit using standard Linux applications -- have a very real problem. Since the applications that they use on a regular basis haven't been written to be PVM or MPI aware, they simply can't take advantage of a cluster. This is unfortunate, since it limits the use of clustering to a very small group. Wouldn't it be wonderful if there were some technology that would allow standard Linux applications to take advantage of a cluster without any need for them to be rewritten or even recompiled? Thankfully, there is -- and this technology is called openMosix.



Back to top


Enter openMosix

OpenMosix adds clustering abilities to the Linux kernel that allow any standard Linux process to take advantage of a cluster's resources. By using adaptive load-balancing techniques, processes running on one node in the cluster can transparently "migrate" to another node where they can execute faster. Because openMosix is completely transparent to all running programs, the process that has been migrated doesn't even know (or need to know) that it's running on a remote system. As far as that remote process and other processes running on the original node (called the "home node") are concerned, the process is running locally.

This transparency of openMosix means that no special programming is required to take advantage of openMosix's load-balancing technology. In fact, a default openMosix installation will migrate processes to the "best" node automatically. This makes openMosix a clustering solution that can provide an immediate benefit for a wide variety of applications.



Back to top


What openMosix does

The really great thing about openMosix is that it can turn a bunch of Linux machines into something like a large virtual SMP (symmetric multiprocessor) system. However, there are a few differences. First, on a "real" SMP system, two or more CPUs can exchange data very quickly; but with openMosix, the speed at which nodes can communicate with one another is determined by the speed of your LAN. Using Gigabit Ethernet or some other kind of high-bandwidth networking technology will allow you to increase the effectiveness of your openMosix cluster.

Of course, openMosix provides a number of benefits over traditional multiprocessor systems. With openMosix, you can create clusters consisting of tens or even hundreds of nodes using inexpensive PC hardware. In contrast, SMP systems that contain large numbers of processors can be prohibitively expensive, depending on your budget. For many applications, openMosix will give you more "bang for the buck" than a traditional supercomputer or mainframe. And of course, there's no reason why you can't run OpenMosix on a bunch of high-end multi-processor systems. As noted earlier, it can also be beneficial to combine openMosix with performance-enhancing technologies such as Intel's Hyper-Threading Technology, available xSeries servers with Intel Xeon processors, which is now supported under Linux. Hyper-Threading Technology allows improved performance of threaded applications by allowing a single Xeon processor to appear to the operating system as two virtual processors. By taking advantage of Hyper-Threading together with SMP, you could benefit from having multiple physical and/or virtual processors and also enjoy the benefits of openMosix itself. It's even possible (and easy!) to use openMosix together with existing MPI or PVM programs in order further optimize the performance of your cluster-aware applications.

OpenMosix, like an SMP system, cannot execute a single process on multiple physical CPUs at the same time. This means that openMosix won't be able to speed up a single process such as Mozilla, except to migrate it to a node where it can execute most efficiently. In addition, openMosix doesn't currently offer support for allowing multiple cooperating threads to be separated from one another.

In contrast, Hyper-Threading Technology allows multiple threads to execute on different logical processors at the same time. For this reason, performance-oriented users may want to consider combining Hyper-Threading Technology and openMosix in creative ways, since the technologies can be complementary. By assembling a cluster of nodes from nodes built from Xeon processors, Hyper-Threading Technology could be used to increase each node's ability to handle multiple cooperating threads that cannot be separated and distributed among openMosix nodes; taking advantage of these technologies could result in a significant performance improvement depending on how you plan to use your cluster.

And of course, openMosix will allow for extremely scalable parallel execution at the process level. OpenMosix can migrate most standard Linux processes between nodes with no problem. If an application forks many child processes, each of which performs work, then openMosix will be able to migrate each one of these processes to an appropriate node in the cluster. You can take advantage of this ability even if a particular application isn't designed to use multiple sub-processes that can be migrated independently. For example, if you wanted to compress 12 digital audio tracks using your cluster, you could simply start all 12 audio encoding processes simultaneously. After a few seconds, openMosix would migrate each process to an appropriate node in your cluster. If you happened to have a 12-node cluster, your audio encoding job would complete nearly 12 times faster than it would have otherwise. If the number of processes that you plan to run simultaneously is greater than the number of nodes in your cluster, multi-processing and Hyper-Threading provide options to experience additional performance gains.



Back to top


Mosix vs. openMosix

OpenMosix is a fairly recent branch of the original Mosix project, which also provides transparent clustering solution for Linux. So, why use openMosix? Here are some good reasons. In late 2001, the Mosix project lead decided to release all new versions of Mosix (which were previously GPL code) under a non-GPL license. Actually, the new Mosix code now contains no license at all except for a copyright and an "All rights reserved" clause. Because of this sudden and unexpected change, new releases of Mosix can no longer be considered to be free software, and it is unclear what rights users have in relation to the new Mosix code. In fact, there's nothing to prevent the author of Mosix from requiring Mosix users from paying royalty fees to him at a later date.

Obviously, this abrupt licensing change created a lot of concern among current Mosix users, and this concern was exacerbated by the fact that code and the Mosix mailing list archives themselves were removed from the Mosix Web site without explanation. Thankfully, these users were not the only people concerned about this sudden change. Moshe Bar, the Mosix project co-manager, didn't agree with the switch away from GPL licensing. In response to this change, Moshe started the openMosix project in order to ensure that a free version of Mosix would continue to be available to the public.

Since the inception of the openMosix project, a large number of Mosix installations have switched to openMosix -- over 350 at last count. In addition, Moshe's new, more open development style has resulted in the rapid acceleration of openMosix development. There are now 14 people actively working on openMosix, whereas only 4 are working on Mosix. In turn, a number of bug fixes and performance enhancements have been made to the openMosix code, and a good number of other new convenience, functionality and peformance-related features are in the works. Now that the dust has settled, it appears that the openMosix/Mosix split has allowed for the creation of a better, more robust and more rapidly developed Linux clustering solution than would have existed otherwise.



Back to top


What you need for openMosix

In order to set up an openMosix cluster, you'll need two or more Linux systems that are connected on a LAN. In order to run openMosix, they should be capable of compiling and running a 2.4 series kernel.

That's what's required. For maximum cluster performance, you may want to consider using the following recommended components.

At least 100Mbit (Fast) Ethernet is strongly recommended. Standard (10Mbit) Ethernet won't give you very good cluster performance, but should be fine if you just want to play around with openMosix. Gigabit Ethernet is optional, but beneficial. Gigabit Ethernet cards are also dropping in price; reliable ones can be found for as low as $130 USD. However, don't feel that you absolutely need Gigabit Ethernet; openMosix can do lots of good things with only Fast Ethernet.

Hooking your machines' Ethernet cards up to a dedicated high-speed switch is also beneficial. By doing so, your systems will be able to communicate over Ethernet in "full duplex" mode, effectively doubling bandwidth. If you have a limited number of machines, you may want to consider using a specially-wired Ethernet cable to connect the systems to one another directly. By doing so, you'll get switch-like full-duplex performance at a potentially lower price. This trick is very helpful when used for 2 or 3-node clusters, since these configurations only require one or two NICs per machine respectively.

A good amount of swap space is recommended. This will allow nodes to be dynamically removed from your cluster without causing the existing nodes from running out of virtual memory. Again, this is recommended and will only make a difference in extreme situations where you are pushing your cluster very hard.

Again, these suggestions are completely optional, and it is entirely possible to set up a cluster using two machines running Pentium family processors over a standard Ethernet network. The faster your network, the more efficiently openMosix will be able to migrate processes between nodes in your cluster. A fast network also makes openMosix a lot more fun to play with. Speaking of playing with openMosix, join me in my next article when I show you how to set up your very own openMosix cluster. We'll then be ready to have lots of fun in part three of this series, when we get to see openMosix in action.



Resources



About the author

Daniel Robbins is the President/CEO of Gentoo Technologies, Inc. Residing in Albuquerque, New Mexico, he is the creator of Gentoo Linux, an advanced Linux for the PC, and the Portage system, a next-generation ports system for Linux. He writes articles, tutorials, and tips for the developerWorks Linux zone and has also served as a Contributing Author for the Macmillan books Caldera OpenLinux Unleashed, SuSE Linux Unleashed, and Samba Unleashed. Daniel has been involved with computers in some fashion since the second grade, when he was first exposed to the Logo programming language as well as a potentially dangerous dose of Pac Man. This probably explains why he has since served as a Lead Graphic Artist at SONY Electronic Publishing/Psygnosis. Daniel enjoys spending time with his wife, Mary, and daughter, Hadassah. You can contact him at drobbins@gentoo.org.




Rate this page


Please take a moment to complete this form to help us better serve you.



YesNoDon't know
 


 


12345
Not
useful
Extremely
useful
 


Back to top