 | Level: Intermediate Sean A. Walberg (sean@ertw.com), Senior Network Engineer
31 Mar 2007 Applications using the LAMP (Linux®, Apache, MySQL, PHP/Perl)
architecture are constantly being developed and deployed.
But often the server administrator has little control over the application itself
because it's written by someone else. This series of three articles discusses many
of the server configuration items that can make or break an application's
performance. This first article covers the LAMP architecture, some measurement
techniques, and some basic Linux kernel, disk, and file system tweaks. Successive articles investigate tuning the
Apache, MySQL, and PHP components.
Linux, Apache, MySQL, and PHP (or Perl) are the foundation of many Web
applications, from to-do lists to blogs to e-commerce sites. WordPress and Pligg
are but two common software packages powering high-volume Web sites. This
architecture has come to be known simply as LAMP. Almost every distribution of
Linux includes Apache, MySQL, PHP, and Perl, so installing the LAMP software is
almost as easy as saying it.
This ease of installation gives the impression that the software runs itself,
which is simply not true. Eventually the load on the application outgrows the
settings that come bundled with the back-end servers and application performance
suffers. LAMP installations require constant monitoring, tuning, and evaluation.
Tuning a system has different meanings to different people. This series of
articles focuses on tuning the LAMP components -- Linux, Apache, MySQL, and PHP.
Tuning the application itself is yet another complex matter. There is a symbiotic
relationship between the application and the back-end servers: a poorly tuned
server causes even the best application to fail under load, and there's only so
much tuning one can do to a server before a badly written application slows to a
crawl. Fortunately, proper system tuning and monitoring can point to problems in
the application.
The LAMP architecture
The first step in tuning any system is understanding how it works. At the
simplest level, a LAMP-based application is written in a scripting language such
as PHP that runs as part of the Apache Web server that is running on a Linux host.
The PHP application takes information from the client through the requested URL,
any form data, and whatever session information has been captured to determine
what it is supposed to do. If needed, the server pulls information from a MySQL
database (also running on Linux), combines the information with some Hypertext
Markup Language (HTML) templates, and returns it to the client. This process
repeats itself as the user navigates the application and also occurs in parallel
as multiple people access the system. The flow of data is not one way, however,
because the database may be updated with information from the user in the form of
session data, statistics collection (including voting), and user-submitted content
such as comments or site updates. In addition to the dynamic elements, there are
also static elements such as images, JavaScript code, and Cascading Style Sheets
(CSS).
 |
Variations on LAMP
LAMP started out as strictly Linux, Apache, MySQL, and PHP or Perl. It is not
uncommon, however, to run Apache, MySQL, and PHP on Microsoft®
Windows® if Linux isn't your strength. Then again, you can always swap out
Apache for something like lighttpd, and you still have a LAMP-style system,
albeit one with an unpronounceable acronym. Or you may prefer a different open
source database such as PostgreSQL or SQLite, a commercial database such as
IBM® DB2®, or even a commercial but free engine like IBM DB2
Express-C.
This article focuses on the traditional LAMP architecture because it's the one
I see most often in my travels, and its components are all open source.
|
|
After looking at the flow of requests through the LAMP system, you can begin to
see the points where slowdowns might occur. The database provides much of the
dynamic information, so the client notices any delay in responding to queries. The
Web server must be able to execute the scripts quickly and also handle multiple
concurrent requests. Finally, the underlying operating system must be in good
health to support the applications. Other setups that share files between
different servers over the network can also become a possible bottleneck.
Measuring performance
Constant measurement of performance helps in two ways. The first is that
measurement helps you spot trends, both good and bad. As a simple example, by
watching central processing unit (CPU) usage on a Web server, you can see when it
is overloaded. Similarly, watching the total bandwidth used in the past and
extrapolating to the future helps you determine when network upgrades are needed.
These measurements are best correlated with other measurements and observations.
For example, you might determine that when users complain of application slowness,
the disks happen to be operating at maximum capacity.
The second use of performance measurements is to determine if tuning has helped
the situation or made it worse. You do this by comparing measurements before and
after the change is made. For this to be effective, though, only one item should
be changed at a time, and the proper metric should be compared to determine the
effect of the change. The reason for changing only one thing at a time should be
obvious. After all, it is quite possible that two simultaneous changes could
counteract each other. The reason for the metrics statement is more subtle.
It is crucial that the metrics you choose to watch reflect on the user of the
application. If the goal of a change is to reduce the memory footprint of the
database, eliminating various buffers will certainly help, at the expense of query
speed and application performance. Instead, one of the metrics should be
application response time, which opens up tuning possibilities other than just the
database's memory usage.
You can measure application response time in many ways. Perhaps the easiest is
with the curl command shown in Listing 1.
Listing 1. Using cURL to measure the response time of a Web site
$ curl -o /dev/null -s -w %{time_connect}:%{time_starttransfer}:%{time_total}\
http://www.canada.com
0.081:0.272:0.779
|
Listing 1 shows the curl command being used to look up
a popular news site. The output, which would normally be the HTML code, is sent to
/dev/null with the -o
parameter, and -s turns off any status information. The
-w parameter tells curl to
write out some status information such as the timers described in Table 1:
Table 1. Timers used by
curl
| Timer | Description |
|---|
| time_connect | The time it takes to establish the TCP connection to the server |
|---|
| time_starttransfer | The time it takes for the Web server to return the first byte of data after
the request is issued |
|---|
| time_total | The time it takes to complete the request |
|---|
Each of these timers is relative to the start of the transaction, even before the
Domain Name Service (DNS) lookup. Thus, after the request was issued, it took
0.272 - 0.081 = 0.191 seconds for the Web server to process the request and start
sending back data. The client spent 0.779 - 0.272 = 0.507 seconds downloading the
data from the server.
By watching curl data and trending it over time, you
get a good idea of how responsive the site is to users.
Of course, a Web site is more than just a single page. It has images, JavaScript
code, CSS, and cookies to deal with. curl is good at
getting the response time for a single element, but sometimes you need to see how
fast the whole page loads.
The Tamper Data extension for Firefox (see the Resources
section for a link) logs all the requests made by the Web browser and displays the
time each took to download. To use the extension, select Tools > Tamper
Data to open the Ongoing requests window. Load the page in question, and
you'll see the status of each request made by the browser along with the time the
element took to load. Figure 1 shows the results of loading the developerWorks
home page.
Figure 1. Breakdown of requests
used to load the developerWorks home page
Each line describes the loading of one element. Various data are displayed, such
as the time the request started, how long it took to load, the size, and the
results. The Duration column lists the time the element itself took to load, while
the Total Duration column shows how long all the sub elements took. In Figure 1,
the main page took 516 milliseconds (ms) to load, but it was 5101 ms before
everything was loaded and the entire page could be displayed.
Another helpful mode of the Tamper Data extension is to graph the output of the
page load data. Right-click anywhere in the top half of the Ongoing requests
window and select Graph all. Figure 2 shows a graphical view of the data
from Figure 1.
Figure 2. A graphical view of
requests used to load the developerWorks home page
In Figure 2, the duration of each request is displayed in dark blue and is shown
relative to the start of the page load. Thus, you can see which requests are
slowing down the whole page load.
Despite the focus on page loading times and user experience, it is important not
to lose sight of the core system metrics such as disk, memory, CPU, and network. A
wealth of utilities are available to capture this information; perhaps the most
helpful are sar, vmstat, and
iostat. See the Resources
section for more information about these tools.
Basic system tweaks
Before you tune the Apache, PHP, and MySQL components of your system, you should
take some time to make sure that the underlying Linux components are operating
properly. It goes without saying that you've already stripped down your list of
running services to only those that you need. In addition to being a good security
practice, doing so saves you both memory and CPU cycles.
Some quick kernel
tuning
Most Linux distributions ship with buffers and other Transmission Control
Protocol (TCP) parameters conservatively defined. You should change these
parameters to allocate more memory to enhancing network performance. Kernel
parameters are set through the proc interface by
reading and writing to values in /proc. Fortunately,
the sysctl program manages these in a somewhat easier
fashion by reading values from /etc/sysctl.conf and
populating /proc as necessary. Listing 2 shows some
more aggressive network settings that should be used on Internet servers.
Listing 2. /etc/sysctl.conf showing more aggressive network settings
# Use TCP syncookies when needed
net.ipv4.tcp_syncookies = 1
# Enable TCP window scaling
net.ipv4.tcp_window_scaling = 1
# Increase TCP max buffer size
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
# Increase Linux autotuning TCP buffer limits
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
# Increase number of ports available
net.ipv4.ip_local_port_range = 1024 65000
|
Add this file to whatever is already in
/etc/sysctl.conf. The first setting enables TCP SYN
cookies. When a new TCP connection comes in from a client by means of a packet
with the SYN bit set, the server creates an entry for the half-open connection and
responds with a SYN-ACK packet. In normal operation, the remote client responds
with an ACK packet that moves the half-open connection to fully open. An attack
called the SYN flood ensures that the ACK packet never returns so that the
server runs out of room to process incoming connections. The SYN cookie feature
recognizes this condition and starts using an elegant method that preserves space
in the queue (see the Resources section for full
details). Most systems have this enabled by default, but it's worth making sure
this one is configured.
Enabling TCP window scaling allows clients to download data at a higher rate. TCP
allows for multiple packets to be sent without an acknowledgment from the remote
side, up to 64 kilobytes (KB) by default, which can be filled when talking to
higher latency peers. Window scaling enables some extra bits to be used in the
header to increase this window size.
The next four configuration items increase the TCP send and receive buffers. This
allows the application to get rid of its data faster so it can serve another
request, and it also improves the remote client's ability to send data when the
server gets busier.
The final configuration item increases the number of local ports available for
use, which increases the maximum number of connections that can be served at a
time.
These settings become effective at next boot or the next time
sysctl -p /etc/sysctl.conf is run.
Configure disks for
maximum performance
Disks play a vital role in the LAMP architecture. Static files, templates, and
code are served from disk, as are the data tables and indexes that make up the
database. Much of the tuning to follow, especially that pertaining to the
database, focuses on avoiding disk access because of the relatively high latency
it incurs. Therefore, it makes sense to spend some time optimizing the disk
hardware.
The first order of business is to ensure that atime
logging is disabled on file systems. The atime is the
last access time of a file, and each time a file is accessed, the underlying file
system must record this timestamp. Because atime is
rarely used by systems administrators, disabling it frees up some disk time. This
is accomplished by adding the noatime option in the
fourth column of /etc/fstab. Listing 3 shows an example
configuration.
Listing 3. A sample fstab showing how to enable noatime
/dev/VolGroup00/LogVol00 / ext3 defaults,noatime 1 1
LABEL=/boot /boot ext3 defaults,noatime 1 2
devpts /dev/pts devpts gid=5,mode=620 0 0
tmpfs /dev/shm tmpfs defaults 0 0
proc /proc proc defaults 0 0
sysfs /sys sysfs defaults 0 0
LABEL=SWAP-hdb2 swap swap defaults 0 0
LABEL=SWAP-hda3 swap swap defaults 0 0
|
Only the ext3 file systems have been modified in Listing 3 because
noatime is helpful only for file systems that reside on
a disk. A reboot is not necessary to effect this change; you only need to remount
each file system. For example, to remount the root file system, run
mount / -o remount.
A variety of disk hardware combinations are possible, and Linux doesn't always
reliably detect the optimal way to access the disks. The
hdparm command is used to get and set the methods used
to access IDE disks.
hdparm -t /path/to/device performs a speed test that
you can use as a benchmark. For the most reliable results, the system should be
idle when you run this command. Listing 4 shows a speed test being performed on
hda.
Listing 4. A speed test being performed on /dev/hda
# hdparm -t /dev/hda
/dev/hda:
Timing buffered disk reads: 182 MB in 3.02 seconds = 60.31 MB/sec
|
As the test shows, the disks are reading data at around 60 megabytes (MB) per
second.
Before delving into some of the disk tuning options, a warning is in order. The
wrong setting can corrupt the file system. Sometimes you get a warning that the
option isn't compatible with your hardware; sometimes you don't. For this reason,
test settings thoroughly before putting a system into production. Having standard
hardware across all your servers helps here too.
Table 2 lists some of the more common options.
Table 2. Common options for
hdparm
| Option | Description |
|---|
| -vi | Query the drive to determine which settings it supports and which settings
it is using. |
|---|
| -c | Query/enable (E)IDE 32-bit I/O support.
hdparm -c 1 /dev/hda enables this. |
|---|
| -m | Query/set multiple sectors per interrupt mode. If the setting is greater
than zero, up to that number of sectors can be transferred per interrupt. |
|---|
| -d 1 -X | Enable direct memory access (DMA) transfers and set the IDE transfer mode.
The hdparm man page details the numbers that may go
after the -X. You should need to do this only if
-vi shows you're not using the fastest mode. |
|---|
Unfortunately for Fiber Channel and Small Computer Systems Interface (SCSI)
systems, tuning is dependent on the particular driver.
You must add whichever settings you find useful to your startup scripts, such as
rc.local.
Network file system
tuning
The network file system (NFS) is a way to share disk volumes across the network.
NFS is helpful to ensure that every host has a copy of the same data and that
changes are reflected across all nodes. By default, though, NFS is not configured
for high-volume use.
Each client should mount the remote file system with
rsize=32768,wsize=32768,intr,noatime to ensure the
following:
- Large read/write block sizes are used (up to the specified figure, in this
case 32KB).
- NFS operations can be interrupted in case of a hang.
- The
atime won't be constantly updated.
You can put these settings in /etc/fstab, as shown in
Listing 3. If you use the automounter, these go in the
appropriate /etc/auto.* file.
On the server side, it is important to make sure there are enough NFS kernel
threads available to handle all your clients. By default, only one thread is
started, though Red Hat and Fedora systems start at 8. For a busy NFS server, you
should push this number higher, such as 32 or 64, to start. You can evaluate your
clients to see if there was blockage with the
nfsstat -rc command, which shows client Remote
Procedure Call (RPC) statistics. Listing 5 shows the client statistics for a Web
server.
Listing 5. Showing a NFS client's RPC statistics
# nfsstat -rc
Client rpc stats:
calls retrans authrefrsh
1465903813 0 0
|
The second column, retrans, is zero, showing that no
retransmissions were necessary since the last reboot. If this number is climbing,
then you should consider adding more NFS kernel threads. This is done by passing
the number of threads desired to rpc.nfsd, such as
rpc.nfsd 128 to start 128 threads. You can do this at
any time. Threads are started or destroyed as necessary. Again, this should go in
your startup scripts, preferably in the script that starts NFS on your system.
A final note on NFS: Avoid NFSv2 if you can because performance is much less than
in v3 and v4. This is not an issue in modern Linux distributions, but check the
output of nfsstat on the server to see if any NFSv2
calls are being made.
Looking ahead
This article covered some of the basics of LAMP and looked at some simple Linux
tuning for LAMP installations. With the exception of NFS kernel threads, you can
set and then ignore the parameters discussed in this article. The next two
articles in this series focus on Apache, MySQL, and PHP tuning. Tuning them is
much different than tuning Linux because you need to constantly revisit the
parameters as the traffic volumes increase, the read/write distributions change,
and the application evolves.
Resources Learn
- "Easy system monitoring with SAR" (developerWorks, February 2006) is a guide
to keeping track of key system metrics using
sar.
- "Expose
Web performance problems with the RRDtool" (developerWorks, March 2006) is a
tutorial by Sean that expands on the cURL technique and graphs the data for long-term
analysis.
- In Monitoring Virtual Memory with
vmstat (Linux Journal, October 2005), you learn how to observe paging
activity on your Linux system.
- If NFS is new to you, To Protect and Serve: Providing Data to Your Cluster (Prentice Hall
Professional Technical Reference, February 2005) is a good introduction to both
NFS and the automounter, which helps in large-scale NFS deployments.
- "TCP and
Linux' Pluggable Congestion Control Algorithms" (Linux Gazette, February
2007) discusses how to try out the many algorithms to implement TCP
congestion control that Linux supports, and, more importantly, the importance and
impact of loss and delay on your network sessions.
- TCP SYN cookies were mentioned
earlier as a defense against denial of service attacks involving SYN floods.
Wikipedia describes their implementation. It's a brilliant idea.
- In the developerWorks Linux zone,
find more resources for Linux developers.
- Stay current with developerWorks technical events and Webcasts.
Get products and technologies
- The Tamper Data extension for
Firefox allows you to view and modify HTTP headers on the fly and to graph the
loading of page elements.
-
Order the SEK for Linux, a two-DVD set containing the latest IBM trial
software for Linux from DB2®, Lotus®, Rational®, Tivoli®,
and WebSphere®.
- With IBM trial software, available for download directly from developerWorks,
build your next development project on Linux.
Discuss
About the author  | 
|  | Sean Walberg has been working with Linux and UNIX since 1994 in academic, corporate, and Internet service provider environments. He has written extensively about systems administration over the past several years. |
Rate this page
|  |