 | Level: Introductory Daniel Robbins (drobbins@gentoo.org), President/CEO, Gentoo Technologies, Inc.
01 Sep 2001 With the 2.4 release of Linux come a host of new filesystem possibilities, including Reiserfs, XFS, GFS, and others. These filesystems sound cool, but what exactly can they do, what are they good at, and exactly how do you go about safely using them in a production Linux environment? Daniel Robbins answers these questions by showing you how to set up these new advanced filesystems under Linux 2.4. In this installment, Daniel takes a look at tmpfs, a VM-based filesystem, and introduces you to the new possibilities available with 2.4's "bind"-mounting abilities.
In my previous articles in this series, I introduced the benefits of journalling and the ReiserFS and showed how to set up a rock-solid Linux 2.4-based ReiserFS system. In this article, we're going to tackle a couple of semi-offbeat topics. First,
we'll take a look at tmpfs, also known as the virtual memory (VM) filesystem. Tmpfs
is probably the best RAM disk-like system available for Linux right now,
and is a new feature of kernel 2.4. Then, we'll take a look at another new 2.4
feature called "bind mounts", which allow a great deal of flexibility when it
comes to mounting (and remounting) filesystems. Then, in the next article, we'll
focus on devfs, and after that we'll spend some time getting intimately
familiar with the new ext3 filesystem.
Introducing tmpfs
If I had to explain tmpfs in one breath, I'd say that tmpfs is like a
ramdisk, but different. Like a ramdisk, tmpfs can use your RAM, but it
can also use your swap devices for storage. And while a traditional ramdisk is
a block device and requires a mkfs command of some kind before you can
actually use it, tmpfs is a filesystem, not a block device; you just mount it,
and it's there. All in all, this makes tmpfs the niftiest RAM-based filesystem
I've had the opportunity to meet. tmpfs and VM
Let's take a look at some of tmpfs's more interesting properties. As I
mentioned above, tmpfs can use both RAM and swap. This might seem a bit
arbitrary at first, but remember that tmpfs is also known as the "virtual
memory filesystem". And, as you probably know, the Linux kernel's virtual
memory resources come from both your RAM and swap devices. The VM subsystem in
the kernel allocates these resources to other parts of the system and takes
care of managing these resources behind-the-scenes, often transparently moving
RAM pages to swap and vice-versa. The tmpfs filesystem requests pages from the VM subsystem to store files.
tmpfs itself doesn't know whether these pages are on swap or in RAM; it's the
VM subsystem's job to make those kinds of decisions. All the tmpfs filesystem
knows is that it is using some form of virtual memory. Not a block device
Here's another interesting property of the tmpfs filesystem. Unlike most
"normal" filesystems, like ext3, ext2, XFS, JFS, ReiserFS and friends, tmpfs
does not exist on top of an underlying block device. Because tmpfs sits on top
of VM directly, you can create a tmpfs filesystem with a simple mount command:
# mount tmpfs /mnt/tmpfs -t tmpfs |
After executing this command, you'll have a new tmpfs filesystem mounted at
/mnt/tmpfs, ready for use. Note that there's no need to run
mkfs.tmpfs; in fact, it's impossible, as no such command exists.
Immediately after the mount command, the filesystem is mounted and
available for use, and is of type tmpfs. This is very different from
how Linux ramdisks are used; standard Linux ramdisks are block devices,
so they must be formatted with a filesystem of your choice before you can use
them. In contrast, tmpfs is a filesystem. So, you can just mount it
and go.
Tmpfs advantages
Dynamic filesystem size
You're probably wondering about how big that tmpfs filesystem was that
we mounted at /mnt/tmpfs, above. The answer to that question is a bit
unexpected, especially when compared to disk-based filesystems.
/mnt/tmpfs will initially have a very small capacity, but as files
are copied and created, the tmpfs filesystem driver will allocate more VM and will
dynamically increase the filesystem capacity as needed. And, as files are
removed from /mnt/tmpfs, the tmpfs filesystem driver will
dynamically shrink the size of the filesystem and free VM resources, and by
doing so return VM into circulation so that it can be used by other parts of
the system as needed. Since VM is a precious resource, you don't want anything
hogging more VM than it actually needs, and the great thing about tmpfs is that
this all happens automatically.
See Resources.
Speed
The other major benefit of tmpfs is its blazing speed. Because a typical tmpfs
filesystem will reside completely in RAM, reads and writes can be almost
instantaneous. Even if some swap is used, performance is still excellent and
those parts of the tmpfs filesystem will be moved to RAM as more free VM
resources become available. Having the VM subsystem automatically move parts
of the tmpfs filesystem to swap can actually be good for performance,
since by doing so, the VM subsystem can free up RAM for processes that need it.
This, along with its dynamic resizing abilities, allow for much better overall
OS performance and flexibility than the alternative of using a traditional RAM
disk.
No persistence
While this may not seem like a positive, tmpfs data is not preserved between
reboots, because virtual memory is volatile in nature. I guess you probably
figured that tmpfs was called "tmpfs" for a reason, didn't you? However, this
can actually be a good thing. It makes tmpfs an excellent filesystem for
holding data that you don't need to keep, such as temporary files (those found
in /tmp) and parts of the /var filesystem tree.
Using tmpfs
To use tmpfs, all you need is a 2.4 series kernel with "Virtual memory file
system support (former shm fs)" enabled; this option lives under the "File
systems" section of the kernel configuration options. Once you have a
tmpfs-enabled kernel, you can go ahead and mount tmpfs filesystems. In fact,
it's a good idea to enable tmpfs in all your 2.4 kernels, whether you plan
to use tmpfs or not. This is because you need to have kernel tmpfs support in
order to use POSIX shared memory. System V shared memory will
work without tmpfs in your kernel, however. Note that you do not need a
tmpfs filesystem to be mounted for POSIX shared memory to work; you simply need
the support in your kernel. POSIX shared memory isn't used too much right now,
but this situation will likely change as time goes on.
Avoiding low VM conditions
The fact that tmpfs dynamically grows and shrinks as needed makes one wonder: what happens when your tmpfs filesystem grows to the
point where it exhausts all of your virtual memory, and you have no RAM
or swap left? Well, generally, this kind of situation is a bit ugly. With
kernel 2.4.4, the kernel would immediately lock up. With kernel 2.4.6, the VM
subsystem has in many ways been fixed, and while exhausting VM isn't exactly a
wonderful experience, things don't blow up completely, either. When 2.4.6 gets
to the point where it can't allocate any more VM, you obviously won't be unable
to write any new data to your tmpfs filesystem. In addition, it's likely that
some other things will happen. First, the other processes on the system will
be unable to allocate much more memory; generally, this means that the system
will most likely become extremely sluggish and almost unresponsive.
Thus, it may be tricky or unusually time-consuming for the superuser to take
the necessary steps to alleviate this low-VM condition.
In addition, the kernel has a built-in last-ditch system for freeing memory
when no more is available; it'll find a process that's hogging VM resources and
kill it. Unfortunately, this "kill a process" solution generally backfires
when tmpfs growth is to blame for VM exhaustion. Here's the reason. Tmpfs
itself can't (and shouldn't) be killed, since it is part of the kernel and not
a user process, and there's no easy way for the kernel to find out which
process is filling up the tmpfs filesystem. So, the kernel mistakenly attacks
the biggest VM-hog of a process it can find, which is generally your X server
if you happen to be running one. So, your X server dies, and the root cause of
the low-VM condition (tmpfs) isn't addressed. Ick.
Low VM: the solution
Fortunately, tmpfs allows you to specify a maximum upper bound for
the filesystem size when a filesystem is mounted or remounted. Actually, as of
kernel 2.4.6 and util-linux-2.11g, these parameters can only be set on
mount, not on remount, but we can expect them to be settable on remount
sometime in the near future. The optimal maximum tmpfs size setting depends
on the resources and usage pattern of your particular Linux box; the idea is to
prevent a completely full tmpfs filesystem from exhausting all virtual memory
and thus causing the ugly low-VM conditions that we talked about earlier. A
good way to find a good tmpfs upper-bound is to use top to monitor your
system's swap usage during peak usage periods. Then, make sure that you
specify a tmpfs upper-bound that's slightly less than the sum of all free swap
and free RAM during these peak usage times.
Creating a tmpfs filesystem with a maximum size is easy. To create a new tmpfs
filesystem with a maximum filesystem size of 32 MB, type:
This time, instead of mounting our new tmpfs filesystem at
/mnt/tmpfs, we created it at /dev/shm, which is a
directory that happens to be the "official" mountpoint for a tmpfs filesystem.
If you happen to be using devfs, you'll find that this directory has already
been created for you.
Also, if we want to limit the filesystem size to 512 KB or 1 GB,
we can specify size=512k and size=1g, respectively. In addition
to limiting size, we can also limit the number of inodes (filesystem
objects) by specifying the nr_inodes=x parameter. When using
nr_inodes, x can be a simple integer, and can also be followed
with a k, m, or g to specify thousands, millions, or
billions (!) of inodes.
Also, if you'd like to add the equivalent of the above mount tmpfs
command to your /etc/fstab, it'd look like this:
tmpfs /dev/shm tmpfs size=32m 0 0
|
Mounting on top of existing mountpoints
Back in the 2.2 days, any attempt to mount something to a
mountpoint where something had already been mounted resulted in an
error. However, thanks to a rewrite of the kernel mounting code, using mountpoints multiple times is not
a problem. Here's an example scenario: let's say that we have an existing
filesystem mounted at /tmp. However, we decide that we'd like to
start using tmpfs for /tmp storage. In the old days, your only
option would be to unmount /tmp and remount your new tmpfs
/tmp filesystem in its place, as follows:
# umount /tmp
# mount tmpfs /tmp -t tmpfs -o size=64m |
However, this solution may not work for you. Maybe there are a number of
running processes that have open files in /tmp; if so, when trying
to unmount /tmp, you'd get the following error:
umount: /tmp: device is busy
|
However, with recent 2.4 kernels, you can mount your new /tmp
filesystem without getting the "device is busy" error:
# mount tmpfs /tmp -t tmpfs -o size=64m
|
With a single command, your new tmpfs /tmp filesystem is
mounted at /tmp, on top of the already-mounted partition,
which can no longer be directly accessed. However, while you can't get to the
original /tmp, any processes that still have open files on this
original filesystem can continue to access them. And, if you umount
your tmpfs-based /tmp, your original mounted /tmp
filesystem will reappear. In fact, you can mount any number of
filesystems to the same mountpoint, and the mountpoint will act like a stack;
unmount the current filesystem, and the last-most-recently mounted filesystem
will reappear from underneath.
Bind mounts
Using bind mounts, we can mount all, or even
part of an already-mounted filesystem to another location, and have the
filesystem accessible from both mountpoints at the same time! For example, you
can use bind mounts to mount your existing root filesystem to
/home/drobbins/nifty, as follows:
# mount --bind / /home/drobbins/nifty |
Now, if you look inside /home/drobbins/nifty, you'll see your root
filesystem (/home/drobbins/nifty/etc,
/home/drobbins/nifty/opt, etc.). And if you modify a file on your
root filesystem, you'll see the modifications in
/home/drobbins/nifty as well. This is because they are one and the
same filesystem; the kernel is simply mapping the filesystem to two different
mountpoints for us. Note that when you mount a filesystem somewhere else, any
filesystems that were mounted to mountpoints inside the bind-mounted
filesystem will not be moved along. In other words, if you have
/usr on a separate filesystem, the bind mount we performed above
will leave /home/drobbins/nifty/usr empty. You'll need an
additional bind mount command to allow you to browse the contents of
/usr at /home/drobbins/nifty/usr:
# mount --bind /usr /home/drobbins/nifty/usr |
Bind mounting parts of filesystems
Bind mounting makes even more neat things possible. Let's say that you have a
tmpfs filesystem mounted at /dev/shm, its traditional location, and
you decide that you'd like to start using tmpfs for /tmp, which
currently lives on your root filesystem. Rather than mounting a new tmpfs
filesystem to /tmp (which is possible), you may decide that you'd
like the new /tmp to share the currently mounted
/dev/shm filesystem. However, while you could bind mount
/dev/shm to /tmp and be done with it, your
/dev/shm contains some directories that you don't want to appear
in /tmp. So, what do you do? How about this:
# mkdir /dev/shm/tmp
# chmod 1777 /dev/shm/tmp
# mount --bind /dev/shm/tmp /tmp
|
In this example, we first create a /dev/shm/tmp directory and then
give it 1777 perms, the proper permissions for /tmp. Now
that our directory is ready, we can mount /dev/shm/tmp, and only
/dev/shm/tmp to /tmp. So, while
/tmp/foo would map to /dev/shm/tmp/foo, there's no
way for you to access the /dev/shm/bar file from
/tmp.
As you can see, bind mounts are extremely powerful and make it easy to make
modifications to your filesystem layout without any fuss. Next article, we'll
check out devfs; for now, you may want to check out the following resources.
Resources
About the author  | |  |
Residing in Albuquerque, New Mexico, Daniel Robbins is the President/CEO of Gentoo Technologies,
Inc., the creator of Gentoo Linux, an advanced Linux for the
PC, and the Portage system, a next-generation ports system for Linux.
He has also served as a contributing author for the Macmillan books
Caldera OpenLinux Unleashed, SuSE Linux Unleashed, and Samba Unleashed.
Daniel has been involved with computers in some fashion since the
second grade, when he was first exposed to the Logo programming
language as well as a potentially dangerous dose of Pac Man. This
probably explains why he has since served as a Lead Graphic Artist at
SONY Electronic Publishing/Psygnosis. Daniel enjoys spending
time with his wife, Mary, and his new baby daughter, Hadassah. You can contact Daniel at drobbins@gentoo.org. |
Rate this page
|  |