 | Level: Intermediate Peter Seebach (developerworks@seebs.plethora.net), Writer, Freelance
24 Mar 2004 What is C99? Who needs it? Is it available yet? Peter Seebach discusses the 1999 revision of the ISO C standard, with a focus on the availability of new features on Linux and BSD systems.
Not all of the new C99 features are supported in the versions of gcc
distributed with open source operating systems. However, a sufficient number are now widely available, so you can start looking seriously
at adopting C99 features in new development, especially where they make a
substantial difference in efficiency or clarity.
This article reviews the availability of C99 language and library features
on recent releases of Linux and BSD. Because many of these features are
standard features of gcc, a recent version of gcc will do the same thing
on most other platforms. Library support, of course, varies from one
distribution to another, or from one operating system to another.
Invoking gcc with a language standard
The GNU C compiler supports a number of different versions of the C
programming language. You can select the version of the C standard to use
on the command line, using the -std
option. The default is not any version of the standard, but rather, the "GNU C"
language, which has its own set of extensions. You can select common versions of the C
standard with the following options:
 |
C-ninety-what?
The C99 standard is the most recent revision of the ISO standard for C.
A bit of historical background may be helpful. The C language
was developed without committees and went through a lot of changes early
on. Eventually, most vendors stabilized somewhere near the language
described in the first edition (1978) of Kernighan & Ritchie's The C
Programming Language, although extensions were commonplace. ANSI began
work on a standard based on this book and on existing practice, and a
standard became widely available in 1989-1990. This standard is widely
referred to as "C89"; some wags refer to the language described in the
1978 edition of K&R as "C78."
Over the next ten years, compiler vendors
continued developing new extensions and new features, and in 1999, a
revised standard was released, representing a number of years of work on
standardizing many of the most useful and widely supported new features.
This standard is often referred to as the "C99" standard.
|
|
-
-std=c89 or
-std=iso9899:1990
The original C89 standard
-
-std=iso9899:199409
C89, plus the changes in Normative Addendum 1
-
-std=c99 or
-std=iso9899:1999
The C99 revised standard
To enforce full compliance with a version of the standard, use the
-pedantic option. This option is primarily
useful for making sure your code will survive the transition to other compilers. For
instance, if you're sharing a codebase with people who aren't using gcc,
you probably want it on all the time. Note that the -pedantic
flag will occasionally get some of the details of a given standard wrong. For instance,
it might try to enforce a C89 rule on a C99 program, or
might fail to enforce an obscure rule. It's still worth having it for
testing. If you're trying to write portable code, there's a lot to be
said for -std=c99 -pedantic -Wall.
The C89 standard introduced a new concept: the distinction between
freestanding and hosted environments. A hosted environment is what
most people are used to; it provides the full standard library, and
execution always starts at main().
If you want the slightly
different set of warnings and behaviors that are implied for a
freestanding environment, use the -ffreestanding option.
The default is to assume a hosted environment. To address a common FAQ,
yes, it is intentional that gcc gives warning for declarations of
main() with arguments or return type other than
those listed
in the standard. While the C99 standard allows implementations to provide
alternative declarations, they're never portable. In particular, the
common practice of declaring main() with a
return type of
void is simply incorrect. (This is why
NetBSD's kernels are
compiled with the -ffreestanding flag.)
Language features
There are two parts of the C programming language. These are,
confusingly, called the "language" and the "library." Historically, there was
a bundle of commonly used utility code that everyone tended to reuse;
this was eventually standardized into what's called the Standard C
Library. The distinction was pretty easy to understand at first: If the
compiler did it, it was the language; if it was in the add-on code, it was
the library.
With time, however, the distinction has been blurred. For instance,
some compilers will generate calls to an external library for 64-bit
arithmetic, and some library functions might be handled magically by the
compiler. For the purposes of this article, the division follows the
terminology of the standard: features from the "Library" section of the
standard are library features and are discussed in the next section of
the article. This section looks at everything else.
The C99 language introduces a number of new features that are of
potential interest to software developers. Many of these features are
similar to features of the GNU C set of extensions to C; unfortunately, in
some cases, they are not quite compatible.
A few features popularized by C++ have made it in. In particular, //
comments and mixed declarations and code have become standard features of
C99. These have been in GNU C forever and should work on every platform.
In general, though, C and C++ remain separate languages; indeed, C99 is a
little less compatible with C++ than C89 was. As always, trying to write
hybrid code is a bad idea. Good C code will be bad C++ code.
C99 added some support for Unicode characters, both within string literals
and in identifiers. In practice, the system support for this probably
isn't where it needs to be for most users; don't expect source that uses
this to be accessible to other people just yet. In general, the wide
character and unicode support is mostly there in the compiler, but the
text processing tools aren't quite up to par yet.
The new variable-length array (VLA) feature is partially available.
Simple VLAs will work. However, this is a pure coincidence; in fact, GNU
C has its own variable-length array support. As a result, while simple
code using variable-length arrays will work, a lot of code will run into
the differences between the older GNU C support for VLAs and the C99
definition. Declare arrays whose length is a local variable, but don't
try to go much further.
Compound literals and designated initializers are a wonderful code
maintainability feature. Compare these two code fragments:
Listing 1. Delaying for n microseconds in C89
/* C89 */
{
struct timeval tv = { 0, n };
select(0, 0, 0, 0, &tv);
}
|
Listing 2. Delaying for n microseconds in C99
// C99
select(0, 0, 0, 0, & (struct timeval) { .tv_usec = n });
|
The syntax for a compound literal allows a brace-enclosed series of values
to be used to initialize an automatic object of the appropriate type.
The object is reinitialized each time its declaration is reached, so it's
safe with functions (such as some versions of select) that
may modify the corresponding object. The designated initializer syntax
allows you to initialize members by name, without regard to the order in
which they appear in an object. This is especially useful for large and
complicated objects with only a few members initialized. As with a normal
aggregate initializer, missing values are treated as though they'd been
given 0 as an initializer. Other initialization rules have changed a bit.
For instance, you're now allowed to have a trailing comma after the last
member of an enum declaration, to make it just
a bit easier to write code generators.
For years, people have been debating extensions to the C type system, such
as long long. C99 introduces a handful of new
integer types. The most widely used is long long. Another type introduced by the standards process is intmax_t. Both of these types
are available in gcc. However, the integer promotion rules are not always
correct for types larger than long. It's probably best to use explicit
casts.
There are also a lot of types allowing more specific descriptions of
desired qualities. For instance, there are types with names like
int_least8_t, which has at least 8 bits, and
int32_t, which has exactly 32 bits. The
standard guarantees
access to types of at least 8, 16, 32, and 64 bits. There is no promise
that any exact-width types will be provided. Don't use such types unless
you are really, totally sure that you can't accept a larger type. Another
optional type is the new intptr_t type, which
is an integer
large enough to hold a pointer. Not all systems provide such a type
(although all current Linux and BSD implementations do).
The C preprocessor has a number of new features. It allows empty
arguments, and it supports macros with varying numbers of arguments. There is
a _Pragma operator for macro-generating
pragmas, and there's
a __func__ macro, which always contains the name
of the current function. These features are available in current versions of
gcc.
C99 added the inline keyword to suggest
function inlining.
GNU C also supports this keyword, but with slightly different semantics.
If you're using gcc, you should always use the static keyword
on inline functions if you want the same behavior as C99 would give for
the code. This may be addressed in future revisions; in the meantime, you
can use inline as a compiler hint, but don't
depend on the
exact semantics.
C99 introduced a qualifier, restrict, which can
give a
compiler optimization hints about pointers. Because there is no requirement
that a compiler do anything with this, it's done in that gcc accepts it.
The degree of optimization done varies. It's safe to use, but don't count
on it making a huge difference yet. On a related note, the new type-aliasing rules are fully supported in gcc. This mostly means that you
must be more careful about type punning, which is almost always going to
invoke undefined behavior, unless the type you're using to access data of
the wrong sort is unsigned char.
Array declarators as function arguments now have a meaningful difference
from pointer declarators: you can put in type qualifiers. Of particular
interest is the very odd optimizer hint of giving an array declarator the
static type modifier. Given this declaration:
int foo(int a[static 10]);
It is undefined behavior to call foo() with a
pointer that
doesn't point to at least 10 objects of type int. This is an
optimizer hint. All you're doing is promising the compiler that the
argument passed to the function will be at least that large; some machines
might use this for loop unrolling. As old hands will be well aware, it's
not a new C standard without an entirely new meaning for the
static keyword.
One last feature to mention is flexible array members. There is a common
problem of wanting to declare a structure that is essentially a header
followed by some data bytes. Unfortunately, C89 provided no good way to
do this without giving the structure a pointer to a separately allocated
region. Two common solutions included declaring a member with exactly one
byte of storage, then allocating extra and overrunning the bounds of the
array, and declaring a member with more storage than you could possibly
need, underallocating, and being careful to use only the storage
available. Both of these were problematic for some compilers, so C99
introduced a new syntax for this:
Listing 3. A structure with a flexible array
struct header {
size_t len;
unsigned char data[];
};
|
This structure has the useful property that if you allocate space for
(sizeof(struct header) + 10) bytes, you can treat data as being an array
of 10 bytes. This new syntax is supported in gcc.
Library features
That's fine for the compiler. What about the standard library? A lot of
the library features added in C99 were based on existing practice,
especially practices found in the BSD and Linux communities. So, many of
these features are preexisting ones already found in the Linux and BSD
standard libraries. Many of these features are simple utility functions;
almost all of them could in principle be done in portable code, but many
of them would be exceedingly difficult.
Some of the most convenient features added in C99 are in the printf family
of functions. First, the v*scanf functions have become standardized; for
every member of the scanf family, there is a corresponding v*scanf
function that takes a va_list parameter instead of a variable argument
list. These functions serve the same role as the v*printf functions,
allowing user-defined functions that take variable argument lists and end
up calling a function from the printf or scanf family to do the hard work.
Secondly, the 4.4BSD snprintf function family
has been
imported. The snprintf function allows you to
print safely
into a buffer of fixed size. When told to print no more than
n bytes, snprintf
guarantees that
it creates a
string of length no more than n-1, with a null
terminator at
the end of the string. However, its return code is the number of
characters it would have written if n had been large enough. Thus, you
can reliably find out how much buffer space you would need to format
something completely. This function is available everywhere, and you
should use it just about all the time; a lot of security holes have been
based on buffer overruns in sprintf, and this
can protect
against them.
A number of new math features, including complex math features and special
functions designed to help optimizing compilers for specific floating
point chips are in the new standard, but not reliably implemented
everywhere. If you need these functions, it is best to check on the exact
platform you're targeting. The floating point environment functions are
not always supported, and some platforms will not have support for IEEE
arithmetic. Don't count on these new features yet.
The strftime() function has been extended in
C99 to provide a
few more commonly desired formatting characters. These new characters
appear to be available on recent Linux and BSD systems; they aren't always
widely available on somewhat older systems, though. Check the
documentation before using new formats.
As noted, most of the internationalization code is not reliably
implemented yet.
Other new library features are typically not universally available; the
math functions are likely to be available in supercomputer compilers, and
the internationalization functions are likely to be available in compilers
developed outside the United States. Compiler vendors implement the
features they have a call for.
Looking forward
It is generally best to be conservative in adopting new features.
However, many of the C99 features are now sufficiently widespread that new
development projects can reasonably take advantage of them. The gcc
compiler suite is sufficiently widely available that most projects can
reasonably assume that it will be an option on a broad variety of target
platforms. If you're primarily targeting Linux or BSD systems, or both,
you can count on at least partial support for a great number of the new
C99 features. These features were adopted based on perceived need and
real-world implementation experience, and they should serve you well.
When deciding which features you're willing to depend on, don't just look
at what's available on the computer you're typing on; think about the
target system or systems. Do you want to require people to upgrade to a
more recent distribution of an operating system? Will your target market mind having to
get a new compiler? Test a feature on likely target systems before
you commit to using it.
Resources
About the author  | 
|  |
Peter Seebach has been a member of the ISO C standards committee since late 1996. He is only a little bitter that strsep() didn't make it into C99, and tends to use it anyway out of spite. He can be reached at developerworks@seebs.plethora.net.
|
Rate this page
|  |