 | Level: Intermediate Kenneth Stephen (kstephe@us.ibm.com), Software Engineer,
IBM
15 Jul 2008 Hungarian notation is a naming convention that can be used in
code and design artifacts. In this article, learn a simple technique of
Hungarian notation that you can apply during data modeling and implementation
to improve the quality of your applications.
Introduction
Hungarian notation is a naming convention in programming where the naming
of a variable indicates its usage. Hungarian notation was designed to be
language-independent. There are two types: Systems Hungarian notation and
Apps Hungarian notation.
 | | Hungarian notation, which was invented by Charles Simonyi, got its
name because the prefixes make the variable names look like they're
written in a non-English language, and because Simonyi is originally from
Hungary. In Simonyi’s version of Hungarian notation, every variable was
prefixed with a lower case tag that indicated the kind of thing that the
variable contained. |
|
In practice, Hungarian notation has been used mainly to indicate the type
of the variable, and this has lead to misconceptions about the usefulness
of the technique. As it turns out, the Hungarian naming convention is
quite useful. It is one technique among many that helps programmers
produce better code faster.
In this article, learn a simple technique, using the notation, that you can
apply during your data modeling and implementation phases.
Example
For example:
lSerialNumber <— indicates that the variable is of type long
sSerialNumber <— indicates that the variable is of type string
|
In most type-safe programming languages, the compiler is well aware of the
type of variable and will not allow incorrect type usage. In this
case, using the Hungarian notation has little value.
The notation provides benefits in situations where the usage is indicated
in the naming convention. For example:
custSerialNumber <— indicates that the serial number is that of a customer
suppSerialNumber <— indicates that the serial number is that of a supplier
|
In this case, the compiler will not be able to do any semantic checking;
the notation helps the programmer do that. In data modeling and database
programming, such capabilities are quite useful. Keep reading to learn how
to apply Hungarian notation to improve the quality of your model and
programs.
Improving the data modeling
experience
Consider a simple data model connected to a development process at a
software company. The "crow's feet" notation is used here (see
Resources for more about crow's feet). The
company has several software projects. As shown in Figure 1, if a project
is actively being worked on, then it has one or more features. If a
feature is being worked on, then it has one or more developers associated
with it.
Figure 1. Relationships
among project, features, and developers
There are three different id fields and three
different name fields. The data types of the
three id fields are all integers, and the data
types of the three name fields are strings. The
modeling tool, or compiler, can check to see if you incorrectly associated an
id to a name, but it
won’t be able to tell you if anything is wrong between the
name from the project entity being related to
the name from the feature entity. This is
something that you, as the modeler, would have to pay attention to.
Hungarian notation can help you by making things clearer, as shown in
Figure 2.
Figure 2. Removing ambiguity by identifying source of data element
The company also has projects planned for the future, which are still in
the requirements stage, and there are no features defined for them.
Analysts are assigned to flesh out the requirement for these projects, as
shown in Figure 3.
Figure 3. Semantic
conflict for project id
If you compare Figure 1 and Figure 3, you can see that
the project_id field is being used in a
different sense. This is a true semantic conflict. In the previous case,
where the name fields were conflicting, the
different name fields could have all been of different data types and a
modeling tool could then have caught the problem if someone equated a
feature name to a developer name. In this case, however, the field name
and the data type are the same. The difference between the two fields is a
pure semantic difference—something no tool could detect. Here
again, Hungarian notation can make things clear.
Figure 4. Resolving the
semantic conflict
Improving the implementation
A physical implementation of a model that uses Hungarian notation can also
have benefits at development time. When joins are done, similarly named
fields have to be unambiguously identified. This is usually accomplished using
correlation names. For example, you want to find out if you have
developers working on an active project who are also working as analysts
on future projects. The following sample code shows how you would code that
SQL statement if you were not using Hungarian notation.
select d.name
from developer d, feature f, project prj, planned_project pprj, analyst a
where d.feature_id = f.id
and f.project_id = prj.id
and a.id = d.id
and a.project_id = pprj.id
|
With Hungarian notation, the same SQL statement would look as follows:
select d_name
from developer, feature, project, planned_project, analyst
where d_f_id = f_id
and f_prj_id = prj_id
and a_id = d_id
and a_pprj_id = pprj_id
|
Eliminating a few correlation names and replacing the ‘.’ with a ‘_’ might
seem like a cosmetic change, but it's more than that. In the first form,
it's easier to make a mistake, such as saying “f.project_id = pprj.id”
because there isn’t any identification in “f.project_id” that the
“project_id” comes from the project table and not from the
“planned_project” table. Hungarian notation makes the source of the data
element obvious, which reduces programmer errors.
Applying the technique
correctly
Like most best practices, if the Hungarian notation technique is not
applied correctly when used in database design, then you may have problems.
For example, on a recent application that was a Web service exposing a
data model to command line queries, the data model contained data elements
that the business user was interested in. Hungarian notation was applied
in the data model design, resulting in the information in Figure 5.
Figure 5. Logical and
physical data entities
The end users of the command line queries did not know anything about the
internal naming conventions in the database. When they used the command
line queries, they specified the name of the data elements known to them.
For example, if they wanted the end of service date, they would specify
“end_of_service_date” rather than “prv_end_of_service_date.” This meant,
of course, that the application would have to specify and maintain a
mapping between the external names and the internal names.
The situation was complicated because one of the main objectives of the
command line client was to implement a simple SQL-like query interface.
The command line query program would support simple inner joins and the
use of SQL functions on the data elements. To enable this, one would need
to differentiate between name when used as a
string, name when used as a column name, and
name when used as a correlation name or table
name. This required a full-fledged parser that could make the
distinctions.
For example, consider the following user command:
Query -select "max(gen_avail_date), max(gold_master_date)" -from "product_release" -where
"end_of_service_date < '2010-01-01'" |
The application would now need to map this to:
select max(prv_gen_avail_date), max(prv_gold_master_date)
from product_release_view
where prv_end_of_service_date < '2010-01-01'
|
It's implied that the application has a parser that understands the SQL
functions syntax, complexities such as correlation names, joins, and so
on.
All of this complexity was unnecessary. If the database view had used the
exact same names as the external business elements, no mapping would be
required. The application would simply be able to take the different
portions of the command line query, as specified by the user, and use that
to build an SQL statement dynamically.
Figure 6 shows what is required.
Figure 6. Hiding the
notation
The external view is a view defined on top of the internal view. This way,
the data modeling can have all the advantages associated with Hungarian
notation. The SQL statements are programmatically built, based on end-user
input, and so don’t suffer from the problem of programmers getting the data semantics wrong.
Resources Learn
Get products and technologies
- Download
IBM product evaluation versions
and get your hands on application development tools and middleware
products from DB2®, Lotus®, Rational®, Tivoli®,
and WebSphere®.
Discuss
About the author  | |  | Kenneth Stephen is an application architect with more than 17 years of
experience in IT, primarily involving developing and maintaining
applications that use relational databases. He has worked as a database
designer and modeler for over six years. He currently works for IBM in AIX
development. He holds a bachelor of technology degree in electronics and
communication engineering from Kerala University, India. |
Rate this page
|  |