    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Software Engineers Toolbox</title>
        <link>http://accu.org/index.php/journals/720</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>


        <h2>Journal Articles</h2>


<div class="xar-mod-head"><span class="xar-mod-title">CVu Journal Vol 8, #1 - Feb 1996 + Programming Topics</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="http://accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="http://accu.org/index.php/journals/c76/">Journals</a>

                     &gt;                         <a href="http://accu.org/index.php/journals/c77/">CVu</a>

                     &gt;                         <a href="http://accu.org/index.php/journals/c137/">081</a>
                    (7)
<br />

                                            <a href="http://accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="http://accu.org/index.php/journals/c13/">Topics</a>

                     &gt;                         <a href="http://accu.org/index.php/journals/c65/">Programming</a>
                    (488)
<br />

                                            <a href="http://accu.org/index.php/journals/c137-65/">Any of these categories</a>

                    -                        <a href="http://accu.org/index.php/journals/c137+65/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Software Engineers Toolbox</h1>
<p><strong>Author:</strong>&nbsp;Administrator</p>
<p>
<strong>Date:</strong> 03 February 1996 13:15:26 +00:00 or Sat, 03 February 1996 13:15:26 +00:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e18" id="d0e18"></a>What's In a
Name?</h2>
</div>
<div class="sidebar">
<p>When I wrote my last column (Test Yourself), I did my best to
ensure that the answers were totally accurate. If only I had been a
bit more careful with the questions! As Francis correctly pointed
out in his sidebars, I was a bit careless with my prototypes and
returns. The trouble is, I was concentrating so carefully on the
bits that were intended to be wrong, I took insufficient care with
the rest. Fortunately, none of this invalidated the answers, but it
does show that you can't be too careful. If it had been 'proper'
code, of course, either Lint or the compiler would have picked me
up on all of these problems. I think I shall get into the habit of
linting even such 'trivial' examples before publishing them.</p>
</div>
<p>There are a couple of areas of C programming which tend to be
overlooked in the text books. Perhaps they just aren't glamorous
enough. One such subject is the way we name things in our code. I
don't just mean the general rule for what is a valid identifier, I
mean guidelines for how to construct meaningful, consistent and
helpful identifiers - a naming convention.</p>
<p>Some people don't worry too much about naming conventions. They
have certain, vague preferences for how they choose and format
their identifiers, but they make no effort to create a formal, or
at least conscious, set of rules. I don't want to overstate the
importance of having a good formal naming convention, but I do
believe that it is important to have one. Good identifier names do
help programmers to write better programs, but the real payoff
comes in the maintenance phase.</p>
<p>If you do any amount of programming, you will inevitably get
involved in code maintenance, which for most programs is 60%-90% of
the total programming effort. The biggest problem with code
maintenance is that you often have to spend considerable time and
effort trying to work out how the code works before you can start
to modify it. Even code you wrote yourself can be impenetrable
after a twelve month hiatus, so it is well worth spending a bit of
effort to avoid problems such as poor formatting and misleading
identifiers. (Poor formatting can at least be fixed fairly easily
with tools such as indent, but bad or meaningless identifiers are a
curse forever, as those who have to maintain Unix kernels will
probably affirm.) It is well worth the effort to define and use a
sensible and consistent naming convention.</p>
<p>What constitutes a valid identifier in C is defined by the
standard as any combination of upper and lower-case letters, digits
and the underscore, provided the first character is a non-digit. Of
course, defining what is a <span class=
"emphasis"><em>valid</em></span> name is easy, defining what is a
meaningful name is somewhat more difficult. The standards obviously
allows any combination of characters, such as <tt class=
"literal">_X_35Wxc_7</tt>, whether it is meaningful or not. We put
meaning into names by using combinations of words with accepted and
unambiguous meanings.</p>
<p>The minimum length of a name (obviously) is one, but the maximum
length is implementation dependant. ISO C guarantees at least
thirty-one significant characters for macro names or identifiers
with internal linkage, but only six (case insignificant) characters
for identifiers with external linkage. This latter restriction was
provided for compatibility with some old, but important (at the
time) linkers. In practice, the six character restriction is
obsolete and should be ignored unless you know there really is a
problem. (It is likely to be dropped from the next revision of the
standard.) Many implementations allow names much longer than
thirty-one characters, but still only treat the first thirty-one as
significant. Some allow more significant characters, but frankly, I
think thirty-one characters is already too long. I start to worry
if I see too many identifiers longer than twenty characters or so.
Expressions with long identifiers rapidly disappear off the right
side of the screen or have to be split across multiple lines, both
of which tend to make code less readable. Of course, if you think
an identifier genuinely needs more characters, by all means use
them, but be conservative.</p>
<p>All identifiers can have a punctuation style That is, the way it
appears as opposed to what it says. A punctuation style is defined
by its use of underscores and capitals. There are a huge number of
possibilities, but 95% of C code is are probably covered by the
following seven styles.</p>
<div class="orderedlist">
<ol type="1">
<li>
<p><tt class="literal">HAT_SIZE</tt></p>
</li>
<li>
<p><tt class="literal">HATSIZE</tt></p>
</li>
<li>
<p><tt class="literal">hat_size</tt></p>
</li>
<li>
<p><tt class="literal">Hat_Size</tt></p>
</li>
<li>
<p><tt class="literal">HatSize</tt></p>
</li>
<li>
<p><tt class="literal">hatSize</tt></p>
</li>
<li>
<p><tt class="literal">hatsize</tt></p>
</li>
</ol>
</div>
<p>The main purpose of punctuation should be to make the name
easier to read and for that reason, I don't like styles 2 and 7.
<tt class="literal">totalannualinterest</tt> is far too difficult
to decipher. That still leaves us, however, with five good
alternatives. You could choose just one style and use that for all
names, but as we have several possibilities, we can use them to
help differentiate between the various uses of identifiers
(variable, function, tags, typedefs, etc.). When reading code, it
is often useful to know the sort of language element a name
represents. i.e. Is this a function or a macro? By using different
styles consistently, we can provide some of this information. (We
will see other ways shortly.) There are very few de facto standards
for C naming conventions, but one which is almost universal is to
use all uppercase for macro names. I would recommend that you keep
to that and use style one or two for macros. The only other advice
I am going to give is, don't go overboard. It isn't necessary, or
desirable, to have different styles for everything. The most
important point is to decide what styles you will use for each type
of identifier and to use those styles consistently.</p>
<p>Some time ago, I did a quick survey (on the accu.general mailing
list - thanks folks) of the styles that people prefer (I forgot to
include style 7) for five common classes of identifier. The results
are listed in Table 1 (rating out of 5):</p>
<div class="table"><a name="d0e78" id="d0e78"></a>
<p class="title c2">Table 1. Table 1</p>
<table summary="Table 1" border="1">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;tbody&gt;
<tr>
<td> </td>
<td>Vars</td>
<td>Funcs</td>
<td>Object Macros</td>
<td>Function Macros</td>
<td>typedefs</td>
</tr>
<tr>
<td><tt class="literal">HATSIZE</tt></td>
<td>1.0</td>
<td>1.1</td>
<td>2.8</td>
<td>2.6</td>
<td>1.6</td>
</tr>
<tr>
<td><tt class="literal">HAT_SIZE</tt></td>
<td>1.1</td>
<td>1.1</td>
<td>4.2</td>
<td>3.5</td>
<td>2.2</td>
</tr>
<tr>
<td><tt class="literal">hat_size</tt></td>
<td>4.4</td>
<td>4.4</td>
<td>1.5</td>
<td>2.2</td>
<td>2.9</td>
</tr>
<tr>
<td><tt class="literal">Hat_Size</tt></td>
<td>2.0</td>
<td>2.2</td>
<td>1.1</td>
<td>1.6</td>
<td>2.0</td>
</tr>
<tr>
<td><tt class="literal">HatSize</tt></td>
<td>2.6</td>
<td>2.7</td>
<td>1.5</td>
<td>1.7</td>
<td>3.4</td>
</tr>
<tr>
<td><tt class="literal">hatSize</tt></td>
<td>3.7</td>
<td>3.7</td>
<td>2.6</td>
<td>2.7</td>
<td>2.3</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<p>This is probably a pretty good reflection of the general use of
these styles in C programming. Some people derided the <tt class=
"literal">HatSize</tt> style as 'Pascal influence'. Personally, I
quite like it, partially because it is more compact than using
underscore separators, but there isn't a lot in it.</p>
<p>Now comes the task of putting meaning into the names by
carefully choosing words to put into them. C identi-fiers range
from single letters through single words to longer compounds of
two, three, or more words. Single letter names can be acceptable in
some cases, but not as often as many programmers seem to think. X
and Y may be appropriate for positions a Cartesian co-ordinate
system, but not as temporary variables. It is widespread practise
to use i, j, k, etc. as simple loop counters. Unfortunately, many
programmers use these when more descriptive names would be helpful,
such as when they are also being used as array indices. My general
advice is to use single letter identifiers very sparingly.</p>
<p>There is always a balance between making an identifier as
meaningful as possible and keeping to a reasonable length. To get
maximum meaning in minimum length every part of the name must earn
its keep, so the first rule is, no weasel words! Weasel-words are
those whose meaning is so vague or general that they add nothing to
the meaning of the identifier The common offender is Process... for
function names. Every function is doing some sort of processing. We
what to know exactly what that process is, so call it <tt class=
"literal">CalculateVolume()</tt>, not <tt class=
"literal">ProcessMeasurements()</tt>. Other weasel-words are
variable names like num or flag. Very few variables are so bereft
of meaning that you couldn't give them a more meaningful name than
these.</p>
<p>The other way to keep the length manageable is to use
abbreviations. Care needs to be taken, however, if meaning is not
to be lost. Abbreviations must be consistent. There should be a
list of acceptable, common abbreviation and some control over
project-specific ones. If there is a standard abbreviation, then
use it unless there is a very good reason not to. Don't use
different abbreviations for the same word, or use abbreviations
sporadically. For example, if your standard abbreviation for
'number' is 'num, don't mix variable names such as <tt class=
"literal">NumHats</tt>, <tt class="literal">NmBlocks</tt> and
<tt class="literal">NumberOfShirts</tt>. Be very careful that your
abbreviations aren't ambiguous, either. Does wt stand for weight,
white or something else? Defining standard abbreviations helps to
resolve such problems.</p>
<p>Where there is no commonly agreed abbreviation for a word, it is
often possible to create one by dropping some or all of the vowels,
or by choosing those letters which contribute most to the
pronunciation of the word. (i.e. drop any 'silent'- or almost
silent - letters.) As with all abbreviations, care is needed to
avoid ambiguity. If you are including measurement units in a name,
use the scientifically accepted abbreviations, including the
correct case. (e.g. <tt class="literal">power_kW</tt>, <tt class=
"literal">height_mm</tt>)</p>
<p>Spellings should be consistent across all identifiers. Avoid
having <tt class="literal">EyeColor</tt>, but <tt class=
"literal">HairColour</tt>. Also avoid names which only differ by a
single character. It is far too easy for a typing error to
introduce a subtle bug otherwise. Nor should you use names which
have identical spelling and rely on case-sensitivity to separate
them. Don't have a variable called max_temp and a macro called
<tt class="literal">MAX_TEMP</tt>. Beware, too, of spellings which
may confuse 0 (zero) with O (uppercase-O), 1 (one) with l
(lowercase-L) or 2 (two) with Z (uppercase-Z). (The similarity is
much worse in certain fonts.) One example from my experience is a
variable for serial line zero, called sl0 (es-el-zero), which many
people thought was called 'es-ten'.</p>
<p>Also, don't forget to consider the way it will sound. Although
code is a written medium, we often have to discuss it in meetings
such as code reviews. Try not to create names which look different,
but sound similar.</p>
<p>As well as deciding which words to put in the name, you should
also think about the order in which you place them. Some writers
say it is better to put the most important word first. Although I
would generally agree with this, I balance this against the need
for names to be comfortable to read. Putting the most important
word first can sometimes result in an awkward-sounding name. As
with so many other factors, the most important thing is to be
consistent. If you decide to write <tt class="literal">Max...</tt>
rather than <tt class="literal">...Max</tt>, then use that style
consistently. Don't use <tt class="literal">MaxTemp</tt> in one
place and <tt class="literal">HeightMax</tt> in another.</p>
<p>Although I said above that every part of a name should earn its
keep, I do sometimes allow a little redundancy to creep in. I do
this mainly to make searching for names easier. Let me give you an
example. Code I worked on recently had a function called
Intercom(). There was also a large family of related functions
called <tt class="literal">IntercomHold()</tt>, <tt class=
"literal">IntercomCancel()</tt>, etc. The problem was that a search
for <tt class="literal">Intercom</tt> matched dozens of instances
that I wasn't really looking for. I could get around this to a
certain extent by clever grep patterns, but not entirely.
(Maintenance programmers tend to grep a lot!) If the function had
been called <tt class="literal">IntercomRqst()</tt>, say, it would
have made life a bit easier. Since it is usually shorter names that
suffer from this problem, adding a few redundant characters isn't
too big a big problem.</p>
<p>One aspect of naming that I haven't talked about yet is the use
of affixes. Affixes can be used to provide additional information
about the purpose or nature of an identifier in addition to the
pure meaning. (If it adds to the meaning, I consider it to be part
of the base name, not an affix.) We have already looked at doing
this using different punctuation styles. Affixes can be used as an
alternative or a supplement to that method. The three primary uses
for affixes are to include type information, to associate an
identifier with a particular functional unit and to indicate what
language element an identifier represents.</p>
<p>The most common example of including type information is the
notorious Hungarian notation, which is loved and hated in roughly
equal measure. (I am only mentioning the possibilities, not
discussing the merits!) Hungarian notation prefixes names with
letters which indicate the type of a variable, or the return type
of a function. Thus an integer with a base name of <tt class=
"literal">WallHeight</tt> becomes <tt class=
"literal">iWallHeight</tt>. A similar thing may be done with
suffixes, though this is traditionally much more limited. The
canonical example is using a <tt class="literal">suffix _ptr</tt>
to indicate that the name refers to a pointer. e.g. <tt class=
"literal">output_file_ptr</tt> is a pointer to the output file.</p>
<p>Affixes are also used to indicate logical groupings. Well
designed code usually has some sort of modular structure. For
example, all the database access functions will be in a single
module. By prefixing all database functions with, say, db, we can
see that <tt class="literal">dbGetName()</tt>, <tt class=
"literal">dbGetNumber()</tt> and <tt class=
"literal">dbPutDetails()</tt> are all related functions. This is
traditionally done with prefixes rather than suffixes.</p>
<p>The third common use of affixes is to indicate what sort of
language element the identifier is. The most common application is
probably using a suffix of <tt class="literal">_t</tt> to indicate
user defined types. This is the method is used in the standard for
types such as <tt class="literal">size_t</tt> and <tt class=
"literal">wchar_t</tt>. This is not a bad practice to follow,
except that Posix reserves this practice. Not a problem if you
aren't programming for Posix, of course. If you are, or you want to
be maximally portable, there is nothing to stop you defining your
own suffix, such as <tt class="literal">_ty</tt>. Other uses in
this category include suffixes to identify struct or union
tags.</p>
<div class="sidebar">
<p>A classic example of the problems that this kind of affix usage
creates is shown by <tt class="literal">wchar_t</tt>. In C it is
usually provided by a <tt class="literal">typedef</tt> - hence the
<tt class="literal">_t</tt>. This will not work in C++ where wide
characters must be a true type. To maintain compatibility we
sacrifice meaning. quite apart from any other reason this kind of
maintenance is why so many now deprecate Hungarian notation.</p>
</div>
<p>The last point I will mention is reserved identifiers. The C
standard (and Draft C++ Standard) reserve quite a lot of possible
names. I am not going to give a full list here, but if you want a
rough and ready rule, don't use any identifier appearing in a
standard library header and never start an identifier with an
underscore. (For C, it is best to steer clear of C++ keywords,
too.) That still leaves a few holes, but it is a jolly sight better
than nothing. If you are really keen, you might also avoid
identifiers reserved by related standards such as Posix.</p>
<p>This article is necessarily only a brief summary of the issues
involved. If you would like more guidance on how to set up a good
naming convention, a couple of good books are &quot;<span class=
"emphasis"><em>C Elements of Style</em></span>&quot;, by Steve Oualline
and &quot;<span class="emphasis"><em>C-Style: Standards and
Guidelines</em></span>&quot; by David Straker. The most important thing
is not so much what you put in your naming convention, but that you
have one and use it consistently.</p>
<p>I don't pretend that using a good naming convention will
eliminate all your bugs or make your coffee taste better, but it
can make your job considerably more comfortable.</p>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
