    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Life Stories</title>
        <link>http://accu.org/index.php/journals/739</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>


        <h2>Journal Articles</h2>


<div class="xar-mod-head"><span class="xar-mod-title">CVu Journal Vol 10, #6 - Sep 1998 + Programming Topics</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="http://accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="http://accu.org/index.php/journals/c76/">Journals</a>

                     &gt;                         <a href="http://accu.org/index.php/journals/c77/">CVu</a>

                     &gt;                         <a href="http://accu.org/index.php/journals/c135/">106</a>
                    (12)
<br />

                                            <a href="http://accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="http://accu.org/index.php/journals/c13/">Topics</a>

                     &gt;                         <a href="http://accu.org/index.php/journals/c65/">Programming</a>
                    (488)
<br />

                                            <a href="http://accu.org/index.php/journals/c135-65/">Any of these categories</a>

                    -                        <a href="http://accu.org/index.php/journals/c135+65/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Life Stories</h1>
<p><strong>Author:</strong>&nbsp;Administrator</p>
<p>
<strong>Date:</strong> 03 September 1998 13:15:27 +01:00 or Thu, 03 September 1998 13:15:27 +01:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e20" id="d0e20"></a></h2>
</div>
<p>This article is aimed at everyone except OO and C++ experts. If
you're a novice C programmer and the following does nothing for
you, then I've failed. If you're an expert, then go ahead and tear
it to pieces.</p>
<p>A program is essentially code acting on data. What we're
concerned with here is the lifetime of any part of the data in
relation to the flow of the code. Some data has a lifetime that is
independent of that of a program instance. A database stored on a
disk drive is a good example. An e-mail message bouncing around the
Internet is another. This kind of data isn't the subject of this
article. I am only concerned with what goes on 'inside' a program,
i.e. that which is in memory and accessible to your code!</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e26" id="d0e26"></a>Global
data</h2>
</div>
<p>In the bad old days both in C programming and elsewhere,
'global' data was commonplace. In C, this is data that is declared
and defined outside of a function. The lifetime of global data is
essentially that of the program. Global data is generally
considered to be a 'bad thing'. In particular, using global data as
a means of communication between different parts of the program is
bad. Occasionally, the fact that just one storage location is
accessible from anywhere in the program is useful. For example
number or currency formats may change if the regional settings are
changed. Even so, providing functions for this purpose rather than
making the data visible globally is a better style.</p>
<p>Strictly speaking we should distinguish between data with
external and internal linkage. The latter is really global only to
the translation unit and is still basically a bad thing.</p>
<p>Eliminate global data.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e35" id="d0e35"></a>Local data</h2>
</div>
<p>Variables declared inside function bodies have a scope that is
limited to that function. In fact, the scope is the enclosing block
in C, so that a variable may be restricted to only part of a
function. The scope limits your use of a variable but doesn't in
itself define its lifetime. If a local variable is marked as
<tt class="literal">static</tt>, it holds its value even after
execution has left the enclosing block. However, for all variables
that aren't marked as <tt class="literal">static</tt>, their
lifetime is from the declaration to the end of the block. In C this
is the whole block, in C++ the declaration may be anywhere in the
block.</p>
<p>As well as the above-mentioned declarations, parameters passed
by value have similar scope and lifetimes.</p>
<p>In the simplest cases, when working with simple values, the
lifetime of a variable is of academic interest only; its scope is
the dominant consideration when coding:</p>
<pre class="programlisting">
int GreatestCommonDivisor(int m, int n) {
  int remainder;
  while(n) {
    remainder = m % n;
    m = n;
    n = remainder;
  }
  return m;
}
</pre>
<p>The fact that the loop is working with two parameters and a
local variable that exist only during the call is no great concern.
The fact that these variables may be modified without fear of side
effects on the rest of the program is reassuring however.</p>
<p>Now for some recursion:</p>
<pre class="programlisting">
int Factorial( int m ) {
  int result;
  if( m &gt; 1 )
    result = m * Factorial(m - 1);
  else
    result = 1;
  return result;
}
</pre>
<p>Here the scope and the lifetime of the parameter '<i class=
"parameter"><tt>m</tt></i>' and the local '<span class=
"returnvalue">result</span>' are within the function call. So
calling <tt class="literal">Factorial(GreatestCommonDivisor(30,
12))</tt> leads to a sequence of calls, with storage being set
aside at each call.</p>
<p>So first <tt class="literal">GreatestCommonDivisor(30, 12)</tt>
is called with storage being used for 'm' and 'n' initially
containing copies of the literal values 30 and 12. This storage
along with that used for the remainder variable is modified by the
loop until the first parameter has been reduced to 6. This value is
returned from the function and the storage used by the function is
discarded.</p>
<p>This value (6) is then passed to the <tt class=
"function">Factorial</tt> function, inside of which, storage is set
aside for the result variable. The value in '<tt class=
"varname">result</tt>' is eventually passed back to the calling
code to be dealt with as it may.</p>
<p>With a parameter of 6, the function passes a value of 5 to
<tt class="function">Factorial</tt>. Inside this invocation, the
outer invocation's variables remain untouched, fresh storage is
used for its copy of the local data. Each successive call is made
with values of 4, 3, 2 and 1 when finally, the conditional allows
the calls to progressively unwind, and building the result at each
level until the first invocation finally returns 720.</p>
<p>Although it is more efficient to write a non-recursive
<tt class="function">Factorial</tt> with a simple loop, the
allocation of storage for these variables is still very efficient.
Each function invocation has a 'stack frame' which is simply an
area on the call stack, accessed by offsets from a frame pointer.
That is, adjusting the stack pointer to leave room for them
allocates space for local variables. That is why local variables
have arbitrary values until you initialise them.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e92" id="d0e92"></a>The Heap</h2>
</div>
<p>C provides heap allocation library functions (<tt class=
"function">malloc</tt> etc), C++ provides <tt class=
"function">new</tt> and <tt class="function">delete</tt> operators.
Both do much the same job, in that you can allocate space away from
the call stack and access the allocated space by using pointers.
Although local pointers have a lifetime limited by their scope, the
data pointed at has a lifetime entirely under your control. Without
even thinking about classes and the like, life has suddenly become
much more complicated. The main thing that makes life difficult is
that simple pointers are merely values and may be copied freely.
Consider a simple use of the heap:</p>
<pre class="programlisting">
  char * pBuffer = new char[buff_size];
  //... use pBuffer
  delete [] pBuffer;
</pre>
<p>For C just substitute calls to <tt class="function">malloc</tt>
and <tt class="function">free</tt>. This is pretty innocuous stuff.
Using C++ as it should be, if the <tt class="function">new</tt>
operator fails, an exception will be raised, so the subsequent use
of the buffer would be skipped.</p>
<p>For every successful call of <tt class=
"function">new</tt>/<tt class="function">malloc</tt>, there should
be a corresponding call to <tt class=
"function">delete</tt>/<tt class="function">free</tt>. If memory is
allocated repeatedly and not freed, the result is a memory leak.
That is to say that the heap will slowly fill up until there is no
memory left for your or anyone else's program. This is a bad thing.
Even more catastrophic (at least for your program) is what happens
if you access data after it's been freed. Actually, <span class=
"emphasis"><em>not</em></span> crashing is probably the most
dangerous behaviour! Even worse, freeing the same heap object twice
almost always results in a nasty crash.</p>
<p>There are two ways that the above fragment could give you grief.
Firstly, if an exception is thrown and it is not caught between the
allocation and freeing of the buffer, a memory leak will occur. The
other case is if a copy of the pointer is taken and then used after
the last line above. That's aside from direct use of <tt class=
"varname">pBuffer</tt> after the <tt class="function">delete</tt>
while it's still in scope or overflowing the buffer of course!</p>
<p>For now, note that the heap must be used with care.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e146" id="d0e146"></a>Objects</h2>
</div>
<p>C++ brings the concept of constructors and destructors into the
fray. Objects in C++ (<tt class="literal">struct</tt>s and
<tt class="literal">class</tt>es) have lifetimes that start with a
constructor and end with a destructor. Lifetimes of such objects
are like those of the plain data types above. If a class is
instantiated as a simple variable in a block, it will live to the
end of the block. If it is instantiated using the <tt class=
"literal">new</tt>' operator, it will live till it is <tt class=
"literal">delete</tt>d. The former case has two benefits.</p>
<p>Firstly, you don't have to destroy the instance explicitly, that
will happen automatically. This eliminates one source of
errors.</p>
<p>Secondly, its destructor will be called even if an exception is
thrown that isn't caught in that function. It has to be mentioned
that this is only true for 'modern' compilers, i.e. those that
support exceptions properly rather than tacky macros.</p>
<p>So what am I trying to say? Obviously, you will want to create
objects on the heap and yet we know that doing so can lead to
memory leaks if you fail to delete them at the proper time. So how
can this be done safely?</p>
<p>Objects can contain other objects. It is simple to compose a
class with instances of other classes. It is nearly as simple to
contain pointers to classes and instantiate them on the heap. The
latter case is likely to be forced on you when you need to create a
number of objects that is determined at run time. A typical example
would be to create a linked list of objects populated from a file,
or a varying number of open documents in an editor.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e171" id=
"d0e171"></a>Ownership</h2>
</div>
<p>It is a common style to compose a class of many different
objects. Thus you might declare:</p>
<pre class="programlisting">
class MyObject {
  AnObject oneObject;
  AnObject anotherObject;
public
  MyObject(int a, int b);
};
</pre>
<p>The AnObject instances must be initialised on construction.
Supposing their constructor requires an integer parameter, the
<tt class="classname">MyObject</tt> constructor will look something
like this:</p>
<pre class="programlisting">
MyObject::MyObject(int a, b)
        : oneObject(a),
          anotherObject(b)
{}
</pre>
<p>Alternatively, you might want to create the objects
independently on the heap. Suppose that we were writing an
old-fashioned linked list class.</p>
<pre class="programlisting">
class ListOfAnObjects {
  struct AnObjectNode {
    AnObject *pItem;
    AnObjectNode *pNext;
  }
  AnObjectNode *pHeadNode;
public
  ListOfAnObjects() : pHeadNode(0) {}
  ~ListOfAnObjects();
  // insertion and deletion functions...
  // iterator class...
};
</pre>
<p>In this case the constructor simply initialises to an empty list
- a null terminated single linked list.</p>
<p>The insertion functions will create new <tt class=
"classname">AnObjectNode</tt> instances with the <tt class=
"literal">new</tt> operator. The destructor will have the
responsibility of deleting the nodes at the very least:</p>
<pre class="programlisting">
ListOfAnObjects::~ListOfAnObjects() {
  while( pHeadNode ) {
    AnObjectNode *pThisNode = pHeadNode;
    pHeadNode = pHeadNode-&gt;pNext;
    delete pThisNode;
  }
}
</pre>
<p>Please note that the above loop maintains a valid list while it
is gobbling it up. This can be quite important in more complex
situations. But anyway&hellip;</p>
<p>Assuming that the other methods of this class are also correct,
this class isn't going to cause memory leaks by leaving orphaned
list nodes littering the heap. On the other hand, what about the
<tt class="classname">AnObject</tt> instances?</p>
<p>As presented, this class can maintain a list of pointers to
these objects but their lifetimes are independent of the list. That
is, users of the list class must manage the lifetimes of these
objects.</p>
<p>Before rushing ahead, let's look at some of the benefits of what
we've got so far.</p>
<p>Firstly, if a list class like the above is instantiated as part
of another class its destructor is sure to be called, so long as
the enclosing class' destructor is called.</p>
<p>Secondly, if it is declared locally in a block of code, its
destructor is sure to be called, even if an exception is raised in
the intervening code.</p>
<div class="note c2">
<h3>Note</h3>
<p>For a useful example of this benefit, look at using <tt class=
"classname">auto_ptr</tt> as a means of looking after local
pointers to heap objects.</p>
</div>
<p>Assuming that the other member functions are correctly written,
the list class is able to take full charge of the lifetimes of the
node instances. This power of life and death over objects is
usually termed 'ownership'. The list class owns all its list nodes.
This is what you would expect, as the sole purpose of these nodes
is to support the collection of objects in a list.</p>
<p>Can we take the concept of ownership further and have the list
class own the <tt class="classname">AnObject</tt> instances that it
is listing?</p>
<p>Extending the class to do this is simple, wherever we delete a
node, we could delete the <tt class="classname">AnObject</tt> at
the same time e.g.:</p>
<pre class="programlisting">
delete pThisNode;
delete pThisNode-&gt;pItem; // *@~%?
</pre>
<p>Just kidding, obviously the order of these lines should be
reversed!</p>
<p>The good thing about this approach is that you don't have to
worry about deleting your <tt class="classname">AnObject</tt>
instances, the list class will do it for you. The bad thing is that
only one instance of the list class can reference any one AnObject
instance.</p>
<p>One refinement that I have found useful is to give the list
class an ownership attribute. That is, it may or may not be an
owner. This allows you to create a master list of objects that owns
them and outlives all other lists and have other lists that simply
reference the objects without owning them. This approach does avoid
unnecessary copying of objects<sup>[<a name="d0e245" href=
"#ftn.d0e245" id="d0e245">1</a>]</sup>.</p>
<p>Don't use this idea unless you are confident that it reflects
your requirements accurately. If you need finer grained control of
such object's lifetimes, this becomes unmanageable.</p>
<p>Another extension to the ownership concept is to allow ownership
to be passed from one owner to another (see <tt class=
"classname">auto_ptr</tt> again). In the case of a list class, this
implies that a method must be provided that can remove an object
from the list without destroying it, even when the destructor would
do so. Such a method might be called '<tt class=
"methodname">relinquish</tt>'. I think that passing ownership
around is something to consider only in unusual cases<sup>[<a name=
"d0e266" href="#ftn.d0e266" id="d0e266">2</a>]</sup>. For example,
it is a useful concept when an object wants to 'commit suicide' and
pass itself on to something that will destroy it later at its own
convenience.</p>
<p>The list class in non-owner mode is an example of what is
generally known as a 'collection class'. Ownership is a property of
container classes, although they also are usually responsible for
the creation of the objects as well as their destruction. Container
semantics are basically value based rather than reference based and
are exemplified by the STL. So go and read a good book on STL if
you want to know more.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e276" id="d0e276"></a>Reference
counting</h2>
</div>
<p>A more sophisticated approach to managing object lifetimes is to
track references to each object, so that destruction occurs only
when there are no references to an object. Imagine something like
the following:</p>
<pre class="programlisting">
void SomeFunction( AWordList &amp; outerList ) {
  AWordList list;                 // ref. count:
  PtrAWord pWord = new AWord(&quot;Hello&quot;);// hello: 1
  list.Add(pWord);                    // hello: 2 
  pWord = new AWord(&quot;World&quot;);         // world: 1, hello:1 
  list.Add(pWord);                    // world: 2
  outerList.Add(pWord);               // world: 3
}
</pre>
<p>Where <tt class="classname">AWord</tt> is a class that holds a
single word and is reference counted by some means.</p>
<p>The rules are that when you assign to a pointer to <tt class=
"classname">AWord</tt>, the reference count is incremented and when
you destroy the pointer or assign something else to it, the count
is decremented.</p>
<p>The two <tt class="classname">AWord</tt> instances that are
created in the above fragment have different lives.</p>
<p>First, the 'hello' instance is created and assigned to
'<tt class="varname">pWord</tt>', this gives it an initial
reference count of one. Then it is also added to a local list for
reasons that will never become clear. This operation increases its
reference count to two, i.e. both '<tt class="varname">pWord</tt>'
and 'list' refer to it.</p>
<p>Now a new <tt class="classname">AWord</tt> instance is created
initialised to 'World' and its reference count is set to one as the
assignment is made to '<tt class="varname">pWord</tt>', then
because '<tt class="varname">pWord</tt>' no longer references the
'hello' instance, its reference count is decreased to one (just the
list owns it). Note the order of operations - it matters. Why?</p>
<p>Similarly, the 'World' instance is also added to the local list
and a referenced list that exists outside the scope of this
function. These assignments bring its reference count up to
three.</p>
<p>The function now finishes, causing the destruction of
'<tt class="varname">list</tt>' and '<tt class=
"varname">pWord</tt>'. The destruction of <tt class=
"varname">list</tt> decrements the reference counts of both
<tt class="classname">AWord</tt> instances, causing the 'hello'
instance to drop to zero. With the <tt class="varname">pWord</tt>
destruction also decrementing it, the 'World' object still has a
reference count of one, corresponding to the <tt class=
"varname">outerList</tt> referencing it.</p>
<p>When a reference count drops to zero, it causes the destruction
of the object.</p>
<p>Oh Yeah! You might say, wondering what language that is in. As
is well known, this model of object lifetimes is directly supported
as an intrinsic part of many modern languages. It is also supported
in C++ using what are commonly referred to as smart pointers, i.e.
what looks like a simple C style pointer is really a class. The
implementation of such classes is beyond the scope of this article.
There are several ways that this can be achieved, depending to a
certain extent on the classes being collected. To take one common
example, COM objects must support this reference counting as part
of their most fundamental interface, hence the use of smart
pointers when using COM from C++ is more or less obligatory. It is
informative to watch as your program fails to release one little
part of - say - Excel and thus fail to leave the machine in the
state that it found it - i.e. with Excel still loaded!</p>
<p>Well, that's about it. Please don't treat this article as
definitive. Most C++ publications are awash with much more detailed
articles on these subjects.</p>
<p>May your objects lead long and fulfilling lives (and not live
happily <span class="emphasis"><em>ever</em></span> after)!</p>
</div>
<div class="footnotes"><br>
<hr class="c3" width="100">
<div class="footnote">
<p><sup>[<a name="ftn.d0e245" href="#d0e245" id=
"ftn.d0e245">1</a>]</sup> <i><span class="remark">even better is to
provide this facility as a property of a node object so that the
constructor and destructor for <tt class=
"classname">AnObjectNode</tt> track ownership and delete any owned
<tt class="classname">AnObject</tt>. Francis</span></i></p>
</div>
<div class="footnote">
<p><sup>[<a name="ftn.d0e266" href="#d0e266" id=
"ftn.d0e266">2</a>]</sup> <i><span class="remark">Anyone who was
involved in the saga of designing <tt class=
"classname">auto_ptr</tt> can attest to just how difficult it is to
get any where near viable ownership semantics when ownership can be
transferred. You finish up with such horrors as assignments that
change their rhs and copy constructors that change the object being
copied. Francis</span></i></p>
</div>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
