    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Space Optimisation under Palm OS</title>
        <link>http://accu.org/index.php/journals/1138</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>


        <h2>Journal Articles</h2>


<div class="xar-mod-head"><span class="xar-mod-title">CVu Journal Vol 13, #5 - Oct 2001 + ACCU Topics</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="http://accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="http://accu.org/index.php/journals/c76/">Journals</a>

                     &gt;                         <a href="http://accu.org/index.php/journals/c77/">CVu</a>

                     &gt;                         <a href="http://accu.org/index.php/journals/c118/">135</a>
                    (7)
<br />

                                            <a href="http://accu.org/index.php/journals/">All</a>

                     &gt;                         <a href="http://accu.org/index.php/journals/c13/">Topics</a>
                    (1)
<br />

                                            <a href="http://accu.org/index.php/journals/c118-13/">Any of these categories</a>

                    -                        <a href="http://accu.org/index.php/journals/c118+13/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Space Optimisation under Palm OS</h1>
<p><strong>Author:</strong>&nbsp;Administrator</p>
<p>
<strong>Date:</strong> 03 October 2001 13:15:47 +01:00 or Wed, 03 October 2001 13:15:47 +01:00</p>
<p><strong>Summary:</strong>&nbsp;</p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e20" id=
"d0e20"></a>Introduction</h2>
</div>
<p>Over the last year I have started working with the Palm
Operating System. For anyone not familiar with them, Palm devices
are small handhelds without a keyboard, but with a little touch
sensitive screen.</p>
<p>Developing on this platform is an interesting challenge as
contemporary Palm devices ship with between 2MB and 8MB of memory
in total - and no hard disk! Because of this, users expect most
Palm applications to be less than 100K in size and they are usually
not disappointed. Along the path of my Palm apprenticeship, trying
to pack lots of functionality into as small a memory footprint as
possible, I have picked up a few tips and tricks. I would like to
share these with you.</p>
<p>I use GCC 2.91 for Palm OS [<a href="#GCC">GCC</a>] and offer no
guarantee that my experience of this compiler suite on this
platform will generalise elsewhere. However, an article that was
entirely Palm-specific would only be of interest to a small
proportion of ACCU members, so I have endeavoured to at least
provide food for thought for anyone thinking about space
optimisation.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e32" id="d0e32"></a>The rules of
optimisation</h2>
</div>
<p>The rules of optimisation are usually stated as [<a href=
"#Jackson">Jackson</a>]:</p>
<div class="orderedlist">
<ol type="1">
<li>
<p><span class="emphasis"><em>Don't do it.</em></span></p>
</li>
<li>
<p><span class="emphasis"><em>(For experts only) don't do it
yet.</em></span></p>
</li>
</ol>
</div>
<p>However, there are two caveats to the rule:</p>
<div class="orderedlist">
<ol type="1">
<li>
<p>You should always avoid pessimisations [<a href=
"#Sutter">Sutter</a>], coding techniques that make your code slower
but have little other benefit. A lot of what I discuss here come
under this category - just common sense coding techniques that
should &quot;flow from your fingertips&quot; whenever you are coding for
size.</p>
</li>
<li>
<p>Part of the reasoning for the &quot;(For experts)&quot; clause is that
measuring the effects of optimisation can be quite involved. This
is not so for size optimisation - one build and your executable
size tells you all you need to know.</p>
</li>
</ol>
</div>
<p>You don't have to be an expert to do some space
optimisation.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e63" id="d0e63"></a>First try: use
C</h2>
</div>
<p>I was aware of the special space considerations almost from the
start of my attempts to program the Palm, because the official
documentation [<a href="#PalmOS">PalmOS</a>] is rather insistent
upon this point. I started off using C, my reasoning being that, as
a simpler language than C++, it was bound to create smaller code,
right? (As I will show later on, this was both right and
wrong.)</p>
<p>I had some success with C, but I found some of its limitations
annoying because essentially I am an object-oriented programmer at
heart.</p>
<p>I found myself writing code like this:</p>
<pre class="programlisting">
/* This struct is private! Do not access the
 * fields directly - use the DocX functions 
 * instead. */
struct Doc
{
  This* thisP;
  That* thatP;
};
</pre>
<p>Clearly, I am addicted to data hiding, but I also missed other
C++ features like interface inheritance and inline functions. So, I
started tentatively experimenting with using GCC's C++ compiler,
G++.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e79" id="d0e79"></a>First steps
with G++</h2>
</div>
<p>Once I had fixed all the problems with my C code that prevented
it being valid C++, chiefly putting in casts to replace all
implicit conversions from <tt class="literal">void*</tt>, I had a
bit of a shock: compiling exactly the same code, G++ produced an
executable that was just short of 20K bigger than GCC. At that
time, that meant the program I was working on increased in size by
50%, which was unacceptable.</p>
<p>At this point I contemplated sticking with C, but my structure
comments were starting to really grate, so I persevered. I started
to think about why the executable size had grown.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e89" id="d0e89"></a>Learning to
live without</h2>
</div>
<p>The general philosophy of C++ is &quot;what you don't use, you don't
pay for&quot; [<a href="#Stroustrup">Stroustrup</a>]. However, there are
at least two features of the language that, even if they don't
break that rule in theory, certainly make it hard for compiler and
linker writers to stick to it. These are:</p>
<div class="orderedlist">
<ol type="1">
<li>
<p>Run-time type identification (RTTI).</p>
</li>
<li>
<p>Exception handling.</p>
</li>
</ol>
</div>
<p>The reason that both of these features are problematic in this
respect is the way they flow across translation units<sup>[<a name=
"d0e106" href="#ftn.d0e106" id="d0e106">1</a>]</sup>. The compiler
must generate RTTI information for any defined type with external
linkage, (well actually, no, only for polymorphic types - ones with
virtual member functions, however you might, with a lazy compiler,
get RTTI infra structure - Francis) for even if the translation
unit that contains the type never uses this information, another
translation unit may require it.</p>
<p>The reasoning is similar for exception handling: even a
translation unit that never catches or throws an exception may,
provided it makes calls outside the translation unit, be required
to propagate an exception thrown lower down the call stack. So, the
compiler must generate code for this eventuality.</p>
<p>So, we end up with a situation where even C-compatible code
contains RTTI and exception handling logic (well only the latter -
Francis). Moreover, any part of the run-time library required to
support these features will also be handed to the linker.</p>
<p>Ideally, we'd have a system where the linker would realise that,
even though RTTI and exception handling code was present, it was
never used so could be safely removed. Unfortunately, this is
asking quite a lot of your average linker. It is certainly not what
the GCC linker does.</p>
<p>Once I'd worked all this out, I discovered the G++ options
<tt class="literal">-fno-rtti</tt> and <tt class=
"literal">-fno-exceptions</tt>, which turn off both of these
features. Using these two switches meant that G++ compiled the code
to exactly the same size executable as the C compiler.</p>
<p>Of course you might be wondering, why bother removing the code
to support these features if you're going to use them anyway? Well,
to take the easy case first, I hardly ever use RTTI, so I can
easily do without it. Living without exception handling is more
problematic - say goodbye to the most of the STL, conforming new
behaviour, and more, but if you can do it in C, you can do it in
C++. In my case, the 20K saving was a sufficient incentive to do
without exception handling, but your mileage may vary. The deficit
would often be less than 20K in practice because as you start to
actually use exception handling, most of your C-style error
handling code can be removed.</p>
<p>As I started to move my code base away from pure C, and added
some non-trivial classes, I needed a non-exception-throwing way for
a constructor to signal failure. If there is an elegant way of
doing this, I have not found it. Two options I considered are:</p>
<div class="orderedlist">
<ol type="1">
<li>
<p>Two phase construction. A constructor that is always guaranteed
to succeed and another member function, called something like
Init(), that does any more error-prone construction work,
indicating in its return value whether it succeeded or not.</p>
</li>
<li>
<p>Pass an output parameter to the constructor, which it can use to
signal a failure.</p>
</li>
</ol>
</div>
<p>I plumped for option 2 in the end, so I had something like
this:</p>
<pre class="programlisting">
Doc::Doc(bool&amp; success) :
  m_resourceP(AcquireResource())
{
  if(!m_resourceP)success = false;
}

Doc::~Doc()
{
  if(m_resourceP) FreeResource(m_resourceP);
}
</pre>
<p>which was called by code a bit like this:</p>
<pre class="programlisting">
bool success = true;
Doc doc(dbP, success);
if(success)
{
// Doc constructed OK - we can use it.
}
</pre>
<p>Note that the caller sets <tt class="literal">success</tt> to
<tt class="literal">true</tt> initially, rather than the
constructor. The benefit of this is that the top-level constructor
can call other constructors (e.g. for data members or base classes)
and any failure is reported back to the caller. If I'd taken the
other approach and had the constructor set success to true
initially, cascading construction would have been more complex: a
constructor could end up setting success to true after another
constructor had already failed and set it to false.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e151" id="d0e151"></a>Avoiding
library calls</h2>
</div>
<p>Part of the reason that RTTI and exception handling increased
the executable size was that they caused parts of the run-time
library to be linked in. On a device like the Palm, the C++
run-time library is an expensive luxury and I avoid using it
entirely. This is not such a hardship as you might first imagine,
as, for the most basic calls for such things as string and memory
handling, there are OS supplied equivalents. Calling an OS function
costs very little in terms of code size, because the function is
implemented in ROM. Even on non-ROM based OSs such as most Windows,
OS function calls don't usually cost much, since the shared
libraries for the most basic calls - <tt class=
"literal">GDI32.DLL</tt>, <tt class="literal">USER32.DLL</tt> and
<tt class="literal">KERNEL32.DLL</tt>, in the case of Windows -
will already be loaded anyway<sup>[<a name="d0e165" href=
"#ftn.d0e165" id="d0e165">2</a>]</sup>.</p>
<p>Where there is an exact mapping between OS and standard library
function, it is possible to create a function with the same name as
the standard library function and use it to call the OS function.
This &quot;interposition&quot; gives you the best of both worlds. However, be
careful to ensure you always include the &quot;translation&quot; header
before you call any of the functions within - at best failure to do
so will cause a compile error, at worst it will just silently bring
in part of the standard library and bloat your code - because
certain standard library function calls don't require their
standard library header to be included in order for the compiler to
accept them. If you make several changes at once, you may have
trouble tracking the culprit down, although the <tt class=
"literal">-nostdlib</tt> option available in later GCC versions can
help.</p>
<p>One example where interposition is commonly used, and even
explicitly sanctioned by the C++ language standard, involves the
free store functions - <tt class="literal">operator new</tt>,
<tt class="literal">operator delete</tt> and their array-handling
chums. Their run-time library implementation brings in substantial
parts of the run-time library to the link. This, plus their key
role in most C++ systems, makes them ideal candidates for
replacement.</p>
<p>Initially I tried implementing the free store operators as
static member functions of only the classes I knew might be
allocated on the free store. However, this ended up leading to a
lot of duplicate code, so in the end I used &quot;interposition&quot; as
described above and implemented the global versions of the
operators.</p>
<p>First, create a header file called something like <tt class=
"literal">FreeStore.h</tt>, containing prototypes for the free
store functions you will be replacing. Make sure you include this
in every source file that makes use of these functions. Don't
forget one!</p>
<p>Then create a source file that implements the functions using
only OS-supplied memory management functions<sup>[<a name="d0e191"
href="#ftn.d0e191" id="d0e191">3</a>]</sup>. If, like me, you
turned off exception handling, your new functions should return
<tt class="literal">NULL</tt> in the case of failure. Luckily, this
is how most OS memory allocation functions, including Palm's
<tt class="literal">MemPtrNew</tt>, behave.</p>
<p>One thing to watch out for is that <tt class="literal">operator
delete</tt> is guaranteed to take no action when passed a
<tt class="literal">NULL</tt> pointer, whereas the closest OS
equivalent may not make this promise. The documentation for
<tt class="literal">MemPtrFree</tt>, the Palm OS deallocation
function, doesn't, so it is safest to write <tt class=
"literal">operator delete</tt> to include the <tt class=
"literal">NULL</tt> test, like so:</p>
<pre class="programlisting">
void operator delete(void* ptr){
  if(ptr) MemPtrFree(ptr);
}
</pre></div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e223" id="d0e223"></a>Inline
functions</h2>
</div>
<p>A simplistic view of inline functions is that they are a
straight trade-off to achieve faster code at the expense of an
increase in code size. For very simple functions, this is not so -
making them inline both increases speed and reduces code size,
because it removes the function call overhead. Unless you wish to
measure the impact of every single inlining decision (certainly
ideal, but also rather tedious), I recommend the following rule of
thumb when optimising for space:</p>
<div class="blockquote">
<blockquote class="blockquote">
<p>&quot;If a function is getting or setting a single variable OR
calling a single function, inline it. Otherwise, don't.&quot;</p>
</blockquote>
</div>
<p>Watch for &quot;hidden&quot; function calls such as copy constructors and
assignment operators. And be aware that, like all rules of thumb,
there will be exceptions.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e233" id="d0e233"></a>Polymorphic
hierarchies the small way</h2>
</div>
<p>C++'s support for polymorphism is often used to provide multiple
concrete implementation classes of a single interface class (also
known as an abstract base class or ABC). Interface classes contain
no code, they are used purely to provide a set of member function
signatures, called pure virtual functions.</p>
<p>Contrary to common expectation, calls to pure virtual functions
can sometimes occur - in the twilight zone when an object is
partially constructed. The compiler must emit code that somehow
copes with this possibility. With the version of GCC I was using
this led to a 6K increase in code size, just for a single pure
virtual function. Why is this?</p>
<p>The C++ standard just says that calling a pure virtual function
leads to undefined behaviour, leaving it completely up to compiler
vendors as to what should happen. As is suggested by the &quot;= 0&quot;
syntax, pure virtual functions could perhaps be implemented as a
<tt class="literal">NULL</tt> function pointer. Indeed, at least
one compiler uses just this strategy [<a href=
"#Vollmann">Vollmann</a>]. The downside here is that if the call is
not spotted by the compiler, it will lead to a general exception or
crash. This is perfectly valid &quot;undefined behaviour&quot;, but it isn't
terribly helpful for someone trying to work out why some code isn't
working.</p>
<p>Another common strategy - adopted by both GCC and Microsoft's
Visual C++ compiler, is to implement pure virtual functions as
calls to a run-time library function. GCC's is called <tt class=
"literal">__pure_virtual</tt>. Testing <tt class=
"literal">__pure_virtual()</tt> showed it getting stuck in an
infinite recursion, which is probably a bug. Visual C++'s similar
system displays an error message and terminates the application,
which I guess is what GCC's authors also intended. The take home
point is that this means that having any pure virtual functions in
your code causes parts of the run-time library to be brought in -
<tt class="literal">__pure_virtual()</tt> plus all the gubbins it
requires to report the error. This is the cause of the 6K overhead
that was observed.</p>
<p>To minimise code size, this overhead must be avoided. Therefore
it is necessary to eschew true pure virtual functions.</p>
<p>I adopted a minimalist technique to achieve this aim. First, I
provided an empty macro, PURE, solely to make the code to some
degree self-documenting:</p>
<pre class="programlisting">
#define PURE
</pre>
<p>So, when I wanted a pure function what I wrote it like so:</p>
<pre class="programlisting">
virtual int foo() PURE;
</pre>
<p>And created a stub implementation of the function. This was the
smallest possible stub that the compiler would accept plus a call
to my own <tt class="literal">PureCall()</tt> function e.g.</p>
<pre class="programlisting">
/*virtual*/ int SomeClass::foo()
{ PureCall(); return 0; }
</pre>
<p>I wrote <tt class="literal">PureCall()</tt> so it just uses OS
functionality to generate a fatal run-time error
diagnostic<sup>[<a name="d0e281" href="#ftn.d0e281" id=
"d0e281">4</a>]</sup>.</p>
<p>Of course, you can do without <tt class=
"literal">PureCall()</tt>, but then you run the risk of silent
run-time errors in your code. For the best of both worlds, it would
be possible to keep <tt class="literal">PureCall()</tt> as it is
for debug versions of the code, and to make it an empty inline
function in release versions.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e296" id="d0e296"></a>Finally</h2>
</div>
<p>That concludes our amble through Palm OS space optimisation. I
hope you found it useful, even if you do not program a Palm device.
I would welcome your feedback on the article, and can be contacted
at <tt class="email">&lt;<a href=
"mailto:glancaster@codemill.net">glancaster@codemill.net</a>&gt;</tt>.
<span class="bold"><b>Happy optimising!</b></span></p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e306" id=
"d0e306"></a>Acknowledgements</h2>
</div>
<p>Many thanks to Detlef Vollman, for thoroughly reviewing this
article and providing much additional insight, and also to Lois
Goldthwaite for her comments on an earlier draft.</p>
</div>
<div class="bibliography">
<div class="titlepage">
<h2><a name="d0e311" id="d0e311"></a>References</h2>
</div>
<div class="bibliomixed"><a name="GCC" id="GCC"></a>
<p class="bibliomixed">[GCC] GCC, the GNU Compiler Collection. Free
Software Foundation. http://www.gnu.org/software/gcc/gcc.html</p>
</div>
<div class="bibliomixed"><a name="Jackson" id="Jackson"></a>
<p class="bibliomixed">[Jackson] Jackson, Michael A., Principles of
Program Design, Academic Press, London and New York, 1975.</p>
</div>
<div class="bibliomixed"><a name="Sutter" id="Sutter"></a>
<p class="bibliomixed">[Sutter] Herb Sutter. Talk at ACCU Spring
Conference 2001.</p>
</div>
<div class="bibliomixed"><a name="PalmOS" id="PalmOS"></a>
<p class="bibliomixed">[PalmOS] Palm OS&reg; Programmer's API
Reference and Palm OS&reg; Programmer's Companion, Palm Inc.
www.palmos.com/dev/tech/docs/.</p>
</div>
<div class="bibliomixed"><a name="Stroustrup" id="Stroustrup"></a>
<p class="bibliomixed">[Stroustrup] Bjarne Stroustrup, The Design
and Evolution of C++, p. 121, Addison Wesley, 1994.</p>
</div>
<div class="bibliomixed"><a name="Vollmann" id="Vollmann"></a>
<p class="bibliomixed">[Vollmann] Detlef Vollmann, e-mail
communication. The compiler is/was Borland C++ for OS/2.</p>
</div>
</div>
<div class="footnotes"><br>
<hr class="c2" width="100">
<div class="footnote">
<p><sup>[<a name="ftn.d0e106" href="#d0e106" id=
"ftn.d0e106">1</a>]</sup> A translation unit is a source file, plus
all the headers #included in that source file.</p>
</div>
<div class="footnote">
<p><sup>[<a name="ftn.d0e165" href="#d0e165" id=
"ftn.d0e165">2</a>]</sup> Occasionally, using a shared C++ run-time
library has no extra overhead - if you can guarantee that when your
program is loaded, another program is already using it.</p>
</div>
<div class="footnote">
<p><sup>[<a name="ftn.d0e191" href="#d0e191" id=
"ftn.d0e191">3</a>]</sup> Bear in mind that this straightforward
implementation doesn't call the current, or indeed, any, <tt class=
"literal">new_handler</tt>.</p>
</div>
<div class="footnote">
<p><sup>[<a name="ftn.d0e281" href="#d0e281" id=
"ftn.d0e281">4</a>]</sup> Another more transparent approach to
avoiding the pure virtual function overhead and other run-time
library overhead issues is to write a run-time library replacement,
containing low overhead implementations of only those functions you
need, including <tt class="literal">__pure_virtual()</tt>.</p>
</div>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
