pinExceptions Make for Elegant Code

Overload Journal #86 - August 2008 + Programming Topics + Design of applications and programs   Author: Anthony Williams
Anything that can go wrong, will go wrong. Anthony Williams compares ways of dealing with errors.

On episode 8 of the Stack Overflow podcast that he does with Jeff Atwood [Spolsky], Joel Spolsky comes out quite strongly against exceptions, on the basis that they are hidden flow paths. Whilst I can sympathise with the idea of making every possible control path in a routine explicitly visible, coming back to writing C code for a recent project after many years of coding in C++ has driven home to me that this actually makes the code a lot harder to follow, as the actual code for what it's really doing is hidden amongst a load of error checking.

Whether or not you use exceptions, you have the same number of possible flow paths. With exceptions, the code can be a lot cleaner than with exceptions, as you don't have to write a check after every function call to verify that it did indeed succeed, and you can now proceed with the rest of the function. Instead, the code tells you when it's gone wrong by throwing an exception.

Exceptions also simplify the function signature: rather than having to add an additional parameter to hold the potential error code, or to hold the function result (because the return value is used for the error code), exceptions allow the function signature to specify exactly what is appropriate for the task at hand, with errors being reported 'out-of-band'. Yes, some functions use errno, which helps by providing a similar out-of-band error channel, but it's not a panacea: you have to check and clear it between every call, otherwise you might be passing invalid data into subsequent functions. Also, it requires that you have a value you can use for the return type in the case that an error occurs. With exceptions you don't have to worry about either of these, as they interrupt the code at the point of the error, and you don't have to supply a return value.

Listing 1 shows three implementations of the same function using error code returns, errno and exceptions.

    int foo_with_error_codes(some_type param1,
       other_type param2,result_type* result)
    {
      int error=0;
      intermediate_type temp;
      if((error=do_blah(param1,23,&temp)) ||
         (error=do_flibble(param2,temp,result))
      {
        return error;
      }
      return 0;
    }


    result_type foo_with_errno(some_type param1,
       other_type param2)
    {
      errno=0;
      intermediate_type temp=do_blah(param1,23);
      if(errno)
      {
        return dummy_result_type_value;
      }
      return do_flibble(param2,temp);
    }

    result_type foo_with_exceptions(some_type param1,
       other_type param2)
    {
      return do_flibble(param2,do_blah(param1,23));
    }
  
Listing 1

Error recovery

In all three cases, I've assumed that no recovery is required if do_blah succeeds but do_flibble fails. If recovery was required, additional code would be required. It could be argued that this is where the problems with exceptions begin, as the code paths for exceptions are hidden, and it is therefore unclear where the cleanup must be done. However, if you design your code with exceptions in mind I find you still get elegant code (see my blog entry [Williams07] for some considerations on elegance in software). try/catch blocks are ugly: this is where deterministic destruction comes into its own. By encapsulating resources, and performing changes in an exception-safe manner, you end up with elegant code that behaves gracefully in the face of exceptions, without cluttering the 'happy path'. See Listing 2.

    int foo_with_error_codes(some_type param1,
       other_type param2,result_type* result)
    {
      int error=0;
      intermediate_type temp;
      if(error=do_blah(param1,23,&temp))
      {
        return error;
      }
      if(error=do_flibble(param2,temp,result))
      {
        cleanup_blah(temp);
        return error;
      }
      return 0;
    }


    result_type foo_with_errno(some_type param1,
       other_type param2)
    {
      errno=0;
      intermediate_type temp=do_blah(param1,23);
      if(errno)
      {
        return dummy_result_type_value;
      }
      result_type res=do_flibble(param2,temp);
      if(errno)
      {
        cleanup_blah(temp);
        return dummy_result_type_value;
      }
      return res;
    }


    result_type foo_with_exceptions(some_type param1,
       other_type param2)
    {
      return do_flibble(param2,do_blah(param1,23));
    }

    result_type foo_with_exceptions2(some_type param1,
       other_type param2)
    {
      blah_cleanup_guard temp(do_blah(param1,23));
      result_type res=do_flibble(param2,temp);
      temp.dismiss();
      return res;
    }
  
Listing 2

In the error code cases, we need to explicitly cleanup on error, by calling cleanup_blah. In the exception case we've got two possibilities, depending on how your code is structured. In foo_with_exceptions, everything is just handled directly: if do_flibble doesn't take ownership of the intermediate data, it cleans itself up. This might well be the case if do_blah returns a type that handles its own resources, such as std::string or boost::shared_ptr. If explicit cleanup might be required, we can write a resource management class such as blah_cleanup_guard used by foo_with_exceptions2, which takes ownership of the effects of do_blah, and calls cleanup_blah in the destructor unless we call dismiss to indicate that everything is going OK.

Real examples

That's enough waffling about made up examples, let's look at some real(ish) code. Here's something simple: adding a new value to a dynamic array of DataType objects held in a simple dynamic_array class. Let's assume that objects of DataType can somehow fail to be copied: maybe they allocate memory internally, which may therefore fail. We'll also use a really dumb algorithm that reallocates every time a new element is added. This is not for any reason other than it simplifies the code: we don't need to check whether or not reallocation is needed.

If we're using exceptions, that failure will manifest as an exception, and our code looks like Listing 3. On the other, if we can't use exceptions, the code looks like Listing 4.

    class DataType
    {
    public:
      DataType(const DataType& other);
    };
    class dynamic_array
    {
    private:
      class heap_data_holder
      {
        DataType* data;
        unsigned initialized_count;
      public:
        heap_data_holder():
          data(0),initialized_count(0)
        {}
        explicit heap_data_holder(unsigned max_count):
           data((DataType*)malloc(
           max_count*sizeof(DataType))),
           initialized_count(0)
        {
          if(!data)
          {
            throw std::bad_alloc();
          }
        }
        void append_copy(DataType const& value)
        {
          new (
             data+initialized_count) DataType(value);
          ++initialized_count;
        }
        void swap(heap_data_holder& other)
        {
          std::swap(data,other.data);
          std::swap(initialized_count,
             other.initialized_count);
        }
        unsigned get_count() const
        {
          return initialized_count;
        }
        ~heap_data_holder()
        {
          for(unsigned i=0;i<initialized_count;++i)
          {
            data[i].~DataType();
          }
          free(data);
        }
        DataType& operator[](unsigned index)
          {
            return data[index];
          }
        };
        heap_data_holder data;
        // no copying for now
        dynamic_array& operator=(
           dynamic_array& other);
        dynamic_array(dynamic_array& other);
    public:
      dynamic_array()
      {}
      void add_element(DataType const& new_value)
      {
        heap_data_holder new_data(data.get_count()+1);
        for(unsigned i=0;i<data.get_count();++i)
        {
          new_data.append_copy(data[i]);
        }
        new_data.append_copy(new_value);
        new_data.swap(data);
      }
    };
  
Listing 3


    class DataType
    {
    public:
      DataType(const DataType& other);
      int get_error();
    };
    class dynamic_array
    {
    private:
      class heap_data_holder
      {
        DataType* data;
        unsigned initialized_count;
        int error_code;
      public:
        heap_data_holder():
           data(0),initialized_count(0),
           error_code(0)
        {}
        explicit heap_data_holder(unsigned max_count):
           data((DataType*)malloc(
           max_count*sizeof(DataType))),
           initialized_count(0), error_code(0)
        {
          if(!data)
          {
            error_code=out_of_memory;
          }
        }
        int get_error() const
        {
          return error_code;
        }
        int append_copy(DataType const& value)
        {
          new (
             data+initialized_count) DataType(value);
          if(data[initialized_count].get_error())
          {
            int const error=
               data[initialized_count].get_error();
            data[initialized_count].~DataType();
            return error;
          }
          ++initialized_count;
          return 0;
        }
        void swap(heap_data_holder& other)
        {
          std::swap(data,other.data);
          std::swap(initialized_count,
             other.initialized_count);
        }
        unsigned get_count() const
        {
          return initialized_count;
        }
        ~heap_data_holder()
        {
          for(unsigned i=0;i<initialized_count;++i)
          {
            data[i].~DataType();
          }
          free(data);
        }
        DataType& operator[](unsigned index)
        {
          return data[index];
        }
      };
      heap_data_holder data;
      // no copying for now
      dynamic_array& operator=(dynamic_array& other);
      dynamic_array(dynamic_array& other);

    public:
      dynamic_array()
      {}
      int add_element(DataType const& new_value)
      {
        heap_data_holder new_data(data.get_count()+1);
        if(new_data.get_error())
           return new_data.get_error();
        for(unsigned i=0;i<data.get_count();++i)
        {
          int const error=
             new_data.append_copy(data[i]);
          if(error)
            return error;
        }
        int const error=
           new_data.append_copy(new_value);
        if(error)
          return error;
        new_data.swap(data);
        return 0;
      }
    };
  
Listing 4

It's not too dissimilar, but there are a lot of checks for error codes: add_element has gone from 10 lines to 17, which is almost double, and there are also additional checks in the heap_data_holder class. In my experience, this is typical: if you have to explicitly write error checks at every failure point rather than use exceptions, your code can get quite a lot larger for no gain. Also, the constructor of heap_data_holder can no longer report failure directly: it must store the error code for later retrieval. To my eyes, the exception-based version is a whole lot clearer and more elegant, as well as being shorter: a net gain over the error-code version.

Error safety

I expect most of you are familiar with the Abrahams Exception Safety Guarantees [Abrahams], but these could realisticly be termed Error Safety Guarantees: we want code that is robust in the face of errors in general, not just exceptions. The only reason that exceptions are 'special' is that people are less familiar with how to write code in the presence of exceptions. It is providing sensible guarantees for the code that leads us to the structure of the example code above; something that remains essentially the same even when using error codes. Creating a copy of a structure 'off to the side' and then swapping it with the original is a useful technique whichever error handling mechanism you use, but it really comes into its own with exceptions.

The Abrahams Exception Safety Guarantee

These guarantees were first documented by Dave Abrahams when the C++ Standards committee were working on the 1998 C++ Standard. The idea is that code should provide one of the three guarantees - if it doesn't, then an exception occuring in your code will result in leaked resources or corrupt data structures or both. The guarantees are:

The no-fail (or no-throw) guarantee

This is the strongest of all guarantees. A function that provides this guarantee will not throw any exceptions, and will not fail. All destructors should provide this guarantee, as should important operations like swap which provide the building blocks for the code that uses them to provide suitable exception safety guarantees.

The strong guarantee

A function that provides this guarantee is all or nothing: if it fails, then any effects are rolled back so the state of the data structure is the same as it was on entry. This requires that the function doesn't do anything irreversible (like perform I/O), and that there are suitable operations that provide the no-fail guarantee which can be used to commit or roll back the changes.

The basic guarantee

This is the basic level you should strive for in all code: if a function fails, then it must leave the data structures in a valid state, even if that state differs from the original. For example, failure to insert a new item into a container must leave the container in a valid state, even if all the existing items have been deleted.

Any code that doesn't provide even the basic guarantee is not exception safe.

Structural changes

The lack of the ability for the constructor of heap_data_holder to abort on failure means that the way objects are written must change: the 'invariants' of the class must be extended to allow for it to be in an 'invalid' state due to the constructor failing. Similarly, the signatures of some functions must change to allow for failures: if your function returns a reference then this can pose a problem if the function failed: you don't necessarily have an object to return a reference to, and must instead return a pointer, which can therefore be NULL. This subtle shift in the design of the code now means that any code that calls this function has to be prepared for a NULL pointer to be returned where before it could rely on there being an object, since there is no such thing as a NULL reference.

The lack of exceptions actually makes it hard to pass the result of one function as a parameter to another altogether: because the function will return normally even if it failed, the second function has to handle whatever the first returns on error without causing serious problems. We saw this back in the first example where the call to do_blah was separated from the call to do_flibble in order to check the error code, whereas the exception version had do_blah called directly in the call for do_flibble. If you apply this to operators it gets even worse: operators don't have any means of returning an error code directly, so they have to resort to techniques such as the use of errno, and you essentially lose any benefits of writing an operator in the first place. With exceptions, we can write:


      std::string foo(std::string const& s)
      {
        return "hello " + s + " goodbye";
      }

where the second call to operator+ will only happen if the first succeeded. If we don't have exceptions then the first call to operator+ has to return something, and the second call to operator+ has to handle the case that one or more of its arguments is an 'invalid' object.

Conclusion

I guess it's a matter of taste, but I find code that uses exceptions is shorter, clearer, and actually has fewer bugs than code that uses error codes. Yes, you have to think about the consequences of an exception, and at which points in the code an exception can be thrown, but you have to do that anyway with error codes, and it's easy to write simple resource management classes to ensure everything is taken care of. Without exceptions you often have to contort the design to handle the error checking.

References

[Abrahams] Abrahams, D., 'Exception Safety in Generic Components', http://www.boost.org/community/exception_safety.html

[Spolsky] 'Stack Overflow' (podcast) - available from: http://blog.stackoverflow.com/index.php/2008/06/podcast-8/

[Williams07] 'Elegance in Software', http://www.justsoftwaresolutions.co.uk/design/elegance-in-software.html

This article is based on a blog entry with the same title at http://www.justsoftwaresolutions.co.uk/design/exceptions-make-for-elegant-code.html

Overload Journal #86 - August 2008 + Programming Topics + Design of applications and programs