Taming Complexity: A Class Hierarchy Tale

Introduction

Some years ago, I was involved in the design of the software for a security system1. The software had to process events from various sensor devices detecting motion in the building. However, consider the operation of such a system installed in a typical office building: obviously alarms shouldn't ring when people are in the building during normal working hours. Actually that's not strictly correct: there may be areas of the building that are normally off limits, and so should be monitored in normal working hours. Further, there may be cases when people are working during the night in certain parts of the building. Clearly this security system needed to be flexible - i.e. it needed a lot of configurability designing into it. To this end I had the task of designing a logic engine in support of configurability far beyond the simple description given above.

From this design comes the class hierarchy shown in Listing 1.

Listing 1: evaluator Class Hierarchy

class evaluator
{
public:
  virtual ~evaluator();
  virtual bool evaluate() const = 0;
...
};

class common_evaluator : public evaluator
{
protected:
  common_evaluator(const trigger* t1, const trigger* t2)
  : one(t1), two(t2)
  {}

  const trigger& trigger_1() const { return *one; }
  const trigger& trigger_2() const { return *two; }

private:
  const trigger* const one;
  const trigger* const two;
};

class and_evaluator : public common_evaluator
{
public:
  and_evaluator(const trigger* t1, const trigger* t2)
  : common_evaluator(t1, t2)
  {}
private:
  virtual bool evaluate() const
  {
    return trigger_1().active() && trigger_2().active();
  }
};

class or_evaluator : public common_evaluator
{
public:
  or_evaluator(const trigger* t1, const trigger* t2)
  : common_evaluator(t1, t2)
  {}

private:
  virtual bool evaluate() const
  {
    return trigger_1().active() || trigger_2().active();
  }
};

The trigger class represents a trigger event, i.e. an event that potentially participates in the triggering of an alert. Motion being detected in a certain area and a door that should be closed being open, are both examples of triggers going active. Further, a trigger can be a simple time interval, 1800-0800 for example - i.e. not during daytime office hours. In the domain terminology, such events are said to cause a trigger to "go active". It is possible that a single trigger event could trigger an alarm, motion being detected in an off-limits part of the building, for example. However, motion detected in certain (most) parts of the building between 1800 and 0800 - i.e. a combination of two trigger events - is the kind of combination of events requiring an alert to be raised. This is where the evaluator class hierarchy comes in. I don't propose to go any further into the problem domain analysis, but it turns out that when two trigger events are of interest, there are two cases where it may be necessary to raise an alert:

The case where both triggers need to be active (logical AND), and
The case where either or both triggers need to be active (logical OR)

The evaluator hierarchy shown in Listing 1 facilitates this, and also leaves open the possibility of extending the system to handle the logical XOR case. Note in passing, that I'm ignoring cases where an alert is raised if only a single trigger goes active - these do not form part of this illustration.

The C++ Three Level Hierarchy

I've used the evaluator class hierarchy and spent some time explaining it, because I like the idea of a realistic example to work with. The above comes from a real project, although what I've shown is a simplified version for illustration purposes.

I'm using this hierarchy as an example of the three level structures that are idiomatic in C++ class hierarchy design: interface class (see [ Radford ] for an explanation of the term) at the top, common (but abstract) implementation in the middle, and the most derived classes forming the (concrete) implementations.

However the tale of this hierarchy has a slight twist in it. Let me just digress for the moment and look at another three level hierarchy - one I trust will need no explanation.

The circular_shape hierarchy, shown in Listing 2 on page 26, uses the three level approach already discussed. However in this case, the common_circular_shape class (the middle class) captures state common across the concrete (most derived) classes, while the derived classes themselves have their own specific state. Also, note the constructors: there is a lot of repetition of code here, in fact the constructors of circle and ellipse differ in name only.

Listing 2: circular_shape Class Hierarchy

class circular_shape
{
public:
  virtual ~ circular_shape();

  virtual void move_x(distance x) = 0;
  virtual void move_y(distance y) = 0;

...
};
class common_circular_shape : public circular_shape
{
protected:
  circular_shape(const point& in_centre)
  : centre_pt(in_centre)
  {}
  point centre() const;
...
private:
  point centre_pt;
};

class circle : public common_circular_shape
{
public:
  circle(const point& in_centre, const distance& in_radius)
  : common_circular_shape(in_centre), radius(in_radius)
  {}

  virtual void move_x(int x);
  virtual void move_y(distance y);
...
private:
  distance radius;
};

class ellipse : public common_circular_shape
{
public:
  ellipse(const point& in_centre, const distance& in_radius)
  : common_circular_shape(in_centre), radius(in_radius)
  {}

  virtual void move_x(distance x);
  virtual void move_y(distance y);
...
private:
  distance radius_1, radius_2;
};

Compare the circular_shape and evaluator hierarchies. In the latter case, the concrete (most derived) classes have no state of their own - all state is common and captured in the common implementation. However, there is something and_evaluator and or_evaluator have in common with circle and ellipse : they are saddled with the repetitive constructor baggage. That is to say, the derived classes must each have an almost identical constructor just to initialise state that is kept entirely in a base class. The areas of variability in and_evaluator and or_evaluator are:

The name of the constructor - everything else about the constructors is the same for both classes
The two characters denoting logical operation in evaluate() - other than that, the implementation of this virtual function is exactly the same for both classes, and the same would apply if an xor_evaluator were ever to be added

Everything else is the same between these two classes (and this also applies to a future xor_evaluator ).

This suggests a need to investigate ways of simplifying the design. Herein is the subject matter for the rest of this article.

I will investigate two ways to separate the commonality and variability: one using delegation, and the other using static parameterisation using C++ templates.

Approach Using Delegation

This approach draws on the STRATEGY design pattern [ Gamma- ]. First, let's deal with the evaluate() virtual function - I'm going to factor out and_evaluator and or_evaluator 's implementations into and_operation and or_operation respectively, and derive these from the interface class operation. The resulting hierarchy is shown in Listing 3 on page 27.

Listing 3: operation Class Hierarchy

class operation
{
public:
  virtual ~operation();
private:
  virtual bool result(const trigger& one, const trigger& two) const = 0;
...
};
class and_operation : public operation
{
private:
  virtual bool result(const trigger& one, const trigger& two) const
  {
    return one.active() && two.active();
  }
};


class or_operation : public operation
{
private:
  virtual bool result(const trigger& one, const trigger& two) const
  {
    return one.active() && two.active();
  }
};

The evaluator class no longer needs to be in a hierarchy. It now looks like Listing 4. This design has the benefit that there is no need for repetitive constructor code (the operation hierarchy is stateless). However there is a disadvantage - the lifetimes of operation objects must now be managed.

Listing 4: Non-Hierarchical evaluator Class

class evaluator
{
public:
  evaluator(
    const trigger* trigger_1, const trigger* trigger_2, const operation* op_object)
  : t1(trigger_1), t2(trigger_2), op(op_object)
  {}

private:
  bool evaluate() const
  {
    return op->result(*t1, *t2);
  }

  const trigger* const t1;
  const trigger* const t2;
  const operation* const op;
};

This can be addressed by taking the evolution of this design one step further - i.e. operations being stateless can be taken advantage of and, instead of using classes, function pointers can be used, as shown in Listing 5 on page 28.

Listing 5: Function Pointers Replacing Classes

bool and_function(const trigger& one,const trigger& two)
{
  return one.active() && two.active();
}
...
class evaluator
{
public:
  evaluator(
    const trigger* trigger_1, const trigger* trigger_2, bool (*eval_op)(const trigger&,
          const trigger&))
  : t1(trigger_1), t2(trigger_2), op(eval_op)
  {}
private:
  bool evaluate() const
  {
    return op(*t1, *t2);
  }

  const trigger* const t1;
  const trigger* const t2;
  bool (*op)(const trigger&, const trigger&);
};

However even at this stage in the evolution of the design, a crucial piece of commonality remains to be dealt with: the operation::result() virtual function implementations still differ only in the two characters denoting the logical operation.

Approach Using C++ Templates

This approach mixes run time polymorphism with static parameterisation. In C++ terms, it mixes classes that have virtual functions, with templates. (See Listing 6 on page 28).

Listing 6: evaluator Class Using Templates

class evaluator
{
public:
  virtual ~evaluator();
  virtual bool evaluate() const = 0;
  ...
};

template <typename relational_function> class evaluator_implementor
  : public evaluator
{
public:
  evaluator_implementor(const trigger* trigger_1, const trigger* trigger_2)
  : t1(trigger_1), t2(trigger_2)
  {}

private:
  virtual bool evaluate() const
  {
    return relational_function()(t1->active(), t2->active());
  }

  const trigger* const t1;
  const trigger* const t2;
};

The evaluator interface (base) class is just the same as it was in the original three level hierarchy we started with, and this design does not introduce a second hierarchy. The part that's changed is the derived classes - these have been replaced by a single class template called evaluator_implementor , derived (publicly) from evaluator . The evaluator_implementor class contains the state and implements the evaluate() virtual function using a function object supplied as a template parameter.

The idea is that and_evaluator and or_evaluator can now be implemented in a very straightforward manner using the standard library's logical_and<> and logical_or<> function object templates as template arguments:

typedef evaluator_implementor<std::logical_and<bool> > and_evaluator;
typedef evaluator_implementor<std::logical_or<bool> > or_evaluator;

Advantages:

The hierarchy is much simplified. A three level hierarchy has been collapsed into a two level hierarchy and there is only one class at each level!
There is only one implementation of evaluate() and only one derived class constructor - thus eliminating the repetition of code sharing much commonality.
The separation between commonality and variability is much more clearly stated. C++ templates have been used to very effectively express this separation of concerns. The structure is expressed in a base class with one class template derived from it. The implementation specifics are factored out into the template parameter.

Disadvantages:

With evaluator_implementor being a class template, there is no way to avoid the implementation of evaluate() leaking into the client code. This is not a major disadvantage and is unavoidable in some template code. Nevertheless, I regard it as being more desirable for such implementation leaks to be avoided if possible. There is a straightforward mechanism for dealing with this, which I'll come back to in a moment.
The final disadvantage is very much one of pragmatics. There are still very few programmers out there who are comfortable with templates and template techniques. Although more and more are becoming familiar with the STL, this only requires the knowledge to use a template, whereas the design under discussion requires the confidence to write one. In practice this usually means that if a development group has a programmer who is happy to design/write this sort of code, the programmer may be discouraged from doing so because other members of the group will not be comfortable with the code. It is sadly ironic that most working C++ programmers are likely to be happier with a design containing the noise of more repetition of code (and hence more opportunity to make a mistake) and less clarity of intent, just because it doesn't use a template.

Just before moving on, I need to illustrate the mechanism I alluded to in the first disadvantages point above.

Rather than following the traditional approach of placing the evaluator_implementor class template in a header file, it can be placed in an implementation (typically .cpp ) file. The and_evaluator and or_evaluator typedef s can also be placed in this implementation file. Client code does not need to see the class template definition, or the typedef s, because all use of objects is intended to be via the evaluator interface class. This still leaves us with a slight problem - how can client code create and_evaluator and or_evaluator objects? Well, they can't do so directly, but simple factory functions making it possible can be provided. The code fragment in Listing 7 illustrates this.

Listing 7: Factory Functions

// evaluator.h
class evaluator
{
public:
  virtual ~evaluator();
  virtual bool evaluate() const = 0;
  ...
};

evaluator* create_and_evaluator(const trigger* trigger_1, const trigger* trigger_2);
...
// evaluator.cpp
template <typename relational_function> class evaluator_implementor
  : public evaluator
{
public:
  evaluator_implementor(const trigger* trigger_1, const trigger* trigger_2)
  : t1(trigger_1), t2(trigger_2)
  {}
  ...
};

typedef evaluator_implementor<std::logical_and<bool> > and_evaluator;
typedef evaluator_implementor<std::logical_or<bool> > or_evaluator;

evaluator* create_and_evaluator(const trigger* trigger_1, const trigger* trigger_2)
{
    return new and_evaluator(trigger_1, trigger_2);
}
...

The definitions of the factory functions are in the same implementation file as evaluator_implementor . Therefore, both implementation and instantiation of this class template are in the same file, and its definition is never required anywhere else in the code.

Finally

When complexity occurs in design, attempts to pretend it isn't there lead inevitably to trouble. Complexity can't be eliminated, it must be managed.

While the delegation based approach had to be explored, it didn't really buy us very much. It solved only the problem of constructor code repetition, but introduced the extra problem of managing more objects. Taking the approach one step further - i.e. using function pointers - eliminated this problem but the problem of repetitive constructor code remained.

The approach using C++ templates, however, is an excellent illustration of managing complexity by moving it out of the code and into the programming language. In stating the advantages and disadvantages of this approach I made the point about most programmers still not being happy to write templates. Templates are actually a complex feature of the C++ language, and no doubt this accounts for the reluctance of programmers to adopt them. However, the benefit of including a complex feature in the language is borne out in the design using templates - it is an excellent illustration of the C++ language absorbing complexity, rather than allowing the complexity to manifest itself in the actual code. This is the approach I adopted for the design of the security system described in the introduction.

Acknowledgements

Many thanks to Phil Bass and Alan Griffiths for their helpful comments.

References

[Gamma-] Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software , Addison-Wesley, 1995.

[Radford] Mark Radford, "C++ Interface Classes - An Introduction", http://www.twonine.co.uk/articles/CPPInterfaceClassesIntro.pdf