pinBetter Encapsulation for the Curiously Recurring Template Pattern

Overload Journal #70 - Dec 2005 + Programming Topics   Author: Alexander Nasonov

C++ has a long, outstanding history of tricks and idioms. One of the oldest is the curiously recurring template pattern (CRTP) identified by James Coplien in 1995 [Coplien]. Since then, CRTP has been popularized and is used in many libraries, particularly in Boost [Boost]. For example, you can find it in Boost.Iterator, Boost.Python or in Boost.Serialization libraries.

In this article I assume that a reader is already familiar with CRTP. If you would like to refresh your memory, I would recommend reading chapter 17 in [Vandevoorde-]. This chapter is available for free on www.informit.com.

If you look at the curiously recurring template pattern from an OO perspective you'll notice that it shares common properties with OO frameworks (e.g. Microsoft Foundation Classes) where base class member functions call virtual functions implemented in derived classes. The following code snippet demonstrates OO framework style in its simplest form:

// Library code
class Base
{
  public:
    virtual ~Base();
    int foo() { return this->do_foo(); }

  protected:
    virtual int do_foo() = 0;
};

Here, Base::foo calls virtual function do_foo, which is declared as a pure virtual function in Base and, therefore, it must be implemented in derived classes. Indeed, a body of do_foo appears in class Derived:

// User code
class Derived : public Base
{
  private:
    virtual int do_foo() { return 0; }
};

What is interesting here, is that an access specifier of do_foo has been changed from protected to private. It's perfectly legal in C++ and it takes a second to type one simple word. What is more, it's done intentionally to emphasize that do_foo isn't for public use. (A user may go further and hide the whole Derived class if she thinks it's worth it.)

The moral of the story is that a user should be able to hide implementation details of the class easily.

Now let us assume that restrictions imposed by virtual functions are not affordable and the framework author decided to apply CRTP:

// Library code
template<class DerivedT>
class Base
{
  public:
    DerivedT& derived() {
       return static_cast<DerivedT&>(*this); }
    int foo() {
       return this->derived().do_foo(); }
};
// User code
class Derived : public Base<Derived>
{
  public:
    int do_foo() { return 0; }
};

Although do_foo is an implementation detail, it's accessible from everywhere. Why not make it private or protected? You'll find an answer inside function foo. As you see, the function calls Derived::do_foo. In other words, base class calls a function defined in a derived class directly.

Now, let's find an easiest way for a user to hide implementation details of Derived. It should be very easy; otherwise, users won't use it. It can be a bit trickier for the author of Base but it still should be easy to follow.

The most obvious way of achieving this is to establish a friendship between Base and Derived:

// User code
class Derived : public Base<Derived>
{
  private:
    friend class Base<Derived>;
    int do_foo() { return 0; }
};

This solution is not perfect for one simple reason: the friend declaration is proportional to the number of template parameters of Base class template. It might get quite long if you add more parameters.

To get rid of this problem one can fix the length of the friend declaration by introducing a non-template Accessor that forwards calls:

// Library code
class Accessor
{
  private:
    template<class> friend class Base;
    template<class DerivedT>
    static int foo(DerivedT& derived)
    {
        return derived.do_foo();
    }
};

The function Base::foo should call Accessor::foo which in turn calls Derived::do_foo. A first step of this call chain is always successful because the Base is a friend of Accessor:

// Library code
template<class DerivedT>
class Base
{
  public:
    DerivedT& derived() {
       return static_cast<DerivedT&>(*this); }
    int foo() {
       return Accessor::foo(this->derived()); }
};

The second step succeeds only if either do_foo is public or if the Accessor is a friend of Derived and do_foo is protected. We are interested only in a second alternative:

// User code
class Derived : public Base<Derived>
{
  private:
    friend class Accessor;
    int do_foo() { return 0; }
};

This approach is taken by several boost libraries. For example, def_visitor_access in Boost.Python and iterator_core_access in Boost.Iterator should be declared as friends in order to access user-defined private functions from def_visitor and iterator_facade respectively.

Even though this solution is simple, there is a way to omit the friend declaration. This is not possible if do_foo is private - you will have to change that to protected. The difference between these two access specifiers is not so important for most CRTP uses. To understand why, take a look at how you derive from CRTP base class:

class Derived : public Base<Derived> { /* ... */ };

Here, you pass the final class to Base's template arguments list.

An attempt to derive from Derived doesn't give you any advantage because the Base<Derived> class knows only about Derived.

Our goal is to access protected function Derived::do_foo from the Base:

// User code
class Derived : public Base<Derived>
{
  protected:
    // No friend declaration here!
    int do_foo() { return 0; }
};

Normally, you access a protected function declared in a base class from its child. The challenge is to access it the other way around.

The first step is obvious. The only place for our interception point where a protected function can be accessed is a descendant of Derived:

struct BreakProtection : Derived
{
    static int foo(Derived& derived) {
       /* call do_foo here */ }
};

An attempt to write

   return derived.do_foo();

inside BreakProtection::foo fails because it's forbidden according to the standard, paragraph 11.5:

When a friend or a member function of a derived class references a protected nonstatic member of a base class, an access check applies in addition to those described earlier in clause 11. Except when forming a pointer to member (5.3.1), the access must be through a pointer to, reference to, or object of the derived class itself (or any class derived from that class) (5.2.5).

The function can only be accessed through an object of type BreakProtection.

Well, if the function can't be called directly, let's call it indirectly. Taking an address of do_foo is legal inside BreakProtection class:

    &BreakProtection::do_foo;

There is no do_foo inside BreakProtection, therefore, this expression is resolved as &Derived::do_foo. Public access to a pointer to protected member function has been granted! It's time to call it:

struct BreakProtection : Derived
{
  static int foo(Derived& derived)
  {
    int (Derived::*fn)() =
       &BreakProtection::do_foo;
    return (derived.*fn)();
  }
};

For better encapsulation, the BreakProtection can be moved to the private section of Base class template. The final solution is:

// Library code
template<class DerivedT>
class Base
{
  private:
    struct accessor : DerivedT
    {
        static int foo(DerivedT& derived)
        {
            int (DerivedT::*fn)() 
               = &accessor::do_foo;
            return (derived.*fn)();
        }
    };
  public:
    DerivedT& derived() {
       return static_cast<DerivedT&>(*this); }
    int foo() { return accessor::foo(
       this->derived()); }
};
// User code
struct Derived : Base<Derived>
{
  protected:
    int do_foo() { return 1; }
};

Note that the user code is slightly shorter and cleaner than in the first solution. The library code has similar complexity.

There is a downside to this approach, though. Many compilers don't optimize away function pointer indirection even if it's called in-place:

return (derived.*(&accessor::do_foo))();

The main strength of CRTP over virtual functions is better optimization.

CRTP is faster because there is no virtual function call overhead and it compiles to smaller code because no type information is generated. The former is doubtful for the second solution while the latter still holds. Hopefully, future versions of popular compilers will implement this kind of optimization. Also, it's less convenient to use member function pointers, especially for overloaded functions.

References

[Coplien] James O. Coplien. "Curiously Recurring Template Patterns", C++ Report, February 1995.

[Vandevoorde-] David Vandevoorde, Nicolai M. Josuttis. "C++ Templates: The Complete Guide". http://www.informit.com/articles/article.asp?p=31473

[Boost] Boost libraries. http://www.boost.org.

[standard] ISO-IEC 14882:1998(E),Programming languages - C++.

Overload Journal #70 - Dec 2005 + Programming Topics