ACCU Home page ACCU Conference Page
Search Contact us ACCU at Flickr ACCU at GitHib ACCU at Google+ ACCU at Facebook ACCU at Linked-in ACCU at Twitter Skip Navigation

pinCompile-time Data Structures in C++17: Part 1, Set of Types

Overload Journal #146 - August 2018 + Programming Topics   Author: Bronek Kozicki
Compile time data structures can speed things up at runtime. Bronek Kozicki details an implementation of a compile time set.

The standard collection std::set is known to every C++ programmer. The runtime complexity of its find and insert operations O(N ln N) is one of the reasons why it is often replaced with std::unordered_set, which has amortised complexity of O(1). But what if we had access to a similar data structure with a guaranteed complexity of ‘O(0)’, that is one where no computation is performed during runtime at all, and calls to member functions (such as insert or find) are directly replaced, during the compilation, by the results of the call? That also means that such data structure could be used inside compile-time expressions, such as std::conditional or std::enable_if, whose sample implementations are shown in Listing 1.

template <bool S, typename T, typename F> struct
  conditional;
template <typename T, typename F> struct
  conditional<true, T, F> { using type = T; };
template <typename T, typename F> struct
  conditional<false, T, F> { using type = F; };

template <bool S, typename T> struct enable_if {};
template <typename T> struct enable_if<true, T> {
  using type = T; };
            
Listing 1
About O(0)
There is no such thing as ‘O(0)’, hence quotes. That is because in big-O notation, we disregard the constant component, and the difference between O(1) and ‘O(0)’ is only the constant component (equal to C, for some unspecified fixed value C). In practice, the notion of big-O applies only to runtime complexity, and, as explained on the left, we are seeking to remove the runtime cost entirely. Hence our choice is to either disregard the big-O notation as not applicable or use a dummy ‘O(0)’.

Of course, there must be a drawback to such a data structure: since we expect the lookup operation to work during compilation, it follows that such a set must also be fully populated during compilation. That might not seem very useful in a small program; however, design concerns in large programs often drive to more decoupling, where such ability might be useful. For a trivial example, see the if constexpr expression in the body of the Foo constructor in Listing 2.

struct Bar {};
struct Foo {
  int i = 0;
  template <typename Set> constexpr explicit
  Foo(Set) {
    if constexpr ((bool)Set::template test<Bar>)
      i += 1;
  }
};
            
Listing 2

The sample code in Listing 2 demonstrates another important point about compile-time data structures: the ‘value’ looked for is a tag type (named Bar in this example). Such types provide us with symbolic, unique names without the need for a central repository (e.g. an enum type), hence avoiding accidental coupling inside such a repository, while at the same time minimising the risk of clashes if there is no such central location (e.g. from extern const int or C-style #define). Examples of tag types in the C++ standard include std::nothrow_t or types defined for various overloads of std::unique_lock constructor.

What would be the interface of such a ‘set of types’ utility? One member has already been presented above; it is the test template variable (NB, (bool) cast is a workaround for a bug in MSVC 15.7, to be fixed in version 15.8 and superfluous for other modern compilers). We also want the ability to populate the set and compare it to other sets. But how can we populate a compile-time construct? It is immutable, after all. The solution is to apply the principles of functional programming – in the compile time. That is, we can create a new container instance (or, in our case, a new container type) as a copy of the existing container with the addition of a new element. That sounds expensive, but let’s not forget that the set must be filled during the compilation, which means that the runtime complexity of such an operation is still going to be ‘O(0)’, at the cost of compilation time. In other words, to populate our set we are going to use a pure function, to be entirely evaluated during the compilation, which returns either a type or an immutable value: a meta-function. As we will see, quite often meta-functions are not functions at all – they can be immutable template values or template types. A simplistic example is presented in Listing 3.

namespace set_impl {
  template <typename ... L> struct contains;
  template <> struct contains<> {
    template <typename>
    constexpr static bool value = false;
  };
  template <typename T, typename ... L> struct
    contains<T, L...> {
    template <typename U>
    constexpr static bool value =
    std::is_same_v<T, U> ||
      contains<L...>::template value<U>;
  };
}

template <typename ... L>
struct set {
  constexpr explicit set() = default;
  template <typename T>
  constexpr static bool test =
    set_impl::contains<L...>::template value<T>;
  template <typename T>
  using insert = set<T, L...>;
};

int main() {
  struct Fuz; struct Baz;
  constexpr static auto s1 =
    set<>::insert<Baz>::insert<Fuz>{};
  constexpr static auto s2 =
    decltype(s1)::insert<Bar>{};
  constexpr static Foo foo2{ s2 };
  std::cout << foo2.i << std::endl;
}
            
Listing 3

The implementation of test presented in Listing 3 is indeed simplistic (we are going to improve it later). It is encapsulated as an implementation detail struct contains (in the set_impl namespace), which is a variadic template just like set, but dedicated to the calculation of the value contains, with the help of standard type trait std::is_same_v and two template specialisations, acting the role of a recursive meta-function. Specialisation contains<T, L...> performs equality check between U and T, while contains<> returns the terminating condition ‘not found’. Note how we used immutable template value template <typename U> constexpr static bool value = ... to pass the result of the nested meta-function to its caller. Such recursive calls are a common pattern in functional programming because they work well with immutable data. For the same reason, they work well with meta-functions, and we will employ recursion often. However, in meta-programming, we have always to remember to provide a terminating specialisation – which is an equivalent of a recursion terminating execution branch in functional programming.

The implementation of the insert meta-function is trivial, but sadly also incorrect, as we will see below. That is because of two outstanding, and so far ignored, properties of most sets, which are:

  • Uniqueness of the elements.
  • Constraints on the type of elements.

It is the presence of both which gives as a ‘set of something’. In std::set, the first guarantee is provided by quietly ignoring duplicate inserts, while the template parameter captures the later one. For our ‘set of types’, we are going to follow the lead of std::set and ignore duplicates. For the later guarantee we can, for the time being, apply some hardcoded set of constraints. For example:

  • Disallow qualifiers (const or volatile)
  • Disallow reference types (lvalue or rvalue)
  • Disallow pointer types

The proposed constraints do not impact the uses of our set where tag types are passed directly, for example, hardcoded. If they are deduced, or coded in a remote location, they only require additional use of std::decay_t to remove qualifiers or reference. The constraint to disallow pointers is meant to protect against user errors – either a typo or deduced type which unexpectedly turns out to be a pointer. Additionally, since our elements are types, we need to consider how to handle void. One useful solution is to handle it as an element of an empty set – that is, a non-element. In other words, we are going to use void as a placeholder for an element, where no element is available (the similarity to the role of void in C and C++ becomes obvious when ‘element’ is replaced by ‘type’).

Storing actual values
What if we wanted to store actual values, known at compilation time? There are a few options:
  • Implement a set of values separately, following the ideas presented here
  • Store values wrapped into instances of std::integral_constant
  • Assign unique, immutable values to tag types

Back to our code example, one can easily spot that the insert meta-function is not doing anything to enforce uniqueness of added types. Similarly, nothing is preventing the user from instantiating, e.g. set<int, int>. These two problems aside, yet another limitation of our set is that set<int, long> and set<long, int> are both distinct types but they are one set (because the order of elements does not matter to mathematical sets). There is probably no robust solution for the last problem, but a workable compromise would be a unique member type inside set to hold unique types, and creation of an is_same member meta-function to test set equality with no regard to order. That might look like Listing 4.

template <typename ... L> class set;
namespace set_impl {
  template <typename T> struct check {
    static_assert(std::is_same_v<T,
      std::decay_t<T>>);
    static_assert(not std::is_pointer_v<T>);
    using type = T;
   };

  template <typename ...L> struct contains_detail;
  template <typename U> struct
    contains_detail<U> {
    using type = typename check<U>::type;
    constexpr static bool value =
      std::is_void_v<type>;
    };
    template <typename U, typename T,
      typename ... L> struct
      contains_detail<U, T, L...> {
      using type = typename check<U>::type;
      using head = typename check<T>::type;
      constexpr static bool value =
        std::is_void_v<type>
        || std::is_same_v<type, head>
        || contains_detail<U, L...>::value;
  };

  template <typename ... L> struct insert_detail;
  template <typename T, typename ... L> struct
    insert_detail<T, set<L...>> {
    using head = typename check<T>::type;
    using type = set<head, L...>;
  };

  template <typename ... L> struct
    unchecked_list {};
  template <typename T> struct cracker_list;
  template <typename ... L> struct
    cracker_list<set<L...>> {
    using type = unchecked_list<L...>;
  };
  template <typename T> struct cracker;
  template <typename ... L> struct
    cracker<set<L...>> {
    using list = typename cracker_list<
      typename set<L...>::unique>::type;
    constexpr static size_t size =
      set<L...>::unique::size;
};

  template <typename ... L> struct
    intersect_detail;
  template <typename ... T> struct
    intersect_detail<unchecked_list<>,
      unchecked_list<T...>> {
    constexpr static size_t size = 0;
  };
  template <typename V, typename ... L,
    typename ... T> struct
    intersect_detail<unchecked_list<V,L...>,
      unchecked_list<T...>> {
    constexpr static size_t size =
      intersect_detail<unchecked_list<L...>,
        unchecked_list<T...>>::size
        + (not std::is_void_v<V>
          && contains_detail<V, T...>::value);
  };
  template <typename ... L> struct unique;
  template <> struct unique<> {
    using type = set<>;
    constexpr static size_t size = 0;
    template <typename , size_t Size>
    constexpr static bool is_same = Size == 0;
    template <typename U>
    constexpr static bool test =
      contains_detail<U>::value;
  };
  template <typename T, typename ... L> struct
    unique<T, L...> {
    using type = typename std::conditional<
      contains_detail<T, L...>::value
      , typename unique<L...>::type
      , typename insert_detail<T,
          typename unique<L...>::type
        >::type
      >::type;
    constexpr static size_t size =
      contains_detail<T, L...>::value
      ? unique<L...>::size
      : unique<L...>::size + 1;
    template <typename U, size_t Size>
    constexpr static bool is_same =
      size == Size
      && size == intersect_detail<
        unchecked_list<T,L...>, U>::size;
      template <typename U>
      constexpr static bool test =
        contains_detail<U, T, L...>::value;
  };
}

template <typename ... L>
class set {
  using impl = set_impl::unique<L...>;
public:
  constexpr explicit set() = default;
  using unique = typename impl::type;
  constexpr static size_t size = impl::size;
  constexpr static bool empty = size == 0;
  template <typename T>
  constexpr static bool is_same =
    impl::template is_same<typename set_impl
        ::cracker<T>::list, set_impl::cracker<T>
        ::size>;
  template <typename T>
  constexpr static bool test =
    impl::template test<T>;
  template <typename T>
  using insert =
    typename set_impl::unique<T, L...>::type;
};
            
Listing 4

The code may appear complex, but that is only because of the amount of typing necessary in C++ to code each simple meta-function. In fact, it is quite simple, although there are few things to note. Again we are delegating the work required to individual meta-functions, and we also have a unique type to build the unique set of types (this ensures that e.g. set<int, int> will be recognised as having only one element since repeated elements do not count). Inside this unique type we delegate tasks to individual meta-functions insert_details, contains_details and intersect_details. The last one calculates the size of the intersection of sets, which is required to test set equality using a simple formula: sets are equal if they have equal size and their intersection is equal to that size as well. We could reuse instersect_details to implement two more useful checks is_cross to check whether two sets share a non-empty intersection and is_super to check whether one set is a subset of another (in reverse, since our ‘superset’ will be on the left side and ‘subset’ on the right). In the first case, we only need to know that the size of the set intersection is greater than 0. In the latter, the size of the intersection will be equal to the size of a subset. The opportunity for reuse of intersect_details is obvious, and to do so we are going to introduce a higher order meta-function compare, to replace unique::is_same. In functional programming, a higher order function is a function which consumes (or produces, or both) a function. In our case, a meta-function is a template (whereas types take the role of values) which means that higher-order meta-function compare is going to consume a template template parameter (yes, that is the word ‘template’ twice in a row) which encapsulates the meta-function to perform the size comparison required. See Listing 5.

namespace set_impl {
  template <size_t Mine, size_t Cross,
    size_t Theirs> struct is_cross {
    constexpr static bool value = Cross > 0;
  };
  template <size_t Mine, size_t Cross,
    size_t Theirs> struct is_super {
    constexpr static bool value =
      Cross == Theirs;
  };
  template <size_t Mine, size_t Cross,
    size_t Theirs> struct is_same {
    constexpr static bool value =
      Mine == Cross && Cross == Theirs;
  };
  template <typename ... L> struct unique;
  template <> struct unique<> {
    // ...
    template <template <size_t, size_t, size_t>
      typename How, typename U, size_t Size>
    constexpr static bool compare =
      How<0, 0, Size>::value;
  };
  template <typename T, typename ... L>
    struct unique<T, L...> {
    // ...
    template <template <size_t, size_t, size_t>
      typename How, typename U, size_t Size>
    constexpr static bool compare =
      How<size,
      intersect_detail<unchecked_list<T, L...>,
      U>::size, Size>::value;
  };
}

template <typename ... L>
class set {
  using impl = set_impl::unique<L...>;
public:
  // ...
  template <typename T>
  constexpr static bool is_same =
    impl::template compare<set_impl::is_same,
    typename set_impl::cracker<T>::list,
    set_impl::cracker<T>::size>;
  template <typename T>
  constexpr static bool is_cross =
    impl::template compare<set_impl::is_cross,
    typename set_impl::cracker<T>::list,
    set_impl::cracker<T>::size>;
  template <typename T>
  constexpr static bool is_super =
    impl::template compare<set_impl::is_super,
    typename set_impl::cracker<T>::list,
    set_impl::cracker<T>::size>;
};
            
Listing 5

Right at the end, it is time to revisit the set of constraints imposed on types stored in our set of types. As we have just seen, a template template parameter can be used to implement a higher order (meta-) function, which is an ideal way for the user to select their own set of constraints, and perhaps even type transformations. The final program (Listing 6) makes use of this ability.

#include <cstdio>
#include <utility>
#include <type_traits>

template <template <typename> typename Check, typename ... L> class set;

namespace set_impl {
  template <template <typename> typename Check,
    typename ... L> struct contains_detail;
  template <template <typename> typename Check,
    typename U> struct contains_detail<Check, U> {
    using type = typename Check<U>::type;
    constexpr static bool value =
      std::is_void_v<type>;
  };
  template <template <typename> typename Check,
    typename U, typename T, typename ... L>
    struct contains_detail<Check, U, T, L...> {
    using type = typename Check<U>::type;
    using head = typename Check<T>::type;
    constexpr static bool value =
      std::is_void_v<type>
      || std::is_same_v<type, head>
      || contains_detail<Check, U, L...>::value;
    };
  template <template <typename> typename Check,
    typename ... L> struct insert_detail;
  template <template <typename> typename Check,
    typename T, typename ... L>
    struct insert_detail<Check, T, set<Check,
    L...>> {
    using head = typename Check<T>::type;
    using type = set<Check, head, L...>;
  };
  template <typename ... L> struct unchecked_list
  {};
  template <typename T> struct cracker_list;
  template <template <typename> typename Check,
    typename ... L> struct cracker_list<set<Check,
    L...>> {
    using type = unchecked_list<L...>;
  };
  template <typename T> struct cracker;
  template <template <typename> typename Check,
    typename ... L> struct cracker<set<Check,
    L...>> {
    using list = typename
      cracker_list<typename set<Check,
      L...>::unique>::type;
    constexpr static size_t size = set<Check,
      L...>::unique::size;
  };
  template <typename T> struct dummy_check {
    using type = T;
  };
  template <typename ... L> struct
    intersect_detail;
  template <typename ... T> struct
    intersect_detail<unchecked_list<>,
    unchecked_list<T...>> {
    constexpr static size_t size = 0;
  };
  template <typename V, typename ... L,
    typename ... T> struct
    intersect_detail<unchecked_list<V, L...>,
    unchecked_list<T...>> {
    constexpr static size_t size =
      intersect_detail<unchecked_list<L...>,
      unchecked_list<T...>>::size +
      (not std::is_void_v<V>
        && contains_detail<dummy_check, V,
        T...>::value);
  };
  template <size_t Mine, size_t Cross,
    size_t Theirs> struct is_cross {
    constexpr static bool value = Cross > 0;
  };
  template <size_t Mine, size_t Cross,
    size_t Theirs> struct is_super {
    constexpr static bool value =
      Cross == Theirs;
  };
  template <size_t Mine, size_t Cross,
    size_t Theirs> struct is_same {
    constexpr static bool value = Mine == Cross
      && Cross == Theirs;
  };
  template <template <typename> typename Check,
    typename ... L> struct unique;

  template <template <typename> typename Check>
    struct unique<Check> {
    using type = set<Check>;
    constexpr static size_t size = 0;

    template <template <size_t, size_t, size_t>
      typename How, typename U, size_t Size>
    constexpr static bool compare =
      How<0, 0, Size>::value;

    template <typename U>
    constexpr static bool test =
      contains_detail<Check, U>::value;
  };
  template <template <typename> typename Check,
    typename T, typename ... L>
    struct unique<Check, T, L...> {
    using type = typename std::conditional<
      contains_detail<Check, T, L...>::value
      , typename unique<Check, L...>::type
      , typename insert_detail<Check, T,
        typename unique<Check, L...>::type>::type
    >::type;
    constexpr static size_t size =
      contains_detail<Check, T, L...>::value
      ? unique<Check, L...>::size
      : unique<Check, L...>::size + 1;
    template <template <size_t, size_t, size_t>
      typename How, typename U, size_t Size>
    constexpr static bool compare =
      How<size, intersect_detail
      <unchecked_list<T, L...>, U>::size,
      Size>::value;
    template <typename U>
    constexpr static bool test =
      contains_detail<Check, U, T, L...>::value;
  };
}

template <template <typename> typename Check,
   typename ... L>
class set {
  using impl = set_impl::unique<Check, L...>;
public:
  constexpr explicit set() = default;
  using unique = typename impl::type;

  constexpr static size_t size = impl::size;
  constexpr static bool empty = size == 0;

  template <typename T>
  constexpr static bool is_same =
    impl::template compare<set_impl::is_same,
    typename set_impl::cracker<T>::list,
    set_impl::cracker<T>::size>;

  template <typename T>
  constexpr static bool is_cross =
    impl::template compare<set_impl::is_cross,
    typename set_impl::cracker<T>::list,
    set_impl::cracker<T>::size>;

  template <typename T>
  constexpr static bool is_super =
    impl::template compare<set_impl::is_super,
    typename set_impl::cracker<T>::list,
    set_impl::cracker<T>::size>;

  template <typename T>
  constexpr static bool test =
    impl::template test<T>;

  template <typename T>
  using type =
    typename std::enable_if<(not std::is_void_v<T>
    && test<T>), typename Check<T>::type>::type;

  template <typename T>
  using insert = typename
    set_impl::unique<Check, T, L...>::type;
};

struct Baz { Baz() = delete; };
struct Bar { Bar() = delete; };
struct Fuz { Fuz() = delete; };

struct Foo {
  int i = 0;

  template <typename Set> constexpr explicit
  Foo(Set) {
    if constexpr ((bool)Set::template test<Bar>)
      i += 1;
  }
};

template <typename T> struct PlainTypes {
  static_assert(std::is_same_v<T, std::decay_t<T>>);
  static_assert(not std::is_pointer_v<T>);
  using type = T;
};

int main() {
  constexpr static set<PlainTypes> s0{};
  constexpr static auto s1 =
    decltype(s0)::insert<Baz>::insert<Fuz>{};
  constexpr static Foo foo1{s1};
  std::printf("foo1.i=%d\n", foo1.i);

  constexpr static auto s2 =
    decltype(s1)::insert<Bar>{};
  constexpr static Foo foo2{s2};
  std::printf("foo2.i=%d\n", foo2.i);
}
            
Listing 6
Building the program
Presented programs should build without warnings under C++17 compliant compilers. The following have been tested:
  • GCC 8.1
  • Clang 6.0
  • Visual Studio 2017 (compiler version 15.7)

A sample CMake file which works for all these compilers is presented below:

cmake_minimum_required(VERSION 3.8)
project(metaset)
set(CMAKE_CXX_STANDARD 17)
if(MSVC)
  set(CMAKE_CXX_FLAGS
   "${CMAKE_CXX_FLAGS} /permissive-")
  set(CMAKE_CXX_FLAGS_DEBUG
   "${CMAKE_CXX_FLAGS_DEBUG} /D_DEBUG")
else()
  set(CMAKE_CXX_FLAGS
   "${CMAKE_CXX_FLAGS} -Wall -Wextra -Wpedantic")
  set(CMAKE_CXX_FLAGS_RELEASE "-O3 -DNDEBUG")
  set(CMAKE_CXX_FLAGS_DEBUG "-O0 -g")
endif()
set(SOURCE_FILES main.cpp)
add_executable(${PROJECT_NAME} ${SOURCE_FILES})

Overload Journal #146 - August 2018 + Programming Topics