Static Reflection in C++

Static reflection is under consideration for C++26. Wu Yongwei demonstrates how to achieve reflection now and shows some examples of what C++26 might make possible.

Static reflection will be an important part of C++ compile-time programming, as I discussed in the October issue of Overload [Wu24]. This time I will discuss static reflection in detail, including how to emulate it right now, before it’s been added to the standard.

Background

Many programming languages support reflection (Python and Java, for example). C++ is lagging behind.

While this is the case, things are probably going to change in C++26. Also, what will be available in C++ will be very different from what is available in languages like Java or Python. The keyword is ‘static’.

Andrew Sutton defined ‘static reflection’ as follows [Sutton21]:

Static reflection is the integral ability for a metaprogram to observe its own code and, to a limited extent, generate new code at compile time.

‘Compile-time’ is the special sauce in C++, and it allows us to do things impossible in other languages:

Zero-overhead abstraction. As Bjarne Stroustrup famously put it, ‘What you don’t use, you don’t pay for. What you do use, you couldn’t hand-code any better.’ If you do not need static reflection, it will not make your program fatter or slower. But it will be at your hand when you do need it.
High performance. Due to the nature of compile-time reflection, it is possible to achieve unparalleled performance, when compared with languages like Java or Python.
Versatility at both compile time and run time. The information available at compile time can be used at run time, but not vice versa. C++ static reflection can do things that are possible in languages like Java, but there are things that C++ can do but are simply impossible in other languages.

What we want from reflection

When we talk about static reflection, what do we really want? We really want to see what a compiler can see, and we want to be able to use the relevant information in the code. The most prominent cases are enum and struct. We want to be able to iterate over all the enumerators, and know their names and values. We want to be able to iterate over all the data members of a struct, and know their names and types. Obviously, when a data member is an aggregate, we also want to be able to recurse into it during reflection. And so on.

Regretfully, we cannot do all these things today with ‘standard’ definitions. Yes, in some implementations it is possible to hack out some of the information with various tricks. I would prefer to use macros and template techniques to achieve the same purpose, as the code is somewhat neater, more portable, and more maintainable – at the cost of using non-standard definition syntaxes. Of course, nothing beats direct support from the future C++ standard.

A few words on macro techniques

I have accumulated some macro code along the years, starting from the work of Netcan [Netcan]. The key facilities are:

GET_ARG_COUNT: Get the count of variadic arguments, so that GET_ARG_COUNT(a, b, c) becomes 3.
REPEAT_ON: Apply the variadic arguments to the main function macro (with a count), so that REPEAT_ON(func, a, b, c) becomes func(0, a) func(1, b) func(2, c).
PAIR: Remove the first pair of parentheses from the argument, so that PAIR((long)v1) becomes long v1.
STRIP: Remove the first part in parentheses, so that STRIP((long)v1) becomes v1.
…

Some of the ideas were around at least as early as 2012 [Fultz12a], but Paul Fultz’s code was not suitable for real software projects. My current code should be considered production-ready, and its variant has already been used in some large applications. It has also been tested under all mainstream compilers, including the pre-standard MSVC (supporting old MSVC did take some efforts). You can find my definitions in the Mozi open-source project [mozi].

Some consider macros evil, and macros should really be avoided where we can find better alternatives, but I personally find macros easier to understand and maintain than some template hacks.

A taste of enum reflection

Oftentimes we want to know how many enumerators are defined in an enumeration, what their underlying values are, and what their string forms are. The last need is especially important for debugging/logging purposes.

Existing implementations

There are existing libraries that provide such capabilities, like Magic Enum C++ [magic_enum] and Better Enums [better-enums].

Magic Enum C++ requires a recent C++17-conformant compiler, and it works with the standard form of enumeration definition. However, since it uses compile-time counting techniques to find out the values of enumerators, the range of enumerators are limited. Also, it does not live well with enumeration values that are not declared in the enumeration definition (say, something like Color{100}) – invoking magic_enum::enum_name on such a value will get an empty string_view. This said, I recommend using it, if it satisfies your needs.

Better Enums works with basically any compiler, even old C++98 ones. However, it requires you to use a special form for enumeration definition. That alone is ugly but acceptable. What is uglier is that the result is not an enum, and it cannot get along with values not declared in the enumeration definition at all – stringifying such a value will cause a segmentation fault…

My handmade implementation

Mainly to understand the problem better, I tried enum reflection myself. Basically, I did the following things:

Make sure the result of code generation was still an enum
Provide the mapping from enumerators to their string forms via inline constexpr variables
Support necessary operations using function overloads such as to_string

An example of an enum class definition:

  DEFINE_ENUM_CLASS(Color, int,
                    red = 1, green, blue);

Then I can use it as follows:

  cout << to_string(Color::red) << '\n';
  cout << to_string(Color{9}) << '\n';

And I will get the following output:

  red
  (Color)9

Some implementation details

While you can check the implementation details in the Mozi project, I would like to give an overview of what DEFINE_ENUM_CLASS does. Its definition is in Listing 1.

#define DEFINE_ENUM_CLASS(e, u, ...)       \
  enum class e : u { __VA_ARGS__ };        \
  inline constexpr std::array<             \
    std::pair<u, std::string_view>,        \
    GET_ARG_COUNT(__VA_ARGS__)>            \
    e##_enum_map_{REPEAT_FIRST_ON(         \
      ENUM_ITEM, e, __VA_ARGS__)};         \
  ENUM_FUNCTIONS(e, u)

Listing 1

You can see clearly that it does three things:

Define a standard enum class
Define an inline constexpr array that contains pairs of underlying integer values and the string forms of enumerators, which are generated by applying the ENUM_ITEM macro on the enumerators
Declare utility functions for the new enum type

With the definition of Color above, it will expand to Listing 2 (at first level). The full expansion results in something like Listing 3.

enum class Color : int { red = 1, green, blue };
inline constexpr std::array<
  std::pair<int, std::string_view>, 3>
  Color_enum_map_{
    ENUM_ITEM(0, Color, red = 1),
    ENUM_ITEM(1, Color, green),
    ENUM_ITEM(2, Color, blue),
  };
ENUM_FUNCTIONS(Color, int)

Listing 2

enum class Color : int { red = 1, green, blue };
inline constexpr std::array<
  std::pair<int, std::string_view>, 3>
  Color_enum_map_{
    std::pair{
      to_underlying(Color(
        (eat_assign<Color>)Color::red = 1)),
      remove_equals("red = 1")},
    std::pair{
      to_underlying(
        Color((eat_assign<Color>)Color::green)),
      remove_equals("green")},
    std::pair{to_underlying(Color((
                eat_assign<Color>)Color::blue)),
              remove_equals("blue")},
  };
inline std::string to_string(Color value)
{
  return enum_to_string(to_underlying(value),
                        "Color",
                        Color_enum_map_.begin(),
                        Color_enum_map_.end());
}

Listing 3

This should be enough for you to see the basic ideas. And you can check out the implementation details in the Mozi project, if interested.

Example of enum reflection in C++26

The code in Listing 4 should supposedly work as per P2996 [P2996r7], the current static reflection proposal for C++26.

template <typename E>
  requires std::is_enum_v<E>
std::string to_string(E value)
{
  template for (constexpr auto e :
                std::meta::enumerators_of(^E)) {
    if (value == [:e:]) {
      return std::string(
        std::meta::identifier_of(e));
    }
  }
  return std::string("(") +
         std::meta::identifier_of(^E) + ")" +
         std::to_string(to_underlying(value));
}

Listing 4

It uses the following reflection features:

^E generates the reflection information for the enum type E.
[:e:] ‘splices’ the reflection object back into a source entity, which is an enumerator here.
The template for loop (expansion statement) allows iteration over heterogeneous objects at compile time.
std::meta::enumerators_of gets all enumerators of the enumeration.
std::meta::identifier_of gets the identifier/name of a reflected object. Here we use it once for the name of the enumerator, and once for the name of the enumeration.

It does the same thing as my handmade to_string without the manual scaffolding: no macros are needed any more.

The online implementation of an early proposal, P2320 [P2320r0], available in Compiler Explorer, is convenient for demonstration purposes. The obvious differences between P2996r7 and P2320 are function names: enumerators_of was members_of, and identifier_of was name_of. There are some other reflection-supporting Godbolt compilers, which are not yet capable enough, mainly due to the lack of support for expansion statements. I have written two different versions of the enum reflection code that work under P2320:

https://cppx.godbolt.org/z/8rWTcf1KP: A simple version that does linear search as shown above
https://cppx.godbolt.org/z/P5Ycdv3xj: A more complex version that collects the string forms of enumerators and sorts them, so that we can use binary search later on (similar to what I did in Mozi)

As you can see, while it is still not trivial to implement the full logic, the major advantage is that we can use the standard enum definition form, without the current limitations of Magic Enum C++. The reflection information can be accessed at compile time, but we can save it so that we can access it later at run time.

Reflection on structs

The need for reflection of structs is even stronger than enums. Reflection is very helpful in debugging/logging, and serialization and deserialization become easy when reflection is available.

Existing implementations

I know two existing implementations for reflection purposes.

Boost.PFR [pfr] is:

…a C++14 library for very basic reflection that gives you access to structure elements by index and provides other std::tuple like methods for user defined types without any macro or boilerplate code.

It is easy to use. It supports common operations like iteration, comparison, and output. However, due to the lack of static reflection, it has no way to access the names of fields.

Struct_pack [struct_pack] is a “very easy to use, high performance serialization library”. It requires C++17 and focuses on serialization/deserialization. It is not designed for generic reflection purposes, and you cannot really use it for your own serialization scenarios (without some serious hacking).

While not a real implementation, the earliest code I am aware of about struct reflection is from Paul Fultz [Fultz12b]. Modern compile-time techniques were not ready in 2012, so while the basic ideas were similar, Netcan and I did not borrow much code from him.

My handmade implementation

I have my own struct reflection method, which does not have the limitations of Boost.PFR but under the hood requires macro use. However, once static reflection is standardized, much of the code and techniques can be adapted to standard C++.

The basic approach is:

Use macros to generate code so that the resulting type is really a struct of the supposed size (no fatter!)
Generate nested types and static constexpr data members which provide the needed information
Provide stand-alone function templates for the common operations

Here is an example. Suppose we have the following definitions:

  DEFINE_STRUCT(
    Point,
    (double)x,
    (double)y
  );
  DEFINE_STRUCT(
    Rect,
    (Point)p1,
    (Point)p2,
    (uint32_t)color
  );

Then we can initialize such structs as usual:

  Rect rect{
    {1.2, 3.4},
    {5.6, 7.8},
    12345678
  };

We can print it easily:

  print(data);

And we will get:

  {
      p1: {
          x: 1.2,
          y: 3.4
      },
      p2: {
          x: 5.6,
          y: 7.8
      },
      color: 12345678
  }

Usage scenario: copy same-name fields

The implementation details may not be very interesting, but we do have more interesting usage scenarios. One thing I implemented was copying fields of interest.

Suppose the following definitions (please notice that v2 and v4 have different types in S1 and S2):

  DEFINE_STRUCT(S1,
    (uint16_t)v1,
    (uint16_t)v2,
    (uint32_t)v3,
    (uint32_t)v4,
    (string)msg
  );
  
  DEFINE_STRUCT(S2,
    (int)v2,
    (long)v4
  );
  
  S1 s1{…};
  …
  S2 s2;

Then the following statement will do the right thing:

  copy_same_name_fields(s1, s2);

And it is done with the highest possible efficiency, equivalent to s2.v2 = s1.v2; s2.v4 = s1.v4;. I have checked its compiler-generated x86-64 assembly code, which is:

  movzx   eax, WORD PTR s1[rip+2]
  mov     DWORD PTR s2[rip], eax
  mov     eax, DWORD PTR s1[rip+8]
  mov     QWORD PTR s2[rip+8], rax

I do not think Java or Python can ever do anything similar!

If this does not look useful, just think about big database records. Imagine we have a container of big BookInfo objects, and we want to do something like the SQL SELECT name, publish_year WHERE author_id = …. The code would be that in Listing 5.

DEFINE_STRUCT(
  BookInfoNameYear,
  (string)name,
  (int)publish_year
);
BookInfoNameYear record{};
vector<BookInfoNameYear> result;
Container<BookInfo> container;
while (…) {
  auto it = container.find(…);
  …
  copy_same_name_fields(*it, record);
  result.push_back(record);
}

Listing 5

Isn’t the code much simpler than, while as efficient as, manually copying the needed fields? The advantage is especially obvious when there are many such fields.

I have seen copying tens of fields in real code, often followed by serialization (to send the information over the network), which is a topic I will discuss separately.

Under the hood

DEFINE_STRUCT is defined as follows:

  #define DEFINE_STRUCT(st, ...)                 \
    struct st {                                  \
      using is_reflected = void;                 \
      template <typename, size_t>          \
      struct _field;                             \
      static constexpr size_t _size =            \
        GET_ARG_COUNT(__VA_ARGS__);              \
      REPEAT_ON(FIELD, __VA_ARGS__)              \
    }

The S2 above will first expand to something like:

  struct S2 {
    using is_reflected = void;
    template <typename, size_t>
    struct _field;
    static constexpr size_t _size = 2;
    FIELD(0, (int)v2)
    FIELD(1, (long)v4)
  };

And FIELD(0, (int)v2) will expand to:

  int v2;
  template <typename T>
  struct _field<T, 0> {
    using type = decltype(decay_t<T>::v2);
    static constexpr auto name = CTS_STRING(v2);
    constexpr explicit _field(T&& obj)
      : obj_(std::forward<T>(obj)) {}
    constexpr decltype(auto) value()
    { return (std::forward<T>(obj_).v2); }
    T&& obj_;
  };

I leave CTS_STRING(v2) unexpanded, as it has two possible definitions, depending on the environment [Wu22]. For now, you can think of it as just "v2", with some additional magic (which copy_same_name_fields requires).

When you have an obj of type S2, you can access its members using their field numbers: _field<S2&, 0>(obj).value() is exactly obj.v2 (with the correct value category), and S2::_field<S2&,0>::type is the type of obj.v2 (which is int). With the help of fold expressions, more complex things like compile-time field iteration is now possible, as shown in Listing 6.

template <size_t I, typename T>
constexpr decltype(auto) get(T&& obj)
{
  using DT = decay_t<T>;
  static_assert(I < DT::_size,
                "Index to get is out of range");
  return typename DT::template _field<T, I>(
           std::forward<T>(obj))
    .value();
}
template <typename T, typename F, size_t... Is>
constexpr void
for_each_impl(T&& obj, F&& f,
              std::index_sequence<Is...>)
{
  using DT = decay_t<T>;
  (void(std::forward<F>(f)(
     index_t<Is>{},
     DT::template _field<T, Is>::name,
     get<Is>(std::forward<T>(obj)))),
   ...);
}
template <typename T, typename F>
constexpr void for_each(T&& obj, F&& f)
{
  using DT = decay_t<T>;
  for_each_impl(
    std::forward<T>(obj), std::forward<F>(f),
    std::make_index_sequence<DT::_size>{});
}

Listing 6

Now, a function call like for_each(obj, f) will be equivalent to:

  f(0, S2::_field<S2&, 0>::name, get<0>(obj));
  f(1, S2::_field<S2&, 1>::name, get<1>(obj));

Facilities like for_each is essential in implementing user-visible tools like print and serialization.

Example of struct reflection in C++26

As in the case of enum reflection, we will be able to dispense with the macro use when C++26 static reflection arrives. Listing 7 is a demo implementation of print (slightly changed from [Wu24] in order to conform to the updated version of P2996).

template <typename T>
void print(const T& obj, ostream& os = cout,
           std::string_view name = "",
           int depth = 0)
{
  if constexpr (is_class_v<T>) {
    os << indent(depth) << name
       << (name != "" ? ": {\n" : "{\n");
    template for (constexpr meta::info member :
        meta::nonstatic_data_members_of(^T)) {
      print(obj.[:member:], os,
            meta::identifier_of(member),
            depth + 1);
    }
    os << indent(depth) << "}"
       << (depth == 0 ? "\n" : ",\n");
  } else {
    os << indent(depth) << name << ": " << obj
       << ",\n";
  }
}

Listing 7

Given what we have known about ^ and [:…:], the code is pretty straightforward.

We can verify it actually works under P2320 (https://cppx.godbolt.org/z/G3EcvhKxK) and P2996, with an expansion statement workaround (https://godbolt.org/z/77PYjzcW8).

A few more words on Mozi

Mozi is an open-source project I started in late 2023, mostly for the purpose of experimenting with macro-based static reflection. I have implemented generic comparison, copying, printing, and serialization/deserialization. A serialization scenario called net_pack is implemented, which includes fully automatic byte-order swap and is suitable for coping with network datagrams. A special bit_field type is provided to provide bit-field support over the network.

I regard it as a demonstration of some interesting things that are possible with static reflection. What is currently possible with macro techniques will be possible with the C++26 static reflection, only it will be simpler, for both the implementer and the user.

References

[better-enums] https://github.com/aantron/better-enums

[Fultz12a] Paul Fultz II, ‘Is the C preprocessor Turing complete?’, May 2012, https://pfultz2.com/blog/2012/05/10/turing

[Fultz12b] Paul. Fultz II, ‘C++ Reflection in under 100 lines of code’, July 2012, https://pfultz2.com/blog/2012/07/31/reflection-in-under-100-lines

[magic_enum] https://github.com/Neargye/magic_enum

[mozi] https://github.com/adah1972/mozi

[Netcan] https://github.com/netcan/recipes/tree/master/cpp/metaproggramming

[P2320r0] Andrew Sutton et al., ‘The Syntax of Static Reflection’, 2021, http://wg21.link/p2320r0

[P2996r7] Wyatt Childers et al., ‘Reflection for C++26’ (revision 7), October 2024, http://wg21.link/p2996r7

[pfr] https://github.com/boostorg/pfr

[struct_pack] https://github.com/alibaba/yalantinglibs

[Sutton21] Andrew Sutton, ‘Reflection: Compile-Time Introspection of C++’, ACCU 2021, https://www.youtube.com/watch?v=60ECEc-URP8

[Wu22] Yongwei Wu, ‘Compile-Time Strings’, Overload, 30(172):4-7, December 2022, https://accu.org/journals/overload/30/172/wu/

[Wu24] Yongwei Wu, ‘C++ Compile-Time Programming’, Overload, 32(183):7-13, October 2022, https://accu.org/journals/overload/32/183/wu/>

Wu Yongwei Having been a programmer and software architect, Yongwei is currently a consultant and trainer on modern C++. He has nearly 30 years’ experience in systems programming and architecture in C and C++. His focus is on the C++ language, software architecture, performance tuning, design patterns, and code reuse. He has a programming page at http://wyw.dcweb.cn/.

Background

What we want from reflection

A few words on macro techniques

A taste of enum reflection

Existing implementations

My handmade implementation

Some implementation details

Example of enum reflection in C++26

Reflection on structs

Existing implementations

My handmade implementation

Usage scenario: copy same-name fields

Under the hood

Example of struct reflection in C++26

A few more words on Mozi

References

Advertisement

Advertisement

Your Privacy