ACCU Home page ACCU Conference Page
Search Contact us ACCU at Flickr ACCU at GitHib ACCU at Google+ ACCU at Facebook ACCU at Linked-in ACCU at Twitter Skip Navigation

pinTwo Daemons

Overload Journal #128 - August 2015 + Programming Topics   Author: Dietmar Kühl
Most people have come across C++11’s forwarding references. Dietmar Kühl explains what && really means.

When playing Nethack [Nethack] using the traditional terminal-based interface you may be confronted by a monster taking the form of a & character: a daemon of some sort. Depending on your level of experience, facing one daemon is often OK but facing two tends to get you into way more trouble. Since the release of C++11 we are frequently faced with the two daemons &&!

Now, if you think that the two characters next to each other represent the two daemons, you’d be right in the Nethack sense. In C++ any instance of these two characters applied to a type actually represents one of two daemons. When you have got a T&& it is either a reference to an object which can be moved from or it is an entity whose type is deduced to match what the entity is initialized with. It uses exactly the same notation for two entirely different things! To relieve you from being haunted by these daemons and rather be prepared to fight them, I’ll describe below how these entities differ. In addition, I’ll show how you may use these insights to support movable entities with pre-C++11 compilers to some extent.

How to know when to move

First let’s make a quick detour. In C++ a lot of temporary objects are floating around. When functions or expressions return their results as values you get a ‘temporary’. If the temporary isn’t of a simple data type, it may be quite involved and expensive to copy. On the other hand, since they are temporaries, there isn’t much use in them as they are about to be destroyed! Instead of copying a temporary to get the value elsewhere and destroying the temporary, it makes much more sense to move the temporary’s content if it is expensive.

It would be great if we could indicate whether an object shall be moved or copied! The C++ approach to indicate different processing is to use different types. Since lvalue references are already used to indicate copying and we don’t want to move lvalues by accident, we could reasonably use a different type, let’s call it movable_ref<X>, to indicate that an object can be moved (bear with me – I know about X&& and I’ll get to this).

If a type is used to indicate that an object can be moved, the compiler can choose an appropriate overload of a function which may move from the object. For example, a class supporting a move construction could be declared like Listing 1.

class foo {
  // ...
public:
  foo(foo const& other);
      // well-known copy
  foo(movable_ref<foo> other);
      // move the referenced object
  // ...
};
			
Listing 1

It is even possible to implement movable_ref<X> using a pre-C++11 compiler! All it takes is a class which internally holds a reference to an object indicating that the object won’t be used again and to provide suitable access to that reference. For example, a corresponding class template and a suitable factory function could look something like Listing 2 (see the ‘Further reading’ for a pointer to a complete implementation).

template <typename T>
class movable_ref {
  // suitable friend declaration goes here
  T* pointer;
  explicit movable_ref(T& object):
    pointer(&object) {}
public:
  operator T&() const { return *this->pointer; }
};
template <typename T>
movable_ref<T> move(T& object) {
  return movable_ref<T>(object);
}
			
Listing 2

With roughly the declaration in Listing 2, it is possible to use a simple expression like move(x) to create an object which indicates that the content of x can be transferred! This works like a charm for lvalues. For temporaries things become a little bit harder: a temporary cannot be bound to a non-const reference but it would still be desirable to move them. Carrying on just a little bit more with the class template above, it would be possible to create a move() member function for types which need to be moved. For example, the move() operation of a std::string could look like this:

  movable_ref<std::string> std::string::move() {
    return ::move(*this); }

Note that temporaries are non-const. They just can’t be bound to non-const lvalue reference. It is entirely possible to call a non-const member function like std::string::move() on them (assuming such a function exists)! So creating the above function would just work and allow temporaries to be moved using a notation like this:

  std::string("temporary on the").move()

Note that so far there is no use of C++11 at all! The class template movable_ref<X> can be used with pre-C++11 compilers to move objects. There is the small caveat that it doesn’t allow moving objects in return statements but with reasonable compilers there isn’t much need for that as they’ll use copy-elision with carefully written functions anyway.

Of course, we wouldn’t like this approach as it is a bit clumsy. Especially for temporaries it requires too much work! Also, the compiler can actually automatically tell whether an object can be moved or needs to be copied in many cases: if the object wasn’t given a name, it can be moved as there is no way for the object to be accessed otherwise (well, the object could register itself somewhere but a class supporting moving is well-advised not to do so). Apart from unnamed objects there are also a few other cases when the compiler knows that the object can be safely moved. For example, when a named variable is returned from a function it can be moved:

  T f() {
    T value(...);
    // ...
    return value;
  }

Note that the C++ standard considers moving value for the above function but will use copying if the return statement is written like this:

  return (value);

In cases where it is known that an object won’t be used the compiler could generate the call to move() implicitly and yield a movable_ref<X>. It turns out this is nearly what happens! Instead of using the type movable_ref<X> the compile uses a type which matches the declaration X&& where X is not a deduced type (I’m using X instead of T to indicate that X is not a deduced type; T tends to be used for template parameters and is often deduced), i.e., it uses rvalue references. Before diving into more details of rvalue references note that the notation movable_ref<X> can be used interchangeably for X&& assuming the following using alias is visible:

  template <typename T>
  using movable_ref = 
    std::add_rvalue_reference_t<T>;

The interesting aspect of this using alias is that it allows code written in terms of movable_ref<X> to be used both with a C++03 and a C++11 (or later) compiler. With a few support functions the identical code can be used to move objects! This yields a neat migration path for libraries which need to compile with both new and old compilers. The main flaw of this idea is that it is presented in 2015 and not, at least, 5 years ago. However, better late than never: it may still help some C++ users who haven’t migrated off C++03 compilers.

Using std::add_rvalue_reference_t<X> in the definition of movable_ref instead of X&& has the advantage of preventing deduction of X. This nearly makes the using alias useful in general in C++11 rather than just for libraries which need to compile with both C++03 and later versions. The main danger is that using std::movable_ref<X>& x is legal and looks as if x would be a reference to an object which can be moved from but it isn’t. Instead the references are collapsed and the notation simply yields a X&.

rvalue references

Now let’s have a look at rvalue references. When you have got a X&& for some non-reference type X, e.g. std::string&&, you have got a reference which can only be bound to something which can be moved from. Either there are rules allowing the compiler to implicitly consider the object movable or the user has used a cast to have an lvalue look like an rvalue reference, e.g.:

  std::string     s("lvalue");
  std::string&& r = static_cast<std::string&&>(s);

Using std::move(s) is just another way to write the cast above: std::move() is specified to deduce the argument type and do the cast.

A declaration even when using an rvalue reference, e.g., r in the above code, introduces an entity with a name. Since it has a name using r yields an lvalue! That is, the compiler won’t allow moving from the referenced object. The name rvalue reference derives from the kind of entities which can be bound to this sort of reference: rvalue references only allow binding objects which are about to go away or which are made to look as if that is the case with a cast. The compiler won’t allow binding an lvalue to an rvalue reference without a cast.

The net effect of these rules is that using X&& x as part of a function signature indicates that the function was called with an object which can be moved from and it is OK to change the state of x. The typical change is to move the content, i.e., to transfer resources from x to another object.

According to the core language this transfer can leave the object in some arbitrary state as long as it is safe to call the destructor on the object. However, the standard library mandates that the respective class invariants are retained for the moved-from object when using move construction or move assignment.

Forwarding references

Sadly, when you see a name declared as T&& t the object referenced by T cannot necessarily be moved! More specifically, if T is used to deduce the type the meaning of T&& is entirely different! For example, assume you have the class template in Listing 3.

template <typename X>
struct example {
  template <typename T>
  static void f(X&& x, T&& t) {
    // ...
  }
};
			
Listing 3

The type for x is not deduced. That is, x is a reference to a movable object of the template argument X specified to instantiate the class template example. On the other hand, the type T for t can be deduced based on the arguments given to f(). Listing 4 contains a few examples on how f() could be called and what the type T becomes.

std::string       s("mutable");
std::string const c("immutable");
example<int>::f(int(1), s);
  // T == std::string&
example<int>::f(int(2), c);
  // T == std::string const&
example<int>::f(int(2), std::string("tmp"));
  // T == std::string
			
Listing 4

When replacing T in the function template f() with std::string& i.e., when instantiating f() for the type std::string the second argument type becomes std::string& &&. This odd-looking type doesn’t exist as the references are collapsed: if there are multiple reference qualifiers on a type, they get collapsed so there is only one reference qualifier:

  • & && becomes &
  • && & becomes &
  • && && becomes &&

That is, depending on how f() is called, t may be an lvalue reference or it may be an rvalue reference. Just because the argument is spelled as T&& and looks like an rvalue reference, it isn’t necessarily one. Unconditionally trying to move from t would, thus, likely yield incorrect behavior.

Independent of how f() is called, t will be considered to be an lvalue. It always has a name and the compiler will not move from a named object implicitly unless the name is about to go away. If you want to potentially move an argument whose type was deduced, you’d use std::forward<T>(t): depending on the explicit template argument type T the function argument t will be cast to a suitable reference type: if T is not an lvalue reference type, the function yields a T&&. Otherwise the function yields the type T.

Since it is a bit awkward to talk about a T&& where the T is deduced, there needs to be a name. Scott Meyers refers to these references as universal references. The C++ standard defines the term forwarding reference for the same entities. I think this term describes what these entities are doing better.

auto and &&

The rules for the type deduced when using auto are identical to the rules used with deduced arguments for function templates. Thus, you’ll get the following:

  • auto v = expr; declares v to have the value type of the expression expr and the result of expr will be copied into v (of course, the copy can possibly be elided).
  • auto& r = expr; declares r as an lvalue reference to the result of expr which assumes that expr indeed yields an lvalue.
  • auto const& c = expr; declares c to a const lvalue reference to the result of expr. If expr doesn’t yield an lvalue but a temporary, the life-time of the temporary gets expanded until c goes out of scope.
  • auto&& s = expr; declares s to be some reference to the result of expr. If expr yields a value s will be an rvalue reference to the temporary whose life-time gets expanded until s goes out of scope. Otherwise s will be a reference to the reference yielded by expr.

Similar to the use in function templates whether s can be moved from depends on how s actually happens to be declared despite the use of &&. That is, with names declared using auto&& you wouldn’t use std::move(). Instead you would use std::forward() with a somewhat ugly-looking type:

  std::forward<decltype(s)>(s)

Of course, spelling out the actual type may be a lot more ugly than decltype(s).

Since we are on the topic of auto declarations: do not use range-based for loops using a variable declared as a plain auto unless you know that the value type of the iterator is a type which can be efficiently copied and you don’t need to mutate the elements. Otherwise you are much better off to use auto with a reference qualifier. As a default it is reasonable to use auto&&:

  for (auto&& v: container) { ... }

When the elements are only read you may want to explicitly indicate that the elements are not mutated in which case auto const& is the way to go. Similarly, if you know you are going to mutate the elements you may want to explicitly indicate that using auto&.

Summary

C++ uses the notation T&& for two entirely unrelated things:

  • If T is deduced T&& just means that an object is referenced and the type of T indicates whether the referenced object can be moved from.
  • If T is not deduced T&& indicates that the referenced objects can be moved from unconditionally.

To support move semantics using pre-C++11 compilers a class template can be used to indicate that an entity is movable. With a corresponding using alias and a few access functions the same notation can be used for rvalue references in C+11 providing a migration path.

Reference and further reading

[Nethack] http://www.nethack.org/

For further reading on this topic see one of these:

Thomas Becker’s articles: http://thbecker.net/articles/rvalue_references/section_01.html

Effective Modern C++, Scott Meyers, Item 24

Github: https://github.com/bloomberg/bde/blob/master/groups/bsl/bslmf/bslmf_movableref.h

Scott Meyers. ‘Universal references in C++11’ Overload, 20(111):8:12, October 2012.

Overload Journal #128 - August 2015 + Programming Topics