Should you declare (almost) everything auto? Roger Orr considers when auto is appropriate.
To have a right to do a thing
is not at all the same as
to be right in doing it.
~ G.K.Chesterton
In the first article we covered the rules governing the
auto
keyword that was added to the language in C++11 (or added back, if your memory of C++ goes back far enough!)
It is important with a feature like
auto
not only to know the rules about what is permitted by the language – and the meaning of the consequent code – but also to be able to decide
when
the use of the feature is appropriate and what design forces need to be considered when taking such decisions.
In this article we look in more detail at some uses of
auto
with the intent of identifying some of these issues.
A ‘complex type’ example
One of the main motivations for
auto
was to simplify the declaration of variables with ‘complicated’ types. One such example is in the use of iterators over standard library containers in cases such as:
std::vector<std::set<int>> setcoll; std::vector<std::set<int>>::const_iterator it = setcoll.cbegin();1
Many programmers were put off using the STL because of the verbosity of the variable declarations. With C++03 one recommendation was to use a
typedef
– and this approach remains valid in C++11:
typedef std::vector<std::set<int> > collType; // C++03 code still works fine collType setcoll;6"> collType::const_iterator it = setcoll.begin();
With the addition of
auto
to the language the code can be shortened considerably:
std::vector<std::set<int>> setcoll; auto it = setcoll.cbegin();
But is it better ?
To help answer that question let us consider the alternatives in more detail.
The original code is often seen as hard to read because the length of the variable declaration dwarfs the name itself. Many programmers dislike the way that the meaning of the code is masked by the scaffolding required to get the variable type correct.
Additionally, the code is fragile in the face of change. The type of the iterator is heavily dependent on the type of the underlying container so the two declarations (for
setcoll
and
it
) must remain in step if the type of one changes.
The second code, using a
typedef
, improves both the readability of the code and also the maintainability as, should the type of the container change, the nested type
const_iterator
governed by the
typedef
will change too. However, having to pick a type name adds to the cognitive overhead; additionally good names are notoriously hard to pin down.
In the final code the use of
auto
further helps readability by focussing the attention on the expression used to initialise
it
as this defines the type that
auto
will resolve to. Given this, code maintainability is improved as the type of
it
will track the type required by the initialising expression.
We retain the type safety of the language – the variable is still strongly typed – but implicitly not explicitly. The main downside of the final version of the code is that if you do need to know the precise type of the variable then you have to deduce it from the expression, to do which also means knowing the type of the container. On the other hand, it can be argued that to understand the semantics of the line of code you already have to know this information, so the new style has not in practice made understanding the code any more difficult.
In this case I am inclined to agree with this view and I can see little downside to the use of
auto
to declare variables for iterators and other such entities. So:
- the code is quicker and easier to write and, arguably, to read
- the purpose is not lost in the syntax
- code generated is identical to the explicit type
- the variable automatically changes type if the collection type changes
However, the last point can be reworded as the variable
automatically
silently
changes type if the collection type changes. In particular this can be an issue with the difference between a
const
and non-
const
container. Note that the C++11 code uses
cbegin()
:
auto it = setcoll.cbegin();
If we’d retained the used of
begin()
we would have got a
modifiable
iterator from a non-
const
collection. The C++03 code makes it explicit by using the actual type name:
std::vector<std::set<int>>::const_iterator it;
The stress is slightly different and may mean making some small changes to some class interfaces, as with the addition of
cbegin()
.
DRY example
auto
allows you to specify the type name once. Consider this code:
std::shared_ptr<std::string> str = std::make_shared<std::string>("Test");
-
We’ve repeated
the std::string
-
make_shared
exists solely to createstd::shared_ptr
objects
We can write it more simply as:
auto str = std::make_shared<std::string>("Test");
The resulting code is just over half as long to write (and read) and I don’t think we’ve lost any information. Additionally the code is easier to change.
Using
auto
rather than repeating the type is indicated most strongly when:
- the type names are long or complex
- the types are identical or closely related
auto
is less useful when:
- the type name is simple – or important
- the cognitive overhead on the reader of the code is higher
So I think
auto
may be less useful in an example like that in Listing 1.
// in some header struct X { int *mem_var; void aMethod(); }; // in a cpp file void X::aMethod() { auto val = *mem_var; // what type is val? ... |
Listing 1 |
YMMV (Your mileage may vary) – opinions differ here. The ease of answering the question about the type of
val
may also depend on whether you are using an IDE with type info.
For example, with Microsoft Visual Studio you get the type for the example in Listing 1 displayed in the mouse-over as shown in Figure 1.
Figure 1 |
Dependent return type example
auto
can simplify member function definitions. Consider the class and member function definition in Listing 2.
class Example { public: typedef int Result; Result getResult(); }; Example::Result Example::getResult() { return ...; } |
Listing 2 |
We have to use the prefix of
Example::
for the return type
Result
as at this point in the definition the scope does not include
Example
.
auto
allows the removal of the class name from the return type.
The syntax is to place the
auto
where the return type would otherwise go, then follow the function prototype with
->
and the actual return type:
auto Example::getResult() -> Result { return ...; }664">
Whether or not this makes the code clearer depends on factors including:
- familiarity
- consistent use of this style.
Personally, I still can’t decide on this one. I think the new style is an improvement over the old one, but until use of C++11 is sufficiently widespread trying to use the style may simply result in a mix of the old and new styles being used. I do not think this would be a great step forward for existing code bases, but might be worth trying out for new ones.
Polymorphism?
One problem with
auto
is the temptation to code to the
implementation
rather than to the
interface
. If we imagine a class hierarchy with an abstract base class
Shape
and various concrete implementations such as
Circle
and
Ellipse
. We might write code like this:
auto shape = make_shared<Ellipse>(2, 5); ... shape->minor_axis(3);
The use of
auto
has made the generic variable
shape
to be of the explicit type shared pointer to
Ellipse
. This makes it too easy to call methods – such as
minor_axis
above – that are not part of the interface but of the implementation.
When the type of shape is ‘shared pointer to the abstract base class’, you can’t make this mistake. (Aside: I think this is a bigger problem with
var
in C# than with
auto
in C++ but your experience may be different.) The trouble is that
auto
is too ‘plastic’ – it fits the exact type that matches whereas
without
auto
the author needs to make a decision about the most appropriate type to use. This doesn’t only affect polymorphism:
const
, signed/unsigned integer types and sizes are other possible pinch points where the deduction of the type done by
auto
is not the best choice.
What type is it?
It is possible to go to the extreme of making everything in the program use
auto
, but I’m not convinced this is a good idea. For example, what does the program in Listing 3 do?
auto main() -> int { auto i = '1'; auto j = i * 'd'; auto k = j * 100l; auto l = k * 100.; return l; } |
Listing 3 |
It is all too easy to assume the
auto
types are all the same – miss the promotion, the
'l'
or the
'.'
. Opinions also vary on whether writing
main
using
auto
aids readability – I am not at all sure it does, especially given the large amount of existing code predating this use of
auto
.
You can use the
auto
rules (on some compilers) to tell you the type. For example, if we want to find out the actual type of
j
we could write this code:
auto main() -> int { auto i = '1'; auto j = i * 'd', x = "x"; ...
When compiled this will error as the type deduction for
auto
for the variables
j
and
x
produces inconsistent types. A possible error message is:
error: inconsistent deduction for 'auto': 'int' and then 'const char*'
You may also be able to get the compiler to tell you the type by using template argument deduction, for example:
template <typename T> void test() { T::dummy(); } auto val = '1'; test<decltype(val)>();
This generates an error and the error text (depending on the compiler) is likely to include text such as:
see reference to function template instantiation 'void test<char>(void)' being compiled
What are the actual rules?
The meaning of an
auto
variable declaration follows the rules for template argument deduction.
We can consider the invented function template
template <typename T> void f(T t) {}
and then in the expression
auto val = '1';
the type of
val
is the same as that deduced for
T
in the call
f('1')
.
This meaning was picked for good reason – type deduction can be rather hard to understand and it was felt that having a subtly different set of rules for
auto
from existing places where types are deduced would be a bad mistake. However, this does mean that the type deduced when using
auto
differs from a (naïve) use of
decltype
:
const int ci; auto val1 = ci; decltype(ci) val2 = ci;
val1
is of type
int
as the rules for template argument deduction will drop the top-level
const
; but the type of
val2
will be
const int
as that is the declared type of
ci
.
Adding modifiers to auto
Variables declared using
auto
can be declared with various combinations of
const
and various orts of references. So what’s the difference?
auto i = <expr>; auto const ci = <expr>; auto & ri = <expr>; auto const & cri = <expr>; auto && rri = <expr>;
As above,
auto
uses the same rules as template argument deduction so we can ask the equivalent question about what type is deduced for the following uses of a function template:
template <typename T>; void f(T i); void f(T const ci); void f(T & ri); void f(T const & cri); void f(T && rri);
The answer to the question is, of course, ‘it depends’ ... especially for the
&&
case (which is an example of what Scott Meyers has named the ‘Universal Reference’).
const inference (values)
Let us start by looking at a few examples of using
auto
together with
const
for simple value declarations.
int i(0); int const ci(0); auto v0 = 0; auto const v1 = 0; auto v2 = i; auto const v3 = i; auto v4 = ci; auto const v5 = ci;
This is the easiest case and, as in the earlier discussion of the difference between
auto
and
decltype
,
v0
is of type
int
and
v1
is of type
int const
(you may be more used to calling it
const int
). Similarly
v2
and
v4
are of type
int
and
v3
and
v5
are of type
int const
.
In general, with simple variable declarations, I prefer using
auto const
by default as the reader knows the value will remain fixed. This means if they see a use of the variable later in the block they do not have to scan the intervening code to check whether or not the value has been modified.
const inference (references)
Let’s take the previous example but make each variable an l-value reference:
int i(0); int const ci(0); auto & v0 = 0; // Error auto const & v1 = 0; auto & v2 = i; auto const & v3 = i; auto & v4 = ci; auto const & v5 = ci;
The first one
fails
as you may not form an l-value reference to a temporary value. However, you
are
allowed to form a
const
reference to a temporary and so
v1
is valid (and of type
int const &
).
v2
is valid and is of type
int &
and the three remaining variables are all of type
int const &
. Notice that the
const
for
v4
is not removed, unlike in the previous example, as it is not a
top-level
use of
const
.
Reference collapsing and auto
Things get slightly more complicated again when we use the (new) r-value reference in conjunction with
auto
.
int i(0); int const ci(0); auto && v0 = 0; auto const && v1 = 0; auto && v2 = i; auto const && v3 = i; // Error auto && v4 = ci; auto const && v5 = ci; // Error
The first variable,
v0
, becomes an r-value reference to the temporary 0 (type
int &&
) and the second,
v1
, is the
const
equivalent (
int const &&
). When it comes to
v2
, however, the reference type ‘collapses’ to an l-value reference and so the type of
v2
is simply
int &
.
v3
is invalid as the presence of the
const
suppresses the reference collapsing and you are not allowed to bind an r-value reference to an l-value.
v4
reference-collapses to
int const &
and the declaration of
v5
is an error for the same reason as for
v3
.
So this is the complicated one:
auto && var = <expr>;
as, depending on the expression,
var
could be
T & T && T const & T const &&
Deducing the last case is a little more obscure – you need to bind to a
const
temporary that is of class type. Here’s an example of deducing
const &&
:
class T{}; const T x() { return T(); } auto && var = x(); // var is of type T const &&
Note that non-class types, like
int
, decay to
&&
. This changed during the development of C++11 and at one point Microsoft’s compiler and the Intellisense disagreed over the right answer (see Figure 2)!
Figure 2 |
(The compiler in the Visual Studio 2013 preview edition does now get this right.)
More dubious cases
auto
does not work well with initializer lists as the somewhat complicated rules for parsing these results in behaviour, when used with
auto
, that may not be what you expect:
int main() { int var1{1}; auto var2{1};
You might expect
var1
and
var2
to have the same type. Sadly the C++ rules have introduced a new ‘vexing parse’ into the language. The type of
var2
is
std::initializer_list<int>
. There is a proposal to make this invalid as almost everyone who stumbles over this behaviour finds it unexpected.
A mix of signed and unsigned integers – or integers of different sizes – can cause problems with
auto
. In many cases the compiler generates a warning, if you set the appropriate flag(s), and if you heed the warning you can resolve possible problems. But not in all cases ....
std::vector<int> v; ... for (int i = v.size() - 1; i > 0; i -= 2) { process(v[i], v[i-1]); }
If you change
int
to
auto
then the code breaks. The trouble here is that
v.size()
returns
std::vector::size_type
which is an unsigned integer value. The rules for integer promotions means that
i
is also an unsigned integer value. If it starts out odd it will decrease by 2 round the loop as far as 1, then the next subtraction will wrap around – to a large
positive
value. Of course, care must be taken to ensure that an
int
will be large enough for all possible values of
size()
that the program might encounter.
I’m less convinced by the use of
auto
for variables defined by the results of arithmetic expressions as the correct choice of variable type may be necessary to ensure the desired behaviour of the program.
Conclusion
auto
is a very useful tool in the programmer’s armoury as it allows you to retain type safety without needing to write out the explicit types of the variables. I expect that use of
auto
will become fairly widespread once use of pre-C++11 compilers becomes less common.
However, I do have a concern that thoughtless use of
auto
may result in code that does not behave as expected, especially when the data type chosen implicitly is not the one the reader of the code anticipates.
Please don’t use
auto
without thought simply to save typing, but make sure you use it by conscious choice and being aware of the potential issues and possible alternatives.
Acknowledgements
Many thanks to Rai Sarich and the Overload reviewers for their suggestions and corrections which have helped to improve this article.
This article is based on the presentation of that title at ACCU 2013.