ACCU Home page ACCU Conference Page
Search Contact us ACCU at Flickr ACCU at GitHib ACCU at Google+ ACCU at Facebook ACCU at Linked-in ACCU at Twitter Skip Navigation

pinSingle Exit

Overload Journal #57 - Oct 2003 + Programming Topics   Author: Jon Jagger

In CVu 15.4 Francis makes a case that some functions are less complex if they use multiple return statements. In Overload 55 I stated my preference for single exit via a single return. I'd like to explore the examples Francis presented to try and explain my preference more explicitly.

Example 1 - Multiple Returns

The first code fragment Francis presented was as follows:

bool contains(vector<vector<int> >
              const & array, int value) {
  int const rows(array.size());
  int const columns(array[0].size());
  for (int row(0); row != rows; ++row) {
    for (int col(0);
         col != columns;
         ++col) {
      if (array[row][col] == value) {
        return true;
      }
    }
  }
  return false;
}

And he wrote:

"if you are a believer that functions should never have more than a single return you have a problem because however you reorganise your code the requirement is for two distinct exit conditions".

I'm a firm believer, but even if I wasn't I'd have to agree that some multiple returns somehow feel more acceptable than others. And as Francis says "perceived complexity is a function of many things". However I don't think it's quite accurate to say the requirement is for two distinct exit conditions. To try and explain what I mean, consider the following implementation of contains[1]:

bool contains(const vector<vector<int> >
              & array, int value) {
  vector<int> all;
  for (int at = 0; at != array.size();
       ++at) {
    copy(array[at].begin(),
         array[at].end(),
         back_inserter(all));
  }
  return find(all.begin(), all.end(),
              value) != all.end();
}

This is an unusual implementation but it does show that the requirement is always to return a single value to the function caller (in this case either true or false). Exactly how you do so depends on your choice of implementation which is a different matter. Another approach would be to design an iterator adapter class that "flattens" the iteration through a container of containers.

Francis continues "The only ways these can be combined in a single return statement require either continuing processing after you know the answer or increasing the perceived complexity of the code." Here is the heart of the issue - the complexity of the code. Is a single-return version of contains necessarily more complex?

Example 1 - Single Return

Here's a more realistic single-return version of contains:

bool contains(const vector<vector<int> >
              & values, int value) {
  int at = 0;
  while (at != values.size()
         && !exists(values[at], value)) {
    ++at;
  }
  return at != values.size();
}

This makes use of the following non-standard helper function:

template<typename range, typename value >
         bool exists(const range & all,
  const value & to_find) {
    return find(all.begin(), all.end(),
                to_find) != all.end();
}

Example 1 - Comparison

What are the differences between these single/multiple return versions?

  • Line count. No difference. (I haven't counted lines containing a single left/right brace).

  • Maximum indent. The deepest control in the multiple-return version is 3 - the return in an if in a for in a for whereas the deepest control in the single-return version is 1 - the increment in the while. This is the reason the single-return version needs fewer lines containing a single left/right brace.

  • Function count. The multiple-return version is a single function whereas the single-return version uses a helper function. The helper function is useful in its own right and could quite conceivably have already existed. Small helper functions are significant because they can help to make other functions smaller and clearer.

  • Loop scaffolding complexity. By using the helper function the single-return version has lost a whole level of iteration scaffolding.

  • Return expression complexity. The multiple-return version uses two literals, true and false. One of these returns occurs at indentation level 3. In contrast the single-return version uses a single boolean expression at indentation level 1.

  • Loop condition complexity. The multiple-return version has two very simple (and very similar) boolean expressions as its two for statement continuation conditions. The single-return version has one (more complex) boolean expression in its while statement continuation condition. How comfortable you are with this more complex boolean expression (using the && short-circuiting operator) is largely a matter of how familiar you are with this style.

  • Style. If you are looking for an element in a C++ vector you could argue that it's reasonable to expect to use the C++ find algorithm, as the multiple-return version does. In contrast the single-return version uses a more C-like explicit subscript iteration. The difference is quite subtle in this case but it does serve to highlight an important point Francis made - "perceived complexity is a function of many things (one of them being the individual reader)". I think its fair to say the more experience you have of "mature" C++/STL style the more readable you'd find the single-return version.

Example 2 - Multiple Returns

The second code fragment Francis presented is as follows (some code elided, I assume the int <-> bool conversions are deliberate):

bool will_be_alive(life_universe
                       const & data,
                   int i,
                   int j) {
  int const diagonal_neighbours = ...;
  int const orthogonal_neighbours = ...;
  int const live_neighbours =
              diagonal_neighbours +
              orthogonal_neighbours;
  if (live_neighbours == 3)
    return true;
  if (live_neighbours == 2)
    return data[i][j];
  return false;
}

I would start by rewriting this as follows:

bool will_be_alive(const life_universe & data,
                   int i,
                   int j) {
  const int diagonal_neighbours = ...;
  const int orthogonal_neighbours = ...;

  const int live_neighbours =
                diagonal_neighbours +
                orthogonal_neighbours;
  if (live_neighbours == 3)
    return true;
  else if (live_neighbours == 2)
    return data[i][j];
  else
    return false;
}

The difference is the explicit coding of the control-flow surrounding the return statements. Do you think making the control-flow explicit is a good thing? If you're not that bothered I invite you to consider the following:

if (live_neighbours == 3)
return true;

I hope you're more concerned by this lack of indentation. These days indenting your code to reflect logical grouping is taken as an article of faith that people forget to question or recall exactly why it is used. Indentation visibly groups similar actions and decisions occurring at the same level. If you believe that indentation is a Good Thing there is a strong case for clearly and explicitly emphasising that all three return statements exist at the same level. In contrast, and significantly, the multiple-returns in the first example are not at the same level.

Example 2 - Single Return

Francis also presented example 2 using a single return involving nested ternary operators:

return (live_neighbours == 3)
  ? true
  : (live_neighbours == 2)
    ? data[i][j]
    : false;

I agree with Francis that this adds nothing in terms of clarity. In fact I think it's a big minus. This is the kind of code that gives the ternary operator a bad name. But inside this long and inelegant statement there is a shorter and more elegant one trying to get out. To help it escape consider a small progression. We start with this (not uncommon) pattern:

bool result;
if (expression)
  result = true;
else
  result = false;
return result;

This is exactly the kind of code that gives single-exit a bad name. It is overly verbose; it isn't a simple, clear, and direct expression of its hidden logic. It is better as:

if (expression)
  return true;
else
  return false;

But this is still overly verbose. So we take a short step to:

return (expression) ? true : false;

And removing the last bit of duplication we finally arrive at:

return expression;

This is not better merely because it is shorter. It is better because it is a more direct expression of the problem. It has been stripped of its solution focused temporary variable, its if-else, and its assignments; all that remains is the problem focused expression of the answer. It has less code and more software. Applying the same process to the chained if-else containing three return statements we arrive not at a nested ternary operator but at this:

return live_neighbours == 3 ||
       live_neighbours == 2 &&
       data[i][j];

This is focused on and is a direct expression of the problem in exactly the same way.

Conclusion

My rules of thumb are as follows:

  • Almost all functions are better with a single return. The issue is separation of concerns. Do you separate out the statements that determine the answer from the statement/s that return the answer? Multiple-return versions don't whereas single-return versions do.

  • Multiple return statements become less acceptable the further apart they become (both in terms of logical indentation and physical line number). Large functions have greater scope for abuse simply because they allow multiple returns to live farther apart.

  • Multiple return statements are more acceptable when they are all at the same level of a mutually-exclusive selection. In most cases these multiple returns can be refactored into a more expressive single return.

But remember, dogmatically following rules is not a recipe for good software. The best software flows from programmers who think about what they do and who follow principles and practices that naturally generate quality.

Many thanks to Kevlin for an insightful review of a first draft of this article.



[1] The array[at] duplication can be avoided like this:

copy(array[at], back_inserter(all));

which uses a handy (but sadly non-standard) version of copy which, coincidentally, I also used in my other article (Software As Read).

template<typename range, typename output>
output copy(const range & source,
            output sink) {
  return copy(source.begin(), source.end(),
              sink);
}

Overload Journal #57 - Oct 2003 + Programming Topics