The Casting Vote

San Jose - November 1993 - Sean A. Corfield

The announcement of Borland C++ 4.0 managed to permeate the November meeting of the joint ISO and ANSI committees. Borland's representative disappeared for one day - everyone joked that this was in order to ensure that the forthcoming release matched the decisions of the committee as closely as possible. There is no doubt that Borland have become the leading edge commercial compiler - they've even managed to include some features that were only voted into the draft standard at the San Jose meeting - new style cast syntax, more of which later.

What were the decisions made at San Jose and how will they affect you?

1. Library

The standard library is still going through some turmoil. Minor changes were made to the standard string class and the dynamic array class (dynarray).

The string class now has two index operators:

char  operator[](size_t) const; 
char& operator[](size_t);

"About time", many people will say. Complex number classes have been added which match the proposed NCEG (Numerical C Extensions Group) builtin complex data types. A stringstream class has been added to allow streams to be used with the builtin string class for all the code where we currently like to use "sprintf". Streams themselves are still being worked on, however, in the face of internationalisation (often referred to as "il8n" because 18 letters are omitted from the abbreviation!).

"il8n" might not seem like a very interesting topic - wide characters and multi-byte characters - but it generates a lot of discussion. At the moment, two somewhat different approaches to streams have been proposed. The committee hopes to settle this fairly soon, but there do appear to be some personal / political issues which could become a sticking point - I will say no more!

Other changes in the standard library will have a wider impact:

namespaces and exceptions. A proposal has been accepted to "wrap" each standard header in a namespace to reduce pollution of the global scope. To provide a transition, old-style headers will be provided that include the new-style header and then open up the namespace, something along the lines of:

// math -- new header 
namespace iso::c::math {
  double acos(double);
  // etc
}


// math.h
// -- include namespace version of
//    math header
#include <math>
// -- open up the namespace:
using namespace iso::c::math;

// prog.c
#include <math.h> 
// doesn't need to change 
... acos(0.5) ...

Why is this useful? Each header currently declares all its names in the global namespace which would prevent you from using, say, a faster maths library implementation, e.g.,

// fastmath 
namespace fastmath {
  double acos(double);
  // etc 
}

// prog.c
#include <math.h>
#include <fastmath>
// -- override some declarations:
using fastmath::acos;
using fastmath::asin;
// etc
... acos(0.5) ...// uses fastmath::acos

Because of the way namespaces are defined to work, this allows you to mix'n'match between libraries in a well-defined manner. The general approach has been accepted but there may be name changes before the standard is produced. By the way, namespace was voted into the draft standard in July 1993 - the last "big" extension.

The exceptions proposal for the library will provide a single class hierarchy of exception, currently with xmsg at the root, which partitions exceptions that can be thrown by the library routines into subcategories such as domain and range. Again, the principle has been accepted but the exact details are still to be finalised. In the light of a PC compiler that supports exception handling, this is an important issue - you need to know the exception inheritance hierarchy in order to be able to write code that robustly handles library exceptions.

2. Templates

The Extensions Working Group completed the majority of the issues that were outstanding on templates. This will result in Chapter 14 Templates being completely rewritten between now and March 1994 in order to incorporate the resolutions. The new chapter will include some extensions that are required to make templates useful for real work.

In case you think I'm being flippant, I should probably explain that remark: almost every implementation already provides some extensions but they are incompatible with each other - the committee is trying to standardise these extensions to make useful programs portable.

One major area which has been clarified is "name binding". This determines whether an identifier in a template definition is looked up at the definition point or at the instantiation point. Examples to explain this are complicated, but the bottom line is that with a few minor details still to be resolved, the compiler will be able to syntax check a template definition for you, even if you do not instantiate it.

What extensions?

You can now explicitly ask for a template to be instantiated rather than relying on the compiler to figure it out:

template<class T> class Thing {...}; 
// request instantiation 
template class Thing<int>;

This should make the compile-and-link cycle faster because the compiler can perform all the instantiations "on-the-fly" rather than having the pre-link phase run the compiler for each template that is required to be instantiated.

You can now declare an intent to specialise a template:

// template Thing as above:
// I will provide my own definition
class Thing<char*>;

This helps the compiler decide what assumptions it can make when it later sees:

Thing<char*>  x;

In fact, the resolution will break a lot of code - you are now REQUIRED to declare your intent to specialise a template. This used to work:

// thing.h
template<class T> class Thing {...};

// thing.c 
#include <thing.h> 
class Thing<char*> {
  // specialised definition 
};

// main.c
#include <thing.h> 
Thing<char*>  a;

The specialisation would be linked in, although the class interface used would have come from <thing.h> so strange things could happen! Under the new resolution, the compiler can assume that what it sees in <thing.h> will be what is used to instantiate the template and will NOT link in the specialisation!

Both of the above are supported in different forms by many compilers, but the following extension is entirely new: explicit qualification of function templates. Currently, the following is illegal:

template<class T> T* factory();

This is because template parameters are required to appear in the function parameter list so that the types can be deduced when a function call is seen. The extension allows the following to be written:

int* pi = factory<int>(); 
Shape* ps = factory<Shape>(); 
Car* pc = factory<Car>();

The "factory" will produce whatever type you want (subject, of course, to successful instantiation of the function template). The syntax of explicit qualification has some bearing on the extensions adopted for instantiation requests and specialisation described above, and for the next item I shall talk about: new style casts.

3. New Style Casts

This was the most controversial item voted on. The presentation of the proposal had generated a lot of discussion during the week but when it was presented for formal vote split the ANSI committee almost exactly down the middle (the ISO committee were in favour: six for, one against and one abstain). Some further discussion ensued when it was discovered that one ANSI member did not have voting rights and that the ANSI vote would have to be counted again! Fortunately, Bjarne Stroustrup was able to sway several committee members and the final vote went 3 : 2 in favour at ANSI (the ISO vote was unchanged).

What are the new style casts?

In March 1993, run-time type identification was voted into the draft which introduced the unusual looking notation:

if (X* = dynamic_cast<X*>(p)) { ... }

The new style casts also look like that, with three new operators as follows:

static_cast<type> (expression)
const_cast<type>(expression) 
reinterpret_cast<type>(expression)

The idea behind these is to enhance the safety of casting. Currently, if you see:

Y* py; 
// . . . 
X* px = (X*)py;

You cannot tell what the cast will do. Depending on whether X and Y are complete types (i.e., their definition is visible) and whether they are related, the cast will change its behaviour. Indeed, simply adding an extra include file could cause the cast to change behaviour!

The new casts come with a set of restrictions on what they can actually do which are, in simple terms, as follows:

const_cast is the simplest to explain - it allows you to adjust the const and volatile qualifiers and nothing else, e.g.,

class A {...};
class B : public A {...};
A*    pa;
B*    pb;
int const* volatile* pvpci;
// ...
const_cast<A const*)(pa);         //OK
const_cast<int**>(pvpci)          // OK
const_cast<int const**>(pvpci)    // OK
const_cast<A const*>(pb);         // error!
const_cast<char const*>
                (volatile*)(pvpci)//error!

The last two are errors because they change more than just the qualifiers.

static_cast is related to implicit casts - if you can do an implicit cast between two types (in one direction), static_cast allows you to go in both directions so long as you do not cast away const, e.g.,

B const b2("constant thing");
// implicit B const* -> A const*
A const* pa2 = &b2;
// ...
static_cast<A*>(pb)    //OK
static_cast<B*>(pa)    // downcast - also OK
static_cast<B*>(pa2)   // error!

The latter is an error because it casts away constness. You can say:

static_cast<B*>(const_cast<A*>(pa2))

const_cast<B*>(static_cast<B const*>(pa2))

This is much more explicit - see how easy it was to write an unsafe program before:

(A*)pb  // OK - redundant upcast
(B*)pa  // OK - we know it's a B really
(B*)pa2 // OK? oops! now we can modify b2!

The final cast is reinterpret_cast - it can do all the casts that you cannot do with static_cast, but it also cannot cast away constness.

The committee were unable to agree that deprecating the C-style would be a good move and that part of the proposal was dropped, but even so, I would recommend using the new style casts as soon as you get Borland 4.0 because they will catch a lot of bugs that would otherwise be very hard to track down.

Note that Bjarne Stroustrup's "hard" problem (from the Interview in Overload #2) with adding const in a pointer to function cast was decided to be a non-problem. In other words, reinterpret_cast is allowed to do that type of cast (and const_cast is not) because when you change the constness of function parameters you simply get a different type of function and the standard already says that it is undefined if you attempt to call a function through such a pointer to function - you have to cast it back to the original type first. Whilst this leaves a very small hole in the const-safety of the new casts, it is considered to be obscure enough not to matter, and the benefits outweigh the problems.

4. Boolean

A proposal to introduce a builtin boolean type was considered and accepted (3 : 2 at ANSI; 7 for, 1 against, no abstain at ISO). This provides a builtin integral type called "bool" and two reserved words "true" and "false". Those of you with a background in languages like Pascal will cheer, others will be stunned / shocked / delete where applicable. The UK panel discussed the possibility of a builtin boolean type over a year ago but could not see how to integrate it into the language. Andrew Koenig of AT&T and Dag Bruck of Dynasim AB, Sweden, solved the problem beautifully by introducing implicit conversions from integral and pointer types to bool in certain circumstances.

The following operators now return a bool result:

<    <=    >    >=    ==    ! =
&&    ||    !

The latter three also require bool operands (but will implicitly convert integral and pointer types if necessary), as does ? :

All the conditional constructs (if, while, for, do while) require a bool expression (with conversion if necessary). bool converts implicitly to int, with true becoming 1 and false becoming 0 - this preserves existing code that uses int (or char) instead of some boolean typedef. There are a few quirks that will arise from the adoption of bool - floating point types do not convert:

double d;
if (d) // sloppy! tests d != 0.0
{      // becomes an error with bool type
}
int i ;
i = d; // converts d to int - not good if i
       //is used as boolean 
if (i) // huh?
{
}
bool b;
b = d; // error - cannot convert!

This is good news - you cannot reliably test floating point numbers for equality and converted floating point numbers do not make good true/false values!

Another quirk is that ++ will be defined to mean "become true" to allow the following current (sloppy) practice:

bool matches = false;
for (list* p = head; p; p = p->next) {
  if (p->item == key) {
    ++matches;
  } 
}
if (matches) { // found any matches for key? 
}

Note that ++ is deprecated. You will eventually have to rewrite this:

int count =0; // count the entries, or... 
bool matches = false; // ..use a flag 
for (list* p = head; p; p = p->next) { 
  if (p->item == key) {
    // increment the count, or...
    ++count;
    matches = true; // set the flag 
  } 
}
// valid use of bool matches, or... 
if (matches) {
}
// ...alternate use of a count 
if (count > 0) { 
}

The choice is probably stylistic. Again, the justification for introducing bool is type safety - if you really want a boolean valued variable, now you can declare one and have the compiler check that you are using it consistently.

5. Implicit int

The joint committee decided (by almost unanimous vote) to deprecate implicit int everywhere and ban it altogether in two places. In fact, the actual deprecation covers omitted type-specifiers so it will be acceptable to say:

short s = 0;

but you will get a compiler warning (probably) if you say:

const c = 42;

static x = 1;

The two places where omitting the type-specifier has been banned altogether are:

// file scope:
f();   // previously meant int f();
a;     // previously meant int a;

and

// previously meant typedef int Int;
typedef Int;

The first two are not allowed by ISO C and not believed to be common in C++, the latter certainly shouldn't be common. Part of the reason for banning the typedef form was that typedef will be used to resolve some name binding issues in templates.

6. Miscellaneous

The Core Working Group resolved several (generally minor) issues covering a wide variety of topics: qualifiers, references, lvalue-ness and so on. Most of the resolutions are intended to codify existing practice, but it is difficult to know the impact on the language until the changes are integrated. Personally, I suspect they are not consistent and we will see some backtracking.

7. The Working Paper, Editing and The Schedule

The committee is still intending to vote the Working Paper up to a Working Draft in July 1994. This is the first step in getting the draft standard approved as an International Standard. Almost everyone on the committee believes this is a very aggressive schedule. The ISO committee members are deeply unhappy with the current state of the draft and look likely to vote against approving a Working Draft in July unless some substantial improvements are made before that time. In order to achieve this, Andrew Koenig, the project editor, has convened an editorial working group who will edit parts of the draft in parallel to speed up the process. Volunteers for the work include ISO members from the UK (myself), the USA and Australia, with both France and Germany keen to review the changes prior to incorporation into the draft. This process may well become the critical task in providing a timely international C++ standard. Wish us luck!

The next international meeting is in March 1994.

Now I will make a general plea: support your national standards body!

If you are interested in attending the UK C++ panel meetings (one day each, four times a year in London) please contact me for more information.

I can be contacted by e-mail: Sean.Corfield@prlO.co.uk (yes, I know, P-R-L-zero is an odd e-mail address!)

(Ed - write to me if you are interested and don't have access to e-mail)