pinAn E-mail Conversation with Bjarne Stroustrup

Overload Journal #2 - Jun 1993   Author: Mike Toms

Q: Many C++ users are aware that you are currently engaged with the ANSI/ISO standardisation of C++ and in particular, the enhancements. What do you feel are the most major enhancements that should be part of the standard? (regardless of whether or not they are an option)

A: We, that is the ANSI/ISO committee, have accepted templates, exception handling, run-time type information, and a couple of extension I consider minor. That in itself is a lot for the community to absorb and for the implementors and educators to catch up with. However, I hope to see namespaces accepted at the Munich meeting in July. Namespaces will provide significant help in organizing programs and in particular in composing programs out of separately developed libraries without the name clashes. Namespaces are also easy to implement and easy to learn. I had them implemented in five days and I have taught people to use them in 10 minutes.

For example, two suppliers may each use their own namespace so that names will not clash:

namespace X { 
class String { ... };
typedef Bool int;
int f(const char*);
void g();
}
namespace Y { 
class String { ... };
enum Bool { false, true };
int f(int);
void h();
}

A user can then pick and chose among the names:

void my_funct()
{
X::String s = "asdf";
Y::f(2);
X::g();
}

or state a preference for a particular namespace:

void your_func()
{
using namespace X;
String s = "asdf"; // X::string
Y::f(2);
g(); //X::g
//...
}

There are more details, but these are the basics (and this is an interview, not a tutorial).

I suspect that namespaces will be the last major extension in this round of work. We can of course have a nice little discussion about what 'major' means in this context, but we do need to get a draft standard ready for public review in September '94 and we have a lot of work to do before then. I suspect that the extensions' group will be busy with cleaning up the description of templates and exceptions and dealing with proposals for little extensions - most of which will also have to be rejected or postponed. Then we'll see what the response to the public review period is and work on based on that.

I hope this doesn't sound too negative, but stability is essential for C++ users and we can't just add every feature that people like. Even if we added just the GOOD ideas the language would become unmanageable. We have to apply taste and judgement. A language has to be a practical tool and not just a grab bag of neat features and bright ideas.

The minor extension I think would be most important is a set of cast operators to provide an alternative to the old "do everything" cast. Unfortunately, there still is a few loose ends in that proposal and if I can't resolve those I can't recommend its adoption and it won't make it into the standard. The basic idea is that a cast like

(T)v

can do too many things. It may be producing a new arithmetic value from v, it may be producing a pointer to a different subobject, it may produce a pointer of a type unrelated to the object v pointed to, it may be producing a pointer from a non-pointer type, it may be producing an integer from a pointer. It may be constructing a new object of type T by calling a constructor. It all depends on the type of v and on what kind of type T is. A reader of

(T)v

cannot know without looking at the context if the declarations in that context change the meaning of the cast quietly changes. Because of these quiet changes and because programmers frequently misunderstand what the casts they and their colleagues write actually do. I consider the old-style cast "slippery". It'll do something, but far too often it's not obvious what that is. We at Bell Labs and many others have found this a significant source of bugs and maintenance problems (we do measure such things). If the writer of the cast could say what was really meant to be done many of these problems would go away.

The basic idea is to have three operators doing three basic kinds of conversions currently done by (T)v:

static_cast<T>(v)      // for reasonably well-behaved casts
reinterpret_cast<T>(v) // for horrible casts
const_cast<T>(v) // for casting away const and volatile

In addition we now have

dynamic_cast<T>(v)       // for run-time checked casts

Naturally, we can't just ban old-style casts, but with these new operators In place people could write better new code and eventually fade out old cast.

Currently, I'm stuck on problems related to const. Many people are VERY keen that constness doesn't quietly disappear. Therefore, my intent was that only const_cast should be able to remove constness. Unfortunately, people can ADD a const specifier in one operation and then go on to try to modify a const later where it is not obvious what is going on. The problem kind of that's holding me up is:

void f(X* p); // f modifies *p

void (*fp)(const X* p); // *fp doesn't modify *p

fp = (void (*)(const X*))&f; // forcing fp to point to f
// note 'const' ADDED

void g(const X* p) // g doesn't modify *p
{
fp(p); // OOPS, thanks to the cast
// above *p gets modified
}

This is a highly obscure effect. Fortunately, it doesn't actually bite people very often, but when it does it can be extraordinarily hard to track down. Please note that this is also a problem in ANSI C. I have suggested:

fp = static_cast<(void (*)(const X*)>(&f); // error

This wouldn't work because the compiler would know that the cast was suspect with respect to const and people would have to write

fp = const_cast<(void (*)(const X*)>(&f); // ok

My worry is that many const problems are so obscure and subtle that people would decide that the compiler was wrong and prefer to use the old style cast that would be seen as simpler. This is the kind of problem where I have a hard time deciding whether the cure might be worse than the illness. I have a logically sound solution, but can it be successfully introduced into common C++ use?

Q: If you had the opportunity to turn the clock back to 1980 and start the development process again, what would you have done differently and why?

A: You can never bathe in the same river twice. There are things that I could do better now, but some of those things would have killed C++ if I had done them then. For example, I couldn't work without virtual functions, yet the introduction of virtual functions were postponed from 1980 to 1983 because I - with reason - doubted my ability to get people to accept them in 1980/81. More to a current point, many can now afford garbage collection, but in 1980 essentially all of my users could not. Had C with Classes or early C++ relied on GC then the language would have been stillborn. A language has to fit with its time and grow with the changing demands. The idea of a perfect language in the abstract is fundamentally wrong. A good language serves its users as they are, for the problems they have, on the platforms they work on.

I'm not really sure Bjarne (vintage 1993) knows more about the vintage 1980 users than Bjarne (vintage 1980) did. Therefore, I don't really want to conjecture. I now know things about the vintage 1993 users that Bjarne (vintage 1980) didn't, and that knowledge I'm trying to put to good use in the standards group and elsewhere.

Q: When C++ is standardised, do you have plans to extend it further? (ANSI 2010 C++ Std perhaps)

A: My immediate reaction is "No way!" I have had enough of language work to last a lifetime. As I'm disengaging from language work I'm getting back into the use of language that started it all. I didn't really want to design a language, I just happened to have programming problems for which there were no sufficiently good language available at the time. If - and only if - my future projects gets me into that situation again will I consider new language features. At the HOPL-2 conference Dennis Ritchie observed that there seemed to be two kinds of languages: the ones designed to solve a problem and the ones designed to prove a point. Like C, C++, is of the former kind.

Q: What was the programming problem that started you on the C++ development track?

A: I was looking for a way to separate the UNIX kernel into parts that could run as a distributed system using a local area network. I needed to express the kernel as a set of interacting modules and I needed to model the network for analysis. In both cases, I needed to class concept to express my ideas. I never actually got back to those problems myself, though I have over the years helped on several projects using C++ to simulate networks and network traffic to help design networks and protocols.

Q: I often hear (mainly from Smalltalk programmers) the criticism that C++ is not a 'pure' OO language. Do you think that being a 'hybrid' language strengthens or weakens C++ as a commercial programming language?

A: Arm-chair philosophers also tend to make that criticism. I think that C++'s real strength comes from being a 'hybrid.' As I said above, C++ was designed to solve problems rather than (merely) to prove a point. C++ is a general-purpose programming language, a multi-paradigm programming language rather than (merely) an object-oriented language. Not all problems map well into any particular view of programming. In particular, not all problems map into a view of object-oriented programming as the design and use of deeply nested class hierarchies demonstrate. The programming problems we face and people's ways of thinking are much more varied than people would prefer to believe. Consequently, it is easy to design a smaller, simpler, and cleaner language than C++. I knew that all along. What is needed, though, and what I built was a language that was flexible enough, fast enough, and robust enough to cope with the unbelievable range of real challenges.

Q: Do you think that subjects such as Garbage Collection and Persistence should be dealt with as part of the language, or be implementation/third-party add-ons?

A: Persistence is many different things to different people. Some just wants an object-l/O package as provided by many libraries, others wants a seamless migration of objects from file to main memory and back, others wants versioning and transaction logging, others will settle for nothing less than a distributed system with proper concurrency control and full support for schema migration. For that reason, I think that persistence must be provided by special libraries, non-standard extension, and/or "third-party" products. I see no hope of standardizing persistence.

The ANSI/ISO standards committee's extensions' group is looking into whether we can help with some of the simpler levels of this problem either through language features or through standard library classes.

The support for run-time type identification that we accepted in Portland in March contains a few "hooks" deemed useful by people dealing with persistence.

Optional garbage collection is, I think, the right approach for C++. Exactly how that can best be done is not yet known, but we are going to get the option in several forms over the next couple of years (whether we want to or not).

Why GC?

It is the easiest for the user. In particular, it simplifies library building and use.

It is more reliable than user-supplied memory management schemes for some applications.

Why not GC?

GC carries a run-time overhead that is not affordable to many current C++ applications running on current hardware.

Many GC techniques imply service interruptions that are not acceptable for important classes of applications (e.g. real-time, control, human interface on slow hardware, OS kernel).

Many GC techniques carry a large fixed overhead compared to non-GC techniques. Remember, not every program needs to run forever, memory leaks are quite acceptable in many applications, many applications can manage their memory without GC and without relative high-overhead GC-like techniques such as reference counting. Some such applications are high performance applications where overhead from unneeded GC is unacceptable.

Some applications do not have the hardware resources of a traditional general-purpose computer.

Some GC schemes require banning of several basic C operations (e.g. p+1, a[i], printf()).

I know that you can find more reasons for and against, but no further reasons are needed. I do not think you can find sufficient arguments that EVERY application would be better done with GC without restricting the set of applications you consider. Similarly, I don't think you can find sufficient arguments that NO application would be better done with GC without restricting set of applications you consider.

My conclusion (as you can find in "The C++ Programming Language" (even the first edition) and also in the ARM) is that GC is desirable in principle and feasible, but for current users, current uses, and current hardware we can't afford to make the semantics of C++ and of its most basic standard libraries dependent on GC.

But mustn't GC be guaranteed in "The Standard" to be useful?

We don't have a scheme that is anywhere near ready for standardization. If the experimental schemes are demonstrated to be good enough for a wide enough range of real applications (hard to do, but necessary) and doesn't have unavoidable drawback that would make C++ an unacceptable choice for significant applications, implementors will scramble to provide the best implementations.

I expect that some of my programs will be using GC within a couple of years and that some of my programs will still not be using GC at the turn of the century.

I am under no illusion that building an acceptable GC mechanism for C++ will be easy - I just don't think it is impossible. Consequently, given the number of people looking at the problem, several solutions will emerge and hopefully we'll settle on a common scheme at the end.

Q: What methodology do you use when designing C++ programs?

A: That depends what kind of problem I'm trying to solve. For small programs I simply doodle a bit on the back of an envelope or something, for larger issues I get more formal, but my primary "tool" is a blackboard and a couple of friends to talk the problems and the possible solutions over with. Have a look at chapters 11 and 12 in "The C++ Programming Language (2nd Edition)" for a much more detailed explanation of my ideas (which naturally are based on experience from the various projects I have been involved in).

Q: Tools nearly always lag behind the development of a 'new' language, in what areas do you feel that the C++ development world is being deprived of suitable software tools?

A: Actually, I think that in the case of C++ tools are lacking less than education is. C++ isn't just a new syntax for expressing the same old ideas - at least not for most programmers. This implies a need for education, rather than mere training. New concepts have to be learned and mastered through practice. Old and well-tried habits of work have to be re-evaluated, and rather than dashing of doing things "the good old way" new ways have to be considered - and often doing things a new way will be harder and more time-consuming than the old way - when tried for the first time.

The overwhelming experience is that taking the time and making the effort to learn the key data abstraction and object-oriented techniques is worth while for almost all programmers and yields benefits not just in the very long run but also on a three to twelve month time scale.

There are benefits in using C++ without making this effort, but most benefits require the extra effort to learn new concepts - I would wonder why anyone not willing to make that effort would switch to C++.

When approaching C++ for the first time, or for the first time after some time, take the time to read a good textbook or a few well-chosen articles (the C++ Report and the C++ Journal contains many). Maybe also have a look at the definition or the source code of some major library and consider the techniques and concepts used. This is also a good idea for people who has used C++ for some time. Many could do with a review of the concepts and techniques. Much has happened to C++ and its associated programming and design techniques since C++ first appeared. A quick comparison of the 1st and the 2nd edition of "The C++ Programming Language" should convince anyone of that.

As for tools (says he, getting off his hobby horse :-), I think that what we are seeing today will look rather primitive in a few year's time. In particular, we need many more tools that actually understand C++ (both the syntax and the type system) and can use that knowledge. Currently, most tools know only a little bit about syntax or about the stream of executable instructions.

Eventually, we'll have editors that can navigate through a program based on the logical structure of a program rather than the lexical layout, be able to click on a + and instantly be told with it resolves to under overload resolution, and have re-compilation be incremental with a small grain. Such an environment would make what you can currently get for languages such as Lisp and Smalltalk look relative primitive by taking advantage of the wealth of information available in the structure of a C++ program.

Let's not get greedy, though. C++ was designed to be useful in tool poor environments and even in a traditional Unix or DOS environment it is more than a match for many alternatives for many applications. Environments and tools are nice, and we'll eventually get great ones, but for much C++ at least they are not essential.

Q: With all the advantages of C++, do you think the use of ANSI C will decline?

A: In a sense, yes. You can't buy an ANSI C compiler for the PC market any more except as an option on a C++ compiler. I expect that over the years we'll see a gradual adoption of C++ features even by the most hard-core C fanatics. The C++ features are now available in the environments C programmers use, they work, and they are efficient. C programmers would be silly not to take advantage of the C++ features that are helpful to them.

Not that I'm not preaching some OO religion. C++ is a pragmatic language and is best approached in a pragmatic manner: Use the parts of it that are useful to you and leave the rest for later when it might come in handy. I strongly prefer sceptics to "true believers." Naturally, I recommend "The C++ Programming Language (2nd edition)" as the main help in understanding C++ and its associated techniques. It contains a lot of practical information and advice - on programming, on the language, and on design - and very little hype and preaching. I think too many C++ texts push a particular limited view of what C++ is or aims at delivering only a shallow understanding of C++.

To gain really major benefits from C++ you have to invest a certain amount of effort in learning the new concepts. Writing C or Pascal with C++ syntax gives some benefits, but the greatest gains come from understanding the abstraction techniques and the language features that support them. Just being able to parrot the OO buzzwords doesn't do the trick either. The nice thing about C++ in this context is that you can learn it incrementally and can get benefits proportionally to your effort in learning it. You don't have to first learn everything and only then start reaping benefits.

Q: I feel that C++ should be (like C) "lean and mean" and some of the additions (such as RTTI) will be adding layers of "fat" to the language. Do these extensions impose a penalty on the C++ community even if no use is made of them?

A: C++ is lean and mean. The underlying principle is that you don't pay for what you don't use. RTTI and even exception handling can be implemented to follow this dictum - strictly. In the case of RTTI the simple and obvious implementation is to add two pointers to each virtual table (that is a fixed storage overhead of 8 bytes per class with virtual functions) and no further overhead unless you explicitly use some aspect of RTTI. In my UNIX implementation, those two words have actually been allocated in the vtbl "for future enhancements" since 1987!

When you start using dynamic casts the implementation needs to allocate objects representing the classes. In my experimental implementation those were about 40 bytes per class with virtual functions and you can do better yet in a production system. That doesn't strike me as much when you take into account that you only get the overhead if you explicitly use the facilities. Presumably, you'd have to fake the features if you wanted them and the language didn't support them and that's more expensive in my experience. One reason for accepting RTTI was the observation that most of the major libraries did fake RTTI in incompatible and unnecessarily expensive ways. The run-time overhead of an unoptimized dynamic cast is one function call per level of derivation between the base class known and the derived class you are looking for.

One thing people really should remember is that a design that relies on static type checking is usually better (easier to understand, less error-prone, and more efficient) than one relying on dynamic type checking. RTTI is for the relatively few (but often important) cases where C++'s static type system isn't sufficient. If you start using RTTI to simulate Smalltalk or CLOS in C++ you probably haven't quite understood the problem or C++.

Q: A lot of programmers (and members of the press) envisage C++ as a language developed solely for the development of GUI products, and that it has no place in the "normal" (whatever that may be) programming arena due to its complexity. I, on the other hand, think that C++ is the best all-round programming language ever invented and should be used for every programming task. A middle ground obviously exists, but what tasks do you see C++ as being best suited for?

A: They are plain wrong. C++ was designed for applications that had to work under the most stringent constraints of run-time and space efficiency. That was the kind of applications where C++ first thrived: operating system kernels, simulations, compilers, graphics, real-time control, etc. This was done in direct competition with C. Current C++ implementations are a bit faster yet.

Also, C++ appears much more complex to a language lawyer trying to understand every little detail than to a programmer looking for a tool to solve a problem. There are no prizes (except maybe booby prizes) for using the largest number of C++ features.

The way to approach a problem with C++ is to decide which classes you need to represent the concepts of your application and then express them as simply and as straight-forwardly as possibly. Most often you need only relatively simple features used in simple ways. Often, much of the complexity is hidden in the bowels of libraries.

Q: C++'s popularity seems to be accelerating at present. Do you think that other OO languages (such as Smalltalk, Actor and Eiffel) and other hybrids like OO-COBOL will make an impact on the growth of C++?

A: I don't think so. Compared to C++, they are niche languages. They all have their nice aspects but none have C++'s breath of applicability or C++'s efficiency over a wide range of application areas and programming styles. Smalltalk seems to have a safe ecological niche in prototyping and highly dynamic individual projects. It ought to thrive. If OO-COBOL takes off it also ought to have a ready-made user base.

Q: I see parallel processors becoming more widely available to programmers in the near future. How easy will it be to use C++ in the parallel programming environment?

A: Parallel processors are becoming more common, but so are amazingly fast single-processors. This implies the need for at least two forms of concurrency: multi­threading within a single processor, and multi­processing with several processors. In addition, networking (both WAN and LAN) imposes its own demands. Because of this diversity I recommend parallelism be represented by libraries within C++ rather than as a general language feature. Such as feature, say something like Ada's tasks, would be inconvenient for almost all users. It is possible to design concurrency support libraries in C++ that approaches built-in concurrency support in both convenience of use and efficiency. By relying on libraries, you can support a variety of concurrency models, though, and thus serve the users that need those different models better than can be done by a single built-in concurrency model. I expect this will be the direction that will be taken by most people and that the portability problems that will arise when several concurrency-support libraries are used within the C++ community can be dealt with by a thin layer of interface classes.

Many thanks for your time Bjarne, I'm sure my readers will enjoy reading your comments.

Overload Journal #2 - Jun 1993


02