ACCU Home page ACCU Conference Page ACCU 2017 Conference Registration Page
Search Contact us ACCU at Flickr ACCU at GitHib ACCU at Google+ ACCU at Facebook ACCU at Linked-in ACCU at Twitter Skip Navigation

pinDigging a Ditch

Overload Journal #66 - Apr 2005 + Programming Topics   Author: Paul Grenyer

Writing a custom stream is easy! Most people are now entirely comfortable using std::vector and std::list, and know the difference between a std::map and a std::set. However, the use and extension of the C++ standard library's streams is still considered difficult.

In this article I am going to look at writing a logging stream. A logging stream inserts the current date and time at the beginning of a buffer full of characters when it is flushed. The buffer is flushed to another stream which can modify the characters further or write them, for example, to the console (std::cout) or to a file (std::ofstream).

In section 13.13.3 of The C++ Standard Library [Josuttis] Nico Josuttis discusses how to write a custom stream in a fair amount of detail. Even though the book is widespread among developers, the section on streams does not appear to be widely read. Therefore in this article I am going to follow reasonably closely the line that Josuttis takes, but will cut out a lot of the unnecessary background which may scare the people who, wrongly, feel it must be read and understood before embarking on a custom stream. I will also discuss and resolve a potential initialisation problem not explored by Josuttis.

Stream Buffer

The heart of a stream is its buffer. Buffer is a misnomer as it does not have to buffer at all and can, if it so chooses, process the characters immediately.

Along with buffering, if required, the stream buffer does all the reading and writing of characters for the stream. The standard library provides std::basic_streambuf as a base class for stream buffers. Listing 1 shows a stream buffer that converts all the characters streamed to it to upper case and writes them with putchar:

#include <streambuf>
#include <locale>
#include <cstdio>

template<class charT,
         class traits = std::char_traits<charT> >
class outbuf
     : public std::basic_streambuf<charT, traits> {
private:
  typedef typename std::basic_streambuf<charT,
                        traits>::int_type int_type;

  // Central output function.
  // - print characters in uppercase.
  virtual int_type overflow(int_type c) {

    // Check character is not EOF
    if(!traits::eq_int_type(c, traits::eof())) {

      // Convert character to uppercase.
      c = std::toupper<charT>(c,
                        std::basic_streambuf<charT,
                                traits>::getloc());
      // Write character to standard output
      if(putchar(c) == EOF) {
        return traits::eof();
      }
    }

    return traits::not_eof(c);
  }
};

Listing 1: Example stream buffer

The overflow member function of std::basic_streambuf is called for each character that is sent to the stream buffer. Overriding it allows the behaviour to be modified. The example in Listing 1 above performs the following for each character sent to overflow:

  1. The character is tested to make sure it is not an indication of the end of a file or an error.

  2. The character is converted to uppercase.

  3. The character is written to standard out. If an error occurs while writing the character this is indicated by returning traits::eof().

  4. An indication of whether or not the character represents the end of a file or an error is returned.

Traits are used throughout Listing 1 to ensure that EOF is detected and handled correctly. Streams can be used with any character type that has a corresponding set of character traits. A detailed knowledge of character traits is not required when using the built in character types char and wchar_t as their traits are already part of the standard library. Character traits are discussed in 14.1.2 of Josuttis.

Output Stream

The easiest way to use a stream buffer is to pass it to an output stream as shown in Listing 2 below:

#include <streambuf>
#include <ostream>
#include <locale>
#include <cstdio>

template<class charT,
         class traits = std::char_traits<charT> >
class outbuf : public std::basic_streambuf<charT,
                                          traits> {
private:
  typedef typename std::basic_streambuf<charT,
                        traits>::int_type int_type;

  // Central output function.
  // - print characters in uppercase.
  virtual int_type overflow(int_type c) {
    // Check character is not EOF
    if(!traits::eq_int_type(c, traits::eof())) {
      // Convert character to uppercase.
      c = std::toupper<charT>(c,
                        std::basic_streambuf<charT,
                                traits>::getloc());
      // Write character to standard output
      if(putchar(c) == EOF) {
        return traits::eof();
      }
    }

    return traits::not_eof(c);
  }
};

int main() {
  outbuf<char> ob;
  std::basic_ostream<char> out(&ob);
  out << "31 hexadecimal: "
      << std::hex
      << 31 << std::endl;
  return 0;
}

Listing 2: Passing a stream buffer to an output stream

The output from the example in Listing 2 is:

31 HEXADECIMAL: 1F

The example in Listing 2 demonstrates a working stream, but is not an ideal solution as the stream buffer must be declared separately from the stream itself. A common solution is to create a subclass of std::basic_ostream with the stream buffer as a member which can be passed to the std::basic_ostream constructor as shown in Listing 3:

template<class charT,
         class traits = std::char_traits<charT> >
class ostream
       : public std::basic_ostream<charT, traits> {
private:
  outbuf<charT, traits> buf_;

public:
  ostream() : std::basic_ostream<charT,
                          traits>(&buf_), buf_() {}
};

Listing 3: Subclass of std::basic_ostream

Having the stream buffer as a member introduces a potential initialisation problem. The solution to the problem introduces a further problem hidden deep within the C++ standard [Standard]. However, this second problem is also easily fixed.

Problem 1

If the stream buffer is dereferenced in std::basic_ostream's constructor or in its destructor, undefined behaviour can occur as the stream buffer will not have been initialised. At least one well known and widely used standard library implementation does nothing to avoid this and does not need to. Library implementers know their stream implementations and whether or not protection is needed. We, as stream extenders writing for potentially any number of different stream implementations, do not. There is no guarantee in the C++ standard to fall back on either.

Josuttis places the buffer before std::basic_ostream's constructor in the initialisation list, which makes no difference at all as stated in 12.6.2/5 of the C++ standard:

Initialization shall proceed in the following order:

  • First, and only for the constructor of the most derived class as described below, virtual base classes shall be initialized in the order they appear on a depth-first left-to-right traversal of the directed acyclic graph of base classes, where "left-to-right" is the order of appearance of the base class names in the derived class base-specifier-list.

  • Then, direct base classes shall be initialized in declaration order as they appear in the base-specifier-list (regardless of the order of the mem-initializers).

  • Then, nonstatic data members shall be initialized in the order they were declared in the class definition (again regardless of the order of the mem-initializers).

  • Finally, the body of the constructor is executed.

Note: the declaration order is mandated to ensure that base and member subobjects are destroyed in the reverse order of initialization.

The fact that the stream buffer is not initialised before it is passed to std::basic_ostream's constructor may not cause a problem with your compiler and library, but why risk it when there is a simple and straightforward solution? On the other hand, it may fail in a screaming fit immediately. Moving the stream buffer to a private base class which is initialised before std::basic_ostream solves the problem nicely. The initialisation order of base classes is specified as stated in 12.6.2/5 above. Listing 4 shows the base class which is used to initialise the stream buffer and how to use it with the output stream.

template<class charT,
         class traits = std::char_traits<charT> >
struct outbuf_init {

private:
  outbuf<charT, traits> buf_;

public:
  outbuf<charT, traits>* buf() {
    return &buf_;
  }
};

template<class charT,
         class traits = std::char_traits<charT> >
class ostream : private outbuf_init<charT, traits>, 
         public std::basic_ostream<charT, traits> {

private:
  typedef outbuf_init<charT, traits> outbuf_init;

public:
    ostream() : outbuf_init(),
      std::basic_ostream<charT,
                     traits>(outbuf_init::buf()) {}
};

Listing 4: Initialising the stream buffer

Problem 2

basic_ios is a virtual base class of basic_ostream. The C++ standard (27.4.4/2) describes its constructor as follows:

Effects: Constructs an object of class basic_ios (27.4.2.7) leaving its member objects uninitialized. The object must be initialized by calling its init member function. If it is destroyed before it has been initialized the behavior is undefined.

basic_ios::init is called from within basic_ostream's constructor. This is where things get complicated. As basic_ios is a virtual base class of basic_ostream, the objects which make up an ostream object are initialised in the following order (see 12.6.2/5):

...
basic_ios
outbuf
outbuf_init
basic_ostream
ostream

Therefore the constructors of basic_ios and outbuf are both called before the constructor of basic_ostream and therefore before basic_ios::init is called. This means that if the outbuf constructor throws an exception, basic_ios's destructor will be called before basic_ios::init; resulting in the undefined behaviour described in 27.4.4/2.

The answer to this problem is contained within 12.6.2/5 and is very simple. Making ostream inherit virtually, as well as privately, from outbuf_init causes it to be constructed before anything else:

template<class charT,
         class traits = std::char_traits<charT> >
class ostream
      : private virtual outbuf_init<charT, traits>, 
        public std::basic_ostream<charT, traits> {

private:
  typedef outbuf_init<charT, traits> outbuf_init;

public:
  ostream()
      : outbuf_init(),
       std::basic_ostream<charT,
                    traits>(outbuf_init::buf()) {}
};

The initialisation order then becomes:

outbuf
outbuf_init
...
basic_ios
basic_ostream
ostream

Now, if output_buf does throw an exception there is no undefined behaviour as the basic_ios has not yet been created.

ostream can be made easier to use by introducing a couple of simple typedefs for common character types:

typedef ostream<char> costream;
typedef ostream<wchar_t> wostream;

int main() {
  costream out;
  out << "31 HEXADECIMAL: " << std::hex
      << 31 << std::endl;
  return 0;
}

Listing 5: Typedefs for using ostream

That completes the implementation for the simplest possible custom stream.

Logging Stream Buffer

The previous example of a stream buffer was very basic, potentially inefficient and didn't actually buffer the characters streamed to it. The logging stream mentioned at the start of this article requires the characters to be buffered. When the buffer is flushed the time and date are prepended before it is passed on to the next stream.

Josuttis also has an example of a buffered stream buffer. However, his example uses a fixed array for a buffer that gets flushed when it is full. The logging stream should only flush the buffer when instructed to do so, with a std::endl or a call to flush. To accomplish this, the fixed array can be replaced with a std::vector.

As already mentioned the logging stream simply buffers the characters streamed to it and passes them on to another stream, preceded by a time and date, when flushed. Therefore the stream buffer must contain some form of reference to the other stream.

Listing 6 shows a basic implementation for the logging stream buffer. A std::vector based buffer has been introduced and overflow modified to check for EOF before inserting its character into the buffer.

#include <streambuf>
#include <vector>

template<class charT,
         class traits = std::char_traits<charT> >
class logoutbuf
     : public std::basic_streambuf<charT, traits> {
private:
  typedef typename std::basic_streambuf<charT,
                        traits>::int_type int_type;
  typedef std::vector<charT> buffer_type;
  buffer_type buffer_;

  virtual int_type overflow(int_type c) {
    if(!traits::eq_int_type(c, traits::eof())) {
      buffer_.push_back(c);
    }
    return traits::not_eof(c);
  }
};

Listing 6: Basic implementation of logging stream buffer

As it stands the stream buffer in Listing 6 only buffers characters. It never flushes them. A pointer to an output stream buffer, that the characters can be flushed to, is required. The initialisation and undefined behaviour fixes described in the previous section have the side effect that logoutbuf will be a member of a virtual base class and therefore should have a default constructor. A virtual base class constructor must be called explicitly or implicitly from the constructor of the most derived class (12.6.2/6). A default constructor eliminates the need for explicit constructor calling. This in turn means that a reference to an output stream cannot be passed in through the constructor and therefore a pointer to the output stream buffer must be stored instead and initialised by way of an initialisation function. This is not ideal, but a trade-off to guarantee safety elsewhere. The initialisation function is also in keeping with the buffer initialisation in basic_ios.

template<class charT,
         class traits = std::char_traits<charT> >
class logoutbuf
     : public std::basic_streambuf<charT, traits> {
private:
  typedef typename std::basic_streambuf<charT,
                        traits>::int_type int_type;

  typedef std::vector<charT> buffer_type;

  std::basic_streambuf<charT, traits>* out_;
  buffer_type buffer_;

public:
  logoutbuf() : out_(0), buffer_() {}
  void init(std::basic_ostream<charT,
                               traits>* out) {
    out_ = out;
  }
  ...
}; 

Listing 7: Initialising the output stream buffer

Listing 7 shows the logoutbuf stream buffer with the output stream buffer pointer and initialisation function. A constructor has also been added to make sure that the output stream buffer pointer is initialised to 0, so that it can be reliably checked before characters are sent to it.

When basic_ostream::flush is called, either directly or via std::endl, it starts a chain of function calls that finally results in basic_streambuf::sync being called. This is where the buffer should be flushed. The buffer should also be flushed when a logoutbuf object is destroyed, so sync should also be called from the logoutbuf destructor.

template<class charT,
         class traits = std::char_traits<charT> >
class logoutbuf
     : public std::basic_streambuf<charT, traits> {
  ...
public:
  ...
  ~logoutbuf() {
    sync();
  }
  ...

private:
  ...
  virtual int sync() {
    if(!buffer_.empty() && out_) {
      out_->sputn(&buffer_[0],
                  static_cast<std::streamsize>
                                 (buffer_.size()));
      buffer_.clear();
    }
    return 0;
  }
}; 

Listing 8: Synchronising the buffer

Listing 8 shows the implementation of the sync function. It checks the buffer to make sure there is something in it to flush and then checks the output stream buffer pointer to make sure the pointer is valid. The contents of the buffer are then sent to the output stream buffer, via its sputn function, and then cleared.

basic_streambuf's sputn function takes an array of characters as its first parameter and the number of characters in the array as its second parameter. std::vector stores its elements contiguously in memory, like an array, so the address of the first element in the buffer can be passed as sputn's first parameter. std::vector's size function is used to determine the number of elements in the buffer and can therefore be used as sputn's second parameter. The type of sputn's second argument is the implementation defined typedef std::streamsize. As the return type of std::vector::size is also implementation defined (and not necessarily the same type), sputn's second parameter must be cast to avoid warnings from compilers such as Microsoft Visual C++. There is a possibility that the number of characters stored in the buffer will be greater than std::streamsize can hold, but this is highly unlikely.

logoutbuf is now a fully functioning, buffered output stream buffer and can be plugged into a basic_ostream object and tested.

...
int main() {
  logoutbuf<char> ob;
  ob.init(std::cout.rdbuf());

  // Flush to std::cout
  std::basic_ostream<char> out(&ob);
  out << "31 hexadecimal: " << std::hex
      << 31 << std::endl;
  return 0;
}

Listing 9: Using logoutbuf

Listing 9 creates a logoutbuf object, sets std::cout's stream buffer as its output stream buffer and then passes it to a basic_ostream object, which then has character streamed to it. The output from the example in Listing 9 is:

31 hexadecimal: 1f

The next step is to generate the time and date that will be flushed to the output stream buffer prior to the contents of the logoutbuf buffer. The different ways of generating a date and time string are beyond the scope of this article so I am providing the following implementation, which will handle both char and wchar_t character types, without any explanation beyond the comments in the code:

#include <streambuf>
#include <vector>
#include <ctime>
#include <string>
#include <sstream>

...

template<class charT,
         class traits = std::char_traits<charT> >
class logoutbuf
     : public std::basic_streambuf<charT, traits> {
  ...
private:
  std::basic_string<charT, traits> format_time() {
    // Get current time and date
    time_t ltime;
    time(&ltime);

    // Convert time and date to string
    std::basic_stringstream<charT, traits> time;
    time << asctime(gmtime(&ltime));

    // Remove LF from time date string and
    // add separator
    std::basic_stringstream<char_type> result;
    result << time.str().erase(
                 time.str().length() - 1) << " - ";

    return result.str();
  }
  ...

  virtual int sync() {
    if(!buffer_.empty() && out_) {
      const std::basic_string<charT, traits> time
                                   = format_time();
      out_->sputn(time.c_str(),
                  static_cast<std::streamsize>
                                  (time.length()));
      out_->sputn(&buffer_[0],
                   static_cast<std::streamsize>
                                 (buffer_.size()));
      buffer_.clear();            
    }
    return 0;
  }
  ...   
};

Listing 10: Adding date and time

The sync function in Listing 10 now sends a date and time string (plus the separator) to the output stream buffer before flushing the logoutbuf buffer. The result of running the example from Listing 9 is now:

Fri Apr 20 16:00:00 2005 - 31 hexadecimal: 1f

logoutbuf is now fully functional, but there is a further modification that can be made for the sake of efficiency. Currently overflow is called for every single character streamed to the stream buffer. This means that to stream the 31 hexadecimal: string literal to the stream buffer involves 16 separate function calls. This can be reduced to a single function call by overriding xsputn.

...
#include <algorithm>

template<class charT,
         class traits = std::char_traits<charT> >
class logoutbuf
     : public std::basic_streambuf<charT, traits> {
  ...
private:
  ...
  virtual std::streamsize xsputn(const char_type* s,
                             std::streamsize num) {
    std::copy(s, s + num,
         std::back_inserter<buffer_type>(buffer_));
    return num;
  }
  ...
};

Listing 11: Overriding xsputn

xsputn takes the same parameters as basic_streambuf::sputn and uses the std::copy algorithm together with std::back_inserter to insert the characters from the array into the buffer. logoutbuf is now complete.

logoutbuf does of course require its own logoutbuf_init class and basic_ostream subclass, with a few modifications:

template<class charT,
         class traits = std::char_traits<charT> >
class logoutbuf_init {
private:
  logoutbuf<charT, traits> buf_;

public:
  logoutbuf<charT, traits>* buf() {
    return &buf_;
  }
};

template<class charT,
         class traits = std::char_traits<charT> >
class logostream
       : private virtual logoutbuf_init<charT,
                                        traits>,
         public std::basic_ostream<charT, traits> {
private:
  typedef logoutbuf_init<charT, traits>
                                    logoutbuf_init;
public:
  logostream(std::basic_ostream<charT,
                                traits>& out)
      : logoutbuf_init(),
        std::basic_ostream<charT,
                   traits>(logoutbuf_init::buf()) {
    logoutbuf_init::buf()->init(out.rdbuf());
  }
};

typedef logostream<char> clogostream;
typedef logostream<wchar_t> wlogostream; 

Listing 12: logoutbuf_init class and basic_ostream subclass

The logoutbuf_init class is actually the same as the one form the previous section; it's the logostream that is slightly different. The constructor takes a single parameter which is the output stream and its body passes its stream buffer to logoutbuf via init (suddenly the trade off doesn't seem so bad).

The final test example is shown in Listing 13:

...
int main() {
  costream out(std::cout);
  out << "31 hexadecimal: " << std::hex
      << 31 << std::endl;
  return 0;
}

Listing 13: Using the stream

Conclusion

The stream buffer is clearly the heart of an output stream. The potential for a stream buffer being accessed before it is initialised is easily avoided, as is the possibility of undefined behaviour, with the minimal of tradeoffs.

The buffering of characters streamed to a stream buffer is easily handled by a std::vector with no need for extra memory handling. Multiple characters can be added to a std::vector just as easily as single characters and the contiguous memory elements make it easy to flush to an output stream.

Writing a custom stream is easy! I believe this article shows just how easy it is, even with a minimum of background knowledge.

Acknowledgments

Alisdair Meredith, Alan Stokes, Jez Higgins, Alan Griffiths, Thaddaeus Frogley.

References

[Josuttis] Nicolai M. Josuttis, The C++ Standard Library, Addison-Wesley, ISBN: 0-201-37926-0.

[Standard] The C++ Standard, John Wiley and Sons Ltd, ISBN: 0-470-84674-7

Overload Journal #66 - Apr 2005 + Programming Topics