Writing Your Own Stream Manipulators

Still glides the Stream, and shall for ever glide;
The Form remains, the Function never dies.
William Wordsworth

Streams are typically more than just dumping or loading grounds for basic data. Their strength lies in accommodation of new data types and customisation. As objects their state may also be controlled via either a method or a manipulator interface; I am going to concentrate on manipulators. I’ll wrap up the article with a worked example.

Simple manipulators

Data flow and control are the two basic operations on a stream. Returning references and the syntactic sugar of operator overloading offer a convenient way to chain together successive items for input or output in a single expression:

clog reason " (error " << errno "'): " << strerror(errno) << endl;

The last item in the above expression provides more control than data: a newline is inserted and the buffered output is flushed. Using the endl stream manipulator is a more common idiom than using a naked in ’\n’ C++. The equivalent expression without manipulators is slightly less elegant:

(clog << reason << " (error " << errno "): " << strerror(errno) << '\n').flush();

Other simple manipulators include: ws, to act as a sink and eat whitespace from an istream; ends, to NUL terminate a string; flush, to flush an ostream; and hex, dec and oct, to convert input from and output to the appropriate base.

These are all simply function names. In addition to the insertion and extraction member operators for basic types, the istream and ostream classes both have operators that take a function pointer to execute on the current stream.

Given this, it is possible to define your own manipulators to control layout and stream state: (See listing 1)

ostream &tab(ostream &out)
{
    return out « '\t';
}

ostream &title(ostream &out)
{
    return out << "\n"
                  "********************\n"
                  "* Objects R Us Ltd *\n"
                  "********************\n"
               << endl;
}

istream &eatline(istream &in)
{
    while(in && in.get() != '\n')
    {
    }
    return in;
}

Listing 1 - layout and state manipulators

These manipulators may be used both indirectly and directly:

cin >> eatline;
eatline(cin);

The indirect form offers the syntactic convenience of mixing control and data into a sequence of insertions or extractions from a stream.

Parameterised manipulators

All the manipulators shown above are simple functions. They perform a single task well but without any flexibility. A standard IOStream implementation also provides a number of manipulators taking the form of function calls to control the stream. For instance to set the output precision for floating point numbers:

cout << "result = "
     << setprecision(6)
     << result
     << endl;

Using this approach you can implement more general control over the stream: (See listing 2)

cout << newline(5); //five newlines and a flush
cout << newpage(standard_title);
cin >> eat('*');

Listing 2 - more control

There are a number of different methods for implementing these manipulators. The most obvious method is to implement these manipulators as functions that return objects for which insertion or extraction operators are defined, eg. (See listing 3)

class eatable
{
public:
    eatable(char);
} ;

istream &operator>>(istream &, eatable);

eatable eat(char to_eat)
(
    return eatable(to_eat);
}

Listing 3 - class eatable

Such classes are often not intended for general use and have mangled names or friend only access. A more direct and, in my opinion, more general solution is to cut out the function and use objects directly. A constructor call to create a temporary object looks like a function call with much the same effect:

class eat
{
public:
   eat(char);
   ...
};
istream &operator>>(istream &, eat);

The class based approach offers a more obvious route to creating custom manipulator objects:

eat line_sink('\n');
separator_sink(separator);
cin >> line_sink;
cin >> separator_sink;

Objects can act as pseudo-functions or functors. This is a powerful idiom deriving from use of the function call operator, so that direct application of manipulators on streams becomes possible:

eat_line(cin);
eat_separator(cin);

This is an idiom I will return to and explore in greater detail in future.

An example

Consider a manipulator that takes an integer and expands it out into word form, eg.

cout << expand(-123) << endl;

will print out

minus one hundred and twenty three

The class holds a single value and is not designed as a base class, so the default copy constructor, assignment operator and destructor are usable — they may be inserted for completeness, but for the sake of brevity I will omit them from this article: (See listing 4)

class expand
{
public:
    expand(short to_expand) : value (to_expand) {}
    ostream &operator()(ostream &) const;
private:
    short value;
    static const char *const tens[];
    static const char *const units_and_teens[];
};

Listing 4 - class expand

The operator() member function does the hard work, using the literal constant tables tens (“ten” to “ninety”, starting from the first element rather than the zeroth) and units_and_teens (“ zero” to “nineteen”). This makes the stream insertion operator quite trivial:

ostream &operator <<(ostream &out, expand number)
{
   return number(out);
}

Notice also that friendship of the class is not required for the insertion operator; the property of expression on an ostream is a feature of the class and may be used separately. The algorithm for writing out the number is mutually recursive.

You are welcome to try a non-recursive version if you wish, but you will gain little except a fuller understanding of why the simplest solution is recursive! For sensible output, this implementation assumes that shorts are no more than 16 bits wide:(See listing 5, overpage)

ostream &expand::operator()(ostream &out) const
{
    if(value < 0)
    {
        out << "minus " << expand(-value);
    }
    else if(value >= 1000)
    {
        out << expand(value / 1000) << " thousand";

        if(value % 1000)
        {
            if(value % 1000 < 100)
            {
                out << " and";
            }
            out << " " << expand(value % 1000);
        }
    }
    else if(value >= 100)
    {
        out << expand(value / 100) << " hundred";

        if( value % 100 )
        {
            out << " and " << expand (value % 100);
        }
    }
    else if(value >= 20)
    {
        out << tens[value / 10];

        if(value % 10)
        {
            out << " " << expand(value % 10);
        }
    }
    else
    {
        out << units_and_teens[value];
    }

    return out; // MM: not in original code
}

Listing 5

Improvements

There are a number of obvious improvements and extensions that may be made, but have not been included for obvious reasons:

increase the domain from short to long, keeping in mind that a 64 bit implementation will probably benefit from a more general table driven approach;
offer expansion for floating point numbers, using either an internal pointer to a separate object (emulating a ’virtual constructor’) or a discriminated union to determine correct output method for integers and floating point numbers;
locale dependency, remembering that not all languages follow the same approach to naming numbers as English: in French, for example, seventy-one is literally ’sixtyeleven’;
create an input stream manipulator, contract, that reads in text to yield a number.

Kevlin Henney

kevlin@wslint.demon.co.uk

Simple manipulators

Parameterised manipulators

An example

Improvements

Advertisement

Advertisement

Your Privacy