pinDynamicAny, Part I

Overload Journal #86 - August 2008 + Programming Topics + Design of applications and programs   Author: Aleksandar Fabijanic
Alex Fabijanic presents a class hierarchy providing dynamic typing in standard C++.

Dynamic and static typing are competing forces acting upon the programming languages domain. The strong typing system, such as the one in C++ can be a bulletproof vest or a straitjacket, depending on the context. While C++ strong static typing is well-justified and useful, sometimes it is convenient or even necessary to circumvent it. Over time, various ways around it have been devised, on both high and low ends of the abstraction spectrum1. Additionally, standard C++ offers dynamic and static polymorphism.

Dynamic languages have no notion of variable type. Values have type, while variables are type-agnostic. Hence, the type of a variable can change during its lifetime, depending on the value assigned to it. Clearly, in addition to static type-safety loss, there is also a performance penalty associated with this convenience. There are, however, scenarios where a relaxed, dynamic type system is a desirable feature, even in a statically typed language like C++. An example that comes in mind first is a retrieval of structured data from an external source. Typically, the data will arrive in a variety of types. In a statically typed world, this implies the requirement of knowing the exact data types at compilation time. Additionally, every time the data types or layout changes, the code must change as well. A way around this obstacle is through dynamic typing support.

This article describes the approach taken by the C++ Portable Components [POCOa] framework ('POCO' in further text) to implement safe and efficient dynamic typing capabilities within the confines of standard C++.

Any

Boost Libraries [Boost] contain multiple classes meant to alleviate the pains associated with static typing. Our focus here is on boost::any and a solution building on its design. Through clever construction and type erasure, boost::any is capable of storing any type. Both built-in and user-defined types are supported. A code example of boost::any usage is shown in Listing 1.

    std::list<boost::any> al;
    int i = 0;
    std::string s = "1";

    al.push_back(i);
    al.push_back(s);
  
Listing 1

However, boost::any is implemented in the type-safe spirit of C++. Run-time efficiency and strong typing are the constraints behind its design. Although it provides a mechanism for dynamic type discovery (Listing 2.), boost::any does not itself get involved into such activity, nor is it willing to cooperate in implicit conversions between values of different types. Moreover, an attempt to extract a type other than the one held, results in either an exception being thrown or a null pointer returned (in the manner of standard C++ dynamic_cast).

    bool isInt(const boost::any& a) {
        return a.type() == typeid(int);
    }
  
Listing 2

A boost::any object is really convenient when one wants to pass around a variable of arbitrary type without having to worry about what type it actually is. The most common use is storing diverse types in an STL container. However, the true nature of boost::any is static and, at the actual value extraction place, it is necessary to know precisely what type is held inside the any. While very 'soft' on the assignment side, on the extraction side boost::any is even more rigid than built-in C++ types - it is only possible to cast it back to its original type. The boost::any class has been ported to POCO (with some additions2), where it is known as Poco::Any. The design of this class has served as a foundation for development of its dynamic cousin, Poco::DynamicAny, which is the main theme of this article.

DynamicAny

As mentioned above, Poco::Any is a handy class for storing variety of types behind a common interface offering strongly typed cast mechanism and support for querying the held data type. Poco::DynamicAny extends Any functionality by providing full-blown runtime dynamic typing functionality within an ANSI/ISO C++ compliant framework. DynamicAny builds on the heritage of Any by adding the following features:

  • runtime checked value conversion and retrieval
  • non-initialized state support
  • implicit conversion to target type when possible and safe
  • seamless cooperation with POD types
  • seamless cooperation with std::string
  • std::map wrapper (a.k.a. DynamicStruct)
  • std::vector specialization (array-like semantics support)
  • date/time specializations
  • binary large object specialization
  • basic JSON support.

The class goes a long way to provide intuitive and reasonable conversion semantics and prevent unexpected data loss, particularly when performing narrowing or signedness conversions of numeric data types. One of the challenges during the design process was to come up with a set of intuitive conversion and behaviour rules. Of course, many conversions attempts will throw an exception because they make no sense (e.g. converting "abc" to a numeric type). Additionally, deciding what is true or false seems like an easy task until it is actually attempted. The final verdict was that anything resembling either explicit falsehood (string "false", bool false) or 'nothingness' (empty string, integer zero, min. float value ...) shall be false, everything else is true. This decision is compatible with C and C++, where zero integer is false and everything else is true. Also, "false" and "true" strings behave as expected in a case-insensitive manner.

The rules governing the behavior of DynamicAny are3:

  • An attempt to convert or extract from a non-initialized ('empty') DynamicAny variable shall result in an exception being thrown
  • Loss of signedness is not permitted for numeric values. An attempt to convert a negative signed integer value to an unsigned integer type storage results in an exception being thrown.
  • Overflow is prohibited; attempt to assign a value larger than the target numeric type size can accommodate results in an exception being thrown.
  • Precision loss, such as in conversion from floating-point types to integers or from double to float on platforms where they differ in size (provided double value fits in float min/max range), is permitted.
  • String truncation is allowed - it is possible to convert between string and character when string length is greater than 1. An empty string gets converted to the char '\0', a non-empty string is truncated to the first character.

Boolean conversions are performed as follows:

  • A string value "false" (not case sensitive), "0" or "" (empty string) evaluates to false; any string not evaluating to false evaluates to true (e.g. "hi"true).
  • All integer zero values are false, everything else is true.
  • Floating point values equal to the minimal floating point representation on a given platform are false, everything else is true.

The added value and benefit of DynamicAny is in relieving the programmer from type-related worries for all the fundamental C++ types and some POCO framework objects. DynamicAny allows storage of different data types and transparent conversion between them in the fashion of dynamic languages.4

Some DynamicAny usage examples are shown in listings 3-6.

    // Values are interchangeable between
    // different types in a safe way
    DynamicAny any("42");
    int i = any; // i == 42
    any = 65536;
    std::string s = any; // s == "65536"
    char c = any; // too big, throws RangeException
  
Listing 3
    // Conversion operators for
    // basic types are available
    DynamicAny any = 10;
    int i = any - 5;    // i == 5
    i += any;           // i == 15
    i = 30 / any;       // i == 3
    bool b = 10 == any; // b == true
  
Listing 4
    // DynamicAny can be incremented or
    // decremented when holding integral value
    DynamicAny any = 10;
    any++; // any == 11
    --any; // any == 10
    any = 1.2f; // make it float
    ++any; // throws InvalidArgumentException
  
Listing 5
    // Workaround for std::string
    DynamicAny any("42");
    std::string s1 = any; //OK
    // std::string s2(any); //g++ compile error
    std::string s3(any.convert<std::string>()); //OK
  
Listing 6

There are some conversions that require 'workarounds' with some compilers as illustrated in the code snippet in Listing 65.

DynamicAny implementation

In the manner of boost::any, storage and extraction of an arbitrary user-defined type are supported out-of-the-box. In addition to that, DynamicAny's conversions are fully extensible. In order to provide the support for conversion to other types, the DynamicAnyHolder<Type> must be specialized for the Type with appropriate convert() function overloads being defined.

The structure outline of the DynamicAny and DynamicAnyHolder class hierarchy is shown in Listing 7. DynamicAny owns a pointer to DynamicAnyHolder. The default zero pointer indicates that variable has not been initialized yet. In the non-initialized state, attempt for extraction or conversion triggers an exception. At assignment time, the DynamicAnyHolder storage is allocated on the heap and the address stored in the pointer. The storage is automatically released at destruction time by virtue of the C++ RAII mechanism.

    class DynamicAny
    {
    public:
    DynamicAny();
      /// Creates an empty DynamicAny.

    template <typename T> DynamicAny(const T &val):
      _pHolder(new DynamicAnyHolderImpl<T>(val))
      /// Creates the DynamicAny from the given value.
    { }

    // ...

    DynamicAnyHolder* _pHolder;
    };

    class DynamicAnyHolder
    {
    public:
    virtual ~DynamicAnyHolder();
    // ...
    virtual void convert(Int8& val) const
    { throw BadCastException(
       "Can not convert to Int8"); }

    virtual void convert(Int16& val) const
    { throw BadCastException(
       "Can not convert to Int16"); }

    // ...

    virtual void convert(std::string& val) const
    { throw BadCastException(
       "Can not convert to string"); }

    // ...
    protected:
    DynamicAnyHolder();
    // ...
    };

    template <typename T>
    class DynamicAnyHolderImpl: public DynamicAnyHolder
    /// template for arbitrary user-defined types
    {
    public:
    DynamicAnyHolderImpl(const T& val): _val(val) { }
    ~DynamicAnyHolderImpl() { }

    const std::type_info& type() const
    { return typeid(T); }

    DynamicAnyHolder* clone() const
    { return new DynamicAnyHolderImpl(_val); }

    const T& value() const
    { return _val; }

    private:
    DynamicAnyHolderImpl();
    // ...
    T _val;
    }

    template <>
    class DynamicAnyHolderImpl<Int8>:
    public DynamicAnyHolder
    /// Int8 specialization
    {
    public:
    DynamicAnyHolderImpl(Int8 val): _val(val) { }

    // ...

    void convert(Int8& val) const
    { val = _val; }

    void convert(Int16& val) const
    { val = _val; }

    // ...

    void convert(std::string& val) const
    { val = NumberFormatter::format(_val); }

    // ...

    private:
    DynamicAnyHolderImpl();
    Int8 _val;
    };
  
Listing 7

Support for various data types is achieved through polymorphism - DynamicAnyHolderImpl is a template class inheriting from DynamicAnyHolder and only specializations of this class do the conversion work. The direct extraction of the original data type depends on the template and specializations having value() member function returning the held value. Although it may be viewed as a questionable design decision, for efficiency sake value() has intentionally not been made virtual. This decision has provided the value extraction performance comparable to that of boost::any.

The most commonly used data types (all fundamental data types, std::string, std::vector<DynamicAny>, DateTime, Timestamp, BLOB) are specialized within the POCO framework and ready for immediate use. The mentioned set of data types covers the majority of cases where automatic conversion is frequently needed. All other data types are covered by the generic DynamicAnyholderImpl<T> template and, like boost::any, allow only extraction of the held type, while an attempt to convert the value results in exception. When needed, a specialization for user-defined types is possible. A definition of sample UDT (a social security number formatter), with specialization and usage example code is shown in Listing 8.

    class SSN
      /// a user-defined type
    {
    public:
    SSN(const SSN& ssn): _ssn(ssn._ssn) { }

    SSN(const DynamicAny& da): _ssn(da) { }

    operator SSN ()
    { return *this; }

    std::string sSSN() const
    { return format(); }

    int nSSN() const
    { return _ssn; }

    private:
    std::string format() const
      /// format integer as SSN
    {
      std::string tmp;
      std::string str =
         NumberFormatter::format(_ssn);
      tmp.clear();
      tmp.append(str, 0, 3);
      tmp += '-';
      tmp.append(str, 3, 2);
      tmp += '-';
      tmp.append(str, 5, 4);
      return tmp;
    }

    int _ssn;
    };


    // Sample usage:

    SSN udt1(123456789);
    DynamicAny da = udt1;
    std::string ssn = da;
    std::cout << ssn << std::endl;
    SSN udt2 = da;
    std::cout << udt2.nSSN() << std::endl;


    // Output:

    123-45-6789
    123456789
  
Listing 8

As seen in the example, DynamicAny readily holds SSN and smoothly converts it to supported values. Assignment from DynamicAny to SSN is also possible. DynamicAnyHolder provides the dynamic behaviour through polymorphism by virtue of its descendant specializations - the actual value resides in DynamicAnyHolderImpl specialization. This value is converted through the overloaded convert() virtual function call for the appropriate data type. Were the specializon not present, only extraction of the original type (in the fashion of boost::any's any_cast functionality) would have been possible.

The main challenges encountered during the design were making DynamicAny coexist in harmony with built-in types and compilers as well as implementing specializations and safe conversions for most commonly used types. The first attempt for operator overloading was template-based, but that has proved to be painting with too broad a brush, resulting in obscure compile errors on some platforms. To fix the problem, operators on both sides (member and non-member ones) have been re-implemented as overloaded functions. Also, it took several iterations of safe conversion check versions to reconcile with all supported compilers and platforms. The POCO community contribution in the process was instrumental.

DynamicAny in real world

Surely, all this is not without meaning [Melville51]. The code sample shown may be a clever data formatter, but was it worth going through such effort only to provide conversion and formatting between numbers and strings? A code snippet using DynamicAny in a realistic scenario is shown in Listing 9. The added value that DynamicAny brings in this case is:

  • shield against the compile-time data type and layout knowledge requirement
  • shield against conversion data loss
    using namespace Poco::Data::Keywords;
    using Poco::Data::Session;
    using Poco::Data::Statement;
    using Poco::Data::RecordSet;

    // create a session
    Session session("SQLite", "sample.db");

    // a simple query
    Statement stmt(session);
    stmt << "INSERT INTO Person VALUES ('Bart', 12)",
       now;

    // create a RecordSet
    RecordSet rs(session, "SELECT Name,
       Age FROM Person");

    int i = rs[1]; // OK
    std::string s = rs[1]; // OK, too
    i = rs[0]; // throws, can't convert 'Bart' to int
  
Listing 9

To achieve the desired RecordSet capabilities, class Row was introduced. By utilizing DynamicAny's dynamic typing facilities, Row conveniently wraps a row of data and, through RowIterator, works seamlessly in conjunction with STL algorithms to provide functionality for a flexible RecordSet class, as shown in Listing 10. The details are outside of the scope of this article, but suffice it to say that the code shown works for any given SQL statement (i.e. any given column count/datatype combination) thanks to dynamic typing provided by DynamicAny.

    Session session("SQLite", "sample.db");
    std::cout << RecordSet(session, "SELECT * FROM Person");

    // This is how streaming is achieved under
    // the hood:
    // copy(begin(), end(),
    //      ostream_iterator<Row>(cout));
  
Listing 10

As demonstrated in Listings 9 and 10, DynamicAny comes handy as a 'mediator' between type aware data storage (e.g. database) and a type relaxed representation (e.g. web page). Displaying data from database is easy by simply converting all the values to string and embedding them into HTML, for example. However, the data coming back from the web page shall all be strings. In a scenario proposed by a POCO contributor, DynamicAny had been extended having two callback functions being called before and after a value assignment or change. The callbacks allow changes in the value to be instantly reflected in a XML structure and then transformed to any representation on demand. String value coming from a response XML could be put in a DynamicAny then bound to database query without explicit type conversion. Full details are beyond the scope of this article, but the basic outline is laid out in Listing 11. Currently, this is not a part of mainstream code base and a discussion is going on about whether and how to integrate this functionality into the framework.

    // retrieve from a database
    RecordSet rs(session, "SELECT Name,
       Age FROM Simpsons");
    // type is known here
    DynamicAny age = rs[1];
    // output as string (eg. in some html)
    // ...
    // assign from a string (e.g. from some html form)
    age = "14";
    // bind, implicitly casting age to int
    session <<
      "INSERT INTO Simpsons VALUES('Bart', ?)",
      use(age), now;
  
Listing 11

Conclusion

Static and dynamic data typing are contrasting solutions, each with its own advantages and drawbacks. While dynamic typing affects runtime performance, in certain scenarios (e.g. fetching data from a remote database) the performance hit is dwarfed by the time spent on other operations.

As illustrated in the examples, DynamicAny is useful whenever performance requirements are loose and/or data types involved are unknown at compile time. However, as will be shown in part II of the article, performance concern was not a design afterthought. DynamicAny class is part of C++ Portable Components framework Foundation library with extensive use in the Data library. Additionally, some experimenting is underway with DynamicAny used as a 'bridge' between C++ and scripting languages [POCOc].

In the next installment of this article, more details about internal implementation of DynamicAny will be given, as well as some comparison tests between different C++ data type conversion mechanisms and classes.

Acknowledgements

Kevlin Henney is the originator of the idea and author of the boost::any class. Kevlin has provided valuable comments on the article.

Peter Schojer has ported boost::any to POCO, implemented major portions of DynamicAny and provided valuable comments on the article.

Günter Obiltschnig has written majority of the POCO framework and provided valuable comments on the article.

Laszlo Keresztfalvi has provided valuable development and testing feedback, sample usage code as well as valuable comments on the article.

References

[Boost] Boost any library: http://www.boost.org/doc/html/any.html

[Henney00] Henney, Kevlin (2000) 'Valued Conversions', C++ Report, July-August 2000.

[Melville51] Melville, Herman (1851) Moby Dick, Harper & Brothers Publishers

[POCOa] C++ Portable Components: http://poco.sourceforge.net

[POCOb] C++ Portable Components development repository: http://poco.svn.sourceforge.net/viewvc/poco/

[POCOc] Poco::Script: http://poco.svn.sourceforge.net/viewvc/poco/sandbox/Script/

[Stroustrup97] Stroustrup, Bjarne (1997) The C++ Programming Language, Addison-Wesley.

[Sutter07] Sutter, Herb (2007) 'Modern C++ Libraries', Proceedings, SD West.

1 unions, void pointers, Microsoft COM Variant,boost::variant, boost::any, boost::lexical_cast

2 Added RefAnyCast operators returning reference and const reference to stored value.

3 Some of the features are scheduled for the next release and are currently available from the development code repository [POCOb]

4 For conversion from type T1 to type T2 to be possible, a DynamicAnyHolderImpl<T1>::convert(T2&) must be defined.

5 The commented line does not compile with g++ (MSVC++ and Sun Studio compile it successfully).

Overload Journal #86 - August 2008 + Programming Topics + Design of applications and programs