BroadVision, Part 4

Recap

In previous articles I have covered the architecture and mechanics of BroadVision - an application framework for building electronic commerce web sites - as well as database access issues. In this, the final part of the series, I shall look at some specific custom objects I have designed and implemented to cover what I consider shortfalls in BroadVision's product.

Maintaining a third party product

Isn't maintenance a dull word? It's something we all hate doing, usually because it's someone else's code and it doesn't come up to our standards. But at least we've got access to the source and some of the engineers that either developed the code or have worked on it before. We buy third party products to try to avoid maintenance - that's the vendor's problem. What seems to have happened with C++ is that various things have conspired to ensure that many third party libraries either come in source form or accompanied by the source code. In cases where they are intended to be "user-extensible" that means that you and I will, over time, become rather familiar with that source code.

As we all should know by now, there are three types of maintenance: corrective, adaptive, perfective. Roughly translated, that's bug fixing, extension to support external requirements and extension to support new functionality. Funnily enough, customising a third-party OO library usually looks like this too - we are simply maintaining the vendor's library! We shouldn't be fixing their bugs, we shouldn't even need to do any adaptive work as the library should be flexible enough to accommodate all sorts of changes in the external environment. Since we are building on the library, we should expect to undertake perfective maintenance only. Unfortunately, if a library isn't designed to be extensible then we end up doing adaptive maintenance just to support our own perfective maintenance.

"Maintaining" BroadVision

I'm going to look at the customisation we undertook with reference to the three types of maintenance above. My definition of "corrective" is probably broader than most people's because I expect a lot from third-party libraries - I include design flaws in this category, for example, like the problem I mentioned with Dyn_SmartLink in part 2 (it isn't a Dyn_Container so you cannot surround a Dyn_IOField with a Dyn_SmartLink).

BroadVision supply the source to their integrators because even they themselves admit that you can't really build on the framework without access to the internals - to see how their existing objects work. Our approach to maintenance was to put the whole source code under version control and use SNIFF+ as a source browser and editor since it integrates well with almost any version control system (we use RCS) and make system (we use UNIX makefiles and Sun's C++ SPARCcompiler). Then, whenever we needed to modify a BroadVision object, we copied the source, renamed the class from Dyn_Xxx to ISS_Xxx and fixed the code in that version instead. This allowed us to integrate changes made by BroadVision in subsequent versions of their framework into our own code as and when we needed - some bugs got fixed by BroadVision and some didn't as they went from version 2.5 to version 2.6 to version 3.0.

A silly typo - corrective

The first bug I encountered in the BroadVision source was nothing more than a careless typo. They use a utility function to compare strings:

int equal_string(const char* s1, const char* s2, int ignoreCase = );

When an object processes the attributes in an HTML tag, it has code like this:

if ( equal_string( attr.name, "HEIGHT", 1 ) )
// store attr.value to height member variable

Unfortunately, in version 2.5 the code wasn't very consistent, sometimes comparing lowercase attribute names (without the ignoreCase parameter), sometimes comparing uppercase attribute names (with the third parameter supplied as 1) and in one unfortunate case the code had:

if ( equal_string( attr.name, "WIDTH, 1" ) )
// store attr.value to width member variable

This meant that the object in question didn't recognise the WIDTH=nn attribute on images. By version 3.0, BroadVision had updated all their code to use the three argument version of equal_string. I suspect they had started in an early version of the framework with a two argument version of equal_string and added the ignoreCase argument in a later version as a default to provide backward compatibility. Moral: default arguments can be a Bad Thing(tm)!

It should be obvious why I consider this corrective maintenance: the code was simply incorrect.

A small design flaw - adaptive

The framework provided a general field display object, Dyn_IOField, but it used default formatting for most fields which wasn't sufficient when dealing with numbers and dates. We produced a numeric field display object, ISS_NumericField, that supported the DECIMAL_PLACES=nn attribute to specify the format. Later, we decided we needed a date field display object and decided to use a more generic attribute FORMAT="xxx" to allow the most general reformatting of dates. In retrospect, this was a design flaw on our part: I'm always harping on about generalising objects and parameterising them yet I didn't spot the "obvious" design here - an ISS_FormattedField object that took a TYPE="xxx" attribute to determine what type of data to reformat the field as ("NUMERIC", "DATE" etc) and the above mentioned FORMAT="xxx" attribute to use as an argument to sprintf() internally. Perhaps I'll rectify that in a future version of our objects.

Why do I consider this adaptive maintenance? BroadVision didn't anticipate that users would need very fine control over the formatting of output. In particular, for the travel site, there was a legal requirement covering the display format of currency rates - an external requirement not anticipated by the designers.

A bigger design flaw - perfective

BroadVision's approach to adapting objects to new roles tends toward creating new derived classes that perform specific actions rather than parameterising existing classes. This has lead to the situation where they have multiple Dyn_XxxField objects that provide a variety of different HTML formats depending on where the data originates. This means that, for instance, you can have a <SELECT> form of a field that is associated with the visitor profile but not for a field that is associated with product data. Our approach was to create an ISS_InputField object that supported the full range of HTML field formats, including multiple select drop-downs - a recent addition for the car manufacturer's site. None of BroadVision's field objects supported <TEXTAREA>, none supported the multiple select field, only one supported <SELECT>, the support for checkboxes and radio buttons was similarly haphazard.

However, there was another issue at play here: database field display objects had to convert their data from a different internal format, a "run-time value" format, before converting the actual value to the required display format. BroadVision's approach was to have two sub-hierarchies of classes, one with its root at Dyn_IOField for formatting 'normal' values and the other with its root at Dyn_ContentField (itself inherited from Dyn_IOField) for formatting database values. No wonder their code contained so much duplication and the formatting was rather haphazard!

Our approach was, as usual, to parameterise the display objects so that they knew whether to convert the internal void* value as a "run-time value" first, before applying the formatting. We simply added a new attribute CONTENT=YES to indicate the conversion should be applied first.

A better approach for BroadVision might well have been to separate the value-fetch functionality from the formatting functionality and allow the value-fetch objects to invoke another named object to perform the formatting after any internal conversion. C++ makes this sort of run-time dynamic behaviour somewhat difficult however!

This is perfective maintenance because we could have compromised the user interface design in order to fit in with the supplied functionality. Multiple select drop-downs were a "nice to have" that could have been implemented as a series of checkboxes. The text area field could have been implemented as a long, scrolling text field. Checkboxes and simple text fields are part of BroadVision's basic functionality. We wanted to extend the core functionality to support "prettier" web pages.

Commonality of approach

I said that I'd compare and contrast the approaches we took on the two different sites and shortly I will but first some discussion of one part of both sites that was effectively common to both. On both sites the visitor can make a series of choices to drill down through the options available. Subsequent choices offered to the visitor depend on previous choices they've made. So both sites had a set of classes that managed the "context" of a visitor's choices. We created a way of storing arbitrary C++ objects on the server that persisted throughout a visitors session - the SessionStore<> template mentioned in part 2. Then we could design and build a suitable "context engine" for managing the drill down choices made and store its 'state' on the server. Naturally, the actual code was very different between the two sites, since different business rules applied to the context, but the principle remained the same: make a note of each choice made by the visitor and provide a way of selecting the next set of available choices from the database.

Here's where the two sites differed greatly. For the travel site, we retained the logical data model within the database, using a large number of tables outside BroadVision's product table. The context engine for that site incorporated database accessors that searched for relevant content within specific tables, determined by the current context. In fighting against the framework, we ended up writing a lot more code. For the car manufacturer's site, we mapped the logical data model down to BroadVision's flat product table model - working within the framework - and then our context engine needed only to generate the SQL query which searched that table. We used BroadVision's standard database accessor to perform the search using the generated SQL. This allowed us to take advantage of BroadVision sophisticated database caching techniques as well as the observation machinery it provides to track which products a visitor is shown and which they select for more information.

As a side-effect, this allowed our context engine to be more 'generic' - it depended much less on BroadVision architecture of accessors, loops, field display objects and so on. As a consequence of this, the context engine was more Bean-like, to use a term borrowed from Java: the external interface - to BroadVision - was smaller and better-structured. In fact, we used a Java-based CASE tool, Together/J from Object International, to design and build a prototype context engine in Java. Before we even wrote a single line of code, we had a fully-featured working Java model of how the context engine would behave. Recoding this into C++ was a straightforward, if somewhat tedious, process. The result was a set of classes that worked first time. The design even proved to be very resilient to some fairly dramatic changes we made to the generated SQL queries in order to tune the database access.

The engine underlying the car manufacturer's site is a better design and takes better advantage of the base functionality provided by BroadVision. We were able to write fewer objects that extended the framework and concentrate on solving the real problem at hand: maintaining a visitor context. I'll probably write a future article about my design principles to show why I think one design was better than another.

Different approaches

By making a greater attempt to work within the framework we tended to tackle deficiencies in BroadVision in a different way. For the travel site, whenever we hit what we considered a deficiency in BroadVision, we wrote whole new BroadVision objects - we ended up with about 100 objects in addition to the basic context engine. For the car manufacturer's site, we tried to find ways to adapt BroadVision's existing objects without having to recode them. In particular, we used the idea of generating SQL queries, which the standard BroadVision accessor could process, to provide some very sophisticated ways of linking product table entries to editorial table entries. This allowed us to link whole sets of dynamic pages of text and images to single product which in turn allowed us to work around BroadVision's limitations on product data without creating custom tables - we reused something BroadVision already provided instead. On both sites, we maintain persistent data about visitors outside the standard profile table but whereas we had written custom accessors on the travel site, we used BroadVision's standard accessors and simply generated SQL queries where necessary on the car manufacturer's site.

BroadVision have recently released version 4.0 of their framework which, whilst it supports version 3.0 objects, takes a radically different approach to customisation. In version 4.0 they have opened up the object model in such a way that it can be manipulated by server-side JavaScript. Instead of writing custom objects to perform conditional generation of HTML it should be possible in version 4.0 to achieve this by using JavaScript on the pages. This is a much more cost effective approach: C++ skills are much more expensive than JavaScript skills in the web programming world. It will make BroadVision an easier product for web developers to use. We're fairly confident that the approach we've taken with the car manufacturer's site will also be a better match to this new direction since it relies less heavily on customising version 3.0 objects.

Summary

A well-designed application framework provides a good way to achieve a base level of functionality quickly and with little development effort. Such frameworks ought to be a cost effective way to build custom applications. As we've seen from this series of articles BroadVision, in common with many other commercial application frameworks, has some way to go but it does provide a large helping hand for electronic commerce web sites. The main areas where framework designers need to improve is in architecture and generic design: the architecture needs to be such that developers can easily extend the system without having to write large slabs of code; the design needs to be such that the basic components can be heavily customised without writing code. Both design and architecture need to harmonise so that user extensions can be "plugged in". OO design has come a long way in the last few years and OO developers are now beginning to expect pluggable, parameterised frameworks. We're trying to build our own systems that way to increase reuse and decrease costs, we expect application frameworks to keep up with our progress.

On the other hand, selecting an application framework will shape the way you work. Most frameworks have a steep learning curve and are best used utilised when you go with the grain rather than trying to force it to your will. This means that you need to ensure the framework fits your application area very well or that you know how to shape your application to your framework of choice. Evaluating an application framework is often very difficult because you can't tell how much it will help you until you've actually built something with it. Conversely, you can't tell how to build something with it until you've learnt a lot about how it works itself.

I still think application frameworks are a "necessary evil" but they are getting better. We couldn't have built either the travel site or the car site as quickly without BroadVision. Or we could have chosen to reduce the functionality but that makes us less competitive as companies expect more and more from a web site.

And finally...

At the time of writing, the car manufacturer's site is still not live. Being a very dynamic site means that the car manufacturer has a lot of data to enter in order to fully populate the site. Such data entry - content management in web terms - is a labour intensive process and I think that many companies underestimate this task when undertaking what we call a "third generation" web site (first generation is "brochureware", a flat site with information about a company and sometimes its products; second generation is more comprehensive often providing some forms for visitors to give feedback and providing more in-depth information about a company's products; third generation sites are much more dynamic and adapt to each visitor, providing a personalised experience often differing from visit to visit).

Some of you may well be wondering why this final installment is so late. After completing development of the car manufacturer's site at the end of last year, I immediately became involved in another BroadVision project that had very tight deadlines and I simply didn't have time to work on this article. The new project involves no C++ at all, just look'n'feel customisation of a product BroadVision calls "Knowledge Web Application". It's ideal for Intranet and Extranet projects which deliver masses of content - documents - within a standard navigation framework. KWA provides the navigation and the content management system. You just set up the database structure to reflect how the content is classified and how it should be delivered in "channels" and "programmes" within those channels, then you can start publishing content. That site is now installed at the client offices and only client training and "go-live" support remains.

But don't think I have time on my hands! I've just started yet another BroadVision project which will start by auditing an existing BroadVision-powered site which takes little advantage of the framework's power. My job is to determine a roadmap towards a fully dynamic, personalised version of the site.

However, I do intend to write another article that follows on from this series if the editors are interested. It will look at BroadVision's main competitor, ATG's Dynamo, which is written entirely in Java and is used to power sites such as Sony Entertainment (the PlayStation site) and My Sun (a personalised portal site for Sun's other web sites). It takes a radically different approach in terms of design and architecture - an architecture made possible by Java. Essentially I want to highlight how Java and C++ lend themselves to very different design strategies - something that I've been dealing with quite a lot since I started using Java as a design language even for systems that would be built in C++.