ACCU Home page ACCU Conference Page
Search Contact us ACCU at Flickr ACCU at GitHib ACCU at Google+ ACCU at Facebook ACCU at Linked-in ACCU at Twitter Skip Navigation

pinFrom Occam's Razor to No Bugs' Axe

Overload Journal #100 - December 2010 + Programming Topics + Process Topics   Author: Sergey Ignatchenko
Designing good APIs that stand the test of time is notoriously hard. Sergey Ignatchenko suggests a radical guideline.

As usual, the opinions expressed within this article are those of 'No Bugs' Bunny, and do not necessarily coincide with opinions of the translator and the Overload editor; please also keep in mind the difficulties in translating accurately from Lapine (like those described in [LoganBerry2004]). In addition, both the translator and Overload expressly disclaim all responsibility from any action or inaction resulting from reading this article.

"Fight Features. ...the only way to make software secure, reliable, and fast is to make it small."
Andrew S. Tanenbaum

Every time I start to develop a new API for fellow rabbits, I (and probably every other library developer) always face the same question: which functions might my users possibly want? Over the years, I have came to a seemingly paradoxical yet extremely practical approach to this problem, which I want to share here. It is not something entirely new, but I don't think it has ever been emphasized enough.

First, I'll try to analyze how it usually happens.

The usual approach: what else MIGHT our users want?

When the need for a new API (class or library) arises, the natural temptation of the developer is to complete the development of the API once and for all, and to provide everything any potential user could possibly want. Examples of such APIs are abundant, and such libraries are often successful, but there are several problems with this approach which leads to a reduction in productivity in the long run. These problems are:

  • the inability to remove an unnecessary API: as soon as the API is released, it is extremely difficult to get rid of it. While there are attempts to introduce 'deprecation' into APIs (for example in Java [Java]), this process is usually extremely slow and only marginally helps with the problem.
  • unexpected abuse of an API: as soon as an API is used by more that 3 developers you can count on it being used in all the ways you have thought of, and in all the ways you didn't think about. This often becomes a problem, especially if the API accidentally reveals certain details of the underlying implementation (which you hoped that nobody of sane mind would ever use, but this always turns out to be wishful thinking: there will be at least one person who disagrees with your definition of sane)
  • these lead to the third, more important point: as soon as an API is released, you're essentially bound to maintain it virtually forever, including undocumented and unintentional quirks. Are you scared of maintaining it forever? I certainly am. Tharn!1
  • this in turn greatly increases code rigidity: as you're bound to maintain the API forever, including all the unintended features, you're very often effectively prevented from changing the underlying implementation as someone may well be relying on some unintentionally exposed aspect.

In addition, this kind of creeping featuritis doesn't come free for developers who're using the library:

  • it provides many features that most users of the library don't want: while few people will complain about the extra features, after exceeding a certain threshold it often causes the problem of 'I can't see the forest for the trees' when a developer just doesn't know where to start.
  • it increases the risks of abuse, the effects of which can be significant. For instance, it is this risk which is one reason behind the decision of Linus Torvalds not to allow C++ into the Linux kernel [Torvalds2007].
  • it is more likely for legitimate feature requests to be denied because implementing them would be a disaster: it often would be easier to implement if you weren't also required to support lots of existing features, many of which are unnecessary but required to be supported forever.

Waterfall vs Agile

One way to think about this 'include everything in sight' development approach is to compare it to the Waterfall development model. If the API, once designed, cannot be changed or extended, it pushes us to include everything which we think might be needed. Unfortunately, Waterfall development doesn't really work in practice because it is hard to predict what will actually be necessary. Fortunately, this was recognized in 2001 when the term Agile development was coined [Agile].

Within the Agile development model, the whole development process is iterative by design and changes are a part of the process and are welcome. Applying this principle to a library we can see that APIs can also change easily. At least in theory it means much less pressure on the API developer to put in 'everything a user might need' right away; in fact, many Agile methodologies explicitly state that developing features 'just in case' is not a good thing, for example XP states 'Never Add Functionality Early' [YAGNI].

No Bugs' axe

Unfortunately, up until now it has not been explicitly stated what constitutes 'too early' for a feature to be included and what does not. In my projects with fellow rabbits, I have used the following principle for a while with a considerable success:

If you do not have a concrete case of how a feature will be used - do not provide it.

I (without false humility) hereby propose to name it after yours truly, namely a "No Bugs' Axe" principle. In some very wide sense you can consider it as parallel of the classic 'Occam's razor' principle [Occam]: in the same way that this cuts off unnecessary entities needed to explain a phenomenon, No Bugs' Axe slashes away unnecessary features.

The rationale behind such a harsh approach is not only related to the problems of creeping featuritis mentioned above, but is also related to one obvious (though often ignored) observation: if you do not know what your users really want, you're not able to provide a reasonable API. It is very common that users request one thing, while in reality they need something very different which might be easier to implement for you and (much more importantly) easier for them to use. Let's see this in practice with a concrete example.

Suppose that you're developing your own String class for your project. Originally, following the Axe principle, you've made your String a bare-bones implementation which can only store a string and compare it to another one. Even such a simple implementation is useful for many practical applications. As time goes by, one of the developers using the library comes and asks you to introduce a function find(), similar to strstr() in C. It is certainly easier to just go ahead and implement such a simple function than to argue about it, but according to the Axe principle you should ask why the developer needs it. You've asked and s/he replied: 'Oh, I need to find out if the file extension is .abc, so I want to use find() to detect it.' After giving it a bit of thought, you ask, 'Using find() in such a manner is cumbersome and error-prone: would you be happy to use a Java-like endsWith() instead?'; and the answer is 'Sure thing!'

Next time, another developer comes and once again tells you that s/he needs find(). Again, you ask why does s/he need it? This time, the real need is to provide a substring search within an URL for a custom web server extension. After some thought and research, you realize that what users really need is not a substring search, but a pattern match, which the developer (knowing too well that its implementation is not trivial) was too humble to ask for. What was really necessary in this case was not simple find(), but some form of regexp.

What about code reuse?

One popular argument in support of including 'everything in sight' is 'if we provide an incomplete API, how it can be reused in the future?' There is only one problem with this argument: it is fundamentally flawed. As it has been observed for quite a long time, and recently articulated in [Kelly2010], it is not code reuse which really matters to deliver quality software: it is a set of other properties, like modularity and low coupling, which are of importance, but which are often considered too abstract. On the other hand, development aiming for code reuse tends to be as much as three time more expensive than developing single-use code. To address this conflict between code quality and development costs, [Kelly2010] proposes the approach of 'emergent reuse' - 'don't plan for reuse but look for opportunities to reuse something that has gone before' - which is perfectly consistent with No Bugs' Axe.

On hammer and nails

I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail.

Abraham Maslow, psychologist

As we have shown above, concrete use cases help to shape APIs in ways which can be hard to envision well in advance. But let's take a closer look: if a general find() function had been provided from the very beginning, would developers have ever come asking for better APIs? All my experience convinces me that most developers will not ask for a new API when a workaround is available (unless it is really horrible to use, and even in this case it may still be misused though not as likely).

Therefore, restricting APIs unless concrete use cases are provided serves one more important purpose: it stimulates creating APIs which are the right tool for the job (instead of using a hammer on screws). As it has been shown above, it often helps to keep code as a whole more readable, less error-prone, and (paradoxically!) more functional.

Subtle points

There are a few important, though subtle, points to understand about the No Bugs' Axe principle:

  • while it prohibits, or at least strongly discourages, implementing features until they're requested, it doesn't mean that you, as an API developer, should not think how you're going to implement a feature if it is requested in the future. The idea behind No Bugs' Axe is not to corner ourselves by relying on a feature never being requested, but exactly the opposite: to keep open all possible options, and the restriction on existing public APIs is one of the means to achieve this.
  • it is important to understand that any refusal to implement a certain feature on the basis of the No Bugs' Axe is essentially temporary: if a satisfactory use case is demonstrated at some point later, the prior objections based on No Bugs' Axe are automatically revoked.
  • nothing is carved in stone. Any attempt to follow a set of rules to the letter is doomed from the very beginning, therefore you should rely on your judgment and common sense. For example, despite being adamant on cutting off unnecessary APIs in the String class above, when the need for endsWith was demonstrated even I myself wouldn't argue against adding the complementary startsWith function. A rationale to provide startsWith would be not to provide a complete API, but to avoid confusion for developers who will reasonably expect to have startsWith when they see endsWith.

Pros and cons

Developing APIs under the No Bugs' Axe principle has some important implications:

  • the API developer must be available to analyze the needs of API users, with round-trip times (from the moment of request to a reply, even if it is a negative one) being, not in the order of months (which is unfortunately often the case with API developers), but of hours, maximum days.
  • developers who're using the API should be encouraged to submit requests when they feel new features are necessary. This includes treating even the most silly requests respectfully: in the worst case you can always write a FAQ about requests which you're not going to honour, with a list of workarounds.

In exchange for these (I'm sure minor) inconveniences, the following benefits are obtained from adhering to No Bugs' Axe:

  • it encourages a coding style which uses exactly the right tool for the job
  • it reduces the number of APIs which can be abused
  • it reduces the number of APIs which need to be supported (most likely eternally)
  • it reduces inter-dependencies, making code less rigid and less fragile;
  • consistent with Agile development principles, APIs evolve with the project (which is inevitable anyway in the long run), but with much better backward compatibility as it is usually much easier to add a new API rather than to drop or change an existing one.

Other usages

As we have seen, the No Bugs' Axe principle works very well for APIs, but it can be easily extended into other areas, where it is also useful for similar reasons. In particular, the very same principle can be easily applied to user features. In practice, it is often not useful to take claims coming from BAs 'our user needs such and such checkbox here' uncritially - ask why. Usually the user (almost) never needs 'a checkbox', but instead the ability to specify something; and a checkbox might not always be the best way to do it. While asking questions like this might not be welcome within the current development culture of many companies, my experience shows that it often helps to improve quality of the end product, and therefore is beneficial for the company in general.

References

Agile] Manifesto for Agile Software Development,http://agilemanifesto.org/

[Java] 'How and When To Deprecate APIs', http://download.oracle.com/javase/1.4.2/docs/guide/misc/deprecation/deprecation.html

[Kelly2010] Allan Kelly, 'Reuse Myth - can you afford reusable code' http://allankelly.blogspot.com/2010/10/reuse-myth-can-you-afford-reusable-code.html

[Loganberry2004] David 'Loganberry', Frithaes! - an Introduction to Colloquial Lapine!, Unit 14: Feelings and Emotions; Parts of the Body (2), http://www.loganberry.furtopia.org/bnb/lapine/unit14.html

[Loganberry2006] David 'Loganberry', Frithaes! - an Introduction to Colloquial Lapine!, Dictionary - Lapine to English, http://www.loganberry.furtopia.org/bnb/lapine/dictlaptoeng.html

[Occam] http://en.wikipedia.org/wiki/Occam%27s_razor

[Torvalds2007] Linus Torvalds, 'Why C++ is a horrible language', http://article.gmane.org/gmane.comp.version-control.git/57918

[YAGNI] http://en.wikipedia.org/wiki/You_ain%27t_gonna_need_it and http://www.xprogramming.com/Practices/PracNotNeed.html

1 The word tharn is difficult to translate into human language, but the closest meaning is 'stupefied by terror' [Loganberry2006]

Overload Journal #100 - December 2010 + Programming Topics + Process Topics