Adept Software Development Blog for Architects & Developers

Ever since we started developing software there has been calls for code reusability. First there was (and still is) the library. In the 80's we talked about the 'black box', meaning component objects where only the interfaces were published. Later COM extended this principle. Then came object-oriented design and we talked objects. Now we have beans, activeX components, EJBs, applets, scriptlets and a myriad of ways to provide code for reuse.

Even when technologies work together, their view of generalisation is different. For example, an EJB uses objects. Conventionally, objects can have instance and common (static) data. Objects used by EJBs, however can have separate 'common' data - uncommon data. I digress. This article is about when to write specific code and when to generalise.

Why generalise code? There are two valid reasons:

Code reuse.
Clarity

Generalisation for Code Clarity

Let's take clarity first because it is easiest. Clarity is tantamount. Self documenting code is far easier to maintain that a long stream of unrelated groups of statements.

public Account getAccount()
  {
    User user = getUser();
    Account account = readAccount( user);
    updateTransations( account);
    return account;
  }

private Account readAccount( User user)
  {
    // connect to account system, retrieve and translate account details
    ... lots of technical code ...
  }

private void updateTransations( Account account)
  {
    // Retrieve recent transactions and update the account details accordingly.
    ... lots of technical code ...
  }

The public getAccount() method clearly tells us functionally what is involved in retrieving account details and the names and function clearly match the business requirement. The private methods readAccount() and updateTransations() are never used elsewhere, but remove implementation details from the functional code. It makes sense to hold aside functional code from implementation into separate objects, quite possibly in different application tiers.

In short, code separation for clarity is the main use for generalisation techniques and should be practiced constantly.

Generalisation for Code Reuse

Everyone leaves university with the belief that every line of code they write is sacred and will be used over and over again (Or was I really that pig-headed?). Unfortunately no-one is taught how the rest of the world will know to use these new pearls of the developer's art. In fact, there are a heavy set of benefits in writing code specific to the task:

It's more clear because the internals are not generalised (accountKey instead if key).
It's more concise, because the best generalised code must take into account conditions that in a specific instance would not occur. Why check for a null parameter in a specific method when the one caller cannot - under any conditions - pass a null? For a general method, one must cover for outcomes not obvious for any one caller.
For the same reason, it's faster to write - since we can design the internals to match the known user, we don't have to wrap our heads around all the possible uses that our new code could be put to.
It's easier to maintain because there is no fear of changing code that will cause other callers to behave differently. How often have we seen code that uses quirks of a known interface rather than just it's published uses? How often does this happen by accident?
It's easier on system testing since changes to more generalised code is more likely to require broad regression testing.

For the sake of impartiality, here's the argument for code reuse:

Changes are made in one place - and effect all callers.
Smaller code base.
Behaviour is consistent across callers.

Hmm, do we see a trend here? Personally I follow this checklist:

If I do not know of another use for the code I will write it in a way totally specific to the requirement.
If I suspect that other parts of the application are likely to used code the same or similar I will take care that the code involved is fairly separate. I will also take care that this does not take extra time. There will be no general interface or other non-specialised code.
When a second caller requires nearly or completely identical code I will review the common code and and refactor it as required. It should go no higher up the object tree than the common need.
If I identify the need for a low level common object I will be tempted to take the time to create it. I do not, however, add more general interface above what I need. Why account for float and double parameters when you only ever use the int ones? Only when the additional functionality is needed will I update the library class.

Pitfalls of Early Generalisation

You'll spend excessive time adding tests and interfaces that will not be used in case they are needed later.
You'll end up with code that has an excessive number of if() statements or similar branches to cater for different clients.
You'll have obscure object inheritances making it difficult to find who is doing what.

Do you want to see a beauty?

public static boolean isSet(Object o) {
    if (o == null) {
        return false;
    } else if (o instanceof Boolean) {
        return isBooleanSet((Boolean) o);
    } else if (o instanceof String) {
        return isStringSet((String) o);
    } else if (o instanceof Long) {
        return isLongSet((Long) o);
 ...

This one is possibly useful if the calling code did not know the type of object, but in all cases in the project that uses this method they do!

Code Generalisation Methods

The simplest and most common is at the method level internal to an object. As we are creating the class we see use for code elsewhere and refactor it into a private method so that both can call it. This usually also makes it easier to read the calling method.

Subclassing can be used to place generalised code in the parent class to be used by children when needed. While the code is not as visible as when it is in the working class, it is clearly associated with the object heirarchy. The same method can be used to separate functional from implementation code, with the restriction that Java only allows single inheritance.

Helpers are separate objects or static class methods in a separate class that provides common code. A modern code library is a collection of helpers. Care must be taken with code helpers to ensure that all developers know of their existence. Because they are not physically connected to a class (as in inheritence) they can often be lost leading to inconsistencies and code duplication.

A bean is an independant item with a clear interface that can be used to ask it questions or have it perform actions. A bean is in truth the implementation of the software black box.

How to Find General Code - The Unanswered Question

Code generalisation is a wonderful thing. It attracts designers and developers like moths to a flame. But, to carry on with the metaphors - there is a fly in the ointment. No-one has found an even marginally successful method of documenting common code in a way that potential users know that it exists. Sure, we all familiarise ourselves with the core libraries of the packages we use (do we?). We'll also look for libraries that fill our needs. The problem arises internal to a project. Most developers will develop a component for a complex system by looking for and finding a similar component and duplicating it's functionality. Common code may be pushed up the inheritence tree or refactored into helpers, but unless the team is small and tightly knit or the communications are very good, only a small percentage of the developers will make use of the new tools provided. Enforcing clear javadoc helps - if it is read. What other techniques are useful?