Adept Software Development

Adept: (A)pplication (D)evelopment (E)nterprise to (P)ersonal (T)ransition. It is a system I am developing to leverage Enterprise developer skills to produce stand-alone software for other market segments. This is a general software development blog discussing issues about project, architecture, design and development. The emphasis will be in Java, but many of the issues will be more general. Almost all will be technical.

http://marringtons.com

Wednesday, April 27, 2005

The Semaphore

Java makes multi-tasking easy with the synchronised keyword. This can be applied to a method as part of the signature or used to wrap a code block, by providing an instance to synchronise against. If a second thread attempts to enter a synchronised section, it's blocked until the first one completes.

This is all well and good, but Java prior to 1.5 did not provides semaphores. A semaphore is a flag that will cause a thread to wait until another thread tells it to continue. I use semaphores as much as, if not more often than, synchronised blocks.

I use semaphores in CGI so that the HTTP server thread can wait until the external program completes. The HTTP server itself has a waitUntilClosed method that can be used to block the main thread until the server itself signals completion. Semaphores are also invaluable in the database code so that index generation, physical writes and housekeeping can be done in the background, yet concerned threads can be put on hold when necessary.

Before semaphores it was common, although always considered bad practice, to write a polling loop. This is a loop that just keeps checking for a desired result until it comes out positive. Polling loops are always a bad idea because they eat CPU time like it's going out of style, and do very little by comparison.

All Java objects have wait and notify methods that provide all the functionality. In the Adept Library I have chosen to wrap these in a Semaphore object for convenience - and, as always, to make the code easier to read. This object handles interrupts and allows for cases where resume() can be called before pause().

If a thread is core to the interactive components of an application, it's a good idea for it to not block forever. Perhaps the thread that was to resume the process has died or become lost - in multi-threading, anything can happen. Of course this generates code similar to a polling loop, but if the wait period is long enough there will not be a serious performance cost.

In summary, if you need to wait for something to happen, use semaphores rather than polling . For a pack of good examples, head over and download the Adept Library.

Monday, April 18, 2005

Exceptions and Tiers

I recently worked on a project where each tier had its own exception class - messy. To make things worse, each call between tiers was wrapped in a try-catch to translate it into the exception class of the calling tier. This made the code bloated and harder to read.

The only valid reason to catch and translate an exception is when you have additional important information to add. For example: converting a string to an integer can throw a library exception saying what happened, but only the calling function has enough information to tell which field failed. Of course if it's a programming error, the first exception is adequate as the stack trace will tell where it comes from. If, however, the data is from an untrusted source (user, XML stream, etc) - the user will require the field name and source. In this case the calling method will trap the conversion exception and add the additional information.

Exceptions can be divided into a limited set of groups.

  1. Development Errors (aka bugs): Occur when the code fails because of faulty coding (or possibly design). It is hoped that all of these will be eliminated before a system goes to production. Realistically, a few will always hang around. In Java they are most commonly of type NullPointerException. Where they are explicitly created they should be unchecked exceptions since they need not be dealt with in the normal course of events. It's not uncommon for developers to throw these all the way back to the GUI as unsightly error pages. This isn't a good solution as the user loses their place and gets a very negative impression about the stability of the application. Equally unsuitably, some commercial applications just silently swallow these sorts of error - at best just logging them. While this is less scary for the user, it can often mean that they will not get the results they expect - and that the problem will never get dealt with. I've found that the best way of reporting them is to treat them as special validation messages. The user will see the form with a "sorry we seem to have a problem message" in one of the validation message fields. At least this way they can try for a work-around.
  2. Validation Errors: At the other end of the exception are errors caused by the interface user. These can be as simple as a number out of range or as complex as not being able to kill the dragon because you forgot to pick up the fire wand 5 levels higher in the dungeon. From a development viewpoint the application should not break and the user should be informed of their fault in a clear, accurate and polite way. (Alas! We told you not to sell the 'Wand of Dragon Slaying'. But did you listen to us? No! The name of this game is Dragon Slayer, for Thor's sake.)
  3. Internal Exceptions: Attempting to turn a string into a number with the Java Integer class can cause an exception if the string contains non-digits. You have 2 choices; either pre-parse the string or catch the exception and translate it into a validation. Most libraries will throw exceptions inappropriate for your application. Fortunately they are normally checked, so the compiler will nag you. Catch them as early as possible and translate them into either development error or validation exceptions - or deal with them if by some miracle that is possible. Do this in implementation layers and don't sully business logic with such confusing code.

Monday, April 11, 2005

Transferring Data Between Tiers

Tiers and slices are simply mechanisms for visualising the age-old principle of divide and conquer. For the ultimate in maintainability and reuse, it is best to keep the divisions as clear as possible.

A software object emulates objects in the real world by being a combination of information and functionality. To keep code regions separate and independant, however, it's best if they don't know anything about the internal functionality of their neighbours. You may ask an object to do a job or provide a result, but separation means you should not know about or rely on the methods the object uses to achieve that. You should know that you can use your calculator to add up, but you shouldn't be weighing up the variable resistance methods within the microchip that are used to represent binary data. The key to tiered design is to be creatively stupid.

So, when calling a service in another tier we should provide and receive pure information without attached functionality. When the said information is larger that a single primative it's often called a data transfer object or DTO.

There's another good reason for a DTO. If your application is J2EE EJB or otherwise designed so that it can be distributed, information will be passed by value rather than reference.

For the separation to occur it is important that the DTO be clean. It should contain mostly primative, other DTOs or well known classes that do not cause too much interdependence. Where possible, the latter should be immutable. There's nothing worse than receiving a DTO by RMI from a remote server that includes the whole session or security structure as a field. Not only is it massive to transfer over the wire, but you rebuild it locally without almost all the remote code existing locally in an up-to-date format. On the other hand a Map of String is quite acceptable since both sides will be using a common library.

I prefer DTOs to be very specific. I intensely dislike DTOs floating around with partially filled fields depending on what was asked for. It's also bad to have DTOs with information extraneous to requirements. Not only is it confusing when maintenance is required, but it also means you are retrieving information that is not required - often at considerable expense in resources. This can happen if you attempt to pass a DTO through more than one tier.

The persistence tier, for example, will probably have DTOs that match the database tables. There's a temptation when providing a service at the service layer to create a DTO that has persistence tier DTOs as fields. Resist. Firstly you are exposing too much of your database structure, secondly your service is presenting a complex graph to its client in a form not logical for that view and thirdly it's not common for data to require different formatting from different viewpoints. In an extreme case, the persistence layer may be using sql.date while the service layer uses Calendar and the GUI tiers deal with a string including a formatted date.

The bigest valid complaint against using DTOs in a clean compartmentalised manner is the need to be continually copying the contents at each tier interface - in both directions. I have seen tier interface services that are just masses of copy statements. Updating a DTO without changing all the copy code is a common source of subtle bugs.

The Adept library object package has a lot of support for DTOs. Specifically there is a DTO helper class with static methods for transfering data between DTOs and POJOs, both in bulk and given a list of required fields. It is a deep copy operation. There are also classes to convert DTOs to/from XML streams for data transfer and to/from name-value pairs for screen or form population and retrieval.

Using deep copy methods such as this allow you to remove the interface layers in each side of each tier since the copy can become a single line part of the logic layer of the tier.

Tuesday, April 05, 2005

Tiers for Logical Application Separation

This is the first of three articles on tiers in software development. The next two will focus on data transfer and exception processing.

Current thinking in software architecture is that applications should be designed and implemented in clear tiers. Think of a chocolate layer cake. The icing is the GUI tier - the one you see. Each layer below is a tier providing unique functionality.

Examples

A standard PHP web application is a single tier design. The same code that accesses the database also displays the pages. From another perspective it is a 3-tier design with the browser providing the GUI tier and the database engine the persistence tier. Still, the developer can only change the single central tier.

A client/server system is a clear 2-tier design. There is a clear division between the 2 parts of the application - to the extent that they are usually running on different systems.

N-tier systems are less easy to see from the outside as it is a development model rather than a physical separation.

An N-Tier Pattern

I divide an application into tiers and tiers into layers. Really there is no difference except for logical grouping.
  1. GUI Tier
    1. GUI display layer - responsible for display and retrieval of data only. It expects information ready to display and passes information back as it comes from the user. In a web application it is code and information passed to the browser for rendering. Because it is the only part of the application that will be operating locally to the user, it may include some code for validation of input and manipulation of display for output.
    2. GUI support layer - Manipulates information as the program sees it to produce information that the browser renders and the user sees. In traditional web server applications the JSP or ASP is the prime GUI support tier.
    3. GUI transfer layer - is the primary interface with the next tier. Information to be displayed is consolidated here and formatted into the form to be placed on the screen. A date object, for example, will be turned into a string to be displayed. The same goes for the reverse direction. When the user enters information, this is the second level of validation as strings are translated to internal form.
  2. Business Logic Tier
    1. Service Layer - The latest buzz-word is SOA (Service Oriented Architecture). This is the layer that supplies the exported services, and can talk with the GUI tier above, a fat client or a Tuxedo interface for a remote request from another server. The service layer should not be involved in the infrastructure required to deliver the service. It's prime responsibility is to respond to requests, recieving and sending messages as simple data structures without code or other ties to the underlying system (DTO). It will need to validate parameters and apply security as required.
    2. Definition Layer - Here the definitions for business logic - as defined in the design documents - have been translated to code. Code here should be simple, clean and clear - able to be compared one-to-one with the design documents. Don't confuse the definitions with implementation, and don't validate parameters, catch and process exceptions or any other implementation code that can 'muddy the waters' when reviewing business process. Each method should be a clear list of actions with branches and loops.
    3. Implementation Layer - This is where the business work is done. Each of the actions used in the matching Definitions layer will be implemented with all the nasties of exception processing, data retrieval and consolidation. While the code in this layer will be dirty with detail, if the business instructions in the definition layer are finely grained, then methods should not be too large or hard to follow. Any refactoring to make use of common code should be done at this layer rather than the Definitions layer, for the sake of clarity on the higher tiers.
  3. Persistence Tier (Database, mail, IPC and such)
    1. Interface Layer - This is effectively an internal service layer. For a database package it would encompass the domain model representation, providing enough information in a single method to satisfy service layer requests without loading to much additional information from the tables. The service layer should not know about persistence internals, so this layer provides a level of separation.
    2. Implementation Layer - The implementation layer more finely-grained than the interface layer. It's typically one-to-one with the database tables or other interface services. It's often provided by external packages (i.e. hibernate or javax.mail), although it can also include system-local interfaces to external packages.
    3. Helper Layer - In the persistence tier above all others, the various tables and interfaces referred to in the implementation layer will require common code for processing. In a full OO design these would be part of the super class. Because tables and interfaces often use external packages that cannot always be subclassed, common support code will need to be in separate objects. Put these in a separate Helper layer for clarity.