Adept Software Development

Adept: (A)pplication (D)evelopment (E)nterprise to (P)ersonal (T)ransition. It is a system I am developing to leverage Enterprise developer skills to produce stand-alone software for other market segments. This is a general software development blog discussing issues about project, architecture, design and development. The emphasis will be in Java, but many of the issues will be more general. Almost all will be technical.

http://marringtons.com

Monday, May 16, 2005

Threads and Multitasking

Integrating thread support in the base Object class and in the language syntax wasn't what I'd call a great technological achievement. We've been doing the same for years with very similar procedures in C++. No, the real magic was in making multi-tasking available to everyone. Without it Java would not have been suitable for application servers and probably would never have survived.

While we are discussing historical imperatives, threading wouldn't have been as useful or popular without garbage collection. While creating a threaded system is still in the domain of the specialist, all programmers who develop for application servers are developing code to be run in a thread. Without garbage collection, minor memory leaks in a stand-alone application become much worse on an application with many threads (all quite possibly running the same code).

Did I mention that multi-tasking is still a rare thing only used by a few specialists in the field? This isn't only because it's harder than normal development, but also because there aren't many occasions where the additional expense of task switching is worth the benefits of appearing to do two things at once. The operative word here is 'appearing'. On a single processor desktop, only one thread can be doing anything at one time. Only on a multi-processor server can you hope that multiple threads will execute in parallel. Even then this isn't guaranteed as the CPU time must be shared between multiple programs, not just threads.

So, when is multi-tasking of value?

  • Application Servers: where one server services multiple clients or browsers.
  • Communications: where messages need to be received when they arrive even if an earlier one is still processing.
  • Spawning: Running external programs and not having to wait until they complete.
  • Housekeeping: Just like at home, collections of data get messier over time. Faster response times can be had by quickly changing and throwing your discarded clothes on the floor. You can then come back and clean up when you have spare time (and I certainly hope my wife isn't reading this). This is why garbage collection with all its complex overheads can provide more responsive systems than C/C++ with malloc/free. Don't clean up memory as you finish with it, leave it and wait until you have free time. Application software can be the same. I have a cache with an ageing feature. It holds data like a HashMap, but if it is more than X minutes old it is discarded. It is, of course, checked on retrieval, but what of the case where the data is never retrieved again. I have a separate housecleaning thread that walks through the list when the system is not doing anything else looking for out-of-date data and removing it. Such data is closed if it has a Closeable interface. This is very useful for caching connections since you don't want them held open indefinitely if they are not being used.
  • Prioritised Processing: is a close relative to housekeeping. My object database system has threads for deleted records, indexing and reorganisation. When you ask to delete a record it is flagged and the call returns. A background thread actually deletes the record and frees the space. The same goes for indexing. This can be expensive with multiple indexes, but why delay the calling program?
  • Scheduled Tasks: You don't really want everything to freeze waiting for a specific time or time interval do you? A classic example is displaying a clock on a gui application. Start a thread that waits 1 second then redisplays the clock before waiting again and you have a timepiece with a minimum of system impact.

The use of threads can be divided into two diametrically opposite groups. Application servers and communications are examples of making a single service appear committed while actually serving many masters. Spawning, housekeeping and prioritised processing are all examples of providing more responsiveness by deferring tasks to a less busy time.

The reason these two use groups both work well with Java lightweight threads is that they both involve more waiting than working. While one thread is waiting, others can be busy without appearing to slow anything down.

There's that word again: On desktop systems, it appears that appearance is everything.

Wednesday, May 04, 2005

Mainframe to Stand-Alone to Client/Server to Web - The Full Circle

This is an observation that may seem so obvious that it does not need writing. Then again, I have not read it elsewhere, so here goes...

In the sixties and seventies the corporate computing environment was dominated by mainframes. These big monsters required regular feeding of funds and staff. As business began to rely on the information processing they provided, the emerging IT departments gained more and more power. Is this reading like a bad fantasy novel yet?

Then came 1980 (It was hard not to notice, the hair was so bad). IBM had looked at the geeky micro-computer market and seen a cheaper and more flexible terminal to connect to their mainframes. Those of us using micro-computers at the time were not impressed. Their hybrid 8/16 bit 8088 was much slower than the second generation Z80s we were using - and way more expensive.

But we were stupid. The thing was, it wasn't about the technology - that can always be improved - it was about the culture. The guys with suites bought them by the dozen: they saw freedom from the control the IT department had been exerting. By the time Lotus had taken the Visicalc idea and made it work with big sheets of data, the market was set. Rather than the intelligent mainframe workstation with a bit of word processing thrown in that IBM had envisaged, office workers were running their own programs to get the work done locally.

The two armies faced off, and the battle raged. Desktops sprouted databases (DBASE) and 4GL solutions. The mainframer fought back with client server applications, a sort of compromise that would leave IT departments back in charge. Client/server was a failure. It was a lot more expensive to develop than mainframe-only and had severe reliability problems. While it had better looks it was as slow as we were used to for mainframe applications.

And then the popular front did themselves in. They got all excited about an information presentation technology called the Internet. This is not a criticism of the Internet, but where we saw information at our fingertips, the IT Department saw servers under their control sending information to terminals under their control (called browsers this time).

Almost every enterprise application in the last 7 years has been web-based - meaning application servers with nothing but a browser on your super-comptuter of a desktop.

Is it just me, or have we come full circle in the last 25 years? Sure the browser has replaced the dumb terminal and the IBM mainframe by the Sun Solaris server, but what else has really changed? It looks a lot prettier, but we are waiting just as long for a page to load now as we did then. We work on enterprise systems where 4 seconds between pressing a button and getting a display is acceptable. How would you like it if Word or Excel behaved this way?

Whatever happened to distributed processing?