Synchronized - Dos and Don'ts
- Do look closely at each and every piece of static data in the system. Threads can only clash when sharing static data, or data with a static element somewhere in the reference chain.
- Don't put static data in library routines, as you can't be sure when they'll be used by threaded code. A classic problem is a cache, which is usually a synchronised static map. Because any data retrieved from such a cache has a static element in the reference chain, there is a risk of it being accessed by more than one thread. In my code this happened with the session cache. A single browser inside a single session can post two almost simultaneous requests serviced by two separate server threads - both accessing the same session data.
- Do use syncronized to wrap any code that can be run by more than one thread and accesses common data. It will cause second and subsequent threads to wait until the first thread to have a syncronized block to leave said block.
- Don't synchronize everything. It's a common beginner's mistake to synchronize everything in a class to forestall any problems. In the first case it won't stop problems as deadlocks are still possible, and in the second case it can make a program impossibly slow, by creating massive bottlenecks.
- Do keep synchronised code sections to a minimum. I realise that technically this is the same point as the one above - but it's so important that it bears repeating. I can't believe how often I have seen synchronised methods that call logging functions. Logging, like disk or network I/O, can cause delays - not to mention the CPU time required to turn data into human readable form. If it's a popular piece of code, system response drops off while many threads wait for one to complete. Telltale symptoms of this mistake, then, are low CPU usage with long response times.
- Don't use synchronised maps and lists unless they are absolutely necessary. There is no need to synchronise a map that's inside synchronised code in your application. Unlike items synchronised to the same semaphore, doing this will incur a considerable overhead to no advantage. The older Java containers were synchronised by default, whereas the new aren't.
- Do think carefully on any synchronised code - because it's virtually impossible to unit test. Problems usually only manifest themselves in production with heavy varied load. These problems aren't easy to reproduce on request..
- Don't assume that because code always works on a single-CPU desktop or test machine that it will on a multi-cpu server. Even desktops will be vulnerable once the new multi-core CPUs hit the market in quantity. A single CPU system is still linear; even when it looks like it's multi-tasking it's really just swapping between tasks very quickly. A multi-CPU system can have completely separate processors running the same code and accessing the same data. Certainly synchronize should work the same way in both cases, but the completely different mechanisms employed means that your code will be exercised differently. The end result is that code that works perfectly well on a single-CPU system may fail randomly on a 2 or 4 CPU server.
I always like to give an example. Unfortunately, thread awareness is such a complex issue that it's impossible to give a simple valid example. The best I could find in my code was in a double-hash database index. Don't try and understand the code out of context - download it at The Adept Library if you want to do that. This is an extreme case for minimising synchronised sections. If the method were synchronised it would lock out for a long time. Instead, only the bucket change is synchronised.
Note that the test is made twice - once to see if we need to do it and once inside the synchronised block to check that it has not changed. The odds of two threads deleting the same index at exactly the same time is probably billions to one, but it is not hard to protect against if you think on it. Because only the change is synchronised, care is taken in the rest of the code that reads are consistent even if the data under them changes. This takes thought, but is a lot more efficient than synchronising everything.
boolean delete( int hash, int record) throws IOException
{
int bucket = getBucket( hash);
IntegerStack possibles = findInBucket( bucket, hash);
int deleted = 0;
int possible;
while (! possibles.isEmpty())
{
possible = possibles.pop();
if (index.file.getInt( possible) == record)
{
index.file.putInt( possible, -1);
int secondaryBucketLocation
= (hash >> primaryShift) & primaryMask;
if (secondaryBucketCounts[secondaryBucketLocation] >= 0)
synchronized(this)
{ // only if sharing secondary bucket hash
if (secondaryBucketCounts[secondaryBucketLocation] >= 0)
{
secondaryBucketCounts[secondaryBucketLocation]--;
secondaryBucketLocation
= primaryBuckets[secondaryBucketLocation];
if (secondaryBucketLocation == -1)
// in the middle of a split
secondaryBucketLocation = afterSplitLocation;
index.file.putInt( secondaryBucketLocation,
index.file.getInt( secondaryBucketLocation) - 1);
}
}
deleted++;
}
}
return deleted > 0;
}









0 Comments:
Post a Comment
<< Home