Archief voor categorie social
ScheduledThreadPoolExecutor horribly broken
Geplaatst door Peter Doornbosch in java, mobility, social, technical op 29 maart 2007
A while ago, I considered using the Java 5 ThreadPoolExecutor class for executing
remote calls asynchronuously. The application I was working on needs to perform remote
calls on large numbers of devices, and as remote calls can take quite a while, you
don’t want these remote calls to wait on each other. Moreover, as remote calls might fail,
e.g. due to network problems, a retry mechanism was also needed. The
ScheduledThreadPoolExecutor, a subclass of the ThreadPoolExecutor, additionally allows you to
schedule a task at a certain delay, which offers a simple and elegant solution for the
retry mechanism: when a remote call fails, I only had to re-schedule it with a
proper (increasing) delay and the scheduler would take care of it.
Thanks to the fact the ThreadPoolExecutors provides an abstract method afterExecute()
that is called after execution of the task, I didn’t have to pollute my task
implementation with retry logic, but could clearly separate these concerns. In the
afterExecute() method of the (subclassed) ScheduledThreadPoolExecutor, I could ask the
task whether it had succeeded, and if not I could simply reschedule it. And all this
just in a few lines of code:
void afterExecute(Runnable task, Throwable exception) {
if ((RemoteCallerTask) task).failed()) {
super.schedule(task, 10, TimeUnit.SECONDS);
}
}
When I first tested it, I got a ClassCastExeception. My first guess was that it might
have something to do with different class loaders, but when I ran it in a debugger it
turned out to be something that realy surprised me: the Runnable task that was passed
to this method was not my do-a-remote-call task that I had passed to the executor, but
something of a completely different type
(ScheduledThreadPoolExecutor$ScheduledFutureTask
Maybe I misinterpreted the documentation? I went back to the ThreadPoolExecutor
javadoc. It talked about “methods that are called before and after execution of each
task”, and the parameter description claimed the Runnable parameter to be “the runnable
that has completed”. This seemed to match my expectations: you execute a Runnable task
and that is what is passed to afterExecute. That would be the only sensible definition
of a after-execution hook, wouldn’t it? As the source code is the best documentation, I
checked the ThreadPoolExecutor source, which confirmed what I was expecting: the task
that is run is passed to the beforeExecute() and afterExecute() hooks.
A little bit of studying on the ScheduledThreadPoolExecutor source revealed why I got a
ClassCastExeception: it wraps the (user supplied) task in a ScheduledFutureTask object
before passing it to the base class (that puts it in the task queue). One of the
reasons why this wrapper is needed, is because the ScheduledThreadPoolExecutor uses a
DelayQueue to store the tasks and elements of this queue must implemented Delayed
(i.e. have a method that returns the delay). This type of queue sorts tasks based on
the delay: shorter delay comes first. When taking an element from the queue, it blocks
until the delay of the first task has passed. Using this type of queue makes the
implementation of the ScheduledThreadPoolExecutor quite simple: it wraps the task in a
wrapper that implements getDelay() and puts these wrappers in the queue.
Although I can appreciate the beauty of using a DelayQueue in combination with a normal
ThreadPoolExecutor, I don’t think it is the right solution. The point is that it breaks
one of the fundamental principles of OO programming and that is that derived classes
should respect the contract defined by the base class(es) (and or interfaces). The contract
that the base class ThreadPoolExecutor defines, is that it will call the hook methods
with your task as a parameter. ScheduledThreadPoolExecutor breaks this contract, as it
does not adhere to what its base class has promised.
This break of contract shouldn’t be taken lightly. It makes code that uses these
executors fragile: if for some reason someone decides to use the other class as
implementation of the general executor service, existing code might break. Put
differently: in order to make your code robust, you would to have to take into account
which executor implementation was chosen, at different points in your code. This
violates principles of encapsulation and abstraction: code should never depend on
implementation types, only on interfaces.
I was pretty disappointed that even in the concurrency API, such fundamental
mistakes can be found. Moreover, it appeared this is not the only break of contract, and that fixing this properly doesn’t seem to have any priority, but more on that later.
