June 15, 2009
Break Up and Interleave Work to Keep Threads ResponsiveOption 2:Use Reentrant Message Pumping
Option 2 could be subtitled: "Cooperative multitasking to the rescue!" It will be familiar to anyone who's worked on systems based on cooperative multitasking, such as early versions of MacOS and Windows.
Example 2 illustrates the simplest implementation of Option 2, where instead of "saving stack frames on the heap" in the form of continuations, we instead just keep our in-progress state right on the stack where it already is and use nesting (aka reentrancy) to pump additional messages.
In a moment we'll consider several options for implementing Yield. For now, the key is just that the Yield call contains a message pump loop that will process waiting messages, which is what gives them the chance to get unblocked and interleave with this loop.
Option 2 avoids the performance and memory overhead of separate allocation and queueing we saw in Option 1, but it leads to bigger stacks. Stack overflow shouldn't be a problem, however, unless stack frames are large and nesting is pathologically frequent (in which case, there are bigger problems; see next paragraph).
Probably the biggest advantage of this approach is that the code is simpler. We just have to call Yield every so often to allow other messages to be processed, and we're golden ... or so we think, because unfortunately it's not actually quite that easy. The code is simpler to write, but requires a little more care to write correctly. Why?
Remember: Beware State Changes When Interleaving, Really
Just as we saw with continuations, so with any other interleaving including Yield: Whenever the code Yields it must be aware that the thread's state can be changed by the code that executed on this thread during Yield operation. It cannot in general assume that any data that is private to the thread, including thread-local variables, remains unchanged across the Yield call. With continuations, the issue was a bit easier to remember because that style already requires the programmer to explicitly save the local state off to the side and then return, so that by the time we get to the code where the continuation resumes it's easy to remember that time has passed and the world might have changed.
When using Yield-reentrancy instead of continuations, it's easier to forget about this effect, in part because it really is (too) easy to just throw a Yield in anywhere. To see how this can cause problems, assume for a moment that semantically we don't care if nested messages change the contents of items (which was the case in the discussion around Example 1(b)). That is, assume we don't necessarily need to process a snapshot of the state, but only get through items until we're done. Even with those relaxed requirements, do you notice a subtle bug in Example 2?
Consider: What happens if a nested message changes the size of the items collection? If that's possible, then the collection contains at least i objects before the Yield and the expression items[i] is valid, but after the Yield the collection may no longer contain i objects and so items[i] could fail.
In this simple example, we can apply a simple fix. The issue is that we have a window between observing that i < items.size() and using items[i], so the simplest fix is to eliminate the problematic window by not yielding in between those points:
Now the code is resilient to having the items collection change during the call.
Of course, as in the discussion around Example 1(b), it's possible that we may not want the collection to change at all, for example to ensure we're processing a consistent snapshot of the collection's state. If so, we may need to save even more off to the side just like we did in Example 1(b), except here it's easier to do because we don't have to mess with a continuation object:
Notice that we can now correctly use myItems[i] both before and after Yield because we've insulated ourselves against the problematic state change.
|
|
||||||||||||||||||||||||||||||
|
|
|
|