January 08, 2009
volatile vs. volatileHerb Sutter
A tale of two similar but different tools
Herb is a bestselling author and consultant on software development topics, and a software architect at Microsoft. He can be contacted at www.gotw.ca.
What does the volatile keyword mean? How should you use it? Confusingly, there are two common answers, because depending on the language you use volatile supports one or the other of two different programming techniques: lock-free programming, and dealing with 'unusual' memory. (See Figure 1.)
[Click image to view at full size]
Figure 1: A tale of two requirements.
Adding to the confusion, these two different uses have overlapping requirements and impose overlapping restrictions, which makes them appear more similar than they are. Let's define and understand them clearly, and see how to spell them correctly in C, C++, Java and C# -- and not always as volatile.
[Click image to view at full size]
Table 1: Comparing the overlapping but different requirements.
Case 1: Ordered Atomic Variables For Lock-Free Programming
Lock-free programming is all about doing communication and synchronization between threads using lower-level tools than mutex locks. In the past and even today, however, these tools are all over the map. In rough historical order, they include explicit fences/barriers (e.g., Linux's mb()), order-inducing special API calls (e.g., Windows' InterlockedExchange), and various flavors of special atomic types. Many of these tools are tedious and/or difficult, and their wide variety means that lock-free code ends up being written differently in different environments.
The last few years, however, have seen a major convergence across hardware and software vendors: The computing industry is coalescing around sequentially consistent ordered atomic variables as the default or only way to write lock-free code using the major languages and OS platforms. In a nutshell, ordered atomic variables are safe to read and write on multiple threads at the same time without doing any explicit locking because they provide two guarantees: their reads and writes are guaranteed to be executed in the order they appear in your program's source code; and each read or write is guaranteed to be atomic, all-or-nothing. They also have special operations such as compareAndSet that are guaranteed to be executed atomically. See [1] for further details about ordered atomic variables and how to use them correctly.
Ordered atomic variables are available in Java, C# and other .NET languages, and the forthcoming ISO C++ Standard, but under different names:
A Word About Optimization
We're going to look at how ordered atomics restrict the optimizations that compilers, CPUs, cache effects, and other parts of your execution environment might perform. So let's first briefly review some basic rules of optimization.
The most fundamental rule of optimization in pretty much any language is this: Optimizations that rearrange ('transform') your code's execution are always legal if they don't change the meaning of the program, so that the program can't tell the difference between executing the original code and the transformed code. In some languages, this is also known as the 'as if' rule -- which gets its name from the fact that the transformed code has the same observable effects 'as if' the original source code had been executed as written.
This rule cuts two ways: First, an optimization must never make it possible to get a result that wasn't possible before, or break any guarantees that the original code was allowed to rely on, including language semantics. If we produce an impossible result, after all, the program and the user certainly can tell the difference, and it's not just 'as if' we'd executed the original untransformed code.
Second, optimizations are permitted to reduce the set of possible executions. For example, an optimization might make some potential (but not guaranteed) interleavings never actually happen. This is okay, because the program couldn't rely on them happening anyway.
Ordered Atomics and Optimization
Using ordered atomic variables restricts the kinds of optimizations your compiler and processor and cache system can do. [3] There are two kinds of optimizations to consider:
First, all of the ordered atomic reads and writes on a given thread must execute exactly in source code order, because that's one of the fundamental guarantees of ordered atomic variables. However, we can still perform some optimizations, in particular, optimizations that have the same effect as if this thread just always executed so quickly that another thread didn't happen to ever interleave at certain points.
For instance, consider this code, where a is an ordered atomic variable:
Is it legal for a compiler, processor, cache, or other part of the execution environment to transform the above code into the following, eliminating the redundant write in line A?
The answer is, 'Yes.' This is legal because the program can't tell the difference; it's as if this thread always ran so fast that no other thread accessing a concurrently ever got to interleave between lines A and B to see the intermediate value. [4]
Similarly, if a is an ordered atomic variable and local is an unshared local variable, it is legal to transform
to
which eliminates the read from a. Even if another thread is concurrently trying to write to a, its as if this thread always ran so fast that the other thread never got to interleave between lines C and D to change the value before we can read our own back into local.
Second, nearby ordinary reads and writes can still be reordered around ordered atomic reads and writes, but subject to some restrictions. In particular, as described in [3], ordinary reads and writes can't move upward across (from after to before) an ordered atomic read, and can't move downward across (from before to after) an ordered atomic write. In brief, that could move them out of a critical section of code, and you can write programs that can tell the difference. For more details, see [3].
That's it for lock-free programming and ordered atomics. What about the other case that some "volatiles" address?
|
|
||||||||||||||||||||||||||||||
|
|
![]() |
||||
![]() |
|
|||
![]() |
||||
![]() |
||||
![]() |
|
|||
![]() |
|
|||