January 05, 2009
Parallel LINQPaul Kimmel
PLINQ lets you tap into extra power but with little extra work
Paul is an applications architect for EDS and author of LINQ Unleashed for C#. He can be contacted at pkimmel@softconcepts.com.
Multicore processors are a standard part of computing these days. For instance, the laptop I'm writing this article on has a Core 2 Duo Intel 64-bit processor and 3 GBs of RAM. That's a lot of computing power for single-threaded code. Luckily, Parallel LINQ (PLINQ), which is part of the Parallel FX extensions for .NET, lets me use the basic LINQ keywords to tap into that extra power with little extra work on my part. The the Parallel FX Library is a managed concurrency library that includes PLINQ and the Task Parallel Library (TPL).
The basic use of PLINQ is to add a reference to the downloaded System.Threading.dll installed by default at C:\ Program Files\ Microsoft Parallel Extensions Jun08 CTP and call the IParallelEnumerable.AsParallel extension method on your collection. IParallelEnumerable<T> inherits from IEnumerable<T> and generally appears in a LINQ query at the end of the from range in collection clause. For instance, if the collection were an array of integers named numbers, then you would substitute collection with numbers.As Parallel().
Listing One uses the Sieve of Eratosthenes to determine if a number is a prime. A list of primes is built using the sieve technique, and a LINQ query runs through a bunch of numbers testing for primality. Listing One(a) is the sequential query and One(b) the parallel version.
Listing One
If this is the first time you've used LINQ queries, then the queries are on the lines starting with var and followed by = from p in candidates...where...select.... The LINQ queries look somewhat like inverted SQL queries.
Again, the first LINQ query in Listing One is sequential and the second is parallel. The AsParallel extension method kicks off the background threads. At minimum, calling AsParallel is all you need to do to use PLINQ and multiple threads for your LINQ queries.
By default, the original sequence order is not maintained when you invoke a PLINQ query. If you want the order of the sequence maintained, then call AsOrdered. However, maintaining order incurs some performance penalties. On my Dell Precision 470 workstation, the parallel version of this ran consistently slower than the sequential version. Let's explore some reasons performance may actually degrade for parallel operations.
Understanding Exclusion Scenarios for PLINQ
PLINQ works with LINQ-for-Objects and LINQ-for-XML. PLINQ isn't intended for use with LINQ-to-SQL or LINQ-to-Entities. Why? Because the IQueryProvider implementation for SQL Server basically translates LINQ queries to SQL queries for LINQ-to-SQL and LINQ-to-Entities--queries that are processed by the SQL engine instead of in memory.
CPU-bound queries that query large amounts of data, perform intensive computations, or a combination of both will yield better results than queries that are I/O bound, such as queries against the filesystem or SQL server.
You might reasonably wonder why a parallel capability couldn't just automatically be added to LINQ behind the scenes. The answer is that programmer involvement is needed to handle data impurity, concurrency exceptions, thread affinity, ordering expectations, and poorer than expected performance problems in some scenarios. Adverse conditions for parallelism are referred to as 'parallelism blockers' and include:
Finally, optimal parallelism is affected by Amdahl's law, which says roughly that performance speedup is limited by the amount of code that must be processed sequentially. That is, at some point after all of the code and data are partitioned, some of it must be processed sequentially, and the sequential processing that must occur and all of that synchronizing, partitioning, cross-thread communicating, and results-merging detract from performance speedup. The net effect is that having two or four processors do not necessarily mean that your code will run two times or four times faster, respectively.
|
|
||||||||||||||||||||||||||||||
|
|
|
|