Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Q&A with a TBB Junkie


Developing "lock-free, wait-free, obstruction-free, atomic-free synchronization algorithms and data structures" is his hobby. Based on his frequent postings, he's a "brown belt" ninja contributor on the Intel Software Network Forum, and one of the site's newest bloggers. Meet Dmitriy V'jukov, a Moscow-based high-performance computer systems developer who is an assiduous observer of Intel Threading Building Blocks (TBB) and the adoption of parallelism by developers around the world. Go Parallel invited V'jukov to share his opinions about TBB, the Microsoft Task Parallel Library, other tools to support concurrency and the proposed Intel Parallel Studio.

Q: What is your software development background?

A: I hold a masters degree in computer science from Moscow State Technical University. I have five years of experience as a C/C++ software development engineer, focused mainly on client/server systems and network servers. In my spare time, I deal with synchronization algorithms, programming models for multi-core and multi-threading verification tools.

Q: How long have you been using TBB and for what purpose?

A: I am quite aware of things happening around and inside TBB, but frankly I was not using TBB "in production." I was studying user interfaces and implementation of TBB in detail. I've developed a library for unit-testing/formal verification of synchronization algorithms (or small pieces of multi-threaded code). It's called Relacy Race Detector.

I have had some preliminary conversations with TBB developers with regards to its usage in the development of TBB. I am going to provide a free license for TBB developers. I had an analogous conversation with IBM's Paul McKenney (he works on high-end Intel platforms and Linux technology) with regards to its usage in the development of Linux kernel.

But I'm not sure whether Relacy Race Detector itself will be interesting to the general public, because it's targeted mostly at experts who develop very low-level and complicated algorithms.

Q: What difficulties do you see developers having with TBB?

A: In forums and discussion groups I see that developers face three kinds of problems with TBB algorithms:

1. Task granularity size. In order to achieve good performance, task granularity must be carefully chosen. Tasks that are too fine-grained will lead to high overheads. And tasks that are too coarse-grained will lead to bad scalability due to lack of "parallel slack."

2. Excessive sharing. In order to achieve good scalability, each thread must work mainly with private data. Having each thread, on each iteration, update some global variable (or variables) will turn scalability from a linear positive to a super-linear negative. Task-based programming is especially prone to the problem. Higher-level abstractions (tbb::parallel_reduce, tbb::parallel_scan) incorporate more intelligence to overcome the problem. This strongly suggests that developers should use as high-level abstractions as possible.

3. Locality. Though the modern computer memory sub-system is still called RAM (random-access memory), it's a kind of complicated, distributed, heterogeneous, hierarchical system now. Fortunately, there are very simple tips on how to use it efficiently: First, prefer stride access; second, use all data loaded into the cache; and third, reuse the data in cache while it's still there.

While this advice is applicable to a single-threaded environment too, in a task-based model it's harder to realize whether, for example, access will be in stride or not. Once again, higher-level abstractions are less prone to the problem.

Q: How much are these problems with parallel programming vs. problems with TBB in particular?

These problems are related to parallel programming in general, and in particular to all other parallel programming libraries: OpenMP, Task Parallel Library, Cilk, etc.

Q: When you discuss granularity size, are you talking about the general parallel programming issue of task size, or referring to the problematic TBB 1.0 requirement to pick an explicit grain size (which was fixed in TBB 2.0 with the auto_partitioner)?

A: I am talking about the general parallel programming issue of task size.

Q: What's your biggest challenge in concurrent programming?

A: My biggest challenge in concurrent programming is debugging. Things like non-determinism, asynchronism, the absence of total order of events and state of distribution make debugging of concurrent systems beyond the human brain's strength sometimes. Every "little" error in source code can take up to several days or weeks to fix. And that's the best case scenario. In the worst case, you don't know that there is an error until you get the call from an enraged customer. And the customer can't say under what circumstances it happens.

This is a field where I am looking forward to strong tool support, of all kinds: static analysis, dynamic analysis, post-mortem analysis, advanced IDE support. I have developed some in-house tools for my purposes. But not every developer is able to develop a comprehensive toolset manually.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.