Multicore Testing Requires Real Parallelism to Happen
Testing an application prepared to run concurrent code can become a nightmare for old-fashioned testing platforms. Multicore testing requires new techniques, new expertise and new hardware. For example, you cannot guarantee a parallelized application's accuracy testing it on computers with single core microprocessors.
I'm going to borrow a sentence from Bram Stoker's "Dracula"
"We learn from failure, not from success!"
One of the most frustrating experiences with multicore programming could be a parallelized application generating unexpected random problems. However, if this application had successfully passed the testing process, it would be even a more annoying situation. Why could this happen? Because testing techniques also have to Go Parallel.
Usually, the best computers (workstations or servers) are dedicated to run the final version of the applications. Nowadays, there is a great probability of having at least four or more logical processing cores in a server (four hardware threads).
You can parallelize an existing algorithm and you can debug it using a dual-core CPU (two logical processing cores, two hardware threads). Then, an extensive testing process could be performed on many different dual-core computers (again, two logical processing cores, two hardware threads). The application could offer accurate results, it could work as expected. However, when running the application on the server, something could go wrong. A hidden bug could appear, a bug generated by an unexplored concurrency.
Two hardware threads do not guarantee real concurrency all the time the algorithm is scheduled to run in parallel. The great problem is the operating system, the scheduler, the kernel and all the other processes and software threads that are competing for processing time. They can avoid some real concurrency to happen because two threads are not always running in parallel. This situation could solve some concurrency bugs. It's a question of time. Some instructions are not running on parallel, they are not running at the same time because there are other threads stealing processing time.
However, when you move to the parallel processing power offered by the server, the additional hardware threads (logical cores) offered by this computer would enable the software threads to run in parallel. Hence, real concurrency will happen. Pure concurrency bugs will appear because the instructions that produce the problem will run exactly at the same time.
How can you detect these pure concurrency bugs? You have to use the appropriate hardware to let real parallelism happen. You cannot test a parallelized algorithm running on single core microprocessors. You need more logical cores, more hardware threads. You have to use the adequate hardware according to the kind of parallelization you're willing to create. It doesn't mean that you need 256 logical cores to develop an application that could be capable of scaling to this number of cores. However, it means that sometimes, two logical cores aren't enough.
Once you face this kind of horrible and difficult to detect bugs, you'll learn to create better parallelized algorithms. You'll learn many things from failure. The recently launched Intel® Parallel Studio offers an excellent toolbox to detect these bugs. It is available for C/C++ programming languages.
Most modern IDEs (Integrated Development Environments) are adding features to help the developers to detect and solve these bugs. However, I do believe Intel® Parallel Studio is the most complete toolbox. I'd love to see versions for .Net and the JVM (Java Virtual Machine) in the future.
Don't forget to check your testing platforms and environments before deploying the final version of a parallelized application. Doing so, you'll avoid terrifying concurrency nightmares.
This Week's Multicore Reading List
MATLAB and Google App Engine
Logging In C++ : Part 2
Improving log granularityA Conversation with BitMagic's Developer
Prefer Structured Lifetimes: Local, Nested, Bounded, Deterministic
- Intel Parallel Studio; Download the free eval today!
- Parallelism Breakthrough Video Series; Watch and learn more about Intel® Parallel Studio
- 2009 Intel Software Webinar Series; View On-Demand webinars
- Coding for Multi-core Processes; Intel® Compiler Pro eBook
- Performance Through Parallelism; Intel® Tuning for Vista eBook
- Intel® Software Network; Connect with developers and Intel engineers
-
November 17, 2009
Visual Effects for Animation - presented by DreamWorks Animation
Speaker: Ron Henderson (Bio)Ron Henderson manages the FX Tools group at DreamWorks Animation, where he is responsible for developing physical simulation and procedural modeling tools. These systems have been used for key visual effects in recent films such as Kung Fu Panda and Monsters vs. Aliens (March 2009).
Prior to joining DreamWorks in 2002 he was a senior scientist at Caltech with a joint appointment to the Applied Math and Aeronautics departments, where he worked on efficient techniques for the direct numerical simulation of fluid turbulence.Abstract:
In this webinar, Ron Henderson will show examples of visual effects, from hair and feathers to smoke and fire, from a variety of DreamWorks Animation feature films. He will discuss in general terms the kinds of techniques used to achieve particular visual effects. Finally, Henderson will show a detailed breakdown of the dam-breaking scene from Madagascar: Escape 2 Africa, demonstrating how different elements of key frame animation, simulation, and rendering are combined in a real production shot. -
December 1, 2009
A Quick and Easy Way to Parallelize a Legacy Codebase with Intel® Threading Building Blocks (TBBs)
Speaker: Bernard Laberge, Avid, Senior Principal Engineer (Bio)Bernard Laberge is a senior principal engineer in the video editors division at Avid. During his seven years with the company he has been actively involved in the replacement of the legacy video processing engines used by Avid editors with a common hardware-abstracted, component-based video processing engine currently running on the CPU with SIMD optimized code, GPU, and dedicated hardware.
Abstract:
Learn how to overcome the limitations of a thread-based scheduler, including dealing with the absence of recursive parallelism support and the inefficient handling of unbalanced processing load. Bernard Laberge addresses how Avid resolved the expensive refactoring of their thread-based scheduler into a task-based solution by choosing Intel® Threading Building Blocks (TBBs). He explores how Avid was able to easily integrate the Intel TBBs into their video editor applications and more than 5 million lines of code. -
December 15, 2009
How to Use Intel® Parallel Studio to Streamline Code Development in a Multicore Environment
Speaker: Matt Dunbar, Director for Performance Technology, SIMULIA (Bio)Matt Dunbar is the director for performance technology at SIMULIA. Since joining the company in 1993, he has worked on parallelization of the Abaqus suite of products, initially for shared memory architectures and more recently for distributed memory architectures. Dunbar has also been intimately involved in selecting both the hardware and software tools used in the development of the Abaqus product line.
Abstract:
Resolve elusive, costly multithreading errors quickly and efficiently with Intel® Parallel Studio. While many coding problems that lead to bugs in software applications are typically straightforward logic errors, errors in managing memory and in multithreading code can sometimes take weeks to months to diagnose and fix. Matt Dunbar explores how and why taking advantage of multicore processors through multithreaded code is critical for compute-intensive applications. While spotlighting his work on SIMULIA's Abaqus finite element solver, Dunbar addresses the need for multicore execution and shares his experiences using Intel Parallel Studio to streamline code development in a multicore environment.



