FREE Subscription to Dr. Dobb’s Digest: Same Great Content, New Digital Edition
Site Archive (Complete)
DrDobbs Portal Blog: Python, Parallelism, and Multicore
EDITOR'S EYE

The World of Software Development.

by Jon Erickson
June 07, 2007

Python, Parallelism, and Multicore

If you're not careful, it's easy to start thinking that the push for concurrency and parallelism, being driven (for the most part) by the adoption of multicore processors, only involves the ususal suspects: programming languages like C/C++ and FORTRAN, and vendors like Intel, Sun, and Google. But that's not the case, as evident by the waves that Python is starting to make when it comes to parallel computing.

For instance, in the upcoming August 2007 issue of Dr. Dobb's Journal, Robert Bjornson, Nicholas Carriero, and Stephen Weston will present their article "Python NetWorkSpaces" which examines a new way of writing parallel programs. NetWorkSpaces is an open source framework for that's easy to learn, accessible via almost all development environments (including Java, MATLAB, octave, Python, Perl, and Ruby),and easy to use clusters from within scripting languages like Python, Matlab, and R.

But NetWorkSpaces isn't the only game in town, when it comes to Python and parallelism. Interactive Supercomputing (ISC) has unveiled Star-P 2.5 for Life Sciences, an interactive parallel computing platform that lets you code algorithms and models using tools like Python or MATLAB, then run them instantly and interactively on parallel high-performance computers (HPCs). Star-P 2.5 for Life Sciences addresses a number of performance issues key to life-science research, including the ability to work with data sets that up to 4 terabytes across 512 processors. Star-P 2.5 for Life Sciences also features a 200-300 percent performance improvement for Fast Fourier Transform (FFT) functions on distributed data, which are commonly used in life-sciences signal and image-processing applications. It also offers a tool that lets you see real-time graphical analysis of how well programs execute, allowing optimization of the code structure for better performance.

It's worth noting that ISC also supports Python as a standalone package with its Star-P 2.5 for Python. Star-P 2.5 for Python includes a Python client interface that lets you take advantage of Python-specific numerical libraries and functions, including NumPy and SciPy, Python programming extensions that add support for large, multi-dimensional arrays and matrices, as well as high-level mathematical functions which operate on these arrays. Additionally, many modules from the Python open source community can be run as parallel tasks, to speed up tasks that can be executed independently.

Star-P for Python lets you use any of Python's hundreds of functions in a task parallel computation, such as Monte Carlo simulations or "unrolling" serial FOR loops. Additionally, for data parallel computing -- operations involving compute-intensive operations on large distributed data sets -- over 50 of the most popular "blockbuster" functions commonly used in technical computing are included.

"Star-P support for Python will enable NumPy users to run their programs on a parallel server or cluster with a handful of trivial syntax changes," said ISC's Ilya Mirman. "Our goal is to enable scientists, engineers and analysts to write number-crunching programs in a comfortable high-level language, and then immediately run their code on parallel systems with the least amount of complexity."

According to a recent study of 600 Python users sponsored by ISC and conducted by Fletcher Spaght, more than a third of respondents (35 percent) said that running their Python applications on high-performance computers would yield significant improvements to their research capabilities. This isn't surprising when you considers that more than a third (31 percent) of Python program run times take more than an hour to complete, with 20 percent taking days to complete. At the same time, Python users' data sets are only getting larger -- with 20 percent exceeding 10 GB.

So now you have an idea of why the scientific community in general, and Python users in particular, are interested in the benefits of concurrency, and the power of multicore.

Posted by Jon Erickson at 10:21 AM  Permalink





January 2008
Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    


BLOGROLL
 
INFO-LINK