Sunday, January 25, 2009

Parallel Programming in .Net 4.0 and VS2010: Part I – The Parallel Task Library (Parallel.For, Parallel.Foreach() and Invoke())

Copyright 2008-2009, Paul Jackson, all rights reserved

Update 2/9/2010 – This original article was written using the Visual Studio 2010 CTP.  I’ve since updated the information in a new article which takes into account changes in the API as of the release of the VS2010 RC.

The Parallel Extensions, which include the Parallel Task Library, Parallel LINQ and Concurrency Data Structures, have been available as a CTP for some time now and run well in Visual Studio 2008, but will be officially released as part of .Net 4.0. 

To use the library now, in VS2008, you must download and install the Parallel Extensions library.  Once installed, add a reference to the library’s assembly to your project:

image

Yes, that’s System.Threading there with a 1.x version number.  The library install registers the assembly with Visual Studio and the classes are in the System.Threading namespace.   It appears they’ll be built-in for .Net 4.0, but for now we have an extension assembly with the same name as a core .Net namespace.  So when you type "System.Threading. and don’t get any of the extension classes you were expecting in Intellisense, remember you’ll have to add this reference.

Parallel.For()

In this article, we’re going to look at the Parallel Task Library and what it makes available for parallelizing for and foreach loops.

Starting with a simple, single-threaded application that does some work a few times:

            for (int i = 0; i < 100; i++)
            {
                doWork(i);
            }

And where doWork() simply does a lot of math on random numbers:

        private static void doWork(int instance)
        {
            double result = 
                Math.Acos(new Random().NextDouble()) * 
                Math.Atan2(new Random().NextDouble(), new Random().NextDouble());
            
            for (int i = 0; i < 20000; i++)
            {
                result += (
                    Math.Cos(new Random().NextDouble()) * 
                    Math.Acos(new Random().NextDouble()));
            }
        }

Running this code on a system with four cores results in a some activity on all four cores and a runtime of a little over 12 seconds:

image

But by changing one line of code and switching from a traditional, sequential For loop to the Parallel.For loop provided by the library, we can have a significant impact on the performance of the application:

            //for (int i = 0; i < 100; i++)
            System.Threading.Parallel.For(0, 100, delegate(int i)
                {
                    doWork(i);
                }
            );

The change is impressive – all four cores spike and the process finishes in 3.7 seconds instead of 12.

image

A look at what’s happening behind the scenes with the Visual Studio Threads Debug Window gives some insight into where the improvement comes from:

image

The Parallel Library has started seven threads to do the work.  Take note of the number of threads because it changes based on the number of cores available in the system -- (cores * 2) –1 – well, actually (cores * 2), but one of them is doing other work.  You, as a developer, won’t have to determine how many threads it’s appropriate for you to start on a given system, the library does it for you automatically (there is a method to gain more control of the threads, though).  The downside of this, of course, is you’ll now have multiple threads to try and debug if there’s a problem.  Visual Studio 2010 includes some new debugging tools that will make this easier and which I’ll be examining in a future post.

There is some overhead associated with threading, so the individual tasks may take a little longer – in this case, they went from about 120-milliseconds for the single-threaded example to a wide range, some as high as 400-milliseconds, in the parallel:

image image

Single-threaded

Parallel

This overhead should be part of your decision about when and what to parallelize in your application.  What’s more important for your application: that it finish faster or use less total CPU time?  This is something only you can answer with regard to your application and goals.

Another consideration is ordering of the work.  As you can see from the screenshots above, the single-threaded example performs the work sequentially.  Each time through the for loop does its work and completes before the next starts.  But when this is parallelized, the order of completion can’t be guaranteed.  As the work is off-loaded to seven worker threads (in this example), each of those threads could complete its work and get the next task at any time.  So parallelizing isn’t an option (or is a more complex option) when the order of execution matters to your application.

In parallel (pun intended) with this series on how to parallelize the code, I’ll also be writing a series of articles on the process of parallelizing – how to decide when, where and what the candidates for parallelizing are in an application.

Parallel.ForEach()

The Parallel.ForEach() method follows the same pattern:

            List<int> list = new List<int>();
            // populate list
            Parallel.ForEach(list, item =>
                {
                    // parallel work here
                }
            );

In this example I used a slightly different syntax for the body of the method, “=>” instead of “delegate”, but the end result is the same. 

Parallel.Invoke()

Parallel.For() and Parallel.ForEach() are useful if you have collections you need to act on or a known number of times you have to execute the same code, but what about when you have several different things (methods) you need to do?  For that, we have Parallel.Invoke().

Parallel.Invoke() simply executes, in parallel, a list of methods passed to it:

        static void Main(string[] args)
        {
            Parallel.Invoke(A, B, C, D, E);
            Console.ReadLine();
        }
        public static void A(){}
        public static void B(){}
        public static void C(){}
        public static void D(){}
        public static void E(){}

Methods A(), B(), C(), D() and E() will be assigned to worker threads as they become available and execute in parallel.  As with the other methods, there are no guarantees about the order of execution.

Conclusion

The Parallel Extensions in .Net 4.0 are going give developers a much easier and consistent way to parallelize their applications.  The library greatly insulates us from the complexity and inner plumbing of the parallel world, but still have responsibilities to use them properly.  Hard questions and issues will remain around the architecture, design and implementation of parallelization in our applications, but Microsoft is aware of this and is busy providing us with more resources than just the tools in this library.  The Parallel Computing Developer Center has a wealth of whitepapers and presentations on these topics.

4 comments:

Sunil Rathore said...

Great article Paul, thanks for saving my time. Yesterday I installed the VS 2010 and I found only one problem in your article. The problem is you said we have to install the Parallel Extensions for older version, ye indeed we have to. But these features of Parallel Extensions comes automatically in VS 2010 beta 2. So, the method for calling the "For" loop slighlty get changed, I don't know that you missed it or your System.Threading.Parallel.For(0, 100, delegate(int i) is working for you or not. But for me its not working, I have to add the Tasks before the parallel keyword i.e. System.Threading.Tasks.Parallel.For(0, 100, delegate(int i)


Millon thanks for your great article.

Paul Jackson said...

Sunil -- Thanks for your comment. Yes, the Parallel class was moved to the System.Threading.Tasks namespace. I'm in the process of writing updated articles for the current API now that the VS2010 RC is available.

Anonymous said...

Hi,
Great post. The link which can download the parallel extension library for vs2008 is broken. can u provide an alternative link.

Troy Schuetrumpf said...

Thank you for the post on the parallel lib, great stuff!