Thursday, June 3, 2010

Parallel Programming in .Net 4.0 and VS2010: Part II – WinForms, Tasks and Service Level Agreements

Copyright 2008-2009, Paul Jackson, all rights reserved

Note: This article was originally published with code samples from the Visual Studio 2010 Beta.  This version has been updated for the release version.  This article, especially, has changed due to the changes in the Task cancellation mechanism.

The Console application in my previous post was actually the prototype for a more robust WinForms implementation – I have a customer who likes prime numbers.  We’ll call him Bob.

image

Sure, Bob’s a little weird, but he typically pays his invoices Net-10, so I like to keep him happy.

I first deployed the application to Bob without using the Parallel Task Library, so it was single-threaded:

private void goButton_Click(object sender, EventArgs e)
  {
      Stopwatch watch = Stopwatch.StartNew();
      for (int i = 0; i < 100; i++)
      {
          doWork(i);
      }
      watch.Stop();
      listBox1.Items.Add(String.Format("Entire process took {0} milliseconds", watch.ElapsedMilliseconds));
  }
  
  private void doWork(int instance)
  {
      Stopwatch watch = Stopwatch.StartNew();
      for (int i = 3; i < 30000; i++) 
      {
          if (isPrime(i));     
      } 
      watch.Stop();
      listBox1.Items.Add(String.Format("{0} took {1} milliseconds", instance, watch.ElapsedMilliseconds));
  }
  
  
  private static bool isPrime(int i)
  {
      for (int j = 2; j <= (i / 2); j++) 
      { 
          if ((i % j) == 0)             
              return false; 
      } 
      return true; 
  }

Bob was happy with the results, but not with the performance:

“Sixteen seconds is too long!  I can’t wait that long!  Time is money in my business!  It needs to be instantaneous!”

I tried to explain to Bob that “instantaneous” is not a Service-Level Agreement, but he was adamant: “Faster!”

And, no, I have no idea what Bob’s business is or why he needs this application.  He pays on time – I don’t ask a lot of questions.

So my first change is to make the for-loop parallel:

Parallel.For(0, 100, i =>
 {
     doWork(i);
 });

Unfortunately, changing an application from single- to multi-threaded can have unintended consequences.  When writing a single-threaded WinForms application you don’t have to worry about things like WinForms Controls only being accessible from the UI thread:

image

The doWork() method accesses the ListBox directly to add items to it, but since doWork() is now being run on a different thread, it violates a fundamental Windows requirement that UI controls only be accessed from the thread they were created on.

This problem isn’t new with .Net 4.0, it’s always been there and remains even in WPF.  UI Controls can only be accessed from the thread that created them and that should be the main application thread.  So there are some hoops we have to jump through in order to get back to the UI thread in order to update the control.  There are a number of articles available that describe patterns for dealing with this – the one I typically use (now) is:

private void updateList(string item)
{
    if (listBox1.InvokeRequired)
        listBox1.BeginInvoke((Action)delegate() { listBox1.Items.Add(item); });
    else
        listBox1.Items.Add(item);
}

Then I change the doWork() method to add items to the list through the addToList() method instead of directly:

private void doWork(int instance)
 {
     Stopwatch watch = Stopwatch.StartNew();
     for (int i = 3; i < 30000; i++) 
     {
         if (isPrime(i));     
     } 
     watch.Stop();
     updateList(String.Format("{0} took {1} milliseconds", instance, watch.ElapsedMilliseconds));
 
 }

And, for consistency, I do the same to the goButton_Click where the total elapsed time is recorded.

Running the application now results in the same performance improvement seen in the console application:

image

I take this new version to Bob, pretty confident that dropping the processing time from almost 16 seconds to 4 will make him happy.  The only problem is that I’m not sure how to bill him for it – after all, using the Parallel Extensions I was able to get this speed improvement with only a few minutes of work -- .Net 4.0 might improve my productivity, but it could have a negative effect on my Accounts Receivable.

Unfortunately, Bob’s not as impressed as I thought he’d be:

“4 seconds?  I still have to wait 4 seconds?  Faster! Faster! Faster!  I need instantaneous results!  I need to start working with the list as soon as I click the Go button!”

Bob’s a little high-strung.

But something he said sparks an idea.  Bob needs to work with the list immediately, but not with the entire list.  His process is to do something with each item in the list (no, I still don’t know what he does with them), so he doesn’t need everything, he just needs enough to start working.  While he’s working on the early results, later results can be completed and returned.

The way we’d do that today is to start our own background thread using something like ThreadPool.QueueUserWorkItem().  I can move the current code from button click event to another method, then use QueueUserWorkItem to run that code on a background thread:

private void goButton_Click(object sender, EventArgs e)
{
    ThreadPool.QueueUserWorkItem(new WaitCallback(freeUi));
}
 
private void freeUi(object sync)
{
    Stopwatch watch = Stopwatch.StartNew();
    //for (int i = 0; i < 100; i++)
    Parallel.For(0, 100, i =>
    {
        doWork(i);
    });
    watch.Stop();
    updateList(String.Format("Entire process took {0} milliseconds", watch.ElapsedMilliseconds));
}

This works great.  The list starts populating immediately and Bob will be able to get started working on the first items in the list while the remaining work finishes.  But QueueUserWorkItem is a little passé – it’s very … .Net 2.0 – what does 4.0 offer us to replace it with?

Enter the System.Threading.Tasks namespace and the Task class.  Using the StartNew() method on Task.Factory, we can create our background thread using the new, .Net 4.0 Task model, rather than via the old ThreadPool.  New is better.  Mostly.  Except New Coke – New Coke was bad.  Very, very bad.

   1: private void goButton_Click(object sender, EventArgs e)
   2:  {
   3:      Task.Factory.StartNew(() => { freeUi(null); });
   4:  }

Bob is thrilled with this new version.

“I’m thrilled!  This is great!  It’s instantaneous! By the way … how do I make it stop?”

So Bob wants a way to stop the population of the list once he’s started it.  We can give him that with the CancellationTokenSource and CancellationToken.  First we need a token to use for cancelation:

   1: private CancellationTokenSource tokenSource;
   2: private CancellationToken token;
   3:  
   4: public Form1()
   5: {
   6:     InitializeComponent();
   7:  
   8:     tokenSource = new CancellationTokenSource();
   9:     token = tokenSource.Token;
  10: }

Now this token must be passed to the Task:

   1: private void goButton_Click(object sender, EventArgs e)
   2: {
   3:     var task = Task.Factory.StartNew(() => { freeUi(null); }, token);
   4: }

And, via a ParallelOptions object, to the Parallel.For loop:

   1: var options = new ParallelOptions();
   2: options.CancellationToken = token;
   3:  
   4: Parallel.For(0, 100, options, i =>
   5: {
   6:     doWork(i);
   7: });

And, finally, having given Bob a Cancel button, we would call Cancel() on the CancellationTokenSource when he clicks it:

   1: private void cancelButton_Click(object sender, EventArgs e)
   2: {
   3:     tokenSource.Cancel();
   4: }

Now that we have a token in all of our Tasks and threads that will tell us when Bob clicks Cancel, we need to gracefully shutdown all of those background threads and what’s spawning them.

For the Parallel.For loop, we need to get the ParallelOptions down into the doWork method itself, because we want to be able to interrupt the prime number loop itself:

Parallel.For(0, 100, options, (i) =>
{
    doWork(i, options);
});

Once there, we can use the CancellationToken property to break out of the process when it’s cancelled:

private void doWork(int instance, ParallelOptions options)
 {
     Stopwatch watch = Stopwatch.StartNew();
     for (int i = 3; i < 30000; i++) 
     {
         if (isPrime(i));
         options.CancellationToken.ThrowIfCancellationRequested();
     } 
     watch.Stop();
     updateList(String.Format("{0} took {1} milliseconds", instance, watch.ElapsedMilliseconds));
 
 }

This method throws an exception, so you should catch that and use it to clean up after cancelled tasks.

try
 {
     Parallel.For(0, 100, options, (i) =>
      {
          try
          {
              doWork(i, options);
          }
          catch (OperationCanceledException ex)
          {
              // clean up iteration
          }
      });
 }
 catch (OperationCanceledException ex)
 {
     // clean up loop
 }

Bob’s happy with this latest version.  It satisfies both reasons to parallelize or multi-thread an application: Performance and User Experience.  Bob’s experience is improved because he’s able to continue work immediately after clicking on the Go button and doesn’t have to wait at all, and the entire process’ execution time has improved from over fifteen seconds to under four (on a quadcore PC).  Even I’m happy, because I can now send Bob an invoice and get paid.

 

Download Project Source Code