Friday, 4 May 2012

What's wrong with using async for parallel programming?

If you have a tiny number of completely independent non-async tasks and lots of cores then there is nothing wrong with using async to achieve parallelism. However, if your tasks are dependent in any way or you have more tasks than cores or you push the use of async too far into the code then you will be leaving a lot of performance on the table and could do a lot better by choosing a more appropriate foundation for parallel programming.
For example, the following F# code maps a function over an array in parallel using the Task Parallel Library in .NET 4: f xs
Async cannot be used to write cache oblivious code and, consequently, async-based parallel programs are likely to suffer from lots of cache misses and, therefore, all cores stalling waiting for shared memory which means poor scalability on a multicore.
The TPL is built upon the idea that child tasks should execute on the same core as their parent with a high probability and, therefore, will benefit from reusing the same data because it will be hot in the local CPU cache. There is no such assurance with async.


David said...

Would you have any opinions to share about the latest version of Rx which seems to have enjoyed more aggressive optimization.

I seriously doubt now that F# async would have comparable performance profile anymore. Benchmarks I ran a few months ago showed Rx to be faster than both CCR and F# async yet with worst scaling.

Flying Frog Consultancy Ltd. said...

@David: I think Rx will be squeezed out between different axes, of which performance is one. Rx might be faster now for what it provides but its functionality is limited and threads+locks are probably substantially faster still. In the other direction, async is slightly slower but much easier to reason about and debug. So I'm not sure Rx has anywhere to thrive.

Also, something like the Disruptor might give you the best of both worlds for certain (very common!) applications.