Monday, 14 May 2012

Parallel programming in functional languages

Functional programming has immutable data structures and no side effect which are inherently suitable for parallel programming.
That is a common misconception. Parallelism is solely about performance and purity usually degrades performance. So purely functional programming is not a good starting point if your objective is performance. In general, purity means more allocation and worse locality. In particular, purely functional data structures replace arrays with trees and that incurs many more allocations and places far more burden on the garbage collector.
For example, measure the performance of the elegant purely functional "quicksort" in Haskell. Last time I checked, it was thousands of times slower than Sedgewick's conventional imperative solution on my machine.
Also, nobody has ever managed to implement an efficient dictionary data structure (pure or impure) or an efficient purely-functional sort in Haskell and nobody has figured out how to write an asymptotically efficient persistent disjoint set data structure and there is no known way to implement other basic data structures like purely functional weak dictionaries!
Moreover, the garbage collector in the defacto-standard Haskell implementation (GHC) is heavily optimized for pure code at the expense of mutation. For example, GHC's hash table is still 26× slower than .NET's. Historically, the performance of mutation was considered so unimportant in Haskell that writing a single pointer to an array was an O(n) operation in GHC for five years.
I investigate how to exploit multicore computation in a functional language, and target production code for some numerical applications.
The best way I have found is to learn how to write decent parallel programs for multicores in the imperative style (study Cilk, in particular) and then factor the code using first-class functions and tail call elimination into the impure functional style.
This means cache oblivious data structures and algorithms. Nobody has ever done this in Haskell. Indeed, none of the research published on parallel Haskell to-date has even mentioned the essential concept of cache complexity. Furthermore, although it is widely known that non-strict (aka lazy) evaluation renders space consumption unpredictable, it is not yet widely appreciated that this same problem renders scalability wildly unpredictable on multicores.
F# has Microsoft behind its back, and its parallel constructs such as PLINQ, TPL, Async Workflow have been well-documented and shown some potentials.
They are well beyond showing potential. Thousands of commercial applications have been built upon those industrial-strength foundations.
However, research about parallelism in Haskell is very active at the moment, and it posseses many nice features which haven't supported by F# yet:
You are assuming they are "nice features"? Read Simon Marlow's latest paper about this:
"...a combination of practical experience and investigation has lead us to conclude that this approach is not without drawbacks. In a nutshell, the problem is this: achieving parallelism with par requires that the programmer understand operational properties of the language that are at best implementation-defined (and at worst undefined). This makes par difficult to use, and pitfalls abound — new users have a high failure rate..."
My question is which language should I choose for functional parallelism?
I'd advise against parallel purely functional code for production because it is a complete wild card. Assuming you're happy to sacrifice some purity in order to attain competitive performance, I use and recommend F#.


coco said...

interesting..I was using clojure and now I'm learning f#, It's very common listen about the clojure-jvm power for concurrence and parallel programming, but now than I'm using the clr and mono is really impressing how prehistoric feels the jvm, actually when .Net was moving fast creatng a better environment for parallel and concurrent programming, the jvm discuss if lambdas was a good idea for the java language....

Jon Harrop said...

@coco: Yes, I believe the time is ripe for a competitor to the JVM that leap frogs it and brings the benefits of .NET to the non-Microsoft world. In particular, the software development tools available for mobile targets (e.g. iOS) are dire compared to desktop tools.