Thursday, 30 December 2010

Performance-related features in F# and C#

Many people expect the performance of F# to be comparable to the performance of C# "because they both compile to the same intermediate language (CIL)". Although the VM obviously places the same ceiling on the performance of both F# and C# there are some features that affect performance differently between the two language implementations.

Firstly, C# and F# don't actually compile down to the same CIL because only F# generates tail calls. This affects performance when mutually recursive functions that represent the different states of a state machine tail call each other in order to move between states. This design pattern often appears in the context of asynchronous workflows that can tail call each other in F# but this will not even be possible with the new async support in C# 5.

Secondly, the implementation of delegates on .NET is currently quite inefficient and, consequently, F# uses its own FastFunc type for high-performance first-class functions.

Thirdly, F# uses .NET metadata to convey inline functions so that they can be exported across APIs and, of course, that can dramatically improve performance in certain circumstances. Moreover, F#'s inlining allows functions passed as arguments to higher-order functions to be completely inlined and type specialized. Our F# for Numerics library makes extensive use of this feature to ensure that per-type functions such as comparison are specialized, giving performance in F# up to 2,350× faster than the equivalent C# (!). In fact, some of our numerical routines are consistently several times faster than vendor-tuned Fortran in libraries like the Intel MKL.

Finally, pattern matching can be extremely laborious to express in C# because the language lacks pattern matching but it is almost impossible to maintain optimized C# code equivalent to many non-trivial pattern matches. In contrast, the F# compiler aggressively optimizes pattern matches during compilation.

Conversely, the C# compiler can be better at optimizing computations over value types (e.g. complex arithmetic).


Dave Thomas said...

What about the cost of immutability? Does the extra overhead of creation bite into these boosts...

A code example of the design pattern for mutual recursion on async work flows would be nice :-)

Jon Harrop said...

@Dave: Immutable value types are almost free. In fact, modern compilers for imperative languages often translate code into Single Static Assignment (SSA) form, where each value is assigned only once and, therefore, everything is essentially an immutable value. However, larger immutable data structures like dictionaries (Map in F#) come at considerable cost in terms of throughput. But they are optional...

The article Concurrent programming: TCP relay server from the F#.NET Journal gave some excellent examples of mutually recursive asynchronous workflows. The relay server can be in one of several states and can receive commands that move it from one state to another. Each state is represented by a separate asynchronous workflow in a mutually recursive group of them.