This post presents one of our recent pieces of consultancy work as a case study. This case is interesting because it concerns a significant commercial code base ported from OCaml to F#, the introduction of parallelism to the code base and the performance of F# compared to OCaml in the context of a compiler (where OCaml is regarded as being very fast).
We recently ported a compiler written in OCaml to the F# programming language for a client and performance turned out to be an issue. The precise details cannot be disclosed under our contract.
The original compiler was 15,000 lines of OCaml code. The compiler was regularly applied to large code bases and the C code that it produces can be considerable so compilation can take minutes. Consequently, the compiler's performance is valued by our client's customers and, therefore, the original code had been optimized for OCaml's performance characteristics.
A direct translation of the OCaml code to F# proved to be over 10× slower. This was so slow that it impeded testing of the initial translation so some optimization was done before the first working version. Specifically, profiling indicated that the biggest problem was the high rate of exceptions being raised and caught. Exceptions are around 600× slower on .NET than in OCaml and are often used for ordinary control flow in OCaml so direct translations to F# often suffer from poor performance as a consequence. In this case, the hot paths were changed to use union types (usually option types) instead of exceptions. Although this incurs a lot of unnecessary boxing in F# the performance improvements were substantial and the F# version became 5x slower than the OCaml.
Surprisingly, thorough testing showed that our almost-blind initial translation of 15,000 lines of code was completely error free. I think this is a real testament to the power of ML's static type system. In fact, the only error found to date was introduced during subsequent optimization.
After demonstrating the correctness of the translation, effort turned to trying to improve performance in an attempt to compete with the original OCaml code. We expected this to be extremely difficult because OCaml is very heavily optimized for exactly this kind of workload. However, we were actually able to exceed OCaml's performance. This was accomplished by taking a regimented approach where the program was profiled to identify the main hot spots and actions were taken to improve overall performance but only the most effective optimizations were retained before repeating the process and all results (including failures) were carefully recorded. This allowed us to produce a fast code base that was still as familiar as possible to its developers. In particular, the support for parallel programming in .NET 4 and F# 2.0 proved invaluable and was the final boost we needed to exceed OCaml's performance.
This case study proves that it is possible to migrate significant commercial code bases from OCaml to F# without suffering performance degradation. Beyond metaprogramming, we have found that many practical applications see substantial speed improvements when moved from OCaml to F# often due to better support for multicore but also better support for unboxing.