UPDATE: A new F# solution that is 960× faster than the original Mathematica is available here.
Another example from Sal Mangano's Mathematica Cookbook, this time taken from Andreas Lauschke's example, is a "fast" pricer for American options. The relevant section of the book stresses the importance of performance in this context and the Mathematica code presented was heavily optimized by experts. Here is their code:
Specifically, this function has been optimized by pushing computational burden from the general-purpose term rewriting language of Mathematica onto the optimized C routines in its standard library and then compiling the high-level code into lower-level bytecode that is interpreted more efficiently, giving another 5× speedup.
However, this general approach to the optimization of Mathematica code has the unfortunate side effect of "foresting": introducing unnecessary intermediate data structures because built-in operations over them are faster than writing a direct solution in such a slow language. Consequently, a direct solution in a compiled language will usually be orders of magnitude faster. In this case, we found that the following translation to F# is 64× faster than Sal's original Mathematica benchmark:
let americanPut kk r sigma tt = let sqr x = x * x let a, nn, mm, tt0 = 5.0, 100.0, 20, sqr sigma * tt / 2.0 let k, h, s = 2.0 * r / sqr sigma, 2.0 * a / nn, tt0 / float mm let alpha, xn = s / sqr h, int(2.0 * a/h + 0.5) + 1 let x i = float i * h - a let ss = Array.init xn (fun i -> kk * exp(x i)) let k0, k1 = 0.5 * (k - 1.0), 0.25 * sqr (k + 1.0) let k1s = k1 * s let pp0 i = max 0.0 (kk - ss.[i]) let rec run (u: _ ) (u': _ ) j = if j=mm then u else let f i = let xi = -a + float i * h if xi >= 0.0 then 0.0 else exp(k0 * xi + k1s * float j) * (1.0 - exp xi) u'. <- alpha * u. - (2.0 * alpha - 1.0) * u. + alpha / kk * exp(k0 * a + k1s * float j) |> max (f 0) for i=1 to u.Length-2 do u'.[i] <- alpha * (u.[i+1] + u.[i-1]) - (2.0 * alpha - 1.0) * u.[i] |> max (f i) run u' u (j+1) let u i = exp(k0 * x i) * pp0 i / kk let u = run (Array.init xn u) (Array.create xn 0.0) 0 ss, Seq.mapi (fun i u -> kk * exp(-k1 * tt0 - k0 * x i) * u) u Array.Parallel.init 10000 (fun strike -> let ss, ps = americanPut (float(strike+1)) 0.05 0.4 1.0 Seq.zip (Seq.take 60 ss) (Seq.take 60 ps))
The superior performance of the F# solution comes from two main benefits:
The F# program is compiled all the way down to machine code before being executed whereas the Mathematica code is interpreted. The intermediate data structures are then replaced by simple function calls. This alone makes the F# solution over 10× faster than the original Mathematica.
F# inherits the benefits of a highly-optimized multicore-capable run-time from the .NET platform. This allows us to use fine-grained parallelism to improve performance even further for another 6.1× speedup on this 8-core machine.