The parallel matrix-matrix multiply makes heavy use of async for parallel programming, where it is an anti-pattern. So it may be productively rewritten using Tasks as follows:
let randomMatrix = let rnd = new System.Random() Array2D.init 500 500 (fun _ _ -> rnd.NextDouble()) let matrixMultiply (a:float[,]) (b:float[,]) = let rowsA, colsA = Array2D.length1 a, Array2D.length2 a let rowsB, colsB = Array2D.length1 b, Array2D.length2 b let result = Array2D.create rowsA colsB 0.0 System.Threading.Tasks.Parallel.For(0, rowsA, fun i -> for j in 0 .. colsB - 1 do let mutable x = 0.0 for k in 0 .. colsA - 1 do x <- x + a.[i,k] * b.[k,j] result.[i,j] <- x ) |> ignore result
matrixMultiply randomMatrix randomMatrix
This is 60x faster than the original async-based solution. However, this is a poor choice of algorithm because it lacks locality. Better algorithms are tiled or cache oblivious.
Their prime number generator makes heavy use of enumerable sequences which can be productively rewritten using recursion and arrays instead, as follows:
(* A very naive prime number detector *) let isPrime (n:int) = let rec loop i = i*i > n || (n % i <> 0 && loop (i+1)) loop 2 (* Return primes between m and n using multiple threads *) let primes m n = Array.Parallel.init (n-m+1) (fun i -> m+i, isPrime (m+i)) |> Array.Parallel.choose (function n, true -> Some n | _ -> None) (* Run a test *) while true do let timer = System.Diagnostics.Stopwatch.StartNew() primes 1000000000 1000100000 |> ignore printfn "%fs" timer.Elapsed.TotalSeconds
This is 13x faster than their original seq-based solution.
2 comments:
Just surprised to see that the F# compiler properly optimizes the recursive loop as it doesn't look like a tail call at first glance.
@David: That recursive call is actually in tail position because the boolean operators are short-circuit evaluated.
So the code "i*i > n || (n % i <> 0 && loop (i+1))" is equivalent to "if i*i>n then true else if n%1 <> 0 then loop (i+1) else false".
Post a Comment