Dr Cliff Click of Azul Systems, specialists in manycore JVM systems, recently published a blog post about the performance of Java compared primarily to C and C++ but also discussing C# and .NET. Three of Cliff's comments are of particular interest:

Under the heading "Places where C/C++ beats Java for obvious reasons":

*"Value Types, such as a 'Complex' type require a full object in Java."* - Dr Cliff Click

What Cliff forgot to mention is that .NET also provides value types and a far more compelling example than complex numbers is the humble hash table.

Consider the task of filling a hash table with 10,000,000 associations from integers to single-precision floating point numbers. This may be accomplished in Java as follows:

package hashtablebenchmark;
import java.util.HashMap;
public class Main {
public static void main(String[] args) {
int n = 10000000;
for (int j=0; j<10; ++j) {
long startTime = System.currentTimeMillis();
HashMap hashtable = new HashMap(n);
for(int i=1; i<=n; ++i) {
hashtable.put(i, 1.0f / i);
}
System.out.println("m[100] = " + hashtable.get(100));
long time = System.currentTimeMillis() - startTime;
System.out.println("Took: " + time / 1e3 + "s");
}
}
}

The equivalent program in F# is not only shorter but also 17× faster:

let n = 10000000
let m = System.Collections.Generic.Dictionary(n)
for i=1 to n do
m.[i] <- 1.0f / float32 i
printf "m[100] = %f\n" m.[100]

Specifically, Java takes 6.967s initially and 5.733s steady state whereas F# takes only 0.414s.

In fact, F# completes this benchmark so quickly that we would like to give it more work but Java cannot handle any more work without running out of memory on this 4Gb machine.

Elsewhere, Cliff also writes of Java:

*"Very Good Multi-Threading Support. Parallel programming is just easier in Java."* - Dr Cliff Click

and later:

*"Not that I track C# all that closely but... I believe the JIT produces substantially slower code than Java"* - Dr Cliff Click

Allow us to demonstrate the otherwise. The Computer Language Shootout contains a well-formed benchmark called spectral-norm for which the fastest Java solution is a 173-line parallel program. This implementation may be rewritten in only 24 lines of F#:

let A i j = 1.0 / float((i + j) * (i + j + 1) / 2 + i + 1)
let inline mul A (u: _ []) (v: _ []) =
System.Threading.Tasks.Parallel.For(0, v.Length, fun i ->
let mutable vi = 0.0
for j = 0 to v.Length - 1 do
vi <- vi + A i j * u.[j]
v.[i] <- vi) |> ignore
let AtAu u v =
let w = Array.create (Array.length u) 0.0
mul (fun i j -> A i j) u w
mul (fun i j -> A j i) w v
do
let n = 5500
let u, v = Array.create n 1.0, Array.create n 0.0
for i = 0 to 9 do
AtAu u v
AtAu v u
let u, v = vector u, vector v
printf "%0.9f\n" (sqrt(Vector.dot u v / Vector.dot v v))

In the Java, dozens of lines of code were devoted to parallelism. In contrast, only two lines of code were altered to parallelize the F#. So it would seem that parallel programming is not "just easier in Java".

The the serial Java takes 12.722s initially and 12.299s steady state whereas F# takes only 12.18s from a cold start. On this 8-core 2xE5405 2.0GHz Xeon, the parallel Java takes 1.839s initially and 1.820s steady state whereas the parallel F# takes only 1.60s from a cold start. The fact that Java is slower in every single case indicates that the CLR's JIT is not "producing substantially slower code than Java".

Finally, Cliff made no mention of two other design deficiencies that cripple Java's performance. Firstly, their type erasure approach to generics incurs massive performance degradation for a lot of generic code because it leads to unnecessary boxing. Secondly, the JVM's lack of tail call elimination is not only a growing hindrance in the era of functional programming but the only general-purpose workaround, trampolines, is typically 10× slower that necessary.

If you want to learn how to write fast parallel programs in F#, read our book Visual F# 2010 for Technical Computing.