Sunday, 21 February 2010

Sieve of Eratosthenes

The infinite sequence of prime numbers can be defined easily but inefficiently in F# using Haskell-style infinite lazy lists:

> #r "FSharp.PowerPack.dll";; --> Referenced 'C:\Program Files\FSharpPowerPack-\bin\FSharp.PowerPack.dll' > open LazyList;; > let rec primes = unfold (fun n -> Some(n, n+1)) 3 |> filter isPrime |> cons 2 and isPrime n = Seq.takeWhile (fun d -> d*d <= n) primes |> Seq.forall (fun d -> n%d <> 0);; val isPrime : int -> bool val primes : LazyList<int> > #time;; --> Timing now on > Seq.nth 100000 primes;; Real: 00:00:05.251, CPU: 00:00:05.241, GC gen0: 283, gen1: 47, gen2: 1 val it : int = 1299721

Find the 100,000th prime using this algorithm took over 5s. The Sieve of Eratosthenes is a much more efficient algorithm and may be implemented in F# in only 18 lines of code as follows:

> let primes = let a = ResizeArray[2] let grow() = let p0 = a.[a.Count-1]+1 let b = Array.create p0 true for di in a do let rec loop i = if i<b.Length then b.[i] <- false loop(i+di) let i0 = p0/di*di loop(if i0<p0 then i0+di-p0 else i0-p0) for i=0 to b.Length-1 do if b.[i] then a.Add(p0+i) fun n -> while n >= a.Count do grow() a.[n];; val primes : (int -> int) > primes 100000;; Real: 00:00:00.025, CPU: 00:00:00.031, GC gen0: 0, gen1: 0, gen2: 0 val it : int = 1299721 > primes 10000000;; Real: 00:00:15.855, CPU: 00:00:15.958, GC gen0: 2, gen1: 2, gen2: 2 val it : int = 179424691

Note that our implementation produces an arbitrarily-extensible sequence and, therefore, can be used to compute primes of any size (up to machine precision, of course). In contrast, most imperative implementations of the Sieve of Eratosthenes only compute prime numbers up to a preset limit.

Computing the 100,000th prime now takes only 0.025s and, in fact, we can compute the 10 millionth prime in only 16 seconds.

For comparison, Melissa O'Neill published a research paper about the Sieve of Eratosthenes in Haskell. Melissa's simplest solution is orders of magnitude slower than ours and her fastest solution is an order of magnitude longer.


Arjen said...

That's an efficient sieve! I think I'll steal it.

I made it 33% faster by using a BitArray to store the sieve:

let b = new System.Collections.BitArray(p0,true)

Flying Frog Consultancy Ltd. said...

@Arjen: If you use bytes to represent the booleans then you can fill the sieve from multiple threads simultaneously, which gives more than 33% performance improvement on this 8 core!

Anonymous said...

Many thanks to you for support. I should.

Anonymous said...

Very Interesting!
Thank You!

Johan Kullbom said...

You might be interested in looking at the even faster Sieve of Atkin. I've published a version for F# here: The sieve of Atkin in F#