The following class implements a purely functional Bloom filter in F#:
The constructor creates a Bloom filter from the given sequence of elements. The ProbablyContains function tests for probable membership.> type BloomSet<'a when 'a: equality>(xs: seq<'a>) = let bits = System.Collections.BitArray(65536) let idx x n = (x >>> n) &&& 0xffff do for x in xs do let x = x.GetHashCode() bits.[idx x 0] <- true bits.[idx x 16] <- true member __.ProbablyContains(x: 'a) = let x = x.GetHashCode() bits.[idx x 0] && bits.[idx x 16];; type BloomSet<'a when 'a : equality> = class new : xs:seq<'a> -> BloomSet<'a> member ProbablyContains : x:'a -> bool end
For example, the following demonstrates how the proportion of false positives rises with the number of elements in the Bloom filter for our fixed size (64kb) internal array:
> for n in [1000; 10000; 100000] do
Just like sets, Bloom filters have many practical applications.let rand = System.Random() let xs = set [for _ in 1..n -> rand.NextDouble()] let s = BloomSet xs let ys = set [0.0 .. 0.001 .. 1.0] - xs let count x = if s.ProbablyContains x then 1 else 0 let pFail = float (Seq.sumBy count ys) / float ys.Count printfn "%0.2f%% false positives when n=%d" (100.0 * pFail) n;; 0.10% false positives 7.89% false positives when n=10000 82.02% false positives when n=100000
1 comments:
And by coincidence, some hours ago I was working on a Reactive Extensions version.
:):)
Excerpt at https://gist.github.com/1931336
Post a Comment