Saturday, 12 October 2013

A thread-safe object cache that doesn't leak

Consider the problem of lazily initializing objects from handles. Requesting the object corresponding to any given handle more than once should return the same physical object that the first such call returned. The simplest solution is to use a dictionary to memoize a create function:

let get create =
  let cache = System.Collections.Generic.Dictionary()
  fun id ->
    let mutable obj = Unchecked.defaultof<_>
    if cache.TryGetValue(id, &obj) then obj else
      obj <- create id
      cache.[id] <- obj
      obj

But this is not thread safe. No problem, thanks to .NET the code is actually simpler using a concurrent dictionary:

let get create =
  let cache = System.Collections.Concurrent.ConcurrentDictionary<int, 'a>()
  fun id -> cache.GetOrAdd(id, System.Func<_, _>(create))

The problem with this approach is that objects never have their entries removed from the dictionary and, therefore, this solution leaks memory. For example, the following test:

do
  let get = Leaks.get (fun n -> [n])
  for i in 1..1000000000 do
    ignore(get i)
    if i &&& (i-1) = 0 then
      printfn "%d created, %dMB in use"
        i (System.GC.GetTotalMemory false / 1048576L)

Produces this output, showing that the program has leaked away over a gigabyte of memory in just a few seconds:

1 created, 0MB in use
2 created, 0MB in use
4 created, 0MB in use
4194304 created, 331MB in use
8388608 created, 649MB in use
16777216 created, 1297MB in use

Fortunately, this mistake is easy to fix on .NET by using a concurrent weak dictionary that has strong keys and weak values. One such implementation is available here. This challenge may then be solved correctly in just 10 lines of F# code as follows:

let get create =
  let cache = TvdP.Collections.ConcurrentWeakDictionaryStrongKeys()
  fun id ->
    lock cache (fun () ->
      let mutable value = Unchecked.defaultof<_>
      if cache.TryGetValue(id, &value) then
        value
      else
        cache.[id] <- create id
        cache.[id])

With this solution we find that the memory usage never rises beyond 20MB:

1 created, 0MB in use
2 created, 0MB in use
4 created, 0MB in use
4194304 created, 17MB in use
8388608 created, 13MB in use
16777216 created, 16MB in use

To learn more, subscribe to the F# Journal today!

Friday, 11 October 2013

Scribble app in 12 lines of F#

Here’s a little Scribble app for F# Interactive in just 12 lines of code:

#r "PresentationCore"
#r "PresentationFramework"

open System.Windows

let canvas = Controls.Canvas()
let window = Window(Content=canvas)
let mutable lines = Shapes.Polyline()
window.MouseDown.Add(fun _ ->
  lines <- Shapes.Polyline(Stroke=Media.Brushes.Black, StrokeThickness=1.0)
  ignore(canvas.Children.Add lines))
window.MouseMove.Add(fun e -> lines.Points.Add(e.GetPosition window))
window.MouseUp.Add(fun _ -> lines <- Shapes.Polyline())
Application().Run window

This program begins by referencing the required WPF assemblies and opening a namespace. The canvas and a window containing it are created followed by a mutable global polyline. The mouse down event causes the polyline to be replaced and the new one added to the canvas. The mouse move event causes the current mouse position to be appended to the end of the points on the current polyline. The mouse up event replaces the polyline but does not add the new one to the canvas.

The result looks like this:


If you want to see more F# programs, subscribe to the F# Journal today!

Tuesday, 8 October 2013

Removing image borders

Shrinking images by removing a constant-color border is a common problem. The process turns this:

into this:

In fact, we faced this problem with all of the images from our previous blog post. One solution is to fire up a general-purpose image manipulation program like Paint.NET and invoke a series of mystical commands on each image in turn. This blog post looks at an F# program with a minimal WPF UI that allows image files to be dragged from explorer and dropped onto our UI. All such images are automatically shrunk by removing outside edges containing pixels that are all the same color.

We begin by referencing the usual WPF assemblies (PresentationCore, PresentationFramework, System.Xaml and WindowsBase) as well as System.Drawing because GDI+ exposes a more elegant interface for manipulating images. Then we open the following namespace:

open System.Windows

The following function loads an image as a 2D array of colors:

let loadImage (file: string) =
  use image = new System.Drawing.Bitmap(file)
  Array2D.init image.Width image.Height (fun x y ->
    image.GetPixel(x, y))

Note how the use binding allows the image to be automatically disposed when it falls out of scope.

Similarly, the following function saves as image of the given dimensions with pixel colors given by the getPixel function to the given file:

let saveImage (width: int) (height: int) getPixel file =
  use shrunk = new System.Drawing.Bitmap(width, height)
  for x in 0..width-1 do
    for y in 0..height-1 do
      shrunk.SetPixel(x, y, getPixel x y)
  shrunk.Save file

Next we write a function that returns the first index into a sequence of sequences where the subsequence contains more than one value or the length of the sequence if no such subsequence exists:

let find xs =
  let f xs = Seq.length(Seq.distinct xs) <> 1
  match Seq.tryFindIndex f xs with
  | None -> Seq.length xs
  | Some idx -> idx

The core of our program is the following shrink function that loads an image, identifies the rectangle within the border and saves that part of the image back to the same file:

let shrink file =
  let original = loadImage file
  let width, height = original.GetLength 0, original.GetLength 1
  let xs = [ 0..width-1 ]
  let ys = [ 0..height-1 ]
  let row y = seq { for x in xs -> original.[x, y] }
  let column x = seq { for y in ys -> original.[x, y] }
  let x0 = Seq.map column xs |> find
  let x1 = width - find(Seq.map column (List.rev xs))
  let y0 = Seq.map row ys |> find
  let y1 = height - find(Seq.map row (List.rev ys))
  let width, height = x1 - x0, y1 - y0
  if width > 0 && height > 0 then
    saveImage width height (fun x y -> original.[x+x0, y+y0]) file

The main program then creates a window, enabled drag-and-drop and adds a callback :

[<System.STAThread>]
do
  let window = Window(Title="Shrink images")
  window.AllowDrop <- true
  window.Drop.Add(fun e ->
    if e.Data.GetDataPresent DataFormats.FileDrop then
      (e.Data.GetData DataFormats.FileDrop :?> string [])
      |> Seq.iter shrink)
  Application().Run window
  |> ignore

This program can now be used to remove borders from many images in batch.

For more articles on F#, subscribe to the F# Journal!

Tuesday, 1 October 2013

Downloading stock prices reloaded

We recently published a blog post describing a 14-line solution for downloading stock prices using F# 2.0. Type providers are a major new feature in F# 3.0 (Visual Studio 2012 and later) that make this task even easier!

We begin by installing the FSharp.Data package from NuGet:

Install-Package FSharp.Data

This provides us with type providers for the JSON, CSV and XML formats as well as Freebase and the World Bank.

In order to write this as a script we must begin by referencing the FSharp.Data DLL which is currently located in the following relative location:

#I @"../packages/FSharp.Data.1.1.10/lib/net40"
#r "FSharp.Data.dll"

Note that this has hardcoded the NuGet package’s version number which is likely to change in the future.

Next, we create a CSV type provider that uses sample CSV data to infer the types in the columns:

type Stocks = FSharp.Data.CsvProvider<"http://ichart.finance.yahoo.com/table.csv?s=MSFT">

Note that we are able to give a URL as the location of the sample CSV data. This was not obvious from any of the available tutorials at the time of writing, all of which assume the data has already been downloaded to disk. The ability to refer to sample data on the web is clearly useful in practice.

Now we are ready to define the type we wish to use to represent the relevant data:

type Prices = { Open: decimal; High: decimal; Low: decimal; Close: decimal }

We shall use the same base URL as before:

let url = "http://ichart.finance.yahoo.com/table.csv?s="

Our getStockPrices function may now be written more simply, without requiring code to parse the CSV, as follows:

let getStockPrices stock =
  let msft = Stocks.Load(url + stock)
  [ for row in msft.Data ->
      { Open=row.Open; High=row.High; Low=row.Low; Close=row.Close } ]

Finally, the following example application downloads historical stock price information for Microsoft:

getStockPrices "MSFT"

Using type providers we have reduced the size of even the simplest example from 14 lines of code to just 9.

Writing a contour plotter

The F# Journal just published an article about computer graphics:
"Visualizations come in many different forms. One of the most common ways to visualize a function of two variables is a contour plot. The function is sampled to identify perimeters where it crosses thresholds and these are rendered as vector paths. This article describes the design and implementation of a simple program that generates contour plots using the meandering triangles algorithm and, when parallelized, runs over 11x faster than Mathematica..."
To read this article and more, subscribe to The F# Journal today!

Human-readable string joining in F#

Eric Lippert (formerly of Microsoft) posted an interesting blog post challenging readers to write a program that converts a sequence of strings into a single string by comma separating all but the last string, which is separated by the word “and” instead. Let’s consider this challenge in F#...

Perhaps the simplest and most maintainable solution is to convert the sequence into an array and act upon that instead:

let concat (xs: seq<string>) =
  let xs = Array.ofSeq xs
  match xs.Length with
  | 0 -> "{}"
  | 1 -> sprintf "{%s}" xs.[0]
  | n -> sprintf "{%s and %s}" (String.concat ", " xs.[..n-2]) xs.[n-1]

Pattern matching over the length of the array allows us to handle the three separate cases elegantly. Furthermore, in the final case it allows us to bind the length of the array to the variable n. In the last two cases we use the C-like sprintf function to compose several strings in one go. The final case also uses an array slice to obtain all elements except the last.

This function is easily tested in F# Interactive as follows:

> List.map concat [[]; ["A"]; ["A"; "B"]; ["A"; "B"; "C"]];;
val it : string list = ["{}"; "{A}"; "{A and B}"; "{A, B and C}"]

Let’s look at some alternative solutions. First up, an elegant (almost) purely functional solution based upon a state machine with three states:

type State =
  | Empty
  | Singleton of string
  | And of System.Text.StringBuilder * string

The following function transitions the state machine from one state to another as new characters are encountered:

let transition a x =
  match a with
  | Empty -> Singleton x
  | Singleton y -> And(System.Text.StringBuilder("{" + y), x)
  | And(sb, y) -> And(sb.Append(", " + y), x)

The main concat function then folds the transition function over the sequence of strings and acts upon the final accumulator:

let concat xs =
  match Seq.fold transition Empty xs with
  | Empty -> "{}"
  | Singleton x -> "{" + x + "}"
  | And(sb, y) -> sb.Append(" and " + y + "}").ToString()

The three states are equivalent to the three pattern matches in our simple solution. Technically, the .NET string builder is not a purely functional data structure but that actually has no effect on our code here.

Another, more traditional, solution is to encode the state machine as an imperative program using conditions and loops. This may be written as follows in F#:

let concat (xs: seq<string>) =
  use e = xs.GetEnumerator()
  if e.MoveNext() then
    let s = e.Current
    if e.MoveNext() then
      let sb = System.Text.StringBuilder("{" + s)
      let mutable s = e.Current
      while e.MoveNext() do
        sb.Append(", " + s) |> ignore
        s <- e.Current
      sb.Append(" and " + s + "}").ToString()
    else "{" + s + "}"
  else "{}"

This produces the same result as the two previous solutions but is the most difficult to understand because the state is implicitly represented by the “program counter”.

Finally, a rather contrived solution that converts the input sequence into an array but creates the output string using the F# String.init function:

let concat (xs: seq<string>) =
  let xs = Array.ofSeq xs
  let n = xs.Length
  let len = if n<2 then 2+n else 1 + 2*n
  String.init len (fun i ->
    match i, len-1-i with
    | 0, _ -> "{"
    | _, 0 -> "}"
    | i, _ when i%2=1 -> xs.[i/2]
    | _, 2 -> " and "
    | _ -> ", ")

The last point of interest is the performance of these different functions when applied to a very long input string. As none of these functions uses repeated concatenation they are all linear time. When given ten million strings on this machine the simplest solution takes 0.79s, the functional state machine takes 0.81s, the imperative state machine takes 54s and the solution using String.init takes just 0.38s. Given such a small performance difference, the winning solution in the vast majority of cases is the simplest solution.

To see more F# samples and learn more tips and tricks, subscribe to the F# Journal today!