## Thursday, 12 December 2013

### Graph algorithms: Kruskal's minimum spanning tree

The F# Journal just published an article about graph algorithms:
"Kruskal's algorithm for computing the minimum spanning tree of a weighted undirected graph uses the union-find data structure to combine fragments of tree connected by the lightest edges. This article describes both pure and impure implementations of the algorithm using the persistent union-find data structure by Conchon and FilliĆ¢tre to create a high performance purely functional solution..."

## Tuesday, 10 December 2013

### Graph algorithms: Prim's minimum spanning tree

The F# Journal just published an article about graph algorithms:
"Prim's algorithm for computing the minimum spanning tree associates a predecessor and distance with each vertex and uses a searchable priority queue to add the lightest-weight vertices to the tree until the minimum spanning tree has been found. This article examines both pure and impure implementations of Prim's algorithm..."

## Saturday, 12 October 2013

### A thread-safe object cache that doesn't leak

Consider the problem of lazily initializing objects from handles. Requesting the object corresponding to any given handle more than once should return the same physical object that the first such call returned. The simplest solution is to use a dictionary to memoize a create function:

let get create =
let cache = System.Collections.Generic.Dictionary()
fun id ->
let mutable obj = Unchecked.defaultof<_>
if cache.TryGetValue(id, &obj) then obj else
obj <- create id
cache.[id] <- obj
obj

But this is not thread safe. No problem, thanks to .NET the code is actually simpler using a concurrent dictionary:

let get create =
let cache = System.Collections.Concurrent.ConcurrentDictionary<int, 'a>()
fun id -> cache.GetOrAdd(id, System.Func<_, _>(create))

The problem with this approach is that objects never have their entries removed from the dictionary and, therefore, this solution leaks memory. For example, the following test:

do
let get = Leaks.get (fun n -> [n])
for i in 1..1000000000 do
ignore(get i)
if i &&& (i-1) = 0 then
printfn "%d created, %dMB in use"
i (System.GC.GetTotalMemory false / 1048576L)

Produces this output, showing that the program has leaked away over a gigabyte of memory in just a few seconds:

1 created, 0MB in use
2 created, 0MB in use
4 created, 0MB in use
4194304 created, 331MB in use
8388608 created, 649MB in use
16777216 created, 1297MB in use

Fortunately, this mistake is easy to fix on .NET by using a concurrent weak dictionary that has strong keys and weak values. One such implementation is available here. This challenge may then be solved correctly in just 10 lines of F# code as follows:

let get create =
let cache = TvdP.Collections.ConcurrentWeakDictionaryStrongKeys()
fun id ->
lock cache (fun () ->
let mutable value = Unchecked.defaultof<_>
if cache.TryGetValue(id, &value) then
value
else
cache.[id] <- create id
cache.[id])

With this solution we find that the memory usage never rises beyond 20MB:

1 created, 0MB in use
2 created, 0MB in use
4 created, 0MB in use
4194304 created, 17MB in use
8388608 created, 13MB in use
16777216 created, 16MB in use

## Friday, 11 October 2013

### Scribble app in 12 lines of F#

Here’s a little Scribble app for F# Interactive in just 12 lines of code:

#r "PresentationCore"
#r "PresentationFramework"

open System.Windows

let canvas = Controls.Canvas()
let window = Window(Content=canvas)
let mutable lines = Shapes.Polyline()
lines <- Shapes.Polyline(Stroke=Media.Brushes.Black, StrokeThickness=1.0)
window.MouseUp.Add(fun _ -> lines <- Shapes.Polyline())
Application().Run window

This program begins by referencing the required WPF assemblies and opening a namespace. The canvas and a window containing it are created followed by a mutable global polyline. The mouse down event causes the polyline to be replaced and the new one added to the canvas. The mouse move event causes the current mouse position to be appended to the end of the points on the current polyline. The mouse up event replaces the polyline but does not add the new one to the canvas.

The result looks like this:

If you want to see more F# programs, subscribe to the F# Journal today!

## Tuesday, 8 October 2013

### Removing image borders

Shrinking images by removing a constant-color border is a common problem. The process turns this:

into this:

In fact, we faced this problem with all of the images from our previous blog post. One solution is to fire up a general-purpose image manipulation program like Paint.NET and invoke a series of mystical commands on each image in turn. This blog post looks at an F# program with a minimal WPF UI that allows image files to be dragged from explorer and dropped onto our UI. All such images are automatically shrunk by removing outside edges containing pixels that are all the same color.

We begin by referencing the usual WPF assemblies (PresentationCore, PresentationFramework, System.Xaml and WindowsBase) as well as System.Drawing because GDI+ exposes a more elegant interface for manipulating images. Then we open the following namespace:

open System.Windows

The following function loads an image as a 2D array of colors:

use image = new System.Drawing.Bitmap(file)
Array2D.init image.Width image.Height (fun x y ->
image.GetPixel(x, y))

Note how the use binding allows the image to be automatically disposed when it falls out of scope.

Similarly, the following function saves as image of the given dimensions with pixel colors given by the getPixel function to the given file:

let saveImage (width: int) (height: int) getPixel file =
use shrunk = new System.Drawing.Bitmap(width, height)
for x in 0..width-1 do
for y in 0..height-1 do
shrunk.SetPixel(x, y, getPixel x y)
shrunk.Save file

Next we write a function that returns the first index into a sequence of sequences where the subsequence contains more than one value or the length of the sequence if no such subsequence exists:

let find xs =
let f xs = Seq.length(Seq.distinct xs) <> 1
match Seq.tryFindIndex f xs with
| None -> Seq.length xs
| Some idx -> idx

The core of our program is the following shrink function that loads an image, identifies the rectangle within the border and saves that part of the image back to the same file:

let shrink file =
let width, height = original.GetLength 0, original.GetLength 1
let xs = [ 0..width-1 ]
let ys = [ 0..height-1 ]
let row y = seq { for x in xs -> original.[x, y] }
let column x = seq { for y in ys -> original.[x, y] }
let x0 = Seq.map column xs |> find
let x1 = width - find(Seq.map column (List.rev xs))
let y0 = Seq.map row ys |> find
let y1 = height - find(Seq.map row (List.rev ys))
let width, height = x1 - x0, y1 - y0
if width > 0 && height > 0 then
saveImage width height (fun x y -> original.[x+x0, y+y0]) file

The main program then creates a window, enabled drag-and-drop and adds a callback :

do
let window = Window(Title="Shrink images")
window.AllowDrop <- true
if e.Data.GetDataPresent DataFormats.FileDrop then
(e.Data.GetData DataFormats.FileDrop :?> string [])
|> Seq.iter shrink)
Application().Run window
|> ignore

This program can now be used to remove borders from many images in batch.

For more articles on F#, subscribe to the F# Journal!

### Fun with infinite power series

This blog post was inspired by Doug McIlroy's excellent Power Serious article about processing infinite power series using infinite lazy lists in Haskell. Although Doug's Haskell code is significantly more elegant than our F# in parts, we add the ability to visualize power series as typeset mathematics including some example approximations to well-known constants using our F# for Visualization library.

Lazy lists are not as elegant in F# as they are in Haskell (where lazy evaluation is the norm) and, in fact, do not even appear in the F# standard library. However, rather than using lazy lists of coefficients to represent infinite series we can use functions that map from the integer index to the rational factor. The first factor in an infinite series is the constant and appear at index zero and so on.

Arbitrary-precision rational numbers are provided by the F# PowerPack:

#r "FSharp.PowerPack"

We shall make extensive use of this type so let us define an alias:

type Q = BigRational

We can begin by defining some fundamental infinite series. The first represents the number one by returning one for the factor at index zero and zero for all other factors:

let one = function 0 -> 1N | _ -> 0N

The function f(x)=x is represented by a series with a one at index one and zeroes at every other index:

let x = function 1 -> 1N | _ -> 0N

A constant is represented as follows:

let constant k = function 0 -> k | _ -> 0N

The following function skips the first factor and moves the rest along so rest(x+2x²+3x³)=1+2x+3x² which is equivalent to dividing by x when the first factor was zero:

let rest f n = f(n+1)

The xTimes function moves the factors down one place and adds a zero at the beginning, which is equivalent to multiplying by x:

let xTimes f n = if n=0 then 0N else f(n-1)

Addition and subtraction of sequences are simply element-wise:

let (+.) f g n : Q = f n + g n

let (-.) f g n : Q = f n - g n

Prepending an element onto the beginning of a lazy list is now replaced by a function that returns a given constant at index zero followed by a sequence:

let cons(x, xs) n = if n=0 then x else xs(n-1)

The derivative of a series is remarkably simple:

let differentiate f n = Q.FromInt(n+1) * f(n+1)

Similarly for integration:

let integrate f n =
if n=0 then 0N else f(n-1) / Q.FromInt n

As we shall see, the ability to integrate and differentiate power series allows us to derive a wide variety of interesting power series.

The following function scales each factor by another factor:

let scale x xs n = x * xs n

The product of two power series can be expressed as follows:

let rec ( *. ) f g n = f 0 * g n + xTimes(rest f *. g) n

The quotient of one power series divided by another also has a remarkably simple representation:

let rec (/.) f g n =
let rec qs n =
cons(f 0 / g 0, scale (1N / g 0) (rest f -. qs *. rest g)) n
qs n

Composition allows us the calculate f(g(x)) where both f and g are power series:

let rec ( >>. ) f g n = cons(f 0, rest g *. (rest f >>. g)) n

Reversion gives the inverse function of the given power series provided it has a zero constant at the beginning and also has a remarkably simple representation:

let revert f =
let rec rs n = cons(0N, one /. (rest f >>. rs)) n
rs

Now we're ready to compute the power series of some familiar functions. The power series of ln(1+x) is given by the integral of 1/(1+x):

let logOnePlusX = integrate(one /. (x +. one))

The power series for exponential may now be obtained by reverting the power series for logarithm because they are inverse functions of each other:

let exp = revert logOnePlusX +. one

A similar identity holds for arctan:

let atan = integrate(one /. (one +. x *. x))

Reverting this function recovers the power series for tan:

let tan = revert atan

Amazingly, the power series of sine and cosine can be defined in terms of each other using integration:

let rec sin n = integrate cos n
and cos n = (one -. integrate sin) n

Finally, arc sine can be obtained by reverting the series for sine:

let asin = revert sin

Note, however, that cosine cannot be reverted because it has a non-zero constant.

In order to visualize our power series we begin by loading the F# for Visualization library and opening the relevant namespaces:

#r "FSharpForVisualization"

open FlyingFrog.FSharpForVisualization
open FlyingFrog.Graphics.Typeset
open FlyingFrog.Graphics.Typeset.Math

The following function finds the first six non-zero factors out of the first twelve:

let initialTerms series =
seq { for i in 0..12 -> i, series i }
|> Seq.filter (fun (_, q) -> q <> 0N)
|> Seq.truncate 6
|> List.ofSeq

Although F# allows arbitrary-precision rational literals in expressions it does not allow them in patterns. Therefore, we define the following active patterns to recognise important rational numbers:

let (|One|_|) q = if q = 1N then Some() else None
let (|MinusOne|_|) q = if q = -1N then Some() else None

We shall classify the sign of the factor in order to determine whether it should be preceded by a minus sign:

type Sign = Negative | NonNegative

The sign of a rational is then given by:

let signOf q =
if q<0N then Negative else NonNegative

The following function typesets a single term in a power series where the factor is q and the exponent is i:

let typesetTerm = function
| 0, q -> math q
| 1, (MinusOne | One) -> math "x"
| 1, q -> Row[math(abs q); math "x"]
| n, (MinusOne | One) -> Super(math "x", math n)
| n, q -> Row[math(abs q); Super(math "x", math n)]

Finally, a power series may be typeset by extracting the first few non-zero terms, finding their signs and typesetting each term before combining them with plus or minus signs between them:

let rec typesetSeries series =
let rec loop = function
| [] -> math "+ …"
| (Negative, x)::xs -> Row[math "-"; x; loop xs]
| (NonNegative, x)::xs -> Row[math "+"; x; loop xs]
let terms =
[ for i, q in initialTerms series ->
signOf q, typesetTerm(i, q) ]
match terms with
| [] -> math "…"
| (Negative, x)::xs -> Row[math "-"; x; loop xs]
| (NonNegative, x)::xs -> Row[x; loop xs]

As each visualization will be spawned in its own window it is useful to augment each power series with the mathematical notation for the expression that it expresses. The following draw function does this:

let draw series lbl =
Row[lbl; math "="; typesetSeries series]
|> view

For example, our power series for logarithm may be visualized as follows:

Row[Text "ln"; math "(1+x)"]
|> draw logOnePlusX

Here are the other power series we calculated:

Super(math "e", math "x")
|> draw exp
Row[Super(Text "tan", math "-1"); math "(x)"]
|> draw atan
Row[Text "tan"; math "(x)"]
|> draw tan
Row[Text "sin"; math "(x)"]
|> draw sin
Row[Text "cos"; math "(x)"]
|> draw cos
Row[Super(Text "sin", math "-1"); math "(x)"]
|> draw asin

We can also evaluate some approximations by summing the first few terms of some of our power series:

let eval n x (a : int -> BigRational) =
seq { for i in 0..n-1 -> a i * pown x i }
|> Seq.sum

Again, it is useful to encapsulate the conventional mathematical notation for a value along with its known decimal representation, a calculated rational approximation and its decimal representation:

let displayApprox lbl (exact: float) (q: Q) =
Row[lbl; math "="; math exact; Symbols.EqualTilde;
math q; math "="; math(float q)]
|> view

We can easily calculate a good approximation to ln(1.1):

eval 10 (1N / 10N) logOnePlusX
|> displayApprox (Row[Text "ln"; math "(1.1)"]) (log 1.1)

Machin's formula for Ļ gives us quite good rational approximations:

16N * eval 8 (1N / 5N) atan - 4N * eval 2 (1N / 239N) atan
|> displayApprox Greek.pi System.Math.PI

Evaluating e¹ gives an approximation to e:

displayApprox (math "e") System.Math.E (eval 10 1N exp)

To create visualizations like these within the comfort of Visual Studio, order a copy of our F# for Visualization library.

## Tuesday, 1 October 2013

We recently published a blog post describing a 14-line solution for downloading stock prices using F# 2.0. Type providers are a major new feature in F# 3.0 (Visual Studio 2012 and later) that make this task even easier!

We begin by installing the FSharp.Data package from NuGet:

Install-Package FSharp.Data

This provides us with type providers for the JSON, CSV and XML formats as well as Freebase and the World Bank.

In order to write this as a script we must begin by referencing the FSharp.Data DLL which is currently located in the following relative location:

#I @"../packages/FSharp.Data.1.1.10/lib/net40"
#r "FSharp.Data.dll"

Note that this has hardcoded the NuGet package’s version number which is likely to change in the future.

Next, we create a CSV type provider that uses sample CSV data to infer the types in the columns:

type Stocks = FSharp.Data.CsvProvider<"http://ichart.finance.yahoo.com/table.csv?s=MSFT">

Note that we are able to give a URL as the location of the sample CSV data. This was not obvious from any of the available tutorials at the time of writing, all of which assume the data has already been downloaded to disk. The ability to refer to sample data on the web is clearly useful in practice.

Now we are ready to define the type we wish to use to represent the relevant data:

type Prices = { Open: decimal; High: decimal; Low: decimal; Close: decimal }

We shall use the same base URL as before:

let url = "http://ichart.finance.yahoo.com/table.csv?s="

Our getStockPrices function may now be written more simply, without requiring code to parse the CSV, as follows:

let getStockPrices stock =
let msft = Stocks.Load(url + stock)
[ for row in msft.Data ->
{ Open=row.Open; High=row.High; Low=row.Low; Close=row.Close } ]

Finally, the following example application downloads historical stock price information for Microsoft:

getStockPrices "MSFT"

Using type providers we have reduced the size of even the simplest example from 14 lines of code to just 9.