Sunday, 25 November 2012
Improving grid performance
"Many enterprise software applications require the ability to visualize a grid quickly and easily. This article looks at the performance of using the built-in WPF Grid control for a modest 80x120 grid of numbers and describes the design and implementation of a bespoke FastGrid control that uses WPF's low-level rendering APIs to make startup 16x faster..."
To read this article and more, subscribe to The F#.NET Journal today!
Saturday, 24 November 2012
Basic use of LLVM from F#
"The low-level virtual machine (LLVM) is a freely-available high quality code generation library with support for many different architectures and platforms including x86, x64 and ARM. Programs are conveyed to LLVM in its intermediate representation (IR) which is an infinite register virtual machine. LLVM supports both static and JIT compilation of IR to native code. This article describes a compiler written in F# that can compile programs written in a small language to LLVM IR (using PInvoke) in order to evaluate them interactively..."
To read this article and more, subscribe to The F#.NET Journal today!
Sunday, 11 November 2012
The F# Software Foundation (FSSF)
Although the FSSF was only just created, their website is already a gold mine of F#-related links. The "Getting F#" section refers to explanations of how to install F# on Mac, Linux, Windows, Android and iOS as well as HTML5 and GPGPU.
We fully support this movement to encourage wider adoption of F# and, in fact, we are currently experimenting with F# on the Raspberry Pi and intend to look at F# on the Google Nexus 7 tablet and the forthcoming Microsoft Surface Pro tablet.
Wednesday, 7 November 2012
SAT Solver
"Satisfiability solvers or SAT solvers are programs than solve logical problems. A boolean SAT solver is one than solves the problem of finding sets of variable bindings for which a given boolean expression evaluates to true..."
To read this article and more, subscribe to The F#.NET Journal today!
Sunday, 14 October 2012
Numerical integration
"Definite integrals can be calculated approximately using a variety of techniques known collectively as numerical integration algorithms. The ability to calculate definite integrals has many practical applications including solving ordinary differential equations. This article describes several basic algorithms for numerical integration and a technique that allows these algorithms to be applied to both finite and infinite limits..."
To read this article and more, subscribe to The F#.NET Journal today!
Wednesday, 10 October 2012
Simulating a concurrent system
"Distributed concurrency can be complicated and, consequently, there can be great value in having a complete working prototype to study. The simplicity of asynchronous agents in F# makes them ideal for writing simulation code. This article describes the design of a simple concurrent system that distributes incoming messages between two separate concurrent systems in order to duplicate subsequent processing in order to provide fault tolerance..."
To read this article and more, subscribe to The F#.NET Journal today!
Wednesday, 5 September 2012
Structural typing
"Static type systems enforce constraints at compile time by using types as contracts that must be adhered to. Nominal typing compares types according to the explicit interfaces they implement. If two type explicitly implement the same interface then they can be used interchangeably within the context of that interface. Structural typing is an alternative to nominal typing where the comparison of types is based upon the structure of the types rather than explicit interfaces..."
To read this article and more, subscribe to The F#.NET Journal today!
Case study: porting a commercial compiler from OCaml to F#
This post presents one of our recent pieces of consultancy work as a case study. This case is interesting because it concerns a significant commercial code base ported from OCaml to F#, the introduction of parallelism to the code base and the performance of F# compared to OCaml in the context of a compiler (where OCaml is regarded as being very fast).
We recently ported a compiler written in OCaml to the F# programming language for a client and performance turned out to be an issue. The precise details cannot be disclosed under our contract.
The original compiler was 15,000 lines of OCaml code. The compiler was regularly applied to large code bases and the C code that it produces can be considerable so compilation can take minutes. Consequently, the compiler's performance is valued by our client's customers and, therefore, the original code had been optimized for OCaml's performance characteristics.
A direct translation of the OCaml code to F# proved to be over 10× slower. This was so slow that it impeded testing of the initial translation so some optimization was done before the first working version. Specifically, profiling indicated that the biggest problem was the high rate of exceptions being raised and caught. Exceptions are around 600× slower on .NET than in OCaml and are often used for ordinary control flow in OCaml so direct translations to F# often suffer from poor performance as a consequence. In this case, the hot paths were changed to use union types (usually option types) instead of exceptions. Although this incurs a lot of unnecessary boxing in F# the performance improvements were substantial and the F# version became 5x slower than the OCaml.
Surprisingly, thorough testing showed that our almost-blind initial translation of 15,000 lines of code was completely error free. I think this is a real testament to the power of ML's static type system. In fact, the only error found to date was introduced during subsequent optimization.
After demonstrating the correctness of the translation, effort turned to trying to improve performance in an attempt to compete with the original OCaml code. We expected this to be extremely difficult because OCaml is very heavily optimized for exactly this kind of workload. However, we were actually able to exceed OCaml's performance. This was accomplished by taking a regimented approach where the program was profiled to identify the main hot spots and actions were taken to improve overall performance but only the most effective optimizations were retained before repeating the process and all results (including failures) were carefully recorded. This allowed us to produce a fast code base that was still as familiar as possible to its developers. In particular, the support for parallel programming in .NET 4 and F# 2.0 proved invaluable and was the final boost we needed to exceed OCaml's performance.
This case study proves that it is possible to migrate significant commercial code bases from OCaml to F# without suffering performance degradation. Beyond metaprogramming, we have found that many practical applications see substantial speed improvements when moved from OCaml to F# often due to better support for multicore but also better support for unboxing.
Saturday, 18 August 2012
Tic-tac-toe revisited
"This article rearchitects a GUI application from a previous article to use an asynchronous concurrent design. The result is a clean separation between view and model that closely resembles a client-server architecture due to the use of explicit message passing. This design pattern is generally applicable..."
To read this article and more, subscribe to The F#.NET Journal today!
Thursday, 2 August 2012
Cancellable streams
"Applications such as searching and visualization that request information from servers often have the potential to overload the server by sending too many requests before the handling of existing requests has been completed. A simple and effective solution to this problem is cancellation. This article examines a simple system of two communicating sequential processes that use cancellation to implement a server that accepts requests, returns responses and uses cancellation to tear down outstanding requests in order to avoid system overload..."
To read this article and more, subscribe to The F#.NET Journal today!
Sunday, 29 July 2012
Asynchronous bounded queues
"The MailboxProcessor in the F# standard library uses unbounded queues that can grow to be arbitrarily long. Consequently, the queue in a consumer can grow without limit if its producers overwhelm it with data. This article examines an elegant solution to this problem in the form of a bounded queue with asynchronous enqueue and dequeue operations that cause both producers and consumers to cooperatively suspend themselves until capacity or data are available to satisfy their request. Furthermore, this simpler architecture neatly decouples the sequential processes from the concurrent queues they use to communicate..."
To read this article and more, subscribe to The F#.NET Journal today!
Friday, 29 June 2012
Pathological garbage collector behaviour
The F#.NET Journal just published an article about garbage collection:
"When performance is important in a garbage collected language such as F# it is useful to have some understanding of the costs of operations that the garbage collector performs. This article provides an introduction to garbage collection, an overview of the processes performed by generational garbage collectors for multithreaded programs and goes on to examine some F# programs that deliberately exhibit pathological behaviour and presents workarounds that alleviate or even eliminate the costs associated with garbage collection on .NET..."
To read this article and more, subscribe to The F#.NET Journal today!
Thursday, 21 June 2012
Purely functional hash tries
"Purely functional data structures such as the Set and Map collections provided by the F# standard library have many advantages over traditional imperative collections, particularly clarity due to the absence of mutation, but their performance can be much worse. Benchmarks have shown the F# Set and Map running up to 40× slower than the imperative HashSet and Dictionary collections provided by .NET. This article describes the design and implementation of a hash trie implementation of a set that is up to 7× faster to construct and up to 4.5× faster to search than the F# set..."
To read this article and more, subscribe to The F#.NET Journal today!
Monday, 18 June 2012
Taylor series expansion of cosine that is generic over number type
cosine(x) = 1 - x²/2! + x⁴/4! - x⁶/6! + ...
float
-specific version in F#:let cosine n x =
let rec loop i q t c =
if i=n then c else
loop (i + 1) (q + 10 + 8*i) (-t * x * x / float q) (c + t)
loop 0 2 1.0 0.0
cosine 1000000 0.1
inline
in order to remove the performance overhead of this parameterization:let inline cosine zero one ofInt ( ~-. ) ( +. ) ( *. ) ( /. ) n x =
let rec loop i q t c =
if i=n then c else
loop (i + 1) (q + 10 + 8*i) (-.t *. x *. x /. ofInt q) (c +. t)
loop 0 2 one zero
float
like this, which is just as fast as before:cosine 0.0 1.0 float ( ~- ) (+) (*) (/) 1000000 0.1
float
:cosine 0.0f 1.0f float32 ( ~- ) (+) (*) (/) 1000000 0.1f
cosine 0N 1N BigNum.FromInt (~-) (+) (*) (/) 10 (1N / 10N)
type Expr =
| Int of int
| Var of string
| Add of Expr * Expr
| Mul of Expr * Expr
| Pow of Expr * Expr
static member (~-) f = Mul(Int -1, f)
static member (+) (f, g) = Add(f, g)
static member (*) (f, g) = Mul(f, g)
static member (/) (f, g) = Mul(f, Pow(g, Int -1))
cosine (Int 0) (Int 1) Int (~-) (+) (*) (/) 3 (Var "x")
-x*x
out of loop
.Friday, 25 May 2012
Improving matrix-matrix multiply and prime number programs in F#
The parallel matrix-matrix multiply makes heavy use of async for parallel programming, where it is an anti-pattern. So it may be productively rewritten using Tasks as follows:
let randomMatrix = let rnd = new System.Random() Array2D.init 500 500 (fun _ _ -> rnd.NextDouble()) let matrixMultiply (a:float[,]) (b:float[,]) = let rowsA, colsA = Array2D.length1 a, Array2D.length2 a let rowsB, colsB = Array2D.length1 b, Array2D.length2 b let result = Array2D.create rowsA colsB 0.0 System.Threading.Tasks.Parallel.For(0, rowsA, fun i -> for j in 0 .. colsB - 1 do let mutable x = 0.0 for k in 0 .. colsA - 1 do x <- x + a.[i,k] * b.[k,j] result.[i,j] <- x ) |> ignore result
matrixMultiply randomMatrix randomMatrix
This is 60x faster than the original async-based solution. However, this is a poor choice of algorithm because it lacks locality. Better algorithms are tiled or cache oblivious.
Their prime number generator makes heavy use of enumerable sequences which can be productively rewritten using recursion and arrays instead, as follows:
(* A very naive prime number detector *) let isPrime (n:int) = let rec loop i = i*i > n || (n % i <> 0 && loop (i+1)) loop 2 (* Return primes between m and n using multiple threads *) let primes m n = Array.Parallel.init (n-m+1) (fun i -> m+i, isPrime (m+i)) |> Array.Parallel.choose (function n, true -> Some n | _ -> None) (* Run a test *) while true do let timer = System.Diagnostics.Stopwatch.StartNew() primes 1000000000 1000100000 |> ignore printfn "%fs" timer.Elapsed.TotalSeconds
This is 13x faster than their original seq-based solution.
Monday, 14 May 2012
Parallel programming in functional languages
Functional programming has immutable data structures and no side effect which are inherently suitable for parallel programming.
I investigate how to exploit multicore computation in a functional language, and target production code for some numerical applications.
F# has Microsoft behind its back, and its parallel constructs such as PLINQ, TPL, Async Workflow have been well-documented and shown some potentials.
However, research about parallelism in Haskell is very active at the moment, and it posseses many nice features which haven't supported by F# yet:
par
requires that the programmer understand operational properties of the language that are at best implementation-defined (and at worst undefined). This makes par
difficult to use, and pitfalls abound — new users have a high failure rate..."My question is which language should I choose for functional parallelism?
Applications of priority queues
- The A* algorithm for route finding.
- Computing the "sliding median" or "moving median average" by maintaining queues of values in the window less than or equal to and greater than or equal to the current median, respectively.
Wednesday, 9 May 2012
Solving Einstein's Riddle
Monday, 7 May 2012
Visualizing the Stanford Bunny using WPF
"Stanford University maintain a database of 3D models, perhaps the most famous of which is the Stanford Bunny. This is a 3D scan of a statue of a rabbit. The tesselated version contains 65,000 triangles and is stored as plain text in a PLY file. This article describes a program that embeds the mesh data as an embedded resource in a .NET assembly, parses it at run-time and visualizes the results in 3D using Windows Presentation Foundation..."
To read this article and more, subscribe to The F#.NET Journal today!
Sunday, 6 May 2012
How do I sort lexicographically in F#?
> type Point = { X: float; Y: float };;
type Point =
{X: float;
Y: float;}
List.sort
:> [ { X = 2.0; Y = 3.0 }
{ X = 2.0; Y = 2.0 }
{ X = 1.0; Y = 3.0 } ]
|> List.sort;;
val it : Point list = [{X = 1.0;
Y = 3.0;}; {X = 2.0;
Y = 2.0;}; {X = 2.0;
Y = 3.0;}]
X
and then by Y
.compare
function.IComparable
and friends. If you want to use a custom ordering for a few operations then you can use higher-order functions like List.sortBy
and List.sortWith
. For example, List.sortBy (fun p -> p.Y, p.X)
will sort by Y
and then X
because F# generates the lexicographic comparison over 2-tuples for you (!).