Saturday, 24 February 2007

FSharp.net

I just found an empty website at fsharp.net that contains forums for discussing F#. There are a few registered users already.

As much as I like the Hub, it can be very slow.

F# certainly seems to be gaining popularity. I've had 2,500 visitors to my F# demos so far this month. I've also changed my site to use the less ambiguous term "F#.NET" in the hope that this will catch on and become a better way to Google for the F# programming language.

Cheers,
Jon.

Saturday, 17 February 2007

Symbolic manipulation

The F# programming language excels at the manipulation of symbolic expressions. This includes compiler and interpreter writing (the manipulation of programs) but perhaps the most obvious application is computer algebra.

The language features that make F# ideal for symbolic processing are variant types and pattern matching. The following snippet of F# code defines a variant type and overloads the + and * functions, allowing symbolic mathematical expressions to be constructed using ordinary arithmetic operators:

type t =
| Q of int
| Var of string
| Add of t * t
| Mul of t * t with

static member ( + ) (f, g) = match f, g with
| Q n, Q m -> Q (n + m)
| Q 0, e e, Q 0 -> e
| f, Add(g, h) -> f + g + h
| f, g -> Add(f, g)

static member ( * ) (f, g) = match f, g with
| Q n, Q m -> Q (n * m)
| Q 0, e e, Q 0 -> Q 0
| Q 1, e e, Q 1 -> e
| f, Mul(g, h) -> f * g * h
| f, g -> Mul(f, g)
end

Composing symbolic expressions using these operators has the added benefit of performing some simplifying rearrangements. For example, adding 0 or multiplying by 1 has no effect.

> Q 2 + Q 1 * Var "x" + Q 0;;
val it : t = Add (Q 2,Var "x")

As well as being expressive, F#'s pattern matching is also very efficient. This implementation can simplify randomly generated expressions with 10^6 atoms in 0.386s on my 2.2GHz AMD64.

Cheers,
Jon.

Wednesday, 14 February 2007

Avoiding copying in functional programming languages

This boils down to an absolutely pivotal concept in functional programming
called referential transparency. This concept is one of the main stumbling
blocks for imperative programmers learning FPLs like F#. I
certainly had trouble grokking this concept when I ditched C++.

In essence, when a new immutable data structure is created from an existing
data structure, the new data structure has a lot in common with the original.
An imperative programmer immediately assumes (incorrectly) that the original
data structure must have been copied. In fact, there is never any need to
copy immutable data because you can simply refer back to the original, i.e.
referencing is transparent.

So FPLs effectively handle everything by pointer (mutable or immutable) and
the only copying you'll ever incur is copying pointers. In this case, the
pointers inside the record will be copied, which is nothing to worry about.

In the vast majority of cases, succinct F# code is efficient F# code. As
long as you don't explicitly copy lots of data, there won't be much copying
going on "under the hood".

Friday, 2 February 2007

F# is the perfect tool for scientists

I'm in the process of writing my next book "F# for Scientists" and, whilst looking for interesting examples, decided to try to mix web programming with science.

I ended up with a tiny F# script that can be run from an F# interactive session. The first few lines download the XML data on the C terminal domain of rabbit serum haemopexin (a piece of protein that I analysed many years ago). The XML data is stored in GZipped form, which is easily decompressed using built-in .NET library functions.

The resulting XML can be parsed using other .NET library functions but the result is much easier to handle if it is converted into an F# variant type (rather than a .NET object hierarchy). This can be done by a single, tiny F# function that can then be applied to any XML document. The result is displayed by the interactive session, making it much easier to dissect the XML tree structure and get at the data you want.

In this case, I extract the amino-acid sequence of the protein and compute the Kyte-Doolittle hydrophobicity profile. I intend to do a little analysis of this data (maybe calculating the pseudoperiodic features using wavelets) to illustrate how tertiary protein structure can be subtly related to simple functions of the sequence. Of course, the results of any computations can even be injected into Excel using .NET's "automation" API.

The F# interactive mode turns out to be ideal for scientists who want to slice and dice their data without having to write a complete program. Although a few other languages (e.g. Python, Matlab) provide interactive sessions, F# is vastly faster and a much more elegant language.

I believe that visualisation will be the killer reason for scientists to use F# though. The ability to spawn real-time visualisations from an interactive session is nothing short of awesome. The results are vastly superior to anything I've ever seen, even in expensive commercial packages like Matlab and Mathematica.

Consequently, I think it is highly likely that I'll start doing some visualisation-related work as soon as this book is polished. Programmatic visualisations can be done quite easily using .NET and Managed DirectX but F# and its interactive mode pave the way for supremely-elegant declarative graphics, where you just write "sphere here" and "cylinder there" to visualise a complete scene.

I'd like to abstract away quite a bit "under the hood". Rendering seemingly simple features like shadows is actually quite technically challenging and these kinds of visual clues are vitally important to the human ocular system.

I envisage a future where scientists can visualise their results using sophisticated 2D and 3D graphics with minimal effort and without having to buy books on game programming and learn about adaptive subdivision.

Cheers,
Jon.