Friday, 2 February 2007

F# is the perfect tool for scientists

I'm in the process of writing my next book "F# for Scientists" and, whilst looking for interesting examples, decided to try to mix web programming with science.

I ended up with a tiny F# script that can be run from an F# interactive session. The first few lines download the XML data on the C terminal domain of rabbit serum haemopexin (a piece of protein that I analysed many years ago). The XML data is stored in GZipped form, which is easily decompressed using built-in .NET library functions.

The resulting XML can be parsed using other .NET library functions but the result is much easier to handle if it is converted into an F# variant type (rather than a .NET object hierarchy). This can be done by a single, tiny F# function that can then be applied to any XML document. The result is displayed by the interactive session, making it much easier to dissect the XML tree structure and get at the data you want.

In this case, I extract the amino-acid sequence of the protein and compute the Kyte-Doolittle hydrophobicity profile. I intend to do a little analysis of this data (maybe calculating the pseudoperiodic features using wavelets) to illustrate how tertiary protein structure can be subtly related to simple functions of the sequence. Of course, the results of any computations can even be injected into Excel using .NET's "automation" API.

The F# interactive mode turns out to be ideal for scientists who want to slice and dice their data without having to write a complete program. Although a few other languages (e.g. Python, Matlab) provide interactive sessions, F# is vastly faster and a much more elegant language.

I believe that visualisation will be the killer reason for scientists to use F# though. The ability to spawn real-time visualisations from an interactive session is nothing short of awesome. The results are vastly superior to anything I've ever seen, even in expensive commercial packages like Matlab and Mathematica.

Consequently, I think it is highly likely that I'll start doing some visualisation-related work as soon as this book is polished. Programmatic visualisations can be done quite easily using .NET and Managed DirectX but F# and its interactive mode pave the way for supremely-elegant declarative graphics, where you just write "sphere here" and "cylinder there" to visualise a complete scene.

I'd like to abstract away quite a bit "under the hood". Rendering seemingly simple features like shadows is actually quite technically challenging and these kinds of visual clues are vitally important to the human ocular system.

I envisage a future where scientists can visualise their results using sophisticated 2D and 3D graphics with minimal effort and without having to buy books on game programming and learn about adaptive subdivision.


1 comment:

Art said...

Agreed -- "... without having to buy books on game programming and learn about adaptive subdivision"