## Sunday, 27 December 2009

### Evolution: the weasel program

An interesting challenge is called Evolutionary Algorithm and draws upon a famous computer simulation written by Richard Dawkins to demonstrate the power of random variation and non-random cumulative selection in natural and artificial evolutionary systems, and how this process differs from chance.

The program begins with a random string of letters and spawns a generation of 200 random mutations of the parent string, selects the fittest mutation (by similarity to an "ideal" string) and uses that individual to spawn the next generation. The ideal string used is "METHINKS IT IS LIKE A WEASEL".

The results of this program clearly demonstrates how selection can drive random processes to solve problems extremely efficiently: only 6 generations were required to reproduce the ideal string exactly. In fact, randomized algorithms are of great practical importance and a similar technique, simulated annealing, is used as an example in our F# for Technical Computing book.

Our F# solution is as follows:

```let target = "METHINKS IT IS LIKE A WEASEL"
let charset = "ABCDEFGHIJKLMNOPQRSTUVWXYZ "

let rand = System.Random()

let fitness (trial: string) =
Seq.zip target trial
|> Seq.fold (fun d (c1, c2) -> if c1=c2 then d+1 else d) 0

let mutate parent rate _ =
String.mapi (fun _ c ->
if rand.NextDouble() < rate then c else
charset.[rand.Next charset.Length]) parent

do
let mutable parent =
String.init target.Length (fun _ ->
charset.[rand.Next charset.Length] |> string)
let mutable i = 0
while parent <> target do
let pfit = fitness parent
let best, f =
Seq.init 200 (mutate parent (float pfit / float target.Length))
|> Seq.map (fun s -> (s, fitness s))
|> Seq.append [parent, pfit]
|> Seq.maxBy (fun (_, f) -> f)
if i % 100 = 0 then
printf "%5d - '%s'  (fitness:%2d)\n" i parent f
parent <- best
i <- i + 1
printf "%5d - '%s'\n" i parent```

This program produces the following output:

```    0 - 'CEUMIDXSIXOOTSEHHXVMD IHTFWP'  (fitness: 6)
100 - 'PEPHIZLB NGSIO LCWE AQEKCSZQ'  (fitness:11)
200 - 'MESHIZHB IQ IO LTWGGAQWMKSRX'  (fitness:13)
300 - 'MESHIZHB IQ IO LTWGGAQWMKSRX'  (fitness:13)
400 - 'METHIVKS ITLIN LYKJPABWDASEU'  (fitness:19)
500 - 'METHINKS IT IB LIKEFA WDASEL'  (fitness:25)
518 - 'METHINKS IT IS LIKE A WEASEL'```

### F# examples on Rosetta Code

Rosetta Code is a wonderful community-driven wiki containing examples of dozens of programming tasks written in dozens of languages.

Naturally, we couldn't resist the challenge to solve many of these tasks in F# and we shall be publicizing our results here and even developing some into full-blown F#.NET Journal articles.

Perhaps the most surprising result is that the F# solutions are extremely concise even when compared with bleeding-edge research languages such as Haskell. For example, the Haskell solution to the Data Munging 2 problem is:

```type Date = String
type Value = Double
type Flag = Int
type Record = (Date, [(Value,Flag)])

duplicatedDates :: [Record] -> [Date]
duplicatedDates [] = []
duplicatedDates [_] = []
duplicatedDates (a:b:tl)
| sameDate a b = date a : duplicatedDates tl
| otherwise    = duplicatedDates (b:tl)
where sameDate a b = date a == date b
date = fst

numGoodRecords :: [Record] -> Int
numGoodRecords = length . filter recordOk
where recordOk :: Record -> Bool
recordOk (_,record) = sumOk == 24
where sumOk = length \$ filter isOk record
isOk (_,v) = v >= 1

parseLine :: String -> Record
parseLine line = (date, records')
where (date:records) = words line
records' = mapRecords records

mapRecords :: [String] -> [(Value,Flag)]
mapRecords [] = []
mapRecords [_] = error "invalid data"
mapRecords (value:flag:tail) =

main :: IO ()
main = do
let inputs = map parseLine \$ lines contents
putStrLn \$ show (length inputs) ++ " total lines"
putStrLn "duplicated dates:"
mapM_ putStrLn \$ duplicatedDates inputs
putStrLn \$ "number of good records: " ++ show (numGoodRecords inputs)```

whereas our F# solution is simply:

```let dates = HashSet(HashIdentity.Structural)
let mutable ok = 0

do
match String.split [' '; '\t'] line with
| [] -> ()
| date::xys ->
if dates.Contains date then
printf "Date %s is duplicated\n" date
else
let f (b, t) h = not b, if b then int h::t else t
let _, states = Seq.fold f (false, []) xys
if Seq.forall (fun s -> s >= 1) states then
ok <- ok + 1
printf "%d records were ok\n" ok```

This really highlights the dramatic step forward that F# represents, not only improving upon the previous generation of mainstream programming languages but even leaping beyond research.

## Tuesday, 22 December 2009

### Zach Cox' word count challenge

Zach Cox published an interesting challenge on his blog recently: to traverse a directory hierarchy of files and compute the total distribution of word frequencies. Here is a simple F# solution:

```open System.IO

let rec words_of_dir dir =
seq { for dir in Directory.GetDirectories dir do
yield! words_of_dir dir
for file in Directory.GetFiles dir do

do
|> words_of_dir
|> Seq.countBy id
|> Seq.sortBy (fun (_, n) -> -n)
|> printf "%A\n"```

Taking only 17.5s at only 14 lines of code, this F# combines the brevity of the most concise solution (Ruby) with the performance of the fastest solution (Scala).

## Monday, 21 December 2009

### Trending F# jobs in the UK

As a pioneering functional programming language on the world's favorite platform, F# is set to explode onto the job scene following its first major release as part of Visual Studio 2010 on 22nd March 2010.

The ITJobsWatch website is everyone's first stop for analysing trends in the UK job market and, interestingly, their statistics for the F# language indicate that the number of F# jobs is over 4× higher than the same period last year and the average salary has risen to £70k or €79k or US\$112k:

This is the fastest growth in demand and salary of any functional programming language! Extrapolating the current trends, demand for F# programmers will be higher than for C++ programmers in 2012!

Suffice to say, the demand and salary of F# programmers in the UK will be a trend to watch over the next couple of years.

## Wednesday, 16 December 2009

### Generating and visualizing 2D mazes

The F#.NET Journal just published an article about puzzles:

"Maze generation is a remarkably complex and diverse subject that essentially falls into the category of network or graph theory but is most often seen in the context of games and puzzles. The characteristics that make a maze interesting for humans to try to navigate are subjective and not easily defined and, consequently, the design and implementation of an automatic maze generator is as much an art as it is a science. This article describes the design and implementation of a simple but effective maze generation algorithm including WPF-based visualization. The algorithm is elegantly expressed in terms of recursive functions and purely functional data structures..."

### Parsing and visualizing binary Geographic Information System data

The F#.NET Journal just published an article about parsing binary file formats:

"Detailed vector descriptions of worldwide geographical data are freely available in the Shapefile format. This article describes how polygonal data can be parsed from the binary Shapefile format and visualized using Windows Presentation Foundation easily and efficiently using F#. The result is a simple program that draws a map of the world and can be used as the basis for a wide variety of Geographic Information System (GIS) applications from cartography to climatology..."