Sunday, 3 December 2006

Array manipulation (continued)

I've been studying the slice-based approach to array manipulation used by the Python and Matlab implementations of a 1D, 4-tap discrete wavelet transform. I am particularly interested in this because of the idea of incorporating Matlab-like array slicing in future versions of F#.

Here is the Python code (written by Sturla Molden):

def D4_Transform(x, s1, d1, d2):
odd = x[1::2]
even = x[:-1:2]
d1[:] = odd[:] - C1*even[:]
s1[:-1] = even[:-1] + C2*d1[:-1] + C3*d1[1:]
s1[-1] = even[-1] + C2*d1[-1] + C3*d1[0]
d2[0] = d1[0] + s1[-1]
d2[1:] = d1[1:] + s1[:-1]
even[:] = C4 * s1[:]
odd[:] = C5 * d2[:]
if x.size > 2:
D4_Transform(even,s1[0:even.size/2],
d1[0:even.size/2],d2[0:even.size/2])

The discrete wavelet transform computes downsampled convolutions repeatedly until no data remains. My F# implementation of this transform is:

let rec d4_aux a n =
let n2 = n lsr 1 in
let tmp = Array.make n 0. in
for i=0 to n2-2 do
tmp.(i) <- a.(i*2)*.h0 +. a.(i*2+1)*.h1 +. a.(i*2+2)*.h2 +. a.(i*2+3)*.h3;
tmp.(i+n2) <- a.(i*2)*.g0 +. a.(i*2+1)*.g1 +. a.(i*2+2)*.g2 +. a.(i*2+3)*.g3;
done;
tmp.(n2-1) <- a.(n-2)*.h0 +. a.(n-1)*.h1 +. a.(0)*.h2 +. a.(1)*.h3;
tmp.(2*n2-1) <- a.(n-2)*.g0 +. a.(n-1)*.g1 +. a.(0)*.g2 +. a.(1)*.g3;
Array.blit tmp 0 a 0 n;
if n > 4 then d4_aux a (n lsr 1)

Firstly, this could be simplified by using a new type to represent a cyclic array. However, unless the modulo indexing was hoisted from loops over the array, this would adversely affect performance.

Note that the F# is both shorter and faster than the Python. In fact, the F# is as fast as C++!

The next thing to note is that the the Python code doesn't use slicing because it is a good idea in general but, rather, because it evades touching the arrays element-wise (which would be very slow in an interpreted language like Python). So, are slices useful?

My F# program could be written more succinctly using slices:

let rec d4_aux tmp a n =
let n2 = n lsr 1 in
for i=0 to n2-2 do
tmp.(i) <- dot h a.[i*2:];
tmp.(i+n2) <- dot g a.[i*2+1:];
done;
Array.blit tmp 0 a 0 n;
if n > 4 then d4_aux a (n lsr 1)

Where a.[i:] denotes a kind of IEnumerable starting at index "i".

This has saved quite a bit of code but optimising this back to the original is quite difficult.

So I think slices would be useful in F#. However, I think they will be much more useful if they are part of a generic framework for handling random-access containers, like IList, such that you can take slices of slices, or slices of random-access containers that aren't arrays (e.g. functional arrays implemented as balanced binary trees).

On a related note, I'd like an equivalent of OCaml's stream parsing syntax extension. This could be done using the upcoming feature "active patterns" to dissect an IEnumerable (now called Seq).

In summary, I think slices will be useful in F# but I don't think they will be as prevelant in F# as they are in languages like Matlab and Python because F# already provides very fast element-wise access.

No comments: