Testing FFT with R

Published on

In the previous article, we have built a simple FFT algorithm in Haskell. Now it’s time to test it, for correctness and for (asymptotic) efficiency. If you expect the conclusion to be “it’s good” on both accounts, read on; it may get a bit more interesting than that.

This is also a good occasion to introduce and play with inline-r, a Haskell package that allows to run R code from within Haskell.

Testing correctness

In R, FFT is available straight out of the box, without a need to import a single package.

We use tasty and quickcheck to generate random lists of complex numbers, run R’s FFT and our implementation on these random inputs and compare the results.

The result?

Haskell vs R: OK (0.28s)
  +++ OK, passed 100 tests.

All 1 tests passed (0.28s)

That’s reassuring; at least we are computing the right thing. Now let’s see if we’ve managed to stay within our \(O(n \log n)\) bound.

Testing complexity

Luckily, our implementation already records the number of arithmetic operations it takes to compute the answer.

> snd . fft $ replicate 13 1
312

And we can follow this number as \(n\) grows:

> map (\n -> snd . fft $ replicate n 1) [1..20]
[0,4,12,16,40,36,84,48,144,100,220,96,312,196,420,128,544,324,684,240]

But these numbers don’t tell us much. How do we know if this is \(\Theta(n \log n)\) or \(\Theta(n^2)\)? For this, we will again turn to R.

R’s lm function approximates one value as a linear combination of other values. In this case, we’ll try to find a linear combination of \(n\), \(n \log n\), and \(n^2\) that approximates the number of operations it takes our FFT implementation to complete on an input of size \(n\). If it turns out that the coefficient of \(n^2\) in this combination is a non-negligible positive number, then the complexity of our algorithm is probably not \(O(n \log n)\).

First, we try \(n\)s which are powers of 2. Then the number of points for evaluation halves (while remaining even!) at each level of our divide-and-conquer algorithm.

Call:
lm(formula = ops ~ I(n^2) + I(n * log(n)) + n)

Coefficients:
  (Intercept)         I(n^2)  I(n * log(n))              n
    3.036e-13      1.840e-17      2.885e+00      3.399e-14

The number of steps is rather well approximated by \(2.885\, n \log n\). That’s a win! (By the way, can you tell or guess where the number 2.885 comes from?)

But remember, the number of points only shrinks when it’s even. What the number is odd from the very beginning? Then it’ll never reduce at all!

Call:
lm(formula = ops ~ I(n^2) + I(n * log(n)) + n)

Coefficients:
  (Intercept)         I(n^2)  I(n * log(n))              n
    2.657e-12      2.000e+00      1.834e-13     -2.000e+00

Our fears are confirmed; this time the \(n \log n\) coefficient is negligible, and the number of operations appears to be \(2(n^2-n)\).

For an arbitrary number \(n\), the efficiency of the algorithm depends on the largest power of 2 that divides \(n\). If \(n = 2^m q\), where \(m\) is odd, then \(n\) will halve during the first \(m\) steps, and then stabilize at \(q\).

Concluding remarks on inline-r

The way inline-r works is very cool. It takes advantage of some unique features of R; for example, that the type of R’s AST and the type of its runtime values is the same type, just like in Lisp.

So the quasi-quote [r|...|] works by calling a normal R function that parses the expression and returns its AST as an R value. Then inline-r traverses the AST, replaces metavariables like foo_hs by values dynamically constructed from foo, and passes the expression-as-a-value back to R for evaluation.

To learn more, refer to the paper Project H: Programming R in Haskell. Also, if you understand Russian, listen to our podcast where we discuss inline-r with Alexander Vershilov, one of its authors.

On the other hand, the code using inline-r may get somewhat bulky, as in the r_fft function. It’s hard for me to say yet whether this complexity is justified. The module organization is questionable; in our first program, accessing fairly basic functionality required importing 6 different modules. Finally, the documentation lacks in content and organization.

Yet this is a very impressive young project, and I would love to see its continued development.