Testing FFT with R
Published on
In the previous article, we have built a simple FFT algorithm in Haskell. Now it’s time to test it, for correctness and for (asymptotic) efficiency. If you expect the conclusion to be “it’s good” on both accounts, read on; it may get a bit more interesting than that.
This is also a good occasion to introduce and play with inline-r, a Haskell package that allows to run R code from within Haskell.
Testing correctness
In R, FFT is available straight out of the box, without a need to import a single package.
We use tasty and quickcheck to generate random lists of complex numbers, run R’s FFT and our implementation on these random inputs and compare the results.
{-# LANGUAGE QuasiQuotes, ScopedTypeVariables, DataKinds, PartialTypeSignatures, GADTs #-}
import Test.Tasty
import Test.Tasty.QuickCheck
import Test.QuickCheck.Monadic
import Data.Complex
-- inline-r imports
import qualified Foreign.R as R (cast)
import Foreign.R.Type as R -- SComplex
import H.Prelude (runRegion, withEmbeddedR, defaultConfig)
import Language.R.QQ (r)
import Language.R.HExp as HExp (HExp(Complex),hexp)
import Data.Vector.SEXP as R (toList)
-- the module we wrote in the previous article
import FFT
-- Call R's FFT
r_fft :: [Complex Double] -> IO [Complex Double]
= runRegion $ do
r_fft nums <- [r|fft(nums_hs)|]
r_result1 let r_result2 = R.cast R.SComplex r_result1
HExp.Complex r_result3 = hexp r_result2
= R.toList r_result3
r_result4 return r_result4
=
main
withEmbeddedR defaultConfig. defaultMain
. testProperty "Haskell vs. R"
$ \(nums :: [Complex Double]) -> monadicIO $ do
<- run $ r_fft nums
r_result
let haskell_result = fst $ fft nums
$
assert == Prelude.length haskell_result &&
Prelude.length r_result all ((<1e-8) . magnitude) (zipWith (-) r_result haskell_result)
The result?
Haskell vs. R: OK (0.28s)
+++ OK, passed 100 tests.
All 1 tests passed (0.28s)
That’s reassuring; at least we are computing the right thing. Now let’s see if we’ve managed to stay within our \(O(n \log n)\) bound.
Testing complexity
Luckily, our implementation already records the number of arithmetic operations it takes to compute the answer.
> snd . fft $ replicate 13 1
312
And we can follow this number as \(n\) grows:
> map (\n -> snd . fft $ replicate n 1) [1..20]
[0,4,12,16,40,36,84,48,144,100,220,96,312,196,420,128,544,324,684,240]
But these numbers don’t tell us much. How do we know if this is \(\Theta(n \log n)\) or \(\Theta(n^2)\)? For this, we will again turn to R.
R’s lm function approximates one value as a linear combination of other values. In this case, we’ll try to find a linear combination of \(n\), \(n \log n\), and \(n^2\) that approximates the number of operations it takes our FFT implementation to complete on an input of size \(n\). If it turns out that the coefficient of \(n^2\) in this combination is a non-negligible positive number, then the complexity of our algorithm is probably not \(O(n \log n)\).
{-# LANGUAGE QuasiQuotes #-}
import H.Prelude (runRegion, withEmbeddedR, defaultConfig)
import qualified H.Prelude as H
import Language.R.QQ (r)
import FFT
fft_ops :: Int -> Int
= snd . fft $ replicate n 1
fft_ops n
analyze :: [Int] -- sequence of n's
-> IO ()
= runRegion $ do
analyze ns let ops = map fft_ops ns
ops_dbl :: [Double]
ns_dbl,= map fromIntegral ns
ns_dbl = map fromIntegral ops
ops_dbl
=<< [r| ops <- ops_dbl_hs; n <- ns_dbl_hs; lm(ops ~ I(n^2) + I(n*log(n)) + n)|]
H.print
= withEmbeddedR defaultConfig $ do
main 2^k | k <- [1..10]]
analyze [ 2*k-1 | k <- [1..30]] analyze [
First, we try \(n\)s which are powers of 2. Then the number of points for evaluation halves (while remaining even!) at each level of our divide-and-conquer algorithm.
Call:
lm(formula = ops ~ I(n^2) + I(n * log(n)) + n)
Coefficients:
(Intercept) I(n^2) I(n * log(n)) n
3.036e-13 1.840e-17 2.885e+00 3.399e-14
The number of steps is rather well approximated by \(2.885\, n \log n\). That’s a win! (By the way, can you tell or guess where the number 2.885 comes from?)
But remember, the number of points only shrinks when it’s even. What the number is odd from the very beginning? Then it’ll never reduce at all!
Call:
lm(formula = ops ~ I(n^2) + I(n * log(n)) + n)
Coefficients:
(Intercept) I(n^2) I(n * log(n)) n
2.657e-12 2.000e+00 1.834e-13 -2.000e+00
Our fears are confirmed; this time the \(n \log n\) coefficient is negligible, and the number of operations appears to be \(2(n^2-n)\).
For an arbitrary number \(n\), the efficiency of the algorithm depends on the largest power of 2 that divides \(n\). If \(n = 2^m q\), where \(m\) is odd, then \(n\) will halve during the first \(m\) steps, and then stabilize at \(q\).
Concluding remarks on inline-r
The way inline-r works is very cool. It takes advantage of some unique features of R; for example, that the type of R’s AST and the type of its runtime values is the same type, just like in Lisp.
So the quasi-quote [r|...|]
works by calling a normal R
function that parses the expression and returns its AST as an R value.
Then inline-r traverses the AST, replaces metavariables like
foo_hs
by values dynamically constructed from
foo
, and passes the expression-as-a-value back to R for
evaluation.
To learn more, refer to the paper Project H: Programming R in Haskell. Also, if you understand Russian, listen to our podcast where we discuss inline-r with Alexander Vershilov, one of its authors.
On the other hand, the code using inline-r may get somewhat bulky, as
in the r_fft
function. It’s hard for me to say yet whether
this complexity is justified. The module organization is questionable;
in our first program, accessing fairly basic functionality required
importing 6 different modules. Finally, the documentation lacks in
content and organization.
Yet this is a very impressive young project, and I would love to see its continued development.