QuickCheck vs. SmallCheck
Published on ; updated on
QuickCheck and SmallCheck are the two classical libraries for property testing in Haskell. There are newer developments, like SmartCheck and Hedgehog, but I haven’t tried them much yet.
So here is my take on the comparative strengths and weaknesses of the current versions of QuickCheck and SmallCheck (2.11.3 and 1.1.4, respectively). Being the maintainer of SmallCheck, I am obviously somewhat biased towards it. But quite often I would prefer QuickCheck to SmallCheck for a specific task. It also sometimes happens that I think QuickCheck would work better in principle for a given task, but in practice SmallCheck is more convenient for that task — usually it has to do with generic instances or failure reporting. So here are the considerations I take into account when choosing between the two.
If you do not know much about QuickCheck and/or SmallCheck, the main difference between them is that QuickCheck generates random test cases, whereas SmallCheck enumerates test cases exhaustively up to some depth. Some (though not all) of the pros and cons come from this fundamental difference in paradigms.
Advantages of QuickCheck
Control over the number of tests/execution time
With SmallCheck, the number of tests is usually exponential in their
depth. Say, when enumerating trees (from Data.Tree
), depth
5 gives you 189 tests in less then 0.01s, while depth 6 gives 6479 tests
in 4s. If you only have the patience to run 1000 tests, you have to
settle for 189. QuickCheck, on the other hand, allows you to set the
exact number of tests independently from their depth (which is called
size in QuickCheck).
Update (2018-07-05). In smallcheck-1.1.5, I added a function
limit :: Monad m => Int -> Series m a -> Series m a
which acts similarly to take
for lists. It is still not
as convenient as QuickCheck, because you have to apply it to each “big”
generator as opposed to setting a single global limit on the number of
tests, but you can use it to fight the exponential blowup.
Floating-point numbers
SmallCheck has instances for floating-point numbers, e.g.
> listSeries 5 :: [Positive Float]
1.0,2.0,0.5,4.0,0.25,8.0,0.125,16.0,3.0,6.25e-2,6.0,1.5,12.0,0.75,24.0,0.375,48.0,0.1875] [
These are probably enough if you need to fill some fractional field in a data type, but the logic of your functions doesn’t depend on the number in a complex way.
But if you are testing any truly numerical code (say, matrix operations or physics simulation), it is much more useful to have QuickCheck generate random floating-point numbers.
Advantages of SmallCheck
Generic deriving
If you have a couple of types that rarely change, writing the instances by hand may not a big deal.
Often, especially when writing tests for commercial code, there is a big number of nested algebraic data types. And even if you have the patience to write the instances, they will become incomplete the next time a constructor is added to one of the types (and the compiler won’t warn you).
So it is essential to be able to derive most of the instances. SmallCheck has had generic instances since 2011, thanks to the contribution from Bas van Dijk. QuickCheck still does not have them (github issue).
There is a separate package, generic-arbitrary, but it doesn’t seem to support recursive data types. The recursive types also seem to be the reason why QuickCheck itself doesn’t have the generic implementation.
The solution to the recursive type problem is straightforward: reduce
the depth (size) every time you descend into a nested type. As an
example, here’s an instance for trees from Data.Tree
:
instance Arbitrary a => Arbitrary (Tree a) where
= scale (\n -> max (n-1) 0) $
arbitrary Node <$> arbitrary <*> arbitrary
This approach is not ideal (your size may get down to 0 too quickly), but IMO it is better than no generic instances at all.
Update (2018-07-05). There are some third-party packages that implement generic deriving for QuickCheck and support recursive types, in particular testing-feat and generic-random.
testing-feat looks very cool and powerful from a theoretical perspective, and I’d like to study it in depth some day. In practice it involves definitions like
'MyType
deriveEnumerable 'instance Arbitrary MyType
where
= sized uniform
arbitrary = genericShrink shrink
Sometimes you need to define an Enumerable
instance for
an abstract or primitive type, which I ended up doing like this (for
Text
):
instance Enumerable T.Text where
= share (c1 T.pack) enumerate
generic-random is a bit easier to use, since there’s
no additional Enumerable
class. The declaration looks like
this:
instance Arbitrary MyType
where
= genericArbitraryU'
arbitrary = genericShrink shrink
In both cases, genericShrink
comes from the QuickCheck
package itself, but it’s not the default implementation for
shrink
; the default is shrink _ = []
.
Failure reporting
Say we have a property
= reverse (reverse x) == x prop x
By default, both SmallCheck and QuickCheck only tell you the
generated values that led to the failure (i.e. x
). But I
would also like to know why the test failed — in this case,
what reverse (reverse x)
was.
There is a StackOverflow question about this with two answers, neither of which is perfect.
Edward Z. Yang says:
Because QuickCheck gives you the inputs to the function, and because the code under test is pure (it is, right?), you can just feed those inputs to the function and get the result. This is more flexible, because with those inputs you can also repeatedly test with tweaks to the original function until it’s correct.
There are many reasons why this may not work:
- The code may not be pure by design — both SmallCheck and QuickCheck are capable of testing IO code.
- The code may be intended to be pure, but the very bug you are trying to fix may lead to it being impure/non-deterministic.
- Copy-pasting values may not be very convenient: think large matrices.
- Also, strings produced by
Show
do not always map back to the original values effortlessly. E.g. Maps, Sets, and Vectors all result infromList [...]
when shown. If you have several of them in the same structure, you’ll have to disambiguate thosefromList
s.
The other answer, by Paul Johnson, gives a working solution (using
MkResult
), but it is rather unwieldy, and you have to
include the parameter values manually if you need them.
Update (2018-07-05). I’ve since found and posted a much better way to report a failure in QuickCheck:
import Test.QuickCheck.Property (succeeded, failed, reason)
=
prop a b if res /= []
then succeeded
else failed { reason = reason }
where
= checkCode a b (res, reason)
In SmallCheck, you can return Either String String
instead of Bool
, where the String
s give
explanation for failure and success, respectively. For example, this
property
prop_reverse :: [Int] -> Bool
= reverse x == x prop_reverse x
results in the message
there exists [0,1] such that
condition is false
But we can change the property to explain itself
prop_reverse :: [Int] -> Either String String
=
prop_reverse x if reverse x == x
then Right "Match"
else Left $ "reverse x == " ++ show (reverse x)
which results in
there exists [0,1] such that
reverse x = [1,0]
Better than shrinking
QuickCheck tends to produce complex counterexamples and then tries to
shrink them to find simpler ones. While shrinking does lead to somewhat
simpler examples, they are usually not minimal. Also, it’s on you to
implement the shrinking function for your own types (although there is
genericShrink
).
Because SmallCheck enumerates the values in order, from smaller to bigger, it automatically produces the minimal example.
Determinism and exhaustiveness
Because QuickCheck generates its examples randomly, the set of test cases your property is tested on is rather arbitrary. (It also changes from run to run by default, though this can be fixed by suppplying a fixed seed.)
Sometimes that’s good: if you have a high- or infinite-dimensional space of possibilities, all you can hope for is to test on a few diverse samples from that space. But other times, that space is narrow, and it’s better to test deterministically and exhaustively.