Roman Cheplyaka

Avoid equational function definitions

May 9, 2014

One of the first things that Haskell beginners usually notice is that Haskell has this somewhat unusual but attractive way of defining functions case-by-case:

foldr f z []     = z 
foldr f z (x:xs) = f x (foldr f z xs) 

It looks fun and math-y. The other way to do pattern matching, case expressions, is much less advertized, probably because case invokes associations with dirty old imperative programming. Here’s how the same function could be defined using case:

foldr f z l =
  case l of
    []   -> z 
    x:xs -> f x (foldr f z xs) 

However, there are plenty of reasons to prefer case to multiple function definition clauses.

(If some of these look insignificant at first sight, think of a datatype with tens of constructors, which is quite common when working with abstract syntax trees.)

  1. DRY. Notice how in the equational style the function name and argument names get repeated.

  2. It makes it clear what the function decides upon. The equational style allows you to pattern match on different arguments in different clauses, or even on multiple arguments in the same clause:

    f [] 0 = 0
    f _  1 = 1
    f _  _ = 2

    It gives more power, but also makes it harder to see what’s going on. More importantly, even when this additional power is not used, it’s not obvious from the code itself until you eye-scan all the clauses.

  3. It makes code easier to modify or refactor. Tasks like

    • adding or removing a function argument
    • introducing a local definition common for multiple cases
    • preprocessing function arguments or post-processing the function result

    are trivial with the case expression, and hard to impossible (without rewriting or introducing other top-level functions) with clauses.

  4. When profiling, you often want to add an {-# SCC #-} pragma for a function. If the function is written using multiple cases, you need to attach this pragma to every clause separately. Moreover, even if you do so, they won’t account for the evaluation of arguments due to pattern matching in left-hand sides of the equations.

  5. Once you start reading the Core or STG code, writing functions using case makes it much easier to follow the connection between the original source and its intermediate representation.

Perhaps the only reason to have multiple clauses is if you need that additional power of matching on several arguments at the same time, e.g.

Right a <*> Right b = Right (a b)
Left  a <*> Right _ = Left a
Right _ <*> Left b  = Left b
Left  a <*> Left b  = Left (a <> b)

You could do this with case by matching on tuples, but it isn’t as nice.

Other than this, I rarely ever define functions in the equational style in my code.