Roman Cheplyaka

Recognizing lists

January 29, 2015

import Data.Monoid
import Data.List (isSuffixOf)
import Control.Applicative
import Control.Applicative.Permutation as Perm
import Control.Monad.Trans.State

You are given the following Haskell data type:

data T = A Int | B String

Your task is to write a function

recognize :: [T] -> Bool

that would accept a list iff it contains exactly three elements, of which:

  1. one is an A with a value between 3 and 7
  2. one is an A with an even value
  3. one is a B with a value ending in .hs

The A elements mentioned in conditions (1) and (2) have to be distinct, so [A 4, A 17, B "Prelude.hs"] should not be accepted, even though A 4 satisfies both (1) and (2).

This problem often arises in blackbox testing. You do something to a system; you wait for some time while gathering all events that the system generates. Then you want to check that the system generated all the events it had to generate, and no extra ones. However, the order of events may be non-deterministic. If the system concurrently makes a web request, updates a database and sends a message to a message bus, the order in which these events will be observed by your test suite is arbitrary.

A similar problem arises in input validation.

Interface

We want to solve this problem in a general and scalable way. What would an interface to such solution look like? There are two principal options.

One is a simple function

recognize :: [t -> Bool] -> [t] -> Bool

that takes individual predicates and combines them into a recognizer.

Another is to split the process into the build and use phases by the means of an intermediate data type:

data Parser t
instance Monoid Parser

toParser :: (t -> Bool) -> Parser t
runParser :: Parser t -> [t] -> Bool

In this approach, you convert individual recognizers to monoidal values, combine those values and extract the result.

Semantically, the two are equivalent: if you have one of these interfaces available, you can implement the other one. In a way, this is a consequence of lists being the free monoid, but writing this down explicitly is a good exercise nevertheless.

The advantages of a build-use interface are:

  1. When the cost of the build phase is non-trivial, it can be amortized over multiple uses and/or incremental builds. (This won’t be the case here.)
  2. The use of a specialized type is generally good for the type system hygiene.

We’ll start with the build-use interface because I find it more instructive.

The Parser

To build the monoidal parser, we’ll use the action-permutations library. You may remember it from the article on JSON parsing.

newtype Parser t = Parser { getParser :: Perms (StateT [t] Maybe) () }

StateT [t] Maybe a is the familiar type of parser combinators, [t] -> Maybe (a, [t]), that return the left-over input. We wrap it in Perms to achieve the order independence, and then newtype it to convert its Applicative instance to a Monoid one:

instance Monoid (Parser t) where
  mempty = Parser (pure ())
  mappend (Parser a) (Parser b) = Parser $ a *> b

The build and use functions are straightforward:

toParser :: (t -> Bool) -> Parser t
toParser isX = Parser $ Perm.atom $ StateT $ \ts ->
  case ts of
    (t:ts') | isX t -> Just ((), ts')
    _ -> Nothing

runParser :: Parser t -> [t] -> Bool
runParser p ts =
  case (flip execStateT ts . runPerms . getParser) p of
    Just [] -> True
    _ -> False

Usage

First, let’s express our conditions in Haskell code.

isA1, isA2, isB :: T -> Bool

isA1 x
  | A n <- x, 3 <= n, n <= 7 = True
  | otherwise = False

isA2 x
  | A n <- x, even n = True
  | otherwise = False

isB x
  | B s <- x, ".hs" `isSuffixOf` s = True
  | otherwise = False

Combine them together:

parser :: Parser T
parser = (mconcat . map toParser) [isA1, isA2, isB]

(we could replace mconcat . map toParser with foldMap toParser from Data.Foldable)

recognize :: [T] -> Bool
recognize = runParser parser

Now try it out:

ex1 = recognize [A 4, A 17, B "Prelude.hs"]
-- => False
ex2 = recognize [A 4, A 18, B "Prelude.hs"]
-- => True
ex3 = recognize [B "Prelude.hs", A 18, A 4]
-- => True
ex4 = recognize [B "Prelude.hs", A 18, A 4, A 1]
-- => False