Roman Cheplyaka

Haskell recipe: reading list of lines

October 25, 2012

import Prelude hiding (readList)
import Control.Applicative
import Control.Monad
import Control.Monad.Trans
import Control.Monad.Trans.Maybe
import Data.Maybe

Suppose we need to ask the user to enter a list of lines, terminated by an empty line. The usual way to do it is through recursion:

readList :: IO [String]
readList = do
  l <- getLine
  if null l
    then return []
    else (l :) <$> readList

This is fine, but can we do better?

A good style in Haskell is to replace explicit recursion with high-level recursion combinators where it is possible and doesn’t harm readability.

What is the pattern here? Those who are familiar with parser combinator libraries will easily recognize the many combinator. It tries to recognize as many items as possible and returns the list of recognized items (in this case successful recognition means that we got a non-empty line).

A generalization of the many combinator works for any applicative functor which is an instance of the Alternative class:

many :: Alternative f => f a -> f [a]

The IO functor is not an instance of Alternative, but we can fix it by layering a MaybeT transformer on top of it. MaybeT will be our means to distinguish failure from success.

readList :: IO [String]
readList = fmap (fromMaybe []) $ runMaybeT $ many $ do
  l <- lift getLine
  guard $ not $ null l
  return l

The inner do-block is the code which reads one line, and, depending on whether that line is null, returns it or signals about the failure.

Some more details about what’s going on there.

  • lift getLine lifts getLine from IO String to MaybeT IO String by assuming that it always succeeds (in MaybeT sense).
  • guard is a function from the MonadZero class that checks the condition and, if it does not hold, declares failure. It is somewhat similar to an assert function, except it is used for control flow rather than debugging purposes.
  • Finally, if guard did not abort the computation, we get to the last line that simply returns the line we’ve just read.

(If you want to learn more about how monad transformers work, please refer to Composing monadic effects.)

The outer code does boring stuff: runMaybeT transforms MaybeT IO String to IO (Maybe String) (which is the same thing but newtyped), and fromMaybe interprets Nothing as the empty list.

It is interesting that only the first line of readList definition gives us a hint that MaybeT works here. The rest of the code just uses overloaded functions many, lift, guard and return.

Was the rewrite worth it? It is debatable, since a function did get a bit more complicated. But in my opinion, the answer is yes — we got rid of explicit recursion, and the idea of the function became more clear.