Word vs. Int

Published on ; updated on

When dealing with bounded non-negative integer quantities in Haskell, should we use Word or Int to represent them?

Some argue that we shoud use Word because then we automatically know that our quantities are non-negative.

There is a famous in typed functional programming circles maxim that says “make illegal states unrepresentable”. Following this maxim generally leads to code that is more likely correct, but in each specific instance we should check that we are indeed getting closer to our goal (writing correct code) instead of following a cargo cult.

So let’s examine whether avoiding unrepresentable negative states serves us well here.

If our program is correct and never arrives at a negative result, it does not matter whether we use Int or Word. (I’m setting overflow issues aside for now.)

Thus, we only need to consider a case when we subtract a bigger number from a smaller number because of a logic flaw in our program.

Here is what happens depending on what type we use:

> 2 - 3 :: Int
-1
> 2 - 3 :: Word
18446744073709551615

Which answer would you prefer?

Even though technically -1 doesn’t satisfy our constraints and 18446744073709551615 does, I would choose -1 over 18446744073709551615 any day, for two reasons:

  1. There is a chance that some downstream computation will recognize a negative number and report an error.

    A stock exchange won’t let me buy -1 shares, and the engine won’t let me set the speed to -1 km/h. (These are terrible examples, I know, but hopefully they illustrate my point.)

    Will those systems also reject 18446744073709551615 shares or km/h? If they are well designed, yes, but I’d rather not test this in production.

  2. For a human, it is easier to notice the mistake if the answer does not make any sense at all than if the answer kinda makes sense.

    If an experienced programmer sees an unexpectedly huge number like 18446744073709551615, she will easily connect it to an underflow, although it’s an extra logical step she has to make. A less experienced programmer might spend quite a bit of time figuring this out.

In any case, I don’t see any advantage of replacing an invalid number such as -1 with a technically-valid-but-meaningless number like 18446744073709551615.

Ben Millwood said it well:

I moreover feel like, e.g. length :: [a] -> Word (or things of that ilk) would be even more of a mistake, because type inference will spread that Word everywhere, and 2 - 3 :: Word is catastrophically wrong. Although it seems nice to have an output type for non-negative functions that only has non-negative values, in fact Word happily supports subtraction, conversion from Integer, even negation (!) without a hint that anything has gone amiss. So I just don’t believe that it is a type suitable for general “positive integer” applications.

Moritz Kiefer makes another great point:

One problem that I’ve run into quite a few times with Word is [0 .. n - 1] which is quite common if you’re mapping over array indices. That alone has mostly resulted in me being really careful when I use Word.

Here Moritz is referring to the fact that the expression [0 .. n-1] :: [Word] results in the list of all integers 0 <= i < n unless n is 0, in which case n-1 wraps to maxBound.

Natural

There is a way to get the best of both worlds, though: the Natural type from Numeric.Natural.

  1. It is an arbitrary-precision integer, so we don’t have to worry about overflow.

  2. It is a non-negative type, so the invalid state is not representable.

  3. It raises an exception upon underflow, making the errors in the code prominent:

    > 2 - 3 :: Natural
    *** Exception: arithmetic underflow

There are perhaps a couple of valid use cases for Word that I can think of, but they are fairly exotic. (I am not talking here about the types such as Word32 or Word64, which are indespensible for bit manipulation.) Most of the time we should prefer either Int or Natural to Word.