StateT vs. IORef: a benchmark

Published on ; updated on

Sometimes I’m writing an IO loop in Haskell, and I need some sort of a counter or accumulator. The two main options are to use a mutable reference (IORef) or to put a StateT transformer on top the IO monad.

I was curious, though, if there was a difference in efficiency between these two approaches. Intuitively, IORefs are dedicated heap objects, while a StateT transformer’s state becomes “just” a local variable, so StateT might optimize better. But how much of a difference does it make?

So I benchmarked the four functions, all of which calculate the sum of numbers between 1 and n = 1000.

base_sum simply calls sum from the base package; state_sum and stateT_sum maintain the accumulator using the State Int and StateT Int IO monads, respectively, and ioref_sum uses an IORef within the IO monad. And here are the results, as reported by criterion.

Mean execution times reported by criterion. The error bars are the lower and upper bounds of the mean as reported by criterion, which I think are 95% bootstrap confidence intervals.

I’m not sure how stateT_sum manages to be faster than state_sum and base_sum (this doesn’t appear to be a statistical fluke), but what’s clear is that ioref_sum is significantly slower of them all.

So if 3ns per state access matter to you, go for StateT even when you are in IO.

(Update: also check out the comments on reddit, especially the ones by u/VincentPepper.)

Here’s the full benchmark code. It was compiled with -O2 by GHC 8.8.4 and run on AMD Ryzen 7 3700X.

import Criterion
import Criterion.Main

import Control.Monad.State
import Data.IORef

base_sum :: Int -> Int
base_sum n = sum [1 .. n]

state_sum :: Int -> Int
state_sum n = flip execState 0 $
  forM_ [1..n] $ \i ->
    modify' (+i)

stateT_sum :: Int -> IO Int
stateT_sum n = flip execStateT 0 $
  forM_ [1..n] $ \i ->
    modify' (+i)

ioref_sum :: Int -> IO Int
ioref_sum n = do
  ref <- newIORef 0
  forM_ [1..n] $ \i ->
    modifyIORef' ref (+i)
  readIORef ref

main = do
  let n = 1000
    [ bench "base_sum" $ whnf base_sum n
    , bench "state_sum" $ whnf state_sum n
    , bench "stateT_sum" $ whnfAppIO stateT_sum n
    , bench "ioref_sum" $ whnfAppIO ioref_sum n