Surprises of the Haskell module system (part 2)

Published on

It’s been 6 years since I published Surprises of the Haskell module system (part 1), so I thought it was time to follow through on the implicit promise and publish part 2.

Exporting constructors without data type

Is it possible to export a data constructor without exporting its parent data type? To be clear, there’s probably no good reason to want to do that, but it’s a fun puzzle, and we’ll learn something while trying to solve it.

Surprisingly, the answer depends on how our constructor and type are named. First, consider the case when they are named differently, as in

data T = C

The naive attempt

module M (C) where
data T = C

does not work. The Haskell 2010 standard says this about data types in the export list:

An algebraic datatype T declared by a data or newtype declaration may be named in one of three ways:
  • The form T names the type but not the constructors or field names. The ability to export a type without its constructors allows the construction of abstract datatypes (see Section 5.8).
  • The form T(c1,,cn), names the type and some or all of its constructors and field names.
  • The abbreviated form T(..) names the type and all its constructors and field names that are currently in scope (whether qualified or not).

Therefore, there seems to be no way to export a constructor without exporting its parent type at the same time. But here’s the trick. First, let’s move the data type definition from M to a different module, Internal:

module Internal where
data T = C

Then, in our module M, import Internal like this:

module M where
import Internal hiding (T)

This will import C but not T. Here’s Haskell 2010 on the semantics of hiding lists:

Entities can be excluded by using the form hiding(import1 ,  , importn ), which specifies that all entities exported by the named module should be imported except for those named in the list. Data constructors may be named directly in hiding lists without being prefixed by the associated type. Thus, in

  import M hiding (C)

any constructor, class, or type named C is excluded. In contrast, using C in an import list names only a class or type.

So now, within module M, we have only C in scope but not T. But M does not export any of them yet, and it seems like we need T in scope in order to add C to the export list.

Here we use a second trick: instead of naming C specifically, we export our Internal module as a whole:

module M (module Internal) where
import Internal hiding (T)

And this achieves the desired result: M now exports the data constructor C but not the data type T. See Part 1 for more discussion of module exports.

This all works because the data type and its constructor are named differently. If we have

data T = T

this won’t work, because

import Internal hiding (T)

will hide both the data type and the data constructor. I actually don’t know a way to export the constructor T without the type T; if you do, please let me know.

Exporting fields without constructor

A similar question is that of exporting a field name without exporting its parent constructor, as in

module M (T, field) where
data T = C { field :: Bool }

The above code works. But what does it mean? Field names can be used as functions (e.g. field :: T -> Bool), but they can also be used in the record construction (C { field = True }) and record update (x { field = True }) syntaxes.

Since we don’t export the data constructor, clearly record construction won’t work. But what about record update? Does field lose its magic and become a simple function when exported this way—either because it wasn’t exported as T(field), or because the data constructor is not in scope? The answer is no, it is still a valid field name, and x { field = True } still works. Here’s Haskell 2010:

It makes no difference to an importing module how an entity was exported. For example, a field name f from data type T may be exported individually (f, item (1) above); or as an explicitly-named member of its data type (T(f), item (2)); or as an implicitly-named member (T(..), item(2)); or by exporting an entire module (module M, item (5)).

And there’s no requirement that the constructor be in scope in order to use the record update syntax—even though its desugaring would require all of the constructors to be in scope.

Some authors use this to their advantage: for instance, Michael Snoyman suggests exporting fields but not constructors for settings types.

But more often than not, we want the opposite: to make the type abstract and export a getter for it. If you think the module M above exports an abstract type T and a function field :: T -> Bool, then you may believe you are free to change the implementation of T without your users noticing. So in the next version you may change T to

module M (T, field) where
data T = C { items :: [Int] }
field :: T -> Bool
field = not . null . items

This new module still exports a type T and a function field :: T -> Bool, but that function is no longer a record field. If someone was using field in a record update, x { field = True }, their code is going to break.

There’s no way to export a field as a plain function or otherwise strip it of its magic. You can either define a getter yourself, as in

data T = C Bool
field :: T -> Bool
field = \case
  C a -> a

or create a copy of a field, as in

data T = C { _field :: Bool }
field :: T -> Bool
field = _field

Hidden types and type synonyms

Consider this example:

module M (x) where

type T = Int

x :: T  
x = 1 

What type does x have in a module that imports M? On the one hand, T is Int, so x :: Int. On the other hand, since T is not in scope, there’s no way to know that T = Int, so it might seem that x is now an opaque value of some abstract type T that we know nothing about. In fact, the former view is correct and the latter isn’t.

The type of an exported entity is unaffected by non-exported type synonyms. For example, in

  module M(x) where  
    type T = Int  
    x :: T  
    x = 1

the type of x is both T and Int; these are interchangeable even when T is not in scope. That is, the definition of T is available to any module that encounters it whether or not the name T is in scope. The only reason to export T is to allow other modules to refer it by name; the type checker finds the definition of T if needed whether or not it is exported.

Similarly, you may wonder what does it mean to export a function but hide some of the types that are part of its signature. Here’s what Haskell 2010 has to say about that:

[E]ntities that the compiler requires for type checking or other compile time analysis need not be imported if they are not mentioned by name. The Haskell compilation system is responsible for finding any information needed for compilation without the help of the programmer. That is, the import of a variable x does not require that the datatypes and classes in the signature of x be brought into the module along with x unless these entities are referenced by name in the user program. The Haskell system silently imports any information that must accompany an entity for type checking or any other purposes. Such entities need not even be explicitly exported: the following program is valid even though T does not escape M1:

  module M1(x) where  
    data T = T  
    x = T  
 
  module M2 where  
    import M1(x)  
    y = x

In this example, there is no way to supply an explicit type signature for y since T is not in scope. Whether or not T is explicitly exported, module M2 knows enough about T to correctly type check the program.

Self-exporting module

By default—when no export list is present in a module—all values, types and classes defined in the module are exported, but not those that are imported. With an explicit export list, we can export a mix of entities from the module itself and the modules it imports. But now you have to explicitly list everything defined in the current module and remember to update the list when you define a new function or type. Or do you?

There’s a nice trick to have an explicit export list and automatically export everything from the current module at the same time: just let the module export itself.

-- exports x, y, and z
module M (module M, Foo.x, Bar.y) where

import Foo
import Bar

z = undefined

Thanks to the clear semantics of module exports (which we discussed in Part 1), self-exporting is well-defined and doesn’t involve any recursion.