Surprises of the Haskell module system (part 2)
Published on
It’s been 6 years since I published Surprises of the Haskell module system (part 1), so I thought it was time to follow through on the implicit promise and publish part 2.
Exporting constructors without data type
Is it possible to export a data constructor without exporting its parent data type? To be clear, there’s probably no good reason to want to do that, but it’s a fun puzzle, and we’ll learn something while trying to solve it.
Surprisingly, the answer depends on how our constructor and type are named. First, consider the case when they are named differently, as in
data T = C
The naive attempt
module M (C) where
data T = C
does not work. The Haskell 2010 standard says this about data types in the export list:
An algebraic datatype T declared by a data or newtype declaration may be named in one of three ways:
- The form T names the type but not the constructors or field names. The ability to export a type without its constructors allows the construction of abstract datatypes (see Section 5.8).
- The form T(c1,…,cn), names the type and some or all of its constructors and field names.
- The abbreviated form T(..) names the type and all its constructors and field names that are currently in scope (whether qualified or not).
Therefore, there seems to be no way to export a constructor without
exporting its parent type at the same time. But here’s the trick. First,
let’s move the data type definition from M
to a different
module, Internal
:
module Internal where
data T = C
Then, in our module M
, import Internal
like
this:
module M where
import Internal hiding (T)
This will import C
but not T
. Here’s
Haskell 2010 on the semantics of hiding
lists:
Entities can be excluded by using the form hiding(import1 , … , importn ), which specifies that all entities exported by the named module should be imported except for those named in the list. Data constructors may be named directly in hiding lists without being prefixed by the associated type. Thus, inimport M hiding (C)
any constructor, class, or type named C is excluded. In contrast, using C in an import list names only a class or type.
So now, within module M
, we have only C
in
scope but not T
. But M
does not
export any of them yet, and it seems like we need
T
in scope in order to add C
to the export
list.
Here we use a second trick: instead of naming C
specifically, we export our Internal
module as a whole:
module M (module Internal) where
import Internal hiding (T)
And this achieves the desired result: M
now exports the
data constructor C
but not the data type T
.
See Part 1
for more discussion of module exports.
This all works because the data type and its constructor are named differently. If we have
data T = T
this won’t work, because
import Internal hiding (T)
will hide both the data type and the data constructor. I actually
don’t know a way to export the constructor T
without the
type T
; if you do, please let me know.
Exporting fields without constructor
A similar question is that of exporting a field name without exporting its parent constructor, as in
module M (T, field) where
data T = C { field :: Bool }
The above code works. But what does it mean? Field names can
be used as functions (e.g. field :: T -> Bool
), but they
can also be used in the record construction
(C { field = True }
) and record update
(x { field = True }
) syntaxes.
Since we don’t export the data constructor, clearly record
construction won’t work. But what about record update? Does
field
lose its magic and become a simple function when
exported this way—either because it wasn’t exported as
T(field)
, or because the data constructor is not in scope?
The answer is no, it is still a valid field name, and
x { field = True }
still works. Here’s Haskell 2010:
It makes no difference to an importing module how an entity was exported. For example, a field name f from data type T may be exported individually (f, item (1) above); or as an explicitly-named member of its data type (T(f), item (2)); or as an implicitly-named member (T(..), item(2)); or by exporting an entire module (module M, item (5)).
And there’s no requirement that the constructor be in scope in order to use the record update syntax—even though its desugaring would require all of the constructors to be in scope.
Some authors use this to their advantage: for instance, Michael Snoyman suggests exporting fields but not constructors for settings types.
But more often than not, we want the opposite: to make the type
abstract and export a getter for it. If you think the module
M
above exports an abstract type T
and a
function field :: T -> Bool
, then you may believe you
are free to change the implementation of T
without your
users noticing. So in the next version you may change T
to
module M (T, field) where
data T = C { items :: [Int] }
field :: T -> Bool
= not . null . items field
This new module still exports a type T
and a function
field :: T -> Bool
, but that function is no longer a
record field. If someone was using field
in a record
update, x { field = True }
, their code is going to
break.
There’s no way to export a field as a plain function or otherwise strip it of its magic. You can either define a getter yourself, as in
data T = C Bool
field :: T -> Bool
= \case
field C a -> a
or create a copy of a field, as in
data T = C { _field :: Bool }
field :: T -> Bool
= _field field
Hidden types and type synonyms
Consider this example:
module M (x) where
type T = Int
x :: T
= 1 x
What type does x
have in a module that imports
M
? On the one hand, T
is Int
, so
x :: Int
. On the other hand, since T
is not in
scope, there’s no way to know that T = Int
, so it might
seem that x
is now an opaque value of some abstract type
T
that we know nothing about. In fact, the former view is
correct and the latter isn’t.
The type of an exported entity is unaffected by non-exported type synonyms. For example, inmodule M(x) where
type T = Int
x :: T
x = 1
the type of x is both T and Int; these are interchangeable even when T is not in scope. That is, the definition of T is available to any module that encounters it whether or not the name T is in scope. The only reason to export T is to allow other modules to refer it by name; the type checker finds the definition of T if needed whether or not it is exported.
Similarly, you may wonder what does it mean to export a function but hide some of the types that are part of its signature. Here’s what Haskell 2010 has to say about that:
[E]ntities that the compiler requires for type checking or other compile time analysis need not be imported if they are not mentioned by name. The Haskell compilation system is responsible for finding any information needed for compilation without the help of the programmer. That is, the import of a variable x does not require that the datatypes and classes in the signature of x be brought into the module along with x unless these entities are referenced by name in the user program. The Haskell system silently imports any information that must accompany an entity for type checking or any other purposes. Such entities need not even be explicitly exported: the following program is valid even though T does not escape M1:module M1(x) where
data T = T
x = T
module M2 where
import M1(x)
y = x
In this example, there is no way to supply an explicit type signature for y since T is not in scope. Whether or not T is explicitly exported, module M2 knows enough about T to correctly type check the program.
Self-exporting module
By default—when no export list is present in a module—all values, types and classes defined in the module are exported, but not those that are imported. With an explicit export list, we can export a mix of entities from the module itself and the modules it imports. But now you have to explicitly list everything defined in the current module and remember to update the list when you define a new function or type. Or do you?
There’s a nice trick to have an explicit export list and automatically export everything from the current module at the same time: just let the module export itself.
-- exports x, y, and z
module M (module M, Foo.x, Bar.y) where
import Foo
import Bar
= undefined z
Thanks to the clear semantics of module exports (which we discussed in Part 1), self-exporting is well-defined and doesn’t involve any recursion.