Published on December 31, 2014; tags: Haskell
Conal Elliott in his paper Denotational design with type class morphisms, as well as in the recent podcast, advocates denotational design as a principle of software design. Unfortunately, in my experience it never works for realistically complex problems.
First, I’ll formulate Conal’s approach as I understand it. For any given entity of your model, you should come up with a simple mathematical object — the denotation — that faithfully represents that entity.
The implementation of the type may vary, presumably to maximize its runtime efficiency, but it should not expose any more information than the chosen denotation has. That is considered an «abstraction leak». Conal specifically talks about that in the podcast (31m50s, for example).
Here I need to stress an important but subtle point in Conal’s principle: simplicity. You only follow Conal’s principle if you find a simple denotation, not just any denotation.
This point is important because without it any design is denotational design, trivially. Universal algebra tells us that for any set of operations and any (non-contradictory) laws about them there exists a model that satisfies these laws. For any Haskell module, we can interpret its types in the set theory (or a more complex domain if needed), and call that our denotation.
But that’s not what Conal is after. His approach is interesting exactly because he argues that it is possible to find simple denotations. This subtle point makes Conal’s approach simultaneously attractive and unrealistic. I’ll demonstrate this with two examples from my own work experience.
At Barclays I worked on FPF, an embedded domain-specific language for describing financial instruments. In his paper, Conal shows how a denotation for such a DSL can quickly grow in complexity when requirements change. When variables and errors are introduced, the denotation changes from
Env -> Result a. Still, this is a very simple DSL that only supports evaluation.
In reality, the main reason people make DSLs instead of using general-purpose languages is the ability to analyze DSL programs. One important feature of FPF is that it could pretty-print a program into a nice PDF. That poses an obvious problem — not every two semantically equivalent programs (under the interpretation semantics) result in equally nice PDFs. Inlining is a semantically sound transformation, but when our users get PDFs with all the definitions inlined, they get angry.
Sure, we could say that now our denotation becomes the domain product
(Env -> Result a, String), where
String is the pretty printer output. But in reality we have a dozen different analyses, and most of them are not expressible in terms of each other, or any single simple model. They also do not satisfy many laws. For instance, one day a user (quant or trader) could come and tell us that the barrier classifier should classify two mathematically equivalent expressions as different barriers because those expressions follow certain conventions. And even though the quant is mathematically inclined, denotations and type class morphism would be the last thing he wants to hear about in response to his feature request.
So, in practice, the best denotation for the DSL expressions was the AST itself. Which, according to my interpretation of Conal’s principles, is not an example of a denotational design, but a failure to apply one.
At my current job (Signal Vine), I work on a platform for scripted interaction with students via text messages. For every student enrolled in a messaging campaign, we send a message, receive a reply, process it, and the cycle repeats.
This is very similar to FRP; perhaps not the FRP Conal prefers (in the podcast he stresses the importance of continuous functions as opposed to events), but the kind of discrete FRP that Justin Le models with Mealy machines.
So it would seem that I should model a student as
newtype Student = Student (InboundSMS -> (OutboundSMS, Student))
That would be an exemplary case of denotational design. But that would be far from our customers’ needs. Every student has a set of profile variables that are filled when the student responds to a text, and our customers (counselors who work with that student) want to see those variables. They also want to see which messages were sent, what the student’s replies were, and even what messages will be sent to the student in the future. These requirements defeat the attempt to model a student in a simple, abstract way. Instead, I need to store all the information I have about the student because sooner or later I’ll need to expose that information to the user.
Denotational design is a very neat idea, but I believe that it only works in simple cases and when requirements are static. In real-world commercial programming, it breaks for two main reasons:
It is certainly useful to think about denotations of your entities in specific, simple contexts (like the evaluation semantics for a DSL); such thought experiments may help you better understand the problem or even find a flaw in your implementation.
But when you are implementing the actual type, your best bet is to create an algebraic data type with all the information you have, so that when you need to extend the interface (or «leak the abstraction»), it won’t cause you too much pain.