Undefined behavior with StablePtr in Haskell
Published on
What will the following Haskell code snippet do?
<- newStablePtr 1
ptr1 <- newStablePtr 2
ptr2
print =<< deRefStablePtr ptr1
Of course it will print 1… most of the time. But it’s also easy to make it print 2.
Here is the full program:
import Foreign.StablePtr
= do
main <- newStablePtr ()
ptr
freeStablePtr ptr
freeStablePtr ptr
<- newStablePtr 1
ptr1 <- newStablePtr 2
ptr2
print =<< deRefStablePtr ptr1
If I compile it with ghc 8.0.2 on x86-64 Linux, it prints 2 and exits cleanly. (Under ghci or runghc, it also prints 2 but then gets a SIGBUS.) You can imagine how much fun it was to debug this issue in a complex library with a non-obvious double free.
The docs for freeStablePtr
say:
Dissolve the association between the stable pointer and the Haskell value. Afterwards, if the stable pointer is passed to
deRefStablePtr
orfreeStablePtr
, the behaviour is undefined.
As far as undefined behaviors go, this one is fairly benign — at least it didn’t delete my code or install a backdoor.
Let’s see what’s going on here. The relevant definitions are in
includes/stg/Types.h
, includes/rts/Stable.h
,
rts/Stable.h
, and rts/Stable.c
. The excerpts
below are simplified compared to the actual ghc source.
A stable pointer is just an integer index into an array,
stable_ptr_table
, although it is represented as a
void*
.
/*
* Stable Pointers: A stable pointer is represented as an index into
* the stable pointer table.
*
* StgStablePtr used to be a synonym for StgWord, but stable pointers
* are guaranteed to be void* on the C-side, so we have to do some
* occasional casting. Size is not a matter, because StgWord is always
* the same size as a void*.
*/
typedef void* StgStablePtr;
typedef struct {
;
StgPtr addr} spEntry;
*stable_ptr_table; spEntry
This is how the table works:
If an index
i
is allocated to a validStablePtr
, thenstable_ptr_table[i].addr
points to whatever heap object the stable pointer is supposed to point to.(StgStablePtr sp) StgPtr deRefStablePtr{ return stable_ptr_table[(StgWord)sp].addr; }
If the index
i
is free (not allocated to any validStablePtr
), then the corresponding entry in the table acts as a node in a linked list that contains all free entries. The variablestable_ptr_free
contains a pointer to the start of this linked list.static spEntry *stable_ptr_free; /* Free a StgStablePtr */ void freeSpEntry(spEntry *sp) { ->addr = (P_)stable_ptr_free; sp= sp; stable_ptr_free } /* Allocate a fresh StgStablePtr */ (StgPtr p) StgStablePtr getStablePtr{ ; StgWord sp = stable_ptr_free - stable_ptr_table; sp = (spEntry*)(stable_ptr_free->addr); stable_ptr_free [sp].addr = p; stable_ptr_tablereturn (StgStablePtr)(sp); }
Suppose that the first two free entries in
stable_ptr_table
are 18 and 19, which is what I actually
observe at the program startup. (The RTS creates a few stable pointers
of its own for the purpose of running the program, which explains why
these don’t start at 0.) Here’s the double-free code annotated with what
happens on each step.
<- newStablePtr ()
ptr -- ptr == 18
-- stable_ptr_free == &stable_ptr_table[19]
-- stable_ptr_table[18].addr is a pointer to a Haskell value ()
freeStablePtr ptr-- stable_ptr_free == &stable_ptr_table[18]
-- stable_ptr_table[18].addr == &stable_ptr_table[19]
freeStablePtr ptr-- stable_ptr_free == &stable_ptr_table[18]
-- stable_ptr_table[18].addr == &stable_ptr_table[18]
<- newStablePtr 1
ptr1 -- ptr1 == 18
-- stable_ptr_free == &stable_ptr_table[18]
-- stable_ptr_table[18].addr is a pointer to a Haskell value 1
<- newStablePtr 2
ptr2 -- ptr2 == 18
-- stable_ptr_free is a pointer to a Haskell value 1
-- stable_ptr_table[18].addr is a pointer to a Haskell value 2
Because stable_ptr_free
now points into the Haskell heap
and outside of stable_ptr_table
, further allocations of
stable pointers will corrupt Haskell memory and eventually result in
SIGBUS or SIGSEGV.