Undefined behavior with StablePtr in Haskell
Published on
What will the following Haskell code snippet do?
ptr1 <- newStablePtr 1
ptr2 <- newStablePtr 2
print =<< deRefStablePtr ptr1Of course it will print 1… most of the time. But it’s also easy to make it print 2.
Here is the full program:
import Foreign.StablePtr
main = do
ptr <- newStablePtr ()
freeStablePtr ptr
freeStablePtr ptr
ptr1 <- newStablePtr 1
ptr2 <- newStablePtr 2
print =<< deRefStablePtr ptr1If I compile it with ghc 8.0.2 on x86-64 Linux, it prints 2 and exits cleanly. (Under ghci or runghc, it also prints 2 but then gets a SIGBUS.) You can imagine how much fun it was to debug this issue in a complex library with a non-obvious double free.
The docs for freeStablePtr say:
Dissolve the association between the stable pointer and the Haskell value. Afterwards, if the stable pointer is passed to
deRefStablePtrorfreeStablePtr, the behaviour is undefined.
As far as undefined behaviors go, this one is fairly benign — at least it didn’t delete my code or install a backdoor.
Let’s see what’s going on here. The relevant definitions are in
includes/stg/Types.h, includes/rts/Stable.h,
rts/Stable.h, and rts/Stable.c. The excerpts
below are simplified compared to the actual ghc source.
A stable pointer is just an integer index into an array,
stable_ptr_table, although it is represented as a
void*.
/*
* Stable Pointers: A stable pointer is represented as an index into
* the stable pointer table.
*
* StgStablePtr used to be a synonym for StgWord, but stable pointers
* are guaranteed to be void* on the C-side, so we have to do some
* occasional casting. Size is not a matter, because StgWord is always
* the same size as a void*.
*/
typedef void* StgStablePtr;
typedef struct {
StgPtr addr;
} spEntry;
spEntry *stable_ptr_table;This is how the table works:
If an index
iis allocated to a validStablePtr, thenstable_ptr_table[i].addrpoints to whatever heap object the stable pointer is supposed to point to.StgPtr deRefStablePtr(StgStablePtr sp) { return stable_ptr_table[(StgWord)sp].addr; }If the index
iis free (not allocated to any validStablePtr), then the corresponding entry in the table acts as a node in a linked list that contains all free entries. The variablestable_ptr_freecontains a pointer to the start of this linked list.static spEntry *stable_ptr_free; /* Free a StgStablePtr */ void freeSpEntry(spEntry *sp) { sp->addr = (P_)stable_ptr_free; stable_ptr_free = sp; } /* Allocate a fresh StgStablePtr */ StgStablePtr getStablePtr(StgPtr p) { StgWord sp; sp = stable_ptr_free - stable_ptr_table; stable_ptr_free = (spEntry*)(stable_ptr_free->addr); stable_ptr_table[sp].addr = p; return (StgStablePtr)(sp); }
Suppose that the first two free entries in
stable_ptr_table are 18 and 19, which is what I actually
observe at the program startup. (The RTS creates a few stable pointers
of its own for the purpose of running the program, which explains why
these don’t start at 0.) Here’s the double-free code annotated with what
happens on each step.
ptr <- newStablePtr ()
-- ptr == 18
-- stable_ptr_free == &stable_ptr_table[19]
-- stable_ptr_table[18].addr is a pointer to a Haskell value ()
freeStablePtr ptr
-- stable_ptr_free == &stable_ptr_table[18]
-- stable_ptr_table[18].addr == &stable_ptr_table[19]
freeStablePtr ptr
-- stable_ptr_free == &stable_ptr_table[18]
-- stable_ptr_table[18].addr == &stable_ptr_table[18]
ptr1 <- newStablePtr 1
-- ptr1 == 18
-- stable_ptr_free == &stable_ptr_table[18]
-- stable_ptr_table[18].addr is a pointer to a Haskell value 1
ptr2 <- newStablePtr 2
-- ptr2 == 18
-- stable_ptr_free is a pointer to a Haskell value 1
-- stable_ptr_table[18].addr is a pointer to a Haskell value 2Because stable_ptr_free now points into the Haskell heap
and outside of stable_ptr_table, further allocations of
stable pointers will corrupt Haskell memory and eventually result in
SIGBUS or SIGSEGV.