Journal of Automated Reasoning manuscript No.

(will be inserted by the editor)

Implementing and reasoning about hash-consed data
structures in Coq
Thomas Braibant · Jacques-Henri Jourdan ·
David Monniaux

arXiv:1311.2959v3 [cs.LO] 15 Dec 2014

November 2013

Abstract We report on four different approaches to implementing hash-consing

in Coq programs. The use cases include execution inside Coq, or execution of
the extracted OCaml code. We explore the different trade-offs between faithful
use of pristine extracted code, and code that is fine-tuned to make use of OCaml
programming constructs not available in Coq. We discuss the possible consequences
in terms of performances and guarantees. We use the running example of binary
decision diagrams and then demonstrate the generality of our solutions by applying
them to other examples of hash-consed data structures.

1 Introduction
Hash-consing is a programming technique used to share identical immutable values

in memory, keeping a single copy of semantically equivalent objects. It is especially
useful in order to get a compact representation of abstract syntax trees. A hashconsing library maintains a global pool of expressions and never recreates an
expression equal to one already in memory, instead reusing the one already present.
In typical implementations, this pool is a global hash-table. Hence, the phrase
hash-consing denotes the technique of replacing node cons truction by a lookup
in a hash table returning a preexisting object, or creation of the object followed
by insertion into the table if previously nonexistent. This makes it possible to
get maximal sharing between objects, if hash-consing is used systematically when
creating objects.
Moreover, a unique identifier is given to each object, allowing fast hashing
and comparisons. This makes it possible to do efficient memoization (also known
as dynamic programming ): the results of an operation are tabulated so as to be
returned immediately when an identical sub-problem is encountered.
T. Braibant · J.-H. Jourdan
Inria Paris-Rocquencourt, Domaine de Voluceau, BP 105, 78153 Le Chesnay, France
E-mail: thomas.braibant@inria.fr, jacques-henri.jourdan@inria.fr
D. Monniaux
CNRS / VERIMAG, Centre Équation, 2, avenue de Vignate, 38610 Gières, France
E-mail: david.monniaux@imag.fr

2

Thomas Braibant et al.

Hash-consing and memoization are examples of imperative techniques that
are of prime importance for performance. The typical way of implementing hashconsing (a global hash table) does not translate easily into Coq. The reason is
that the Gallina programming language at the heart of the Coq proof assistant
is a purely applicative language, without imperative traits such as hash tables,
pointers, or pointer equality. In the following, we discuss how hash-consing can
be implemented using the Coq proof assistant (without modifying it) with two
possible use cases in mind:
– execution inside Coq with reasonable efficiency, e.g. for proofs by reflection;
– efficient execution when extracted to OCaml, e.g. for use in a model-checking

or static analysis tool proved correct in Coq.
More precisely, we present a few design patterns to implement hash-consing and
memoization on various examples, and we use binary decision diagrams (BDDs) as
our prime example. We evaluate these design patterns based on several criterion.
For instance, we discuss for each pattern how we handle the maximal sharing
property and how easy it is to prove the soundness of our implementation. We also
discuss whether or not it is possible to compute inside Coq with a given solution.
Reading this article. In the following, we will try to separate some implementation

details or Coq specific comments from the main text of the article. These remarks,
titled “For the practical mind”, can be safely skipped on the first reading.
Outline. This paper is organized as follows. Since hash-consing can be seen as a

particular kind of memoization, where one memoizes constructor applications, we
begin this paper with a short review of the techniques that are available in Coq
in order to memoize functions (§2). Then, we give a short primer on BDDs in §3
and present the “menu” of our four implementations of BDD libraries in §4. These
implementations are subsequently described in §5 and §6 and use different design
patterns. We discuss the pros and cons of each design pattern in §7. We adapt
these design patterns to implement strong reduction of hash-consed lambda-terms
in §8. We discuss related work and conclude this paper in §9.

2 Memoization in Coq: state of the art

Generally speaking, there are two ways to use Coq code: either to execute it inside
Coq, e.g., to define reflective decision procedures; or to extract it to OCaml (or
other supported languages), then compile the extracted code. These two usages
impose constraints on the possible design patterns. In the remainder of this section,
we describe what the state of the art in memoization approaches is with respect to
these two use cases.

2.1 State monad
The first idea to implement memoization in Coq is to use a dedicated data structure
to tabulate function calls: that is, use a finite map of some sort to store the pairs

Implementing and reasoning about hash-consed data structures in Coq
Section t.
Context {A : Type}.
CoInductive lazy : Type :=
| Thunk : A → lazy.
CoInductive lazies : Type :=
| Cons : lazy → lazies → lazies.
Fixpoint nth n cst :=
match n, cst with
| O, Cons (Thunk xi) _ ⇒ xi
| S p, Cons _ c ⇒ nth p c
end.
CoFixpoint mk_lazy {B} f (n : B) :=
Thunk (f n).

3

CoFixpoint memo’ f n :=
Cons (mk_lazy f n) (memo’ f (S n)).
Definition memo f := memo’ f 0.
End t.
(* A costly version of the identity *)
Fixpoint big {A} n : A → A:=
match n with
| 0 ⇒ fun x ⇒ x
| S n ⇒fun x ⇒ big n (big n x)
end.
Definition id n := big n n.
Definition k := memo id.
Definition run x := nth x k.

Fig. 1: Memoization using co-inductives
(argument, result) for a given function. Users of a library implemented in this fashion

must, then, thread this state through the program using a state monad.
While this solution always works, it has non-negligible verification cost: we
must prove that the bindings in the memoization table are correct, i.e., the values
held in the table correspond to the result of the would-be corresponding function
calls.
Then, if we memoize a function of type A → B, we get a function of type
A → M B, where M is the type constructor associated with our state monad. Even if
the latter seems equivalent to the former, these types are different: therefore, this
tiny modification (memoizing a function) requires modifying every caller of the
memoized function to use a monadic style.
Since it is cumbersome to perform all computations inside a state monad, we
continue this section with a review of other (partial) solutions to the memoization
problem.
2.2 Shallow memoization
It is possible in some cases to make a shallow embedding of this memoization table
in Coq. Melquiond [17] describes how to use co-inductive objects as a cache that
stores previously computed values of a given function. In this section, we present
his clever idea in Fig. 1, with a few cosmetic changes1 . A lazy value is defined as a
co-inductive, which will effectively be represented as a thunk when executed by
Coq’s virtual machine, and will be extracted to OCaml as a Lazy.t thunk. A thunk
is a term whose evaluation is frozen until it is actually needed (lazy evaluation),
then the computed value is cached in the thunk so that it is instantly returned in
case the term is evaluated again (thus, performing a kind of memoization).
Then, lazies represents a stream (an infinite list) of thunks, that can be peeked
at using nth. Each time the virtual machine has to destruct the cst argument of
1 We found out that similar definitions actually made their way into Coq’s standard library
under the name StreamMemo.v, yet we believe that this idea deserves more explanation than
what is available.

4

Thomas Braibant et al.

nth, it will remember the two arguments of this cons cell. Then, it suffices to build
a stream memo that computes the value of a function f for each possible natural
number: in practice, f will be evaluated lazily, and only once per input value.

As an example, the following snippet of code memoizes the time-consuming
function id. (Remark that it also demonstrates that the intermediate calls of id
are not memoized.)
Time
Time
Time
Time

Eval
Eval
Eval
Eval

vm_compute
vm_compute
vm_compute
vm_compute

in
in
in
in

run
run
run
run

26.
26.
25.
25.

(*
(*
(*
(*

6s
0s
3s
0s

*)
*)
*)
*)

Time Eval vm_compute in id 25. (* 3s *)
Time Eval vm_compute in id 25. (* 3s *)

Note that this clever trick is not limited to functions over Peano numbers, and
could be adapted for branching structures, e.g. memoizing functions over binary
numbers. Unfortunately, it seems hard to adapt this idea to memoize recursive
calls: consider for instance the function
Fixpoint exp n := match n with 0 ⇒ 1 | S n ⇒ exp n + exp n end.

In order to compute exp using a linear number of recursive calls we need a memoizing
fixed-point combinator rather than a way to memoize a predefined function.
(To be complete, let us add that functions f : nat → A that can be expressed
as the iteration of a given function g : A → A, can actually be memoized using
the aforementioned technique. While we obviously could rewrite exp to fit into
that scheme, this is not the case of the other functions that we use through our
developments.)
This problem is somewhat representative of the one that we have to face when
it comes to BDDs: first, memoizing fixed-point combinators are crucial in our
work; second, the function that we memoize depends on the global state of the
hash-consing machinery, which evolves through times. In both cases, we fall out of
the scope of the above trick: there is no predefined function to memoize.

2.3 Memoization in extracted code
Now, we turn to a partial solution to the memoization problem, that works when
the user is solely interested in executing the code extracted from a Coq program.
Vafeiadis [21] introduced recently a restricted form of mutable state called
adjustable references. The idea is that adjustable references store some internal
value that can only be updated in innocuous ways. That is, adjustable references
are equipped with an observation function; and the update function ensures that
the result of the observations remain equals through updates. It is therefore possible
to adjust the values held in the references in a way that changes, e.g., the costs of
some computations, yet does not change the result of these computations.
Vafeiadis demonstrated how to use adjustable references to memoize a function
f by tabulating its results: the adjustable reference internal state is a finite map,
while the observation function simply returns f. Updating the contents of the finite
map does not change the observation function; but subsequent reads may make use
of the fact that a value was previously computed and stored inside the memoization
cache. We refer the reader to Vafeiadis’ article for more details about this idea.

Implementing and reasoning about hash-consed data structures in Coq
x1

5

x1

0

1

x2

0 1
x2

x2

x2

0

1

0

1

0

1

T

F

T

F

T

F

0
T

1
F

Fig. 2: The BDD for the function f (0, 0) = T, f (0, 1) = F, f (1, 0) = T, f (1, 1) = F

Remark that it is again the case that this technique does not scale to the
definition of memoizing fixed-point combinators. Therefore, adjustable references
do not quite fit our needs either.

2.4 Summary
Out of the three memoization techniques available in Coq that we described, only
the first one suits our need: it makes it possible to define memoizing fixed-point
combinators. In this paper, we use this state monad approach, then we will come
up with a new idiom.

3 A primer on binary decision diagrams

In the following, we use “reduced ordered binary decision diagrams” (ROBDDs,
BDDs for short) as a running example of hash-consed data structures. BDDs are
representations of Boolean functions and are often used in software and hardware
formal verification tools, in particular model checkers [16].
The data structure. A Boolean function f : {0, 1}n → {0, 1} can be represented as a
complete binary tree with 2n − 1 decision nodes, labeled by variables xi according
to the depth i from the root (thus the adjective ordered ); edges are labeled 0 and
1; leaves are labeled T (for true) or F (for false). The semantics of such a diagram
is: f (x1 , . . . , xn ) is obtained by traversing from the root and following the 0 or 1
edge of a node labeled by xi according to the value of xi .
Such a tree can be reduced by merging identical subtrees, thus becoming a
connected directed acyclic graph (DAG; see second diagram in Fig. 2); furthermore,

decision nodes with identical children are removed (see third diagram in Fig. 2).
These transformations preserve semantics. The reduced representation is canonical :
given a variable ordering x1 , . . . , xn , a function is represented by a unique ROBDD.
Building BDDs. In practice, one directly constructs the reduced tree; let us see

how. All BDD nodes allocated so far are stored in a global hash table, so that
a new object is created only if not already in the hash table: there are never two
identical objects at two different locations in memory, a crucial invariant that must
be maintained throughout the execution.

6

Thomas Braibant et al.

Each BDD node is given a unique identifier: for instance, the current value of a
global 64-bit counter incremented following each allocation2 . Since a new object is
never created if an identical object already exists, two objects are equal if and only
if they have the same unique identifier. The equality test, which is usually expensive
over tree structures, since it requires full traversal, is instead implemented by a
very fast comparison of unique identifiers.
Unique identifiers are also used for hashing. When attempting to construct a
node (v, b0 , b1 ) where v is the variable, b0 is the subtree labeled 0 and b1 is the
subtree labeled 1, one computes the hash value of the node as H (v, u0 , u1 ) where
u0 and u1 are the respective unique identifiers of the subtrees b0 and b1 . This hash
value is then used to look up the node in the hash table. Again, such shallow
hashing is considerably faster than having to traverse the tree structures.
Operations on BDDs. The principal way to build BDDs is to combine the diagrams
of two functions f and g in order to obtain the BDD for other functions such as
f ∧ g , f ∨ g of f ⊕ g . The basic combinator of BDDs is called node melding [16].

Intuitively, melding corresponds to a traversal of the internal nodes of two BDDs
to build a third diagram. This traversal respects the order on variables, and thus,
the third diagram will naturally be ordered. Then, all binary operations on BDDs
are defined as one particular instance of melding.
Suppose that we have two nodes α = (v, l, h) and α0 = (v 0 , l0 , h0 ). The melding
of α and α0 , denoted by α  α0 is defined as follows:

0
0

if v = v 0
(v, l  l , h  h )
0
0
0
α  α = (v, l  α , h  α ) if v < v 0

 0
(v , α  l0 , α  h0 ) if v > v 0
Then, the binary operation  is entirely defined by the results of
Fα

αF

Tα

αT

For instance, the conjunction operation f ∧ g can be defined by melding the BDDs
for f and g , and using the following rewrite rules for the leaf cases:
Fα→F

αF→F

Tα→α

αT→α

For the sake of clarity, we focus on binary operations in the following: they are
representative of the difficulties we had to face3 .
Memoization. The fact that each node has a unique identifier also makes it possible

to memoize the results of BDD operations. One keeps a map from sub-problems
(a pair of nodes α and α0 ) to nodes (the result of α  α0 ), so as to be returned
immediately when a previously solved sub-problem is encountered.
In a BDD library, memoization is crucial to implement the or/and/xor operations with time complexity in O(|a|.|b|) where |a| and |b| are the sizes of the inputs:
as the function is executed, its results on the subtrees of the original problem are
stored in a structure indexed by (ua , ub ) where ua and ub are the unique identifiers
of the a and b inputs. In contrast, the naive approach has exponential complexity,
since the function may be evaluated exponentially often on the same couple of
subtrees.
2 In certain implementations, the unique identifier is the address of the node; this supposes
that objects never change address, which is not the case in OCaml.
3 Note that we only need two rewrite rules for commutative binary operations.

Implementing and reasoning about hash-consed data structures in Coq

7

4 Implementing BDDs in Coq.

The following two sections §5 and §6 describe a total of four implementations
of BDDs in Coq. To make things clear for the reader, we give each of these
implementation a reference name, a pointer to the relevant section of the paper
and a short description.
pure-deep. See §5.1. A pure Coq implementation of BDDs that makes a deep

embedding of memory as finite maps and uses indices as surrogates of pointers.
pure-shallow. See §5.2. A pure Coq implementation of BDDs that uses a shallow

embedding of memory.
smart. See §6.1. An ”impure” implementation of BDDs in Coq: we implement

hash-consing and memoization through the extraction mechanism of Coq.
smart+uid. See §6.2. A variation on the previous approach in which we discuss

how to expose and axiomatize the operations on the unique identifiers associated
with BDD nodes.

5 Pure solutions

As the reader can gather from the summary above, we describe two pure Coq
implementations of BDDs libraries in this section, that differ mainly by the way
we model the memory and the allocation of nodes.

5.1 The pure-deep approach
The idea here is to model the memory using finite maps inside Coq and use
indices in the maps as surrogates for pointers, implementing BDD operations as
manipulations of these persistent maps. Such an implementation4 was described in
[22, 23], but we propose a new one here for the sake of completeness, and because
the old one did not age well w.r.t. the evolution of Coq. Our implementation is
defined as follows.
First, we assign a unique identifier to each decision node. Second, we represent
the directed acyclic graph underlying a BDD as a Coq finite map from identifiers
to decision nodes (that is, tuples that hold the left child, the node variable and
the right child). For instance, the following graph, on the left, can be represented
using the map on the right.
x1

"

x2

%  |
F

"

x3

!

1
2
3

7→
7
→
7
→

(F, x1 , N 2)
(F, x2 , N 3)
(F, x3 , T)

T

Then, we implement the hash-consing pool using another map from decision nodes
to node identifiers and a next counter that is used to assign a unique identifier to
a fresh node. In the situation above, next is equal to 4 and the hash-consing map
is defined as follows
4 One should also mention the seminal work by E. Ledinot in 1993 on the canonicity of
binary decision dags, available from the Coq contributions.

8

Thomas Braibant et al.

(F, x1 , N 2)
(F, x2 , N 3)
(F, x3 , T)

7→
7→
7
→

1
2
3

Notations. Through this paper, we stay as close as possible to our Coq code. Yet
we allow us some liberty when it comes to finite maps (that are pervasive in our
code): the notation t
s denotes a type of efficient maps from type t to type s;
and we use indiscriminately the notations get and set that respectively access and
update such finite maps. Implicitly, these two notations have the following types:
get : A → (A
B) → option B
set : A → B → (A
B) → (A
B)

(Note that we do use the efficient finite maps from the Coq standard library; but
we chose here to abstract from the particular module definitions and names that
we have to use in our code, for the sake of legibility.)
For the practical mind. In the following, we use Coq’s positive numbers positive

as unique identifiers, and use Patricia trees for finite maps.


Fig. 3 shows our inductive definitions and the code of the associated allocation
function mk_node, knowing that alloc n st allocates the fresh node n in the hashconsing state st (taking care of updating both finite maps and incrementing the
“next fresh” counter). Then, equality between BDDs (eqb) is provided by decidable
equality over node identifiers.
All operations thread the current global state in a monadic fashion. The
correctness of BDD operations corresponds to the facts that 1) the global state is
used in a monotonic fashion (that is the structure of the resulting global state is a
refinement of the input one, see Fig. 4); 2) the resulting global state is well-formed;
3) the denotation of the resulting BDD expression is correct.
Invariants and well-formedness properties. We present the well-formedness properties
we preserve over expressions (wf_expr) and over hash-consing states (wf_hashcons)
in Fig. 5. The inductive predicate wf_expr st v e depends on a variable level v that
indicates a bound over the variable identifiers appearing in the BDD e. This ensures
that variable are kept ordered. Then, a well-formed expression is either a leaf (cases
wf_T and wf_F) or a pointer to a hash-consing node with a head variable w less than
the bound v. (Remark that this definition of well-formed expression is not directly
recursive, because the actual recursion takes place in the definition of wf_hashcons
below; this notion of well-formed expressions coupled with well-formedness of the
state suffice to prove that expressions are DAGs with a suitable ordering on the
variables.)
The predicate wf_hashcons is the general well-formedness property over hashconsing states. The predicate wf_bijection ensures that the two maps hmap and
graph form a bijection; wf_expr_lt_next ensures that the next fresh variable counter
will indeed produce fresh variables; wf_map_wf_expr_l and wf_map_wf_expr_r ensure
that a node stored in the hash-consing structure has well-formed children, respecting
the variable order; wf_reduced ensures that all those nodes are reduced (ie. their
left child is different from their right child).
Note that the statement that the hash-consing structure is correct corresponds to
wf_bijection and wf_expr_lt_next. The other statements correspond to properties
about BDDs, namely the facts that they are reduced and ordered.

Implementing and reasoning about hash-consed data structures in Coq

9

Inductive expr := F | T | N : positive → expr.
Definition eqb a b :=
match a,b with | T, T ⇒ true | F,F ⇒ true | N x, N y ⇒ (x =? y) | _, _ ⇒ false end.
Definition node := (expr ∗ var ∗ expr).
Record hashcons_state := {
graph : positive
node;
hmap : node
positive;
next : positive
}.
Definition alloc u st :=
(st.( next), {|
graph := set st.(next) u st.( graph);
hmap := set u st.( next) st.( hmap);
next := st.( next) + 1
|}) .
Definition mk_node (l : expr) (v : var) (h : expr) st :=
if eqb l h then (l, st)
else match get (l,v,h) st.( hmap) with
| Some x ⇒ (N x, st)
| None ⇒ let (x, st) := alloc (l,v,h) st in (N x, st)
end.

Fig. 3: Hash-consing in pure Coq, using a deep-embedding
Record incr (st1 st2 : hashcons_state) : Prop := {
incr_lt : st1.(next) ≤ st2.(next);
incr_find : ∀ p x, get p st1.(graph) = Some x → get p st2.(graph) = Some x
}.

Fig. 4: Monotonicity predicate over hash-consing states

The denotation of BDD expressions. Recall that BDD expression, as we have defined
them in this section, do not have an inductive structure we can follow: this means
that we cannot define the semantics of BDD expressions as a Coq fixed-point.
Rather, we have to define the semantics of BDD expressions as an inductive
predicate that uses the state of the hash-consing data structure to go through the
graph. The inductive value is shown on Fig. 6. It is defined as a binary relation with
two parameters (the valuation of variables env and the state of the hash-consing
structure st) and two arguments: the expression and its denotation (a Boolean).

For the practical mind. We only use the type of environments env to establish a

semantics, thus efficiency is not an issue. However, we prefer to use finite maps of
some kinds to represent environments: using a function of type var → bool would
force the client to reason about the fact that some BDD expressions are insensitive
to the valuation of some variables. Using a finite map make this reasoning internally
in our library and eases the client’s life.


10

Thomas Braibant et al.

Inductive wf_expr st : var → expr → Prop :=
| wfe_T : ∀ v, wf_expr st v T
| wfe_F : ∀ v, wf_expr st v F
| wfe_N : ∀ (p : positive) (l : expr) (w: var) (h : expr),
get p st.( graph) = Some (l, w, h) →
∀ (v : var), (w < v) →
wf_expr st v (N p).
Record wf_hashcons (st : hashcons_state) : Prop := {
wf_bijection : ∀ p n, get n st.( hmap) = Some p ↔ get p st.(graph) = Some n;
wf_expr_lt_next : ∀ p v, wf_expr st v (N p) → p < st.(next);
wf_map_wf_expr_l : ∀ p x v y, get p st.( graph) = Some (x, v, y) → wf_expr st v x;
wf_map_wf_expr_h : ∀ p x v y, get p st.( graph) = Some (x, v, y) → wf_expr st v y;
wf_reduced : ∀ p l v h, get p st.( graph) = Some (l, v, h) → l <> h
}.

Fig. 5: Well-formedness of hash-consing state

Inductive value env st : expr → bool → Prop :=
| value_T : value env st T true
| value_F : value env st F false
| value_N : ∀ (p : positive) (l : expr) (v : var) (h : expr),
get p st.( graph) = Some (l, v, h) →
∀ (vv : bool), get env v = Some vv →
∀ (vhl : bool), value env st (if vv then h else l) vhl →
value env st (N p) vhl.

Fig. 6: The denotation of BDD expressions

Termination of BDD operations. The first problem that we have to solve in our Coq

representation is that, as can be expected from our data structure, BDD operations
cannot be defined using structural recursion (there is no inductive structure to
follow). Unfortunately, we cannot easily use well-founded recursion here because
the well-founded relation involves both parameters of the function and the global
state.
The problem is that the termination of the BDD operations relies on the fact
that the graph of nodes is acyclic; but the graph is not fixed through an execution
of the melding operation! Rather, the global state is threaded in the recursive
calls of . Therefore, proving the recursive calls to be well-founded requires to
prove that the  operation is monotonic w.r.t. graph inclusion. In order to prove
termination, we would have to define the fixed-point and prove this monotonicity
property at the same time. This would involve embedding invariants directly in the
global state, using dependent types. This has two major drawbacks: first, defining
terms containing complex dependent types is cumbersome and proving properties
about them is often very challenging. Second, we would have to pay close attention
to prevent Coq from normalizing those proof terms if we want to use this library
for reflection purposes.
In the end, we resorted to define partial functions that use a fuel argument to
ensure termination: that is, they use an explicit bound on the number of iterations
to do.

Implementing and reasoning about hash-consed data structures in Coq
Record memo := {
mand : (positive ∗ positive)
mor : (positive ∗ positive)
mxor : (positive ∗ positive)
mneg : positive
expr}.

expr;
expr;
expr;

11

Record state := {
... :> hashcons_state;
... :> memo}.

Fig. 7: Adding memoization tables to the global state
Record wf_memo2 (st : hashcons_state) (m : (positive ∗ positive)
expr)
(opb : bool → bool → bool)
:= {
wf_memo2_find_wf_res :
∀ x y e, get (x, y) m = Some e → wf_expr st v (N x) → wf_expr st v (N y) →
wf_expr st v e;
wf_memo2_find_wf :
∀ x y e, get (x, y) m = Some e → ∃ v, wf_expr st v (N x) ∧ wf_expr st v (N y);
wf_memo2_find_sem :
∀ x y res, get (x, y) m = Some res →
∀ env vx vy, value env st (N x) vx → value env st (N y) vy →
value env st res (opb va vb)
}.
Record wf_memo_neg (st : hashcons_state) (m : positive

expr) := { ... }.

Record wf_memo (st : state) := {
wf_memo_mand : wf_memo2 st st.(mand) Datatypes.andb;
wf_memo_mor : wf_memo2 st st.(mor) Datatypes.orb;
wf_memo_mxor : wf_memo2 st st.(mxor) Datatypes.xorb;
wf_memo_mneg : wf_memo_neg st st.(mneg)
}.
Record wf_st (st : state) : Prop := { ... : wf_hashcons st; ... : wf_memo b }.

Fig. 8: Invariant over the memoization information

Memoizing operations. Finally, it is possible to enrich our hash-consing structure
with memoization tables in order to tabulate the results of BDD operations.
The memoization tables (see Fig. 7) are passed around by the state monad,
just as the hash-consing structure. It is then necessary to maintain invariants on
the memoization information.
For the practical mind. In the definition of state, we use two field coercions (denoted
:>) to declare coercions from the type state to the types hashcons_state and memo.
In practice, this means that one can use a state in any expression that would

expect the result of one of these two projections.

We present the invariants over the memoization maps for the binary operations5
in Fig. 8. First, we have to prove that the nodes referenced in the domain and in
the codomain of those tables are well-formed and that those tables keep the bounds
over the variables correct. Then, we have to state that the memoization information
is semantically correct. One downside of this data structure definition is that we
are forced to define one table per operation that we want to memoize, and that
5

The case of the negation operation is similar and is not detailed.

12

Thomas Braibant et al.

this is not modular: adding a new operation requires modifying the definition of
the memoization state and add the corresponding field in the wf_memo record.
(Note that finding the correct pattern of memoization for a program is still an
art rather than a science: using the data structure above, we keep the memoized
values from one run of an operation to the other. In this section, we will settle on
this conservative strategy, but other memoization strategies are possible that yield
different performance and memory consumption profiles.)

A mouthful of code. The final version of our code is shown on Fig. 9. We use a

do-notation à la Haskell to make it more palatable.
Then, we prove that under some hypotheses, this combinator is correct: that is,
it produces well-formed hash-consing structures and memoization tables, and the
denotation of the resulting expression is correct. For the sake of clarity, we will not
expose these hypotheses nor the resulting correctness theorem in this paper, and
refer the interested reader to the archive of code that accompanies this paper [5].

Implementing the and operation. However, we demonstrate the use of this combinator on the particular example of the and function; all other binary operations
follow the same pattern. First, we have to define a function upd_and that updates

the memoization state: it is simply a wrapper that add an element to the right
memoization table, and leave the others untouched. Then, we define the function
mk_and as a simple call to the binary combinator. The crux here is the choice of
the functions Fx and Tx that specify the behavior of the combinator at the leaves
of the DAG.
Definition upd_and na nb res (st : state) :=
mk_state st
{| mand := set (na,nb) res st.( mand);
mor := st.( mor);
mxor := st.(mxor);
mneg := st.(mneg)
|}) .

Definition mk_and :=
combinator
(fun x st ⇒ Some (F,st) ) (* Fx *)
(fun x st ⇒ Some (x,st) ) (* Tx *)
(fun a b st ⇒ get (a,b) st.(mand))
upd_and.

Then, the semantic correctness theorem for this operation is defined as follows.
Theorem mk_and_correct depth v0 (st: state) a b :
wf_st st →
wf_expr st v0 a → wf_expr st v0 b →
∀ res st’, mk_and depth a b st = Some (res, st’) →
wf_st st’ ∧ incr st st’ ∧
wf_expr st’ v0 res ∧
∀ env va vb,
value env st a va →
value env st b vb →
value env st’ res (va && vb).

This statement proves three things under the hypotheses that the given state
and BDDs are well-formed : first, mk_and returns a well formed state. Second,
it manipulates the state in an increasing manner (and, in particular, previously
well-defined BDDs will still be well-defined and keep their semantic values). Third,
the returned BDD have the expected semantic interpretations.

Implementing and reasoning about hash-consed data structures in Coq

Section combinator.
(* fx is the base case when one operand is F *)
Variable fx : expr → state → option (expr ∗ state).
(* tx is the base case when one operand is T *)
Variable tx : expr → state → option (expr ∗ state).
Variable memo_get : positive → positive → state → option expr.
Variable memo_update : positive → positive → expr → state → state.
Definition memo_node a b l v h st :=
let (res, st) := mk_node l v h st in
let st := memo_update a b res st in
(res,st).
Fixpoint combinator depth : expr → expr → state → option (expr ∗ state) :=
fun a b st ⇒
match depth with
| 0 ⇒ None
| S depth ⇒
match a,b with
| F, _ ⇒ fx b st
| _, F ⇒ fx a st
| T, _ ⇒ tx b st
| _, T ⇒ tx a st
| N na, N nb ⇒
match memo_get na nb st with
| Some p ⇒ Some (p,st)
| None ⇒
do nodea ← get na st.(graph);
do nodeb ← get nb st.(graph);
let ’( l1,v1,h1) := nodea in
let ’( l2,v2,h2) := nodeb in
match Pos.compare v1 v2 with
| Eq ⇒
do x, st ← combinator depth l1 l2 st;
do y, st ← combinator depth h1 h2 st;
Some (memo_node na nb x v1 y st)
| Gt ⇒
do x, st ← combinator depth l1 b st;
do y, st ← combinator depth h1 b st;
Some (memo_node na nb x v1 y st)
| Lt ⇒
do x, st ← combinator depth a l2 st;
do y, st ← combinator depth a h2 st;
Some (memo_node na nb x v2 y st)
end
end
end
end.
End combinator.

Fig. 9: Deep combinator for binary operations

13

14

Thomas Braibant et al.

Canonicity. We can prove that this representation of BDDs is canonical: that is,
well-formed equivalent expressions are mapped to the same nodes.
Definition equiv st e1 e2 :=
∀ env v1 v2, value env st e1 v1 → value env st e2 v2 → v1 = v2.
Lemma canonicity st v e1 e2 :
wf_st st → wf_expr st v e1 → wf_expr st v e2 →
equiv st e1 e2 → e1 = e2.

From this result, it follows that the (non-recursive) eqb function from Fig. 3 is
a correct and complete characterization of semantic equivalence of expressions.
Lemma eqb_correct st v e1 e2 :
wf_st st → wf_expr st v e1 → wf_expr st v e2 →
(eqb e1 e2 = true ↔ equiv st e1 e2).

Garbage collection. In the above version of BDDs, we have not implemented garbage

collection. That is, allocated nodes are never destroyed, until the allocation map
becomes unreachable as a whole. Garbage collection could be added, e.g., using a
stop and copy operation that preserve a set of roots. This is beyond the scope of
this paper.

5.2 The pure-shallow approach
The previous implementation uses a deep embedding of the representation of the
BDD in memory via the graph map. This is a natural way to encode a directed
acyclic graph, but, as we saw, makes it difficult (if not unfeasible) to deal properly
with termination. Therefore, we would like to be able to reason about BDDs as
if they formed an inductive type, while keeping the ability to share sub-terms at
runtime.
There is actually no need to look further than inductive types to do that. The
standard intuition about inductive types is that they define the smallest type closed
under application of constructors: the mental image that we get from that is a tree.
Yet, there is nothing that prevent us to use the system to share sub-terms.6
In this section, we demonstrate that we can encode BDDs in Coq using a
representation that looks like binary decision trees, yet has runtime performances
similar to the pure-deep implementation (see §5.1), using explicit sharing. We
present on Fig. 10 our inductive definitions, and the associated allocation function
mk_node. This has to be compared with Fig. 3: the difference mainly lies in the
deletion of the graph map, and the explicit recursive structure of the expr type.
For the practical mind. The reader may wonder why we chose to define expr as
two mutual inductive types expr and hc_expr. Indeed, we explain in §8.3 that it is
possible to inline hc_expr inside the definition of expr. The mutual inductive solution
is inspired by the hash-consing library in OCaml by Conchon and Filliâtre [9]. This
6 Such an approach may be interesting even if each structure is a tree, but different trees can
share sub-trees. For instance, the Astrée static analyzer implements maps as balanced binary
trees with opportunistic sharing [3, §6.2]: e.g. when an idempotent operation f (x, x) = x is to
be performed on the images of two maps g and h, returning the map x 7→ f (g(x), h(x)), then if
g and h are determined to be identical through pointer equality, g is directly returned.

Implementing and reasoning about hash-consed data structures in Coq

15

Inductive expr := | F | T | N : hc_expr → var → hc_expr → expr
with hc_expr := HC : expr → positive → hc_expr.
(* Two extra definitions that are used as coercions in the following code*)
Definition unhc (e : hc_expr) := let ’HC res _ := e in res.
Definition ident (e : hc_expr) := let ’HC _ res := e in res.
Definition eqb a b :=
match a,b with
| T, T | F, F ⇒ true
| N (HC _ id1) x (HC _ id2), N (HC _ id1’) x’ (HC _ id2’) ⇒
(id1 =? id1’) && (x =? x’) && (id2 =? id2’)
| _, _ ⇒ false
end.
Record hashcons_state := {hmap : expr

hc_expr; next : positive}.

Definition alloc u st :=
let r := HC u b.( next) in
(r, {| hmap := set u r b.( hmap); next := b.(next) + 1|}).
Definition hc_node (e : expr) (st : hashcons_state) :=
match get e st.(hmap) with
| Some x ⇒ (x, st)
| None ⇒ alloc e st
end.
Definition mk_node (l : hc_expr) (v : var) (h : hc_expr) st :=
if (ident l =? ident h) then (l,st) else hc_node (N l v h) st.

Fig. 10: A shallow-embedding of sharing

library makes a clear distinction between hash-consing nodes (the equivalent of our
hc_expr inductive) and the actual values that are hash-consed. In Coq, this makes
some inductive proofs a little bit more involved: we need to use mutual induction
on the two data-types.

The code of the hc_node function is subtle: a call to hc_node e bdd will perform
a lookup in the hash-consing map hmap: if the same expression was previously
allocated, then we return the old version; otherwise, we update the map hmap with
a mapping from the expression to its hash-consed version. The lookup ensures
that equal expressions (modulo the comparison function used to index hmap) are
mapped to the same hash-consed expression. Then, mk_node l v h bdd will first test
the identifiers of l and h for equality: if it is the case, then there is no need to
introduce a new node; otherwise, we perform a call to hc_node.
As an example, assume that x and y are expressions with different identifiers.
The following code
let (a, st) := mk_node x v y st in
let (b, st) := mk_node x v y st in
let (c, st) := mk_node a v’ b st in ...

will make a, b and c point to the same memory location! However, if x and y were
not shared maximally, then neither are a, b nor c.

16

Thomas Braibant et al.

A word on memoization. There is no difference at all in the way we handle memoization in this implementation w.r.t. §5.1. That is, we implement the same state

record as before; and pass the same memoization tables around.

A word on termination. It is now easier to define recursive functions that operate
on BDDs by taking advantage of the inductive definition of expr. We have to
stress that easier is not easy because the Coq termination checker requires that

recursive calls are made on a structurally smaller argument: there is no built-in
support for recursive definitions with pairs of arguments that are decreasing w.r.t.
a lexicographic order. Therefore, we have to use nested fixed-points or prove that
the recursive calls are well-founded. In this section, we choose the former.

Re-implementing the combinator. We are now ready to describe the code of the

implementation of the binary combinator presented in Fig. 11. The code is similar
to the one in Fig. 9 with a few key differences: thanks to the inductive definition
of expressions, we do not have to perform lookups in the graph map and we do not
use fuel anymore. This makes the combinator function total ; and we can get rid of
the Maybe monad.

Canonicity. Again, we prove that this representation of BDDs is canonical: well-

formed equivalent expressions are mapped to the same nodes. Again, we have
the corollary that the (non-recursive) equality test from Fig. 10 that inspects the
(top-level) node identifiers is a complete and correct characterization of semantic
equivalence.
Definition equiv e1 e2 :=
∀ env v1 v2, value env e1 v1 → value env e2 v2 → v1 = v2.
Lemma eqb_correct st e1 e2 v :
wf_st st → wf_expr st v e1 → wf_expr st v e2 →
(eqb e1 e2 = true ↔ equiv e1 e2).

Comparison with the previous approach. The implementation presented in this sec-

tion is derived from the previous one, with the following improvements: the proofs
are roughly 20% shorter; the performances are slightly better when executing
the code inside Coq (there is less administrative book-keeping to do in our data
structures); the functions that operates on BDDs are total. Furthermore, it would
probably be easier to implement garbage collection in this setting than in the
previous one, thanks to the simpler definition of the global state.
The situation is almost ideal for the equality test: we prove that the (nonrecursive) equality function that inspects the top-level identifiers of nodes is a
correct and complete characterization of semantics equivalence of BDD expressions.
However, we have no way to prove that it corresponds to physical equality. Actually,
we cannot state within Coq that it is never the case that two identical representations
of the same term coexists, even if we could argue at a meta-level that it is indeed
not the case.

Implementing and reasoning about hash-consed data structures in Coq

17

Section combinator.
Variable fx : hc_expr → state → hc_expr ∗ state.
Variable tx : hc_expr → state → hc_expr ∗ state.
Variable memo_get : positive → positive → state → option (hc_expr).
Variable memo_update : positive → positive → hc_expr → state → state.
Fixpoint combinator : hc_expr → hc_expr → state → hc_expr ∗ state := fun a ⇒
fix combinator_rec b st :=
match unhc a, unhc b with
| F, _ ⇒ fx b st
| _, F ⇒ fx a st
| T, _ ⇒ tx b st
| _, T ⇒ tx a st
| N l1 v1 h1, N l2 v2 h2 ⇒
let ida := ident a in
let idb := ident b in
match memo_get ida idb st with
| Some p ⇒ (p,st)
| None ⇒
let ’( res, st) :=
match Pos.compare v1 v2 with
| Eq ⇒
let ’( x, st) := combinator l1 l2 st in
let ’( y, st) := combinator h1 h2 st in
mk_node x v1 y st
| Gt ⇒
let ’( x, st) := combinator l1 b st in
let ’( y, st) := combinator h1 b st in
mk_node x v1 y st
| Lt ⇒
let ’( x, st) := combinator_rec l2 st in
let ’( y, st) := combinator_rec h2 st in
mk_node x v2 y st
end
in
let ’( _, st) := memo_update ida idb res st in
(res, st)
end
end.
End combinator.

Fig. 11: Shallow combinator for binary operations

6 From pure data structures to persistent data structures via extraction

In the previous section, we use a state monad to store information about hashconsing and memoization. However, one can see that, even if these programming
constructs use a mutable state, they behave transparently with respect to the pure
Coq definitions.
We have seen earlier (see §2.3) that if we abandon (efficient) executability inside
Coq, we can express new idioms. In the following, we implement the BDD library
in Coq as if manipulating decision trees with neither sharing, nor hash-consing
tables, nor memoization tables, then add the hash-consing and memoization code
by using special features of the extraction mechanism to remap some constants
arbitrarily to custom OCaml code.

18

Thomas Braibant et al.

Inductive bdd: Type :=
| T | F | N : var → bdd → bdd → bdd.

Extract Inductive bdd ⇒
"bdd" ["hT" "hF" "hN"] "bdd_match".

(a) BDDs in Coq as decision trees
type bdd =
| T of tag | F of tag | N of positive ∗ bdd ∗ bdd ∗ tag
module HCbdd = Hashcons.Make(...)
let hT = HCbdd.hashcons (T Weakhtbl.dummy_tag)
let hF = HCbdd.hashcons (F Weakhtbl.dummy_tag)
let hN (v, t, f) = HCbdd.hashcons (N (v, t, f, Weakhtbl.dummy_tag))

let bdd_match fT fF fN t =
match t with
| T _ → fT ()
| F _ → fF ()
| N (v, t, f, _) → fN v t f

(b) Hash-consed OCaml BDD type

Fig. 12: Implementing BDDs in Coq, extracting them using smart constructors

6.1 The smart approach
More precisely, we define our BDDs as binary decision trees (see Fig. 12a), and
implement operations in Coq on this simple data structure. Then, we tell Coq
to extract the bdd inductive type to a custom bdd OCaml type (see left of Fig.
12b) and to extract constructors into smart constructors that maintain the maximal
sharing property. The type defined in OCaml is identical to the Coq one, except
that it carries one extra field of type tag, morally containing the associated unique
identifier. The smart constructors make use of the hash-consing library used in
Why3 [20], a recent version of a library by Conchon and Filliâtre [9]. It defines
the Hashcons.Make functor, that we instantiate. The generated module provides a
HCbdd.hashcons function that returns a unique hash-consed representative for the
parameter.
The reader may notice that we choose to name bdd in Coq what is clearly
a representation of a binary decision tree, and which corresponds to what was
previously named expr. We believe that this particular choice of name makes sense
if we consider values of type bdd to represent Boolean functions.
Discussion: the status of the equality test. In Coq, we define the obvious recursive
function bdd_eqb of type bdd → bdd → bool, that decides structural equality of

BDDs. Then, we extract this function into OCaml’s physical equality:
Fixpoint bdd_eqb (b1 b2:bdd): bool :=
match b1, b2 with
| T, T | F, F ⇒ true
| N v1 b1t b1f, N v2 b2t b2f ⇒
Pos.eqb v1 v2 && bdd_eqb b1t b2t && bdd_eqb b1f b2f
| _, _ ⇒ false
end.
Extract Inlined Constant bdd_eqb ⇒ "(==)".

From a meta-level perspective, we argue that the two are equivalent thanks
to the physical unicity of hash-consed structures, provided that all values are
constructed using our smart-constructors, which is the case if we create all nodes
from code extracted from Coq (of course, handwritten OCaml code may break this

Implementing and reasoning about hash-consed data structures in Coq

19

invariant). The key point here is that the way we build terms enforces the fact that
the BDDs they build are maximally shared.
Keeping the BDD reduced. One could be tempted to put in the OCaml code of the
smart constructor hN a test that would enforce that BDDs are reduced. That is, it

would not build a node if its two children were identical.
let hN (v,t,f) =
if t == f
then t
else HCbdd.hashcons (N (v, t, f, Weakhtbl.dummy_tag))

To see the problem with this idea, consider the following Coq code.
match N (v, T, T) with
| T ⇒ false
| N (_, _, _) ⇒ true
end

In Coq, the above expression reduces to true while the extracted version would
reduce to false. Therefore, this idea is wrong: avoiding the node construction
makes subsequent case analysis behave inconsistently between Coq and OCaml.
In the end, the correct way to implement reduction is to use the following helper
function, written in Coq, that builds a node only when necessary.
Definition N_check (v : var) (bt bf : bdd) : bdd :=
if eqb bt bf then bt else N v bt bf.

Note that the user is forced to use this function instead of using the N constructor7 .
Failing to do so will result in code that is correct, but does not build reduced binary
decision diagrams. More precisely, extracting eqb to == would make it possible to
prove that BDD operations are semantically correct, but this would make diagrams
non-canonical: well-formed expressions could be represented by several different
nodes.
Implementing the combinator. The last ingredient needed to transform a decision

tree library into a BDD library is memoization. We use the same kind of ideas:
we define our functions as if not memoized, but we use a special well-founded
fixed-point combinator, that we extract to a memoizing fixed-point combinator.
The details can be seen on Fig. 13: we declare an abstract type class memoizer
of types A such that we know how to memoize functions of type ∀ x : A, P x.
This is extracted in OCaml to the type of polymorphic fixed-point combinators,
with an extra technical int parameter used to specify the initial size of the used
hash map. In Coq, we then define a memoizing combinator memo and a memoizing
fixed-point combinator memo_rec as if they were not using memoization, but we ask
the extraction mechanism to map them to special OCaml functions, that make use
of the type class instance given as parameter. These functions are observationally
equivalent to the Coq ones, provided that the type class instance is correct, and
that the memoized function is pure.
It is worth noting that directly exposing the type ’key memoizer in Coq would
be unsound, because this allows to use its instances in a non-terminating manner:
7 Still, the N constructor that occurs in the definition N_check must be extracted using a
smart-constructor.

20

Thomas Braibant et al.

Parameter memoizer : ∀ A : Type, Type.
Existing Class memoizer.
Extract Constant memoizer "’key" ⇒ "’key Helpers_common.memoizer".
Definition memo A {H : memoizer A} P := @id (∀ x : A, P x).
Extract Inlined Constant memo ⇒ "Helpers_common.memo".
Arguments memo [A] {H} [ P] _ _. (* Set implicit arguments for memo*)
Definition memo_rec A {H : memoizer A} := @Coq.Wf.Fix A.
Extract Inlined Constant memo_rec ⇒ "Helpers_common.memo_rec".
Arguments memo_rec [A] {H} [R] Rwf [ P] F x. (* Set implicit arguments for memo_rec *)

(a) Coq part
type ’key memoizer =
{ memo : α. int → ((’key → α) → (’key → α)) → (’key → α) }
let memo m f = m.memo 5 (fun _ x → f x)
let memo_rec m f = m.memo 5 (fun frec x → f x (fun y _ → frec y))
(b) OCaml part

Fig. 13: Memoizing combinators
Parameter memoizer_N : memoizer N.
Existing Instance memoizer_N.
Extract Inlined Constant memoizer_N ⇒ "Helpers_common.memoizer_N".

(a) Coq part
module NHT =
Hashtbl.Make (struct
type t = coq_N
let equal = (=)
let hash = N.to_int
end)

let memoizer_N =
{ memo = fun n f →
let h = NHT.create n in
let rec aux x =
try NHT.find h x with Not_found →
let r = f aux x in NHT.replace h x r; r
in aux }

(b) OCaml part

Fig. 14: An instance of memoizer for Coq. Numbers.BinNums.N

the memo_rec wrapper makes sure the recursive calls are well-founded. Moreover,
it is important to understand that we have not axiomatized these combinators.
Instead, we give real implementations, semantically equivalent to their OCaml
counterparts. This is important, because it then becomes clear that we do not
introduce any logical inconsistencies in Coq, and these terms keep a computational
content which could be very useful in proofs.
This part of the code is modular, and can be used to memoize functions of
any domain. It is up to the user to give instances of the memoizer type class: to
do so, he can provide dedicated code, as shown in Fig. 14 for the type N of binary

Implementing and reasoning about hash-consed data structures in Coq

21

Definition bdd_and : bdd → bdd → bdd.
refine
(memo_rec (well_founded_ltof _ bdd_size) (fun x ⇒
match x with
| T ⇒ fun _ y ⇒ y
| F ⇒ fun _ _ ⇒ F
| N xv xt xf ⇒ fun recx ⇒
memo_rec (well_founded_ltof _ bdd_size) (fun y ⇒
match y with
| T ⇒ fun _ ⇒ x
| F ⇒ fun _ ⇒ F
| N yv yt yf ⇒ fun recy ⇒
match Pos.compare xv yv with
| Eq ⇒ N_check xv (recx xt _ yt) (recx xf _ yf)
| Lt ⇒ N_check yv (recy yt _) (recy yf _)
| Gt ⇒ N_check xv (recx xt _ y) (recx xf _ y)
end
end)
end));
unfold ltof; simpl; clear; abstract omega.
Defined.

Fig. 15: The and operation on bdds

natural integers, using simple hash tables. Alternatively, for the bdd type, our type
class instance is just an OCaml wrapper around the hash-consing library built-in
memoization mechanisms.
Again, we use the example of the bdd_and operation, shown in Fig. 15. As in
Fig. 11, we define this function using two nested fixed-points, in order to handle the
special recursion scheme of this function. The definition of bdd_and uses memo_rec
twice, in a nested fashion. This is needed in order to handle the special recursion
scheme of this function (decreasing on one of its two parameters).

Garbage collection. The strategy we use in §5.2 for garbage collection is very naive:
we kept everything alive, forbidding any garbage collection. Here, the hash-consing
library we use allow the garbage collector to reclaim any node that is not referenced
by the program, by using suitable weak hash tables. Moreover, it reclaims any
memoized value that is associated with a dead key (we will not give more details
here and refer the reader to [9]). While it avoids memory leaks, this strategy
does not necessarily give the best performances (see §7). We believe the changes
necessary to implement a new strategy are small, and do not involve rewriting the
proofs: only OCaml code is involved.
It is important to note that we have to memoize the functions each argument
after the other (keeping them curried). Indeed, one should be tempted to memoize
a function bdd ∗ bdd → bdd. This is not a good idea, since we use weak hash-tables
for the memoization: in this case, the pairs containing the arguments are no longer
accessible after the function call, so that the garbage collector is going to collect
most of the memoized data at each cycle. Using our pattern, we make sure a
memoized datum is not reclaimed as long as both parameters are still alive in
memory. (Note that if we choose to use regular hash-tables instead of weak ones,

22

Thomas Braibant et al.

this argument does not stand anymore, but the program has a different performance
and memory consumption profile.)
Discussion: comparison with previous approaches. In this instance of our BDD library,

all Coq definitions are kept simple and proofs are straightforward. That is, we can
prove semantic correctness of all operations directly using structural induction on
decision trees and there is no state holding structures. There is a nice separation
between the hash-consing and memoization code, that is generic, written in OCaml
and not proved and the BDD code, mostly written and proved in Coq. We discuss
the lack of confidence in the OCaml code and how it can be restored in §9.1.
We do not need to use monads in the Coq code, so that the interface of the
BDD library remains modular and easy to use. Moreover, it is straightforward to
implement garbage collection strategies in order to avoid memory leaks.

6.2 The smart+uid approach
The previous smart approach totally hides the unique identifiers from the Coq
code. Yet, exposing these unique identifiers may be useful at times.
Consider the following use case: from a BDD B we would like to build an
equivalent propositional logic formula of linear size, for instance for feeding into a
satisfiability modulo theory solver. In order to avoid an exponential blow-up, each
shared sub-tree should generate one single sub-formula, used in a “let” binder so
that its value can be used in multiple occurrences. The obvious way to implement
such a transformation is to first detect which sub-trees are shared, using a set of
shared subtrees seen so far, then to perform the transformation, using a table of
mappings from subtrees to bound variables.
It seems therefore highly desirable to be able to build sets and maps over our
hash-consed type. Generic functional sets and maps are usually implemented using
balanced trees over a totally ordered data type; for efficiency, the comparison
function should be very fast. An obvious choice would be to expose the unique
identifiers to the Coq code (through a function bdd → uid), or at least the total
order that they induce (through a function bdd → bdd → comparison).
Unfortunately, doing so without precautions can lead to unsoundness. Consider
a program where, in succession, two nodes A and B are allocated, then a node A0
isomorphic to A is created; let uA < uB , uA0 be the successive unique identifiers. If
A is collected between the allocations of B and A0 , then A0 will be allocated, with
uA0 > uB . Yet, A0 and A are, from the point of view of the Coq code, identical;
thus uA0 = uA , yielding an inconsistency.
The workaround is to use a normal hash-table, as opposed to a weak hash-table,
which prevents the collection of unreachable nodes. Then, two identical nodes
created within the same execution are necessarily physically equal and thus share
the same identifier.
One difficulty remains. Gallina is a purely functional language; the evaluation
of a given term always yields the same result, and one expects the same property
to extend to the extracted OCaml program, as long as it does not interact with
the external world (e.g. reading from files). Yet, this is not necessarily the case if
one exposes the unique identifiers. Consider a program P1 , such that the extracted
OCaml code allocates two nodes A and B in this order. If P1 is run stand-alone,

Implementing and reasoning about hash-consed data structures in Coq

23

then uA < uB . Yet, if another program P2 allocating B 0 isomorphic to B is first
run, and B 0 is not collected, then B 0 is the same as B and uB = uB 0 < uA .
It seems questionable that the result of an evaluation should depend on whether
or not some other (unrelated) evaluation has taken place, if only because it makes
debugging difficult. 8
6.2.1 Maps and sets over hash-consed types

Arguably, the final result of an algorithm using functional sets and tables should
not depend on the order relation used. One could thus opt to expose neither
the unique identifiers, nor the order relation they induce, but only a functional
map interface (with find, remove and update ). Then, the results cannot depend on
garbage collection and other disturbances to the ordering. Yet, one could also want
these sets and maps to provide a fold operation; this would need to be restricted
to associative and commutative operations (so that their result do not depend on
the ordering). Such an interface would fit the map-reduce approach to aggregate
objects.
One additional advantage is that, with this approach, it is possible to use weak
hash tables and garbage collection, as long as the sets and maps point to the nodes
used as keys and thus prevent their collection: the ordering relation is then exposed
to the Coq code only through equality testing, be it explicit or implicit (through
the use of set or maps).9
While this line of work seems powerful, we did not pursue it: the many axioms
needed do not inspire confidence; some nontrivial meta-theoretic reasoning about
the sequential execution of OCaml code extracted from Coq would be needed.
6.2.2 Axioms

In the smart approach, hash-consing and memoization are done after the fact, and
are completely transparent for the user; yet we felt a need to break this transparency
by exposing the unique identifiers. In the following, we instead make more explicit
the hypotheses that we make on the representation of BDDs. That is, we make
visible in the inductive type of BDDs that each BDD node has a “unique identifier”
field (see Fig. 16) and we take the node construction function as an axiom, which is
implemented in OCaml. Note that nothing prevents the Coq program from creating
new BDD nodes without calling this function mkN. Yet, only objects created by
it (or copies thereof) satisfy the valid predicate; we must declare another axiom
8 One could argue that, with certified programs in the Coq fashion, there is no need to debug:
each function or module comes with a proof of its correctness, which compositionally entail
the correctness of the whole program. Yet, commonly one only proves the results to be correct,
not necessarily optimal, and one proves very seldom that the computation has the expected
complexity. Furthermore, some computations are split between an untrusted solving procedure,
and a trusted checker; a failed check entails having to debug the untrusted procedure, which
may be hard if behaviors are hard to reproduce independently of the rest of program.
9 It is of paramount importance that the sets and maps use the actual nodes as keys (ordered
by the order on unique identifiers), not just the unique identifiers: this prevents scenarios such
as creating an object A, inserting a key uA into some set S, collecting A, creating another
object A0 isomorphic to A, but with a different key uA0 , which could distinguished from A by
membership in S. In a nutshell, it is crucial that the Coq program cannot test for equality of
unique identifiers separated from their associated objects.

24

Thomas Braibant et al.

Axiom var : Set.
Axiom uid : Set.
Axiom uid_eqb : uid → uid → bool.
Axiom uid_eq_correct: ∀ x y : uid,
(uid_eqb x y =true) ↔ x=y.
Inductive bdd : Set :=
| T | F
| N : uid → var → bdd → bdd → bdd.
Axiom mkN : var → bdd → bdd → bdd.
Axiom mkN_ok :
∀ v : var, ∀ bt bf : bdd,
∃ id, mkN v bt bf = N id v bt bf.

Inductive valid : bdd → Prop :=
| valid_T : valid T
| valid_F : valid F
| valid_N : ∀ var bt bf,
(valid bt) → (valid bf) →
(valid (mkN var bt bf)).
Axiom shallow_equal_ok :
∀ id1 id2 : uid,
∀ var1 var2 : var,
∀ bt1 bf1 bt2 bf2 : bdd,
valid (N id1 var1 bt1 bf1) →
valid (N id2 var2 bt2 bf2) →
id1 = id2 →
N id1 var1 bt1 bf1 =
N id2 var2 bt2 bf2.

Fig. 16: Axiomatization of equality using unique identifiers
stating that unique identifier equality is equivalent to Coq’s Leibniz equality for
valid nodes. Then, we can use unique identifiers to check for equality.
This approach is close to the smart approach. It has one advantage, the fact
that unique identifiers are directly accessible from the Coq code. The previously
mentioned limitation applies: since unique identifiers are visible, one cannot use
weak hash tables and the OCaml garbage collector.
The use of axioms is debatable. On the one hand, the use of axioms somewhat
lowers the confidence we can give in the proofs, and they make the code not
executable within Coq. On the other hand, these axioms are actually used implicitly
when extracting Coq constructors to “smart constructors”: they correspond to
the meta-theoretical statement that these constructors behave as native Coq
constructors. Thus, they make explicit some of the magic done during extraction.
We could gain confidence about the fact that our axiomatization is consistent,
by using Coq’s module system. That is, we could parameterize our development
in this section by a suitable interface specifying the various functions that we
axiomatized. Then, we could instantiate this interface either with axioms, and
extract them to an efficient yet unproven implementation in OCaml; or, we could
instantiate them with a pure Coq version. This would prove that this set of axioms
is not unsound in Coq. This general approach was used by Vafeiadis [21] to prove
that his interface for adjustable references is sound, and we could do the same here.
We have yet to conduct this experiment.
7 Discussion

In this section, we compare our design patterns on various aspects. Let’s start with
the easy ones:
Executability inside Coq. The pure-deep and pure-shallow implementations can
be executed inside Coq, and have decent performances. The smart approach
can also be executed inside Coq, but has dreadful performance (when executed
inside Coq, it uses binary decision trees). The smart+uid approach cannot be
executed inside Coq.

Implementing and reasoning about hash-consed data structures in Coq

25

Trust in the extracted code. Unsurprisingly, the smart and the smart+uid approaches yield code that is harder to trust, while the pure-deep and pureshallow approaches leave the extracted code pristine.
Proof. From a proof-effort perspective, the smart approach is by far the simplest.
The smart+uid approach involves the burden of dealing with axioms. However,
it makes it easier to trust that what is formally proven corresponds to the real
behavior of the underlying runtime. By comparison, the pure-deep and pureshallow approaches required considerably more proof-engineering in order to
check the validity of invariants on the global state. Note however that our proof
arguments are much simpler in the latter one.
Garbage collection. Implementing (and proving correct) garbage collection for the
pure-deep or pure-shallow approaches would require a substantial amount
of work. By contrast, the smart approach make it possible to use OCaml’s
garbage collector to reclaim unreachable nodes “for free”.
Operations As we have shown, binary operations can be handled with a single
parametric combinator. All our implementations use such a combinator to
implement conjunction, disjunction, exclusive-or and so on. There is little work
to do to add support for any binary operation we may have overlooked. The
situation is more complicated when it comes to ternary operations (such as
the if-then-else Boolean operation). Implementing it using the smart approach,
and proving it correct requires around 80 lines of code. It would require a nontrivial amount of work to implement it using our pure-deep or pure-shallow
approaches, and we have not conducted this experiment yet. Other operations
that are relevant in a BDD library are function composition and quantification.
We have yet to implement these in any of our libraries. Quantification over one
variable can be defined as a structural recursion over one BDDs. As such, we
think it can be easily defined in all our frameworks. Unary composition (that
is, substitution of one variable with another Boolean function) can be defined
as a structural recursion over two BDDs. As a consequence, implementing it in
the smart framework seems simple. However, this particular recursion scheme
does not fit the combinator of the pure-deep or pure-shallow approaches,
and some additional work would be needed in order to implement it using those
approaches. Moreover, a non-negligible amount of work would be needed to
support variable arity composition and quantification in all libraries.

7.1 Performances of the extracted code
We evaluate the performances of the OCaml code that is extracted from our “pure”
(see §5) and “impure” (see §6) libraries, and we pit them against a reference library
implemented in OCaml (available from J.C. Filliâtre’s web page). This reference
library does not keep the memoization table alive from one execution of an operation
to another: for instance, each time the and function is called, a new memoization
hash-table is allocated. Therefore, to make up for a fair benchmark with our
implementations, we modified this reference library to use a memoization strategy
closer to ours. This alternative reference implementation is designed “reference
(conservative)” in what follows.

26

Thomas Braibant et al.

Then, this evaluation, we use two standard benchmarks (see [23]). The first one
is Urquhart’s formula U (n) defined by
U (n) , x1 ⇐⇒ (x2 ⇐⇒ . . . (xn ⇐⇒ (x1 ⇐⇒ . . . (xn−1 ⇐⇒ xn ))))

The second kind of formula states the pigeonhole principle P (n) for n + 1 pigeons:
that is, if there is n + 1 pigeons in n pigeonholes, then there is at least one hole
that contains two pigeons. The plots of the execution time required to check that
these formulas are tautology are given in Fig. 17.
First, we remark that the reference implementation that we use is (almost
always10 ) 4 times faster than our best implementation smart. There is no significant difference in execution time between the reference implementation with the
conservative memoization strategy and the reference implementation. We have not
conducted a detailed measurement of the memory consumption11 , though, and the
memory consumption profile is likely to be different between the two.
Then, our smart implementation is roughly 4 times faster than the pure-deep
one: interestingly enough this value is consistent over our benchmark, while we
could have expected logarithmic factors to show up. Indeed, recall that the pure
implementation uses functional finite maps, while the smart one uses OCaml
hash-maps.
Also, we have run the same benchmark on our pure-shallow implementation,
and the resulting plot is close to the pure-deep plot: for small instances, the
difference of execution times are of about 10%. For our largest instance of the
pigeonholes problem, we observe that the pure-shallow is faster by about 30%.
We do not delve too much on this result: we wonder for instance to what extent
changing the implementation of finite maps that we use could result in similar
differences of performances.
Primary investigation show that the main performance difference between the
reference implementation and our smart approach come from the use of the generic
hash-consing library: the reference library uses a specialized hash-consing system,
much faster than the generic library we use. Similarly, the difference between
the smart and the pure-shallow seems to be the use of AVL trees to represent
dictionaries.
We did not investigate much the memory behavior. Our simple experiments
show that it depends on the benchmark we consider. The bottom line is that
all our approaches consume less than twice as much memory as the reference
implementation, conservative version. Moreover, we observe that pure-shallow
consumes 10%-20% less memory than pure-deep. Concerning the smart approach,
it seems like our Curryfied memoization scheme creates a lot of very small hashtables in the OCaml heap, which is not the optimized use case of OCaml hash-tables.
Thus, there might be room for improvement here.
10 This is not necessarily true for small formulas due to differences in the start-up costs of
these libraries.
11 These particular benchmarks were run on a computer with 1 TB of ram, which made it
possible to avoid swapping during our experiments. Indeed, memory usage is the limiting factor
on bigger instances of the tests.

Implementing and reasoning about hash-consed data structures in Coq
10000

27

Pure
Smart
Reference (conservative)
Reference

1000
100

Time (s)

10
1
0.1
0.01
0.001
0.0001

0

500

1000

1500

2000

2500

3000

3500

4000

4500

U(n)

(a) Urquhart’s formula
10000

Pure
Smart
Reference (conservative)
Reference

1000

Time (s)

100
10
1
0.1
0.01
0.001

6

8

10

12

14

16

P(n)

(b) Pigeonhole principle

Fig. 17: Execution time for the BDD benchmarks

8 Implementing reduction in the λ-calculus

In the previous sections, we have presented several design patterns about how to
implement hash-consed data structures in Coq. In this section, we apply two of
them to the example of reduction of λ-terms (the running example of Conchon and
Filliâtre [9]).

8.1 An implementation without hash-consing
We first describe a Coq implementation, without memoization nor hash-consing,
of λ-terms with de Bruijn indices inspired from Huet’s Constructive Computation
Theory [14]. We implement four functions on terms: lift, subst, hnf (that puts a
term in head-normal form) and nf (that puts a term in normal form). This amounts

28

Thomas Braibant et al.

Inductive term : Type :=
| Var : N → term
| App : term → term → term
| Abs : term → term.
(a) λ-terms with de Bruijn indices
Fixpoint lifti (n : N) (t : term) (k : N) :=
match t with
| Var i ⇒ if N.ltb i k then Var i else Var (i + n)
| Abs t ⇒ Abs ( lifti n t (N.succ k))
| App t u ⇒ App (lifti n t k) (lifti n u k)
end.
Definition lift n t := lifti n t 0.
(b) A implementation of lift without hash-consing nor memoization
Definition lifti : N → term → N → term.
refine (
memo (fun n ⇒
memo_rec (well_founded_ltof _ size) (fun t rec k ⇒
memo (fun k ⇒
match t with
| Var i ⇒ fun rec ⇒ if N.ltb i k then Var i else Var (N.add i n)
| Abs t ⇒ fun rec ⇒ Abs (rec t _ (N.succ k) )
| App t u ⇒ fun rec ⇒ App (rec t _ k) (rec u _ k)
end) k rec)));
(* Proof obligations elided *)
Definition lift n t := lifti n t 0.
(c) Implementing lift the smart way

Fig. 18: Implementation of terms and lift
to roughly 60 lines of Coq code: we present the code for lift in Fig. 18b as a
reference for future comparisons.
Then, we reuse Conchon and Filliâtre’s benchmark [9]: we implement a λcalculus version of quicksort that operates on lists (encoded as λ-terms) of Church
numerals; and apply this algorithm to sort the list L = (0 :: 3 :: 5 :: 2 :: 4 :: 1 :: nil).
Using Coq’s virtual machine, sorting this list requires over 250 s on a recent
desktop computer12 : this involves a number of reduction that is exponential w.r.t.
the size of the list and the size of the numbers. Extracting the code to OCaml
makes it more efficient by a constant factor. We compile the extracted code with
the native OCaml compiler, and run it to sort the list L.
time (s)
30

Number of allocated bytes
47 ∗ 109

Max. size of the major heap (words)
1.1 ∗ 106

The number of allocated bytes is given as reported by Gc.allocated_bytes: it
corresponds to what is allocated during the execution of the sorting algorithm. The
maximum size of the major heap is given as reported by OCaml Gc.stat function.
In short, these numbers mean that the program allocates an awful lot of short-lived
values.
12

Equipped with Intel Xeon cores running at 3.60GHz.

Implementing and reasoning about hash-consed data structures in Coq

29

8.2 Using smart constructors
The smart approach from §6.1 involves a small amount of changes to the code,
mainly to use memoizing constructs and to add memoizer definitions. To get a
rough idea of the changes involved, we present the modified code for lift in Fig.
18c: it is merely a matter of replacing abstractions and fixed-point definitions with
their memoizing counterparts. The overall size of the code is now roughly over
80 lines of Coq code. While they do not make it possible to compute efficiently
inside Coq, these changes have dramatic consequences on the execution time of
the extracted code. Again, we compile the extracted code with the native OCaml
compiler, and run it to sort the list L.
time (s)
0.15

Number of allocated bytes
0.12 ∗ 109

Max. size of the major heap (words)
3.1 ∗ 106

Here, we have a higher top size of the major heap, which account for our use of
memoization tables. This is to be weighed against the lower amount of allocated
words and the 200× decrease in running time.

8.3 Using the shallow-embedding approach
Finally, we use the pure-shallow from §5.2. There is now roughly 300 lines of
code and definitions to implement lift, subst, hnf and nf. In this section, we will
make a brief review of this code, highlighting the differences with respect to what
we presented in §5.2.
We give the modified definition of terms in Fig. 19a: each constructor takes
an extra positive argument: that is, we drop the extra indirection we used in
the definition of BDD expressions in §5.2. (This choice of definition yields simpler
proof arguments because we do not need to perform mutual inductions on expr
and hc_expr.)
We adapt the notion of global state from §5.2. Again, the hmap finite map is
indexed using a non-recursive comparison function: it simply compare the head
symbols of the terms and, if equal, compare the tags of the sub-terms. Again, we
define one memoization table per function we wish to memoize.
Then, let us take the example of abstraction to describe term creation. We
define the following smart constructor.
Definition mk_Abs t st := mk_term (Abs t 1) st.

The function mk_Abs is a simple wrapper around mk_term: it applies the term
constructor Abs to the term t and a dummy unique identifier. Then, mk_term (see
right of Fig. 19b) performs a lookup in the hash-consing table. If there exists a
mapping from Abs t p (for some p) to x in the table, then we return x. Otherwise,
the alloc function replaces the dummy unique identifier with the next fresh unique
identifier available, and it updates the map hmap with a mapping from the expression
t to its tagged version r.
In effect, this code maintains the invariant that all the terms in the hmap
table are the canonical representative of their equivalence classes modulo the term
comparison used to index hmap. That is, we ensure that the terms we build verify
the following property.

30

Thomas Braibant et al.

Definition wf_term st t := get t st.(hmap) = Some t.

Let us jump to the definition of lifti (see Fig. 19c): it is defined using a wellfounded fixed-point combinator Fixm (elided here) that ensures that the measure
msr is decreasing through recursive calls. The bulk of this definition is lifti_rec:
it is quite similar to the previous definitions of lift, except that we have to add
fancy return clauses to the match definitions. They are necessary to be able to
prove that the measure actually decreases through recursive calls. (Also, note that
we could have inlined lifti_rec in lifti, but this two-step definitions makes it
for more palatable proof goals later on).
The upside of these definition is that we are now able to compute the normal
forms of our λ-terms inside Coq with decent efficiency: it takes roughly 4 s to sort
the list L using vm_compute. The downside of these definitions is made apparent as
soon as we start to prove that our four functions lift, subst, hnf and nf enjoy a
simple correctness property: that they always return terms that are well-formed
and preserve well-formedness of the global state. Proving this requires around 600
lines of code, which brings the total size of this formalization to 900 lines, before
we even started to prove meta-properties like the fact that hnf or nf implement
beta-reduction!

8.4 Discussion
In this section, we have demonstrated that our design patterns can be applied to
other settings than BDDs. This exercise helped us to refine the presentation of
smart constructors we gave in §6.1, and gave us the opportunity to present another
flavor of the shallow-embedding technique, that does not rely on the mutually
recursive inductives we used in §5.2.
In short, this is a nice exercice de style. Yet, we believe that this study is also
significant by itself. Reduction in the λ-calculus is representative of the kind of
computation that arises in Coq, e.g. in the conversion test. Our case study can be
seen as a step stone for future works that would attempt to prove the correctness
of efficient implementations of symbolic algorithms. On the one hand, one could
use the pure-shallow approach to implement hash-consing and memoization, yet
remain inside the bounds of what Coq’s extraction mechanism can safely handle.
On the other hand, one could bite the bullet, use the smart approach, and focus
on more interesting details than hash-consing and memoization.

9 Conclusion

In this paper, we proposed two solutions to implement hash-consing in programs
certified with the Coq system. The first one is to implement it using Coq data
structures; the second is to use the imperative features provided by OCaml through
the tuning of the extraction mechanism. The difference in flavor between the
mapping of Coq constants to smart OCaml realizers or the axiomatization of there
realizers in Coq is a matter of taste. In both cases, some meta-theoretical reasoning
is required and requires to “sweep something under the rug”.

Implementing and reasoning about hash-consed data structures in Coq
Inductive term : Type :=
| Var : N → positive → term
| App : term → term → positive → term
| Abs : term → positive → term.

Definition tag t n :=
match t with
| Var x _ ⇒ Var x n
| App t u _ ⇒ App t u n
| Abs t _ ⇒ Abs t n
end.

(a) Definition of terms, and the associated tagging function
Definition alloc t (st : state) :=
let r := tag t st.( next) in
(r, mk_state
{|
hmap := set t r st.( hmap);
next := st.(next) + 1
|} (to_memo st)).

Definition mk_term (t : term) st :=
match get t st.(hmap) with
| Some x ⇒ (x, st)
| None ⇒ alloc t st
end.

(b) Updating the global state, and the term constructor
Let A := (term ∗ N ∗ state).
Let B := (term ∗ state).
Let msr := fun (n : A) ⇒ size (fst (fst n)). (* measure function *)
Definition lifti_rec (n : N) (arg : A) (rec : ∀ x : A, msr x < msr arg → B).
refine(
match arg as arg’ return arg = arg’ → B with
| (t, k, st) ⇒ fun Harg ⇒
match get (n,ident t,k) (memo_lifti st) with
| Some t ⇒ (t,st)
| None ⇒ let (r,st) :=
match t as t’ return t = t’ → _ with
| Var i _ ⇒ fun H ⇒ if N.ltb i k
then mk_Var i st
else mk_Var (N.add i n) st
| Abs t _ ⇒ fun H ⇒ let (t,st) := rec (t,(N.succ k),st) _ in
mk_Abs t st
| App t u _ ⇒ fun H ⇒ let (t,st) := rec (t,k,st) _ in
let (u, st) := rec (u, k, st) _ in
mk_App t u st
end eq_refl
in (r, upd_lifti (n,ident t,k) r st)
end
end eq_refl).
(* Proof obligations elided *)
Defined.
Definition lifti (n : N) (t : term) (k : N) (st : state) : term ∗ state :=
Fixm msr (lifti_rec n) (t,k, st).
(c) Definition of lifti

Fig. 19: λ-terms revisited, using the shallow-embedding approach

31

32

Thomas Braibant et al.

9.1 Trusting manually-tweaked extracted code
Wrapping-up this article, we have to stress the fact that using hand-written code in
the extraction process yields code that is less trustworthy. Indeed, the code we use
to realize axioms (e.g., inhabitants of the type memoizer) or to replace constructors
and destructors with the suitable smart ones may introduce bugs.
For instance, we could end up producing code that fails to compile, because the
code snippets used to customize the extraction are used as textual replacements.
Or, we could end up producing code that is observationally not-equivalent to the
pristine extracted function because of side-effects. Or we could simply ruin all our
verification effort by introducing the same kind of bugs that we wanted to catch
using formal proofs.
There are two simple ways to restore confidence in this code: code reviews and
tests. Unfortunately, doing more than that is hard. For instance, we could imagine
using the Why3 [20] framework to reason about our code. Unfortunately, Why3
is not suited for reasoning about higher-order programs, such as our memoizing
fixpoint operators. More promising is Charguéraud’s CFML [7] framework, that
makes it possible to reason about ML programs with effects. We wonder to what
extent the verification of our combinators could be conducted in this system.
In any case, we would like to point out that using hand-written code only
increase the trusted code base, which contains the whole extraction mechanism.

9.2 Trusting external imperative code
The simple act of calling external libraries that rely on side effects may introduce
inconsistencies if these libraries return inconsistent values when called with identical
parameters. This is because, in Coq, a given term has a unique value inside a model
(or a trace of execution, in practical terms). This is the reason why exposing the
internals of mutable state, such as unique identifiers, is fraught with possibilities of
introducing subtle inconsistencies.
For example, the “else” branch of the following Coq program is unreachable,
because foo x and foo ( pred ( S x)) are equal, thus the theorem stating that it
always returns “true”:
Definition essai (x : nat) : bool :=
if eq_nat_dec (foo x) (foo (pred (S x))) then true else false.
Theorem essai_true: ∀ x: nat, (essai x) = true.

Yet, if foo is extracted to a function such as
let counter = ref O in fun x → counter := S (!counter); !counter

then the OCaml code for essai O returns “false”.

9.3 Some practical details of implementation
We took care through this article not to use Coq’s advanced support for recursive
definitions, i.e., the Program and Function commands. There are two important
reasons for this. First, as we hinted at in §5.1, the termination arguments for
our BDD operations are out of the scope of what both commands can automate.

Implementing and reasoning about hash-consed data structures in Coq

33

Second, we could have used Program instead of refine in many of our definitions.
Both can be used to build terms with holes, that are later filled using tactics.
However, Program is based on a powerful elaboration [19], meaning that the final
term that is built may be quite different from the one the user has written.
Generally speaking, the performances of the OCaml version of the libraries we
build using the smart approach are quite dependent on side-effects. For instance,
keeping memoization tables from one run of a given BDD operation to the other
yields different performance and memory consumption profiles from what one gets
without this conservative strategy. The only change between these conservative and
non-conservative strategies is when the memoization tables are allocated. Therefore,
we are quite picky when it comes to program transformations that could introduce
partial applications or eta-expansions. Using refine, we are more confident that
the code that we wrote is the one that is going to be extracted, and we have a
better idea of the kind of OCaml code that is produced. This makes it easier to
deal with this kind of potential performance issues.

9.4 Related work
Purely functional implementations of BDDs Christiansen and Huch [8], Jeremy
Bradley [15] implemented lazy BDDs in Haskell in a manner similar to our “pure-

shallow” approach. In their approaches, not only are BDDs maximally shared, they
are also lazily evaluated “on demand”. We did not investigate this possibility.
Imperative features inside the specification language. An obvious solution to imple-

ment and reason about imperative algorithms is to have these imperative features
present in the modeling language of the prover. Some provers directly target highlevel programming languages with data structures, references and imperatives
features: an example is KeY [2], which targets a large subset of Java. While such
features are not available in Coq, there are two conceptual difficulties with this
approach that would have made it impractical in our case-studies. First, BDD
algorithms implemented using hash-consing are functional in a high-level view:
BDD operations are very clearly given functionally, by induction; but also because
hash-consing is suitable only for immutable structures13 . It therefore seems strange
to have to program them in an imperative language, furthermore one that complicates common functional idioms (e.g. pattern-matching). Second, the meta-theory
for such languages is typically huge, with intricate proof rules having to deal with
mutable data, references, late binding etc. It is not obvious how much we can trust
the proof system with respect to the semantics of the language.
A second option is to add to an existing functional specification language
certain imperative traits as monads, then modify the extraction function from the
specification language to the target language so as to translate monadic operations
into imperative calls, as has been done for Isabelle/HOL [6]; this approach has
been used to verify a BDD package [10]. Special proof idioms have to be used
for monadic programs, in addition to the general difficulty of programming in
13 Or at least for structures behaving as though they were immutable; for instance, we can
perform hash-consing on a structure if the mutable information in the structure is just used for
caching and does not affect the hashcode.

34

Thomas Braibant et al.

monadic style (see §2.1). To the best of our knowledge, this approach has not been
investigated in Coq.
A third option is to integrate into the specification language some essentially
imperative data structures (e.g. mutable arrays, from which hash tables can be
implemented), but present them in a functional fashion (e.g. an update to a mutable
array is treated as returning a new array, same as the previous except for the
updated location). The imperative features are then implemented efficiently for
evaluation of expressions inside the prover, and are mapped to native imperative
features of the target language during extraction. For instance, an experimental
version of Coq exists, with native integers and arrays [1]. Again, all difficulties
with monads (see §2.1) apply, plus there is the problem of running a nonstandard
version of Coq.
A fourth option is to deeply embed a subset of an imperative language into
Coq: the programs of this imperative language are given a semantics inside Coq,
and correctness properties are proved with respect to this semantics. This idea is
discussed by Vafeiadis [21], but, as he remarks, this largely precludes the use of
regular proof tactics: we have to develop proof steps specific to the language being
embedded, and prove these steps correct with respect to the semantics.
Verification of BDD algorithms in Coq. Verma et al. [22, 23] implemented and proved

correct in Coq a BDD library featuring efficient negation and disjunction; other
operations like conjunction, implication and so on are implemented as derived
operations. The BDDs produced are reduced and shared.
As we said, we chose to implement a fresh one because the code associated
to their paper did not age well w.r.t. the evolution of Coq. Beside the fact that
they investigated garbage collection, there is no conceptual difference between their
library and our pure-deep approach.
Verification of BDD algorithms in other theorem provers In Isabelle/HOL, Ortner

and Schirmer [18] verified the implementation of a normalization algorithm for
binary decision diagrams. That is, their algorithm takes as input a BDD, and
outputs the corresponding ROBDD. Their formalization is built on the BurstallBornat memory model: they build one heap of type ref
value for each component
of a BDD node, with ref the abstract type of memory addresses. Using a split-heap
model makes it easier to reason about heap-allocated data structures in tools such
as Why3. We believe however, that our formalization is more direct than theirs,
and more suitable for efficient implementations in Coq.
Boyer and Hunt [4] developed an extension of ACL2 that uses hash-consing to
give canonical representatives to ACL2 objects. This makes it possible to memoize
some ACL2 user defined functions. As a case study, they implemented a BDD
library in ACL2. Remark that this implementation is based on the fact that ACL2
exposes hash-consing primitives and the associated reasoning principles to the
user. It is unclear to what extent these features could be added to the Coq proof
assistant.
Hash-consing in the execution language Jean Goubault-Larrecq’s HimML program-

ming language [11–13] is an extension of core Standard ML with primitive finite
set and map data types with a run-time system designed around the concept of
maximal sharing, or systematic hash-consing.

REFERENCES

35

9.5 Future work
We conclude with directions for future work. First, we believe that the smart
constructors approach is generalizable to a huge variety of inductive types. One
can imagine it could be the job of Coq’s extraction mechanism to implement
on-demand such smart constructors and memoizers as it was the case for other
imperative constructs [1]. Where to stop? Should the extraction mechanism also
provide built-in functional maps and sets upon hash-consed types?
Second, in §5.2, we hinted at the fact that using two mutually inductive datatypes for expressions and hash-consed expressions yields more complicated proofs
than the one we had to do in §8.3. However, these two mutually inductive data-types
are at the heart of Conchon and Filliâtre hash-consing library. We believe that
following their approach closely is a good starting point for investigating to what
extent one could provide a certified version of this hash-consing library.
Third, we confess that our implementations of BDDs are a bit rough. We would
like to polish the implementations of our pure-shallow and smart approaches,
and release them as user-friendly libraries. In particular, we need to add some
operations like quantification or functional composition.

References

[1]

[2]

[3]

[4]
[5]
[6]

[7]

[8]

Michaël Armand et al. “Extending Coq with Imperative Features and Its
Application to SAT Verification”. In: Proc. ITP. Vol. 6172. LNCS. Springer,
2010, pp. 83–98. doi: 10.1007/978-3-642-14052-5_8.
Bernhard Beckert, Reiner Hähnle, and Peter H. Schmitt, eds. Verification
of Object-Oriented Software: The KeY Approach. Vol. 4334. LNCS. Springer,
2007. doi: 10.1007/978-3-540-69061-0.
Bruno Blanchet et al. “Design and Implementation of a Special-Purpose
Static Program Analyzer for Safety-Critical Real-Time Embedded Software”.
In: The Essence of Computation: Complexity, Analysis, Transformation. LNCS
2566. Springer, 2002, pp. 85–108.
Robert S. Boyer and Warren A. Hunt Jr. “Function memoization and unique
object representation for ACL2 functions”. In: ACL2. ACM, 2006, pp. 81–89.
Thomas Braibant, Jacques-Henri Jourdan, and David Monniaux. Supplementary material. url: https://github.com/braibant/hash-consing-coq.
Lukas Bulwahn et al. “Imperative Functional Programming with Isabelle/HOL”.
In: Theorem Proving in Higher Order Logics (TPHOL). Vol. 5170. LNCS.
Springer, 2008, pp. 134–149. doi: 10.1007/978-3-540-71067-7_14.
Arthur Charguéraud. “Characteristic Formulae for the Verification of Imperative Programs”. In: Proceeding of the 16th ACM SIGPLAN International Conference on Functional programming (ICFP). Ed. by Manuel M. T. Chakravarty,
Zhenjiang Hu, and Olivier Danvy. ACM, 2011, pp. 418–430.
Jan Christiansen and Frank Huch. “A purely functional implementation of
ROBDDs in Haskell”. In: Trends in Functional Programming. Ed. by Henrik
Nilsson. Vol. 7. Trends in Functional Programming. Intellect, 2006, pp. 55–71.
isbn: 978-1-84150-188-8.

36

[9]

[10]

[11]

[12]

[13]
[14]
[15]

[16]
[17]

[18]

[19]
[20]
[21]
[22]
[23]

REFERENCES

Sylvain Conchon and Jean-Christophe Filliâtre. “Type-Safe Modular HashConsing”. In: ACM SIGPLAN Workshop on ML. Portland, Oregon, Sept. 2006.
url: http://www.lri.fr/~filliatr/ftp/publis/hash-consing2.pdf.
Mathieu Giorgino and Martin Strecker. “Correctness of Pointer Manipulating
Algorithms Illustrated by a Verified BDD Construction”. In: Formal Methods
(FM). Ed. by Dimitra Giannakopoulou and Dominique Méry. Vol. 7436.
LNCS. Springer, 2012, pp. 202–216. doi: 10.1007/978-3-642-32759-9_18.
Jean Goubault. “HimML: Standard ML with Fast Sets and Maps”. In:
5th ACM SIGPLAN Workshop on ML and its Applications. 1994. url: http:
//citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.40.4967.
Jean Goubault. Implementing Functional Languages with Fast Equality, Sets
and Maps: an Exercise in Hash Consing. Tech. rep. Bull S.A. Research Center,
May 1994. url: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.
1.41.1757.
Jean Goubault-Larrecq. HimML: HimML is a map-oriented ML. Dec. 2008.
url: http://www.lsv.ens-cachan.fr/~goubault/himml-dwnld.html.
Gérard Huet. “Constructive Computation Theory”. Course notes, available
at http://yquem.inria.fr/~huet/PUBLIC/CCT.pdf.
Neil Davies Jeremy Bradley. Compositional BDD Construction: A Lazy Algorithm. Tech. rep. CSTR-98-005. Department of Computer Science, University
of Bristol, 1998. url: http://www.cs.bris.ac.uk/Publications/Papers/
1000250.pdf.
Donald E. Knuth. “The Art of Computer Programming”. In: vol. 4A. Binary
decision diagrams. Addison-Wesley, 2011. Chap. 7.1.4.
Guillaume Melquiond. “Floating-point arithmetic in the Coq system”. In:
Information and Computation 216 (2012), pp. 14–23. doi: 10.1016/j.ic.2011.
09.005.
Veronika Ortner and Norbert Schirmer. “Verification of BDD Normalization”.
In: TPHOLs. Vol. 3603. Lecture Notes in Computer Science. Springer, 2005,
pp. 261–277.
Matthieu Sozeau. “Subset Coercions in Coq”. In: TYPES. Vol. 4502. Lecture
Notes in Computer Science. Springer, 2007, pp. 237–252.
The Why3 platform 0.80. url: http://why3.lri.fr.
Viktor Vafeiadis. “Adjustable references”. In: ITP. Vol. 7998. Lecture Notes
in Computer Science. Springer, 2013, pp. 328–337.
Kumar Neeraj Verma and Jean Goubault-Larrecq. Reflecting BDDs in Coq.
Rapport de recherche RR-3859. INRIA, 2000. HAL: inria-00072797.
Kumar Neeraj Verma et al. “Reflecting BDDs in Coq”. In: Proc. ASIAN.
Vol. 1961. LNCS. Springer, 2000, pp. 162–181. doi: 10.1007/3-540-444645_13.