Binary Decision Diagrams
Fabio SOMENZI
Department of Electrical and Computer Engineering
University of Colorado at Boulder

Abstract
We review Binary Decision Diagrams presenting the properties and algorithms that are most relevant
to their application to the verification of sequential systems.

1 Introduction
Many tasks in the design and verification of digital systems manipulate large propositional formulae. It is
therefore important to have efficient ways of representing and manipulating such formulae. In recent times
Binary Decision Diagrams (BDDs) have emerged as the representation of choice for many applications.
Though BDDs are relatively old [39, 1], it was the work of Bryant [8] that attracted the attention and
renewed the interest of many researchers. Bryant observed that reduced, ordered, binary decision diagrams
are a canonical representation of boolean functions. The import of this observation can be seen in the
fact that nowadays, reduction and ordering are usually implied when referring to BDDs. Following the
established use, we refer to reduced ordered binary decision diagrams simply as to BDDs.
Canonicity reduces the semantic notion of equivalence to the syntactic notion of isomorphism. It is the
source of both efficiency and ease of use for BDDs. On the one hand, canonicity enhances the effectiveness
of memoization techniques. On the other hand, it makes the test for equivalence inexpensive. Canonicity
has also one important drawback: It is the prime reason why BDDs are less concise than circuits in general.
There are families of functions for which conjunctive normal forms are exponentially more concise than
BDDs and vice versa. The same happens when comparing some diagrammatic representations. It would
therefore be a mistake to use BDDs indiscriminately whenever a representation for boolean functions is
required. It would be equally mistaken to dismiss BDDs on the grounds that multipliers have provably
exponential BDDs [9]. The best results in many applications come from a flexible approach that combines
the strengths of several representations (e.g., [14]).
The application that led to the initial development of BDDs is switch-level simulation of MOS circuits
[7]. Since then, their use has proliferated to the point that an exhaustive list of all the problems to which
decision diagrams have been applied is not feasible here. We limit ourselves to mentioning some of the
more popular and significant applications of BDDs and their derivatives. Verification of the correctness of
hardware relies on BDDs for both the representation of the circuits and the manipulation of sets of states.
Model checking algorithms based on this implicit representation of sets have successfully verified properties

of systems with very large numbers of states (
or more) [12, 44, 6]. In the optimization of logic circuits,
BDDs are used, among other things, to represent “don’t care” conditions [56], and to translate boolean
functions into circuits based on a specific implementation technology [42, 13, 25]. Testing and optimization
of sequential circuits also benefit from the use of BDDs [18]. Various forms of decision diagrams can be
1

1

f

f

a

a

b

b

1

1

b
0

1

0

Figure 1: Decision tree and decision diagram for the disjunction of


and .

applied to the solution of large systems of linear equations—in particular those found in the analysis of large
Markov chains [30]—and to graph algorithms like maximum flow in a network [31].
The rest of this chapter is organized as follows. In Section 2 we present informally BDDs. After
introducing notation and background in Section 3, we give a definition of BDDs in Section 4. In Section 5 we
discuss the main algorithms to manipulate BDDs. Section 6 is devoted to the problem of variable ordering.
Section 7 discusses the efficient implementation of BDD packages. Section 8 surveys the application of
BDDs to the verification of sequential hardware. Section 9 summarizes and concludes.

2 From Decision Trees to Decision Diagrams
2.1 BDDs Are Reduced Decision Trees


The left part of Figure 1 shows an ordered decision tree for the disjunction of two variables and . The
node at the top—labeled —is the function node. The elliptical nodes labeled with variable names are the
internal nodes, and the rectangular nodes at the bottom are the terminal nodes. The terminal nodes are
labeled by either 1 (signifying true) or 0 (signifying false). Evaluation of for a given valuation of and
consists of following a path from the function node to a terminal node. The label of the terminal node is the
value sought. At each internal node the path follows the solid arc if the variable labeling the node evaluates
to 1, and the dashed arc otherwise. The solid arc is called the then arc, while the dashed arc is called the
else arc. The order in which the variables appear is the same along all paths. In this case, it is
. (The
smallest element of the order is the variable closest to the function node.)
The right part of Figure 1 shows the Binary Decision Diagram for the order
. It is obtained from
the corresponding decision tree by a process called reduction.


 

 
Definition 1 Reduction consists of the application of the following two rules starting from the decision tree
and continuing until neither rule can be applied.
1. If two nodes are terminal and have the same label, or are internal and have the same children, they
are merged.
2. If an internal node has identical children it is removed from the graph and its incoming nodes are
redirected to the child.
2

f

g
a
b

c
0

1

Figure 2: Shared BDDs.


In the case of Figure 1, reduction proceeds by merging the three terminal nodes labeled 1. As a consequence,
the children of one of the internal nodes labeled become the same node. This causes the application of
the second rule which produces the graph on the right of Figure 1. No further application of either rule is
possible and reduction terminates.
The result of reduction depends only on the function to which it is applied and on the order of the
variables. It is independent of the order of application of the rules.

2.2 Shared BDDs and Strong Canonical Form
In practice one does not build the decision tree and then reduces it. Rather, BDDs are created starting
from the BDDs for the constants and the variables by application of the usual boolean connectives and are
kept reduced at all times. At the same time several functions are represented by one multi-rooted diagram.
Indeed, each node of a BDD has a function associated with it. If we have several functions, they will have

 
and 
, we represent them like in
subfunctions in common. For instance, if we have
Figure 2. As a special case, two equivalent functions are represented by the same BDD (not just two identical
BDDs). This approach, therefore makes equivalence check a constant-time operation. Its implementation is
based on a dictionary of all BDDs nodes in existence in an application. This dictionary is called the unique
table. Operations that build BDDs start from the bottom (the constant functions) and proceed up to the
function nodes. Whenever an operation needs to add a node to a BDD that it is building, it knows already
the two nodes (say,  and  ) that are going to be the new node’s children, and the variable (say,  ) that is
 . If indeed the two children are the same, no
going to label it. Therefore the operation first checks if 
 , it looks up
new node needs to be created. (This is in accordance to the second reduction rule.) If 
the unique table for the existence of the triple     , that is, for a node with the desired characteristics.
Only if such node is not found, it is created (in accordance to the first reduction rule). The representation we
have just outlined is called strong canonical form and is quite common in software packages that manipulate
BDDs, in which equivalence is tested by a simple pointer comparison.


2.3 Attributed Arcs


The BDDs for and  are very similar. The only difference being the values of the leaves, which are
interchanged. This suggests the possibility of actually using the same subgraph to represent both and  .


3


Suppose a BDD represents . If we are interested in 
 , it is then sufficient to remember that the function
we have in the multi-rooted graph is the complement of  . This can be accomplished by attaching an attribute
to the arc pointing to the top node of . An arc with the complement attribute is called a complement arc.
The arcs without the attributes are called regular arcs. The use of complement arcs slightly complicates the
manipulation of BDDs, but has three advantages. Obviously, it decreases the memory requirements. 1 The
second, more important consequence of using complement arcs is the fact that complementation can be done
in constant time—the BDD is already in place—and checking two functions for one being the complement
of the other also takes constant time. Finally, inexpensive complementation allows an implementor to use
De Morgan’s laws freely. For instance, once an algorithm for conjunctive decomposition of a function has
been coded, disjunctive decomposition can be obtained with negligible overhead by complementing the
conjunctive decomposition of the complement. Note that with complement arcs we need only one constant
function (we choose 1) and hence only one leaf in the multi-rooted DAG. In Section 5 we shall see that in
order to preserve the canonicity of the representation, only the arcs out of function nodes and the else arcs
of internal nodes can be complement arcs. In the figures, therefore, a dotted line unambiguously designates
a complement arc. Also, in drawings we shall normally align all the nodes labeled by the same variable, and
indicate the variable only once at the left. This leaves the interior of the node free for a unique identifier, to
which the examples will often refer. The attribute mechanism is quite general: Other attributes have been
used for other purposes [47, 36].


3 Preliminaries
Binary Decision Diagrams represent boolean functions. In this section we introduce the notation we use for
boolean formulae and recall the basic result that we need for BDD manipulation. Given a boolean algebra ,
the boolean formulae on the variables     are obtained by recursively applying negation, conjunction,

and disjunction to the elements of and to the variables. We denote conjunction by ‘  ’, disjunction by ‘ ’,



   is a formula. When
and negation by ‘ ’. Unless otherwise noted, we assume 
. Thus,    
no confusion arises we write    instead of    and we drop parentheses assuming that negation has the
highest precedence, and conjunction binds more strongly than disjunction. Formulae designate functions in
the usual way.
Definition 2 Let




 
  

be a boolean function. Then

 


 
are the positive and negative cofactors of





 
  
   

 
  


  
  

with respect to   .


The cofactors are given several different names in the literature, e.g., restrictions. If 
  then
depends on   and   is said to belong to the support of . The cofactors of with respect to   do not
depend on   . Cofactors commute, that is:


   


     

1
We are assuming that the overhead for storing the attributes is negligible. This indeed the case because it is customary to store
the complementation flag in the otherwise unused least significant bit of a pointer. The maximum decrease in memory requirements
is by a factor of 2, but in practice this limit is seldom approached.

4

and they distribute over negation, conjunction, and disjunction:


   

  
     
  

 
 










 

The following result, known as expansion theorem, is due to Boole and forms the basis for most algorithms
that manipulate BDDs.
Theorem 1 Let






  
 

Example 1 Suppose


be a boolean function. Then






  
  

 

 
  


 


 




 



  

 

(1)


. Then


   



  

 

4 BDDs and Their Canonicity


Definition 3 A BDD is a directed acyclic graph  
    representing a vector of boolean functions

. The nodes are partitioned into three subsets.  is the set of the internal nodes. The outdegree of 

is 2. Every node  has a label    in the support of . 1 is the terminal node: Its outdegree is 0. is the set
of function nodes: The outdegree of  is 1 and its indegree is 0. The function nodes are in one-to-one

correspondence with the components of . The outgoing arcs of function nodes may have the complement
attribute. The two outgoing arcs for a node  are labeled  and  , respectively. The  arc may have
the complement attribute. We use           to indicate an internal node and its two outgoing arcs.

The variables in the support of are ordered: If   is a descendant of   (     ), then    
    .

The function vector represented by a BDD is defined as follows:


1. The function of the terminal node is the constant function 1.
2. The function of a regular arc is the function of the head node; the function of a complement arc is the
complement of the function of the head node.
3. The function of a node 
(   ).

is given by  

   

 

 "!

 
, where "! (  ) is the function of  



4. The function of #$ is the function of its outgoing arc.
The distinction between the function of a node and the function of an arc allows us to deal with attributed
arcs in a natural way.
Theorem 2 BDDs are canonical (the representation of a vector of boolean functions
variable ordering) if:
5



is unique for a given

1. There are no distinct internal nodes   and 
   .


2. For every node, !


such that   



 

 ,

 



 

 , and 

 



.

3. All internal nodes are descendants of some node in .


Proof. For this proof it is convenient to reverse the established use of having the smallest variable of the

order at the top of the BDD. We suppose therefore that the variables in are  


  with 
 
closest to the function node. The proof is by induction on . If
, is a vector of constant functions
whose representation is clearly unique because the BDD contains only one terminal node. The constant 1 is
represented as a regular arc pointing to the terminal node, and the constant 0 is represented as a complement

and all functions of      have unique
arc pointing to the terminal node. Suppose now that


representations as BDDs. Let be a component of . We can write
        . Both   and
 are uniquely determined by and   and have unique representations. Suppose does not depend on
 . Then the representation of is the representation of   , which is unique (and identical to the representation of  ). Indeed the second condition of the theorem prevents the representation of from containing
a node labeled   . Suppose now that does depend on   . We need to distinguish the case in which the
outgoing arc of the function node for  is regular from the case in which it is complemented. Suppose the
arc is regular. Then the representation of must consist of a regular arc pointing to a node labeled   with
children representing   and  . Suppose the arc is complemented. Then the representation of must
consist of a complement arc pointing to a node labeled   with children representing    and   . Hence, in
both cases it is unique by the induction hypothesis and the first condition of the theorem. The last condition

of the theorem simply guarantees that the representation of consists solely of the representations of its
components, which are unique.

 
In the following, we only consider BDDs that conform to the requirements of Theorem 2. Note that the
restriction that the  arc must be regular is imposed to guarantee canonicity. Suppose we did not impose the
restriction. Then from




 

we would have two distinct ways to represent


 







   




 



in terms of its cofactors:




 


  






    

(2)

Instead, we use this equivalence to choose the one form in which the then arc is regular.
Evaluation of a function represented according to Definition 3 amounts to counting the number of complement arcs on the path from the function node to the terminal node. An even number of complement arcs
signals that the function evaluates to 1.
A consequence of the requirement that the then arcs be regular is that the value of a function when
all variables evaluate to 1 can be determined by inspection of the arc out of the function node, since no
other complement arc can be encountered along the path. This simple observation leads to the following
interpretation of the restriction of the then arcs. We represent directly only half of the
functions of
variables: Those that evaluate to 1 when all variables evaluate to 1.

 

6

5 Basic Manipulation of BDDs
5.1 Conjunction of BDDs and Related Operations
The usual way of generating new BDDs is to combine existing BDDs with connectives like conjunction
(AND), disjunction (OR), and symmetric difference (EX-OR). As a starting point one takes the simple BDDs
2 We are therefore interested in an
for the functions 
  , for all the variables in the functions of interest.


 , where
algorithm that, given BDDs for and  , will build the BDD for
is a binary connective (a
boolean function of two arguments). The basic idea comes—not surprisingly—from the Theorem 1, since:













  





  


So, if  is the top variable of and  , we can first cofactor the two functions with respect to  and solve
two simpler problems recursively, and then create a node labeled  that points to the results of the two
subproblems (if such a node does not exist yet; otherwise we just return the existing node). 3
, that
Finding the cofactors of and  with respect to  is easy: If does not depend on  , 
 
is, the cofactors are the function itself. If, on the other hand,  is the top variable of , the two cofactors are
the two children of the top node of . Similarly for  .


5.1.1 Conjunction






The algorithm that takes ,  , and
as arguments and returns
 is called Apply in the literature.
We now examine in detail the special case that computes the conjunction of two BDDs. The pseudo-code
is shown in Figure 3. As mentioned in Section 2.3, the application of De Morgan’s laws incurs negligible
overhead thanks to the complement arcs. Therefore all boolean functions of two operands can be computed
efficiently given procedures that compute the conjunction and the symmetric difference of two operands.
This is indeed the approach followed in many packages because the specialized algorithms are more efficient
than the general one.
The algorithm of Figure 3 employs two major data structures. One is the unique table discussed in Section 2.2. Its purpose is to guarantee the canonicity of the BDDs. The second data structure is the computed
table, which stores results of previous computations so that repeated computations may be avoided. The
computed table is often referred to as the cache. We will not use this name to avoid confusion with the
hardware cache of the computed on which the BDD application runs.
The terminal cases for the recursion are:
case





result
0
0








0

 is called a projection function.
The function 
Here for simplicity we ignore the treatment of complement arcs. It is almost always the case that once an algorithm has been
devised to operate on BDDs without complement arcs, it can be easily extended to operate on BDDs with complement arcs by
applying Equation 2.
2

3

7

AND


 



if (terminal case) 
return result;


else if (computed table has entry   ) 
return result;

else 

let  be the top variable of   ;
AND     ;

AND       ;
if ( equals  ) 
;
else  = findOrAddUniqueTable 

insertComputedTable     ;
return  ;


;





Figure 3: Pseudo-code for the conjunction algorithm.

All these conditions can be tested in constant time. If none of them prevails the computed table is consulted.
Two observations are in order here. First, the results of trivial problems—the terminal cases—are not stored
in the computed table, because memory is a precious resource which should not wasted to record results
that can be recomputed in about the same time that it would take to look them up. (Or less, in the case of
a hardware cache miss caused by the memory access necessary to read the computed table.) Second, the

lookup is for the set   . In other words, the order of the operands is immaterial because conjunction is
commutative.
If the computed table lookup fails to return a result, the procedure proceeds to apply the expansion
theorem (Equation 1) and solve the problem recursively. It then applies the reduction rules to the partial
result by making sure that no nodes with two identical children and no duplicates of existing nodes are
ever created. Finally, the result is stored in the computed table and is returned to the caller. The details of
handling complement arcs are not shown in Figure 3, but are analogous to those discussed in the proof of
Theorem 2.


Example 2 Consider


and  of Figure 4. The computation of 

AND


 





AND

AND




 

AND 



  

AND










proceeds as follows.

AND 

  

unique table lookup 


 the procedure looks up the unique table. In this case the node is
Notice that before creating a node
already part of . Therefore, of the three nodes of  (two internal plus the terminal node—the function node
is not usually included in the node count) only one needs to be created.


8

f

g

h

a
b

q

c

r

d

t

p

u
s
1

Figure 4: BDDs for Example 2.

f

g

h

a
b
c

p

q

t

r

d

s
1

Figure 5: BDDs for Example 3.

9

u


Example 3 Consider and  of Figure 5. These functions are intended to demonstrate the use of the
computed table. There are two paths in from the root to node  . As a result there are two recursive calls
with operands  and  . The second benefits from the cached result of the first. The computation of 
 
proceeds as follows.
AND


 


AND


AND











In this case three nodes are created:

 


 

AND










AND 



AND




 



 


AND 

AND

 

 

AND 

AND

 

AND


 


 

 

is cached here

cache lookup 


,


 


AND 

AND 




 







AND

AND


 

AND 


 , and 



.

The time complexity of procedure AND is readily established by observing that each pair of nodes   
with  in and  in  is examined at most four times thanks to the computed table. (The factor of four is due
to possible paths with different complementation parity reaching  and  .) Hence, the number of recursive

calls is less than or equal to    , where   is the number of nodes in the BDD for . Since all steps of
the procedure, except the recursive calls, take time bounded by a constant, we have established the following
result.


Theorem 3 Procedure AND runs in time         .
Though the bound is tight, it is seldom achieved in practice if the order of the variables is reasonable.
It is important to observe that without the computed table, the runtime of procedure AND would grow
exponentially with the number of variables. On the other hand, one may question the practicality of storing
all intermediate results of very large computations. Indeed, in most implementations the computed table
has fixed maximum capacity, which depends on the available memory and the computation at hand, and
is managed as a hash-based cache. This implies that each new result is stored in an entry of the table
determined by computing a hash function. Various replacement policies can be used to determine what
results, if any, are evicted from the table to make room for the new one. In the simplest scheme, each entry
holds exactly one result, and every conflict result in the eviction of the older result. This simple scheme
works well provided enough memory is available, and provided hashing works well. (Something not to be
taken for granted when the computed table has several million entries.)
In summary, the worst case performance of practical implementations of AND is exponential in the
number of variables, but the exponential behavior is seldom observed. Similar remarks apply to most algorithms that operate on BDDs. Almost invariably they use a computed table, and have runtimes that are
linear in the size of each operand if a lossless computed table is used. It should be noted also that in many
implementations the computed table is not flushed once a top-level call to AND terminates. This proves
especially advantageous in some model checking experiments.
10

5.1.2 The If-Then-Else Operator
The approach based on the expansion theorem can be applied also to the if-then-else operator (ite)—a ternary
operator defined as follows:
 
 
 
ite   


where   are three boolean functions. One interesting property of the ITE operator is that all two
argument operators can be expressed in terms of it. For instance,   
ite    . A detailed analysis
of ite can be found in [5].


5.2 Satisfiability Problems with BDDs
The satisfiability problem is the problem of deciding whether there is an assignment to the variables of a
function that makes the function 1. We have seen that this problem is trivial when the function is given as
a BDD, because it is sufficient to check whether the BDD is the constant 0 function and that takes constant
time. A related, equally simple problem, is tautology. Checking whether a function is a tautology also takes
constant time4 . In this section we consider related problems, namely:
Finding one satisfying assignment;
Counting the number of satisfying assignments;
Finding one minimum-cost satisfying assignment (this is also known as binate covering problem).
All these problems have in common the correspondence between satisfying assignments and paths in the
BDD that go from a function node to the constant node. With complement arcs, all paths lead to the unique
constant node and what determines the value associated with a path is its complementation parity. If a path
contains an even number of complement arcs then its associated value is 1, otherwise it is 0.
Multiple assignments may correspond to the same path. This occurs whenever one or more variables do
not appear along the path. The path actually identifies a cube of the function, that is the conjunction of some
variables or their complements. The variables that appear in the cube are those encountered along the path.
A variable appears complemented in the cube if and only if the path goes through an else arc coming out of
a node labeled with that variable.
5.2.1 Finding One Satisfying Assignment
This problem corresponds to finding a path with even complementation parity. We present in Figure 6 a recursive algorithm, though a non recursive algorithm is obviously possible. The algorithm actually computes
a cube in the ON-set of the function. Procedure OneSat is called with three arguments.
The first ( ) is the

top node of the BDD for which a satisfying assignment is sought. The second ( ) represents the complementation parity of the path. It is initially 1 if the outgoing arc of the function node is regular and 0 if it is
complemented. Finally, the third argument ( ) is an array of cells ( being the number of variables),
initialized to all don’t cares.
The analysis of the performance of this algorithm is based on the observation that every internal node
of the BDD has paths to the terminal node with both parities: Otherwise, the node would correspond to a
constant function, which contradicts canonicity. Therefore, the procedure may only backtrack at the nodes
for which the first child examined is the terminal node. Backtrack occurs in such a case if the parity is odd
4

Clearly, we assume that the BDD has been built already—a non negligible assumption.

11

 
OneSat 

if ( is terminal node) return ;



;

 ) return 1;
if (OneSat 



;



 is complemented) complement ;
if (

;

return OneSat 













Figure 6: Algorithm to find one satisfying assignment.

 
SatHowMany 

if ( is terminal node) return ;
if ( is in table) return result from table;
;

countT = SatHowMany 
;

countE = SatHowMany 

 is complemented) countE =
if (

count = (countT countE)/2;
insert ( , count) in table;
return count;










countE;



Figure 7: Algorithm to compute the number of satisfying assignments.


). However, in that case the procedure is guaranteed to succeed for the other child. Hence the total
(

.
number of nodes visited is at most
and the complexity is 



5.2.2 Counting the Number of Satisfying Assignments
This problems amounts to computing the number of minterms in the ON-set of a function. The result
 
changes with the number of variables on which the function depends. For instance,    
has one satisfying assignment, whereas      
   has two. Therefore, the computation of the
number of satisfying assignments takes the number of variables as one of its arguments.
The algorithm we outline here is based on the post-order traversal of the BDD. For every node (arc) we
compute the number
( ) of satisfying assignments for the function represented by that node (arc). We
assume that the number of variables is , the index of a function node is 0 and the index of the terminal


node is
. If  is the terminal node
. For a regular arc connecting node  to node ,
,

whereas, for a complement arc,
. Letting  and  be the two arcs emanating from an internal
node  , we have:


 





 





 

!






This can be seen by noting that the two terms of the expansion with respect to   are disjoint and the
cofactors do not depend on   . Figure 7 gives the pseudo-code of a procedure based on these remarks. The
two parameters are the top node of the BDD and the number of variables. From the efficiency point of view,
 
time, whereas the table makes the complexity
notice that without a table the procedure would take 



12

 , assuming additions and multiplications can be done in constant time.5 The table is not normally saved


across top level calls to this procedure. We need a top level procedure—not shown here—to perform the
initialization of the table and also to take care of the possible complement attribute of the function pointer.
5.2.3 Finding One Minimum-Cost Satisfying Assignment


In this problem variable   has a non-negative cost  associated with it. The cost of an assignment is the sum
of the costs of all the variables set to 1. For instance,      is a satisfying assignment for     

  
      . Its cost is 
. This assignment is not of minimum cost unless
, because       is
   
also a satisfying assignment of cost
.

When is represented by a BDD , the following result [40] allows the solution of the problem in time

linear in the size of .


arc is 0 and the length of a  arc out of a node labeled 

Definition 4 The length of an 
Theorem 4 Let
a solution to



be a BDD for




  .

 







is  .

Then the minimum cost assignment to    
  that is

  

 
is given by the shortest path of even parity connecting the root of



to the terminal node.

Proof. Every path of even parity from the root to the terminal node represents a set of satisfying assignments.
By taking all the variables that do not appear along the path as 0, one obtains a minimal assignment corresponding to the path. For this assignment, the cost is equal to the length of the path. (If some costs are 0,
there may be several minimal assignments.) Let  be a minimal assignment corresponding to a shortest path

of  . We prove that  is minimum. Suppose there is another assignment   such that cost(  ) cost(  ).




  corresponds to a path  from the root of to the terminal node. Then cost(  )  length(  )  length( )
= cost( ), a contradiction.


Note that the length of the shortest path does not depend on the variable ordering chosen.
Example 4 Consider the function


  


 

  





The corresponding BDD for the ordering
is shown in Figure 8. One can see that the path
,


, and
(the path goes through the filled nodes) is the shortest path with even parity and that it
identifies an assignment of cost 2 (assuming unit costs).

 time. (See,
Finding the shortest path in a DAG with bounded outdegree (2 in our case) can be done in 
for instance, [20, p. 536].) The additional complication arising from the complement arcs does not increase
the asymptotic complexity.
5

This assumption is not superfluous, if we consider the large numbers that may be involved. In practice, one would use floating
point numbers or arbitrary precision integer arithmetic to prevent integer overflow. 64-bit floating-point arithmetic may give accuracy problems even for relatively few variables, and does not work for, say, 2000 variables. Arbitrary precision arithmetic violates
the assumption that additions and multiplications can be done in constant time. Furthermore, storing an integer with hundreds of
digits for each node may cause the computed table to grow too large.

13

f
a
b
c
d
e
1

Figure 8: Example BDD for the minimum-cost assignment problem.

5.3 Cofactors, Function Composition, and Quantification
The computation of cofactors is a key ingredient in many BDD-based applications. Other important operations on BDDs, like function composition, and universal and existential quantification, are defined in terms
of cofactors. We shall first analyze the algorithms for cofactors with respect to single variables and with
respect to cubes in Section 5.3.1.
5.3.1 Cofactors with Respect to Single Variables and Cubes
We know how to efficiently compute the cofactors of a BDD with respect to its top variable—they are just
the two children of the top node—and with respect to a variable that does not appear in the BDD. In this last
case the cofactors coincide with the function. We use this knowledge to solve the general case. To this end,

suppose we want to compute —the cofactor of with respect to —where is an arbitrary function and


is a cube. Let be the index of the top variable of and  be the index of the top variable of . We shall


also write
 , where  is either  or   and is a cube, possibly the function 1. We first notice that

if
or is constant, then
. This provides the termination condition. Assuming
and not
constant, we consider the three following cases.


 





. In this case , and as a consequence, do not depend on
the BDD from the root to the leaf.) Hence,


  


 . (We assume that the indices grow in





In practice, we just have to move down along the BDD for and recur.
. In this case we write:
Hence, we cofactor


 

 





with respect to  , we move down along , and recur.
14


. In this case we use the commutativity of cofactors and write:









   











Hence, we need to recursively compute the cofactors of 
them as then and else children to a node labeled  .


and   with respect to


and connect

5.3.2 Function Composition
In function composition, we want to compute:

  




 

 
  




  
 

  . From the expansion theorem,
where  is also a function of    


we derive by substitution:

  


This implies that we can compose
computing

 





 



  






   



and  by finding the cofactors of
ite


(3)

 

with respect to

and

 

and then

      


We have seen the general method to compute the cofactors in Section 5.3.1. Here we present a more efficient
procedure for function composition that only requires the computation of the cofactors with respect to the
top variable of a BDD and its complement. As we saw, these cofactors are simply given by the two children
of the top node. The algorithm for this case is indeed reminiscent of that for cofactors with respect to cubes,
which we saw in Section 5.3.1.
Let be the index of the top variable of . We have three mutually exclusive cases.

  . In this case, 
.

In this case, 



does not depend on 



is the top variable of


and the result of the composition is simply .


and we can apply Equation 3 directly.

  . In this case, we find the variable of least index between the top variable of 


. Let 


be that variable: We expand with respect to 

  


















 





      
    



  










and recur:



 


     






and the top variable of

       

          







 

     

(Note that we used the commutativity of cofactors here.)
5.3.3 Quantification
Given a boolean function
respect to   as:




  
 

and the universal quantification of




of


variables, we define the existential quantification of


    


with respect to  as:





      

15


with

Example 5 As an example, let us consider the following function:

Suppose we want to quantify


If  and 
to verify that


   


Hence,


in . The two cofactors are:


     





and




and








are considered as defined over the same -dimensional boolean space as , then it is easy








 

More specifically,  is the smallest function independent of
function independent of   that is contained in .
Quantification is monotonic, i.e.,













and 


is the largest





that contains




This can be easily seen by writing:





whence:

 


 









and



 










 

 

 

from which the stated inequalities can be derived. Quantifications of the same type commute, i.e.,


and


 




 































The existential and universal quantifications with respect to a cube are therefore well-defined. However,
quantifications of different types do not commute. Existential quantification distributes over disjunction and
universal quantification distributes over conjunction, but not vice versa, unless and  have disjoint support.







 


  

























 


 



















These properties can be easily verified by substituting the definitions. One can also easily verify the following properties, that are useful when quantifications are performed on BDDs with complement arcs.
























A recursive algorithm for the quantification of the variables of a cube in a function
for the computation of cofactors.
16


is derived as the one


Example 6 We compute


, for





      using the variable order 
 
       



      
  
   
           





  
 

 















  


.



 
Existential quantification is very easy when the function is given in sum of product form. It is indeed

sufficient to suppress the quantified variables from each product term.
If all variables are suppressed from a
 

term, the term becomes the constant 1. In the previous example, 
    

 . Likewise,
universal quantification is easy when is given in product of sum form. By contrast, quantification of either
type may increase the size of a BDD.


5.4 Cofactors with Respect to Functions


 

A set   

of -variable boolean functions is an orthonormal set if it satisfies:







 

and







A common example of orthonormal set is      : It is used in Boole’s expansion theorem. Also, for every

function       , the set     is orthonormal.
 
Given an orthonormal set   
, we can expand a function with respect to it in the form




  

 






The coefficients of the expansion are related to


Theorem 5 Let   













      . Then
Proof. Suppose   

Conversely, if

  



 

 
  


     , then





 

    
 
 








  








 

             
17


by the following result.

an orthonormal set of -variable functions and

if and only if


and   







an -variable function. Then


This proves the theorem.

It should be noted that this theorem does not define uniquely the coefficients of the expansion, but imposes
restrictions on what they can be. In other words, the coefficients are incompletely specified. It is possible to
prove that every function in the interval

  


  


satisfies the condition of the theorem for  . This can also be interpreted by saying that  has to agree with
wherever  is 1 and is don’t care otherwise. We shall see in the rest of this section how to exploit this
freedom to obtain compact representations of the coefficients.


5.4.1 The Constrain Operator
We can now define the cofactors of a function with respect to another function, in terms of orthonormal
expansions.
Definition 5 Let


and  be two boolean functions and let











  

be an expansion of with respect to the orthonormal set 
generalized cofactor of with respect to  .


. Then  (   ) is a positive (negative)

The generalized cofactors are incompletely specified by their defining equality. Hence, the problem arises of
finding the best cofactors according to some criterion. A common criterion is the size of the representations,
that is, when BDDs are used, the number of nodes.
. The problem is therefore to choose the values of
As we saw, the values of  are fixed wherever 

 where 
. Coudert et al. [21] introduced an operator on BDDs, called constrain that normally returns
compact cofactors. The strategy of constrain is to map each minterm in the offset of  into a minterm of the

onset of  and use this map to define the values of  wherever 
. The map depends on  and on the
following definition of distance.


Definition 6 Let   



 be variables with ordering


Let 
   
 
is given by:



 
and







be two minterms. The distance between  and , written




 


 




,



This definition of distance reflects the dissimilarity of the paths associated to the two minterms in the BDD
with ordering  

  . We are now ready to define the constrain operator.
Definition 7 For functions


 constrained by  , written 
   if   

if 

and  , the function


where is the minterm such that 



  
and


is minimum.


18



is defined by

Notice that
implies 
. Therefore, is uniquely identified in Definition 6. The

algorithm for constrain visits recursively the BDDs for and  . The recursion is based on the following
theorem:


Theorem 6 Let ,  , and

 
be boolean functions. Let


  





Then:








and

  

 





   


   







  





Proof. We need to prove that  and   satisfy Theorem 5. We consider  , since the case for   is similar.

 



































        
                
     
    
       


We choose  as the variable with the lowest index among those appearing in and  , so that we can apply
cofactoring easily. Note that the straightforward expansion yields an expression that is unsuitable for the
recursive formulation:





 


 

   , because we are dealing with incompletely specified functions.
We cannot simply say that  
The recursion has several termination cases. If is constant,
. If

 we return 1 and if
  we return 0. These are correct solutions in general and will be seen to be the only ones satisfying
Definition 7. If 
, then the only solution is

. If 
and we are at the top-level of the
recursion, any solution is acceptable. We choose to return 0, for reasons that will become apparent later. We

take care of avoiding recursive calls with 
as follows.


Whenever 
or  
, we have identified a group of minterms of that need to be mapped.

  first and then perform the mapping by taking
. We compute
We consider the case for  
   . Since the two cofactors are the same, we don’t create a new node and just return    .
   

Similarly, if 
we return
   .


Example 7 An example of application of constrain is illustrated in Figure 9.


















 











This example demonstrates some of the special cases. In particular,   
the other branch.
19


and we return the result from

f

g

h

x
p

y

r

q

t

s

z

1

Figure 9: BDDs for Example 7.

f

g

h

t

p

a
b

q

c

r

d

u
s
1

Figure 10: BDDs for Example 8.

20

c
x
y
z
1

Figure 11: BDD for
Example 8 For another example, let us consider





 


.

and  of Figure 10. We have:














It should be clear that the choices we have made in our algorithm are aimed at producing a result as small as
possible. It remains to be proved that the algorithm we have outlined actually conforms to Definition 7. We
shall outline a possible proof. We consider first the case when
cause
 . All minterms for which 
to be 1 as well. Hence, minterm of Definition 7 must cause to be 1. This is why we return 1. The case
for
  is similar.


Suppose next that  
at some point of the recursion. Let be the cube of the variables on which
whose distance from the
splitting has occurred up to that point. We have to find the minterms with 


minterms in the cube    is minimum. We first notice that these minterms must be in   . This is because
the weights in the definition of distance decrease exponentially from the root to the leaf. Hence, there is


a minterm in   closer to each minterm in    than every other minterm in the rest of the BDD. Take


    . If 
   is such that 
, then
 . Otherwise,
will be the minterm such that 
closest to  . But this is exactly what our algorithm does.

Figure 11 shows a BDD for the cube
   . It illustrates the typical structure of the BDD of any
cube: It is a single string of nodes. All internal nodes except the one at the bottom have one arc pointing to
0 (pointing to 1 with odd parity). Because of this, the constrain algorithm always returns the result of one
branch and it ends up computing exactly the ordinary cofactor. The effects of the use of the computed table
on the complexity of constrain are the same as for AND.
We complete our discussion of constrain by discussing a problem it has and a possible optimization.
 has fewer nodes than the BDD for , sometimes the reverse is true.
Though in general the BDD for
This most often occurs when the BDD for  is large and depends on many variables does not depend on.
 , causing an undesirable growth of the BDD. This inconvenience
These variables may be introduced in
can sometimes be avoided in a rather inexpensive way by existentially quantifying the undesired variables
from  (we shall see how in Section 5.4.2), but the resulting operator loses one property that is important









 
21




 
restrict


if (
or is constant) return ;

if (
) return 1;

if (
 ) return 0;

if (
) return 0;


let  be the top variable of 
;


if (
) return restrict 
;


;
if ( 
) return restrict

if ( is not the top variable of )

return restrict
ite
 ;

 restrict  
return ite  restrict




;

Figure 12: Procedure restrict (without computed table).
for FSM verification, namely, being an image restrictor. For this reason, the ‘un-optimized version’ is also
of interest and we have presented it here.
5.4.2 The Restrict Operator


In Section 5.4.1 we noted that the constrain operator may introduce variables that do not appear in into
 . We now see how we can use existential quantification to solve the problem. We need the following
lemma.

 
Lemma 1 Let ,  , and  be -variable boolean functions and suppose










  . Then


In other words, a cofactor computed with respect to  is also a cofactor with respect to  .


Proof. It is sufficient to observe that a function must agree with , whenever 
with respect to  . This is certainly the case of , since 
.



, in order to be a cofactor


  . Hence, we can existentially
Lemma 1 applies in particular to the constrain operator and to 
quantify all the variables in  that do not appear in before computing
 . Let  be the result of these
 is a valid cofactor of
quantifications. Then
with respect to  .
It is also possible to integrate the quantification into the computation of constrain. The resulting algorithm is called restrict in [22] and indicated by
 . Whenever the top variable of  has a lower index than
the top variable of , the procedure returns




 


The pseudo-code for restrict is given in Figure 12. Note that the call ite
  quantifies  in .

Also, some implementations prefer to return a failure value if
. This may help catch bugs in the
program, but may sometimes be restrictive. Notice that restrict may actually produce smaller BDDs than
the approach previously described (quantification followed by constrain). It is indeed possible to quantify
variables, which only some cofactors of are independent of, from the corresponding cofactors of  . On
the other hand, quantifications are performed one at the time and performance may be substantially slower,
if constrain uses a cache table and restrict does not.


22

Example 9 We return now to the example of Figure 9 and we compute




 













 




  












  




.



 
One can see that in this case
 has fewer nodes than
 , because we avoided introducing a node

labeled in the positive cofactor with respect to  . In general, however, it may still be the case that
has more nodes than . In that case, one may decide to return as the result of cofactoring.


5.4.3 Properties of Generalized Cofactors
In the following we discuss properties of generalized cofactors that are useful in sequential verification. We
use to indicate a generic operator.


 


and 
be an operator satisfying:

Lemma 2 Let
 
  . Let


and











   


  
  (i.e.,
Proof. From
operators are monotonic, we have:


  

   


Then:













   







 
   , and




   


   


is a generalized cofactor) we get 





be boolean functions,



(4)




   

The reverse inequality follows directly from (4).




. Since the quantification


In this lemma, is a generalized cofactor operator that satisfies (4). We are obviously interested in seeing if
the lemma applies to the cofactoring operators that we use. We recall that the constrain operator is defined
in terms of a mapping:
if   
 
  
 if    

 


where is the minterm such that 
rule. Then:



and


is minimum. Let  be the mapping defined by this




       
 and

 . In words,  does not depend on some of the variables in the support
Let 

of  . We denote by
the restriction of the mapping to the variables in  . It is easily seen that:

  
   
 and     must be minimal. With these preliminaries, we can now
because the distance between 




 




prove the following result.




23


by the definitions of







and


      


Proof. Suppose


 

Theorem 7 Let
 
  . Then:


   






be boolean functions,









  

  
 

 , and


       and      

and of   . Hence     is the  sought in (4) and Lemma 2 applies.
. Then:













We can use the fact that restrict is related to constrain to prove a result similar to Theorem 7 for the former.
Specifically,

















Where  is a function obtained by quantifying variables in some cofactors of  . From the definition of
restrict,    , and the support of  is a subset of that of  . The following property of  is of particular
interest.


Lemma 3 Let

 
 ,   

If 

 





 and   
. Let  


 , then there exists 




be boolean functions,
be such that:















 




Proof. Let




and

 
 

 
  .

Let

 




if
if




such that:








 

Since  is obtained by existential quantification of some variables in some cofactors of  ,  
 
implies that there is a subset of variables that were quantified along the path that includes . Let this subset




be       , let      
, and let      . Notice that
. If we cofactor

and  with respect to , the resulting functions,
 are constants. For , this derives from the requirement
that the quantification is performed only if the current cofactor of does not depend on the current variable.
For  , it follows from the quantifications performed on it. Furthermore,  must be 1 for at least one minterm



 and  
 would be 0. Hence, there is a minterm
of , or  
such that
.














Lemma 3 allows us to prove the analog of Theorem 7 for restrict.


Theorem 8 Let
 
  . Then:


 


and







Proof. Let    be the function such that



  


   
   

 

      


be boolean functions,





  

   
       

Then, with argument similar to the proof of Theorem 7,




and
24




  

    



 
 . Then,




 
   , and

Let
that 

    . If  



    and   



, we are done. Suppose  
. Then, by Lemma 3, there exists

. Hence is the  sought in (4) and Lemma 2 applies.




such

Notice that it is possible to build generalized cofactors that do not satisfy (4). Each operator has a distinct
advantage over the other. On the one hand, constrain distributes over product, as shown by the following
result [59], where denotes composition.
Lemma 4 Let


,


 , and 


 


be boolean functions. Then:







Proof.


   


 




 






 
 



In particular, by taking
    one obtains distributivity with respect to product and can write the
following equation that we shall use in Section 8.3.


















 


  

  
 




  
 

  

 
 



 
  


(5)

Unfortunately, restrict does not enjoy the same property, due to the different mappings applied to each


For instance, for  
 ,
    , and 
     , we have      
, but  
   


  . Therefore we can write:


























  
 

  

 






 

  

 




 

  


 
 


 .


  
  
 
 



    

(6)
 

  


(7)

However, we cannot simultaneously apply restrict to the terms of the product and eliminate it, as we do
with constrain. On the other hand, constrain may introduce in
 variables that are in  , but not in .
This is likely to occur when  is a complicated expression. This is a problem, especially for the method of
Section 8.5.1, that restrict does not have.

 
5.5 Minimization
The constrain and restrict operators described the previous sections are designed to produce small resulting
BDDs. For this reason they have been extensively used for the purpose of BDD minimization. Suppose we
25





 and the BDD of
are given an interval   and that we are asked for a function such that 

  , the care set of , and then

is as small as possible. The problem can be solved by computing
computing itself as 
or as 
. Any operator that computes  could be used instead of constrain
or restrict, and any function from   can be used instead of  .6
Normally restrict performs better than constrain in this capacity because it does not pollute the support
of the result, but there are exceptions. Besides, neither constrain nor restrict guarantees that the BDD for
is smaller than the BDD for  . Indeed, when the BDD for is considerably larger than the BDD for  , it is
often the case that the “minimized” BDD is larger than the original one. In the terminology of [33], these

operators are not safe. The lack of safeness is easily explained. Suppose that in computing 

 a node  of
 is visited twice—first in conjunction with node  of , and then in conjunction with node
of . Since
the constraints are different in the two recursive calls originating at   , different results may be returned.
The result may therefore contain more than one node corresponding to   . The compaction algorithm of
[33] avoids this problem by performing minimization in two steps. In the first step the BDDs are analyzed
and the cases in which duplication may not occur are identified. In the second step, safe minimization is
carried out according to the information gathered in the first step. The compaction algorithm does produce
on average better results than either constrain or restrict, but it is considerably slower.
Another approach to improving BDD minimization relies on extending the remapping mechanism of
constrain. Consider the computation of

   
       . Simplification of the result


occurs when either  
or   
. In such cases the result is either   or  . One can do better by

observing that if   
     then there are solutions
 independent of  for  . One such solution
   . The resulting procedure is more powerful
is given by minimizing   
     with the care set  
than restrict, but much more expensive. Approaches intermediate in power are often preferable. Details and
assessment can be found in [58]. Our discussion has been in terms of a given function to be minimized,
and a care set, so that we could easily relate minimization to constrain and restrict; however, minimization
algorithms can be directly formulated in terms of the interval   .







5.6 Approximation
BDD approximation is the problem of deriving from a given BDD another BDD smaller in size, and whose
function is at a low Hamming distance from the input BDD. (That is, differing from the input BDD in a
 be the BDD produced by the application of approximation
small number of input assignments.) Let
algorithm to the BDD of . Usually, the function of the approximating BDD is required to be either a
subset or a superset of the input function, that is,





or


 


  



For an underapproximation algorithm (such that
),    
. Hence, we only discuss
underapproximation.
Underapproximation algorithms must trade off the size of the result for the distance from . The two
trivial solutions, 0 and itself, are seldom useful. A natural way to rank different approximations is by their
  and defined in [53] by:
density, denoted by 




 

 


(  is the number of minterms of  .) High density corresponds to a concise representation, and is therefore
desirable. (For overapproximation, we want to maximize the density of the complement of the result.)
6



In the case of constrain, replacing with any function in

 

does not affect the result.

26

The decision version of the underapproximation problem can be stated as follow: Given the BDD of a
of the
boolean function of variables for variable order , and integers and , find a function 
same variables that is true for at least assignments to the variables, and such that its BDD for variable
order has at most nodes. This problem is conjectured to be NP-complete. Hence, heuristic techniques
are used to solve it. The algorithms proposed so far work by redirecting some arcs of the BDD for is such
a way that the resulting BDD loses nodes and does not acquire any new satisfying assignment. The simplest
approach redirects arcs from internal nodes to the constant 0 node. A more sophisticated approach allows
redirection of an arc from one internal node to another, subject to a containment check. As a result of these
transformations, some internal nodes of the BDD become unreachable from the root are are dropped.
The various algorithms also differ in the order in which they select the arcs to be redirected. All algorithms examine the BDD for before they compute the approximation to gather information on the distribution of nodes and satisfying assignment in the graph. In this Linear time algorithms like Heavy-Branch
Subsetting and Short-Path Subsetting [53], the approximation then proceeds on the basis of this information.
Quadratic algorithms [57, 52] update the information during the approximation, and produce denser results
 and a minimization algorithm
at the cost of higher runtimes. Finally, given an approximation algorithm



  , it is easy to see that
is another approximation algorithm.


5.7 Decomposition


It is often convenient to express a function as either the conjunction or the disjunction of other functions.


If 
   , then
  . Hence, we can concentrate on disjunctive decomposition without loss of
generality. A simple approach to this problem is based on the expansion theorem, which can be applied to
 

obtain
, with 
    and     . The choice of the splitting variable   influences
the quality of the result. A simple yet effective procedure to estimate the size of the decomposition was
proposed in [15].
Another approach to decomposition is based on the recursive application of the following formula:
















This decomposition is shown to be canonical for a given variable order in [45], which also discusses several
interesting properties of constrain and of the decomposition itself.

6 Variable Ordering
The size of the BDD for a given function depends on the order chosen for the variables. There are
functions—sums of products where each product has support disjoint from the others—for which some
orders lead to BDDs that are linear in the number of variables, whereas other orders give numbers of nodes
that are exponential in the number of variables. The same occurs for the BDDs of adders. At the other end of
the spectrum there are functions for which all possible BDDs are provably exponential in size and functions
whose BDDs are linear for all orderings.
Though clearly one would not want to choose a worst case ordering for an adder, the importance of
the ordering problem cannot be argued from the existence of functions exhibiting such an extreme behavior; equally well, the ordering problem cannot be dismissed simply because there are functions that are
insensitive to the ordering. Indeed, even simple heuristics easily avoid the worst case behavior for adders.
The practical relevance of the variable ordering problem rests on the existence of functions lacking a well
27

understood structure, for which different orders—all derived by plausible methods—give widely different
results.
In Section 6.1 we consider the problem of finding an optimum order. The exact algorithms can be used
for the design of various types of CMOS gates, where an order of the input variables that minimizes the
transistor count is sought and the number of variables is small. The complexity of the problem is such that
the exact algorithms cannot be applied to medium and large-size functions. However, their analysis will
illustrate some features of BDDs with respect to variable ordering and will suggest techniques that allow
one to iteratively improve an existing BDD by changing the variable order. These techniques will be the
subject of Section 6.2. Additional insight is provided by some results—some of general validity and some
applicable to special classes of functions—that characterize optimal orders and that will be presented in
Section 6.5. Finally, Section 6.6 will cover heuristic algorithms that are used to build a BDD starting from
a circuit description.

6.1 Exact Ordering Algorithms
The simplest exact method to find an optimum variable order is to try all possible orders. For variables
there are different orders and the cost of building a BDD is exponential in in the worst case. Hence, a
 
brute-force approach requires 
time. This time can be improved by observing that two permutations
of the variables that have a suffix in common will generate two BDDs with identical lower parts.

 
Suppose is a boolean function of variables     . Let
be the set of the
variable indices. An order of     is a permutation of . The BDD for under order is denoted
 . Let be a subset of and
 the set of orders whose last   members belong to ,
by BDD
that is,

















 

 . The variables
We shall refer to the variables whose indices are in as to the bottom variables of
whose indices are in
will be the top variables. Also, for an order and a variable  , we define
 as the number of nodes labeled   in BDD
 . The following lemma [26] expresses the fact that

given a horizontal cut in a BDD for that separates top variables from bottom variables, the order of the top
variables does not influence the part of the BDD below the cut. (Likewise, the order of the bottom variables
does not affect the part of the BDD above the cut.)




Lemma 5 Let
that for each 


,




satisfying


and   a variable such that 
 .
we have 


 


. Then there is a constant such


Proof. Note that is the position in the order corresponding to the first bottom variable. We can cofactor
with respect to any combination of values (0 and 1) assigned to the top variables. Since cofactors with
respect to variables commute, the resulting function is independent of the order of the top variables and
 . Repeating this argument for all possible assignments to the top
is therefore the same for all 
variables, we see that for each unique cofactor of with respect to an assignment to the top variables there
must be a node (possibly the constant node) in the lower part of the BDD that is the root for that cofactor or
its complement. This node is unique, because the BDD is canonical. The number of unique cofactors does
not depend on the order of the top and the bottom variables, but only on and . Of the unique cofactors,
some will depend on   . Again, the number of those that depend on   only depends on and , but not
on the relative order of the bottom variables. For every cofactor that depends on   there will be exactly one
node labeled   , because   is the first of the bottom variables, so that it must label the root node if it appears
at all. Hence, the number of unique cofactors with respect to all different assignments to the top variables





28


that depend on 




gives the constant and the theorem is proved.

Lemma 5 has many applications in the study of the size of BDDs. We now see how it is applied to the
optimum ordering problem. If the optimal sizes of the lower parts of the BDD for all possible choices of
of size
are known, we can derive the optimal size of the lower part of the BDD for a given of size
as follows [32]. We consider each variable whose index is in as first bottom variable. For   as first
  . The minimum size for the lower part of the BDD with  first
bottom variable, we consider 
bottom variable is the minimum size previously computed for  plus the number of nodes labeled   . Note
that Lemma 5 guarantees that this number does not depend on the specific order of the variables in  and
hence it is sufficient to save the size of the best order of  . (And the order itself, of course.)
The algorithm proceeds then by progressively increasing . The size of the lower part of the BDD for

is known to be 1 (the constant node). The algorithm then works its way up, until the best overall
order is found. It remains to discuss how the numbers of nodes labeled with the first bottom variable are
computed. One method is to build a BDD for for any order that is optimal for  and that has   as first
bottom variable. The number of nodes labeled by   can then be counted by inspecting the BDD. Computing
the BDDs for all
from scratch and storing them would be wasteful; hence we keep just one BDD
and obtain all the desired orders by permuting the variables in that BDD.
In general, any permutation can be decomposed in a sequence of pairwise interchanges of variables.
However, for our algorithm, all we need is to swap two variables at the time. We start from an arbitrary
  and by pairwise interchange we obtain the remaining
order  . We build BDD
BDDs each
such that
having a different variable at the bottom. In this way, we have taken care of all the




. We then consider all the

subsets of size 2. Let

  








 
. We locate the two BDDs


with   and  as bottom variables (they correspond to     and     , respectively). If   is the
bottom variable and   is already next to last, it is sufficient to count the nodes labeled   . However, if  
is higher in the BDD and the next to last variable is  , we swap  and  before counting. The process
is repeated with  as bottom variable and the best result is kept as representative of . The other BDD is
freed.


For 


we need to consider




subsets. For each of them, we have to perform up to swaps.


Example 10 As an example we consider finding an optimal order for
 
 . The purpose of
this example is rather to show how the algorithm works than to illustrate its efficiency. For the latter, larger
functions are required.7 Suppose the initial order is 
  
 . The corresponding BDD is  of


 . The
Figure 13 and it represents both
and
. We represent this by writing





algorithm first creates the BDDs with
(
) and
(
) by swapping  with
 and   with  in . This completes all the subsets of size 1. The costs are all the same, namely 2. (The
constant node plus the one node labeled with the variable whose index is in .) Suppose the first subset

of size 2 that is considered is 
. Also, suppose that  is the first variable considered as first bottom
variable. The BDD 
already has  right on top of   . Hence, it is sufficient to count the number



of nodes labeled  . Since there is only one, the minimum cost for
is
if  is on top.
Next the algorithm tries   on top. To this end, it takes
and swaps   and  . The resulting BDD
is . The cost associated with this choice is again 3. Since this does not improve on the previous result and
there are no other variables to try, the algorithm selects 
and releases .





 




 





  
   



7

Note that

  , but  ! 

for

"!# .
29

 


 

 






f1

f2

f3

x1

x1

x3

x3

x2

x3

x2

x2

x1

f4

f5

f6

x3

x2

x2

x3

x1

x2

x3

x3

x1

x1

x3

Figure 13: BDDs created by the exact ordering algorithm. The arcs with an open circle are regular else arcs.
The arcs with a black dot are complement arcs. Dangling arcs point to the constant 1 node.

30







With similar procedure, the BDD for
is found to be the best between
and  . The latter
 and
is released. Finally,
is found to be  . Having completed the
is chosen, so that 
generation of the BDDs for subsets of size 2, the BDDs for the subsets of size 1 that are not also used by a
larger subset are released. This is the case of in our example. All the costs are again the same, namely 3.
The last step involves no swaps, since the only variable not in  must be at the top. In this example there is
a three-way tie that is broken in favor of the first solution found.

  


  


There are several observations that may speed up the algorithm, sometimes dramatically. First of all, if
we ever get a BDD where there is exactly one node for each variable, we can immediately stop, because the
order is optimal. In our example, the initial order was optimal; hence, we would have avoided the entire
computation. As a second remark, some functions are symmetric in some of their variables. In our example,
was symmetric in   and  . The presence of symmetric variables results in ties between BDDs where
two or more symmetric variables are permuted. If some variables are known to be symmetric, some swaps
may be avoided [35].
Finally, since the algorithm implicitly enumerates the variable orders, it is natural to think of a branch
and bound approach [34]. The upper bound to the optimal size is given at any point in time by the size of the
can
smallest BDD (not just the size of its lower part) representing a subset . The lower bound for
be computed as the cost associated to (the size of the lower part of the associated BDD) plus an estimate
of the minimum number of nodes required to complete the BDD. If the number of nodes labeled with the
nodes are required to produce a connected DAG with a single
first bottom variable is , then at least
root. Indeed with
nodes it is possible to build a binary tree with leaves. Therefore, a lower bound to
the cost of an order obtained from is the cost associated to plus
. If this exceeds the current upper
bound, and its associated BDD are dropped. It is also possible to process the BDD top-down instead of
bottom up. In many cases this allows a more effective bounding [24]. The effectiveness of bounding also
depends on the initial order. The better the order, the tighter the upper bound. How to determine a good
initial order will be covered in Section 6.6.


6.2 Iterative Improvement Techniques
The exact algorithm of Section 6.1 implicitly explores all variable orders by swapping variables in the BDDs.
This idea can be applied as a heuristic by limiting the number of permutations that are explored. This results
in a local search scheme, whereby an initial order is repeatedly perturbed by rearranging groups of variables
until some termination condition is met. Three aspects characterize the possible variants of this method.
The rule for generating a new order from the given one;
the criterion for accepting a new order;
the termination condition.
Among the rules proposed for generating a new order (these rules define the neighborhood of an order) we
find:
Pairwise interchange of adjacent variables [28];
All permutations of groups of


adjacent variables, usually for a small value of

Pairwise interchange of arbitrary variables.
31


(


) [34].

Sifting, that is, the movement of each variable up and down in the order, until it reaches a local
minimum [54].
Group sifting, that is, the movement of a group of variables up and down in the order, until it reaches
a local minimum [50, 49].
All strategies can achieve any permutation of an initial order given enough time. The advantage of considering only adjacent variables lies in the lower cost of performing the swap, which boils down to local
manipulation of the BDDs. The interchange of non-adjacent variables can be effected by a series of swaps
of adjacent variables. Sifting normally obtains better results than methods based on pairwise interchange or
on the permutations of fixed size groups, because it performs a more systematic exploration of the search
space. Even better results are obtained by group sifting.
As for the acceptance and termination criteria, all known combinatorial search strategies can be applied,
going from greedy to simulated annealing [4] and genetic algorithms [23]. In selecting a time consuming
strategy, one has to consider carefully whether the time spent in optimizing the size of the BDDs will be
recovered in the successive manipulations.

6.3 Swapping Variables in Place
The iterative improvement methods of Section 6.2 are all based on successive swaps of adjacent variables.
Each swap can be performed in time that is proportional to the number of nodes labeled by the two variables
being swapped, if the data structures are properly chosen. Specifically, if the unique table is organized as a
set of subtables—one for each variable—it is possible to access all the nodes labeled by a variable in time
that does not depend on the total size of the BDDs being reordered, but only on the size of the subtable, and
the number of nodes stored in it.
To make the swap time independent of the total size of the BDDs it is also necessary to perform the
swapping “in place,” that is, by modifying the given BDDs, instead of creating a modified copy of them.
This is possible thanks to Lemma 5, which guarantees that the parts of the BDDs above and below the
variables being swapped are unaffected by the swap.
Suppose variables  and must be swapped, and suppose  precedes in the order. The subtable
corresponding to  is scanned and all nodes are considered in turn. Let be the function associated with
one such node; it can be written as:


Then swapping  and




 



 


     


corresponds to rearranging the terms of :











 


    



 

 


  
    


Suppose, for simplicity, that  ,  ,   , and  correspond to four distinct nodes. Then the node labeled
 can be relabeled ,  ,  ,   , and  can be reutilized without changes, while new nodes need to be
created for   and   .
Reutilization of the node labeled  is very important, because this node will be pointed in general by
other nodes, or by the application. If a new node were used, the pointers to the old node would have to be
searched for and changed. Thus, the operation would not be local to the subtables for  and .
The nodes for  and  are no longer pointed by the node for after the swap. Once the table for 
has been scanned entirely, all nodes in the table for that are no longer needed are freed. Such nodes are
identified by associating a reference count to them. It is possible for some nodes in the table to retain


32

a non-zero reference count at the end of the process. These nodes have pointers from above  : They are
moved to the table that now contains the nodes labeled .
We now remove the assumption that  ,  ,   , and  correspond to distinct nodes. If neither child
of the node for is labeled , then the node is moved to the other subtable; otherwise swapping proceeds
  , no node for   will be created. If in addition,
as described above. However, if, for instance, 

 , and  
 , then the swap will cause the node count to decrease by one.
Each elementary swap operation may increase, decrease, or leave unchanged the node count. In the
example just discussed, the decrease in node count is due to the elimination of a node with identical children.
It is equally possible for a node to be saved as a result of increased sharing. Consider the two functions
 and   . With the order 
 , they
they do no share any node, whereas with the order
share the one node labeled  . Hence, the total size decreases, even though the size of each function does not
change.


6.4 Dynamic Reordering
Techniques that iteratively improve variable orders are frequently employed “dynamically:” if the size of the
BDDs grows past a given threshold during the execution of an operation, the operation itself is suspended
and reordering is performed, usually by some variant of sifting. After reordering the operation that was
interrupted is usually restarted. This dynamic approach has proved very effective in many applications,
because it allows reordering to intervene when it is actually needed. On the other hand, in sequential
verification it is not uncommon that most of the time is taken by reordering, and it appears that a certain
amount of control on the process by the application is beneficial [38]. Writing the manipulation functions in
such a way that they can be safely interrupted by reordering requires some care. The details depend on the
implementation of the BDD package.

6.5 Some Results on Optimal Ordering
When generating BDDs for the outputs of a circuit, we may use information on the structure of the circuit
to find a good, sometimes optimal order for the variables. Consider the case of sums of products where each
product term has support disjoint from the other terms. For instance:


  


   

 


For these functions every order that does not interleave the variables of two different product terms is optimal
because there is exactly one node per variable. We shall refer to any such order as to a non-interleaved order.
The reason why non-interleaved orders give linear-size BDDs for these sums of products is that the amount
of information that needs to be remembered by a sequential processor that processes the inputs in the noninterleaved order is bound by a constant.
This argument can be generalized by introducing the notion of width of a circuit. We regard a combina   . The nodes of the graph correspond to the gates of the circuit and the
tional circuit as a DAG
edges to the interconnections. A topological order  of  is a partial order such that:



  

$

  


There are many such orders in general. The width of


!



  











33




 




at level  of order  is defined as:
  




 






z1
z
. . . . q

y1
.
.
.
ys

. . . .
x1

z q+1 z m
. . .

. . .

xp

xp+1

xn

Figure 14: Illustration of the concept of width of a circuit.
that is, the number of edges connecting nodes at level less than  to nodes at levels  and above. The width
of with respect to  is:



and finally the width of

is given by:

 



!










!




!

 




!




Through this definition, the width of a circuit is related to the amount of information that crosses a section
of the circuit. The amount of information is related, in turn, to the number of wires that carry it. This is
illustrated in Figure 14. Every level partitions the circuit in two. On the left in Figure 14 are all the gates and
primary inputs with levels less than  . On the right are the gates and primary inputs with levels greater than
or equal to  . Shown between the two sub-circuits are the connections whose number ( ) gives the width of
the circuit at level  , for the chosen topological order.












Example 11 A simple circuit is shown in Figure 15. One can see that !
.
!
!
However, it is possible to change the topological order so that the width is 4 and by decomposing the three
input OR gate into two two-input OR gates, the width can be further reduced to 3. This example illustrates
that the width depends on the circuit representation as well as on the topological order. A variable order
consistent with this topological order is





It is intuitive that circuits with low width have small
BDDs. With reference to Figure 14, suppose the



variables are ordered from   to  . Taking all
cofactors of the function of the circuit with respect to

all input variables at levels  or higher (     ) must result in unique cofactors at most. Hence the

number of nodes labeled by  is less than or equal to . Let be a (total) order of the variables compatible
 —the BDD
with a topological order  8 . It is possible to prove a result [3] linking the size of  

.
for built with order —to !







Theorem 9 Let
   be the DAG of a circuit with input variables,    
  , and
 is a topological order of  , and is a variable order compatible with  , then:

8

 









Note that a topological order specifies a level for all input variables.

34

  


outputs. If

a
b
0
c
0
d

1
1

2

3

1
e

F

2

1

4

2
f
2
g
r=1

r=2

3
r=3

r=4

Figure 15: An example of a circuit with a topological order and the widths at the different levels.
Notice that even with the best topological order the bound may be loose, especially for multiple-output
functions, where no sharing among the different outputs is assumed. The interest of Theorem 9 is not in
the ability to establish a tight bound for the size of a specific BDD. It can be used to prove that a family of
circuits has polynomial-sized BDDs, if it is possible to prove an upper bound on the width of the circuits in
the set9 .
Non-interleaved orders are optimal for fanout-free circuits. The graph of a fanout-free circuit is a rooted
tree. A non-interleaved order is obtained, for instance, by drawing the tree without intersecting arcs and
then numbering the leaves—each corresponding to a variable—from left to right.

6.6 Heuristic Algorithms
In this section we consider the problem of finding a good order of the primary inputs, to build a BDD for a
combinational circuit. The order may be used as starting point for the exact algorithm of Section 6.1 or the
iterative improvement algorithms of Section 6.2. It may also be used directly, when building any BDD for
the given function is the objective and the resulting BDD is not too large.
In all cases, the quality of the orders found affects the time required or even the ability to achieve the
desired result. However, since the methods we shall consider here are intended to work on fairly large
circuits, they are rather simple procedures, inspired by the results of Section 6.5 and by other heuristics that
we shall introduce. We shall consider single output functions first, and then we shall deal with the case of
multiple outputs.
6.6.1 Single-Output Functions
We begin by considering a circuit in sum-of-product form. A good choice for the variable to appear at the
top of the BDD is the most binate variable, i.e., the variable that explicitly appears in most product terms.
This choice minimizes the total number of product terms in the two cofactors. If we assume the number
of product terms as a measure of the complexity of a function, then simplifying the cofactors, i.e., the two
children of the top node, is likely to yield a smaller BDD. One can also see the most binate variable as the one
having the largest effect on the value of the function. A simple, yet effective, method consists therefore of
9

Specifically, the width should grow at most logarithmically.

35

ordering the variables from the most binate to the least binate. An alternative to this method is to transform
the two-level circuit into multiple-level and then apply the methods discussed next.
For a fanout-free circuit composed of AND, OR, and NOT gates, a BDD with exactly one node per
variable can be built by ordering the variables with a simple depth-first search of the circuit graph (a rooted
tree in this case)10 . The order of the variables is given by the order in which the corresponding leaves of the
tree are visited. Assume for simplicity that the tree is binary. It is easily seen that the depth-first search will
reach all the leaves in one sub-tree rooted at the top node of the tree, before reaching any leaf in the other
sub-tree. The recursive application of this argument shows that the order produced by depth-first search is
non-interleaved.
For circuits that contain fanout, it is still possible to use depth-first search to order the variables. This will
have the effect of approximating a non-interleaved order, though no guarantee of optimality will be made.
Furthermore, the order in which the children of a node are visited is no longer immaterial. We present two
criteria to decide the order in which the children of a node are visited. The first criterion tends to place
the input variables that fan out closer to the top of the BDD than those variables that do not fan out. More
precisely, the children of each node are divided in three sets. The first set contains the input variables that
fan out and that are not in the order yet.11 These variables are immediately inserted in the order. The second
set contains the internal lines and the third set contains the input variables that do not fan out. The variables
in the third set are collected in a list and inserted in the order only after all the recursive calls for the internal
lines have been completed. The variables that do not fan out are inserted after the last variables that fans out
in that part of the tree.
Placing the variables that fan out closer to the top of the BDD can be seen as the analogous in the
multiple-level context of the most binate variable heuristic.
The second criterion to rank the children of a node is given by the depth of the subgraph rooted at that
node. In the case of the carry out of an adder, visiting the deepest subgraph first will order the inputs from
the least significant to the most significant bit. Visiting the most shallow subgraph first will result in ordering
the inputs in the reverse order. This suggests that the method may work well also for less structured circuits.
Indeed one may observe that visiting the subgraphs in depth order is related to finding a topological order
of the circuit graph with a low width.
An alternative approach to ordering the inputs of a circuit is also based on the idea of putting at the top
of the BDD the variables that influence the values of the output most. To this purpose, a weight is assigned
to each net in the circuit. The output is assigned a weight of 1, while the other nets are assigned weights
according to the following rules.
Each input of a gate with

inputs receives

of the weight of the gate output;

a fanout stem receives the sum of the weights of its branches.
These rules are applied until all inputs have weights assigned to them. The heaviest input is then chosen as
the first variable of the order. The chosen input is then removed from the circuit. (I.e., it is assigned a noncontrolling value; for an exclusive OR, the assigned value is immaterial.) The weights are then recomputed
for the simplified circuits and the second variable of the order is thus identified. The process continues until
all inputs are ordered.
Example 12 An example of weight assignment is shown in Figure 16. Assuming ties are broken in favor
10
NAND and NOR gates pose no problem. However, if the circuit contains also exclusive OR and exclusive NOR gates, there
may be more than one node per variable and the exact number of nodes may depend on the particular non-interleaved order chosen.
This is because exclusive OR and NOR gates have no controlling values.
11
These variables may have been visited during the search of another node. Hence, they may already be in the order.

36

1/16
a
b 1/16
c
d

1/8
1/8

1

1/8

1/4

3

1/8
1/4
2

5

1/2

1/8
e

F

1/4

4

1/8

7
1

1/4

f
g

1/2

6
1/4

Figure 16: Example of weight assignment.


of the input that comes first in alphabetic order, the first input selected is . The complete order is:





This method does not directly address the main concern of the ones based on depth-first search, namely,
keeping together the inputs that affect the same part of the circuit. However, when an input is removed from
the circuit, the weights of its neighbors increase, thus increasing the likelihood that they will be selected
next. One may verify that on a fanout-free circuit this method produces one of the depth-first orders with
the depths of the subtrees used as tie breakers.
6.6.2 Multiple-Output Functions
The usual method to order the inputs of multiple output functions is to order the support of each output in
turn. The problem is thus split into finding a good order for the outputs and finding a good order for the
inputs. We have seen methods for the latter and hence we briefly discuss here the problem of ordering the
outputs. Once again, we take the adder as a paradigm. If we want the most significant bits of the inputs at
the top of the BDD, we need to consider the most significant output first. This corresponds to considering
first the outputs with the deepest subtree. This heuristic based on depth is related, as when applied to input
ordering, to the width of the circuit graph.
An alternative approach orders the outputs so as to minimize the number of variables that are added
to the support when adding a new output to those already ordered. Let    be the outputs of the
circuit for which a variable order is sought. Let  be the support of  , i.e., the set of variables on which 
depends. Then one may try to minimize










 

An exact solution to this problem is computationally expensive, and hence a greedy strategy is normally
applied. One such strategy tries all the -tuples of outputs as first outputs of the order and then completes
the rest of the order by choosing each time the locally best output, i.e., the one whose addition causes the
least increase in the size of the support.
Once the orders for two or more outputs are given, they must be merged to yield an single order for all
the outputs. The simplest approach is initialize the total order to the order of the first output and then append
the variables of the successive outputs that do not appear yet in the order. This approach is often ineffective,


37

   
as argued in [27], because it unnecessarily separates variables that the orders of the individual outputs put


and
have been
together. Consider the case of two outputs for which the orders


determined. The combined order based on appending yields
. However, the order


is equally good for the first output (which does not depend on ) and is presumably
better for the second output. This observation is at the basis of the interleaving algorithm of [27].

  
7 Implementation Issues
7.1 Memory Management
Consider building the BDD for the function of a single output circuit. The typical approach is to start by
building BDDs for all the input variables of the circuit. One then computes the BDDs for the outputs of the
gates fed by primary inputs only, and so on, until the BDD for the primary output is built. In the process,
one computes the BDDs for many intermediate functions that are no longer of interest, once the result is
obtained. One would like to release the memory used by those BDDs, but there are two problems. First,
some subgraphs may be shared by more than one function and we must be sure that none of those functions
is of interest any longer, before releasing the associated memory. Second, BDD nodes are pointed from the
unique table and the computed table, as well as from other BDD nodes. There are therefore multiple threads
and one cannot arbitrarily free a node without taking care of all the threads going through it 12 .
A solution to these two problems is garbage collection. Garbage collection consists of deferring the
release of memory until there is enough memory to be released to make the required effort worthwhile. A
conceptually simple scheme for garbage collection is based on keeping a reference count for each internal
and terminal node. This is a count of how many BDD nodes (internal and function nodes) point to an internal
node. This count is incremented any time a new arc points to the node (for instance, during the execution of
AND) and is decremented when nodes are freed. When a node is freed, its reference count is decremented.
If the decrement results in a count of 0, then the reference counts of the children are decremented. The
process is recursive. A node with a reference count of 0 is called dead.
When there are enough dead nodes, garbage collection is started. During garbage collection, the unique
and computed tables are scanned and all entries pointing to dead nodes are cleared. The dead nodes are
disposed of at this point. The cost of scanning the tables is amortized over a relatively high number of dead
nodes. When garbage collection is invoked, if the collision lists are too long, one may increase the size of
the tables and rehash all the entries that do not point to dead nodes.
One advantage of garbage collection is that a dead node may be resuscitated if a look-up in the computed
or unique tables returns a dead node. Had the dead node been freed immediately, the re-computation would
have been necessary. In this way, however, a better utilization of the allocated memory is possible. A reclaim
operation is performed when a look-up returns a dead node. This operation undoes what the freeing of the
node had done. The reference count of the node and of its children is incremented and if the children were
also dead, the process is repeated recursively. Both freeing and reclaiming update the count of dead nodes.
The memory layout for a 32-bit machine is described in Figure 17. The computed table—not shown—is
just a table pointing to some of the nodes. One should note that the collision list is implemented by an extra
pointer in the node data structure. This integration of the unique-table with the DAG minimizes memory
occupation and access time simultaneously. The reference count and the variable index are shown sharing a
12
Locating the entry of the unique table pointing to a node just requires a hash-table look-up; on the other hand, deciding which
entries, if any, of the computed table point to a node require keeping an extra pointer or scanning the table. Both solutions are less
than optimal.

38

ref

index
then
else
next

ref

index

ref

index

then

then

else

else

next

next

unique-table

ref

index
then
else
next

Figure 17: The memory picture.

39

single computer word. Assuming each field takes two bytes, we have limitations on the number of variables
and the maximum reference count. The limit of 65535 variables is normally not serious, since in general one
gets into trouble with much fewer variables for other reasons13 . On the other hand, it is quite possible for
the constant node to exceed a reference count of 65535. The solution to this problem—besides the obvious
solution of allocating one four-byte word for each field—is based on saturating increments. If a reference
count reaches 65535, then it is no longer changed. This may cause a dead node to be considered non-dead,
but in practice the nodes that reach saturation are very few and hence the consequences are negligible. 14
All things considered, the memory occupation is about 18 bytes/node on a 32-bit machine. This number is
based on a load factor of 2 for the unique table (the average length of the collision list) and is so divided:
4 words for the node itself (see Figure 17);
one half of the hash table bucket (2 bytes).
To this one must add the memory requirements for the computed table, which vary considerably with the
application. It is thus conceivable to store several million nodes on a machine with a large physical memory15 .

7.2 Efficient Utilization of the Memory Hierarchy
Modern computers rely on hierarchical organization to provide low average access time to large memories.
The success of this approach is postulated on the locality of reference of programs. Unfortunately, the BDD
algorithms that we have presented so far have poor locality, and therefore tend to incur large penalties in
their accesses to memory. In practice, if a process that manipulates BDDs grows in size significantly beyond
the available physical memory, its page fault rate normally becomes so severe as to prevent termination.
One approach to increase locality of reference in BDD manipulation consists of using contiguous memory locations for all the nodes labeled by the same variables and processing all the nodes for one variable
before moving to another variable [48, 2, 55, 61]. This approach is commonly referred to as breadth-first
processing of BDDs, though it is properly levelized processing. The algorithm for levelized calculation of a
generic boolean connective is shown in Figure 18. It consists of two passes: During the fist, top-down pass
requests are generated for the pairs of nodes. These requests correspond to nodes in a non-reduced version
of the result. The second, bottom-up pass performs reduction by forwarding some requests to others.
The requests generated in the first pass double as entries of the computed table. In other words, the
levelized approach features a built-in computed table that stores all intermediate results of a given call
to bfOp. (Inter-operation caching of results is not covered by this mechanism.) The lossless computed
table may be very memory-consuming for operations in which the non-reduced result is much larger than
the reduced one. The extra memory required for the request queues may thus offset the advantages of
the increased locality of access. Combining the levelized approach with the conventional recursive one
can substantially alleviate the problem [61]. Additional speed-up techniques including superscalarity and
pipelining are described in [55]. Reordering tends to destroy the segregation of nodes according to the
variables. The typical approach is to restore such segregation after every reordering.
The levelized approach to BDD manipulation is quite effective in reducing the number of page faults.
When the processes fit in main memory the situation is more complex. On the one hand, improved locality
13

However, some schemes that assign variable indexes sparsely may conflict with such a limit.
Sometimes, only 8 bits are reserved for the reference count, so as to free 8 bits for flags without increasing the size of the node.
If this is done, care must be exercised to periodically adjust reference counts. Alternatively, an overflow table must be kept for those
reference counts that become too large. The details of how this is done depend on the implementation.
15
Virtual memory is of limited help in these cases, as explained in Section 7.2.
14

40



bfOp(op, , ) 

if terminal case (op, , ) return result;

minIndex = minimum variable index of ( , );

create a R EQUEST ( , ) and insert in Q UEUE[minIndex];
/* Top-down A PPLY phase. */
 
for (i = minIndex; i numVars; i
) bfApply(op,i);
/* Bottom-up reduce phase */
for (i = numVars; i  minIndex; i
) bfReduce(i);
return R EQUEST or the node to which it is forwarded;


bfApply(op,i) 
 is the variable with index i;
while (R EQUEST Q UEUE[i] not empty) 

R EQUEST( , ) = unprocessed request from R EQUEST Q UEUE [i];

if (not terminal case ((op,  ,  ),result)) 

nextI = minimum index of (  ,  );


result = find or add (  ,  ) in R EQUEST Q UEUE [nextI];
R EQUEST T HEN = result;

if (not terminal case ((op,   ,   ),result)) 

nextI = minimum index of (   ,   );


result = find or add (   ,   ) in R EQUEST Q UEUE[nextI];
R EQUEST E LSE = result;






bfReduce(i) 
 is the variable with index i;
while (R EQUEST Q UEUE[i] not empty) 

R EQUEST( , ) = unprocessed request from R EQUEST Q UEUE [i];
if (R EQUEST T HEN is forwarded to  ) R EQUEST T HEN =  ;
if (R EQUEST E LSE is forwarded to  ) R EQUEST T HEN =  ;
if (R EQUEST T HEN == R EQUEST E LSE )
forward R EQUEST to R EQUEST T HEN;
else if ((R EQUEST T HEN,R EQUEST E LSE) found in U NIQUE TABLE[i])
forward R EQUEST to that node;
else
insert R EQUEST in U NIQUE TABLE[i];






Figure 18: Levelized BDD manipulation algorithm.

41

of reference should translate into increased cache hit rate. On the other hand, the overhead inherent to
the two-pass approach is fairly sizable [43]. Experimental evidence suggests that the levelized approach
is better than the recursive one when building BDDs for combinational circuits, but the opposite is true
for sequential verification [60]. Improving memory locality for the conventional recursive approach is also
possible. In [41] it is shown that the number of cache misses incurred while accessing the unique table
may be substantially reduced by sorting the collision lists and making sure that older nodes occupy lower
addresses in memory. An additional benefit of this approach is an efficient generational garbage collection
scheme.

8 Sequential Verification
We suppose we are given a sequential system modeled as a finite state machine, and a property that the
system is supposed to verify. Since the system is finite-state, several logics commonly used to express properties admit decision procedures based on state enumeration. In this section we focus on the most elementary
of these procedures: Reachability Analysis. Our choice is motivated by the following observations. First,
reachability analysis is sufficient to decide an important class of properties: invariants, that is, propositional
formulae that should hold in all possible states of the system. Second, reachability analysis exemplifies the
fixed point computations that are the major constituents of more general decision procedures. Therefore
reachability analysis gives us the opportunity to examine most of the issues connected with the efficient
deployment of BDDs in model checking.
 
, where is the input
We define a deterministic finite state machine (FSM) as a 6-tuple
  
alphabet, is the set of states,  is the output alphabet,  and  are the next-state (or transition) and output

 and
functions,
respectively, and
is the set of initial

 (reset) states. In the sequel we assume that




for some and . Therefore 
is a multiple-output boolean function. We call such
a FSM encoded.
A state  is reachable from state  , if there exist a sequence of states  
and a sequence of inputs



  .
 



  ,
(possibly repeated), such that  
,  
, and 

A state is simply reachable if it is reachable from any state in . We will define reachability analysis as the
problem of enumerating all reachable states of a finite state machine. Note that in this problem we are only
concerned with the next-state function  . Hence, in the following, we ignore the output function  .
Reachability analysis can be performed in time linear in the number of edges of the state transition
graph by depth-first search. However, the number of states of a FSM grows exponentially with the number
of memory elements. Hence, even a linear time algorithm may be inadequate in practice. One approach to
achieve sub-linear runtimes for some problem instances is based on the use of characteristic functions.


8.1 Characteristic Functions and Relations






Let
and let
be a set of points of the boolean space
called the characteristic function of if it is one exactly for the points of


#


$

 



. A function 
that are in . That is:


is


We can extend the definition of characteristic function to subsets of an arbitrary finite set , by providing

an injective mapping from to
. Such a mapping is called a binary encoding of . Let be a binary
encoding of , that is,


42


for




 . Then the image of







We define the characteristic function of

















defined by:


as the characteristic function of  :

according to



is 

according to

 


Whenever the encoding is understood, we shall simply write  . With these definitions, we can associate a
characteristic function to every finite set. In the following we shall mainly deal with characteristic functions

of subsets of
.
All set theoretic manipulations can be performed directly on the characteristic functions. In particular

:
one can easily see that, for 


and for arbitrary finite sets   















and encoding

 






















, we have:



 













  

Characteristic functions are interesting because they often provide a compact representation and an
efficient manipulation of sets. Consider the characteristic function


 







One can verify that:














 



However, the representation of 
in the form of a BDD only takes two internal nodes. Though almost
all sets require exponentially large BDDs to represent their characteristic functions (see [46] for a proof
of this classical result), it is true in general that we can manipulate much larger sets by dealing with their
characteristic functions than we would by explicitly representing the elements of the sets.
Among all sets that can be represented by characteristic functions, we now consider relations, i.e., sub
be a binary relation16 . We shall use different sets of
sets of a suitable cartesian product. Let 
variables to identify the elements of  and
and we shall therefore write






  
 

 
 





  
 



 

 
 

noting that the cartesian product of two sets is obtained by taking the product of the respective characteristic
functions, when they have disjoint support. This easily generalizes to -ary relations. Note that we have to
use different sets of variables also in the case of 
, because the elements of the relation are ordered
pairs and we must be able to tell the first element from the second. The most important relation that we shall
deal with in the sequel is the transition relation of a FSM:17


16

and  .
Binary here means that the elements of the relation are pairs of elements from 
In the following, we implicitly assume that we are dealing with characteristic functions, so that, with abuse of notation, we
write  instead of  .
17

43

Definition 8 Let 
  
is given by:
relation of


 

be the transition function of an encoded FSM




 









   

 

 



. Then the transition




The transition relation describes all the pairs of states that are connected by an arc in the state transition
graph of
and the inputs labeling that arc.
The transition relation is not restricted to the representation of deterministic machines. We could have
defined FSMs directly in terms of their transition relations. We did not follow this approach for two reasons.
First, there is no loss of generality in restricting ourselves to deterministic transition structures. We can
apply Choueka’s determinization procedure [19] to remove nondeterminism. Second, we need to define 
because we are going to discuss an algorithm based on it, and it is both easier and more commonly done to
derive  from  than vice versa.

8.2 Symbolic Breadth-First Search
Characteristic functions allow us to represent and manipulate large sets of states and relations among them.
Depth-first reachability analysis, however, deals with states individually. Breadth-first search, on the other
hand, allows one to deal naturally with multiple states simultaneously and has thus become the method of
choice for the traversal of large machines.
When multiple states are processed simultaneously in breadth-first manner, the algorithm is called symbolic breadth-first search or traversal. The sets of states that are processed are represented by their characteristic functions and the problem is then to evaluate  for a given characteristic function as input. The result of
the evaluation is the characteristic function of the set of states that are reachable in one step from any state in
the set whose characteristic function is the input to  . This process is sometimes called symbolic simulation
of  . We shall view this evaluation as the computation of the image of an appropriate function. From this
point on, we assume that all functions are represented by BDDs. In particular,  can be represented in two
ways:
1. By a vector of BDDs, one for each next state variable;
2. by the transition relation associated with  .
These two methods result in two algorithms, called the transition function algorithm and the transition
relation algorithm, that have a common structure. In order to present this structure, we now consider two

 .
definitions concerning the image of a function with domain
and range
Definition 9 Let
is defined by:


 be a multiple-output boolean function. The image of  , denoted by Img   ,


Img     
#       


Informally, the image of is the set of output values that can produce. If  belongs to the image of , then
there is an input  such that 
  . We want to impose a constraint on the inputs that can be applied to
and thus restrict the values that may produce. To this end, we introduce the following definition.


Definition 10 Let
the image of constrained by


 be a multiple-output boolean function and let
, denoted by Img
Img




  


 , is defined by:


#
 


44






be a subset of


. Then



 
traverse 

Reached = From = ;
do 
To = Img  ,From);
New = To Reached;
From = New;
Reached = Reached  New;

while (New
);
return Reached;





Figure 19: Pseudo-code of the symbolic breadth-first traversal.


 . The set is called the constraint and the problem
Img
According to this definition, Img 

is called constrained image computation. In the traversal of a FSM, the constraint
of finding Img
 corresponds to finding all the states that are
represents a group of current states and computing Img
 
reachable in one step from the states represented by . Also, in the case of FSM traversal,
,

where is the number of state variables (that are both inputs and outputs of the next state function) and
 , but it can be regarded as defined
is the number of primary inputs. The constraint is thus defined over



as well, by  just assuming that it is defined over
, but does not depend on the variables
over
ranging over the last coordinates. We are now ready to present a first version of the symbolic breadth-first
traversal procedure. The pseudo-code is given in Figure 19. The union and difference of sets are actually
performed by manipulating the corresponding characteristic functions. The number of iterations performed
gives the sequential depth of the FSM, i.e., the longest of the shortest paths connecting each state to an
initial state. In other words, traverse reaches each state in the minimum number of steps. This is a general
property of breadth-first methods. Before we consider in detail how to compute Img  From  , we examine
the assignment:
From New 
(8)


This statement simply says that the newly reached states will be considered as current states at the next
iteration. Notice, however, that from the point of view of correctness, it is possible to assign to From any set
such that:
New From New  Reached 


(Note that Reached has not been updated yet when the assignment to From takes place.) In particular, From
could be set to equal To or Reached  New. If From New, then some states will be examined more than
once. If states were individually processed that would be a poor choice. However, we are manipulating
characteristic functions; the size of the BDDs and the complexity of the operations on them are only loosely
related to the number of minterms of the corresponding functions. It is indeed possible that a function with
more minterms than another actually has a smaller BDD. In our case, we could find a choice for From
that includes some previously reached states, but results in a smaller BDD and/or a faster computation of
Img  From  . One may observe that:
New


New  Reached 


New  Reached 

Therefore, a possible choice for From that is expected to result in a compact BDD is given by:
New


Reached 
45

and this expression could replace New in the right-hand side of (8).
The image of  can be derived by exhaustive simulation. If a constraint on the inputs is given, only
those inputs that satisfy the constraint are applied. However, this method is impractical for all but the simplest cases. Therefore we seek methods of computing Img  From  , that only manipulate the characteristic
functions of the sets involved in the computation.

8.3 Image Computation
In this section we concentrate on the problems of image computation for the two algorithms based on the
transition function and the transition relation. Throughout this section, we shall use the following example
to illustrate the various techniques.
Example 13 We want to compute the image of


where:




 
 
 

(9)

For this simple example, it is easy to compute the image by exhaustive simulation:
0
0
0
0
1
1
1
1
By inspection, we see that the image of
shown in Figure 20.


0
0
1
1
1
1
0
0

0
1
1
0
0
1
1
0

 
0
0
0
0
1
1
1
0


0
0
1
0
1
1
0
0

0
0
1
0
0
1
1
0


is  000,011,101,110,111 . The BDDs for the three functions are

8.4 Image Computation for Transition Functions


We now examine how to compute the image of a function
, when the transition function is
 by breaking the
given as a set of BDDs, one for each next state variable. It is possible to find Img
computation in two steps [21]:
1. Find a new function whose image over the entire domain
domain .
2. Find the (unconstrained) image of the new function.
46


equals the image of


over the restricted

f1

f2

f3

a

a

a

b

b

b
c

Figure 20: BDDs for the image computation examples.

f

Img(f,C)

Bm

n

B

C
x1

v1

v
x

Img(f,Bn )

Figure 21: Constrained image computation.

47

The first step of the procedure is pictorially represented in Figure 21. It requires an operator that, when

 as another
outside onto the same point of
applied to a multiple-output function, maps a point #

point  
. Any such operator is called an image restrictor.
Noting the similarity between the re-mapping performed by an image restrictor and that of a generalized
cofactor, we may ask whether any of the operators we already know is indeed an image restrictor. The answer
is affirmative, in that the constrain operator (indicated by ) of Section 5.4.1 is indeed an image restricting
 , the mapping
generalized cofactor. This can be easily seen by observing that, when computing 
performed by constrain only depends on  . Therefore, when constrain is applied to all the components of
 
  , it performs the same mapping for all the components. This guarantees that the image of
 will become the image of   , as shown in Figure 21. In summary, we have:


Img





Img




where it is understood that constrain is applied to each component of .18 Note that the restrict operator
(
 ) is not an image restrictor, because the mapping of the points outside  depends also on  , the function
being cofactored; indeed,  determines what variables should be quantified in  . Though it is the only one
that we shall consider in detail, constrain is not the only image restrictor we can define. For instance, one
and map all the points outside onto   . In most cases this will
may choose an arbitrary element $
be less efficient than applying constrain, though.
We now assume that the constraint has already been taken care of, by applying the constrain operator
and we concentrate on two techniques for the unconstrained image computation problem.







8.4.1 Image Computation by Input Splitting
The first technique is based on the following expansion:
Img




Img 

 


Img   


(10)

Intuitively this identity says that the image of the function is the union of the set of output values that can

and the set of output values that can be obtained by setting  
; hence
be obtained by setting  
the name of image computation by input splitting. Equation 10 is applied recursively until a terminal case


   , where   
is found. The simple terminal case is when is constant. Suppose
. Then

 if 
the characteristic function of the image of is   , where 
and 
 otherwise.
Though this terminal case is sufficient to build a correct algorithm, efficiency requires a more sophisticated analysis of terminal conditions. In particular, we consider three mechanisms.









 

Decomposition due to disjoint support;
Identical and complementary components;
Identical subproblems.
Decomposition due to Disjoint Support. Consider the case of
are disjoint. Then,
Img 
Img    Img


18




 
 , where the supports of 

 and


It is important for the variable order not to change during image restriction with constrain because the mapping, although
independent of the component of , does depend on the variable order.

48

Img

 
  





   

Img

 

 

disjoint


Img

  



  
 

Figure 22: Example of image decomposition.
In terms of characteristic functions, it is sufficient in this case to take the product of the characteristic
functions computed for the images of  and . This result can be easily generalized to the case of a multiway partition of     , such that the component functions in one block have support disjoint from the
functions of the other blocks. An example is shown in Figure 22. The effect of partitioning, when it occurs,
is dramatic, so that one should try to cause partitioning as early and as often as possible, by appropriately
choosing the splitting variable.


 and 

Identical and Complementary Components. Suppose
. Assuming  is not
 
 
constant, then Img 
. If 
 , then Img 
. In general, it is possible to remove all
but one identical or complementary components from a problem, solve the simplified problem, and finally
 or  
 . This technique does
reintroduce the missing variables, by adding clauses of the form  
not directly affect the size of the search space. However, it reduces the amount of work for each recursive
invocation of the algorithm and also increases the chances of early termination due to the technique to be
discussed next.


Identical Subproblems. It is often the case that different cofactors of are identical or strictly related.
By maintaining a table of the problems solved thus far, an algorithm may avoid repeated computations
of the same result. This is the familiar technique applied in BDD manipulation algorithms. As in that
case, optimal exploitation of the table of computed results is an issue that may be addressed by trying to
normalize the calls. The normalization is obtained by eliminating all repeated and complement components,
complementing all the components pointed by complement arcs, and sorting the components according to
the order of the pointers.
The recognition of identical subproblems is particularly important when the image being computed is
represented by a small BDD.
Example 14 Consider the case when the image of a function










49




is defined by:





f
x n-1

yn- 1

yn-1
xn-2

y n-2

yn-2

xn-3

x0

y0



Figure 23: A BDD for


.

(Here and  are both sets of output variables.) The corresponding BDD, for an appropriate variable order,
is shown in Figure 23. This BDD grows linearly with . This happens frequently with arithmetic functions
that can be expressed iteratively. If identical subproblems are not detected, though, an exponential number
of recursive calls will be performed, thus effectively expanding the BDD into a tree.
Example 15 For the function defined by (9), the computation based on input splitting proceeds as follows.
We initially select an input variable for splitting. From symmetry considerations, we see that in this case all
variables give similar results; hence we choose . We then compute Img   . We have:


 

 






This is not yet a terminal case, so we split again, this time with respect to , and find:

   




This is a terminal case for which the solution is 

    


We then consider Img

















50


  


    . Therefore:
    
   


   . We have:

 


. Considering then the negative cofactors, we get:

This is again a terminal case, from which we get
Img






This is a terminal case and we get:


Img






 


Finally, adding the two partial results, we have:
Img





  

in agreement with the results obtained by exhaustive enumeration.
8.4.2 Image Computation by Output Splitting
Rather than splitting the problem with respect to the input variables, one can partition with respect to the
output functions, by using the following formula.


Img






       
       

 Img
 Img


     
      

 
(11)


Intuitively, this identity says that the image of can be dichotomized into the points for which 
and

those for which 
. The resulting method is again recursive and at each step we have to compute two
constrained images. This can be done by applying constrain, as we have seen at the beginning of Section 8.4.
The termination conditions are similar to those used with input splitting.
The number of component functions decreases at each iteration. The number of inputs may or may not
decrease, depending on the simplification that occurs in applying constrain. When the problem is reduced

the image is  , and if 
, the
to a single component  , if  is not constant, the image is 1. If 
image is   . The efficiency of this procedure can be increased by application of the same techniques we have
seen for input splitting, namely decomposition, recognition of identical or complementary components, and
the use of a computed table.
It is also possible to combine the input and output splitting techniques, by choosing at every step whether
to decompose the domain or the range of the function. There are cases where input splitting is better
than output splitting, and vice versa. At first glance, output splitting has the advantage of performing a
partitioning of the problem (there are no common image points in the two sub-problems) and of visiting a
). However, this is often offset by the
smaller search space in the case of FSM traversal (because
increased complexity brought about by constrain, and by the decreased occurrence of decomposition. In
most cases, it has been observed that input splitting is more efficient.


Example 16 We now apply the output splitting method to our example. We choose the splitting order as
   . Figure 24 shows the effect of applying constrain to

. We initially compute Img


 and
 . We now compute
and . Let

  


  
   

  



Img

and
Img
Hence,
Img













 





   
51

Img



Img






f 2 f1

f 3 f1

b

b

c

Figure 24: First step in the image computation by output splitting.

f 2 f’1

f 3 f’1

a

a

b

b

c

c

Figure 25: Second step in the image computation by output splitting.

52

  
    . The result of applying constrain with respect to   is shown in
We must now compute Img
Figure 25. Since the two BDDs are identical and non-constant,
Finally, combining the two sub-problems,


Img


     

Img




 


 




 
  

in agreement with our previous results.

8.5 Image Computation for Transition Relations
The image of a next-state function described by the associated transition relation  can be computed as:


  









  

 



 
  

where is a constraint on the inputs, describing a set of present states. If no such constraint is given—
alternatively, if
—we simply have to existentially quantify the input variables in  .
Example 17 For the function defined by (9), the transition relation is given by:





         
      

 
and we want to compute:
Img

  







 


  


 

In this case the order of quantification is immaterial. We follow the alphabetic order.


 


 


  


        
        


 


 

 
    
      

 

                
               
                      


 

  


 

 


 


 





   


 







      
   


  

  
   

This simple example shows that the computation is more complex when using the transition relation than
when using transition function. This is mostly due to the increased number of variables. In its original form,
this method is indeed less efficient than the ones previously seen. However, there are ways to speed it up that
make it competitive. We begin their study by proving some results on generalized cofactors, and constrain
and restrict in particular, that are of interest in this context.
53

8.5.1 Partitioned Image Computation
The major computational problem in image computation with the transition relation derives from taking
 may be small, their product
the conjunction of the components. Even if the individual BDDs for  
may be too large. The final result—after quantification—may also be small, but we may exceed the allotted
resources while building some intermediate result. A substantial increase in speed an memory efficiency
comes from the following simple observation, whose proof is immediate.


Lemma 6 Let
function of   


 

   


be a function of  
. Then:
 ,

 



  




 



 








  and




  





     


be a


Lemma 6 says that we can partially distribute the quantifications over the product, if the terms of the product
do not all depend on all the variables. Since the quantification of a variable normally simplifies the result,
we can hope that by intertwining products and quantifications, the size of the intermediate BDDs will be
better kept under control [11, 59]. The following example illustrates this point.


Example 18 We want to compute Img

  , with 






 

 
Img




 





,



  

    


and 

  
. According to (7), we have:


      
 
  
   
   


   


 


              
       

                  

Applying restrict is simple in these cases. In particular, the second and third term of the transition relation

do not depend on   . This variable is therefore quantified in  
 , resulting in the constant 1 function.
Hence:
Img


                    
       
    
 

 


   


    


         
      
    

  








 


                  
   
  

 

 

 


 

 




 

 
54

 



   



  

Figure 26: Different tree organizations for the partitioned computation of images.
 


 

 




  


    




 


 


 




In this example, Equation 7 was particularly handy. However, it has been observed in practice that Equation 5
is often more efficient.
Several issues need to be considered in defining an algorithm based on Equations 5–7, besides the choice
of what equation to adopt. The first is the general organization of the image computation, which consists of
the product of several factors, and the quantification of part of the variables. If the product of two factors is
taken at each step, a binary tree results. The leaves of the tree correspond to the factors of Equations 5–7.
Each internal node corresponds to the product of its two children and the quantification of all the variables
appearing in the result that do not appear in the rest of the tree. The tree is visited in post-order and at the
end of the visit, one of Equations 5–7 is computed at the root of the tree. There are many different trees with
the same number of leaves. Figure 26 shows the two extreme cases: A perfectly balanced tree and a totally
unbalanced one. The number of nodes in the two trees is the same. For leaves, both trees have
internal nodes.
A second, important issue is the order of the factors, that is, how the factors of Equations 5–7 are mapped
onto the leaves of the binary tree. The order of the factors should try to minimize the size of the intermediate
results. In one strategy, the factors  are ordered so as to minimize the increase in total support (see the
output ordering for the BDDs of multiple-output functions). In another approach, the order is chosen so as
to maximize the number of variables that will be quantified early. This is done by looking at the private
support of each term, defined as the variables that only appear in that term. Practical solutions combine in
various ways these and other heuristics [29, 51].
A final consideration concerns the clustering of the factors in Equations 5–7. If many image computations must be performed, it may be convenient to partition the factors into a few blocks, and then take the
product of each block. This decreases the number of products that must be taken during image computation.
There is a trade-off between the reduction in the number of products and the cost of each product. The usual
approach to decide how many blocks should be created is to impose a limit on the size of the BDD for each
block. One factor is chosen as the seed for a block, and other factors are conjoined to it, until the product
grows larger than a given threshold. The procedure is then repeated on the remaining factors. Notice that
clustering the blocks has the effect of changing the structure of the computation tree.


55


8.5.2 Combining Conjunction and Quantification
In the method described in the previous section, at each node of the tree two BDDs are conjoined and some
variables are existentially quantified from the result. It is possible to gain some efficiency in terms of both
CPU
time and memory by combining the two steps into a single operation. The algorithm that computes


   is based on the usual recursion. As in the case of the quantification algorithm of Section 5.3.3,
is the cube of the variables to be quantified, is the index of the lowest indexed variable of and  and 


is the index of the top variable of . We shall write
, where
is a cube, possibly the function 1. If




or   is constant (either factor is 0 or both are 1) then
 
  . Otherwise, we consider the
three following cases.








If


and  do not depend on 

. In this case we write:


. In this case





 










 


 




 



    AND         

AND        
   
 
       

AND 

AND


 
OR




 . Hence, 




, then the result is 1 and it is not necessary to perform the second recursive call.

 and write:
  AND 

 

. In this case we expand with respect to 





 











 


   

AND 

 



   


8.5.3 Partitioning, Identical Subproblems, and Splitting
Partitioning works in the case of the transition relation as in the case of the transition function, thanks to the
following identity:




     




 

 




The recognition of identical sub-problems can also be made to work in the following way. Suppose that

 

 , the -th term of Equations 5–7, has  as top variable. Suppose next that no ,
appears in the the -th term. This is certainly true if no terms are combined. Terms   and  represent

identical sub-problems if  
and:





 



They represent complementary subproblems if 




 


 

or








or








 

and:
 







If terms  and  represent identical sub-problems,     is replaced by  
      . If they represent


complementary sub-problems,      is replaced by   
can be tested
     . The condition  
efficiently. It is also possible to test for a stronger condition, namely 
 , in constant time. Notice that
56

before restrict or constrain are applied, this stronger condition is always satisfied, since all terms are of the
.
form 
Finally, it is possible to use a table to store the results of image computations, with an approach similar to
the one taken for the transition function methods. Identical problems may occur at different iterations of the
breadth-first traversal. It is also possible that identical sub-problems arise in the course of one computation,
if the image computation is decomposed according to the following identity:











 







  





 


 






  

 


  
 





  
 







 

 








 










 

  
 

 



 

  
 




The similarity of this decomposition to the output splitting technique is apparent. The identity can be
generalized by observing that the essential requirement for output splitting is that no term except   depend
on  . Notice that Theorem 8 can be applied to the two sub-problems. There is also the equivalent of input
splitting, thanks to the following identity:



















  









  

 














 

  
 

   



 

  
 

  


We finally notice that, although we have so far assumed that there is one term for each next state variable,
it is possible to reduce the number of terms by selectively taking the product of groups of terms. This may
be advantageous if the BDD obtained by taking the product is not larger than the sum of the operands. This
technique can be seen as a way of altering the structure of the evaluation tree.

8.6 Renaming Variables in the Traversal Procedure
In a FSM we distinguish present state variables from next state variables. When the image of the next
state function is computed, a set of next states is found that is reachable in one transition from a given
set of present states. At each iteration of the traversal procedure, after the set of newly reached states is
computed, we need to manipulate it as if it were a set of present states. This requires that the result of
the image computation be expressed with the same variables that are used to represent the sets of present
states (Reached, From). This problem is solved differently depending on the algorithm used for image
computation.
In the case of the transition function, one can directly use the same variables for From and To. There is
no problem there, because the BDDs for 
From and To are separate entities at all times. Therefore, one
can implicitly apply the renaming of the variables that corresponds to clocking the state register to make the
next states present.


57

In the case of the transition relation, it is not possible to use the same variables for both present and next
states, because these variables appear together in the characteristic function of the transition relation. We
have to wait until a present state variables has been quantified, before renaming the corresponding next state
variable. (The simplest approach is to rename the variables after the image is computed.)
The renaming of the variables is best seen as a case of function composition. Replacing  by   in the
characteristic equation of the image is equivalent to computing:
To   

To     

  



  

that we know we can compute as:

 

To   


 To 

 

 To  

When all next state variables must be replaced, it is advantageous to perform the substitution in a single
pass, by applying the following identity:
To 



 


 


  To






 


 


 

 To 






 


 

This identity is the basis for a simple recursive algorithm.

8.7 Variable Selection in Image Computation
We have seen that for both the transition function and the transition relation methods image computation
can be decomposed by input and output splitting. An obvious question concerns the criterion by which to
choose the variable to be used for the expansion. We first consider input splitting for the transition function
method. Remember that in this case, we have a vector of BDDs, one for each output. These BDDs only
depend on the input variables. The simplest choice is therefore the variable with lowest index appearing
in those BDDs. The advantages of this method are that the cofactors can be computed trivially and that
no new nodes are created. The choice of the top variable is therefore attractive from the point of view
of memory. However, we may be willing to trade some memory occupation for faster computation, by
choosing a splitting variable that reduces the search space more than the top variable. To that effect, we
notice that the single most dramatic factor in reducing the size of the search space is partitioning based on
disjoint support. Hence we look for splitting variables whose choice leads to earlier decomposition of the
image computation. According to this criterion, good candidates are those variables on which many outputs
depend. This simple heuristic is fairly simple to implement and has excellent performance [37].
In the method based on the transition function, output splitting does not necessarily reduce the support
of the residual functions. In the transition relation method, though, both kinds of splitting can be used to
reduce the supports of the residual functions.

8.8 Backward Traversal
The form of reachability analysis we have examined so far can be considered forward traversal: We start
from the initial states and we trace paths forward. Those states that are along the paths are clearly reachable
from some initial state. If we want to know whether a given property holds at all reachable states of a finite
state machine, we can find all the reachable states and then examine them for possible violations of the
property. In finite state machine equivalence, the property is 
for the product of the two machines
being compared.
58

There is an alternative approach to the same problem: We identify all the states where the property
is violated, and then find whether any of these states is reachable from any initial state. Finding whether
a state is reachable from any initial state can be done by tracing a path backward in the state transition
graph. Tracing paths backward requires the computation of the states from which a given set of states is
reachable. This goes under the name of pre-image computation. Therefore, we now examine pre-image
computation, and then we discuss the relative merits of forward and backward traversal. As in the case of
image computation, there are two algorithms for pre-image computation: One is based on the transition
function, and one is based on the transition relation. For the sake of conciseness, we only consider the latter.
8.8.1 Pre-Image Computation Based on the Transition Relation
The computation of the pre-image based on the transition relation can be performed with the image computation algorithm. It is sufficient to interchange the roles of input and output variables.
Example 19 Let us consider the following transition relation:






         
      

 


  

where , , and are the input variables and  , , and
are the output variables. Suppose we want to
compute the pre-image of  . With argument similar to that employed for image computation, we can
write:
Pre


































In this case the two generalized cofactors are identical, because the constraint is a cube. Performing the
computation, we get:
Pre









  




    



In general, when the transition relation is given in the form:









 

and the constraint is a cube, considerable simplifications occur. Since the cofactor of a function with respect
to a cube does not depend on the variables appearing in the cube and cofactoring with respect to a cube
distributes, all terms of the product contain zero or one output variables after cofactoring. The existential
quantification fully distributes, since each output variable appears in at most one term. Finally, if a term is
  after the cofactoring (this is the case if  does not appear in the cube), then the
still of the form  
existential quantification returns 1. Otherwise, it returns either  or   . Therefore we see that the result
of the pre-image computation is simply the product of those  such that  appears in the constraint cube,
taken with the appropriate phase. In our example, we can write:







    


59




and one can verify that the previous result is correct with minimum computation.
When the constraint is not a cube, the general algorithm based on interleaving quantification and product
applies, as in the case of image computation.
8.8.2 Forward versus Backward Traversal
The relative effectiveness of the two methods depends primarily on the structure of the graph being traversed.
Suppose, for instance, that we are comparing two identical counters for equivalence. We know that the
forward traversal method has difficulty in dealing with this example, because of the large sequential depth
of the produce machine. If we are comparing two modulo- counters, each with one initial state, breadthfirst forward traversal will require
iterations. At each iteration, one new state is added to the set of
reachable states, and this accounts for the ineffectiveness of the procedure.
By contrast, in backward traversal, we initialize our search to all states of the product machine that
give different counts in the two counters. This set of states has a compact characteristic function. A single
pre-image computation leads to convergence, as a one minute’s thought will confirm: The pre-image is
precisely the same set of states that we started from, and it does not include the initial state. Hence, we
arrive immediately at the conclusion that the two counters are equivalent.
Clearly, not all examples work like the counters. If the set of states where a violation occurs does not
have a compact characteristic function, or if the machine to be traversed is not deep, forward traversal will
typically perform better.


8.9 Transitive Closure and Iterative Squaring
Breadth-first symbolic traversal is remarkably efficient, in that it allows realistic machines with very many
  
) to be traversed. The efficiency stems from the ability to process many states at
states (more than
one time. It is then easy to see that the advantage of breadth-first symbolic traversal is lost when the FSM
to be traversed is, for instance, a counter. A counter is characterized by a very high sequential depth—it
actually equals the number of states. In this section we consider a technique, called iterative squaring, that
 be the transition relation of FSM
addresses this problem [12]. The basic idea is quite simple. Let  
, where  is the vector of present state variables, is the vector of primary input variables, and is the
 such that from state  of , state can
set of next state variables.  describes triples of the form 
be reached by applying input . If we existentially quantify all primary input variables from  , we obtain
another relation:






  




















that describes pairs of adjacent states, i.e., states that are connected by at least one edge in the transition
  in a way to be described
graph of . We can compute the transitive closure of  , that we call
shortly. The transitive closure describes the pairs of states that are connected by at least one path in the

state graph of . Finding the set of states reachable from
then amounts to computing:
Reached









 

 








since the product of
and yields the pairs of states connected by a path, such that the first state of the

 is
pair is an initial state; the quantification returns the second components of the pairs. The term
necessary, because some initial states may not be reachable from any other initial state.
The transitive closure can be found by computing the least fixed point of the following recurrence equation:










 


60












00

10

01

11

Figure 27: A state transition graph.

 gives the name of iterative squaring to this technique and is responsible for its

The term    
 ,    is set to    and then
efficiency. In order to find 


 


 




















 for some . To see how the recurrence equation actually computes
   
is computed until  
 describes all the pairs of states   such that can
the transitive closure of  , suppose that   



be reached from  in at least 1 and at most
steps. This is certainly true for
. We can then prove



such that can be reached from in at least 1 and at most
that  describes all the pairs of states
 steps. This follows from observing that       describes triples of states 
 , such that
  steps. Hence is reachable in at least 2 and
( ) is reachable from  ( ) in at least 1 and at most

 obtained by quantification give all pairs of states that satisfy that
at most steps from  . The pairs 
 , the pairs of states that are precisely one step apart are included, thus proving
condition. By adding  
our statement.










Example 20 Consider the state transition graph of Figure 27, where the primary inputs do not appear for
simplicity. We have:







     


  

    


The computation of the transitive closure proceeds as follows:

















  










 


 
    
     
 
  
    
 




One can verify that


      
 
     
 
 





 


            
                  


















     
     
 




  



   

        


    






, consistent with the fact that the graph of Figure 27 has no cycles.

It should be noted that the number of iterations required to compute the transitive closure is logarithmic
in the maximum distance between two states19 . The transitive closure also considers paths originating at
19

This is also called the diameter of the state transition graph.

61

states that cannot be reached from the initial states. Hence it may be inefficient, if a large fraction of the state
graph is unreachable or if the sequential depth of the machine is low. The major problem with the transitive
closure is that the BDDs that are produced during the computation may grow too large.

8.10 Approximate Reachability Analysis
Even with the best algorithms and heuristics, there remain many circuits that are too difficult for the reachability techniques that we have discussed so far. Circuits with a few thousand latches are normally too
difficult, and sometimes, even circuits with less than a hundred latches pose significant problems.
In such cases, however, it may still be possible to perform an approximate reachability analysis. Specifically, it may be possible to find a superset of all reachable states. Alternatively, it may be possible to compute
a subset of all reachable states.
Both approximations may be useful. Suppose we want to prove an invariant. Let us examine the superset
first. If the invariant hold for all the states of the superset, then it holds for all reachable states. On the
other hand, if the invariant does not hold for some states of the superset, it may be that the states where
violations occur are not really reachable. Therefore, verification based on supersets of the reachable states
may produce false negatives, or, in other words, it is conservative. If we prove a circuit correct by applying
a conservative technique, we know it is correct. On the other hand, for conservative verification to be useful,
false negatives should be relatively rare, because proving a false negative false may be impractical, or at
least very time consuming.
The closer the superset of reachable states to the actual set, the lower the chance of getting false negatives. On the other hand, better approximations require more time and memory to be computed. Notice

that a trivial solution to the approximation of the reachable state set from above is given by the set of all
states, where is the number of latches.
Let us now turn our attention to the computation of subsets of the reachable states. A reachability analysis method based on forward traversal produces a sequence of sets of states that eventually converges to the
set of all reachable states. The elements of the sequence provide increasingly accurate approximations from
below to the exact solution. If we cannot complete reachability analysis, we can always stop when we run
out of memory or time, and claim that we have obtained an approximation from below to the solution. The
challenge, therefore, lies in devising algorithms that produce good approximations with limited resources.
When verification of an invariant is based on a subset of all reachable states, it is partial. It may be that
the invariant does not hold, but the states where violations occur have not been visited. On the other hand, if
a violation is reported for a state in the subset, then we know that the invariant does not hold. The usefulness
of partial verification relies on the fact that a new design normally contain many mistakes, and partial
verification may uncover most such mistakes quickly. In this respect, partial verification is not different
from simulation: Indeed simulation is one form of partial verification. Methods based on the reachability
analysis techniques we have examined thus far, however, can be much more efficient than simulation.



8.10.1

Computing Supersets of the Reachable States

In a sequential circuit, each state variable depends in general on some primary inputs and some state variables. If we replace some connections to state variables with connections to new primary inputs, the resulting
circuit will exhibit a wider variety of behaviors: It will be able to do more things than the original circuit.
This simple idea is at the heart of efficient methods for the computation of supersets of the reachable states.
The circuit is partitioned in submachines, and the connections among the submachines are cut. (See Figure 28.) Once a circuit is partitioned, we can traverse each submachine in isolation. Then, we can take the
62

A

B

A

B
simplified
constraints

Figure 28: Partitioning for Approximate Reachability Analysis.
superset of the reachable states of the original machine to be the cartesian product of the sets of states of the
submachines.
This procedure is simple, but a bit too drastic in simplifying the problem. The fact that the connections
among the submachines are completely ignored causes a considerable loss of accuracy. Instead, we can
devise several methods to retain in part the flow of information among the submachines. We outline one
such approach. We process the submachines in some order. For the first machine, we perform reachability
analysis, assuming that the inputs from the other machines are unrestricted. When we consider the second
machine, though, we assume that the inputs coming from the first machine cannot take values corresponding
to states that were found to be unreachable during the first traversal. We then consider the third machine,
constraining the inputs from the first and second machine, and so on.
Once all machines have been considered, we can return to the first one. This time, instead of using
unconstrained inputs from the other machines, we use the constraints determined by the first round of reachability analyses. This may produce a smaller reachable set of states then the initial traversal. We continue
to refine the approximation, until we reach a point in which no reachable states set shrinks further. At that
point we have converged to a superset of the reachable states for the original circuit.
Constraining the traversal of each submachine can be accomplished with minor extensions to the image
computation techniques that we already know. Specifically, the constraint used in image computation comes
from both the state variables of the submachine, and the inputs fed by the other submachines.
The scheme we have discussed can be improved and modified in several ways. Another important
problem is how to partition the circuit into submachines. For these aspects, the reader is referred to [16, 17].
8.10.2

Computing Subsets of the Reachable States

The standard forward traversal algorithm provides a sequence of approximations from below for the reachable states. The challenge is how to provide better approximations within given time and memory limits.
We outline one possible approach. We assume that we use BDDs to represent the sets of states visited during
reachability analysis. We observe that BDDs are successful because they provide a dense representation for
sets. Here dense means that the ratio of the cardinality of the set over the size of the representation is high.
Breadth-first traversal is more or less successful, depending on how dense the BDDs are. Therefore, if
we want to improve the efficiency of traversal, we may want to increase the density of the BDDs. One way
to improve the density is to improve the variable order. However, for some circuits, the set of reached states
may still have a large BDD in spite of our best reordering efforts.
Our next observation is that breadth-first search goes through a fixed sequence of sets of states: All the
initial states, followed by all the states at distance 0 or 1 from the initial states, and so on. If the states at
distance or less from the initial states are randomly scattered through the state space, their representation
as a BDD is likely to be large. However, by visiting the states in some other order than the one imposed by
breadth-first search, we may visit states that are tightly packed in some subspace. Such states will have a


63

dense representation.
We are therefore going to mix breadth-first and depth-first search, leaving the density of the BDDs guide
our meandering through the state graph: We still use as basic approach the breadth-first search algorithm.
However, at each iteration we check the size of the To set. If it is too large, we extract a dense subset from
To using the algorithms of Section 5.6 and we use it instead of To as constrain in the image computation.
The states in To that are not in the dense subset are discarded. Therefore, we do not have a breadth-first
search any more. A few interesting properties of breadth-first search are lost. For instance, when the set
of new states is empty, we are not guaranteed that all states have been reached. In spite of this and similar
disadvantages, the traversal based on density is appealing, because it may visit many more states than pure
breadth-first search in the same amount of time [53, 52].

9 Conclusions
Binary Decision Diagrams are a popular data structure for verification algorithms and for several tasks in
the design of calculational systems. We have reviewed the properties of BDDs that make them amenable for
such applications and we have presented the main algorithms for their manipulation. We have emphasized
the application of BDDs to reachability analysis because reachability analysis is the computational task at
the basis of model checking algorithms. Many variants of BDDs have been proposed and several are in use.
We have not attempted to give even a cursory treatment of them, even though some of the variants have
found applications in verification algorithms. The interested reader will find two good references on the
subject in [10, 46].

References
[1] S. B. Akers. Binary decision diagrams. IEEE Transactions on Computers, C-27(6):509–516, June 1978.
[2] P. Ashar and M. Cheong. Efficient breadth-first manipulation of binary decision diagrams. In Proceedings of the
International Conference on Computer-Aided Design, pages 622–627, San Jose, CA, November 1994.
[3] C. L. Berman. Circuit width, register allocation, and reduced function graphs. Technical Report RC 14129, IBM
T. J. Watson Research Center, Yorktown Heights, NY, October 1988.
[4] B. Bollig, M. Löbbing, and I. Wegener. Simulated annealing to improve variable orderings for OBDDs. Presented
at the International Workshop on Logic Synthesis, Granlibakken, CA, May 1995.
[5] K. S. Brace, R. L. Rudell, and R. E. Bryant. Efficient implementation of a BDD package. In Proceedings of the
27th Design Automation Conference, pages 40–45, Orlando, FL, June 1990.
[6] R. K. Brayton et al. VIS: A system for verification and synthesis. In T. Henzinger and R. Alur, editors, Eigth Conference on Computer Aided Verification (CAV’96), pages 428–432. Springer-Verlag, Rutgers University, 1996.
LNCS 1102.
[7] R. E. Bryant. Symbolic manipulation of boolean functions using a graphical representation. In Proceedings of
the 22nd Design Automation Conference, June 1985.
[8] R. E. Bryant. Graph-based algorithms for boolean function manipulation. IEEE Transactions on Computers,
C-35(8):677–691, August 1986.
[9] R. E. Bryant. On the complexity of VLSI implementations and graph representations of boolean functions with
application to integer multiplication. IEEE Transactions on Computers, 40(2):205–213, February 1991.

64

[10] R. E. Bryant. Binary decision diagrams and beyond: Enabling technologies for formal verification. In Proceedings of the International Conference on Computer-Aided Design, pages 236–243, San Jose, CA, November
1995.
[11] J. R. Burch, E. M. Clarke, and D. E. Long. Representing circuits more efficiently in symbolic model checking.
In Proceedings of the Design Automation Conference, pages 403–407, San Francisco, CA, June 1991.


[12] J. R. Burch, E. M. Clarke, K. L. McMillan, D. L. Dill, and L. J. Hwang. Symbolic model checking:
and beyond. In Proceedings of the Fifth Annual Symposium on Logic in Computer Science, June 1990.

states

[13] J. R. Burch and D. E. Long. Efficient boolean function matching. In Proceedings of the International Conference
on Computer-Aided Design, pages 408–411, Santa Clara, CA, November 1992.
[14] J. R. Burch and V. Singhal. Tight integration of combinational verification methods. In Proceedings of the
International Conference on Computer-Aided Design, pages 570–576, San Jose, CA, November 1998.
[15] G. Cabodi, P. Camurati, and S. Quer. Improved reachability analysis of large finite state machines. In Proceedings
of the International Conference on Computer-Aided Design, pages 354–360, Santa Clara, CA, November 1996.
[16] H. Cho, G. D. Hachtel, E. Macii, B. Plessier, and F. Somenzi. Algorithms for approximate FSM traversal based
on state space decomposition. IEEE Transactions on Computer-Aided Design, 15(12):1465–1478, December
1996.
[17] H. Cho, G. D. Hachtel, E. Macii, M. Poncino, and F. Somenzi. Automatic state space decomposition for approximate FSM traversal based on circuit analysis. IEEE Transactions on Computer-Aided Design, 15(12):1451–1464,
December 1996.
[18] H. Cho, G. D. Hachtel, and F. Somenzi. Redundancy identification/removal and test generation for sequential
circuits using implicit state enumeration. IEEE Transactions on Computer-Aided Design of Integrated Circuits
and Systems, 12(7):935–945, July 1993.
[19] Y. Choueka. Theories of automata on  -tapes: A simplified approach. Journal of Computer and System Sciences,
8:117–141, 1974.
[20] T. H. Cormen, C. E. Leiserson, and R. L. Rivest. An Introduction to Algorithms. McGraw-Hill, New York, 1990.
[21] O. Coudert, C. Berthet, and J. C. Madre. Verification of sequential machines using boolean functional vectors.
In L. Claesen, editor, Proceedings IFIP International Workshop on Applied Formal Methods for Correct VLSI
Design, pages 111–128, Leuven, Belgium, November 1989.
[22] O. Coudert and J. C. Madre. A unified framework for the formal verification of sequential circuits. In Proceedings
of the IEEE International Conference on Computer Aided Design, pages 126–129, November 1990.
[23] R. Drechsler, B. Becker, and N. Göckel. A genetic algorithm for variable ordering of OBDDs. Presented at the
International Workshop on Logic Synthesis, Granlibakken, CA, May 1995.
[24] R. Drechsler, N. Drechsler, and W. Günther. Fast exact minimization of BDDs. In Proceedings of the Design
Automation Conference, pages 200–205, San Francisco, CA, June 1998.
[25] F. Ferrandi, A. Macii, E. Macii, M. Poncino, R. Scarsi, and F. Somenzi. Symbolic algorithms for layout-oriented
synthesis of pass transistor logic circuits. In Proceedings of the International Conference on Computer-Aided
Design, pages 235–241, San Jose, CA, November 1998.
[26] S. J. Friedman and K. J. Supowit. Finding the optimal variable ordering for binary decision diagrams. IEEE
Transactions on Computers, 39(5):710–713, May 1990.
[27] H. Fujii, G. Ootomo, and C. Hori. Interleaving based variable ordering methods for ordered binary decision
diagrams. In Proceedings of the International Conference on Computer-Aided Design, pages 38–41, Santa
Clara, CA, November 1993.

65

[28] M. Fujita, Y. Matsunaga, and T. Kakuda. On variable ordering of binary decision diagrams for the application
of multi-level logic synthesis. In Proceedings of the European Conference on Design Automation, pages 50–54,
Amsterdam, February 1991.
[29] D. Geist and I. Beer. Efficient model checking by automated ordering of transition relation partitions. In D. L.
Dill, editor, Sixth Conference on Computer Aided Verification (CAV’94), pages 299–310, Berlin, 1994. SpringerVerlag. LNCS 818.
[30] G. D. Hachtel, E. Macii, A. Pardo, and F. Somenzi. Markovian analysis of large finite state machines. IEEE
Transactions on Computer-Aided Design, 15(12):1479–1493, December 1996.
[31] G. D. Hachtel and F. Somenzi. A symbolic algorithm for maximum flow in 0-1 networks. Journal of Formal
Methods in Systems Design, 10(2/3):207–219, 1997.
[32] M. Held and R. M. Karp. A dynamic programming approach to sequencing problems. J. SIAM, 10(1):196–210,
1962.
[33] Y. Hong, P. A. Beerel, J. R. Burch, and K. L. McMillan. Safe BDD minimization using don’t cares. In Proceedings of the Design Automation Conference, pages 208–213, Anaheim, CA, June 1997.
[34] N. Ishiura, H. Sawada, and S. Yajima. Minimization of binary decision diagrams based on exchanges of variables.
In Proceedings of the International Conference on Computer-Aided Design, pages 472–475, Santa Clara, CA,
November 1991.
[35] S.-W. Jeong, T.-S. Kim, and F. Somenzi. An efficient method for optimal BDD ordering computation. In
International Conference on VLSI and CAD (ICVC’93), Taejon, Korea, November 1993.
[36] S.-W. Jeong, B. Plessier, G. D. Hachtel, and F. Somenzi. Extended BDD’s: Trading off canonicity for structure
in verification algorithms. In Proceedings of the IEEE International Conference on Computer Aided Design,
pages 464–467, Santa Clara, CA, November 1991.
[37] S.-W. Jeong, B. Plessier, G. D. Hachtel, and F. Somenzi. Variable ordering and selection for FSM traversal. In
Proceedings of the IEEE International Conference on Computer Aided Design, pages 476–479, Santa Clara, CA,
November 1991.
[38] G. Kahmi and L. Fix. Adaptive variable reordering for symbolic model checking. In Proceedings of the International Conference on Computer-Aided Design, pages 359–365, San Jose, CA, November 1998.
[39] C. Y. Lee. Binary decision programs. Bell System Technical Journal, 38(4):985–999, July 1959.
[40] B. Lin and F. Somenzi. Minimization of symbolic relations. In Proceedings of the IEEE International Conference
on Computer Aided Design, pages 88–91, Santa Clara, CA, November 1990.
[41] D. E. Long. The design of a cache-friendly BDD library. In Proceedings of the International Conference on
Computer-Aided Design, pages 639–645, San Jose, CA, November 1998.
[42] F. Mailhot and G. De Micheli. Technology mapping using boolean matching. In Proceedings of the European
Conference on Design Automation, pages 180–185, Glasgow, UK, March 1990.
[43] S. Manne, D. C. Grunwald, and F. Somenzi. Remembrance of things past: Locality and memory in BDDs. In
Proceedings of the Design Automation Conference, pages 196–201, Anaheim, CA, June 1997.
[44] K. L. McMillan. Symbolic Model Checking. Kluwer Academic Publishers, Boston, MA, 1994.
[45] K. L. McMillan. A conjunctively decomposed boolean representation for symbolic model checking. In R. Alur
and T. A. Henzinger, editors, 8th Conference on Computer Aided Verification (CAV’96), pages 13–25. SpringerVerlag, Berlin, August 1996. LNCS 1102.
[46] C. Meinel and T. Theobald. Algorithms and Data Structures in VLSI Design. Springer, Berlin, 1998.

66

[47] S.-I. Minato, N. Ishiura, and S. Yajima. Shared binary decision diagram with attributed edges for efficient
boolean function manipulation. In Proceedings of the Design Automation Conference, pages 52–57, Orlando,
FL, June 1990.
[48] H. Ochi, K. Yasuoka, and S. Yajima. Breadth-first manipulation of very large binary-decision diagrams. In Proceedings of the International Conference on Computer-Aided Design, pages 48–55, Santa Clara, CA, November
1993.
[49] S. Panda and F. Somenzi. Who are the variables in your neighborhood. In Proceedings of the International
Conference on Computer-Aided Design, pages 74–77, San Jose, CA, November 1995.
[50] S. Panda, F. Somenzi, and B. F. Plessier. Symmetry detection and dynamic variable ordering of decision diagrams. In Proceedings of the International Conference on Computer-Aided Design, pages 628–631, San Jose,
CA, November 1994.
[51] R. K. Ranjan, A. Aziz, R. K. Brayton, B. F. Plessier, and C. Pixley. Efficient BDD algorithms for FSM synthesis
and verification. Presented at IWLS95, Lake Tahoe, CA., May 1995.
[52] K. Ravi, K. L. McMillan, T. R. Shiple, and F. Somenzi. Approximation and decomposition of decision diagrams.
In Proceedings of the Design Automation Conference, pages 445–450, San Francisco, CA, June 1998.
[53] K. Ravi and F. Somenzi. High-density reachability analysis. In Proceedings of the International Conference on
Computer-Aided Design, pages 154–158, San Jose, CA, November 1995.
[54] R. Rudell. Dynamic variable ordering for ordered binary decision diagrams. In Proceedings of the International
Conference on Computer-Aided Design, pages 42–47, Santa Clara, CA, November 1993.
[55] J. V. Sanghavi, R. K. Ranjan, R. K. Brayton, and A. Sangiovanni-Vincentelli. High performance BDD package
based on exploiting memory hierarchy. In Proceedings of the Design Automation Conference, pages 635–640,
Las Vegas, NV, June 1996.
[56] H. Savoj, R. K. Brayton, and H. J. Touati. Extracting local don’t cares for network optimization. In Proceedings
of the International Conference on Computer-Aided Design, pages 514–517, Santa Clara, CA, November 1991.
[57] T. R. Shiple. Formal Analysis of Synchronous Circuits. PhD thesis, University of California at Berkeley, Electronics Research Laboratory, College of Engineering, University of California, Berkeley, CA 94720, October
1996. Memorandum No. UCB/ERL M96/76.
[58] T. R. Shiple, R. Hojati, A. L. Sangiovanni-Vincentelli, and R. K. Brayton. Heuristic minimization of BDDs using
don’t cares. In Proceedings of the Design Automation Conference, pages 225–231, San Diego, CA, June 1994.
[59] H. Touati, H. Savoj, B. Lin, R. K. Brayton, and A. Sangiovanni-Vincentelli. Implicit enumeration of finite state
machines using BDD’s. In Proceedings of the IEEE International Conference on Computer Aided Design, pages
130–133, November 1990.
[60] B. Yang, R. E. Bryant, D. R. O’Hallaron, A. Biere, O. Coudert, G. Janssen, R. K. Ranjan, and F. Somenzi. A
performance study of BDD-based model checking. In Formal Methods in Computer Aided Design, November
1998.
[61] B. Yang, Y.-A. Chen, R. Bryant, and D. O’Hallaron. Space- and time-efficient BDD construction via working
set control. In Proc. of Asia and South Pacific Design Automation Conference (ASPDAC’98), Yokohama, Japan,
February 1998.

67