Garbage Collection Algorithms
Project for CSEP 521, Winter 2007
Rick Byers – rbyers@cs.washington.edu

1 Problem and applications
Almost all programming languages have a notion of a heap, or free-store, from which the
programmer can request memory. Historically, the heap was managed explicitly by the programming
by using allocate and release function calls in a library (such as malloc/free in C, and new/delete in
C++). Explicit memory management has been a significant source of software bugs, and programmer
pain (for example, using memory after it has been freed, freeing it twice, or forgetting to free it at all).
Starting around 1960, algorithms began to be studied for automatic memory management [McCarthy,
1960] [Collins, 1960]. With automatic memory management, a programmer can request space from
the heap (eg. by instantiating a new object with a statement like “new MyObject()” in C# and Java),
and the run-time system does the job of allocating the necessary memory and releasing it to be reused when it’s no longer needed. Automatic memory management is easier to use and less errorprone than explicit memory management, but is also usually considered to be less efficient and less
flexible.
Automatic memory management is usually implemented by the runtime system using a family of
algorithms called “garbage collection”. Lisp was the first widespread programming language to
adopt garbage collection in the early 60s. For various reasons (eg. complexity of implementation,
efficiency, and stubbornness of programmers), garbage collection did not become popular in
mainstream industrial programming until the 90s with the advent in popularity of the Java
programming language. Today, virtually every modern high-level programming language has
garbage collection as a core feature of it’s design. Such languages include Java, C#, ML, Visual
Basic, Python, Ruby, Scheme, Haskell (basically everything in widespread use except for C/C++).
Garbage collection is generally considered to be one of the biggest improvements in programmer
productivity since the advent of the high-level programming language.
The basic problem of garbage collection is to identify the memory which is no-longer needed by the
application, and to make it available for other allocations. Memory is determined to be no longer
needed by the application, if there is no longer any way for the program to refer to the region of
memory in question. The main challenge in designing a GC algorithm is to find one which is
“efficient” for typical uses, where efficiency is measured according to many metrics (see part 3).

2 Formalization of the problem
A heap can be thought of as a set of objects, and a distinguished object known as the “root”. Each
object has an address, size, and set of references (which are addresses of other objects). Addresses
can be thought of as non-negative integers in a contiguous address space (from 0 to some maximum
limit N). In practice, GC heaps are normally implemented using a subset of a real process address
space (with potentially non-contiguous “segments”), but we can conceptually map this onto a single
contiguous logical address space. A heap has the constraint that no two objects may overlap. That is,

given any two objects in the heap o1=(a1,s1), and o2=(a2,s2), it is always the case that either a1+s1≤a2
(i.e. a1 occurs entirely before a2) or a2+s2≤a1 (a2 occurs entirely before a1).
A heap can be modeled as a directed graph with the vertices o=(a,s) representing objects, and the
edges representing the references between objects. The problem is then, given a heap at any point,
modify the heap to improve the desired criteria (mainly availability of space for new allocations)
while preserving the property that any object which is reachable from the root still exists in the
modified heap. The most common and simplest modification is to remove some or all of the objects
which have no path from the root. Other modifications are also possible, such as relocating
(changing the address of) objects to coalesce regions of free space, or to preserve some ordering
property.
For example, at some point in a program execution, the heap
may look like the graph at right. If the garbage collector
were to run at this point, it would determine that o3, o4 and o5
are no longer reachable from the root, and therefore should
be considered garbage. The collector may chose to remove
some or all of these garbage objects, and possibly relocate
some of the other reachable objects.

o1

o2
o4

root

o3

o5

3 Garbage collection algorithms
Garbage collection algorithms have been an active field of research since 1960. There are many
different variations on the basic GC algorithms, all of which attempt to maximize some metrics for
typical allocation patterns. The dependence of an algorithm on the allocation pattern of the program
means that there is usually no precise way to compare GC algorithms without also considering the
exact context in which it will be used. In practice, GC algorithms are compared by using both
imprecise general statements of benefits, and precise measurements of their behavior in specific
benchmark scenarios. Some of the most important metrics for comparing GC algorithms include:
- Minimizing the time spent reclaiming memory.
- Minimizing the amount of wasted memory at all times.
- Minimizing the amount of memory necessary to perform a collection.
- Minimizing the time and resources necessary for the program to access the memory during
normal execution (including maximizing CPU cache hit rate and minimizing OS page faults).
- The above usually implies maximizing the locality of reference, that is the tendency for
objects which are used together, to be near each other in memory.
- Minimizing the pause-time experienced by an application during a collection.
- Minimizing the complexity of the algorithm itself (which in practice often translates into
performance, adaptability, maintainability, correctness and security benefits).
- In specialized scenarios (such as small devices) there are often other metrics like maximizing
battery life, or minimizing the number of writes to flash memory.

3.1 Mark-sweep
The earliest and most basic garbage collection algorithm is mark-sweep garbage collection
[McCarthy, 1960], and most modern algorithms are a variant on it. Mark-sweep is a “stop-theworld” collector, which means that at some point when the program requests memory and none is
available, the program is stopped and a full garbage collection is performed to free up space. In marksweep, each object has a “mark-bit” which is used during the collection process to track whether the
object has been visited. Here is an algorithm for mark-sweep garbage collection implemented on top
of some underlying explicit memory management routines, in which free regions of the heap are also
considered objects with mark bits and a known size.
mark_sweep_collect() =
mark(root)
sweep()
mark(o) =
If mark-bit(o)=0
mark-bit(o)=1
For p in references(o)
mark(p)
EndFor
EndIf

sweep()
o=0
While o < N
If mark-bit(o)=1
mark-bit(o)=0
Else
free(o)
EndIf
o = o + size(o)
EndWhile

The mark-sweep algorithm operates in time linear in the size of the heap (i.e. O(N)). This doesn’t
directly tell us how much overhead it imposes on a program, because it must be invoked whenever an
allocation fails, and so the overhead depends on parameters such as how big the heap is, and how
much memory has become unreachable since the last GC. In practice, the overhead, as well as the
pause-time, of mark-sweep collectors is high compared to other algorithms. Mark-sweep does
however have the advantage of freeing all unused memory, but this free memory easily becomes
fragmented (limiting the availability of larger contiguous regions). There is technically a space
overhead for the mark-bit, but in practice a bit is usually re-purposed from some other run-time data
structure, since it’s only needed when the program is not running.

3.2 Semi-space
Semi-space garbage collection [Fenichel, 1969] is a copying algorithm, which means that reachable
objects are relocated from one address to another during a collection. Available memory is divided
into two equal-size regions called “from-space” and “to-space”. Allocation is simply a matter of
keeping a pointer into to-space which is incremented by the amount of memory requested for each
allocation (that is, memory is allocated sequentially out of to-space). When there is insufficient space
in to-space to fulfill an allocation, a collection is performed. A collection consists of swapping the
roles of the regions, and copying the live objects from from-space to to-space, leaving a block of free
space (corresponding to the memory used by all unreachable objects) at the end of the to-space.
Since objects are moved during a collection, the addresses of all references must be updated. This is
done by storing a “forwarding-address” for an object when it is copied out of from-space. Like the
mark-bit, this forwarding-address can be thought of as an additional field of the object, but is usually
implemented by temporarily repurposing some space from the object.

initialize() =
tospace = 0
fromspace = N/2
allocPtr = tospace

collect() =
swap( fromspace, tospace )
allocPtr = tospace
root = copy(root)

allocate(n) =
If allocPtr + n > tospace + N/2
collect()
EndIf
If allocPtr + n > tospace + N/2
fail “insufficient memory”
EndIf
o = allocPtr
allocPtr = allocPtr + n
return o

copy(o) =
If o has no forwarding address
o’ = allocPtr
allocPtr = allocPtr + size(o)
copy the contents of o to o’
forwarding-address(o) = o’
ForEach reference r from o’
r = copy(r)
EndForEach
EndIf
return forwarding-address(o)

The primary benefits of semi-space collection over mark-sweep are that the allocation costs are
extremely low (no need to maintain and search lists of free memory), and fragmentation is avoided.
In addition to improving the efficiency and reliability of allocation, avoiding fragmentation also
improves the locality of reference which means the program will typically run faster (due to paging
and CPU cache effects). The primary drawback of semi-space is that it requires twice as much
memory – at any given time during program execution, half of the available memory cannot be used.
Semi-space collection executes in time proportional to the amount of reachable memory, and so
unlike mark-sweep, can be very efficient if most memory is garbage at the time of collection.
However, for a given heap size, semi-space requires many more collections than mark-sweep (since it
only has half the space to work with), and so if most of the objects are reachable at the time of
collection, semi-space becomes much less efficient than mark-sweep.

3.3

Other variations

There are many other variations on the basic GC algorithms above. Due to space constraints, I will
give only a brief mention to a few of these variations.
Compacting garbage collectors typically use an algorithm like mark-sweep, but also re-arrange the
objects to coalesce free-space to avoid fragmentation. This also often has the benefit of keeping the
objects in memory ordered by their allocation time, which typically improves the locality of
reference. Compaction has most of the benefits of the semi-space algorithm (efficient and reliable
allocation), without the cost of the additional memory. Compaction does, however, require
significantly more time and bookkeeping during collection to copy objects and update object
references appropriately.
Generational garbage collectors are designed under the assumption that objects which are created
recently are more likely to be garbage than objects which have been alive for a long time. Such

collectors typically divide the heap into two or three generations, promoting objects from a
generation to the next older one when they survive a collection. With some careful bookkeeping, it is
then possible to perform partial collections of only the one or two younger generations, avoiding the
cost of scanning through old objects which are likely to sill be alive. This results in a lower overhead
in most scenarios, at the cost of some additional complexity.
Incremental garbage collectors attempt to minimize the pause time incurred due to a collection, at the
expense of more overhead overall relative to stop-the-world collectors. This typically involves doing
a little bit of the GC work at every allocation. This trade-off is appropriate for some interactive
applications such as graphics-intensive programs where the user may notice a pause of even a
fraction of a second. Concurrent garbage collectors are incremental collectors which perform
collection in parallel with the program execution by using multiple CPUs.
Most high-performance modern industrial GCs (including the one in Microsoft’s Common Language
Runtime) are generational mark-sweep compacting collectors, with an optional concurrent mode for
low-latency applications.

4 Open problems
Despite active research for the past 40 years, there are many open problems in garbage collection.
Working on the Common Language Runtime team at Microsoft, I see that the GC team (which has
some extremely bright people) is constantly faced with new challenges and unusual scenarios in
which performance should be improved. GC algorithm design seems like more of an art than a
science – constantly trading off various parameters based on the priority of expected usage models. I
suspect that GC algorithm research will continue to be an active area of research for the next 40
years.
In addition to improving performance in various situations, some concrete areas of research include:
1. Interaction with non-memory resources. How should a GC account for the fact that objects
often hold other resources which have an associated cost, and sometimes strict requirements
on when they must be released for correct behavior of the program? For example, a process
can typically only have a certain number of file handles open at once. Objects representing
files may keep the associated file handles open until they are collected. How should a GC
know to collect unused file objects? Could the idea of garbage collection be generalized to
cover all OS resources in addition to just memory?
2. How should a GC algorithm and OS memory management system co-operate? For example,
when should the OS notify a GC to perform a collection rather than page out some memory to
disk? When should a GC avoid a collection because the cost of bringing in the pages from
disk will be higher than the benefit of freeing any memory? Could the GC-related properties
of a memory page somehow be summarized in memory, when the page is paged-out to disk
by the OS?
3. How can we design GC algorithms which have good average-case performance, while
limiting their worst-case performance to some provable bound? This is necessary for realtime applications which need hard guarantees on the time which certain operations will take
to complete.
4. How can a GC algorithm making effective use of highly parallel computers (eg. >32
processors cores).

5. How should we design good tools and techniques for diagnosing and visualizing problematic
memory usage, such as memory leaks and other performance problems?

5 Conclusions
Garbage collection has been a mixed blessing for the software industry. On the one hand, it does
seem to have greatly improved the productivity of software developers, making writing correct
software easier in the common cases. On the other hand, it adds a lot of complexity and uncertainty
to the implementation of a programming environment. There are many different variations on GC
algorithms, and they are usually impossible to evaluate strictly mathematically without considering a
specific usage model.
With garbage collection it is easier to reason about the correctness of a program, but usually harder to
reason about it’s performance. Reasoning about performance of a program in a GC environment
Overall, garbage collection shifts the burden for a difficult problem from the programmer to the
programming environment implementer, which for most software engineering scenarios is an
excellent trade-off.

6 References
[Collins, 1960] George E. Collins. A method for overlapping and erasure of lists. Communications
of the ACM, 3(12):655-657, December 1960
[Fenichel, 1969] Robert R. Fenichel and Jerome C. Yochelson. A Lisp garbage collector for virtual
memory computer systems. Communications of the ACM, 12(11):611-612, November 1969
[McCarthy, 1960] John McCarthy. Recursive functions of symbolic expressions and their
computation by machine. Communications of the ACM, 3:184-195, 1960