Revocation Transparency
Ben Laurie (benl@google.com)
Emilia Kasper (ekasper@google.com)

Introduction
Like Certificate Transparency, only for revocation. Unlike CT, RT needs a recent status for the
certificate, so the general idea is that client will somehow obtain and check a recent proof from the
log - for example the server could retrieve it from the log regularly and serve that along with the
certificate, or clients could be sent the entire revocation list.
As in CT, consistency with claimed changes from one version to the next must be efficiently
demonstrable (strictly, what you want is an efficient update mechanism so monitors can only pay
attention to changes instead of having to download and process the whole list every 5 minutes - I
claim these are equivalent). Unlike CT, deletions are allowed as well as additions. It must also be
possible to efficiently prove that an entry is or is not in any particular version of the log.
This document does not attempt to answer all questions about revocation, such as who can revoke
and under what circumstances. It provides a mechanism for transparency: i.e. the ability to know,
efficiently, that the list of revoked certificates you see is the same as the list everyone else sees,
and that every revocation (or unrevocation) is available for scrutiny by any interested party.
There are currently two possible mechanisms, one based on a Sparse Merkle Tree, the other on
more traditional Merkle trees where the leaf nodes contain pairs from a sorted list of all revoked
certificates.

Sparse Merkle Tree1
The idea is that we store the status of every possible certificate in a Merkle tree of uncomputable
size. For example, we could hash certificates with SHA-256 and store a 1 or 0 (for revoked or not
revoked) in the corresponding leaf. The idea is, of course, general, and can store sparse values for
any list of long numbers.
Normally a tree this size would not be computable, because there are 2^256 leaves. However, the
observation is that if non-zero values are sparse, then almost all the leaves are 02. This means that
almost all level 2 nodes have the value H(0 || 0), almost all level 3 nodes have the value H(H(0 || 0)
|| H(0 || 0)), and so on. Only nodes that lead to a non-zero leaf value actually need to be computed
individually - the rest have the same value and we can compute them efficiently.
Thus, a certificate can be shown to be unrevoked by showing the path to the root in the Merkle tree
in the usual way. Of course, this path is 256 hashes long, but almost all of those hashes will have
default values, on average, and so do not need to be shown explicitly.
Computing a tree from scratch can be done efficiently. For example, here is code in Python:
hStarEmptyCache = ['0']
def HStarEmpty(n):
1
2

We have not been able to find any reference to this mechanism in the literature and believe it is novel.
In general any repeating value (or even pattern of values) can be used and the construction is still efficient.

if len(hStarEmptyCache) <= n:
t = HStarEmpty(n - 1)
t = Hash(t + t)
assert len(hStarEmptyCache) == n
hStarEmptyCache.append(t)
return hStarEmptyCache[n]
def HStar2b(n, l, lo, hi, offset):
t = hi - lo
if n == 0:
if t == 0:
return '0'
assert t == 1
return '1'
if t == 0:
return HStarEmpty(n)
split = (1 << (n - 1)) + offset
i = bisect_left(l, split, lo, hi)
return Hash(HStar2b(n - 1, l, lo, i, offset) +
HStar2b(n - 1, l, i, hi, split))
def HStar2(n, l):
l.sort()
return HStar2b(n, l, 0, len(l), 0)
Where the argument n is the depth of the tree (i.e. 256 in our example) and l is a list of the
indexes of nodes that take the value 1. Note that we actually use the characters ‘0’ and ‘1’ not the
bit values, for no particular reason.
As in CT, we need the ability to also show consistency from one version of the tree to the next.
Again, this can be done relatively easily - for each addition or deletion from the list we need to
show the path in the tree from the corresponding leaf to the root, which should yield the existing
root hash when the value 0 or 1 is used and the new root hash when the value 1 or 0 is used,
respectively.
Again, most of this path will consist of default values, which do not need to be shown.
Intuition says that a typical path will contain O(log N) non-default hash values, where N is the total
number of non-zero leaves.

Sorted List
The second mechanism takes the list of all revoked certificates and sorts it into order. A tree is
constructed where each leaf is a pair of consecutive entries from this list. Note that the leaves do
not need to be in order. Non-revocation is shown by showing the pair in the tree that bracket the
unrevoked certificate.
Consistency is shown as follows. For an insertion into the list, say w, find the pair (x, y) that bracket
w. Replace it with (x, w) and add a new leaf containing (w, y). Show the paths for the replaced
node and the new node - this proves that the new tree contains all the elements of the old tree, in
the same order, and so long as x < w < y, the new tree corresponds to a sorted list.
For a deletion, say of w as added above, find the two pairs (x, w) and (w, y). Replace the first with
(x, y) and the second with a special value (null). Again, show consistency by showing the two
paths.

Note that (null) leaves can be backfilled as new entries are added.
This mechanism is probably less efficient than the sparse tree as each change must show two
paths, each of size log N. However, it has better worst case behaviour. Note that inducing a bad
case is hard, because it requires a partial hash collision.

Consistency with Claimed Changes Is Not Enough
Suppose an evil log wants to fool you into thinking a revoked cert is, in fact, not revoked, without
raising an alarm with anyone else. This is easy to achieve: the log creates a new version of the
tree with the cert unrevoked, and then changes it back to revoked. The final state is consistent
with everyone else’s view, but it need only show you these two changes - and in between, you will
believe the cert is valid, whilst the rest of the world continues to believe it is invalid.
This is easy to solve, using the existing append-only log used by CT: each change of root hash in
the RT tree must be logged in a CT-like tree. Then the change to unrevoked and back to revoked
becomes part of the public log and must be shown to everyone.
This additional mechanism is only needed if deletions or changes of non-default leaf nodes are
allowed.