Structural Digital Signature for Image Authentication:
An Incidental Distortion Resistant Scheme
Chun-Shien Lu and Hong-Yuan Mark Liao

Institute of Information Science,
Academia Sinica, Taipei, Taiwan.
E-mail: flcs, liaog@iis.sinica.edu.tw

Abstract
The existing digital data veri cation methods are able to detect regions that have been tampered with,
but are too fragile to resist incidental manipulations. This paper proposes a new digital signature
scheme which makes use of an image's contents (in the wavelet transform domain) to construct a
structural digital signature (SDS ) for image authentication. The characteristic of the SDS is that it
can tolerate content-preserving modi cations while detecting content-changing modi cations. Many incidental manipulations, which were detected as malicious modi cations in the previous digital signature
veri cation or fragile watermarking schemes, can be bypassed in the proposed scheme. Performance
analysis is conducted and experimental results show that the new scheme is indeed superb for image
authentication.
keywords: Digital signature, Wavelet transform, Authentication, Fragility, Robustness.

 The preliminary version of this paper will be published in [14] (http://smart.iis.sinica.edu.tw/lcs).

1

1 Introduction
Because of the easy-to-copy nature of digitized media, it is very easy for one to tamper with digital data
without leaving any clues. Under these circumstances, integrity veri cation has become an important
issue in the digital world. Conventionally, the methods used for media veri cation can be classi ed into
two kinds: digital signature-based [2, 3, 5, 7, 8] and watermark-based [4, 6, 9, 12, 17, 18, 19, 20, 21, 23].
A digital signature is a set of features extracted from a media, and these features are stored as a le,
which will be used later for authentication. A very important characteristic of a digital signature is
that it suciently represents the content of the original media. Watermarking, on the other hand, is
a media authentication/protection technique that embeds invisible (or inaudible) information into a
media. For content authentication, the embedded watermark can be extracted and used for veri cation
purposes. The major di erence between a watermark and a digital signature is that the embedding
process of the former requires the content of a media to change. However, both the watermarkbased approach and the digital signature-based approach are expected to be sensitive to any malicious
modi cation applied to the media. For an incidental modi cation such as JPEG compression or
blurring, a good authentication system should be able to tolerate it. Unfortunately, most of the
existing media authentication systems, though they can detect malicious tampering successfully, are
vulnerable to incidental modi cations. The main reason for the above mentioned problem is that the
existing methods do not consider carefully the tradeo between robustness and fragility. In the whole
course of this study, we shall focus our discussion on the image authentication system.
The underlying techniques used to implement the digital signature-based or watermark-based
approaches can be roughly classi ed into quantization-based [6, 12, 22], feature point-based [2, 3],
and relation-based [7, 8]. As to a quantization-based approach, Kundur and Hatzinakos [6] designed
a quantization technique to encode a watermark so that the hidden watermark is more/less sensitive
to modi cations at high/low frequency in the wavelet domain. Usually, over-sensitivity may occur
at the small-to-medium scale while under-sensitivity may only happen at the medium-to-large scale.
With this understanding, one could make application-dependent decisions on whether an image is
credible or not when encountering some modi cations. The major problem associated with [6] is that
the tampering detection results are very unstable. It is well known that the perturbation applied to
a wavelet coecient may make the extracted mark di erent from or still the same as the embedded
one. In other words, the extracted result may be completely unpredictable. Another drawback of [6]
is that the method cannot resist incidental modi cations. Recently, we have proposed a multipurpose
watermarking scheme [12, 13] for image/audio authentication and protection. Our method combines
2

a media data-dependent quantization technique and a complementary watermark hiding strategy
[10, 11] to conceal watermarks. We have also proposed several detection methods to optimize the
tradeo between robustness and fragility.
As to feature point-based authentication systems, Bhattacharjee and Kutter [2] proposed to generate a digital signature by encrypting the feature points' positions in an image. Authentication is
then accomplished by comparing the positions of the feature points extracted from a questionable
image with those decrypted from the previously encrypted digital signature. It is not certain that this
approach can resist JPEG compression with middle-to-high compression ratios because the feature
points are liable to be shifted. Recently, Dittmann et al. [3] presented a content-based digital signature approach for image/video authentication using edge characteristics. Their content features are
similar to [2], but di erent extraction techniques are used.
A typical relation-based technique for developing an image authentication system has been reported
by Lin and Chang [7, 8]. In order to make the designed image authentication system tolerate JPEG
compression, Lin and Chang [7, 8] dedicated themselves to exploring the operation in a JPEG-based
system. They proposed to extract a digital signature by using the invariant relation existing between
any two DCT coecients, which are at the same position of two di erent 8  8 blocks. They found
that the invariance properties could always be preserved before and after JPEG compression. However, they didn't mention clearly whether their method could survive other incidental manipulations.
Although they used the invariance property to achieve their goal, the extracted relation is random by
nature. In other words, the merit of the image structure, which is a very important feature, was not
utilized.
In this paper, we will develop a new digital signature-based image authentication scheme which is
completely di erent from the existing methods. In the proposed method, commonly adopted features
such as the position of feature points or the relationship of any two random coecients are not used
at all. On the contrary, we propose to use the \structure " of an image as a digital signature. In the
proposed scheme, the structure of an image's contents is composed of a number of parent-child pairs
in the wavelet domain. We build up a structural digital signature and check to see if it is robust under
content-preserving manipulations and fragile under content-changing manipulations. Performance
analysis on the proposed new image authentication system has been conducted and the experimental
results have proven the powerfulness of the system.
The remainder of this paper is organized as follows. In Sec. 2, we will present the proposed
structural digital signature-based image authentication scheme. This will include the construction
3

and veri cation of a structural digital signature. An analysis on the performance of our proposed
scheme will be conducted in Sec. 3. We will discuss the false positive and false negative problems
when incidental distortions and/or malicious tampering are encountered. In addition, we will analyze
the e ect that occurs when the size of a structural digital signature changes. Based on the analysis,
a systematic way can be derived to determine the best size for use. In Sec. 4, a series of experiments
will be conducted and their results will be reported. Concluding remarks will be given in Sec. 5.

2 Structural Digital Signature (SDS )
Our digital signature scheme is based on the wavelet transform due to its excellent multiscale and
precise localization properties. Basically, the multiscale representation of an image is by nature highly
suitable for designing a structural digital signature. In Sec. 2.1, we will introduce how to de ne
a structural digital signature based on the interscale relation of wavelet coecients. The rules for
instructing how to label an SDS will be described in Sec. 2.2. The metric and the procedure used to
authenticate an incoming unknown image will be detailed in Sec. 2.3. Analysis issues about the size
and the complexity of an SDS will be elaborated on in Secs. 2.4 and 2.5, respectively.

2.1 De ning SDS based on Interscale Relation of Wavelet Coecients
Let ws;o(x; y) represent a wavelet coecient (at scale s, orientation o, and position (x; y)) in the
orthogonally downsampled wavelet transform domain of an image I. Suppose a J -scale wavelet transform is performed, then 0  s < J . It is well known that a large/small scale represents a coarser/ ner
resolution of an image, i.e., the low/high frequency part. The orientation o may be in a horizontal, vertical, or diagonal direction. The interscale relationships of wavelet coecients can then be converted
into the relationships between the parent node ws+1;o (x; y) and its four child nodes ws;o (2x + i; 2y + j )
with
jws+1;o(x; y)j  jws;o(2x + i; 2y + j )j;
(1)
or

jws

;o (x; y)j < jws;o (2x + i; 2y + j )j;

+1

(2)

where 0  s < J , 0  i; j  1, and 1  x  N and 1  y  M (N  M is the image size). Combining
Eqs. (1) and (2), the above two relations can be rewritten as

jjws

;o (x; y)j ? jws;o (2x + i; 2y + j )jj  0:

+1

4

(3)

In order to design a reliable scheme for image authentication, we propose a new signature method called
structural digital signature (SDS ). The new signature can be obtained by observing the interscale
relations of wavelet coecients of an image. The basic concept of the new scheme relies on the
following: (i) the interscale relationship should be dicult to be destroyed after content-preserving
manipulations; and (ii) this interscale relationship should be dicult to be preserved after contentchanging manipulations. Because these interscale relationships result from the structure of an image
(say I), we de ne them as the structural digital signature of I and call it SDS (I).
The structural digital signature of an image consists of a set of parent-child pairs which satisfy

jjws

;o (x; y)j ? jws;o (2x + i; 2y + j )jj   ( > 0):

+1

(4)

The above constraint is stricter than the original interscale relationship of wavelet coecients shown
in Eq. (3). The size of  will determine the number of parent-child pairs recorded in an SDS (I). The
smaller the  is, the larger the amount of elements in an SDS . We do not intend to keep all the parentchild pairs as elements of an SDS because some of the elements may not be signi cant enough. The
signi cance of a parent-child pair is completely dependent on their magnitude di erence. The larger the
di erence, the more signi cant the parent-child pair is. A parent-child pair whose magnitude di erence
is small is equivalent to having a \small" quantization interval in the quantization-based approaches
[6, 12, 22]. Therefore, it will be very sensitive to modi cations including some minor incidental ones.
In order to design a robust image authentication scheme, we only consider those parent-child pairs
whose magnitude di erences are large as the elements of a structural digital signature. In order to
appropriately detect malicious tampering while tolerating an incidental modi cation, we use the size
of a structural digital signature to control the tradeo between fragility and robustness. In general,
the construction of a structural digital signature is very easy because there is no feature point selection
involved [2, 3].
Once the parent-child pairs are selected by the constraint de ned in Eq. (4), each pair is assigned
a symbol that represents what kind of relationship this pair carries. These symbols will be formally
de ned in Sec. 2.2. The above mentioned symbols and their locations in the wavelet domain will be
encrypted by a public key algorithm such as the famous RSA method [15]. Finally, the encrypted
information will be stored and used for image authentication later.

5

2.2 Labeling an SDS
According to the interscale relationship existing among wavelet coecients, there are four possible
relationship types of an SDS . Assume the magnitude of a parent node p is larger than that of
its child node c (i.e., jpj > jcj), then the four possible relationships of the pair, < p; c >, are: (i)
p  0; c  0; (ii) p  0; c  0; (iii) p  0; c  0; (iv) p  0; c  0. Consider the case when jpj > jcj and
c is small. In order to make < p; c > still credible when incidental modi cations are encountered, the
value of c is not important. Therefore, the relations (i) and (ii) can be merged to form a signature
symbol I under the condition that p  0 and c don't care. On the other hand, the relations (iii) and
(iv) can be merged to form another signature symbol II , under the condition that p  0 and c don't
care. In other words, we intend to keep the sign of the larger element unchanged while disregarding the
smaller one under the constraint that their original interscale relationship is still preserved. Similarly,
signature symbol III (under the condition that c  0 and p don't care) and IV (under the condition
that c  0 and p don't care) can be de ned under the constraint jpj < jcj. For those pairs that are not
recorded in an SDS are all labeled by the fth signature symbol V . Hence, we represent the signature
symbol of a parent-child pair as sym(< p; c >), which can be one of the above de ned symbol types.
In the following section, we shall describe how the veri cation process is executed.

2.3 Veri cation
In the veri cation process, if one would like to verify an unknown image ~I, it is rst wavelet transformed
and then its structural digital signature SDS (~I) that should be constructed. The encrypted structural
digital signature of the original image I is retrieved and then decrypted to obtain its corresponding
SDS (I). One can say the interscale relationship of a pair < p; c > in I is still unchanged in ~I if their
signature symbols are the same. That is, the relation

sym(< p; c >) = sym(< p~; c~ >)

(5)

holds, where the pair < p~; c~ > in ~I is the corresponding pair of < p; c > in I. Finally, we calculate
the completeness of the SDS (CoSDS ) in ~I, which is de ned as the similarity degree, Sim, between
SDS (I) and SDS (~I):
+
? N? ;
CoSDS (~I) = Sim(SDS (I); SDS (~I)) = NjSDS
(I)j

(6)

where N + represents the number of pairs satisfying Eq. (5) and N ? represents the number of pairs
violating Eq. (5). jSDS (I)j is used to denote the number of parent-child pairs in SDS (I). From
6

Eq. (6), we know that CoSDS (~I) will fall into the interval [?1 1]. In other words, the completeness
of SDS represents the ratio of how many parent-child pairs are preserved to satisfy their interscale
relationships. A larger CoSDS means the suspect image ~I is reliable; otherwise, it means ~I has been
maliciously tampered with. In addition, the location of a tampering region can be easily detected
from those parent-child pairs whose signature symbols have been updated.

2.4 How the Size of an jSDS j in uences the Compromise between Robustness and
Fragility
In this subsection, we shall discuss how the constituent parent-child pairs of an SDS in uence a
compromise between robustness and fragility. Let the magnitudes of the di erences of parent-child
pairs in a structural digital signature be arranged in a decreasing order. It is known that the elements
(parent-child pairs) with larger magnitudes are not vulnerable to attacks while those with smaller
magnitudes tend to be easily attacked. Therefore, one can use the larger elements to indicate robustness and use the smaller elements to re ect fragility. Under the circumstances, when the size of a
structural digital signature becomes large, the elements with smaller magnitudes tend to be changed
so that the robustness property is more or less a ected. On the other hand, the modi cation of the
smaller elements will re ect accurately the degree of fragility. So, if jSDS j is small enough such that
elements are all with larger magnitudes, then the fragility property may disappear. In Sec. 3, we will
give a systematic way to determine  (which also determines the jSDS j) by a statistical analysis of
the distributions on an SDS and the behavior of an attack.

2.5 Complexity Analysis on an SDS
In this section, the complexity of a structural digital signature will be analyzed. Let the number of
parent-child pairs in an SDS be n. The rst part of an SDS we should store is the child locations of the
n parent-child pairs. The reason why the child locations are examined instead of the parent locations is
that they are easily tracked. For example, if a child node's location is (x; y), then its parent's location
is (b x2 c,b y2 c). On the contrary, if a parent node's location is (x; y), there are four possible locations
for a child. They are (2x + i; 2y + j ) where 0  i; j  1. For the n parent-child pairs, 2  n bytes are
required to store their locations because each location needs two bytes. In addition, each parent-child
pair in an SDS has four possible interscale relationships. Since each interscale relationship needs two
bits to express it, a total of n4 bytes is required to store all the interscale relationships.
In fact, the storage can be further reduced if the locations of child nodes are stored based on their
7

pre-de ned ordering. Under the circumstances, the number of occurrences of every signature symbol
is counted. For the rst four types of symbols, we store the number of parent-child pairs and then
the locations of these pairs. In this way, the memory used for storing the signature symbols will be
reduced from n4 bytes to 4 bytes. That is, a total of (2n + 4) bytes is required to store a structural
digital signature before encryption.

3 Performance Analysis
Usually, a watermark-based or digital signature-based authentication method must be justi ed by the
false positive (false alarm) and false negative (miss detection) probability analyses like those that have
been done in [6, 7, 11]. For an image authentication system, a false positive probability means an
image is detected to be maliciously tampered but in fact it is not. On the other hand, a false negative
probability means an image is actually modi ed by a malicious tampering but some tampered areas
are not detected. A practical signature system should ensure that both the false positive and false
negative probabilities are reasonably small. The analysis on the false positive and the false negative
probabilities will be elaborated in Secs. 3.1 and 3.2, respectively. The relationship between the
predetermined threshold  and the strength of attacks will be discussed in Sec. 3.3. The security
issues will be discussed in Sec. 3.4

3.1 False Positive due to Incidental Manipulations
An incidental modi cation like the JPEG compression is a kind of \attack" that we would like to
bypass. If an incidental attack is detected, it will cause a false positive type error. Let I be an image,
A be any incidental manipulation, and be a wavelet function. A distorted image, IA, can be derived
by I  A, where  is a convolution operator. Since the authentication process is conducted in the
wavelet domain, the whole transformation process can be denoted as

 (I  A) = (  I)  Af = I  Af ;

(7)

where I is the wavelet transformed image in the space-frequency domain and Af is a version of A
in the frequency domain. Eq. (7) indicates that the wavelet transform of the distorted image IA is
equivalent to the modi cation (by Af ) of the wavelet transformed image I . If Af is a quantization
operation of some compression methods, any coecient in I will only be a ected by itself through
Af . Because the behavior of compression like SPIHT [16] is easily predicted and its corresponding
tree structure is required in constructing an SDS , we will analyze its e ects. SPIHT is a progressive
8

image coding scheme in which the most signi cant bits are transmitted rst. Suppose p (a parent
node) and c (a child node) form a parent-child pair in an SDS and their wavelet coecients satisfy
the relation 2k  jpj  2k?1      2k?j  jcj  2k?(j +1) with j  1. When a SPIHT compression
is executed, we may encounter three di erent possibilities: (1) when the compression ratio is high,
suppose 2t is the threshold nally used in the dominant process [16] and t  k, the reconstructed
parent-child pair, pr and cr , are both zeros. This means the original relationship jpj  jcj is preserved
when pr = cr = 0; (2) when the compression ratio is medium, suppose 2k?1  2t  2k?j , we will
have jpr j > jcr j = 0. Again, the parent-child pair's relationship is preserved; (3) for a compression
with a small ratio, suppose 2k?(j +1)  2t , we will have jpr j > jcr j 6= 0. Once again, the parent-child
pair's relationship is preserved. From the above derivation, it is guaranteed that the proposed SDS
will survive a SPIHT compression at any ratio. A similar conclusion can be applied to the JPEG
compression.
On the other hand, if A is another incidental manipulation (excluding compressions), its behavior
may not be easily analyzed because the change of a speci c coecient may be determined by its
neighbors. However, it is known that an incidental manipulation tends not to destroy the semantics of
an image. Based on this understanding, an SDS will not be signi cantly destroyed when an incidental
manipulation is encountered. Therefore, one can expect that a structural digital signature is indeed a
good mechanism for tolerating incidental modi cations.
Another advantageous point of using SDS is its stable nature against rounding errors. The reason
why this is true is due to the large chosen value of  (by Eq. (4)). When the constituent elements
of an SDS are all with a large , rounding errors that emerge won't in uence the relationship of a
parent-child pair.

3.2 False Negative due to Content Replacement
When a malicious modi cation like content replacement is applied to an image, its corresponding SDS
will have a signi cant change that is very easy to detect. Therefore, we can expect the false negative
probability in this case to be very low. Suppose a parent node p (p > 0) and a child node c is a pair
in an SDS . They have the relation jpj  jcj with jjpj ? jcjj = i (i > ). For simplicity, let p be
attacked by a malicious manipulation with the modi cation quantity Mp . If jp ? Mp j > jcj holds under
the condition that jpj > jcj, then a false negative occurs because 0  Mp  i . If the e ect caused by
Mp forms a Gaussian distribution with variance 2, then the false negative probability can be de ned
9

as

R i

t2
2

?i Ce dt
t2
R1
2 dt
Ce
?1

(C is a constant). When a malicious distortion is applied to an image, if (0   1)

represents the proportion of the parent-child pairs that has been maliciously tampered with but still
maintains their interscale relations, then the total false negative probability will be
R i

t2

2 dt
= jSDS j ?i Ce
Pfn = ii=1
:
t2
R1
2 dt
Ce
?1

(8)

From Eq. (8), it is not dicult to imagine that Pfn will be very low. In other words, the false negative
probability will be very low when a content replacement operation is applied to an image.

3.3 The Relation between  and the Strength of Attacks
In this subsection, we will discuss an issue regarding the relationship between  and the strength of
an attack. Recall that jSDS j denotes the number of parent-child pairs whose interscale relationships
are recorded in a structural digital signature. Attacks can be roughly classi ed into two categories:
incidental manipulation and malicious distortion. To simplify the analysis, we assume the strength of
an attack, a, is a Gaussian distribution, G A , with a mean of zero. According to the Gaussian modeling
of attacks [6, 12, 22], we have the following analysis. Usually, an incidental manipulation tends to have
a small standard deviation I while a malicious tampering tends to have a large standard deviation
M , i.e., I < M . Some reference values regarding I and M were provided in [7] for a speci c image.
Based on our scheme, a structural digital signature is constructed by selecting those parent-child pairs
whose di erences in magnitudes are larger than . The di erence in magnitude, d, may have two
forms: positive di erence (d  0) and negative di erence (d < 0). The positive di erence portion and
the negative di erence portion both form a Gaussian distribution, G S , without a mean of zero. Their
standard deviations are denoted as S , which is usually very large (scale of hundreds) because the
variance of d is large in the wavelet domain and is larger than I . The possible relationships between
G A and G S are depicted in Fig. 1. In Fig. 1, the Gaussian distributions shown in the middle part
are G A , whereas the right/left one is G S corresponding to a positive/negative d.  is de ned as the
intersection point of G A and G S . The shaded areas, which represent the parent-child pairs with a
smaller di erence jdj (in the tails of G S ), are assumed to be updated based on the value in the tails of
G A. Next, we will analyze the e ect of I and M on , respectively.
First, let an incoming attack be an incidental one such as JPEG=SPIHT compression or rescaling.
The probability that the relationship of parent-child pairs may be destroyed (i.e., d's sign is changed)
10

is denoted as pI (the shaded areas in Fig. 1) and can be calculated by

pI = 2  (P f0 < d <  ? g + P f < a < 1g)
= 2  (P f0 < d <  ? g + (1 ? P f0 < a   g))
= 2  (erf ( 2?  ) + [1 ? erf ( 2 )]);
S

I

(9)

where erf () represents the error function [1] which is de ned as:
Z "
2
erf (") = p
e?u2 du:



0

In Eq. (9), the constant 2 represents the two symmetric G S 's that belong, respectively, to the positive
and negative d. Because the attack under consideration is incidental,  ?  is usually small. Since
the standard deviation S of GS is on the scale of hundreds, 2?S is, thus, very small. Under the
circumstances, the rst term in Eq. (9), erf ( 2?S ), approximates zero. On the other hand,  satis es
 >  and  is chosen to be large (Eq. (4)), so  is also large enough. For an incidental attack, we
know the value of I is usually small. Therefore, 2I is large. As a consequence, the second term,
[1 ? erf ( 2I ))], should be very small. In summary, the above discussion explains why the probability
P I can be suciently small if the incoming attack is incidental with a small I . That is,

pI  2  [1 ? erf ( 2 )]  0:
I

(10)

The near-optimal  can be derived based on the condition that the incoming attack is incidental and
the value of pI is smaller than a pre-determined threshold  (e.g.,  = 0:1). Under the circumstances,
the near-optimal  can be derived by

pI  2  [1 ? erf ( 2 )] < :
I

Thus, we have

1 ? 2 < erf ( 2 ):
I

(11)

Using a predetermined  together with I and checking the tables of error function [1], we should
be able to obtain the lower bound of  . From this  , the lower bound of a near-optimal  can be
approximately determined because based on the Gaussian models shown in Fig. 1  is close to  .
Now, let the incoming attack such as object placement/replacement or cloning be malicious. The
probability that the relationships of parent-child pairs in a structural digital signature may be destroyed is de ned as

pM = 2  (P f0 < d <  ? g + P f < a < 1g)
11

= 2  (P f0 < d <  ? g + (1 ? P f0 < a   g)

= 2  (erf ( 2?  ) + [1 ? erf ( 2 )]):
S

M

(12)

In Eq. (12),  ?  is known to be small and, thus, 2?S is very small. As a consequence, the rst term
in Eq. (12), erf ( 2?S ), has a value close to zero because it corresponds to an incidental modi cation.
It is also known that M is usually large and that it may lead to a small 2M . Therefore, the second
term of Eq. (12), [1 ? erf ( 2M ))], has a value which is far from zero. In general, the detection rate
of regions that are maliciously tampered with is determined mainly based on the second term. If we
assume P M is large enough, and M and the tables of error function [1] are available, we will be able
to determine the upper bound of  . From the above  , the upper bound of a near-optimal  will be
approximately obtained as in the case of incidental modi cations.
To sum up, the interval where a near-optimal  should fall can be mathematically derived from
the above analysis. In Sec. 4, we will provide a numerical example to show how di erent values of 
a ect pI .

3.4 Security Problem
In this section, we will discuss the issues regarding (1) the positions of the elements in a structural
digital signature which are known or are correctly guessed; (2) the image intensity is constantly
changed.

3.4.1 Tampering at the Locations Where SDS Does not Record
If the locations of the elements in an SDS are correctly guessed, the attacker may try to tamper with
those positions which are not recorded in the corresponding SDS (I) and thus disable our method.
Fortunately, the attackers cannot succeed in this case because if the parent-child pairs are not recorded
in an SDS (I), then their interscale relationships do not satisfy the condition in Eq. (4). In other
words, we can verify it easily by checking the signature symbols of those parent-child pairs that are
not recorded in SDS (I) and SDS (~I). Let < ws;o(x; y); ws+1;o (2x + i; 2y + j ) > be a parent-child pair
which is not in SDS (I) and assume its corresponding pair < w~s;o (x; y); w~s+1;o (2x + i; 2y + j ) > is not
in SDS (~I), where 0  i; j  1. We can determine whether the < ws;o (x; y); ws+1;o (2x + i; 2y + j ) >
pair is tampered with or not by checking sym < w~s;o (x; y); w~s+1;o (2x + i; 2y + j ) >. If sym <
w~s;o(x; y); w~s+1;o (2x + i; 2y + j ) > is not equal to V , then it has been tampered with. It is known that
the condition for sym < w~s;o(x; y); w~s+1;o (2x + i; 2y + j ) > to belong to V is jjw~s;o (x; y)j?jw~s+1;o (2x +
i; 2y + j )jj < .
12

3.4.2 The Condition that Image Intensity Is Constantly Changed
Attackers may think that they can modify the image's intensity without triggering our authentication
scheme. One possible method is to constantly increase or decrease the intensity of an image I so that
the interscale relationships of all parent-child pairs are not changed. One solution to conquer this
problem is to record the wavelet coecients of the lowest frequency band because they represent the
approximate information of a whole image. In addition, the high frequency bands will not be altered
because a constant convolved with a wavelet will be zero due to the nature of wavelets. Once an image
is tampered with by a constant update, its lowest frequency band will re ect this change. Lin and
Chang [7] used a similar method to solve the above mentioned problem in the DCT domain.

4 Experimental Results
Our structural digital signature-based image authentication scheme was rst tested against a Beach
image with 256  256 size, as shown in Fig. 2(a). A large \umbrella" was placed in Fig. 2(a) and
formed a tampered image as shown in Fig. 2(b). We used a 4?scale wavelet transform to transform
the images so that the resolution of the lowest-frequency channel had the size of 16  16. At rst, the
parent-child pairs whose di erence d satisfying jdj >  = 256 were chosen to construct an SDS . The
detected tampering areas were shown in Figs. 2(c)(e). Another set of detected results using  = 128
was shown in Figs. 2(f)(h). As we expected, the SDS with a smaller size will lose some tampered
pixels. However, the integration of multiscale results was sucient to re ect the area tampered with.
Another set of experiments was conducted by placing a \small" object at the bottom-right corner
of the \peppers" image. Fig. 3(a) and Fig. 3(b) show, respectively, the host image and the image
tampered with. Figs. 3(c)(e) and Figs. 3(f)(h) show, respectively, the detected multiscale results
when  = 256 and  = 128. The above experiments provided a good example of the compromise
between robustness and fragility using two structural digital signatures with di erent sizes.
In the second part of our experiments, we applied several incidental distortions to Fig. 2(a) to test
the robustness of our scheme. Three structural digital signatures with a di erent number of parentchild pairs were constructed, and their corresponding positions in the wavelet domain were shown in
Fig. 4. It can be seen that the SDS with a smaller/larger jSDS j (corresponding to a larger/smaller
) would result in fewer/more elements. Table 1 shows the completeness of SDS obtained under
di erent SPIHT compression ratios using three di erent . It is obvious that when the compression
ratio was smaller than 32, most of the derived CoSDS were perfect. However, when the compres13

sion ratio reached 64, some fragile results emerged for  = 64. For the JPEG compression, perfect
preservations of SDS (except for the results obtained from  = 64) were obtained for quality factors
ranging from 60% (7 : 1) to 10% (21:7 : 1), as shown in Table 2. Table 3 summarized the veri cation
results obtained under other incidental distortions including rescaling, histogram equalization, blurring, median ltering, sharpening, and Gaussian noise adding. These manipulations are sometimes
unavoidable in image processing and, thus, cannot be considered as malicious modi cations. From
Tables 13, we can nd that the completeness of a structural digital signature was consistently very
high for incidental manipulations when  > 64. This indicates that our method can tolerate common
incidental modi cations very well. However, the above conclusion is true only when the value of  is
large enough (e.g.,  > 64 in our experiments). Theoretically, a reasonable  can be determined based
on the analysis described in Sec. 3.
Next, we shall show how the value of  in uences the probability that the relationship of the
parent-child pairs in an SDS is destroyed. Table 3 illustrated six incidental modi cations which were
used in this experiment. The minimum distance () used for thresholding were 256, 128, and 64,
respectively. The curves shown in Fig. 5 indicated that when  was set to 128 or 256, the probability
that the relationship of the parent-child pairs in an SDS being destroyed was zero. From Fig. 5,
we found that the values obtained by theoretical analysis were not necessarily consistent with the
experimental results. This phenomenon can be explained by the following potential reasons: (1) The
behavior of an incidental manipulation and the elements of a structural digital signature are both
assumed to be Gaussian distributed for the sake of simplicity. However, it may not be the case; (ii)
We propose the shaded areas in Fig. 1 that re ect the relationship of those parent-child pairs with
small jdj will be destroyed, but in a practical situation this may not be true. In fact, any parent-child
pair in a SDS could possibly be destroyed. We can only say that the pair with a smaller di erence
has a higher probability of being destroyed. Even when the  of Eq. (11) is set in advance and the
near-optimal  is determined, one cannot decide whether an incoming attack is incidental or not. This
is because when the regions that have been maliciously tampered with are very small, the number
of destroyed parent-child pairs is small too and, thus, its value has the probability of being smaller
than . Therefore, we suggest that the nal decision on whether an attack is incidental or malicious
still needs human intervention so that a perfect perceptual judgement can be made. Under the above
circumstances, if the regions detected as having been tampered with are very small and spread over
a whole image but are still recognizable and meaningful, the imposed attack should be regarded as
malicious. Except for the example of a tiny content-changing modi cation shown in Fig. 3, our scheme
14

is able to determine whether the imposed attack is malicious or incidental by merely comparing the
value of  and 1 ? CoSDS (~I).
In the following, we shall use our scheme to authenticate the images that were modi ed by an
incidental manipulation and a malicious distortion simultaneously. Fig. 6(a) shows a beach image
which was rst JPEG compressed with a quality factor of 10% and then an \umbrella" object was
placed. The veri cation results obtained at 22  24 scales using  = 128 were shown in Figs.
6(b)(d), respectively. As we can see from these results, the area where the umbrella was placed could
be approximately detected and the JPEG compression did not a ect the veri cation results. The
experiment indicated that the structural digital signature eciently tolerated the JPEG compression
while sensitively detecting object placement. Another set of experiments was shown in Fig. 6(e)(h).
The beach image was rst scaled down to 128  128 from 256  256, and then the umbrella object
was placed on it. Finally, the image was rescaled to the original size 256  256, as shown in Fig. 6(e).
When  was set to be 128, Figs. 6(f)(h) showed the placed umbrella was detected at 22  24 scales.
It can be seen that some small fragments which were not the targets were mistakenly detected. This
is because the changes of wavelet coecients that resulted from rescaling are more liable to destroy
the structural digital signature than the JPEG. However, we can also see that the regions belonging
to the \umbrella" tend to be clustered together. By comparing the values shown in Table 2 and Table
3, it is easy to see that the CoSDS values obtained by applying JPEG with any quality factors are
higher than those obtained by applying rescaling.
Finally, we conducted an experiment to demonstrate if malicious tampering occurred on areas
which were not recorded in an SDS , then they could also be detected as we have analyzed in Sec. 3.4.
In Fig. 7(a), a helicopter was placed on the sky portion of the beach image (Fig. 2(a)). As we can see
from Fig. 4, the wavelet coecients in the sky area did not belong to the structural digital signature.
Using the proposed scheme, the area tampered with could be detected and shown, respectively, in
Figs. 7(b) (d) when  = 128. The blocky e ect shown in Fig. 7(b) (d) was the natural result
inherited from the multiresolution representation of the wavelet transform.
From the above experiments, we could make a conclusion about the selection of . The value of 
can be mathematically determined from the analysis described in Sec. 3. However, the assumptions
used in Sec. 3 may not always hold, so we can empirically choose  to be at least 128 which has been
con rmed by several experimental results.

15

5 Conclusion
For image authentication, it is desired that the veri cation method be able to resist content-preserving
modi cations while being sensitive to content-changing modi cations. In this paper, a new structural
digital signature scheme has been proposed for image authentication. We make use of the structure of
an image to construct a digital signature. The only way to destroy the structure of our digital signature
is to signi cantly change the image's content and that would be detected as malicious. In addition,
some unavoidable image processing techniques will preserve a great many of the SDS which would
be detected as incidental. Performance analysis of the structural digital signature has been provided
and experimental results show that our scheme is really robust to content-preserving manipulations
and fragile to content-changing distortions.
Our future work will consider geometric distortions such as rotation and translation, which cannot
be tolerated in this paper because the structural digital signature built in the wavelet domain is variant
to rotation and translation. Another future work will focus on developing structural watermarking,
which can be used for public-key detection from the viewpoint that a watermark structure can only
be removed if its structure is destroyed.
Acknowledgment: The authors thank Dr. Martin Kutter for providing the beach image and the
umbrella image used in the experiments.

16

References
[1] M. Abramowitz and I. A. Stegun, \Handbook of Mathematical Functions with Formulas, Graphs,
and Mathematical Tables", Dover Publications , Inc., New York, 1965.
[2] S. Bhattacharjee and M. Kutter, \Compression Tolerant Image Authentication", IEEE Inter. Conf.
on Image Processing , USA, pp. 435-439, 1998.
[3] J. Dittmann, A. Steinmetz, and R. Steinmetz, \Content-based Digital Signature for Motion Pictures Authentication and Content-Fragile Watermarking", IEEE Inter. Conf. Multimedia Computing and Systems , Vol. II, Italy, pp. 209-213, 1999.
[4] J. Fridrich, \Methods for Detecting Changes in Digital Images", Proc. IEEE Int. Workshop on
Intell. Signal Processing and Communication Systems , 1998.
[5] G. L. Friedman, \The Trustworthy Digital Camera: Restoring Credibility to the Photographic
Image", IEEE Trans. Consumer Electronics , Vol. 39, pp. 905-910, 1993.
[6] D. Kundur and D. Hatzinakos, \Digital Watermarking for TellTale Tamper Proo ng and Authentication", Procceedings of the IEEE , Vol. 87, pp. 1167-1180, 1999.
[7] C.-Y. Lin and S.-F. Chang, \A Robust Image Authentication Method Surviving JPEG Lossy
Compression", SPIE Storage and Retrieval of Image/Video Database , Vol. 3312, San Jose, 1998
(www.ctr.columbia.edu/cylin/auth/auth.html).
[8] C.-Y. Lin and S.-F. Chang, \Generating Robust Digital Signature for Image/Video Authentication", Multimedia and Security Workshop at ACM Multimedia , UK, 1998.
[9] E. T. Lin and E. J. Delp, \A Review of Fragile Image Watermarks", Proc. of the Multimedia and
Security Workshop (ACM Multimedia '99), Orlando, pp. 25-29, 1999.
[10] C. S. Lu, H. Y. Mark Liao, S. K. Huang, and C. J. Sze, \Cocktail Watermarking on Images", 3rd
Inter. Workshop on Information Hiding , LNCS 1768, pp. 333-347, Sept. 29-Oct. 1, 1999.
[11] C. S. Lu, H. Y. Mark Liao, S. K. Huang, and C. J. Sze, \Highly Robust Image Watermarking
Using Complementary Modulations", Proc. 2nd International Information Security Workshop ,
Malaysia, LNCS 1729, pp. 136-153, Nov. 6-7, 1999.

17

[12] C. S. Lu, H. Y. Mark Liao and C. J. Sze, \Combined Watermarking for Image Authentication
and Protection", Proc. 1st IEEE Int. Conf. on Multimedia and Expo , USA, 2000.
[13] C. S. Lu, H. Y. Mark Liao and L. H. Chen, \Multipurpose Audio Watermarking", Proc. 15th Int.
Conf. on Pattern Recognition , Barcelona, Spain, Vol. III, pp. 286-289, 2000.
[14] C. S. Lu and H. Y. Mark Liao, \Structural Digital Signature for Image Authentication: An
Incidental Distortion Resistant Scheme", to appear in Proc. Multimedia and Security Workshop at
the ACM Int. Conf. on Multimedia , Los Angeles, California, USA, 2000.
[15] A. J. Menezes, P. C. van Oorschot, and S. A. Vanstone, \Handbook of Applied Cryptography",
CRC Press , 1997.
[16] A. Said and W. A. Pearlman, "A New, Fast, and Ecient Image Codec based on Set Partitioning
in Hierarchical Trees", IEEE Trans. Circuit and Systems for Video Technology , Vol. 6, pp. 243-250,
1996.
[17] S. Walton, \Image Authentication for A Slippery New Age", Dr. Dobb's Journal , Vol. 20, pp.
18-26, 1995.
[18] R. B. Wolfgang and E. J. Delp, \Fragile Watermarking Using the VW2D Watermark", Proc.
SPIE/IS&T Inter. Conf. Security and Watermarking of multimedia Contents , Vol. 3657, pp. 4051, 1999.
[19] M. Wu and B. Liu, \Watermarking for Image Authentication", IEEE Inter. Conf. on Image
Processing , 1998.
[20] L. Xie and G. R. Arce, \A Blind Wavelet Based Digital Signature for Image Authentication",
Proc. of the EUSIPCO 98 , Island of Rhodes, Greece, Sep 1998.
[21] M. M. Yeung and F. Mintzer, \An Invisible Watermarking Technique for Image Veri cation",
IEEE Conf. Image Processing , Vol. 2, pp. 680-683, 1997.
[22] G. J. Yu, C. S. Lu, H. Y. Mark Liao and J. P. Sheu, \Mean Quantization Blind Watermarking
for Image Authentication," Proc. IEEE Int. Conf. on Image Processing , Vancouver, Canada, Vol.
III, pp. 706-709, 2000.

18

[23] B. Zhu, M. D. Swanson, and A. H. Tew k, \Transparent Robust Authentication and Distortion
Measurement Technique for Images", The 7th IEEE Digital Signal Processing Workshop , pp. 45-48,
1996.

19

Figure 1: The relationship between the attack's distribution G A (with standard deviation I or M )
and the SDS 's distribution G S (with standard deviation S ).

20

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

Figure 2: Content tampering: (a) host image; (b) original image with a large object placed; (c)(e)
detected results at 22  24 scales when  = 256; (f)(h) detected results at 22  24 scales when
 = 128.

21

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

Figure 3: Content tampering: (a) host image; (b) original image with a small object placed at the
bottom-right; (c)(e) detected results at 22  24 scales when  = 256; (f)(h) detected results at
22  24 scales when  = 128.

22

(a)

(b)

(c)

Figure 4: The positions of the elements (illustrated in black color in the wavelet domain) of an SDS
constructed from Fig. 2(a) with (a)  = 256, (b)  = 128, and (c)  = 64.

23

Table 1: CoSDS of Fig. 2(a) under SPIHT with various compression ratios (CR).
Completeness of SDS
 = 256  = 128  = 64

CR

1:000
1:000
1:000
1:000

8:1
16 : 1
32 : 1
64 : 1

1:000
1:000
1:000
0:994

1:000
1:000
0:997
0:816

Table 2: CoSDS of Fig. 2(a) under JPEG with various quality factors (QF).
QF(CR)

60( 7:1 : 1)
50( 8:2 : 1)
40( 9:7 : 1)
30(11:7 : 1)
20(15:0 : 1)
10(21:7 : 1)

Completeness of SDS
 = 256  = 128  = 64

1:000
1:000
1:000
1:000
1:000
1:000

1:000
1:000
1:000
1:000
1:000
0:996

1:000
1:000
0:999
0:992
0:988
0:969

Table 3: CoSDS of Fig. 2(a) under a set of incidental distortions (among them, sharpening
and Gaussian noise adding with amount 16 were run using Photoshop).
Incidental distortions

Standard deviation I

rescaling
equalization
blurring(7  7)
medain ltering(5  5)
sharpening
Gaussian noise(16)

26:8
27:3
22:9
23:0
23:4
15:9
24

Completeness of SDS
 = 256  = 128  = 64

0:993
0:983
0:988
0:943
1:000
1:000

0:918
0:961
0:915
0:830
0:990
1:000

0:808
0:946
0:807
0:682
0:954
1:000

The curves with respect to three minimum differences (256, 128, and 64)
0.5

Probability of parent−child pairs destroyed in an SDS

0.45

0.4

0.35

0.3

0.25

0.2

0.15

0.1

0.05

0

minimum diff. = 256
minimum diff. = 128
minimum diff. = 64
1

1.5
2
2.5
3
3.5
4
4.5
5
5.5
Six incidental manipulations (listed in Table 3 and labeled as 1, 2, 3, 4, 5, and 6)

6

Figure 5: The probability (vertical axis) that the relationship of the parent-child pairs in an SDS
might be destroyed with respect to six incidental manipulations (horizontal axis) listed in Table 3.
The minimum distances () used for thresholding are 256, 128, and 64, respectively.

25

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

Figure 6: Combined attacks with incidental and malicious manipulations: (a) beach image after
JPEG+\umbrella" placement; (b)(d) detected results of (a) at 22  24 scales when  = 128; (e)
beach image after rescaling(scaling+\umbrella" placement); (f)(h) detected results of (e) at 22  24
scales when  = 128.

26

(a)

(b)

(c)

(d)

Figure 7: Malicious manipulations of non-SDS areas: (a) maliciously tampered with image with a
\helicopter" in the sky; (b)(d) detected results of (a) at 22  24 scales when  = 128.

27