Structural Digital Signature for Image Authentication: An Incidental Distortion Resistant Scheme Chun-Shien Lu and Hong-Yuan Mark Liao Institute of Information Science, Academia Sinica, Taipei, Taiwan. E-mail: flcs, liaog@iis.sinica.edu.tw Abstract The existing digital data veri cation methods are able to detect regions that have been tampered with, but are too fragile to resist incidental manipulations. This paper proposes a new digital signature scheme which makes use of an image's contents (in the wavelet transform domain) to construct a structural digital signature (SDS ) for image authentication. The characteristic of the SDS is that it can tolerate content-preserving modi cations while detecting content-changing modi cations. Many incidental manipulations, which were detected as malicious modi cations in the previous digital signature veri cation or fragile watermarking schemes, can be bypassed in the proposed scheme. Performance analysis is conducted and experimental results show that the new scheme is indeed superb for image authentication. keywords: Digital signature, Wavelet transform, Authentication, Fragility, Robustness.  The preliminary version of this paper will be published in [14] (http://smart.iis.sinica.edu.tw/lcs). 1 1 Introduction Because of the easy-to-copy nature of digitized media, it is very easy for one to tamper with digital data without leaving any clues. Under these circumstances, integrity veri cation has become an important issue in the digital world. Conventionally, the methods used for media veri cation can be classi ed into two kinds: digital signature-based [2, 3, 5, 7, 8] and watermark-based [4, 6, 9, 12, 17, 18, 19, 20, 21, 23]. A digital signature is a set of features extracted from a media, and these features are stored as a le, which will be used later for authentication. A very important characteristic of a digital signature is that it suciently represents the content of the original media. Watermarking, on the other hand, is a media authentication/protection technique that embeds invisible (or inaudible) information into a media. For content authentication, the embedded watermark can be extracted and used for veri cation purposes. The major di erence between a watermark and a digital signature is that the embedding process of the former requires the content of a media to change. However, both the watermarkbased approach and the digital signature-based approach are expected to be sensitive to any malicious modi cation applied to the media. For an incidental modi cation such as JPEG compression or blurring, a good authentication system should be able to tolerate it. Unfortunately, most of the existing media authentication systems, though they can detect malicious tampering successfully, are vulnerable to incidental modi cations. The main reason for the above mentioned problem is that the existing methods do not consider carefully the tradeo between robustness and fragility. In the whole course of this study, we shall focus our discussion on the image authentication system. The underlying techniques used to implement the digital signature-based or watermark-based approaches can be roughly classi ed into quantization-based [6, 12, 22], feature point-based [2, 3], and relation-based [7, 8]. As to a quantization-based approach, Kundur and Hatzinakos [6] designed a quantization technique to encode a watermark so that the hidden watermark is more/less sensitive to modi cations at high/low frequency in the wavelet domain. Usually, over-sensitivity may occur at the small-to-medium scale while under-sensitivity may only happen at the medium-to-large scale. With this understanding, one could make application-dependent decisions on whether an image is credible or not when encountering some modi cations. The major problem associated with [6] is that the tampering detection results are very unstable. It is well known that the perturbation applied to a wavelet coecient may make the extracted mark di erent from or still the same as the embedded one. In other words, the extracted result may be completely unpredictable. Another drawback of [6] is that the method cannot resist incidental modi cations. Recently, we have proposed a multipurpose watermarking scheme [12, 13] for image/audio authentication and protection. Our method combines 2 a media data-dependent quantization technique and a complementary watermark hiding strategy [10, 11] to conceal watermarks. We have also proposed several detection methods to optimize the tradeo between robustness and fragility. As to feature point-based authentication systems, Bhattacharjee and Kutter [2] proposed to generate a digital signature by encrypting the feature points' positions in an image. Authentication is then accomplished by comparing the positions of the feature points extracted from a questionable image with those decrypted from the previously encrypted digital signature. It is not certain that this approach can resist JPEG compression with middle-to-high compression ratios because the feature points are liable to be shifted. Recently, Dittmann et al. [3] presented a content-based digital signature approach for image/video authentication using edge characteristics. Their content features are similar to [2], but di erent extraction techniques are used. A typical relation-based technique for developing an image authentication system has been reported by Lin and Chang [7, 8]. In order to make the designed image authentication system tolerate JPEG compression, Lin and Chang [7, 8] dedicated themselves to exploring the operation in a JPEG-based system. They proposed to extract a digital signature by using the invariant relation existing between any two DCT coecients, which are at the same position of two di erent 8  8 blocks. They found that the invariance properties could always be preserved before and after JPEG compression. However, they didn't mention clearly whether their method could survive other incidental manipulations. Although they used the invariance property to achieve their goal, the extracted relation is random by nature. In other words, the merit of the image structure, which is a very important feature, was not utilized. In this paper, we will develop a new digital signature-based image authentication scheme which is completely di erent from the existing methods. In the proposed method, commonly adopted features such as the position of feature points or the relationship of any two random coecients are not used at all. On the contrary, we propose to use the \structure " of an image as a digital signature. In the proposed scheme, the structure of an image's contents is composed of a number of parent-child pairs in the wavelet domain. We build up a structural digital signature and check to see if it is robust under content-preserving manipulations and fragile under content-changing manipulations. Performance analysis on the proposed new image authentication system has been conducted and the experimental results have proven the powerfulness of the system. The remainder of this paper is organized as follows. In Sec. 2, we will present the proposed structural digital signature-based image authentication scheme. This will include the construction 3 and veri cation of a structural digital signature. An analysis on the performance of our proposed scheme will be conducted in Sec. 3. We will discuss the false positive and false negative problems when incidental distortions and/or malicious tampering are encountered. In addition, we will analyze the e ect that occurs when the size of a structural digital signature changes. Based on the analysis, a systematic way can be derived to determine the best size for use. In Sec. 4, a series of experiments will be conducted and their results will be reported. Concluding remarks will be given in Sec. 5. 2 Structural Digital Signature (SDS ) Our digital signature scheme is based on the wavelet transform due to its excellent multiscale and precise localization properties. Basically, the multiscale representation of an image is by nature highly suitable for designing a structural digital signature. In Sec. 2.1, we will introduce how to de ne a structural digital signature based on the interscale relation of wavelet coecients. The rules for instructing how to label an SDS will be described in Sec. 2.2. The metric and the procedure used to authenticate an incoming unknown image will be detailed in Sec. 2.3. Analysis issues about the size and the complexity of an SDS will be elaborated on in Secs. 2.4 and 2.5, respectively. 2.1 De ning SDS based on Interscale Relation of Wavelet Coecients Let ws;o(x; y) represent a wavelet coecient (at scale s, orientation o, and position (x; y)) in the orthogonally downsampled wavelet transform domain of an image I. Suppose a J -scale wavelet transform is performed, then 0  s < J . It is well known that a large/small scale represents a coarser/ ner resolution of an image, i.e., the low/high frequency part. The orientation o may be in a horizontal, vertical, or diagonal direction. The interscale relationships of wavelet coecients can then be converted into the relationships between the parent node ws+1;o (x; y) and its four child nodes ws;o (2x + i; 2y + j ) with jws+1;o(x; y)j  jws;o(2x + i; 2y + j )j; (1) or jws ;o (x; y)j < jws;o (2x + i; 2y + j )j; +1 (2) where 0  s < J , 0  i; j  1, and 1  x  N and 1  y  M (N  M is the image size). Combining Eqs. (1) and (2), the above two relations can be rewritten as jjws ;o (x; y)j ? jws;o (2x + i; 2y + j )jj  0: +1 4 (3) In order to design a reliable scheme for image authentication, we propose a new signature method called structural digital signature (SDS ). The new signature can be obtained by observing the interscale relations of wavelet coecients of an image. The basic concept of the new scheme relies on the following: (i) the interscale relationship should be dicult to be destroyed after content-preserving manipulations; and (ii) this interscale relationship should be dicult to be preserved after contentchanging manipulations. Because these interscale relationships result from the structure of an image (say I), we de ne them as the structural digital signature of I and call it SDS (I). The structural digital signature of an image consists of a set of parent-child pairs which satisfy jjws ;o (x; y)j ? jws;o (2x + i; 2y + j )jj   ( > 0): +1 (4) The above constraint is stricter than the original interscale relationship of wavelet coecients shown in Eq. (3). The size of  will determine the number of parent-child pairs recorded in an SDS (I). The smaller the  is, the larger the amount of elements in an SDS . We do not intend to keep all the parentchild pairs as elements of an SDS because some of the elements may not be signi cant enough. The signi cance of a parent-child pair is completely dependent on their magnitude di erence. The larger the di erence, the more signi cant the parent-child pair is. A parent-child pair whose magnitude di erence is small is equivalent to having a \small" quantization interval in the quantization-based approaches [6, 12, 22]. Therefore, it will be very sensitive to modi cations including some minor incidental ones. In order to design a robust image authentication scheme, we only consider those parent-child pairs whose magnitude di erences are large as the elements of a structural digital signature. In order to appropriately detect malicious tampering while tolerating an incidental modi cation, we use the size of a structural digital signature to control the tradeo between fragility and robustness. In general, the construction of a structural digital signature is very easy because there is no feature point selection involved [2, 3]. Once the parent-child pairs are selected by the constraint de ned in Eq. (4), each pair is assigned a symbol that represents what kind of relationship this pair carries. These symbols will be formally de ned in Sec. 2.2. The above mentioned symbols and their locations in the wavelet domain will be encrypted by a public key algorithm such as the famous RSA method [15]. Finally, the encrypted information will be stored and used for image authentication later. 5 2.2 Labeling an SDS According to the interscale relationship existing among wavelet coecients, there are four possible relationship types of an SDS . Assume the magnitude of a parent node p is larger than that of its child node c (i.e., jpj > jcj), then the four possible relationships of the pair, < p; c >, are: (i) p  0; c  0; (ii) p  0; c  0; (iii) p  0; c  0; (iv) p  0; c  0. Consider the case when jpj > jcj and c is small. In order to make < p; c > still credible when incidental modi cations are encountered, the value of c is not important. Therefore, the relations (i) and (ii) can be merged to form a signature symbol I under the condition that p  0 and c don't care. On the other hand, the relations (iii) and (iv) can be merged to form another signature symbol II , under the condition that p  0 and c don't care. In other words, we intend to keep the sign of the larger element unchanged while disregarding the smaller one under the constraint that their original interscale relationship is still preserved. Similarly, signature symbol III (under the condition that c  0 and p don't care) and IV (under the condition that c  0 and p don't care) can be de ned under the constraint jpj < jcj. For those pairs that are not recorded in an SDS are all labeled by the fth signature symbol V . Hence, we represent the signature symbol of a parent-child pair as sym(< p; c >), which can be one of the above de ned symbol types. In the following section, we shall describe how the veri cation process is executed. 2.3 Veri cation In the veri cation process, if one would like to verify an unknown image ~I, it is rst wavelet transformed and then its structural digital signature SDS (~I) that should be constructed. The encrypted structural digital signature of the original image I is retrieved and then decrypted to obtain its corresponding SDS (I). One can say the interscale relationship of a pair < p; c > in I is still unchanged in ~I if their signature symbols are the same. That is, the relation sym(< p; c >) = sym(< p~; c~ >) (5) holds, where the pair < p~; c~ > in ~I is the corresponding pair of < p; c > in I. Finally, we calculate the completeness of the SDS (CoSDS ) in ~I, which is de ned as the similarity degree, Sim, between SDS (I) and SDS (~I): + ? N? ; CoSDS (~I) = Sim(SDS (I); SDS (~I)) = NjSDS (I)j (6) where N + represents the number of pairs satisfying Eq. (5) and N ? represents the number of pairs violating Eq. (5). jSDS (I)j is used to denote the number of parent-child pairs in SDS (I). From 6 Eq. (6), we know that CoSDS (~I) will fall into the interval [?1 1]. In other words, the completeness of SDS represents the ratio of how many parent-child pairs are preserved to satisfy their interscale relationships. A larger CoSDS means the suspect image ~I is reliable; otherwise, it means ~I has been maliciously tampered with. In addition, the location of a tampering region can be easily detected from those parent-child pairs whose signature symbols have been updated. 2.4 How the Size of an jSDS j in uences the Compromise between Robustness and Fragility In this subsection, we shall discuss how the constituent parent-child pairs of an SDS in uence a compromise between robustness and fragility. Let the magnitudes of the di erences of parent-child pairs in a structural digital signature be arranged in a decreasing order. It is known that the elements (parent-child pairs) with larger magnitudes are not vulnerable to attacks while those with smaller magnitudes tend to be easily attacked. Therefore, one can use the larger elements to indicate robustness and use the smaller elements to re ect fragility. Under the circumstances, when the size of a structural digital signature becomes large, the elements with smaller magnitudes tend to be changed so that the robustness property is more or less a ected. On the other hand, the modi cation of the smaller elements will re ect accurately the degree of fragility. So, if jSDS j is small enough such that elements are all with larger magnitudes, then the fragility property may disappear. In Sec. 3, we will give a systematic way to determine  (which also determines the jSDS j) by a statistical analysis of the distributions on an SDS and the behavior of an attack. 2.5 Complexity Analysis on an SDS In this section, the complexity of a structural digital signature will be analyzed. Let the number of parent-child pairs in an SDS be n. The rst part of an SDS we should store is the child locations of the n parent-child pairs. The reason why the child locations are examined instead of the parent locations is that they are easily tracked. For example, if a child node's location is (x; y), then its parent's location is (b x2 c,b y2 c). On the contrary, if a parent node's location is (x; y), there are four possible locations for a child. They are (2x + i; 2y + j ) where 0  i; j  1. For the n parent-child pairs, 2  n bytes are required to store their locations because each location needs two bytes. In addition, each parent-child pair in an SDS has four possible interscale relationships. Since each interscale relationship needs two bits to express it, a total of n4 bytes is required to store all the interscale relationships. In fact, the storage can be further reduced if the locations of child nodes are stored based on their 7 pre-de ned ordering. Under the circumstances, the number of occurrences of every signature symbol is counted. For the rst four types of symbols, we store the number of parent-child pairs and then the locations of these pairs. In this way, the memory used for storing the signature symbols will be reduced from n4 bytes to 4 bytes. That is, a total of (2n + 4) bytes is required to store a structural digital signature before encryption. 3 Performance Analysis Usually, a watermark-based or digital signature-based authentication method must be justi ed by the false positive (false alarm) and false negative (miss detection) probability analyses like those that have been done in [6, 7, 11]. For an image authentication system, a false positive probability means an image is detected to be maliciously tampered but in fact it is not. On the other hand, a false negative probability means an image is actually modi ed by a malicious tampering but some tampered areas are not detected. A practical signature system should ensure that both the false positive and false negative probabilities are reasonably small. The analysis on the false positive and the false negative probabilities will be elaborated in Secs. 3.1 and 3.2, respectively. The relationship between the predetermined threshold  and the strength of attacks will be discussed in Sec. 3.3. The security issues will be discussed in Sec. 3.4 3.1 False Positive due to Incidental Manipulations An incidental modi cation like the JPEG compression is a kind of \attack" that we would like to bypass. If an incidental attack is detected, it will cause a false positive type error. Let I be an image, A be any incidental manipulation, and be a wavelet function. A distorted image, IA, can be derived by I  A, where  is a convolution operator. Since the authentication process is conducted in the wavelet domain, the whole transformation process can be denoted as  (I  A) = (  I)  Af = I  Af ; (7) where I is the wavelet transformed image in the space-frequency domain and Af is a version of A in the frequency domain. Eq. (7) indicates that the wavelet transform of the distorted image IA is equivalent to the modi cation (by Af ) of the wavelet transformed image I . If Af is a quantization operation of some compression methods, any coecient in I will only be a ected by itself through Af . Because the behavior of compression like SPIHT [16] is easily predicted and its corresponding tree structure is required in constructing an SDS , we will analyze its e ects. SPIHT is a progressive 8 image coding scheme in which the most signi cant bits are transmitted rst. Suppose p (a parent node) and c (a child node) form a parent-child pair in an SDS and their wavelet coecients satisfy the relation 2k  jpj  2k?1      2k?j  jcj  2k?(j +1) with j  1. When a SPIHT compression is executed, we may encounter three di erent possibilities: (1) when the compression ratio is high, suppose 2t is the threshold nally used in the dominant process [16] and t  k, the reconstructed parent-child pair, pr and cr , are both zeros. This means the original relationship jpj  jcj is preserved when pr = cr = 0; (2) when the compression ratio is medium, suppose 2k?1  2t  2k?j , we will have jpr j > jcr j = 0. Again, the parent-child pair's relationship is preserved; (3) for a compression with a small ratio, suppose 2k?(j +1)  2t , we will have jpr j > jcr j 6= 0. Once again, the parent-child pair's relationship is preserved. From the above derivation, it is guaranteed that the proposed SDS will survive a SPIHT compression at any ratio. A similar conclusion can be applied to the JPEG compression. On the other hand, if A is another incidental manipulation (excluding compressions), its behavior may not be easily analyzed because the change of a speci c coecient may be determined by its neighbors. However, it is known that an incidental manipulation tends not to destroy the semantics of an image. Based on this understanding, an SDS will not be signi cantly destroyed when an incidental manipulation is encountered. Therefore, one can expect that a structural digital signature is indeed a good mechanism for tolerating incidental modi cations. Another advantageous point of using SDS is its stable nature against rounding errors. The reason why this is true is due to the large chosen value of  (by Eq. (4)). When the constituent elements of an SDS are all with a large , rounding errors that emerge won't in uence the relationship of a parent-child pair. 3.2 False Negative due to Content Replacement When a malicious modi cation like content replacement is applied to an image, its corresponding SDS will have a signi cant change that is very easy to detect. Therefore, we can expect the false negative probability in this case to be very low. Suppose a parent node p (p > 0) and a child node c is a pair in an SDS . They have the relation jpj  jcj with jjpj ? jcjj = i (i > ). For simplicity, let p be attacked by a malicious manipulation with the modi cation quantity Mp . If jp ? Mp j > jcj holds under the condition that jpj > jcj, then a false negative occurs because 0  Mp  i . If the e ect caused by Mp forms a Gaussian distribution with variance 2, then the false negative probability can be de ned 9 as R i t2 2 ?i Ce dt t2 R1 2 dt Ce ?1 (C is a constant). When a malicious distortion is applied to an image, if (0   1) represents the proportion of the parent-child pairs that has been maliciously tampered with but still maintains their interscale relations, then the total false negative probability will be R i t2 2 dt = jSDS j ?i Ce Pfn = ii=1 : t2 R1 2 dt Ce ?1 (8) From Eq. (8), it is not dicult to imagine that Pfn will be very low. In other words, the false negative probability will be very low when a content replacement operation is applied to an image. 3.3 The Relation between  and the Strength of Attacks In this subsection, we will discuss an issue regarding the relationship between  and the strength of an attack. Recall that jSDS j denotes the number of parent-child pairs whose interscale relationships are recorded in a structural digital signature. Attacks can be roughly classi ed into two categories: incidental manipulation and malicious distortion. To simplify the analysis, we assume the strength of an attack, a, is a Gaussian distribution, G A , with a mean of zero. According to the Gaussian modeling of attacks [6, 12, 22], we have the following analysis. Usually, an incidental manipulation tends to have a small standard deviation I while a malicious tampering tends to have a large standard deviation M , i.e., I < M . Some reference values regarding I and M were provided in [7] for a speci c image. Based on our scheme, a structural digital signature is constructed by selecting those parent-child pairs whose di erences in magnitudes are larger than . The di erence in magnitude, d, may have two forms: positive di erence (d  0) and negative di erence (d < 0). The positive di erence portion and the negative di erence portion both form a Gaussian distribution, G S , without a mean of zero. Their standard deviations are denoted as S , which is usually very large (scale of hundreds) because the variance of d is large in the wavelet domain and is larger than I . The possible relationships between G A and G S are depicted in Fig. 1. In Fig. 1, the Gaussian distributions shown in the middle part are G A , whereas the right/left one is G S corresponding to a positive/negative d.  is de ned as the intersection point of G A and G S . The shaded areas, which represent the parent-child pairs with a smaller di erence jdj (in the tails of G S ), are assumed to be updated based on the value in the tails of G A. Next, we will analyze the e ect of I and M on , respectively. First, let an incoming attack be an incidental one such as JPEG=SPIHT compression or rescaling. The probability that the relationship of parent-child pairs may be destroyed (i.e., d's sign is changed) 10 is denoted as pI (the shaded areas in Fig. 1) and can be calculated by pI = 2  (P f0 < d <  ? g + P f < a < 1g) = 2  (P f0 < d <  ? g + (1 ? P f0 < a   g)) = 2  (erf ( 2?  ) + [1 ? erf ( 2 )]); S I (9) where erf () represents the error function [1] which is de ned as: Z " 2 erf (") = p e?u2 du:  0 In Eq. (9), the constant 2 represents the two symmetric G S 's that belong, respectively, to the positive and negative d. Because the attack under consideration is incidental,  ?  is usually small. Since the standard deviation S of GS is on the scale of hundreds, 2?S is, thus, very small. Under the circumstances, the rst term in Eq. (9), erf ( 2?S ), approximates zero. On the other hand,  satis es  >  and  is chosen to be large (Eq. (4)), so  is also large enough. For an incidental attack, we know the value of I is usually small. Therefore, 2I is large. As a consequence, the second term, [1 ? erf ( 2I ))], should be very small. In summary, the above discussion explains why the probability P I can be suciently small if the incoming attack is incidental with a small I . That is, pI  2  [1 ? erf ( 2 )]  0: I (10) The near-optimal  can be derived based on the condition that the incoming attack is incidental and the value of pI is smaller than a pre-determined threshold  (e.g.,  = 0:1). Under the circumstances, the near-optimal  can be derived by pI  2  [1 ? erf ( 2 )] < : I Thus, we have 1 ? 2 < erf ( 2 ): I (11) Using a predetermined  together with I and checking the tables of error function [1], we should be able to obtain the lower bound of  . From this  , the lower bound of a near-optimal  can be approximately determined because based on the Gaussian models shown in Fig. 1  is close to  . Now, let the incoming attack such as object placement/replacement or cloning be malicious. The probability that the relationships of parent-child pairs in a structural digital signature may be destroyed is de ned as pM = 2  (P f0 < d <  ? g + P f < a < 1g) 11 = 2  (P f0 < d <  ? g + (1 ? P f0 < a   g) = 2  (erf ( 2?  ) + [1 ? erf ( 2 )]): S M (12) In Eq. (12),  ?  is known to be small and, thus, 2?S is very small. As a consequence, the rst term in Eq. (12), erf ( 2?S ), has a value close to zero because it corresponds to an incidental modi cation. It is also known that M is usually large and that it may lead to a small 2M . Therefore, the second term of Eq. (12), [1 ? erf ( 2M ))], has a value which is far from zero. In general, the detection rate of regions that are maliciously tampered with is determined mainly based on the second term. If we assume P M is large enough, and M and the tables of error function [1] are available, we will be able to determine the upper bound of  . From the above  , the upper bound of a near-optimal  will be approximately obtained as in the case of incidental modi cations. To sum up, the interval where a near-optimal  should fall can be mathematically derived from the above analysis. In Sec. 4, we will provide a numerical example to show how di erent values of  a ect pI . 3.4 Security Problem In this section, we will discuss the issues regarding (1) the positions of the elements in a structural digital signature which are known or are correctly guessed; (2) the image intensity is constantly changed. 3.4.1 Tampering at the Locations Where SDS Does not Record If the locations of the elements in an SDS are correctly guessed, the attacker may try to tamper with those positions which are not recorded in the corresponding SDS (I) and thus disable our method. Fortunately, the attackers cannot succeed in this case because if the parent-child pairs are not recorded in an SDS (I), then their interscale relationships do not satisfy the condition in Eq. (4). In other words, we can verify it easily by checking the signature symbols of those parent-child pairs that are not recorded in SDS (I) and SDS (~I). Let < ws;o(x; y); ws+1;o (2x + i; 2y + j ) > be a parent-child pair which is not in SDS (I) and assume its corresponding pair < w~s;o (x; y); w~s+1;o (2x + i; 2y + j ) > is not in SDS (~I), where 0  i; j  1. We can determine whether the < ws;o (x; y); ws+1;o (2x + i; 2y + j ) > pair is tampered with or not by checking sym < w~s;o (x; y); w~s+1;o (2x + i; 2y + j ) >. If sym < w~s;o(x; y); w~s+1;o (2x + i; 2y + j ) > is not equal to V , then it has been tampered with. It is known that the condition for sym < w~s;o(x; y); w~s+1;o (2x + i; 2y + j ) > to belong to V is jjw~s;o (x; y)j?jw~s+1;o (2x + i; 2y + j )jj < . 12 3.4.2 The Condition that Image Intensity Is Constantly Changed Attackers may think that they can modify the image's intensity without triggering our authentication scheme. One possible method is to constantly increase or decrease the intensity of an image I so that the interscale relationships of all parent-child pairs are not changed. One solution to conquer this problem is to record the wavelet coecients of the lowest frequency band because they represent the approximate information of a whole image. In addition, the high frequency bands will not be altered because a constant convolved with a wavelet will be zero due to the nature of wavelets. Once an image is tampered with by a constant update, its lowest frequency band will re ect this change. Lin and Chang [7] used a similar method to solve the above mentioned problem in the DCT domain. 4 Experimental Results Our structural digital signature-based image authentication scheme was rst tested against a Beach image with 256  256 size, as shown in Fig. 2(a). A large \umbrella" was placed in Fig. 2(a) and formed a tampered image as shown in Fig. 2(b). We used a 4?scale wavelet transform to transform the images so that the resolution of the lowest-frequency channel had the size of 16  16. At rst, the parent-child pairs whose di erence d satisfying jdj >  = 256 were chosen to construct an SDS . The detected tampering areas were shown in Figs. 2(c)(e). Another set of detected results using  = 128 was shown in Figs. 2(f)(h). As we expected, the SDS with a smaller size will lose some tampered pixels. However, the integration of multiscale results was sucient to re ect the area tampered with. Another set of experiments was conducted by placing a \small" object at the bottom-right corner of the \peppers" image. Fig. 3(a) and Fig. 3(b) show, respectively, the host image and the image tampered with. Figs. 3(c)(e) and Figs. 3(f)(h) show, respectively, the detected multiscale results when  = 256 and  = 128. The above experiments provided a good example of the compromise between robustness and fragility using two structural digital signatures with di erent sizes. In the second part of our experiments, we applied several incidental distortions to Fig. 2(a) to test the robustness of our scheme. Three structural digital signatures with a di erent number of parentchild pairs were constructed, and their corresponding positions in the wavelet domain were shown in Fig. 4. It can be seen that the SDS with a smaller/larger jSDS j (corresponding to a larger/smaller ) would result in fewer/more elements. Table 1 shows the completeness of SDS obtained under di erent SPIHT compression ratios using three di erent . It is obvious that when the compression ratio was smaller than 32, most of the derived CoSDS were perfect. However, when the compres13 sion ratio reached 64, some fragile results emerged for  = 64. For the JPEG compression, perfect preservations of SDS (except for the results obtained from  = 64) were obtained for quality factors ranging from 60% (7 : 1) to 10% (21:7 : 1), as shown in Table 2. Table 3 summarized the veri cation results obtained under other incidental distortions including rescaling, histogram equalization, blurring, median ltering, sharpening, and Gaussian noise adding. These manipulations are sometimes unavoidable in image processing and, thus, cannot be considered as malicious modi cations. From Tables 13, we can nd that the completeness of a structural digital signature was consistently very high for incidental manipulations when  > 64. This indicates that our method can tolerate common incidental modi cations very well. However, the above conclusion is true only when the value of  is large enough (e.g.,  > 64 in our experiments). Theoretically, a reasonable  can be determined based on the analysis described in Sec. 3. Next, we shall show how the value of  in uences the probability that the relationship of the parent-child pairs in an SDS is destroyed. Table 3 illustrated six incidental modi cations which were used in this experiment. The minimum distance () used for thresholding were 256, 128, and 64, respectively. The curves shown in Fig. 5 indicated that when  was set to 128 or 256, the probability that the relationship of the parent-child pairs in an SDS being destroyed was zero. From Fig. 5, we found that the values obtained by theoretical analysis were not necessarily consistent with the experimental results. This phenomenon can be explained by the following potential reasons: (1) The behavior of an incidental manipulation and the elements of a structural digital signature are both assumed to be Gaussian distributed for the sake of simplicity. However, it may not be the case; (ii) We propose the shaded areas in Fig. 1 that re ect the relationship of those parent-child pairs with small jdj will be destroyed, but in a practical situation this may not be true. In fact, any parent-child pair in a SDS could possibly be destroyed. We can only say that the pair with a smaller di erence has a higher probability of being destroyed. Even when the  of Eq. (11) is set in advance and the near-optimal  is determined, one cannot decide whether an incoming attack is incidental or not. This is because when the regions that have been maliciously tampered with are very small, the number of destroyed parent-child pairs is small too and, thus, its value has the probability of being smaller than . Therefore, we suggest that the nal decision on whether an attack is incidental or malicious still needs human intervention so that a perfect perceptual judgement can be made. Under the above circumstances, if the regions detected as having been tampered with are very small and spread over a whole image but are still recognizable and meaningful, the imposed attack should be regarded as malicious. Except for the example of a tiny content-changing modi cation shown in Fig. 3, our scheme 14 is able to determine whether the imposed attack is malicious or incidental by merely comparing the value of  and 1 ? CoSDS (~I). In the following, we shall use our scheme to authenticate the images that were modi ed by an incidental manipulation and a malicious distortion simultaneously. Fig. 6(a) shows a beach image which was rst JPEG compressed with a quality factor of 10% and then an \umbrella" object was placed. The veri cation results obtained at 22  24 scales using  = 128 were shown in Figs. 6(b)(d), respectively. As we can see from these results, the area where the umbrella was placed could be approximately detected and the JPEG compression did not a ect the veri cation results. The experiment indicated that the structural digital signature eciently tolerated the JPEG compression while sensitively detecting object placement. Another set of experiments was shown in Fig. 6(e)(h). The beach image was rst scaled down to 128  128 from 256  256, and then the umbrella object was placed on it. Finally, the image was rescaled to the original size 256  256, as shown in Fig. 6(e). When  was set to be 128, Figs. 6(f)(h) showed the placed umbrella was detected at 22  24 scales. It can be seen that some small fragments which were not the targets were mistakenly detected. This is because the changes of wavelet coecients that resulted from rescaling are more liable to destroy the structural digital signature than the JPEG. However, we can also see that the regions belonging to the \umbrella" tend to be clustered together. By comparing the values shown in Table 2 and Table 3, it is easy to see that the CoSDS values obtained by applying JPEG with any quality factors are higher than those obtained by applying rescaling. Finally, we conducted an experiment to demonstrate if malicious tampering occurred on areas which were not recorded in an SDS , then they could also be detected as we have analyzed in Sec. 3.4. In Fig. 7(a), a helicopter was placed on the sky portion of the beach image (Fig. 2(a)). As we can see from Fig. 4, the wavelet coecients in the sky area did not belong to the structural digital signature. Using the proposed scheme, the area tampered with could be detected and shown, respectively, in Figs. 7(b) (d) when  = 128. The blocky e ect shown in Fig. 7(b) (d) was the natural result inherited from the multiresolution representation of the wavelet transform. From the above experiments, we could make a conclusion about the selection of . The value of  can be mathematically determined from the analysis described in Sec. 3. However, the assumptions used in Sec. 3 may not always hold, so we can empirically choose  to be at least 128 which has been con rmed by several experimental results. 15 5 Conclusion For image authentication, it is desired that the veri cation method be able to resist content-preserving modi cations while being sensitive to content-changing modi cations. In this paper, a new structural digital signature scheme has been proposed for image authentication. We make use of the structure of an image to construct a digital signature. The only way to destroy the structure of our digital signature is to signi cantly change the image's content and that would be detected as malicious. In addition, some unavoidable image processing techniques will preserve a great many of the SDS which would be detected as incidental. Performance analysis of the structural digital signature has been provided and experimental results show that our scheme is really robust to content-preserving manipulations and fragile to content-changing distortions. Our future work will consider geometric distortions such as rotation and translation, which cannot be tolerated in this paper because the structural digital signature built in the wavelet domain is variant to rotation and translation. Another future work will focus on developing structural watermarking, which can be used for public-key detection from the viewpoint that a watermark structure can only be removed if its structure is destroyed. Acknowledgment: The authors thank Dr. Martin Kutter for providing the beach image and the umbrella image used in the experiments. 16 References [1] M. Abramowitz and I. A. Stegun, \Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables", Dover Publications , Inc., New York, 1965. [2] S. Bhattacharjee and M. Kutter, \Compression Tolerant Image Authentication", IEEE Inter. Conf. on Image Processing , USA, pp. 435-439, 1998. [3] J. Dittmann, A. Steinmetz, and R. Steinmetz, \Content-based Digital Signature for Motion Pictures Authentication and Content-Fragile Watermarking", IEEE Inter. Conf. Multimedia Computing and Systems , Vol. II, Italy, pp. 209-213, 1999. [4] J. Fridrich, \Methods for Detecting Changes in Digital Images", Proc. IEEE Int. Workshop on Intell. Signal Processing and Communication Systems , 1998. [5] G. L. Friedman, \The Trustworthy Digital Camera: Restoring Credibility to the Photographic Image", IEEE Trans. Consumer Electronics , Vol. 39, pp. 905-910, 1993. [6] D. Kundur and D. Hatzinakos, \Digital Watermarking for TellTale Tamper Proo ng and Authentication", Procceedings of the IEEE , Vol. 87, pp. 1167-1180, 1999. [7] C.-Y. Lin and S.-F. Chang, \A Robust Image Authentication Method Surviving JPEG Lossy Compression", SPIE Storage and Retrieval of Image/Video Database , Vol. 3312, San Jose, 1998 (www.ctr.columbia.edu/cylin/auth/auth.html). [8] C.-Y. Lin and S.-F. Chang, \Generating Robust Digital Signature for Image/Video Authentication", Multimedia and Security Workshop at ACM Multimedia , UK, 1998. [9] E. T. Lin and E. J. Delp, \A Review of Fragile Image Watermarks", Proc. of the Multimedia and Security Workshop (ACM Multimedia '99), Orlando, pp. 25-29, 1999. [10] C. S. Lu, H. Y. Mark Liao, S. K. Huang, and C. J. Sze, \Cocktail Watermarking on Images", 3rd Inter. Workshop on Information Hiding , LNCS 1768, pp. 333-347, Sept. 29-Oct. 1, 1999. [11] C. S. Lu, H. Y. Mark Liao, S. K. Huang, and C. J. Sze, \Highly Robust Image Watermarking Using Complementary Modulations", Proc. 2nd International Information Security Workshop , Malaysia, LNCS 1729, pp. 136-153, Nov. 6-7, 1999. 17 [12] C. S. Lu, H. Y. Mark Liao and C. J. Sze, \Combined Watermarking for Image Authentication and Protection", Proc. 1st IEEE Int. Conf. on Multimedia and Expo , USA, 2000. [13] C. S. Lu, H. Y. Mark Liao and L. H. Chen, \Multipurpose Audio Watermarking", Proc. 15th Int. Conf. on Pattern Recognition , Barcelona, Spain, Vol. III, pp. 286-289, 2000. [14] C. S. Lu and H. Y. Mark Liao, \Structural Digital Signature for Image Authentication: An Incidental Distortion Resistant Scheme", to appear in Proc. Multimedia and Security Workshop at the ACM Int. Conf. on Multimedia , Los Angeles, California, USA, 2000. [15] A. J. Menezes, P. C. van Oorschot, and S. A. Vanstone, \Handbook of Applied Cryptography", CRC Press , 1997. [16] A. Said and W. A. Pearlman, "A New, Fast, and Ecient Image Codec based on Set Partitioning in Hierarchical Trees", IEEE Trans. Circuit and Systems for Video Technology , Vol. 6, pp. 243-250, 1996. [17] S. Walton, \Image Authentication for A Slippery New Age", Dr. Dobb's Journal , Vol. 20, pp. 18-26, 1995. [18] R. B. Wolfgang and E. J. Delp, \Fragile Watermarking Using the VW2D Watermark", Proc. SPIE/IS&T Inter. Conf. Security and Watermarking of multimedia Contents , Vol. 3657, pp. 4051, 1999. [19] M. Wu and B. Liu, \Watermarking for Image Authentication", IEEE Inter. Conf. on Image Processing , 1998. [20] L. Xie and G. R. Arce, \A Blind Wavelet Based Digital Signature for Image Authentication", Proc. of the EUSIPCO 98 , Island of Rhodes, Greece, Sep 1998. [21] M. M. Yeung and F. Mintzer, \An Invisible Watermarking Technique for Image Veri cation", IEEE Conf. Image Processing , Vol. 2, pp. 680-683, 1997. [22] G. J. Yu, C. S. Lu, H. Y. Mark Liao and J. P. Sheu, \Mean Quantization Blind Watermarking for Image Authentication," Proc. IEEE Int. Conf. on Image Processing , Vancouver, Canada, Vol. III, pp. 706-709, 2000. 18 [23] B. Zhu, M. D. Swanson, and A. H. Tew k, \Transparent Robust Authentication and Distortion Measurement Technique for Images", The 7th IEEE Digital Signal Processing Workshop , pp. 45-48, 1996. 19 Figure 1: The relationship between the attack's distribution G A (with standard deviation I or M ) and the SDS 's distribution G S (with standard deviation S ). 20 (a) (b) (c) (d) (e) (f) (g) (h) Figure 2: Content tampering: (a) host image; (b) original image with a large object placed; (c)(e) detected results at 22  24 scales when  = 256; (f)(h) detected results at 22  24 scales when  = 128. 21 (a) (b) (c) (d) (e) (f) (g) (h) Figure 3: Content tampering: (a) host image; (b) original image with a small object placed at the bottom-right; (c)(e) detected results at 22  24 scales when  = 256; (f)(h) detected results at 22  24 scales when  = 128. 22 (a) (b) (c) Figure 4: The positions of the elements (illustrated in black color in the wavelet domain) of an SDS constructed from Fig. 2(a) with (a)  = 256, (b)  = 128, and (c)  = 64. 23 Table 1: CoSDS of Fig. 2(a) under SPIHT with various compression ratios (CR). Completeness of SDS  = 256  = 128  = 64 CR 1:000 1:000 1:000 1:000 8:1 16 : 1 32 : 1 64 : 1 1:000 1:000 1:000 0:994 1:000 1:000 0:997 0:816 Table 2: CoSDS of Fig. 2(a) under JPEG with various quality factors (QF). QF(CR) 60( 7:1 : 1) 50( 8:2 : 1) 40( 9:7 : 1) 30(11:7 : 1) 20(15:0 : 1) 10(21:7 : 1) Completeness of SDS  = 256  = 128  = 64 1:000 1:000 1:000 1:000 1:000 1:000 1:000 1:000 1:000 1:000 1:000 0:996 1:000 1:000 0:999 0:992 0:988 0:969 Table 3: CoSDS of Fig. 2(a) under a set of incidental distortions (among them, sharpening and Gaussian noise adding with amount 16 were run using Photoshop). Incidental distortions Standard deviation I rescaling equalization blurring(7  7) medain ltering(5  5) sharpening Gaussian noise(16) 26:8 27:3 22:9 23:0 23:4 15:9 24 Completeness of SDS  = 256  = 128  = 64 0:993 0:983 0:988 0:943 1:000 1:000 0:918 0:961 0:915 0:830 0:990 1:000 0:808 0:946 0:807 0:682 0:954 1:000 The curves with respect to three minimum differences (256, 128, and 64) 0.5 Probability of parent−child pairs destroyed in an SDS 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 minimum diff. = 256 minimum diff. = 128 minimum diff. = 64 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 Six incidental manipulations (listed in Table 3 and labeled as 1, 2, 3, 4, 5, and 6) 6 Figure 5: The probability (vertical axis) that the relationship of the parent-child pairs in an SDS might be destroyed with respect to six incidental manipulations (horizontal axis) listed in Table 3. The minimum distances () used for thresholding are 256, 128, and 64, respectively. 25 (a) (b) (c) (d) (e) (f) (g) (h) Figure 6: Combined attacks with incidental and malicious manipulations: (a) beach image after JPEG+\umbrella" placement; (b)(d) detected results of (a) at 22  24 scales when  = 128; (e) beach image after rescaling(scaling+\umbrella" placement); (f)(h) detected results of (e) at 22  24 scales when  = 128. 26 (a) (b) (c) (d) Figure 7: Malicious manipulations of non-SDS areas: (a) maliciously tampered with image with a \helicopter" in the sky; (b)(d) detected results of (a) at 22  24 scales when  = 128. 27