ICA for Watermarking Digital Images

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Journal of Ma hine Learning Resear h 1 (2002) 1-25

Submitted 10/02; Published 10/02

ICA for Watermarking Digital Images Stephane Bounkong Boremi To h David Saad David Lowe

bounkonsaston.a .uk to hbaston.a .uk saaddaston.a .uk lowedaston.a .uk

Neural Computing Resear h Group Aston University Birmingham, B4 7ET, United Kingdom

Editor:

Abstra t

A domain independent ICA-based approa h to watermarking is presented. This approa h

an be used on images, musi or video to embed either a robust or fragile watermark. In the ase of robust watermarking, the method shows high information rate and robustness against mali ious and non-mali ious atta ks, while keeping a low indu ed distortion. The fragile watermarking s heme, on the other hand, shows high sensitivity to tampering attempts while keeping the requirement for high information rate and low distortion. The improved performan e is a hieved by employing a set of statisti ally independent sour es (the independent omponents) as the feature spa e and prin ipled statisti al de oding methods. The performan e of the suggested method is ompared to other state of the art approa hes. The paper fo uses on applying the method to digitized images although the same approa h an be used for other media, su h as musi or video. Keywords:

Steganography, Watermarking, Information-Hiding, Authenti ation, ICA

1. Introdu tion

Modern so iety in reasingly relies on digitized information that an be easily a

essed,

opied and transmitted. The need for se ure authenti ation and watermarking te hniques stimulated resear h in information hiding over the past three de ades. Information hiding, or steganography, has a broad range of appli ations from opyright prote tion and transa tion tra king, to broad ast monitoring, data integrity, authenti ation and ngerprinting (Cox et al., 2002). Some appli ations of steganography, espe ially in the area of ngerprinting are an essential omponent in the sale of opyright prote ted ele troni goods, su h as software, musi and pi ture les, both o and on-line. Fragile watermarking, on the other hand, plays an important role in authenti ating images and audio signals, o ering an alternative to traditional methods that often require the transmission of additional metadata; this may be easily identi ed and removed. General robust watermarking is based on embedding an imper eptible watermark in the original data ( overtext), aiming at either identifying the sender/re eiver of the data or as a opyright mark. An eÆ ient watermarking te hnique allows one to embed as mu h information as possible, while minimizing the distortion of the watermarked data with

2002 Stephane Bounkong and Boremi To h and David Saad and David Lowe.

Bounkong and To h and Saad and Lowe respe t to the original and being robust against atta ks. A trade o between these on i ting requirements has to be found. For authenti ation purposes, in addition to being imper eptible, the watermark has to be sensitive to the slightest modi ation. This is termed fragile watermarking and allows the dete tion of tampering attempts. In some ases, the watermark is required to be sensitive only to some atta ks while not being a e ted by others, su h as ommon pro essing te hniques (semi-fragile watermarking). Current state of the art watermarking methods mostly operate in a feature spa e, su h as the Fourier domain, rather than on the raw data. The allowed level of distortion due to watermarking, based either on onventional measures or on a per eptual model, is limited to a prede ned threshold. The embedding pro ess typi ally relies on modulation or quantization methods, whereas de oding often relies on orrelation dete tion or mapping to the nearest dis rete value. Most modern watermarking te hniques have emerged in the last de ade. Resear hers in this eld ome from di erent ba kgrounds and typi ally bring with them the knowledge from their previous eld; this is re e ted in the watermarking te hniques devised so far, both in the methods suggested for the watermark embedding pro ess and the feature spa e

hosen for this purpose. The plethora of watermarking methods on o er and the narrow suitability to spe i domains make it diÆ ult to provide a prin ipled omprehensive theoreti al approa h to watermarking; su h an approa h is a prerequisite to any optimization s heme aimed at maximizing the information embedding rate and the robustness against various atta ks, and minimizing the information degradation. In this paper, we propose a novel approa h to watermarking whi h is independent of the appli ation domain and is supported by existing results from information theory of watermarking. It is based on embedding the message in a set of statisti ally independent sour es obtained in our ase by independent omponent analysis (ICA) (Hyvarinen et al., 2001). These sour es onstitute the spanning of a feature spa e, and represent the overtext in onjun tion with the orresponding set of onstant mixing matri es. The distortion measure and the ICA mixing and demixing matri es may di er greatly from one appli ation domain to another, but the watermarking s heme prin iple remains the same. Indeed, the demixing pro ess gives a set of independent sour es, whi h share similar hara teristi s and have little orrelation with the original appli ation domain. Di erent generative models may be used for identifying the set of independent sour es; ICA, that we will fo us on here, is learly one of the most prin ipled methods to identify statisti ally independent sour es used as the feature spa e in the suggested s heme. Re ent information theoreti al analyses have provided lear upper bounds to the watermark information apa ity for a given distortion based on ommuni ation hannel with side information, on rete bounds have also been derived in spe ial ases by Cohen and Lapidoth (2002) and Moulin and O'Sullivan (2003). Unfortunately, the analyses do not provide any information on how to design an optimal system and have been arried out under ertain assumptions about both sour e and watermark distributions. However, an important result of this resear h is that the upper bound, in the ase of parallel hannels, is rea hable only if the sour es are statisti ally independent. Any sele tion of statisti ally dependent sour es results in theoreti ally inferior performan e; this motivates the use of ICA as our feature spa e. 2

ICA for Watermarking In addition, we repla e suboptimal threshold-based de oding methods by maximum a posteriori (MAP) de oding using noise and sour e models derived from experiments. The resulting domain independent watermarking method was examined against existing te hniques and was found to be ompetitive, and many times superior, to state of the art approa hes. The paper is organized as follows: in se tion 2, we introdu e the general watermarking framework and existing te hniques. In se tion 3 and 4, we present our ICA based approa h and details of the method used, respe tively. Comparative experiments with other watermarking te hniques arried out and their results are analysed in se tion 5. Dis ussion and

on lusions are presented in se tion 6. 2. Watermarking: General Framework

The problem of robust watermarking an be des ribed as a game between Ali e, Bob and Mallory. Ali e is the legitimate owner of a digital data X. She wants to embed in it some information m to prote t her intelle tual property rights (IPR) and to be able to prove her ownership (or alternatively to embed ngerprinting information about the data transa tion, su h as details of the buyer). Mallory is the atta ker or forger. He wants to ounterfeit the ^ by uprooting the embedded information and/or embedding his own watermarked data X personal data. The indu ed modi ation is denoted by . Bob is the re eiver of the digital ~ , he wants to be sure of buying the data from an authorized seller (or legitimate data X owner, Ali e). He therefore investigates the presen e of a potentially hidden information ^ from the re eived data X~ (alternatively, Ali e may want to nd the sour e of an illegal distribution). This general watermarking problem is also des ribed in gure 1.

n

m

^ X

m

~ X

Embedding

Attack(s)

Decoding

X

n

(X, ...)

^ m

Figure 1: A general watermarking s heme where m is the embedded message, X is the ^ the watermarked overtext, X~ the atta ked overtext and ^ an

overtext, X estimate of the original message m.

m

Fragile watermarking follows roughly along the same lines, ex ept for the fa t that Ali e embeds a watermark that is highly sensitive to tampering attempts. Mallory would like to modify the data to his advantage (e.g., remove information from CCTV footage), while ~ has not been tampered with. Bob wants to be sure that the re eived data X Clearly watermarking has more than one obje tive. It an be used to prove ownership or to ngerprint data (robust watermarking); for this purpose, the embedded information has to be non-removable unless the original data ( overtext) is irreversibly damaged and

arries no value. Robustness against various atta ks is therefore an important requirement for robust watermarking; the watermark a ts as a serial number arved on real produ ts. 3

Bounkong and To h and Saad and Lowe Watermarking an also be used for authenti ation and veri ation of data integrity. The watermark proves that the data has not been modi ed. Fragility is the important feature to fo us on. The s heme has to enable the dete tion of slight hanges against whi h robust watermarking is expe ted to survive. Authenti ation should identify both global and lo al

hanges, su h as the repla ement of a hara ter in an image, and eventually lo ate them. As watermarked images may be ompressed or transmitted over a noisy hannel, also fragile watermarking may be required to survive some non-mali ious atta ks, while allowing the dete tion of deliberate atta ks su h as feature repla ements. This ase is usually de ned as semi-fragile watermarking. 2.1 Watermarking Requirements

Generi watermarking an be seen as a onstrained ommuni ation problem, of hannel with side information, where distortions indu ed by the sender and the atta ker are limited. The three main onstraints are imper eptibility, robustness and information rate. A formal model presenting the relation between these three onstraints is des ribed by Moulin and O'Sullivan (2003) using an information theoreti al framework of information hiding. In this paper, we give an intuitive des ription of these opposing requirements: is de ned as the similarity between the original overtext and the wa^ )  Æ , with d being a distortion termarked version. This an be expressed as d1 (X; X 1 1 measure and Æ1 the desired threshold. Typi al distortion measures found in the literature are mean square error, signal-to-noise ratio, et (Cox et al., 2002). More

ompli ated distortion fun tions, relying on domain spe i per eptual models an also be used. Note that the atta ker is also limited by a similar distortion onstraint ^;X ~ )  Æ , where the distortion measure d and its threshold Æ may be di erent d2 (X 2 2 2 from d1 and Æ1 , although the same distortion measure is typi ally used. In the latter

ase, the distortion aused by an atta k is usually expe ted to be higher than that of the watermark itself (Æ1  Æ2 ).

Imper eptibility

A watermarking s heme should survive various distortions or atta ks. Common atta ks may be di erent depending on the nature of the overtext, for example audio or images. Mali ious atta ks aim at removing the watermark, while non-mali ious atta ks are due to ommon transmission and signal pro essing pra ti es that might harm the embedded signal. Robust s hemes should at least survive non-mali ious atta ks as well as some mali ious atta ks. Generally, these atta ks are global to the whole overtext. Robustness is often a hieved by embedding globally a watermark with little information ontent ompared to the overtext. Robust watermarking aims ~). at maximizing the rate of orre tly identifying the original message, p(m ^ = mjX The message estimate ^ is obtained by some de oding method.

Robustness

m

In fragile watermarking, one repla es the requirement for robustness by fragility, whi h is the ability to dete t an alteration of the watermark due to tampering attempts. Furthermore, it is often needed to lo alize the distortion. Su h a requirement makes it ne essary to have a watermark with higher information ontent than in the robust watermarking framework spread over the entire overtext. Our method has di erent performan e with respe t to di erent atta ks. Among them, we fo us

Fragility

4

ICA for Watermarking on fragility with respe t to repla ement atta ks, while maintaining some robustness against non-mali ious atta ks. Fragile watermarking aims at optimizing the two following onstraints:  ~ ); max p(m ^ = mjX r (1) ~ ^k > Æ ); min p(m ^ = m j kX f X f ~ is the watermarked data atta ked by a non-mali ious atta k against whi h where X r ~ is the watermarked data atta ked by another type of the s heme has to be robust, X f atta k typi ally lo alised whi h needs to be dete ted by the s heme. The parameter Æf sets the fragility of the s heme.

The information apa ity of a watermarking system is the supremum of all a hievable information hiding rates for a given distortion onstraint and a given set of atta ks. A more formal de nition, relying on information theory is given by Moulin and O'Sullivan (2003). We are interested in maximizing the information embedding rate for a given distortion and atta k onstraints, whi h is learly upper bounded by the apa ity.

Information rate

There is no general solution to this problem, as it depends on the nal purpose of the system. For instan e, if the watermark has to survive atta ks that indu e high distortion, error- orre ting odes an be used to improve robustness, but it may be at the expenses of the information rate. From now on, we will fo us on digital image watermarking although most of the topi s may also be relevant for other domains, su h as musi or video. 2.2 Watermarking Feature Spa es

Early watermarking s hemes operated dire tly in the overtext domain. However, su h methods have been shown to be quite poor in their robustness properties. Many feature spa es have been studied in re ent years for improving the eÆ ien y of watermarking systems; most of these hoi es are for reasons of onvenien e and traditional use in the spe i domain, su h as Fourier and Cosine transforms in images. A theoreti al approa h to nd an optimal feature spa e in a parti ular ase, based on an information theoreti al approa h, has re ently been suggested by Ramkumar (2000). However, it is yet to be seen if this approa h an be of pra ti al use. Resear h has fo used more on the design of pra ti al systems using well known feature spa es. The three main spa es are: is widely used being a good approximation for the Karhunen Loeve transform for highly orrelated data, su h as images. The osine transform has good varian e ompa tion property (Jain, 1989). Therefore, using this feature spa e fa ilitates spreading the watermark a ross a large part of the image information ontent. S hemes based on this transformation (Cox et al., 1997) show good robustness against various non-mali ious atta ks. Window osine transform (on 88 pixel windows) is also used (Ko h and Zhao, 1995) in onjun tion with JPEG quantization table to be robust against the well-known JPEG ompression standard (Walla e, 1992), arguably the most ommon non-mali ious atta k.

Cosine Transform Domain:

Watermarking s hemes based on this feature spa e are also usually robust against non-mali ious atta ks for similar reasons. Moreover, some

Fourier Transform Domain:

5

Bounkong and To h and Saad and Lowe variants of this transformation, su h as Fourier-Mellin transform, inherently in lude properties su h as invarian e against aÆne transformations, thus allowing some watermarking s hemes (Ruanaidh and Pun, 1997) to handle geometri atta ks, su h as rotations. Noti e that geometri atta ks are so far among the most eÆ ient atta ks against general watermarking s hemes (Petit olas and Kuhn, 2002, Stirmark). Motivated by the up oming JPEG2000 standard (Committee, 2000), this feature spa e may play an important role in the near future similar to that of the osine transform spa e at present. As shown in (Meerwald, 2001), watermarking in the wavelet domain has re ently been the fo us of many resear h proje ts in this area. Moreover, this frequen y feature spa e allows the embedding pro ess to be spatially lo alized.

Wavelet Transform Domain:

2.3 Embedding, Atta ks and De oding

Various methods have been used to eÆ iently embed/retrieve information in a medium subje t to di erent atta ks. Among them, two main lasses have emerged: quantization and modulation methods. Quantization methods have been widely studied in the oding area for de ades, and have re ently been used in the ontext of watermarking. Modulation methods are often paired with a orrelation dete tor and a given de ision threshold; they show better performan e when the original data is available for de oding. The s hemes have been studied against ommon atta ks in the literature, among them: noise, lossy

ompression, band ltering, ropping and ollusion. 2.3.1 Embedding

Embedding a watermark usually follows one of the two following s hemes or an be a variant of one of them. Modulation: Embedding information through modulation is usually arried out using one of the three formulae suggested by Cox et al. (1997): ^ X

= X + m ;

^ X

= X(1 + m) ;

^ X

= Xe m ;

(2)

where m is the embedded message, X the original overtext value, is a pre-de ned strength ^ is the watermarked value. In this approa h, the value of , ommon to all fa tor, and X

omponents is hosen heuristi ally. Quantization: Another widely used method to embed information is to quantize the original data X, using a quantization fun tion q that provides di erent quantization values/grids to di erent embedded message values m. The embedding strength is determined by the minimal distan e Æ between two adja ent quantization values orresponding to two di erent m symbols. Other methods, su h as modifying a ouple of feature spa e oeÆ ients, while preserving their absolute di eren e or some other prede ned riteria, may be used to embed data. However, in most ases, these are merely variants of modulation and/or quantization. 6

ICA for Watermarking

2.3.2 Atta ks

A watermarked image may undergo some atta ks. We distinguish two kinds of atta k: non-mali ious atta ks, whi h are ommon signal pro essing methods, that are not aimed at removing or tampering with the embedded watermark; and mali ious atta ks, whi h are deliberate attempts to remove/disable the watermark, possibly using the embedding algorithm itself. It is obvious that some non-mali ious atta ks an also be used as mali ious atta ks, espe ially if the watermarking pro ess is known to be weak against them. Common non-mali ious atta ks in lude: Noise

Data may be altered due to transmission through a noisy ommuni ation hannel.

Compression algorithms are often used to transfer image data eÆ iently. Compression algorithms su h as JPEG or JPEG2000 allow ex ellent ompression rates, while they introdu e moderate levels of distortion, whi h depend on the hosen quality level or ompression rate. Su h atta ks are ompletely deterministi but also diÆ ult to model.

Lossy ompression

Very ommon image pro essing te hniques fall into this ategory, su h as luminosity adjustment, sharpening/blurring, ontrast adjustment, edge enhan ement.

Enhan ement

Other ommon pro essing te hniques do not remove the watermark or a e t the quality of the pi ture, but may disable the watermark dete tion. These in lude res aling, ropping and rotation.

De-syn hronization

2.3.3 De oding

Di erent de oding methods are used for the various embedding te hniques. For instan e,

orrelation dete tion is used when the watermark has been embedded using a modulation ~ (or the di eren e s heme. The orrelation is omputed between the atta ked overtext X ~ between the atta ked and original data X X) and the watermark m. The orrelation value is then ompared with a prede ned dete tion threshold. A watermark is dete ted if the orrelation is above it. Noti e that the original data is required for the de oder to perform eÆ iently. Quantization de oding is usually arried out by mapping the atta ked value to the nearest quantized value. Knowledge of the quantization pro ess is suÆ ient for de oding. A prin ipled de oding method, examined later in this paper, is maximum a posteriori (MAP) de oding. Using probabilisti models of the data, watermark, embedding, noise and

orruption pro ess in onjun tion with Bayesian statisti s, one may obtain posterior mean values of the message as well as error-bars. Drawba ks of this method are: its sensitivity to the a

ura y of the probabilisti models used, de oding is arried out in the feature spa e and at a high omputational ost. 3. ICA for Watermarking

ICA was introdu ed several years ago as a blind sour e separation te hnique, but sin e then has been used in a broad range of appli ations, from sparse oding and denoising to feature extra tion (Hyvarinen et al., 2001). The main assumption in ICA is that a given signal 7

Bounkong and To h and Saad and Lowe

an be represented as a linear mixture of statisti ally independent sour es. This property

ombined with the simpli ity of a linear mixture model have made ICA, a powerful and useful tool in various resear h elds. In the ontext of watermarking, an ICA based te hnique has been studied by GonzalezSerrano et al. (2001). The latter, unlike our approa h, is related to a least signi ant bit modi ation in the ICA domain. Also, the reported results show quite poor performan e. In our approa h, ICA allows the maximization of the information ontent and minimization of the indu ed distortion by de omposing the overtext (in this ase the image) into statisti ally independent sour es. Embedding information in one of these independent sour es minimizes the emerging ross- hannel interferen e. In fa t, for a broad lass of atta ks and xed apa ity values, one an show that distortion is minimized when the message is embedded in statisti ally independent sour es (Appendix A). Information theoreti al analysis also shows that the information hiding apa ity of statisti ally independent sour es is maximal (Moulin and O'Sullivan, 2003). Finally, this extremely simple transformation fa ilitates the use of Bayesian de oding te hniques based on statisti al models. They an be onstru ted due to the simple fa torized statisti s of the sour es. Prin ipled Bayesian te hniques are expe ted to improve the de oding performan e in real systems. Another signi ant advantage of the ICA based approa h is its independen e with respe t to the appli ation domain. The distortion measure and the ICA mixing and demixing matri es may di er from one appli ation domain to another, but the watermarking s heme prin iple remains the same. 3.1 ICA-based Watermarking as a Communi ation Problem

ICA-based watermarking an be des ribed, from an information theoreti al perspe tive, as a ommuni ation hannel with side information (Cox et al., 2002). We use the information theoreti al des ription but fo us on the appli ation of ICA-based watermarking within it. Exploiting the fa t that ICA is a simple linear transformation, we on entrate on the feature spa e in modelling the sour es, atta ks and indu ed distortion.

I

ICA

^ s

s

~s

m encoding

n ICA

message

attack

ICA

~ I

Figure 2: ICA watermarking as a ommuni ation problem. In this gure, I is the original image, s are the demixed signals, m is the message to embed, ^s are the watermarked signals, n is the orruption noise, ~s is the orrupted or atta ked watermarked signals and I~ is the atta ked watermarked image. 8

ICA for Watermarking Figure 2 des ribes the omplete watermarking problem, separating the linear mixing/demixing operations from the ommuni ation hannel itself (within the dashed line). In order to onstru t a proper statisti al model of the pro ess, we need to model the demixed signals s, whi h are the sour e realizations, any possible noise n, the atta k pro ess p(~sj^s; n), the message m and the embedding pro ess p(^sjs; m). For onvenien e, the distribution of m is set to be uniform on f0; 1g. We generate a statisti al model of s based on real data, in this ase, a set of representative images. The embedding pro ess we use, is based on Quantization Index Modulation (QIM) studied by Chen and Wornell (2001). The reasons for sele ting this parti ular te hnique are:



its reported high performan e (it has been shown to approa h optimal performan e for some models),

  

its inherent non-linearity, whi h makes it more se ure (Craver, 1996), the simple statisti al model it o ers, and the simpli ity of its implementation.

Finally, we hoose to de ne the atta k pro ess as an additive pro ess and t the noise distribution a

ording to this assumption. 3.2 ICA for Images

ICA is a versatile te hnique used in various appli ations, in luding image pro essing (Hyvarinen et al., 2001). The ICA pro ess derives features that best represent the data via a set of

omponents that are as statisti ally independent as possible. The main assumption behind ICA is that any typi al given signal X an be represented as a linear ombination of statisti ally independent sour es s using a mixing matrix A su h that X = As; we also have the inverse (demixing) relation s = W X, where W denotes the orresponding demixing matrix. Various methods allow ICA basis ve tor estimation. In experiments, we used the FastICA algorithm developed by Hyvarinen and Oja (1997), whi h provides good de omposition results eÆ iently. Sin e it is often impra ti al to use full size images (bigger than 3232 pixels) as inputs, we apply the FastICA algorithm to square image pat hes. Various pat h sizes are used in the literature, from 88 to 3232 pixels. Two pra ti al aspe ts have to be onsidered: pro essing time and the size of relevant features. Large pat hes are theoreti ally feasible, but their basis estimation is omputationally demanding; on the other hand, a small pat h size leads to poor performan e in the watermaking pro ess. A trade o between these two

on i ting onstraints has to be found. Based on this onsideration and pra ti al experiments, we onstru ted our basis (Fig. 5) from a training set of 11 natural s ene images (Fig. 4), from whi h a set of 11,000 image pat hes of 1616 pixels has been randomly sampled. The data obtained have then been

entered. To remove noise and improve energy ompa tion, the data dimensionality was redu ed using prin ipal omponent analysis (PCA). The remaining 60 largest eigenvalues (Eq. 3) preserved 98.68% of the data varian e. The prepro essed dataset was used as input to the FastICA algorithm. 9

Bounkong and To h and Saad and Lowe

Images Patches PCA Eigenvalues

0

2000 1800 1600

Eigenvalues Values

1400 1200 1000 800 600 400 200 0 10

20

30

40 50 Eigenvalues Index

60

70

80

512 0

Figure 3: Image pat he eigenvalues.

512

Figure 4: Example of natural image.

Figure 5: ICA basis obtained from natural images. 4. ICA Watermarking Pro ess

In this se tion, ICA based algorithms for robust and fragile watermarking are proposed. Both algorithms rely on an ICA feature spa e and QIM. Some di eren es, whi h we will highlight later on, make them more suitable to their respe tive purposes. The general ICA based watermarking pro ess omprises four stages as des ribed in Fig. 6. 1. The image is divided into ontiguous image pat hes giving a set of mixed signals. Ea h pat h is then demixed resulting in s, using a predetermined ICA demixing matrix W , prepared using an ensemble of typi al images. 2. For ea h pat h, a set of oeÆ ients are sele ted a

ording to the spe i task (fragile or robust watermarking) as des ribed below. 3. The sele ted oeÆ ients are quantized and watermarked. The di eren e between the watermarked and original values is denoted by . 10

ICA for Watermarking

4.  is multiplied by the mixing matrix A to produ e w whi h is then added to the original pi ture I . I

ICA (demixing)

W

s

Coefficients Processing



ICA

w

Iw

(mixing)

m, δ

A

I

Figure 6: ICA watermarking s heme. In order to a hieve the best robustness, all allowed distortion is on entrated in one IC, using a large quantization step, whi h has been sele ted a

ording to some riteria; for instan e, robustness against ertain types of atta ks or suitable statisti al properties. Furthermore, to improve imper eptibilty we use slight modi ations in non-sele ted ICs to ompensate for the distortion indu ed by the quantization of the sele ted IC; this is arried out by minimizing kA k, where s^ is the sele ted and quantized value and i the index of the quantized sour e or IC; A is the mixing matrix. For authenti ation purposes, one is required to dete t hanges in any given pat h; therefore, a large set of ICs are sele ted in ea h pat h to be quantized. The probability for a random pat h to have the same binary watermark is therefore p = 2 , where n is the number of quantized ICs, hosen to be suÆ iently high so as to limit the feasiblility of data ounterfeiting. In the ase of a repla ement larger than the pat h size, the probabilty of have the same binary signature

an be approximated by p , where m is here the number of involved pat hes (Fig. 7). The allowed distortion hara terized by Æ1 is distributed a ross this set. The lower Æ1 is, the more fragile and imper eptible the watermark be omes. However, if Æ1 is too small, the nite resolution of the digital image makes it impossible to embbed any watermark. Su h a watermark is easily destroyed as intended, but one may also want to ensure that the watermark survives mild non-mali ious atta ks that may o

ur in the re ording or transmission pro ess. Robust Watermarking S heme ICs set sele tion:

^ s

i

Fragile Watermarking S heme ICs set sele tion:

repla ement

n

m

4.1 Watermark Embedding: Quantization Index Modulation

QIM (Chen and Wornell, 2001) is the embedding method used in our experiments. It an be seen as a quantization pro ess, whi h uses two grids orresponding to the value of the message bit m f0; 1g (Fig. 8). As underlined, the relative simpli ity of this embedding pro ess fa ilitates the use of a statisti al model to be used in the de oding pro ess. i

4.2 Watermark De oding

After an atta k, the task is to infer the embedded message ve tor from the atta ked ( orrupted) value of the watermarked overtext . We will fo us on two methods, a simple threshold based approa h and on a prin ipled Bayesian de oding method. m

~ s

11

Bounkong and To h and Saad and Lowe

50

50

100

100

150

150

200

200

250

250

300

300

350

350

400

400

450

450

500

500 100

200

300

400

500

100

200

300

400

500

Figure 7: Dete tion of a tampering atta k using ICA-based fragile watermarking. In the left image, a square pat h of 64  64 pixels has been atta ked by a Gaussian noise of standard deviation 0.5. The modi ed area is delimited by a white line. In the right image, the grey regions show the image pat hes, where a potential tampering attempt has been dete ted. A bla k line denotes the area where the modi ation has been arried out.

Figure 8: Quantization index modulation (QIM) - the original value is represented by *, with Æ representing the quantization step; the dashed-dot arrows show the quantization index modulation. On the left, we embed the message bit m = 0 in the real value x while on the right we embed the message bit m = 1 in the original value y.

m=0 values

^x0

*

0 −> x

*

1 −> y

^y1

m=1 δ

De oding to the nearest grid point is probably the simplest de oding for a quantization embedding s heme, des ribed in Fig. 9. The robustness of the oding/de oding pro ess is dire tly linked to the quantization step Æ used. The only requirement of this de oder is knowledge of the quantization grids. 12

ICA for Watermarking

Figure 9: De oding to the nearest grid point; in this ase the retrieved message m is 0.

m=1

m=0

magnitude

* jδ

(j+1) δ

MAP de oding relies on sele ting m values that maximize the posterior probability (mj~s) as an estimate of the message, m ^ = maxm (mj~s). Using Bayes rule, we obtain (~sjm) (m) (mj~s) =

p

p

p

p

p

:

(~s)

p

Exploiting the fa t that all omponents of the sour e s and message m are identi ally independently distributed and assuming that orrelations emerging from the atta k pro ess are negligible, one an redu e the multidimensional problem to a fa torized single variable inferen e problem. As p(m) has been hosen uniformly and given the fa t that p(~s) is a normalization term independent of m, one may redu e the inferen e problem to m^ / max p(~s jm ) over the two m values, where i is the index in the ve tor. From now on, the latter index i will be omitted to simplify the notation. i

mi

i

i

i

f(x)

^s 1,n−1

^s 0,n

^s 1,n+1

Figure 10: Probability of having s^0 given m = 0. ;n

The probability of having a given quantized value, say s^ =0 , whi h embeds the message = 0 is entral to our al ulation. The expression provided in equation (3) is based on the fa t that any value between two onse utive grid points s^ =1 , watermarked by a message m = 0 will be quantized to the same value s ^ =0 as explained graphi ally in Fig. 10, where the rst index represents the embedded bit value and the se ond represents the running quantization number on the ombined grid. One straightforwardly derives the expression Z ^1 +1 P (^ s0 jm = 0) = f (x) dx ; (3) m

;n

m

m

m

;n

;n

s ;n

;n

^1

s ;n

1

that relies on the sour e probability density fun tion f whi h is not known analyti ally. From the Markov hain in Fig. 2, and using the probabilisti atta k model we onstru ted previously, one derives the following onditional probabilities that fa ilitate the 13

Bounkong and To h and Saad and Lowe MAP estimate of m.

j

p(~ sm

= 0) =

X X X X

pn (~ s

s^0;n

^0

j

= 1) =

^1

^1

1

j

s^0 )P (^ s0 m

pn (~ s

s^1;n

Z)

^0

+1

^0

pn (~ s

j

s^1 )P (^ s1 m

f (x) dx ;

= 0) ;

s ;n

s ;n

n

=

+1

pn (~ s

s

p(~ sm

^1

s ;n

s ;n

n

=

Z)

1

f (x) dx ;

= 1) ;

(4) (5) (6) (7)

s

pn (~ s

s^ ) representing a noise model, the argument of whi h is the di eren e between the watermarked and atta ked IC values. Clearly the method relies heavily on obtaining reliable probabilisti models for both sour es and the atta k pro ess. In this study, we onstru ted a statisti al model for the watermarking problem based on an ICA feature spa e of digital images. A model for f , based on the family of Generalized Gaussian Exponential (GGE) has been derived and statisti ally tested in Appendix B. In order to obtain a better model, more elaborate distributions, su h as mixtures of Gaussian, may be used. The disparity between image and noise sour es, s and n, and the onstru ted probabilisti models p(s) and p(n), respe tively, is measured by the 2 test (Appendix C and D). Three MAP based de oders are devised for three di erent atta ks: JPEG ompression, set partitioning in hierar hi al trees (SPIHT)

ompression and Gaussian noise. :;n

5. Experiments and Results 5.1 Experiments

To test the performan e of our watermarking s heme against existing state of the art methods, we arried out a set of experiments for ea h of the watermarking tasks. The distortion

onstraint threshold Æ1 related to the embedding pro ess is set to 43dB using the peak to signal noise ratio (PSNR) measure, whi h ensure the imper eptibility of the watermark for all studied methods. All the tests are arried out on a test set (di erent from the ICA training set) of 11 greys ale natural s ene images (Fig. 4) of 512  512 pixels. For a given atta k various strengths are tested. The embedded messages m are randomly generated binary sequen es on f0; 1g. Robust Watermarking S heme - To study the performan e of our robust watermarking s heme, we use three variants of our method, two DCT based watermarking algorithms and a DWT based algorithm; their performan es are tested under Gaussian noise, JPEG

ompression (Walla e, 1992) and SPIHT ompression (Said and Pearlman, 1996) atta ks. For a given image, atta k type and atta k strength, the test is repeated 100 times with a di erent embedded message. The embedded messages m are of length 1024. In the next subse tion, a brief des ription of the algorithms will be given. The omplete settings of experiments an be found in Appendix E. Fragile Watermarking - Our fragile watermarking s heme ability to dete t modi ed pat hes is tested. In order to simulate a tampering pro ess, a randomly lo ated square 14

ICA for Watermarking pat h of 1616 pixels in the watermarked pi ture is modi ed by the addition of random noise. Sin e the size of our image pat hes is 1616, this atta k an learly a e t up to four image pat hes of our s heme. Ea h one of the four potentially a e ted pat hes has only 128 pixels involved on average. This means that 75% of the pat h is una e ted by the tampering. The dete tion is onsidered su

essful if some of the fragile watermark bits

annot be orre tly retrieved from the orresponding region. For a given image, atta k strength and length of message, the test is repeated 100 times with a di erent message. The embedded messages m are of 1024  n, where n is the number of bits embedded per pat h and 1024 is the number of pat hes per image. In this test, we set Æn at 0.1, whi h gives us a di erent distortion for ea h n between 49-50 dB PSNR. The robustness of our s heme is also tested against non-mali ious mild atta ks su h as Gaussian noise or JPEG ompression with respe t to the number of bits embedded per pat h. The trade o between robustness and fragility has to be set a

ording to the nal purpose of the watermarking appli ation. In our s heme, the watermark fragility is in reased by in reasing the number of ICs to modify per pat h; doing so also redu es the probability for a random pat h to arry the same binary signature. It is also possible to in rease the fragility by de reasing the quantization step Æn , or by onsidering several adja ent pat hes together if the relevant feature size in the pi ture is large. Physi al limits of the digital image storage (quantization) set a lower bound to Æn . For our experiments, we set the watermarking distortion threshold to 43dB PSNR and distribute all this distortion allowan e a ross the set of sele ted ICs. The detail of parameters used an be found in Appendix E. For a given image, atta k strength and length of message, the test is repeated 10 times with a di erent message. The length of the embedded messages m is 1024  n, where n is the number of bits embedded per pat h. 5.2 Algorithm Des riptions

The various algorithms des ribed below are based on quantization of a sele ted set of oeÆ ients in their respe tive feature spa e. The presele tion of these sets is also des ribed below. This is an ICA based algorithm, where we presele t a small subset of ICs that are parti ularly robust against a spe i atta k; a single IC is then randomly sele ted from this subset to be watermarked in ea h pat h. De oding is arried out by mapping to the nearest grid point.

ICA Sel

This ICA based algorithm is introdu ed as a ben hmark for the ICA Map algorithm, to show the improvement gained from using a prin ipled de oding method instead of de oding to the nearest grid point. A single IC is sele ted to be watermarked in all pat hes for a given atta k; the sele tion riterion is not dire tly related to its robustness against spe i atta ks, but having a good agreement with the orresponding sour e and atta k models, a

ording to the 2 test. De oding to the nearest grid point is used.

ICA Ne

This algorithm is similar to ICA Ne, ex ept for the use of MAP de oding instead of de oding to the nearest grid point.

ICA Map

15

Bounkong and To h and Saad and Lowe A standard, ommonly used, DCT based algorithm. It quantizes (QIM) the DCT representation of the entire pi ture. Among the DCT oeÆ ients whi h represent a signal with at least one y le per image pat h of 1616, the 1024 lowest frequen y ones onvey the watermark. The watermarked pi ture is then obtained by appli ation of an inverse DCT. De oding to the nearest grid point is used.

DCT

A lo al DCT based algorithm. It relies on a partitioning of the pi ture into ontiguous pat hes of 1616 pixels. The DCT is applied to ea h of them. For ea h pat h, a single oeÆ ent is randomly sele ted among the low frequen y ones, and quantized (QIM). An inverse DCT is then applied to obtain ea h watermarked pat h. De oding to the nearest grid point is used.

DCTX

A multiresolution wavelet transform based watermarking algorithm. The detail of the pro ess an be found in (Kundur and Hatzinakos, 1998), but it basi ally relies on embedding the message in the third level of the Haar wavelet de omposition with some strength parameter (Q = 2 in our ase) that determines both the robustness a hieved and the level of imper eptibility. A orrelation based de oder is used. Full details of the method and its parameters are given in (Kundur and Hatzinakos, 1998).

DWT

A fragile watermarking s heme using ICA feature spa e and QIM. The main di eren e with respe t to robust ICA-based watermarking method, is that here one embeds more information per pat h by hoosing several ICs in ea h pat h but with a lower quantization step for ea h IC. The number of embedded bits and the quantization steps an be tuned to adapt to fragility/robustness requirements. Here, de oding to the nearest grid point is also used.

ICA Fra

5.3 Results

The Gaussian noise atta k results (Fig. 11) show that ICA Sel, ICA Ne and DCTX perform as well as the global DCT method, while DWT is less robust a ross the entire range of noise level. ICA Map on the other hand, although based on an IC with sub-optimal robustness properties, outperforms all other s hemes due to its good sour e and noise statisti al models that have been exploited in the de oding pro ess. The JPEG ompression atta k results (Fig. 12) show that ICA Sel performs equally well to the DCTX algorithm. On the other hand, DCT and DWT have quite poor performan es in general, while ICA Ne and ICA Map perform as well as DCTX and ICA Sel in the entire range of a

eptable ompression rates (30-90), but show bad results for high ompression. This is due to a breakdown of the statisti al models for su h high ompression rates, indi ated by the poor 2 test results, and the sub-optimal robustness properties of the sele ted IC. The SPIHT ompression atta k results (Fig. 13) show that ICA Sel, ICA Ne, DCT and DCTX have similar performan es with slightly better performan e showed by ICA Sel and DCTX. DWT performs quite poorly. ICA Map algorithm does not perform very well, presumably due to ina

urate sour e or/and noise models used in the de oding pro ess. The results show quite promising results for ICA algorithms, in general, whi h are either

ompetitive with, or outperform other state of the art methods. Improving the sour e and noise models may fa ilitate further improvements in performan e. Robust Watermarking Results:

16

ICA for Watermarking Gaussian Noise Attack on Watermarking Schemes 0.5 0.45 0.4

Error Rate

0.35 0.3 0.25 0.2

ICA Sel ICA Ne ICA Map DCT DCTX DWT

0.15 0.1 0.05 0 10

15

20 Noise StD

25

30

Figure 11: Performan es of watermarking s hemes against Gaussian noise.

JPEG Compression Attack on Watermarking Schemes 0.4

Error Rate

ICA Sel ICA Ne ICA Map DCT DCTX DWT

0.2

0 10

20

30

40 50 60 JPEG Quality Level

70

80

90

Figure 12: Performan es of watermarking s hemes against JPEG ompression.

Fragile Watermarking Results:

An example for the dete tion of atta ked pat hes has

already been given in Fig. 7. In this example, a pat h is atta ked by Gaussian noise, above the value that the watermark was designed to tolerate. The probability of identifying and lo ating the atta ked pat h then follows dire tly from the number of bits embedded in ea h pat h. To study the fragility and robustness properties of our fragile watermarking s heme,

ICA Fra,

we ondu ted a set of experiments to determine the probability of identifying

an atta ked pat h, and the per entage of de oding errors observed under mild Gaussian and JPEG atta ks.

The results of these experiments are shown in Fig. 14 and Fig. 15

respe tively.

17

Bounkong and To h and Saad and Lowe

SPIHT Compression Attack on Watermarking Schemes 0.3 ICA Sel ICA Ne ICA Map DCT DCTX DWT

0.25

Error Rate

0.2

0.15

0.1

0.05

0

0.5

1

1.5 2 SPIHT Compression Rate

2.5

Figure 13: Performan es of watermarking s hemes against SPIHT ompression. In the rst set of experiments, we atta ked an arbitrary single pat h (1616 pixels) in ea h image using Gaussian noise, and monitored the probability of identifying the atta ked pat h. The experiment was arried out in low Gaussian noise values of varian e smaller than 1 (keeping in mind that intensity levels are in the range 0:255) and di erent numbers of marked bits per pat h. The probability of dete tion is marked on the urved lines with respe t to the number of bits embedded per pat h (horizontal axis) and the noise level (verti al axis). We see that even in relatively low noise levels, it is still possible to identify the atta ked pat hes. In the se ond set of experiments we studied the robustness of our method against global non-mali ious atta ks, Gaussian noise and JPEG, as shown in Fig. 15. The gures des ribe the fra tion of de oding errors (marked on the urved lines) as a fun tion of the number of bits marked per pat h (horizontal axis) and the noise level (verti al axis). Clearly, a higher number of bits marked per pat h for a given distortion, will result in smaller quantization steps and lower robustness. The results show that ICA Fra provides an eÆ ient fragile watermarking method even in the presen e of mild non-mali ious atta ks. The embedded fragile watermark, of a given low distortion (43dB PSNR), an easily and reliably be identi ed using 10 bits per pat h (with probability for a random pat h to arry the same signature of 0.1%). Another aspe t of our method that one should emphasize is that A and W are unknown to the forger; this, ombined with small typi al quantization steps will make it very diÆ ult to forge a watermark. 6. Con lusion

A novel approa h to both robust and fragile watermarking, using ICA as the feature spa e in whi h watermarks are embedded, is presented. The new approa h, being based on embedding information in statisti ally independent sour es, shows high information embedding 18

ICA for Watermarking Average Error Detection induced by a Gaussian Noise

0.9

0.8

(48.7) 0.9

0.95

(47.8) 1

98 0.

8 0.9

0.9

5 0.9 8

0.4

0.9 0.8

0.2

0.8

0.6 0.4

0.6 0.6

0.3

(52.3) 0.6 0.1

(PSNR in dB) Standard Deviation

6

0.3

(50.9) 0.7

0.95 0.9

0.8

0.

(49.8) 0.8

0.4 0.2

0.4 0.3

0.3

(53.9) 0.5

0.2

0.1

0.2

0.1

0.1

(55.8) 0.4

(58.3) 0.3

(61.8) 0.2

5

10

15 Bits/patch

20

25

30

Figure 14: ICA Fra watermarking for lo al distortion dete tion.

rate and minimal distortion. Its performan e is examined on a set of representative images, random messages and various atta ks. Experiments show highly promising performan e on all the atta ks examined. The main advan e is that, being based on embedding information using statisti ally independent sour es, the same watermarking method an be easily applied a ross di erent media. Based on lo al information and a linear transform, our method is omputationally eÆ ient, o ering additional se urity in the use of spe i mixing/de-mixing matri es that are not easy to obtain. The provided statisti al models for both sour es and atta k fa ilitates the use of a Bayesian de oding method that has the potential to provide an optimal de oding s heme. Further resear h may improve the performan e by re ning the existing statisti al models, a new approa h for sele ting the IC's to be watermarked and the distortion measures used (for instan e a measure based on the human visual system). Applying the same approa h to other domains, su h as audio signals (To h et al., 2003), may also require some adaptations due to the di erent nature of the signals. 19

Bounkong and To h and Saad and Lowe

Average Error Detection induced by a Gaussian Noise

Average Error Detection induced by JPEG Compression

0.4 0.3

0 .2

0.9 0.8

(47.1) 1.05

(44.4) 95

0.1

95

0.

6 0. 0.4

0.3 0.2

0.1

0.05

(46) 1.2

(43.6) 94

0.05

(42.6) 1.8

0.01

(PSNR in dB) JPEG Quality Factor

(41.9) 1.95

(45) 1.35

0.8

0.6

0.2

0.3

(42.8) 93

(41.3) 2.1

0.01

(45.7) 96

(47.3) 97

0.4

(PSNR in dB) Standard Deviation

(42.2) 92

95 0.9

(40.7) 2.25

(44.1) 1.5

0.05

9

99

0.8

(40.2) 2.4

(43.3) 1.65

0.1

0. (41.8) 91

9 0.

(39.6) 2.55

99 0.

0.98 0.95

0.6 0.4 0.3 0.2 0.1 0.05 0.01

(39.1) 2.7

(41.3) 90 0.01

(38.3) 3 (38.7) 2.85

(48.3) 0.9 0.6

(49.7) 0.75

(49.8) 98

(78.8) 0.15

5

10

15 20 Bits/patch

25

(58.2) 100

30

5

10

15 20 Bits/patch

0. 3

0.2

0.1

(54.2) 99

(58) 0.3

0.05

3

0.

0.2

0.1

0.05

0.01

(53.5) 0.45

0.01

0.4

(51.4) 0.6

25

30

Figure 15: ICA Fra watermarking for global distortion dete tion. Appendix A. Proof

In this appendix, we will show that a blo kwise memoryless watermarking pro ess, for a given power-limited lass of blo kwise memoryless atta ks and a xed apa ity, minimizes the distortion it indu es when the sour es s to watermark are independent. This result will be derived from (Moulin and O'Sullivan, 2003), proposition 8.3. It is assumed that s is a blo kwise memoryless sour e with blo k size L, that the atta k is also blo kwise memoryless with blo ks of same size L and that the lass of atta ks A is limited by a distortion onstraint. Let p(s) = Li=1 p(si ) be the produ t of the marginals p(si ). Theorem 4.4 in (Moulin and O'Sullivan, 2003) gives us the following expression for the

apa ity C of the watermarking game against an atta ker subje t to a distortion onstrain for embedding, assuming the distribution p(s):

Q

C

= max min J (Q; A) , Q2Q A2A

(8)

where Q denotes the probability density fun tion of the embedding hannel, A the probability distribution fun tion of the atta k hannel and J represents the ost fun tion (information rate), des ribed in se tion 4.3 of (Moulin and O'Sullivan, 2003). Let us rst x the lass of atta ks A. From proposition 8.3 in (Moulin and O'Sullivan, 2003) and using the same notations, we derive that, subje t to a given maximum embedding 20

ICA for Watermarking distortion D1 , the apa ity C of any distribution p(s) and the apa ity C of p(s) are related by C  C . For a given distribution p(s), if D and D0 are two maximum embedding distortions and C and C 0 are the respe tive apa ities, then D  D 0 is equivalent to having C  C 0 . It

an be easily proved, sin e the apa ity is de ned as the maximum over a set of probability density fun tions Q. In reasing the distortion, in reases the set Q thus resulting in a higher

apa ity. If there exists a maximum embedding distortion D2 su h that the apa ity C2 of p(s) subje t to D2 , admits C2 = C , then, a

ording to the result from the previous paragraph applied to the distribution p(s), we obtain D2  D1 . We have proved that, for a given apa ity, the lowest maximum distortion indu ed by the embedding pro ess is a hieved when the distribution of the blo k elements is independent (fa torised). Appendix B. Generalized Gaussian or Exponential Distributions for

 = 1=k

and

 = 2=k

The general expression of densities belonging to this family is (for zero mean) px (x) = C exp



jxj  : E fjxj g

(9)

The positive real-valued power  determines the type of distribution, and C is a normalizing

onstant. Two families of probability density fun tions and probability fun tions, for  = 1=k and  = 2=k are given below. The GGE (1=k) family probability density fun tion and probability fun tion are given below, where  represents the distribution's standard deviation. f 1 (x) k

=

p1 k exp

n

jxj p

o

1

k

1

k

k

F1

k

(10)

;

2k! t  t ( k  X 1 jxj 1 (x) = 2 + 2 sign(x) 1 i)!t i=1 (k  (k 1)!  t = (3k 1)! :

k

k



i

2

i



k

k

)

i

(11)

;

1

(12)

k

The GGE (2=(2m + 1)) family expressions are given by the following equations, where  is the standard deviation. f F

o n jxj exp (x) = p Qm 1 22 ; 2 i=i (2( i + 1)2m+1 m X 2m i (m i)! jxj (x) = 21 + sign(x) 21 + p 12m+1 (2(m i) + 1)! 2  i=1 2 2m+1

2 2m+1

2 2m+1

21

(13) 2(m i)+1 2m+1



2i   

Bounkong and To h and Saad and Lowe

exp

)

jxj 2m2+1 o ( jxj 2m1+1 ) ; 22 

n

 m)!(3m)!  2m1+1 2  2m+1 ; 2 = 2 2(2 m!(6m + 1)! Zu 2 (u) = p exp( t2 )dt :

(15) (16)

0

Appendix C. The

2

(14)

Fitting Test

The 2 test is a method for testing the relevan e of a model against real data. It uses a data distribution model, also alled hypothesis, and a set of real data samples. The disparity between model and data is measured by a normalised quadrati di eren e, Eq. (17). Then

omparing the 2 value obtained to a given threshold, the hypothesis is reje ted or a

epted. The 2 expression is given by the following equation. 

2

=

k X (Mi i=1

upi )2 upi

=

k X i=1

Mi2 upi u ;

(17)

where pi = F (bi ) F (ai ) is the theoreti al probabilities of x falling in i = [ai ; bi), Mi being the number of sample values in i, with Pki=1 Mi = u. The border bins must satisfy npi  1 and the others upi  5; m = k r 1 is the degree of freedom of 2 where r is the number of parameters. In our study, we used 26 bins of size 0.3 from 3.9 to 3.9, two borders bins are also added from 1 to -3.9 and from 3.9 to 1, so m = k r 1 = 26 1 1 = 24, where r is the number of estimated parameters, the number of samples per signal is u = 11000, then the riti al value 2 is 36.4 (log10 (2 ) = 1:56), for a on den e value = 0:05, see Bronshtein and Semendyayev (1997) for further details. Appendix D. Images and Atta ks Modelling D.1 Image Models

Experiments using the 2 test and aiming at modelling the image sour es with the two GGE families and randomly sampled squared pat hes show that about 13% of the 60 ICs an be modelled by a GGE distribution with  = 2=3 or  = 1=2, as seen in Fig.16. In the ase of MAP de oding, these ICs are therefore preferred, as explained in the text. Further resear h on more omplex models may over ome the limitation represented by the restri ted hoi e of ICs to watermark. D.2 Atta ks Models

As shown in Fig. 17, JPEG ompression with high level quality (low

ompression), su h as JPEG 90 are quite well des ribed by a Gaussian distribution. When the quality level de reases, the Lapla ian distribution model be omes more suitable. However, as for the sour e models, only a few ICs have their model validated

JPEG Compression

22

ICA for Watermarking

2

Log (χ ) for GGE distributions 10

4.5

u=1/4 u=1/3 u=2/5 u=1/2 u=2/3 u=1 (Laplacian) u=2 (Gaussian)

3.5

10

Log (χ 2)

4

3

2.5 2 1.5 1 0

10

20

30 # ICs

40

50

60

Figure 16: 2 tting test - Di erent GGE ( 2 f 41 ; 13 ; 25 ; 12 ; 23 ; 1; 2g) distributions are tested against some real data. The latter are obtained from a set of 11,000 randomly sampled images pat hes of 1616 pixels demixed by a ICA demixing matrix W of 60 ICs. and none have their sour e and noise models validated at the same time. In order to improve the de oding performan e, further resear h is required to re ne the models. 5

4

4

4

3

3

3

2

2

2

2

1 0

20 40 JPEG 50

60

1 0

5

5

4

4

10

Log χ

JPEG 30

JPEG 20 5

10

Log χ

2

JPEG 10 5

3

3

2

2

1 0

20

40

# ICs

60

1 0

20 40 JPEG 90

60

1 0

20

40

# ICs

60

u=1/4 u=1/3 u=2/5 u=1/2 u=2/3 u=1 (Laplacian) u=2 (Gaussian) 20

40

# ICs

60

Figure 17: 2 tting test for JPEG ompression models.

SPIHT Compression As shown in Fig.

18, SPIHT ompression is best modelled by the Lapla ian distribution. However, as for JPEG ompression, the 2 values also 23

Bounkong and To h and Saad and Lowe show the need for re nements. As previously, we will use the losest distribution, for instan e the Lapla ian distribution as a rst approximation for our experiments. SPIHT 0.56 bpp 5

4

4

4

3

3

3

2

2

2

1 0

20 40 SPIHT 1.28 bpp

60

1 0

5

5

4

4

3

3

2

2

Log10χ

2

SPIHT 0.8 bpp

5

Log10χ

2

SPIHT 0.4 bpp 5

1 0

20

40

60

# ICs

Figure 18:

2

1 0

20 40 SPIHT 2.8 bpp

60

1 0

20

40

# ICs

60

u=1/4 u=1/3 u=2/5 u=1/2 u=2/3 u=1 (Laplacian) u=2 (Gaussian) 20

40

# ICs

60

Fitting Test for SPIHT Compression Models.

Gaussian Noise ICA is a linear transform and a linear ombination of entered i.i.d. Gaussian variables of the same varian e remains Gaussian. Therefore Gaussian noise on images remains Gaussian in the ICA feature spa e with its standard deviation being a fun tion of the original standard deviation and the demixing matrix W . In our study, if  is the standard deviation of the noise atta k, and i is the standard deviation of the atta k on a given IC, the latter an be expressed as i = k(W T )i k, where k:k is the 2-norm and (W T )i is the ith olumn index of the matrix W T .

Appendix E. Settings

In this se tion, we spe ify the di erent experimental settings for ea h algorithm and ea h atta k; Æ denotes the quantization step of the QIM pro ess. ICA Sel/DCT/DCTX/DWT: For these for algorithms, the same settings are used for all atta ks and atta k strengths. A set of ICs are rst sele ted, then a proper quantization step is set, su h that the distortion onstraint requirement is obeyed. ICA Ne/Map: Table 2 summarizes the noise and sour e models hosen for ICA Map algorithm. The ICs have been sele ted su h that the 2 value is minimal for both. This means that for any other IC, either the 2 of the noise or sour e model is higher than both 2 of the sele ted IC. Unfortunately, as shown in the table 2, all the presented models, but in the Gaussian noise atta k ase, are reje ted. This might also explain the la k of improvement shown by the ICA Map algorithm in simulations for this parti ular atta k. 24

ICA for Watermarking ICA Sel DCT Sele ted ICs/Coef f5, 55, 59g low frequen y with at least one y le per 1616 pixels pat h Æ (or Q for DWT) 0.7 50

DCTX DWT f3, 4, 18, 19, The oeÆ ients 20, 33, 34, 35, are randomly 49, 50g, see sele ted among Fig. 19 the 3rd level de omposition 50 2

Table 1: ICA Sel/DCT/DCTX/DWT algorithm experimental settings 16

1

17

33

49

2

18

34

50

3

19

35

4

20

16

16

Figure 19: DCTX algorithm oeÆ ient sele tion. ICA Fra: A set of ICs is randomly drawn from a pre-sele ted set and quantized using a quantization step Æn , where n is the number of quantized ICs. The pre-sele ted ICs are the ones within the range 11-40. The di erent quantization steps are given in table. 3, for a xed indu ed distortion of 43dB PSNR.

GN JPEG SPIHT Strength 10-30 10 20 30 50 90 0.4 0.56 0.80 1.28 2.8 Sele ted ICs 19 16 34 43 28 15 20 23 17 23 20 1.2 2.3 3 1.2 2.2 1.9 2 2 2 2 2 Æ Sour e Model 5 5 5 5 5 5 5 5 5 5 5 2 34.0 25.4 68.3 92.5 49.4 22.9 34.5 43.0 26.7 43.0 34.5  Noise Model 7 6 6 7 7 7 6 6 6 6 6 50.6 57.1 55.3 78.5 35.5 50.7 49.3 36.6 45.1 98.8 2 Table 2: ICA Ne/Map algorithm experimental settings, where GN stands for Gaussian Noise, and the noise models orresponden es are: 5 for GGE(2/3), 6 for GGE(1) or Lapla ian and 7 for GGE(2) or Gaussian. 25

Bounkong and To h and Saad and Lowe

Number of quantized ICs -

v

Æv Number of quantized ICs -

1

2

3

4

5

6

7-8 0.55

1.6

1.15

0.9

0.8

0.7

0.65

9

10-11

12-13

14-18

19-24

25-30

.5

.45

.4

.35

.3

.25

v

Æv

Table 3: ICA Fra algorithm quantization step.

Appendix F. Numeri al Results Robust watermarking results:

STD

ICA Sel

ICA Ne

ICA Map

DCT

DCTX

DWT

10

0.0206

0.0118

0.0098

0.0125

0.0123

0.3308

15

0.1064

0.0929

0.0750

0.0955

0.0957

0.3919

20

0.2165

0.2079

0.1641

0.2112

0.2110

0.4277

25

0.3130

0.3107

0.2414

0.3149

0.3150

0.4506

30

0.3832

0.3895

0.3009

0.3919

0.3924

0.4653

MSE PSNR

3.3714

3.1249

3.1249

2.9285

3.1556

2.1830

42.5588

42.8884

42.8884

43.1656

42.8466

45.6137

Table 4: Gaussian noise atta k on watermarking s hemes.

QL

ICA Sel

ICA Ne

ICA Map

DCT

DCTX

DWT

10

0.2594

0.3785

0.5548

0.3373

0.2979

0.4125

20

0.0769

0.2593

0.2708

0.1705

0.0611

0.3157

30

0.0130

0.0072

0.0071

0.1306

0.0052

0.2559

50

0.0001

0.0007

0.0007

0.0909

0.0000

0.1947

90

0

0

0

0.0000

0

0.1136

MSE PSNR

3.3680

3.2757

3.2757

2.9326

2.0303

2.1941

42.5633

42.7187

42.7187

43.1596

44.7611

45.6032

Table 5: JPEG atta k on watermarking s hemes.

26

ICA for Watermarking

Rate

ICA Sel

ICA Ne

ICA Map

DCT

DCTX

DWT

0.40

0.1812

0.1613

0.2138

0.1942

0.1558

0.3725

0.56

0.1065

0.0823

0.1424

0.1362

0.0738

0.3106

0.80

0.0350

0.0298

0.0931

0.0541

0.0210

0.2477

1.28

0.0049

0.0018

0.0381

0.0080

0.0029

0.1787

2.80

0

0

0.0026

0

0

0.1180

MSE PSNR

3.3702

3.4640

3.4640

2.9284

2.0279

2.1836

42.5602

42.4620

42.4620

43.1658

44.7665

45.6141

Table 6: SPIHT atta k on watermarking s hemes.

27

Bounkong and To h and Saad and Lowe

Referen es I.N. Bronshtein and K.A. Semendyayev.

Handbook of Mathemati s.

Springer-Verlag, 1997.

B. Chen and G.W. Wornell. Quantization index modulation: a lass of provably good methods for digital watermarking and information embedding. IEEE Transa tions on Information Theory, 47(4):1423{1443, 2001. A.S. Cohen and A. Lapidoth. The gaussian watermarking game. Information Theory, 48(6):1639{1667, 2002.

IEEE Transa tions on

Final Committee. JPEG2000 part 1 draft version 1.0. te h. rep. FCD15444-1, ISO/IEC, Mar h 2000. I.J. Cox, J. Kilian, T. Leighton, and T. Shamoon. Se ure spread spe trum watermarking for multimedia. IEEE Transa tions on Image Pro essing, 6(12):1673{1687, 1997. I.J. Cox, M.L. Miller, and J.A. Bloom. ers, 2002.

Digital Watermarking.

Morgan Kaufmann Publish-

S. Craver. Can invisible watermarks resolve rightful ownership? Te hni al Report RC 20509, IBM Resear h Report, July 1996. F.J. Gonzalez-Serrano, H.Y. Molina-Bulla, and J.J. Murillo-Fuentes. Independent omponent analysis applied to digital watermarking. In International Conferen e on A ousti , Spee h and Signal Pro essing (ICASSP), volume 3, pages 1997{2000, 2001. A. Hyvarinen, J. Karhunen, and E. Oja. Inters ien e, 2001.

Independent Component Analysis.

Wiley-

A. Hyvarinen and E. Oja. A fast xed-point algorithm for independent omponent analysis. Neural Computation, 9(7):1483{1492, 1997. A.K. Jain.

Fundamentals of Digital Image Pro essing.

Pearson Higher Edu ation, 1989.

E. Ko h and J. Zhao. Towards robust and hidden image opyright labelling. IEEE Workshop on Nonlinear Signal and Image Pro essing, pages 452{455, O tober 1995. D. Kundur and D. Hatzinakos. Digital watermarking using multiresolution wavelet de omposition. In International Conferen e on A ousti , Spee h and Signal Pro essing (ICASSP), volume 5, pages 2969{2972, 1998. P. Meerwald. Digital Image Watermarking University Salzburg, January 2001.

in the Wavelet Transform Domain.

PhD thesis,

P. Moulin and J.A. O'Sullivan. Information-theoreti analysis of information hiding. Transa tions on Information Theory, 49(3):563{593, Mar h 2003. F.A.P. Petit olas and M.G. Kuhn. Stirmark 4.0. http://www. l. am.a .uk/~fapp2/watermarking/stirmark/, 2002. 28

Available

IEEE

from

ICA for Watermarking

M. Ramkumar. Data Hiding in Multimedia - Theory and Appli ations. PhD thesis, New Jersey Institute of Te hnology, January 2000. J.J.K. O Ruanaidh and T. Pun. Rotation, s ale and translation invariant digital image watermarking. In Pro eedings of the International Conferen e on Image Pro essing, pages 536{539, O tober 1997. A. Said and W.A. Pearlman. A new fast and eÆ ient image ode based on set partitioning in hierar hi al trees. IEEE Transa tions on Ciruits and Systems for Video Te hnology, 6:243{250, June 1996. B. To h, D. Lowe, and D. Saad. Watermarking of audio signals using independent omponent analysis. In International onferen e on WEB delivering of musi , 2003. G.K. Walla e. The JPEG still pi ture ompression standard. IEEE Transa tions on Consumer Ele troni s, 38(1):18{34, February 1992.

29