Multiplicative watermarking of audio in DFT magnitude
|
|
- Nathaniel Nelson
- 5 years ago
- Views:
Transcription
1 DOI /s y Multiplicative watermarking of audio in DFT magnitude Jyotsna Singh Parul Garg Aloknath De Published online: 21 November 2012 Springer Science+Business Media New York 2012 Abstract In this paper, watermark is multiplicatively embedded in discrete fourier transform magnitude of audio signal using spread spectrum based technique. A new perceptual model for magnitude of discrete fourier transform coefficients is developed which finds the regions of highest watermark embedding capacity with least perceptual distortion. Theoretical evaluation of detector performance using correlation detector and likelihood ratio detector is undertaken under the assumption that host feature follows Weibull distribution. Also, experimental results are presented in order to show the performance of the proposed scheme under various attacks such as presence of multiple watermarks, additive white gaussian noise and audio compression. Keywords Audio Correlation detector Discrete fourier transform Log-likelihood detection Watermarking 1 Introduction Various watermarking embedding techniques have been proposed which embed watermark additively or multiplicatively in audio signal using the imperfections of human auditory system (HAS). These techniques explore the fact that the HAS is J. Singh (B) P. Garg Division of Electronics and Communication Engineering, Netaji Subhas Institute of Technology, Sector 3, Dwarka, New Delhi , India jsingh.nsit@gmail.com P. Garg parul_saini@yahoo.co.in A. De Samsung India Software Operations, No. 66/1, Bagmane Lakeview, Bagmane Tech Park, C.V. Raman Nagar, Byrasandra, Bangalore , India aloknath.de@samsung.com
2 1432 Multimed Tools Appl (2014) 71: insensitive to small amplitude changes, either in the time [3, 4, 21, 25] or frequency [6, 7, 11, 15 18] domains. Boney et al. [4] generated the watermark by filtering a PN-sequence with a filter approximating the frequency masking characteristics of the human auditory system (HAS) [21]. This filtered watermark was then weighted in the time domain to account for temporal masking. Swanson et al. [25] proposed audio dependent watermarking procedure which directly exploited temporal and frequency masking properties to guarantee that the embedded watermark is inaudible and robust. The shaping of watermark is performed using a masking curve computed on the original signal. This masking curve is obtained by psychoacoustic modeling of host audio signal. Bassia et al. [3] presented an audio watermarking algorithm by adding a perceptually shaped spread-spectrum (SS) sequence in time domain. In the other category watermark is embedded in frequency domain. Cox et al. [6] suggested that a watermark should be constructed as an independent and identically distributed (i.i.d.) gaussian random vector that can be imperceptibly inserted in the perceptually most significant spectral components of the data. Garcia [11], Lee and Ho [16] and Kirovski and Malvar [15] exploited psychoacoustic auditory model to shape and embed the watermark for embedding it into Short-Time Fourier Transform (STFT), Cepstral and Modulated Complex Lapped Transform (MCLT) coefficients of an audio signal, respectively. Both the schemes used blind detection techniques. Another technique [7] proposed watermark embedding and detection based on the frequency hopping method in the spectral domain. The scheme of Megías et al. [18] uses MPEG 1 Layer 3 compression to determine the position of the mark bits in the frequency domain. The scheme introduces some randomness in the embedding locations by introducing a secret key in the embedding and detection processes. The secret key includes the seed of a pseudo-random number generator which is used to compute the exact marking positions. The scheme of Megias is nonblind, that is, the spectrum of the original signal is needed to detect the embedded watermark bits. An audio watermarking scheme based on frequency-selective spread spectrum (FSSS) technique in combination with the subband decomposition of the audio signal was presented by Malik et al. [17]. Fujimoto et al. [10], Garcia-Hernandez et al. [12], Fallahpour and Megias [9] and Megías et al. [19] developed a high bit-rate audio watermarking technique with robustness against common attacks and good transparency. The algorithms developed by Fujimoto et al. [10] and Fallahpour and Megias [9] are based on spline interpolation. Spline interpolation is a technique for constructing new data points within the range of a set of discrete data. These techniques are often designed to provide simplicity of implementation and good perceptual quality from known sample values. Fujimoto et al. [10] proposed time domain algorithm in which an original audio signal is divided into distinct frame and then a secret bit is embedded in each frame using spline interpolation. The algorithm proposed in [9] embeds watermark bits based on the spline interpolation of the data derived from FFT transformation. The watermark bits are embedded by manipulating the splineinterpolated magnitudes of the even bins, derived from the magnitudes of odd bins. The computational efficiency of this algorithm is high because of simple interpolation technique. The disadvantage of this algorithm is that the embedded watermark bits are easily removed because the embedding position is known. Another problem with thescheme[9] is that its embedding rule is based on the comparison of the components magnitudes, which makes it vulnerable to certain attacks that could distort the
3 1433 magnitudes. Garcia-Hernandez et al. [12] developed watermarking technique based on rational dither modulation and achieved embedding capacity of 689 bps. A selfsynchronized algorithm was introduced by Megías et al. [19] with an embedding rate of bps. However these two techniques are not robust to MP3 compression of 64 kbps. The embedding techniques in [6, 7, 11, 15 18], exploit psychoacoustic characteristics of HAS while embedding the watermark additively or multiplicatively in spectral domain. These techniques explored the fact that HAS is insensitive to small amplitude changes in spectral domain. Whereas, phase discontinuity of an audio signal causes perceptible distortion when the phase relation between each frequency component of the signal is changed. Hence discrete fourier transform (DFT) magnitude would be a better option for inserting watermark. However, in literature no perceptual model is defined for DFT magnitude which can decide the location and strength of watermark to be embedded in audio spectrum. Also, these techniques have two major drawbacks. First, the psychoacoustic modeling used by existing techniques require rigorous complex computations. Second, the watermark embedding capacity of these schemes is low i.e. there is not much space to accommodate watermark in the host feature within the defined perceptual limits. To overcome these two problems, a new method of evaluating masking threshold for DFT magnitude is proposed which requires lesser computations as compared to traditional psychoacoustic model based thresholds. The technique finds best possible locations in spectra for watermark embedding and finds scaling factor of In this paper, we present a blind robust watermarking system based on pseudo-random signals embedded in the magnitude of the DFT coefficients of an audio signal. The scheme obviates the use of complex HAS calculations. Also, it allows us to build a model which can decide the location and strength of watermark in DFT spectra. The paper is organized as follows. The watermarking system model is presented in Section 3. In the next section, the signal model is presented and the distribution of DFT magnitude coefficients is shown. Then, in Section 3.4, the construction of the optimal detector is depicted. In Sections 4 and 5, the experimental results and the conclusions are presented. 2 Discrete fourier transform (DFT) DFT is used to calculate the spectrum of a waveform in terms of a set of harmonically related sinusoids, each with a particular amplitude and phase. This transform is most commonly used in audio signal processing, as it has the Fast Fourier Transform (FFT) algorithm to increase the processing speed. Also, it is an important representation of audio data because human hearing is based on a kind of real-time spectrogram encoded by the cochlea of the inner ear. Spectrogram is a sequence of FFTsof windowed audio segments. The angular frequencies of these sinusoids are represented by ω k = kω, wherek is an integer varying from 0 to N 1 and ω = 2π f s /N = 2π/NT. Here f s = 1/T denotes sampling frequency of discrete time signal s given as s =[s(0), s(t), s((n 1)T)] (1)
4 1434 Multimed Tools Appl (2014) 71: For convenience s(nt) is often written as s(n) in literature. The kth component of DFT, S(k),ofsignals(n) is given as N 1 S(k) = s(n)e j2πkn/n (2) n=0 The samples of discrete time signal s(n) is recovered using the inverse discrete Fourier transform of S(k) as, s(n) = 1 N N 1 k=0 S(k)e j2πnk/n (3) 3 Description of watermarking model A watermarking system encompasses three major functionalities, namely, watermark generation, watermark embedding, and watermark detection. The aim of watermark generation is to construct a sequence W using an appropriate function f.hencethe watermark vector W =[W(0), W(1),, W(N 1)],suchthatW(i) R, where R is real number, is given as W = f (K, N) (4) here K is the watermark key, N is the length of watermark. Watermarked feature F is obtained by multiplicatively embedding watermark W in host feature F given as F = F(1 + aw) (5) here F =[F (0), F (1),, F (N 1)] and a is the scaling factor lying between 0 and 1. The scaling factor is introduced to maintain imperceptibility of the distortions caused to the host signal due to watermarking. Watermark detector is used to examine whether the signal under test F t contains a watermark W or not under a binary-decision hypothesis test framework. Each module is now discussed in detail, in the following subsections. 3.1 Watermark generation The steps required for generation of watermark are as follows: To construct watermark W, a white pseudo-random (PN) sequence or chip W 0 is generated such that W 0 =[W 0 (0), W 0 (1),, W 0 (N w 1)],whereW 0 (i) ( 1, 1). The sequence is generated using secret key K such that they are mutually independent with respect to the host signal. The magnitude nature of host feature needs to be preserved implying that F given in (5), should always be greater than zero. Such condition is obtained when aw(i) s 0 i N 1 take the value in the finite interval [ 1, 1] keeping scaling factor a 1. The N point DFT region hosting the watermark is usually split in number of subregions, which in our case are the critical bands. The start location (m) and end location (n) of watermark embedded in these critical bands is decided
5 1435 by a pre-defined masking threshold. Hence the length of watermark N w is evaluated as N w = (n m)n (6) To maintain the symmetry of DFT magnitude a reflected version of W 0 is required to be generated as W 0 (i) = W 0(N w i 1), 0 i N w 1 (7) The reflected chip W 0 is embedded in the frequency components around coefficient N 1. This is essential to obtain real valued audio in time domain. 3.2 Masking threshold for DFT magnitude In this paper, the magnitude of DFT coefficients of host audio signal are modified by adding watermark, such that the modified spectra is always below the predefined masking threshold, termed as maximum amplitude spread (MAS). The MAS is defined as the maximum of all amplitude spreads (AS) of DFT components at a particular frequency location within a frame. Following steps are involved to find MAS. Step I Finding amplitude spread (AS) The AS of DFT components is evaluated from the energy spreading function given by Schroeder et al. [22] and its effect is seen at all the N frequency locations of a frame. Schroeder presented a real nonnegative energy spreading function which approximated the basilar spreading as a triangular spreading function and is given as SF db (i, j) = ( z ) ( z ) 2, (8) here SF db (i, j) is the energy spread in decibels (db) from ith to jth frequency location. The bark separation between these two points is given as z = z j z i, where z i and z j denote the bark frequencies of ith and jth frequency locations respectively. Let the audio signal s, given by (1), is sampled at frequency, f s Hertz (Hz). Since audio is real valued signal, its DFT will satisfy the symmetry property i.e. S(k) = S(N k), where k = 1,..., N/2 1. The DFT coefficients S(k) corresponds to frequencies f k given as f k = f s k/n, (9) here 0 k N 1, N being a power of 2. Considering the duplication in the spectra for k N/2, we evaluate the masking spread A 1 (i, j) for amplitude of N/2 components only, given as A 1 (i, j) = SF(i, j), 0 i N/2 1 (10)
6 1436 Multimed Tools Appl (2014) 71: where SF(i, j) is the inverse decibel of SF db (i, j). The square root is to convert the masking spread from energy scale to amplitude scale. Now respecting the symmetry property of DFT components, we define A(i, j) as, A(i, j) = { A1 (i, j), 0 j N/2 A 1 (i, N j), N/2 + 1 j N 1 (11) The amplitude spread of ith DFT component is then defined as, A (i, j) = A(i, j)s(i), for 0 i N/2 1, 0 j N 1, (12) where S(i) is given by (2). This gives N/2 N matrix showing amplitude spread of each of the N/2 DFT components at N frequency locations. Figure 1 shows a plot of amplitude spread A (i, j) of i = 17th and 20th frequency components at all the frequency location f j for 0 j N 1 given in (9) wheren = 512 and fs = 44.1 khz. Step II Evaluation of maximum amplitude spread (MAS) The amplitude spreads of neighboring DFT components overlap each other. Maximum amplitude spread (MAS) is the maximum of all the overlapping amplitude spreads at f i frequency due to DF T coefficients S( j), 0 j N/2 1 and j = i. MAS, Y(i),atlocationi can therefore be evaluated as Y(i) = max(a (i, j)) for 0 j N/2 1 (13) %energy spread of dft components for n=17(dashed), n=20(dot dash)and individual dft component n=18 (star) energy spread frequency(hrz) Fig. 1 Overlapping of amplitude spread of 17th and 20th DFT components and magnitude of 19th DFT component
7 1437 Now maximum amplitude spread MAS for a critical bands z will be the minimum of all Y(i) in that critical band. From (13), we evaluate the maximum amplitude spread Y(z) for critical bands z = 1, 2,, z t as Y(z) = min( Y(i) ) for LB z i HB z, (14) where LB z and HB z are lower and upper frequency components of zth critical band. Figure 2 shows the plot between maximum amplitude spread Y(i) and the magnitude of DFT coefficients F(i) at all the frequency locations f i for i = 0, 1,..., N Watermark embedding In watermark embedding the watermark W is added to host signal F in a way that the symmetry of F is not disturbed. Also, the dc component and nyquist component of DFT spectrum should remain unchanged. This is essential in order to retrieve real valued audio signal after watermarking process. The magnitude of DFT coefficients of host audio signal are modified by multiplicative watermarking, such that the modified spectra is always below the maximum amplitude spread of original signal. Hence, the DFT magnitudes are modified only in certain critical bands to maintain the transparency of audio signal. The embedding steps are described as follows The magnitude F(k) = S(k) and phase φ(k) = S(k) of the spectral coefficients are evaluated for k = 0, 1,, N 1, wheres(k) is given by (2). The distribution of magnitude of DFT coefficients per critical band F z (k), for LB z k HB z is found by translating frequency into bark scale. Here 80 maximum amplitude spread(dot dash)/dft(dots) Vs frequency maximum spread/dft frequency(hz) x 10 4 Fig. 2 Maximum amplitude spread of DFT magnitude for a given audio of frame length N = 512
8 1438 Multimed Tools Appl (2014) 71: z = 1, 2,, z t are the critical bands, z t is total number of critical bands and LB z and HB z are the respective lower and higher frequencies in the critical band z. The watermark is embedded in critical bands in which magnitude of DFT coefficients is less than the defined masking threshold, Y(z). The final watermark is now generated as W(k) = W 0 (i), if mn k nn = W 0 (i), if (1 n)n k (1 m)n = 0, otherwise (15) here 0 i N w and 0 k N 1 and (0 < m < n < 0.5) to maintain symmetry of final watermark. Once location of embedding is decided, the watermark scaling factor a has to be calculated for each critical band to ensure inaudibility of the embedded watermark. The scale factor a z of zth critical band is obtained by dividing masking threshold Y(z) by the maximum magnitude component of the DFT coefficient in each critical band as Y(z) a z = A max( F(k) ), for z = 1, 2,, z t (16) Here A is the gain factor that controls the overall magnitude of the watermarked signal F (k) given in (5). The value of A varies from 0 to 1. The scaling factor a z decides how much the amplitude of watermark is to be suppressed in the selected critical band before adding it to the spectrum of host signal. The scaled watermark is now added according to rule F (k) = F(k) if F(k) Y(z) = F(k)(1 + a z W(k)) if F(k) <Y(z) (17) here 0 k N 1. The modified amplitude of DFT coefficient F (k) is now combined with their corresponding phases φ(k), to get watermarked DFT coefficients S (k). The corresponding time domain watermarked signal s (n) is obtained by calculating inverse discrete fourier transform (IDFT) of S (k) given by (3). 3.4 Optimal watermark detection The aim of watermark detection is to verify, whether or not the given watermark W d at receiver end resides in the test signal F t. The detection is blind i.e. secret key is the only information that detector has at the receiver end. The detector uses salient points for synchronizing the embedded information, so that audio can be analyzed for salient point extraction. Watermark detection can be considered as a binary hypothesis test, solved by means of a correlation detector [13] and log-likelihood ratio detector [2, 5]. However, few assumptions are done before performing the detector tests.
9 1439 Assumptions about the host signal and watermark It is assumed that magnitude of DFT coefficients of speech signal follows Weibull distribution. The Host signal F and the watermark W are independent and identically distributed i.i.d random variables, hence the detector is optimum. DFT magnitude is wide sense stationary process. For large number of samples likelihood ratio and correlation coefficient attain Gaussian distribution due to central limit theorem Likelihood ratio detector The watermarked signal F given in (5), may undergo various signal processing or noisy channel attacks before reaching the receiver end. The received signal F t is now used for watermark detection, by using log-likelihood ratio test. The best suited distribution for magnitude of DFT coefficients F =[f (1),, f (N)] is two parameter Weibull distribution [27] which is defined for positive real axis only. The probabilty density function (pdf) of Weibull distribution is defined as p F ( f ) = β ( ) [ f (i) β 1 ( ) ] f (i) β exp, (18) α α α here f (i) >0 for i = 1, 2,, N. Scaleparameter(α) and shape parameter (β) are positive real valued parameters, which control the mean, variance and shape of distribution. The mean μ f and variance σ 2 f can be written in terms of the scale and shape parameters as ( μ f = αɣ ), β ( σ 2 f = α2 Ɣ ) μ 2 f (19) β where gamma function Ɣ(x) is defined as Ɣ(x) = 0 t x 1 exp( t)dt (20) Once the underlying distribution has been chosen, the next step is the estimation of parameters that govern the characteristics of the selected probability function.the parameter estimation problem consists of finding the underlying distribution parameters by observing samples of random variable described in [26]. Given N sample values [ f (1),, f (N)], from the random variable F, which can be modeled by a two parameter Weibull distribution with a pdf as given by (18) the maximum likelihood estimators ˆα and ˆβ of α and β respectively [24] are known to satisfy the equations ˆα = ( 1 N N ( f (i)) ˆβ i=1 ) 1/ ˆβ (21)
10 1440 Multimed Tools Appl (2014) 71: and ( N )( N ) 1 ˆβ = ( f (i)) ˆβ log f (i) ( f (i)) ˆβ 1 N i=1 i=1 N log f (i) i=1 1 (22) The value of ˆβ has to be obtained from (22) by the use of standard iterative procedures (i.e. Newton Raphson method) and then used in (21) to obtain ˆα. Although the optimum decoder structure requires knowledge of the distribution underlying the magnitude of the non-watermarked coefficients f i, this information is not present at the decoder side. Since decoding is done without resorting to the original audio, the decoder has no access to the original coefficients. Hence the distribution of the non-watermarked coefficients f i needs to be approximated by the distribution of the watermarked coefficients f i. As long as the embedding strength and thus the watermark power is kept small, the difference between the two distributions will be negligible. The values of α and β obtained from maximum likelihood estimator are and respectively. Having identified a suitable model for host feature, we now find the likelihood ratio, as given in [1]. Also, the performance of a log-likelihood based technique can be measured in terms of probability of false alarm P f and probability of misdetection P m. The plot of P f versus P m is called the Receiver Operating Characteristic (ROC) curve of the corresponding watermarking system. This curve conveys all the information required in order to judge the detection performance of a such a system Correlation detector The correlation detector, which is the Maximum Likelihood (ML) optimal detector, is applied to additive or multiplicative watermarking system. These detectors give optimal results while considering Gaussian distribution for the host signals. The correlation detection can be performed by computing the correlation c between pseudorandom sequence W and watermarked signal F t in time or frequency domain given as c = F t W =[F(1 + αw)]w = FW + αfww (23) The correlation is compared to a predefined threshold to determine whether watermark is present in the signal or not. Most popular pseudo-random sequence is the maximum length sequence (also known as M-sequence) [8]. The received signal F t is used for watermark detection, by using correlation test. 4 Experimental results To generate experimental results, a total of 10 standard audio test sequences are takenwhicharelistedintable1. These test sequences are adopted to analyze the performance of the proposed watermarking algorithm. Each signal was sampled at 44.1 khz, represented by 16 bits per sample, and 8 s in length. The DFT magnitude of audio signal was assumed to follow Weibull distribution and the value of the parameters was evaluated using maximum likelihood method as shape parameter, β = and scale parameter, α =
11 1441 Table 1 Audio test sequences (44.1 khz, 16 bit) TS. no. Audio TS. no. Audio 1 Drums 6 Clarinet 2 Flute 7 Waltz 3 Speech (mono) 8 Jazz 4 Speech (stereo) 9 Synth 5 Violin 10 Haffner 4.1 Experimental performance evaluation The value of scaling factor a is changed and its effect is seen on performance of likelihood ratio detector and correlation detector respectively. For this the values of a for various critical bands are obtained using (16). Effect on detection threshold In case of LLR detector, the effect of scale factor a is observed on detection threshold. First we have shown the curves between and P f keeping the value of a fixed. The upper and lower portion of Fig. 3 shows the variations of with respect to P f for two values of a, and 0.8 respectively. The first value, a = , is obtained from MAS threshold and second value, 0.8 is selected close to the maximum limit of a to show the effects clearly visible. As can be seen from figure, for the same range of P f the variations in is only 0 < 0.08 when a = Whereas the variations are quite high (0 20) for a = 0.8. Next we analyse the variations of with respect to a for all the values in Detection Threshold Λ Vs Probability of false alarm P f for LLR detector a= Λ P f 40 a= Λ P f Fig. 3 Threshold versus probability of false detection for LLR detector for two values of scaling factor a
12 1442 Multimed Tools Appl (2014) 71: lambda Scaling factor versus detection threshold Lambda scaling factor Pf=10 6 Lambda Fig. 4 Threshold versus scaling factor for Log-LLR detector for P f = 10 6 the range of 0 < a 1. Figure4 shows the variation of with respect to a for a fixed value of P f ( 10 6 ). From the plot we observe that the value of remains constant for a However, as the value of a is increased beyond 0.04 a steep rise in is obtained. Another observation from figure is that with decreasing a, the value of detection threshold also decreases which in turn degrades the detector response. In case of correlation detector the effect of a is observed on detection threshold T c. A plot between T c and P f for two different values of a (i.e and 0.8) is shown in upper and lower portion of Fig. 5. From the figure we observe that 0.5 Detection Threshold T c Vs Probability of false alarm P f for correlation detector a= T c P f a=0.8 T c P f Fig. 5 Threshold versus probability of false detection for correlation detector for two values of scaling factor a
13 1443 T c = 0.27 when P f = 10 3 for both the values of a. Also the value of threshold lies within the range of 0 T c 0.5 for a wide variation of a (i.e. 0 a 1). Hence it can be inferred from the curves that the output of correlation detector is not much effected by scaling factor a. Instead the output depends mainly on pn sequence, taken as watermark during the embedding process. Effect on ROC The Receiver Operating Characteristic (ROC) curve is obtained from likelihood ratio and correlation watermark detectors, as shown in Fig. 6 respectively. The results are compared with actual experimental curve for both detectors with two different values of a, i.e and 0.8. It is observed that for a = the three curves nearly coincide with each other, whereas the same is not true for the case a = 0.8. For the proposed value of a, given in Section 3.2, the statistical detectors give optimum results which are close to actual experimental value. Further we observe that LLR detector gives better approximation to Fig. 6 Receiver operating characteristic curve for scaling factor α = (upper curve) andα = 0.4 (lower curve), respectively 10 0 ROC curve for Correlation and likelihood detector 10 2 Correlation detector Likelihood ratio detector Experimental curve Pf z SCALING FACTOR = Pm 10 0 ROC curve for correlation and llr detector Correlation detector Likelihood ratio detector Experimental results Pf z SCALING FACTOR = Pm
14 1444 Multimed Tools Appl (2014) 71: Fig. 7 Subjective quality evaluation of watermarked audio experimental results as compared to correlation detector, for all values of scaling factor. 4.2 Objective and subjective quality evaluation Subjective and objective quality tests are performed to evaluate the quality of watermarked audio signal [20]. The subjective audio quality of watermarked audio is evaluated by double-blind A-B-C triple-stimulus hidden reference comparison test. Stimulus A contains the reference signal, whereas B and C are pseudo-randomly selected from the watermarked and the reference signal. After listening to all three, the subject was asked to identify either B or C as the hidden reference, and then grade the watermarked signal relative to the reference stimulus using the SDG. The standard [14] specifies 20 subjects as an adequate size for the listening panel. Since expert listeners participated in the test, the number of listeners has been reduced to 10 for an informal test. A training session preceded the grading session where a trial was conducted for each signal. The tests were performed with headphones in a special cabin dedicated to listening tests. Ten test signals - selected from the sound with a length of s have been presented to the listeners. The results of the listening test are shown in Fig. 7. For the different audio files, the mean SDG value and the 95 % confidence interval are plotted as a function of the different audio tracks to clearly reveal the distance to transparency (SDG= 0). It is observed from the above results that the quality degradation of the proposed watermarking scheme is very small for the vast majority of the test items, given in Table 1. For all test items the SDG is within 0.7 to which indicates that there is no significant distortion introduced by this scheme.
15 1445 Table 2 Results of ODG and scaling factor for multiplicative embedding in DFT magnitude of audio signals S. no. ODG a For objective quality measure, software PQevalAudio for perceptual evaluation of audio quality (PEAQ) is utilized to evaluate an objective difference grade (ODG), which is an objective measurement of SDG. Table 2 lists the average value of PEAQ/ODG with the give test items for varying value of a. It shows that as value of scaling factor a decreases, perceptual quality of watermarked audio becomes better. However, if the value of scaling factor is lowered below the value obtained from MAS (0.0024), ODG obtained is positive. The ITU recommendation does not allow positive ODGs, because this could also happen in listening tests, where the file under test is rated better than the reference file. The value of ODGobtained from watermarkedaudio is for the optimum value of a = From ROCs plotted in Fig. 6 and the objective quality given in Table 2 we observe that for small values of a the detector response is poor, but perceptual quality is within acceptable limits. On the contrary, for larger values of scale factor (a 0.04) the detector response improves, but then the perceptual transparency is deteriorated. It can be inferred from these results that proposed technique gives a good tradeoff between perceptual transparency and detector performance. 4.3 Watermark embedding capacity The proposed scheme provides high watermark embedding capacity with least perceptual distortions. Table 3 compares the embedding capacity and perceptual quality of proposed scheme with other schemes present in literature. The scheme of Megías et al. [18] achieves value of ODG between 0.5 and 2, which is not an acceptable range. The technique proposed by Fujimoto et al. [10] provides high watermark embedding rate of 1 kbps and is simple to implement. However, the scheme only considers mp3 compression attack and ODG value is not mentioned. The algorithm proposed by Fallahpour and Megias [9] achieved a high capacity of about 3 kbps and it is robust against most attacks. The average ODG score achieved is 0.5, which is not too satisfactory. This could be due to the fact that the manipulation based on the estimated FFT coefficients introduces distortions. Table 3 Comparison of ODG and watermark embedding capacity between available literature schemes Technique ODG Embedding capacity Megías et al. [18] 0.5 to 2 61 bps Fujimoto et al. [10] 1 kbps Fallahpour and Megias [9] kbps Proposed to kbps
16 1446 Multimed Tools Appl (2014) 71: The embedding capacity of proposed scheme was found to be 1.4 kbps, with (ODG = 0.065), when embedding was done in only one critical band. The average watermark capacity increased to 4 kbps, with ODG = 0.7), when embedding was performed in more then one critical bands (i.e. 3). As compared to HAS, MAS enables relatively higher watermark embedding rate in DFT magnitude within acceptable limits of perceptual quality. The proposed method is thus able to provide large capacity whilst keeping imperceptibility in the admitted range ( 1 to 0). 4.4 Robustness to attacks The other major issue in watermarking is robustness to various attacks. We will now present the robustness of watermark against additive white gaussian noise (AWGN) noise and presence of multiple watermarks. Fig. 8 Upper curve shows percentage watermark recovery with respect to SNR and lower curve shows correlation detector response for scaling factor a = percentage watermark recovery percentage watermark recovery and Eb/No Eb/No 0.4 Correlation detector response to 1001 random watermarks correct watermark is 501th 0.35 watermark detector response watermarks
17 Addition of AWGN noise The performance of watermark channel is evaluated in the presence of AWGN. A plot between BER and percentage watermark recovery is shown by upper curve in Fig. 8. As can be seen from figure more then 99 % of watermark recovery is achieved for SNR value of 6 db and above. This implies high robustness of watermark against AWGN noise Presence of multiple watermark To see the effect of presence of multiple watermark 1,000 normally distributed pseudo-random sequences, with mean 0 and variance 1, are generated. These sequences are used as test sequences W t for watermark detection process. The correct test watermark W d was used at 501th iteration. Further both types of detectors i.e. likelihood ratio and correlation detectors, are used. In case of LLR detector the value of obtained is 9.8 for P f = 10 6 with scaling factor a = 0.8. The LLR detector output is shown for high value of a, as the response of this detector is poor for small values of a, as can be seen from Fig.4. Log-likelihood ratio of correct watermark obtained is above 9.8 whereas the LLR ratio of other watermarks is well below the threshold. Similarly in case of correlation detector the value of threshold T c obtained statistically was As can be seen from lower curve of Fig. 8 the correct watermark at 501th can be very easily distinguished from other watermarks. The correlation coefficient c ofcorrectwatermarkisabove0.271 whereasotherwatermarksarequite below the threshold. Hence the threshold values evaluated statistically matches with the experimental results. 4.5 Robustness to common audio manipulations Further, we test the robustness of our work against several kinds of common audio manipulations (or attacks). The audio editing tools adopted in our experiment are Cool Edit Pro v2.1 and Goldwave v5.10 to generate all the following Table 4 Robustness to common audio processing operations Attacks Correlation Correlation Correlation (mono) (drum) (flute) Time stretch (preserve pitch) Pitch Shift (preserve tempo) Resample (preserve neither) Low pass filter High pass filter MP3 (128 kbps) Resample (16 k/16 bps) Cropping (with half left) Delay (10.2 ms) Invert
18 1448 Multimed Tools Appl (2014) 71: attacks. The correlation of template matching is given in Table 4 to show the applicability of the proposed scheme in searching for watermark protected audio clips. MP3 compression: To test the robustness against lossy compression, the watermarked audio is compressed and decompressed by MPEG-I Layer 3 (MP3) at 128 kbps. Results indicate high values of the correlation. Re-sampling: The watermarked audio with original 44,100 Hz sampling rate and 16 bits/sample is re-sampled down to 16,000 Hz and 16 bits per sample. Then the low-resolution audio is up-sampled to 44,100 Hz and re-quantizated to 16 bits/sample. Although the above procedure caused audible noise, there is almost no effect on the correlation of template matching and the extracted owner s information is hardly affected. Low-pass filtering: To test the robustness against filtering, a low-pass filter was applied to the watermarked audio sampled at 44,100 Hz. A lowpass filter with less than 3 db of ripple in the passband defined from 0 to 4kHz and at least 40 db of ripple in the stopband defined from 6 khz to the Nyquist frequency (22,050 Hz) was designed. The loss of high frequency components is clearly audible; however, the symmetrically embedded watermark can be detected successfully from the low frequency components. Random cropping: The watermarked audio is randomly cropped and left a half segment in length. Due to the fact that each slice is an independent processing unit, we can extract watermarks from the remaining frames after block synchronization. We can successfully recognize the hidden information and the correlations of template matching are reasonably high. High Pass Filter: A 6th-order highpass Butterworth filter with cutoff frequency of 7,000 Hz was applied on watermarked data sampled at 44,100 Hz. The symmetrically embedded watermark can be detected successfully from the high frequency components. Time scaling: The watermarked audio is scaled by 1.2 % for testing, including the following three different kinds: time stretching (preserves pitch), pitch stretching (preserves tempo) and resampling (preserves neither). The time scaling of resampling attacks, time stretching and pitch stretching attacks has very low effect on our extracting scheme. The shifting and scaling of each slice can be detected by template matching. It has been observed from the results that due to symmetrically embedded watermark in DFT magnitude, proposed scheme is robust against most signal processing attacks. 4.6 Robustness to Stirmark Audio Benchmark Stirmark for Audio [23] is a standard robustness evaluation benchmark tool for audio watermarking techniques. The test results for all test functions in Stirmark Benchmark for Audio V0.2 are listed in Table 5 and are performed with the default parameters included in the version of the tool available online [23]. For that experiment, we have selected 10 standard audio clips (Table 1), watermarked it, and then detected watermarks in the original, the marked copy, and all 49 clips created by
19 1449 Table 5 Watermark detection results on audio clips attacked with the Stirmark Audio Benchmark Attacks Correlation Attacks Correlation Attacks Correlation addbrumm_ addbrumm_ addbrumm_ addbrumm_ addbrumm_ addbrumm_ addbrumm_ addbrumm_ addbrumm_ addbrumm_ addbrumm_ addfftnoise addnoise_ addnoise_ addnoise_ addnoise_ addnoise_ addsinus amplify compressor copysample cutsamples 1 dynnoise echo exchange extrastereo_30 1 extrastereo_50 1 extrastereo_70 1 fft_hlpass fft_invert fft_real_reverse fft_stat1 1 fft_test 1 flippsample invert lsbzero 1 normalize 1 nothing 1 original rc_highpass rc_lowpass 1 resampling smooth smooth stat1 fail stat2 fail voiceremove fail zerocross 1 zerolength zeroremove 1
20 1450 Multimed Tools Appl (2014) 71: Table 6 Comparison of execution time between proposed and existing model Execution time Mono (speech) Drum Flute Stereo (ms) (ms) (ms) (ms) Global masking threshold Maximum amplitude spread the Stirmark Audio suite of attacks. The detection results are presented in Table 5. The detection threshold is set to T c = 0.27, which results in an estimated probability of a false positive smaller than 10 3 for a variety of audio clips. From Table 5, we observe that most of the attacks had minimal effect on the correlation value. The attacks that reduced significantly the correlation value or removed the watermark (such as Stat1, Stat2 and VoiceRemove), had a strong impact on the fidelity of the recording, so that the attacked clip almost did not resemble the original. The attack Stat1 and Stat2 averages the sample with its next neighbors and hence changes the DFT magnitude. Similarly VoiceRemove attack removes the mono part of the file. If the audio signal isn t multichannel (mono) then everything will be removed. Hence the test failed when speech(mono) is used. 4.7 Computational complexity The execution time (etime) for evaluating masking threshold from existing psychoacoustic model and from proposed model on MATLAB 7.7 using Intel Core i3 CPU, 32 bit operating system was evaluated. Table 6 shows that computation time required for evaluating masking threshold from DFT magnitude (proposed technique) is much less then the execution time for frequency masking threshold of MPEG/audio psychoacoustic model. 5 Conclusion The proposed multiplicative spread spectrum based audio watermarking technique embeds watermark in DFT magnitude of audio signal. In order to improve two parameters, the embedding capacity and the computational complexity, a new perceptual model for magnitude of DFT coefficients is developed. This model finds the regions of highest watermark embedding capacity with least perceptual distortion. Also the proposed method reduces computations by bypassing the complex psychoacoustic modeling, required for fulfilling the condition of transparency. Further the scheme uses blind watermark detection i.e. detector does not require original copy of the audio signal to detect watermark from the received audio signal. Theoretical evaluation of detector performance using correlation detector and likelihood ratio detector is undertaken under the assumption that host feature (DFT magnitude) follows Weibull distribution. The performance of scheme is investigated experimentally and statistically and results are compared with the existing schemes in terms of perceptual quality and embedding capacity. The results shown that proposed scheme gives higher embedding capacity as compared to existing high watermark embedding techniques keeping the perceptual quality well within limits. Also, it was observed
21 1451 from experimental results that proposed scheme is robust to various signal processing attacks like presence of multiple watermarks, AWGN and MP3 compression. References 1. Barni M, Bartolini F (2004) Watermarking systems engineering: enabling digital assets security and other applications. Marcel Dekker, New York 2. Barni M, Bartolini F, De Rosa A, Piva A (2001) A new decoder for the optimum recovery of nonadditive watermarks. IEEE Trans Image Process 10(5): Bassia P, Pitas I, Nikolaidis NN (2001) Robust audio watermarking in the time domain. IEEE Trans Multimedia 3: Boney L, Tewfik AH, Hamdy KN (1996) Digital watermarks for audio signal. In: Proc. IEEE int. conf. multimedia comput. syst. (ICMCS), Hiroshima, Japan, pp Cheng Q, Huang TS (2003) Robust optimum detection of transform domain multiplicative watermarks. IEEE Trans Signal Process 51(4): Cox IJ, Kilian J, Leighton T, Shamoon T (1997) Secure spread spectrum watermarking for multimedia. IEEE Trans Image Process 6(12): Cvejic N, Seppänen T (2004) Spread spectrum audio watermarking using frequency hopping and attack characterization. Signal Process 84(1): Cvejic N, Keskinarkaus A, Seppänen T (2001) Audio watermarking using m-sequences and temporal masking. In: Proceedings of IEEE workshops on applications of signal processing to audio and acoustics, New Paltz, New York, pp Fallahpour M, Megias D (2009) High capacity audio watermarking using FFT amplitude interpolation. IEICE Electronics Express 6(14): Fujimoto R, Iwaki M, Kiryu T (2006) A method of high bit rate data hiding in music using spline interpolation. In: Proceedings of the 2006 international conference on intelligent information hiding and multimedia signal processing (IIH-MSP 06), pp Garcia RA (1999) Digital watermarking of audio signals using a psychoacoustic auditory model and spread spectrum theory. In: 107th convention: Audio Engineering Society, New York 12. Garcia-Hernandez JJ, Nakano M, Perez-Meana H (2008) Data hiding in audio signal using rational dither modulation. IEICE Electronics Express 5(7): Hyun KW, Dooseop C, Hyuk C, Taejeong K (2010) Selective correlation detector for additive spread spectrum watermarking in transform domain. Signal Process 90(8): ITU-R (1993) Recommendation BS Methods for the subjective assessment of small impairments in audio systems including multichannel sound systems. Technical report, ITU 15. Kirovski D, Malvar H (2003) Spread-spectrum watermarking of audio signals. IEEE Trans Signal Process 51(4): Lee S-K, Ho Y-S (2000) Digital audio watermarking in Cepstral domain. IEEE Trans Consumer Electron 46(3): Malik H, Ansari R, Khokhar A (2008) Robust audio watermarking using frequency-selective spread spectrum. Inf Secur (IET) 2(4): Megías D, Herrera J, Minguillón J (2005) Total disclosure of the embedding and detection algorithms for a secure digital watermarking scheme for audio. In: Information and communications security. Lecture notes in computer science, vol 3783, pp Megías D, Serra-Ruiz J, Fallahpour M (2010) Efficient self-synchronised blind audio watermarking system based on time domain and FFT amplitude modification. Signal Process 90(12): Neubauer C, Herre J (1998) Digital watermarking and its influence on audio quality. In: Proceedings of 105th Audio engineering society convention, San Francisco, CA 21. Painter EM, Spanias AS (1997) A review of algorithms for perceptual coding of digital audio signals. In: 13th international conference on digital signal processing proceedings DSP-97, vol 1, pp Schroeder MR, Atal BS, Hall JL (1979) Optimizing digital speech coders by exploiting properties of the human ear. J Acoust Soc Am 66(6): Steinebach M, Petitcolas FAP, Raynal F, Dittmann J, Fontaine C, Seibel S, Fates N, Ferr LC (2001) StirMark benchmark: audio watermarking attacks. In: Proceedings international conference information technology: coding and computing, pp 49 54
22 1452 Multimed Tools Appl (2014) 71: Stone GC, Van HG (1977) Parameter estimation for the Weibull distribution. IEEE Trans Electr Insul EI-12(4): Swanson MD, Zhu B, Tewfik AH, Boney L (1998) Robust audio watermarking using perceptual masking. Signal Process 66(3): van Trees HL (1968) Detection, estimation and modulation theory, part I. Wiley, New York 27. Weibull W (1951) A statistical distribution function of wide applicability. J Appl Mech 18(3): Jyotsna Singh received her B. Tech degree in Electronics from Harcourt Butler Technological Institute, Kanpur, India in 1995 and M. Tech degree in Signal Processing from Netaji Subhas Institute of Technology, Delhi University, Delhi, India, in She is working as Assistant Professor in Netaji Subhas Institute of Technology, Delhi University, New Delhi since She is currently working towards the Ph.D degree in Electronics and Communication Engineering from the University of Delhi, India. Her research interests include Speech Recognition and Digital Watermarking of Multimedia. Parul Garg received B.Sc.(Engg.) and M.Sc.(Engg.) degrees from Aligarh Muslim University, Aligarh, India,in 1990 and 1994, respectively, all in Electronics Engineering and her Ph. D. degree in Electrical Engineering from Indian Institute of Technology, Delhi in From May 1996 to July 2000 she worked as a faculty member at the Institute of Engineering and Technology, Lucknow, India. Since July 2000, she has been working as a faculty member at the Netaji Subhas Institute of Technology, New Delhi, India. Her current work mainly focuses on different aspects of wireless communications with emphasis on channel estimation, diversity techniques, cooperative communication, network coding and cognitive radio. She is also working on data hiding in audio signals.
23 1453 Dr. Aloknath De is Country Director for ST-Ericsson India. He holds B.Tech. from Indian Institute of Technology (IIT), Kharagpur; M.E. from Indian Institute of Science (IISc), Bangalore; and Ph.D. from McGill University, Montreal. He is a recipient of Alexander Graham Bell Prize in Canada for his research work in Speech Communication area. He has received IETE Memorial Awards in 2003 and 2008 for distinguished contributions in the fields Electronics and Communication with emphasis on Industrial R&D and Mobile Communication, respectively. Dr. De has over twenty years of industrial and research experiences including BEL, Nortel (Montreal), Hughes and STMicroelectronics prior to leading ST-Ericsson in India. He has been chair ( ) for Media Processing group of International Multimedia Telecommunication Consortium (IMTC), California. He has also been in the technical program committees of various international conferences such as Eurospeech, Supercomm India, VLSI Conf., IEEE CCNC, IEEE Intl Conf on Communications, IEEE Globecom Workshop and others. In late-2009, he co-chaired an INAE workshop on Making India Powerhouse for Semiconductor Design. He s a Senior Member of IEEE and a Fellow of IE, IETE and Indian National Academy of Engg (INAE). He has held an AICTE-INAE Distinguished Visiting Professorship with IIT-Roorkee for and is currently an Adjunct Professor with IIT-Delhi. His current thrust is on system-on-chip (SoC) and embedded solutions for mobile devices and other multimedia communication appliances.
DWT based high capacity audio watermarking
LETTER DWT based high capacity audio watermarking M. Fallahpour, student member and D. Megias Summary This letter suggests a novel high capacity robust audio watermarking algorithm by using the high frequency
More informationHigh capacity robust audio watermarking scheme based on DWT transform
High capacity robust audio watermarking scheme based on DWT transform Davod Zangene * (Sama technical and vocational training college, Islamic Azad University, Mahshahr Branch, Mahshahr, Iran) davodzangene@mail.com
More informationAudio Watermarking Scheme in MDCT Domain
Santosh Kumar Singh and Jyotsna Singh Electronics and Communication Engineering, Netaji Subhas Institute of Technology, Sec. 3, Dwarka, New Delhi, 110078, India. E-mails: ersksingh_mtnl@yahoo.com & jsingh.nsit@gmail.com
More informationTHE STATISTICAL ANALYSIS OF AUDIO WATERMARKING USING THE DISCRETE WAVELETS TRANSFORM AND SINGULAR VALUE DECOMPOSITION
THE STATISTICAL ANALYSIS OF AUDIO WATERMARKING USING THE DISCRETE WAVELETS TRANSFORM AND SINGULAR VALUE DECOMPOSITION Mr. Jaykumar. S. Dhage Assistant Professor, Department of Computer Science & Engineering
More informationMethod to Improve Watermark Reliability. Adam Brickman. EE381K - Multidimensional Signal Processing. May 08, 2003 ABSTRACT
Method to Improve Watermark Reliability Adam Brickman EE381K - Multidimensional Signal Processing May 08, 2003 ABSTRACT This paper presents a methodology for increasing audio watermark robustness. The
More informationA Robust Audio Watermarking Scheme Based on MPEG 1 Layer 3 Compression
A Robust Audio Watermarking Scheme Based on MPEG 1 Layer 3 Compression David Megías, Jordi Herrera-Joancomartí, and Julià Minguillón Estudis d Informàtica i Multimèdia Universitat Oberta de Catalunya Av.
More informationTWO ALGORITHMS IN DIGITAL AUDIO STEGANOGRAPHY USING QUANTIZED FREQUENCY DOMAIN EMBEDDING AND REVERSIBLE INTEGER TRANSFORMS
TWO ALGORITHMS IN DIGITAL AUDIO STEGANOGRAPHY USING QUANTIZED FREQUENCY DOMAIN EMBEDDING AND REVERSIBLE INTEGER TRANSFORMS Sos S. Agaian 1, David Akopian 1 and Sunil A. D Souza 1 1Non-linear Signal Processing
More informationLocalized Robust Audio Watermarking in Regions of Interest
Localized Robust Audio Watermarking in Regions of Interest W Li; X Y Xue; X Q Li Department of Computer Science and Engineering University of Fudan, Shanghai 200433, P. R. China E-mail: weili_fd@yahoo.com
More informationIMPROVING AUDIO WATERMARK DETECTION USING NOISE MODELLING AND TURBO CODING
IMPROVING AUDIO WATERMARK DETECTION USING NOISE MODELLING AND TURBO CODING Nedeljko Cvejic, Tapio Seppänen MediaTeam Oulu, Information Processing Laboratory, University of Oulu P.O. Box 4500, 4STOINF,
More informationAudio Watermark Detection Improvement by Using Noise Modelling
Audio Watermark Detection Improvement by Using Noise Modelling NEDELJKO CVEJIC, TAPIO SEPPÄNEN*, DAVID BULL Dept. of Electrical and Electronic Engineering University of Bristol Merchant Venturers Building,
More informationData Hiding in Digital Audio by Frequency Domain Dithering
Lecture Notes in Computer Science, 2776, 23: 383-394 Data Hiding in Digital Audio by Frequency Domain Dithering Shuozhong Wang, Xinpeng Zhang, and Kaiwen Zhang Communication & Information Engineering,
More informationSteganography on multiple MP3 files using spread spectrum and Shamir's secret sharing
Journal of Physics: Conference Series PAPER OPEN ACCESS Steganography on multiple MP3 files using spread spectrum and Shamir's secret sharing To cite this article: N. M. Yoeseph et al 2016 J. Phys.: Conf.
More informationAudio watermarking robust against D/A and A/D conversions
RESEARCH Open Access Audio watermarking robust against D/A and A/D conversions Shijun Xiang 1,2 Abstract Digital audio watermarking robust against digital-to-analog (D/A) and analog-to-digital (A/D) conversions
More informationDigital Audio Watermarking With Discrete Wavelet Transform Using Fibonacci Numbers
Digital Audio Watermarking With Discrete Wavelet Transform Using Fibonacci Numbers P. Mohan Kumar 1, Dr. M. Sailaja 2 M. Tech scholar, Dept. of E.C.E, Jawaharlal Nehru Technological University Kakinada,
More informationFPGA implementation of DWT for Audio Watermarking Application
FPGA implementation of DWT for Audio Watermarking Application Naveen.S.Hampannavar 1, Sajeevan Joseph 2, C.B.Bidhul 3, Arunachalam V 4 1, 2, 3 M.Tech VLSI Students, 4 Assistant Professor Selection Grade
More informationDifferent Approaches of Spectral Subtraction Method for Speech Enhancement
ISSN 2249 5460 Available online at www.internationalejournals.com International ejournals International Journal of Mathematical Sciences, Technology and Humanities 95 (2013 1056 1062 Different Approaches
More informationSound Quality Evaluation for Audio Watermarking Based on Phase Shift Keying Using BCH Code
IEICE TRANS. INF. & SYST., VOL.E98 D, NO.1 JANUARY 2015 89 LETTER Special Section on Enriched Multimedia Sound Quality Evaluation for Audio Watermarking Based on Phase Shift Keying Using BCH Code Harumi
More informationIntroduction to Audio Watermarking Schemes
Introduction to Audio Watermarking Schemes N. Lazic and P. Aarabi, Communication over an Acoustic Channel Using Data Hiding Techniques, IEEE Transactions on Multimedia, Vol. 8, No. 5, October 2006 Multimedia
More information23rd European Signal Processing Conference (EUSIPCO) ROBUST AND RELIABLE AUDIO WATERMARKING BASED ON DYNAMIC PHASE CODING AND ERROR CONTROL CODING
ROBUST AND RELIABLE AUDIO WATERMARKING BASED ON DYNAMIC PHASE CODING AND ERROR CONTROL CODING Nhut Minh Ngo, Brian Michael Kurkoski, and Masashi Unoki School of Information Science, Japan Advanced Institute
More informationIntroduction of Audio and Music
1 Introduction of Audio and Music Wei-Ta Chu 2009/12/3 Outline 2 Introduction of Audio Signals Introduction of Music 3 Introduction of Audio Signals Wei-Ta Chu 2009/12/3 Li and Drew, Fundamentals of Multimedia,
More informationAudio Watermarking Using Pseudorandom Sequences Based on Biometric Templates
72 JOURNAL OF COMPUTERS, VOL., NO., MARCH 2 Audio Watermarking Using Pseudorandom Sequences Based on Biometric Templates Malay Kishore Dutta Department of Electronics Engineering, GCET, Greater Noida,
More informationSpeech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 7, Issue, Ver. I (Mar. - Apr. 7), PP 4-46 e-issn: 9 4, p-issn No. : 9 497 www.iosrjournals.org Speech Enhancement Using Spectral Flatness Measure
More informationAuditory modelling for speech processing in the perceptual domain
ANZIAM J. 45 (E) ppc964 C980, 2004 C964 Auditory modelling for speech processing in the perceptual domain L. Lin E. Ambikairajah W. H. Holmes (Received 8 August 2003; revised 28 January 2004) Abstract
More informationAn Improvement for Hiding Data in Audio Using Echo Modulation
An Improvement for Hiding Data in Audio Using Echo Modulation Huynh Ba Dieu International School, Duy Tan University 182 Nguyen Van Linh, Da Nang, VietNam huynhbadieu@dtu.edu.vn ABSTRACT This paper presents
More informationSpeech/Music Change Point Detection using Sonogram and AANN
International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 6, Number 1 (2016), pp. 45-49 International Research Publications House http://www. irphouse.com Speech/Music Change
More informationI D I A P R E S E A R C H R E P O R T. June published in Interspeech 2008
R E S E A R C H R E P O R T I D I A P Spectral Noise Shaping: Improvements in Speech/Audio Codec Based on Linear Prediction in Spectral Domain Sriram Ganapathy a b Petr Motlicek a Hynek Hermansky a b Harinath
More informationHigh Capacity Audio Watermarking Based on Fibonacci Series
2017 IJSRST Volume 3 Issue 8 Print ISSN: 2395-6011 Online ISSN: 2395-602X Themed Section: Scienceand Technology High Capacity Audio Watermarking Based on Fibonacci Series U. Hari krishna 1, M. Sreedhar
More informationA Parametric Model for Spectral Sound Synthesis of Musical Sounds
A Parametric Model for Spectral Sound Synthesis of Musical Sounds Cornelia Kreutzer University of Limerick ECE Department Limerick, Ireland cornelia.kreutzer@ul.ie Jacqueline Walker University of Limerick
More information11th International Conference on, p
NAOSITE: Nagasaki University's Ac Title Audible secret keying for Time-spre Author(s) Citation Matsumoto, Tatsuya; Sonoda, Kotaro Intelligent Information Hiding and 11th International Conference on, p
More informationNonuniform multi level crossing for signal reconstruction
6 Nonuniform multi level crossing for signal reconstruction 6.1 Introduction In recent years, there has been considerable interest in level crossing algorithms for sampling continuous time signals. Driven
More informationHIGH QUALITY AUDIO CODING AT LOW BIT RATE USING WAVELET AND WAVELET PACKET TRANSFORM
HIGH QUALITY AUDIO CODING AT LOW BIT RATE USING WAVELET AND WAVELET PACKET TRANSFORM DR. D.C. DHUBKARYA AND SONAM DUBEY 2 Email at: sonamdubey2000@gmail.com, Electronic and communication department Bundelkhand
More informationAudio Restoration Based on DSP Tools
Audio Restoration Based on DSP Tools EECS 451 Final Project Report Nan Wu School of Electrical Engineering and Computer Science University of Michigan Ann Arbor, MI, United States wunan@umich.edu Abstract
More informationRECENTLY, there has been an increasing interest in noisy
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 52, NO. 9, SEPTEMBER 2005 535 Warped Discrete Cosine Transform-Based Noisy Speech Enhancement Joon-Hyuk Chang, Member, IEEE Abstract In
More informationAudio Watermarking Based on Music Content Analysis: Robust against Time Scale Modification
Audio Watermarking Based on Music Content Analysis: Robust against Time Scale Modification Wei Li and Xiangyang Xue Department of Computer Science and Engineering University of Fudan, 220 Handan Road Shanghai
More informationPAPER Robust High-Capacity Audio Watermarking Based on FFT Amplitude Modification
IEICE TRANS. INF. & SYST., VOL.E93 D, NO.1 JANUARY 2010 87 PAPER Robust High-Capacity Audio Watermarking Based on FFT Amplitude Modification Mehdi FALLAHPOUR a), Student Member and David MEGÍAS, Nonmember
More informationImproved Spread Spectrum: A New Modulation Technique for Robust Watermarking
898 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 51, NO. 4, APRIL 2003 Improved Spread Spectrum: A New Modulation Technique for Robust Watermarking Henrique S. Malvar, Fellow, IEEE, and Dinei A. F. Florêncio,
More informationEfficient and Robust Audio Watermarking for Content Authentication and Copyright Protection
Efficient and Robust Audio Watermarking for Content Authentication and Copyright Protection Neethu V PG Scholar, Dept. of ECE, Coimbatore Institute of Technology, Coimbatore, India. R.Kalaivani Assistant
More informationThe main object of all types of watermarking algorithm is to
Transformed Domain Audio Watermarking Using DWT and DCT Mrs. Pooja Saxena and Prof. Sandeep Agrawal poojaetc@gmail.com Abstract The main object of all types of watermarking algorithm is to improve performance
More information(i) Understanding the basic concepts of signal modeling, correlation, maximum likelihood estimation, least squares and iterative numerical methods
Tools and Applications Chapter Intended Learning Outcomes: (i) Understanding the basic concepts of signal modeling, correlation, maximum likelihood estimation, least squares and iterative numerical methods
More informationA High-Rate Data Hiding Technique for Uncompressed Audio Signals
A High-Rate Data Hiding Technique for Uncompressed Audio Signals JONATHAN PINEL, LAURENT GIRIN, AND (Jonathan.Pinel@gipsa-lab.grenoble-inp.fr) (Laurent.Girin@gipsa-lab.grenoble-inp.fr) CLÉO BARAS (Cleo.Baras@gipsa-lab.grenoble-inp.fr)
More informationReduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter
Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Ching-Ta Lu, Kun-Fu Tseng 2, Chih-Tsung Chen 2 Department of Information Communication, Asia University, Taichung, Taiwan, ROC
More informationDWT BASED AUDIO WATERMARKING USING ENERGY COMPARISON
DWT BASED AUDIO WATERMARKING USING ENERGY COMPARISON K.Thamizhazhakan #1, S.Maheswari *2 # PG Scholar,Department of Electrical and Electronics Engineering, Kongu Engineering College,Erode-638052,India.
More informationAvailable online at ScienceDirect. The 4th International Conference on Electrical Engineering and Informatics (ICEEI 2013)
Available online at www.sciencedirect.com ScienceDirect Procedia Technology ( 23 ) 7 3 The 4th International Conference on Electrical Engineering and Informatics (ICEEI 23) BER Performance of Audio Watermarking
More informationChapter 4 SPEECH ENHANCEMENT
44 Chapter 4 SPEECH ENHANCEMENT 4.1 INTRODUCTION: Enhancement is defined as improvement in the value or Quality of something. Speech enhancement is defined as the improvement in intelligibility and/or
More informationChapter IV THEORY OF CELP CODING
Chapter IV THEORY OF CELP CODING CHAPTER IV THEORY OF CELP CODING 4.1 Introduction Wavefonn coders fail to produce high quality speech at bit rate lower than 16 kbps. Source coders, such as LPC vocoders,
More informationAudio Watermarking Based on Multiple Echoes Hiding for FM Radio
INTERSPEECH 2014 Audio Watermarking Based on Multiple Echoes Hiding for FM Radio Xuejun Zhang, Xiang Xie Beijing Institute of Technology Zhangxuejun0910@163.com,xiexiang@bit.edu.cn Abstract An audio watermarking
More informationSpeech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter
Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter 1 Gupteswar Sahu, 2 D. Arun Kumar, 3 M. Bala Krishna and 4 Jami Venkata Suman Assistant Professor, Department of ECE,
More informationEnhanced Waveform Interpolative Coding at 4 kbps
Enhanced Waveform Interpolative Coding at 4 kbps Oded Gottesman, and Allen Gersho Signal Compression Lab. University of California, Santa Barbara E-mail: [oded, gersho]@scl.ece.ucsb.edu Signal Compression
More informationEnhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis
Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis Mohini Avatade & S.L. Sahare Electronics & Telecommunication Department, Cummins
More informationNonlinear Companding Transform Algorithm for Suppression of PAPR in OFDM Systems
Nonlinear Companding Transform Algorithm for Suppression of PAPR in OFDM Systems P. Guru Vamsikrishna Reddy 1, Dr. C. Subhas 2 1 Student, Department of ECE, Sree Vidyanikethan Engineering College, Andhra
More informationSpeech Synthesis using Mel-Cepstral Coefficient Feature
Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract
More informationMultirate Digital Signal Processing
Multirate Digital Signal Processing Basic Sampling Rate Alteration Devices Up-sampler - Used to increase the sampling rate by an integer factor Down-sampler - Used to increase the sampling rate by an integer
More informationEE482: Digital Signal Processing Applications
Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 12 Speech Signal Processing 14/03/25 http://www.ee.unlv.edu/~b1morris/ee482/
More informationAudio Informed Watermarking by means of Dirty Trellis Codes
Audio Informed Watermarking by means of Dirty Trellis Codes Andrea Abrardo, Mauro Barni, Gianluigi Ferrari Department of Information Engineering, University of Siena, Italy & CNIT Research Unit of Siena
More informationMMSE STSA Based Techniques for Single channel Speech Enhancement Application Simit Shah 1, Roma Patel 2
MMSE STSA Based Techniques for Single channel Speech Enhancement Application Simit Shah 1, Roma Patel 2 1 Electronics and Communication Department, Parul institute of engineering and technology, Vadodara,
More informationSingle Channel Speaker Segregation using Sinusoidal Residual Modeling
NCC 2009, January 16-18, IIT Guwahati 294 Single Channel Speaker Segregation using Sinusoidal Residual Modeling Rajesh M Hegde and A. Srinivas Dept. of Electrical Engineering Indian Institute of Technology
More informationEnhancement of Speech in Noisy Conditions
Enhancement of Speech in Noisy Conditions Anuprita P Pawar 1, Asst.Prof.Kirtimalini.B.Choudhari 2 PG Student, Dept. of Electronics and Telecommunication, AISSMS C.O.E., Pune University, India 1 Assistant
More informationA Soft-Limiting Receiver Structure for Time-Hopping UWB in Multiple Access Interference
2006 IEEE Ninth International Symposium on Spread Spectrum Techniques and Applications A Soft-Limiting Receiver Structure for Time-Hopping UWB in Multiple Access Interference Norman C. Beaulieu, Fellow,
More informationMODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS
MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS 1 S.PRASANNA VENKATESH, 2 NITIN NARAYAN, 3 K.SAILESH BHARATHWAAJ, 4 M.P.ACTLIN JEEVA, 5 P.VIJAYALAKSHMI 1,2,3,4,5 SSN College of Engineering,
More informationPerformance Analysis of Parallel Acoustic Communication in OFDM-based System
Performance Analysis of Parallel Acoustic Communication in OFDM-based System Junyeong Bok, Heung-Gyoon Ryu Department of Electronic Engineering, Chungbuk ational University, Korea 36-763 bjy84@nate.com,
More informationDiscrete Fourier Transform (DFT)
Amplitude Amplitude Discrete Fourier Transform (DFT) DFT transforms the time domain signal samples to the frequency domain components. DFT Signal Spectrum Time Frequency DFT is often used to do frequency
More informationScale estimation in two-band filter attacks on QIM watermarks
Scale estimation in two-band filter attacks on QM watermarks Jinshen Wang a,b, vo D. Shterev a, and Reginald L. Lagendijk a a Delft University of Technology, 8 CD Delft, etherlands; b anjing University
More informationDigital Watermarking and its Influence on Audio Quality
Preprint No. 4823 Digital Watermarking and its Influence on Audio Quality C. Neubauer, J. Herre Fraunhofer Institut for Integrated Circuits IIS D-91058 Erlangen, Germany Abstract Today large amounts of
More informationSpeech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm
International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm A.T. Rajamanickam, N.P.Subiramaniyam, A.Balamurugan*,
More informationWARPED FILTER DESIGN FOR THE BODY MODELING AND SOUND SYNTHESIS OF STRING INSTRUMENTS
NORDIC ACOUSTICAL MEETING 12-14 JUNE 1996 HELSINKI WARPED FILTER DESIGN FOR THE BODY MODELING AND SOUND SYNTHESIS OF STRING INSTRUMENTS Helsinki University of Technology Laboratory of Acoustics and Audio
More informationImproving Channel Estimation in OFDM System Using Time Domain Channel Estimation for Time Correlated Rayleigh Fading Channel Model
International Journal of Engineering Science Invention ISSN (Online): 2319 6734, ISSN (Print): 2319 6726 Volume 2 Issue 8 ǁ August 2013 ǁ PP.45-51 Improving Channel Estimation in OFDM System Using Time
More informationImproved Detection by Peak Shape Recognition Using Artificial Neural Networks
Improved Detection by Peak Shape Recognition Using Artificial Neural Networks Stefan Wunsch, Johannes Fink, Friedrich K. Jondral Communications Engineering Lab, Karlsruhe Institute of Technology Stefan.Wunsch@student.kit.edu,
More informationMel Spectrum Analysis of Speech Recognition using Single Microphone
International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree
More informationCommunications Theory and Engineering
Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 Speech and telephone speech Based on a voice production model Parametric representation
More informationIMPROVED CODING OF TONAL COMPONENTS IN MPEG-4 AAC WITH SBR
IMPROVED CODING OF TONAL COMPONENTS IN MPEG-4 AAC WITH SBR Tomasz Żernici, Mare Domańsi, Poznań University of Technology, Chair of Multimedia Telecommunications and Microelectronics, Polana 3, 6-965, Poznań,
More informationReal-time Attacks on Audio Steganography
Journal of Information Hiding and Multimedia Signal Processing c 12 ISSN 73-4212 Ubiquitous International Volume 3, Number 1, January 12 Real-time Attacks on Audio Steganography M. Nutzinger Theobroma
More informationCG401 Advanced Signal Processing. Dr Stuart Lawson Room A330 Tel: January 2003
CG40 Advanced Dr Stuart Lawson Room A330 Tel: 23780 e-mail: ssl@eng.warwick.ac.uk 03 January 2003 Lecture : Overview INTRODUCTION What is a signal? An information-bearing quantity. Examples of -D and 2-D
More informationWatermarking-based Image Authentication with Recovery Capability using Halftoning and IWT
Watermarking-based Image Authentication with Recovery Capability using Halftoning and IWT Luis Rosales-Roldan, Manuel Cedillo-Hernández, Mariko Nakano-Miyatake, Héctor Pérez-Meana Postgraduate Section,
More informationAccurate Delay Measurement of Coded Speech Signals with Subsample Resolution
PAGE 433 Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution Wenliang Lu, D. Sen, and Shuai Wang School of Electrical Engineering & Telecommunications University of New South Wales,
More informationEvaluation of Audio Compression Artifacts M. Herrera Martinez
Evaluation of Audio Compression Artifacts M. Herrera Martinez This paper deals with subjective evaluation of audio-coding systems. From this evaluation, it is found that, depending on the type of signal
More informationDigital Image Watermarking by Spread Spectrum method
Digital Image Watermarking by Spread Spectrum method Andreja Samčovi ović Faculty of Transport and Traffic Engineering University of Belgrade, Serbia Belgrade, november 2014. I Spread Spectrum Techniques
More informationExperimental Validation for Hiding Data Using Audio Watermarking
Australian Journal of Basic and Applied Sciences, 5(7): 135-145, 2011 ISSN 1991-8178 Experimental Validation for Hiding Data Using Audio Watermarking 1 Mamoun Suleiman Al Rababaa, 2 Ahmad Khader Haboush,
More informationSpread Spectrum Watermarking Using HVS Model and Wavelets in JPEG 2000 Compression
Spread Spectrum Watermarking Using HVS Model and Wavelets in JPEG 2000 Compression Khaly TALL 1, Mamadou Lamine MBOUP 1, Sidi Mohamed FARSSI 1, Idy DIOP 1, Abdou Khadre DIOP 1, Grégoire SISSOKO 2 1. Laboratoire
More informationBiomedical Signals. Signals and Images in Medicine Dr Nabeel Anwar
Biomedical Signals Signals and Images in Medicine Dr Nabeel Anwar Noise Removal: Time Domain Techniques 1. Synchronized Averaging (covered in lecture 1) 2. Moving Average Filters (today s topic) 3. Derivative
More informationDigital Watermarking Using Homogeneity in Image
Digital Watermarking Using Homogeneity in Image S. K. Mitra, M. K. Kundu, C. A. Murthy, B. B. Bhattacharya and T. Acharya Dhirubhai Ambani Institute of Information and Communication Technology Gandhinagar
More informationEE 435/535: Error Correcting Codes Project 1, Fall 2009: Extended Hamming Code. 1 Introduction. 2 Extended Hamming Code: Encoding. 1.
EE 435/535: Error Correcting Codes Project 1, Fall 2009: Extended Hamming Code Project #1 is due on Tuesday, October 6, 2009, in class. You may turn the project report in early. Late projects are accepted
More informationAcoustic Communication System Using Mobile Terminal Microphones
Acoustic Communication System Using Mobile Terminal Microphones Hosei Matsuoka, Yusuke Nakashima and Takeshi Yoshimura DoCoMo has developed a data transmission technology called Acoustic OFDM that embeds
More informationChapter 2: Signal Representation
Chapter 2: Signal Representation Aveek Dutta Assistant Professor Department of Electrical and Computer Engineering University at Albany Spring 2018 Images and equations adopted from: Digital Communications
More informationPerformance study of Text-independent Speaker identification system using MFCC & IMFCC for Telephone and Microphone Speeches
Performance study of Text-independent Speaker identification system using & I for Telephone and Microphone Speeches Ruchi Chaudhary, National Technical Research Organization Abstract: A state-of-the-art
More informationSubband Analysis of Time Delay Estimation in STFT Domain
PAGE 211 Subband Analysis of Time Delay Estimation in STFT Domain S. Wang, D. Sen and W. Lu School of Electrical Engineering & Telecommunications University of ew South Wales, Sydney, Australia sh.wang@student.unsw.edu.au,
More informationRESEARCH ON METHODS FOR ANALYZING AND PROCESSING SIGNALS USED BY INTERCEPTION SYSTEMS WITH SPECIAL APPLICATIONS
Abstract of Doctorate Thesis RESEARCH ON METHODS FOR ANALYZING AND PROCESSING SIGNALS USED BY INTERCEPTION SYSTEMS WITH SPECIAL APPLICATIONS PhD Coordinator: Prof. Dr. Eng. Radu MUNTEANU Author: Radu MITRAN
More informationProblem Sheet 1 Probability, random processes, and noise
Problem Sheet 1 Probability, random processes, and noise 1. If F X (x) is the distribution function of a random variable X and x 1 x 2, show that F X (x 1 ) F X (x 2 ). 2. Use the definition of the cumulative
More informationPerformance Analysis of Cognitive Radio based on Cooperative Spectrum Sensing
Performance Analysis of Cognitive Radio based on Cooperative Spectrum Sensing Sai kiran pudi 1, T. Syama Sundara 2, Dr. Nimmagadda Padmaja 3 Department of Electronics and Communication Engineering, Sree
More informationAudio Data Verification and Authentication using Frequency Modulation Based Watermarking
Dublin Institute of Technology ARROW@DIT Articles School of Electrical and Electronic Engineering 2008-01-01 Audio Data Verification and Authentication using Frequency Modulation Based Watermarking Jonathan
More informationFilter Banks I. Prof. Dr. Gerald Schuller. Fraunhofer IDMT & Ilmenau University of Technology Ilmenau, Germany. Fraunhofer IDMT
Filter Banks I Prof. Dr. Gerald Schuller Fraunhofer IDMT & Ilmenau University of Technology Ilmenau, Germany 1 Structure of perceptual Audio Coders Encoder Decoder 2 Filter Banks essential element of most
More informationFOR THE PAST few years, there has been a great amount
IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 53, NO. 4, APRIL 2005 549 Transactions Letters On Implementation of Min-Sum Algorithm and Its Modifications for Decoding Low-Density Parity-Check (LDPC) Codes
More informationTesting of Objective Audio Quality Assessment Models on Archive Recordings Artifacts
POSTER 25, PRAGUE MAY 4 Testing of Objective Audio Quality Assessment Models on Archive Recordings Artifacts Bc. Martin Zalabák Department of Radioelectronics, Czech Technical University in Prague, Technická
More informationSINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase and Reassigned Spectrum
SINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase Reassigned Spectrum Geoffroy Peeters, Xavier Rodet Ircam - Centre Georges-Pompidou Analysis/Synthesis Team, 1, pl. Igor
More informationStudy of Turbo Coded OFDM over Fading Channel
International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 3, Issue 2 (August 2012), PP. 54-58 Study of Turbo Coded OFDM over Fading Channel
More informationNOISE ESTIMATION IN A SINGLE CHANNEL
SPEECH ENHANCEMENT FOR CROSS-TALK INTERFERENCE by Levent M. Arslan and John H.L. Hansen Robust Speech Processing Laboratory Department of Electrical Engineering Box 99 Duke University Durham, North Carolina
More informationDigital Processing of Continuous-Time Signals
Chapter 4 Digital Processing of Continuous-Time Signals 清大電機系林嘉文 cwlin@ee.nthu.edu.tw 03-5731152 Original PowerPoint slides prepared by S. K. Mitra 4-1-1 Digital Processing of Continuous-Time Signals Digital
More informationAudio and Speech Compression Using DCT and DWT Techniques
Audio and Speech Compression Using DCT and DWT Techniques M. V. Patil 1, Apoorva Gupta 2, Ankita Varma 3, Shikhar Salil 4 Asst. Professor, Dept.of Elex, Bharati Vidyapeeth Univ.Coll.of Engg, Pune, Maharashtra,
More informationJournal of mathematics and computer science 11 (2014),
Journal of mathematics and computer science 11 (2014), 137-146 Application of Unsharp Mask in Augmenting the Quality of Extracted Watermark in Spatial Domain Watermarking Saeed Amirgholipour 1 *,Ahmad
More informationVoice Activity Detection for Speech Enhancement Applications
Voice Activity Detection for Speech Enhancement Applications E. Verteletskaya, K. Sakhnov Abstract This paper describes a study of noise-robust voice activity detection (VAD) utilizing the periodicity
More informationMUS421/EE367B Applications Lecture 9C: Time Scale Modification (TSM) and Frequency Scaling/Shifting
MUS421/EE367B Applications Lecture 9C: Time Scale Modification (TSM) and Frequency Scaling/Shifting Julius O. Smith III (jos@ccrma.stanford.edu) Center for Computer Research in Music and Acoustics (CCRMA)
More information