Watermarked Movie Soundtrack Finds the Position of the Camcorder in a Theater

Size: px
Start display at page:

Download "Watermarked Movie Soundtrack Finds the Position of the Camcorder in a Theater"

Transcription

1 1 Watermarked Movie Soundtrack Finds the Position of the Camcorder in a Theater Yuta Nakashima, Ryuki Tachibana, Noboru Babaguchi, senior member, IEEE Abstract In recent years, the problem of camcorder piracy in theaters has become more serious due to technical advances in camcorders. In this paper, as a new deterrent to camcorder piracy, we propose a system for estimating the recording position from which a camcorder recording is made. The system is based on spread-spectrum audio watermarking for the multichannel movie soundtrack. It utilizes a stochastic model of the detection strength, which is calculated in the watermark detection process. Our experimental results show that the system estimates recording positions in an actual theater with a mean estimation error of 0.44 m. The results of our MUSHRA subjective listening tests show the method does not significantly spoil the subjective acoustic quality of the soundtrack. These results indicate that the proposed system is applicable for practical uses. Index Terms Audio watermarking, recording position estimation, movie soundtrack, prevention of movie piracy Fig. 1. A scenario for identifying a pirate. I. INTRODUCTION CAMCORDER piracy in theaters is movie theft by persons who bring a camcorder into a theater and record a movie from the screen. Recently, camcorder piracy has become a serious problem due to technical advances in camcorders. The Motion Picture Association claims that the annual loss caused by pirated movies is 6.1 billion dollars, and that over 90% of the pirated movies of new release titles are illegal recordings made by camcorder piracy [1], [2]. Camcorder piracy in theater is explicitly banned by law in many countries. For instance, in the United Sates, the Family Entertainment and Copyright Act, which became law in 2005, bans the uses of recording devices in theaters. The law also imposes a strict penalty on any person who makes pre-release works (not only movies) publicly available. In Japan, in response to the significant loss of box-office revenues, an anti-camcorder law has been enforced since This law prohibits recording movies even for private uses, which was permitted by the previous copyright law. The law also encourages the movie industry to prevent any person from making illegal recordings. As a deterrent against the camcorder piracy in theaters, several watermarking techniques have been proposed [3], [4], [5], [6], [7]. The main idea of these techniques is to embed a secret message into the movie, and the message indicates where and when the movie was shown. If movies are pirated Manuscript received xxx xx, 20xx; revised xxx xx, 20xx. This work was partly supported by the research grant of the Okawa Foundation. Y. Nakashima and N. Babaguchi are with Graduate School of Engineering, Osaka University, 2-1 Yamadaoka, Suita, Osaka, Japan, {nakashima, babaguchi}@nanase.comm.eng.osaka-u.ac.jp. R. Tachibana is with Tokyo Research Laboratory, IBM Japan, Shimotsuruma, Yamato, Kanagawa, Japan, ryuki@jp.ibm.com. and the illegal recordings are made available via the Internet or some other route, then the secret message can be extracted to determine where and when the illegal recordings were made. This sort of technique is very effective since it can help to specify the theater and showtime the illegal recordings were made for a further surveillance. However, the previously proposed techniques cannot identify the pirate, who made the illegal recordings. We consider a scenario for the purpose of identifying the pirate as follows: (1) The pirate illegally records watermarked movies and uploads the illegal recordings to the Internet. (2) A conventional watermarking system such as [5] finds the illegal recordings on the Internet and analyzes the embedded message to determine the theater and the showtime at which the illegal recordings were made. (3) The position estimation system estimates the position in the theater where the pirate was, precisely enough for specifying the seat. (4) A person identification system identifies the pirate by making correspondence between the seat and the person who was on the seat. A ticketing system or a video surveillance system may be used as the person identification system. This scenario is illustrated in Fig. 1. This paper focuses on the position estimation system surrounded with thick lines in Fig. 1, which is a key component of this scenario. The position estimation system uses an audio watermark signal embedded into movie soundtrack for estimating the recording position. It is not easy to embed audio watermark signals into movie soundtracks. In fact, most of the watermarking methods that have been proposed for movies are video watermarking methods. This difficulty comes from the nature of the movie soundtracks. They are composed of several types of audio such as music, sound effects, voice, and silent portions. In the voice

2 2 Watermark embedder HS1 HS0 HS2 WHS1 WHS0 WHS2 Theater (x, ˆ y) ˆ Watermark embedder Watermark embedder (x, y) Recorded signal Estimated position ^ ^ (x, y) Position estimator Detection strengths Watermark detector movie soundtracks is addressed by a one-step approach utilizing the detection strength model and the entire RS. We present the results of subjective listening tests assessing the acoustic quality of the watermarked multichannel movie soundtracks. As far as we know, this is the first efforts to assess the acoustic quality of audio watermarking in an environment with more than two speakers. The rest of this paper is organized as follows. In Section II, related works are introduced. Section III describes our watermarking algorithm, and then Section IV describes the position estimator. Experimental evaluations of our system and a discussion are given in Section V. We conclude this paper in Section VI. Fig. 2. An overview of the proposed position estimation system. and silent portions, which seems to dominate large portions of a soundtrack, the watermark embedders cannot embed a strong watermark signal without degrading the acoustic quality. We call this the sparseness of movie soundtracks problem. However, we can overcome this problem by maximum-likelihood analysis using the entire recorded signal, and achieve precise recording position estimation by watermarking the multiplechannel soundtrack. An overview of our system is shown in Fig. 2. Now we explain how the position estimation system works. We call each of the channels of the soundtrack as a host signal (HS). The watermark signal for each HS is generated using spread spectrum (SS) technique with different SS codes. The watermark embedder generates a watermark signal for each HS and adds the watermark signal to the HS to generate a watermarked host signal (WHS). Each WHS is emitted into the air from a separate loudspeaker. If the movie is recorded with a camcorder, the monaural recorded signal (RS) of the audio will be a mixture of all of the WHSs. In the RS, the signal from each loudspeaker is delayed in proportion to the distance from that loudspeaker to the microphone of the camcorder. Our main idea is to utilize these delays for the position estimation. The watermark detector calculates detection strengths, which are defined as the correlations between the SS codes and the RS. Therefore, the detection strength of each watermark signal will have a peak at a particular time dependent on the delay times. Taking this into account, we construct a stochastic model of the detection strength. The system calculates the probability of obtaining the detection strengths based on the model and finds an optimal recording position from the probability using maximum-likelihood analysis. The main contributions of this paper are: We demonstrate that digital watermarking of multiplechannel audio signals can be used for finding recording positions down to specifying a specific seat in a large auditorium. This is a brand-new application of the digital watermarking technique. We present a recording position estimation method that is usable even for sparse movie soundtracks. The problem of unreliable watermark signals in the silent portions of the II. RELATED WORK A. Copy Prevention Using Watermarking Technique For music, Tachibana et al. [8] proposed sonic watermarking to allow us to search for illegal recordings made available on the Internet by embedding a secret message into the audio signals. The most distinctive characteristic of this sonic watermarking is that it is applicable even to unplugged live performances. For digital cinema, some studies reveal and classify the sources of pirated movies and assert the importance of the copy prevention using digital watermarking techniques [3], [4]. As a copy prevention method, some watermarking techniques have been proposed. Haitsma et al. [5] developed a video watermarking method for detecting the illegal recordings. Watermark detection from these illegal recordings is a very tough problem because of their geometric distortion. They overcame this problem by relying only on the time axis of the movie and their system allows us to identify the theater, presentation time, and other characteristics. Another way to overcome the geometric distortion was proposed by Nguyen et al. [6], canceling the geometric distortion by considering a model of geometric deformations that occurs according to the positions of the projector and the camcorder. Gohshi et al. [9] presented a watermarking method that is designed to detect watermarks in illegally recorded footage made from CRT screens. Manually canceling the geometric distortion of the footage, they achieved accurate detection. Lubin et al. [7] proposed a video watermarking method aiming at digital cinema applications. This method includes a scheme to cancel the geometric deformations. All of these methods can distinguish illegal recordings made available on the Internet, and are effective in deterring the camcorder piracy. However, they are not able to specify the recording locations where the illegal recordings were made. B. Digital Watermarking Algorithm for Audio Signal There are many digital watermarking algorithms for audio signals. A watermarking algorithm that exploits a psychoacoustic model to maintain the inaudibility of the watermark signal was presented by Swanson et al. [10]. Their psychoacoustic model takes the temporal and frequency masking

3 3 effects of the human auditory system (HAS) into account. The proposed watermarking algorithm of Kirovski and Malvar uses an SS technique [11]. They improved the robustness against distortion by arranging the SS code on the time frequency plane of the HS. For watermarking algorithms that use SS techniques, desynchronization attacks are a serious problem because the desynchronization attacks make watermark detection impossible. They overcame this problem by searching exhaustively for the synchronization position. The algorithm proposed by Tachibana et al. [12] also uses the time-frequency plane of the HS to embed the watermark signal. Another interesting algorithm called echo hiding was presented by Gruhl et al. [13]. Echo hiding embeds a watermark by adding an echo to the HS. The inaudibility of the watermark relies on the temporal masking effect of the HAS. However, since this algorithm uses only one echo to embed a watermark, anyone can detect the watermark and this algorithm is not capable of embedding multiple watermarks. To overcome these problems, Ko et al. [14] proposed a time-spread echo method. This algorithm spreads the echoes in the time domain with a pseudo-random sequence. Frequency (a) Pattern block Pattern block Pattern block (b) Tile Pattern block A frame (c) Time-frequency plane of an HS An amplitude spectrum Time Fig. 3. (a) A pattern block consisting of W B H B tiles. (b) A tile comprised of H T amplitude spectra of two consecutive frames. (c) Repeated pattern blocks on the time-frequency plane of a HS. C. Position Estimation Using Information Hiding For position estimation using information hiding techniques, only a few methods have been proposed. Lazic and Aarabi [15] presented a data hiding method for an audio signal to be a communication channel between a loudspeaker and a microphone, and they applied the method to a position estimation system. Their position estimation system exploits a property of their SS-based data hiding method: the detection strength decreases depending on the distance between the loudspeaker and the microphone. They reported that their system is able to specify the loudspeaker nearest to the microphone. However, since this is done based only on comparison among the detection strengths of the watermarked signals from the loudspeakers, it cannot give the precise position of the recording. Nakashima et al. [16], [17], [18] proposed a position estimation system using an audio watermarking technique to specify the recording position of an illegal recording. This system uses delays of the watermark signal embedded in a multi-channel piece of music and is able to estimate the recording position with a mean estimation error of 1.21 meter in a 6 6 m 2 room. Our position estimation system is based on [18], and is extended to be applicable to the movie soundtracks so that the system can be used for deterrent of the camcorder piracy by using a stochastic model of the detection strength. The existing method [18] has a problem when it is applied to sparse soundtracks. That is, this method fails in estimating the recording position from soundtracks. This is because it takes a two-step approach. The method first calculates delays of the watermark signals and then estimates the recording position. The insufficient energy of the watermark signal in the silent portion of the soundtrack causes a large error in the delay calculation. For accurate position estimation, delays of at least two channels are required. This is a tough condition for soundtracks since the insufficient energy of the watermark signal in the silent portion causes a large error in the delay calculation. Hence, the position estimation resulted in a large error. For this reason, the system [18] targeted only on multiple-channel music pieces. In contrast, the proposed method employs a onestep approach utilizing the detection strength model and the whole RS. This allows accurate estimation of the recording positions even for the soundtracks. III. WATERMARKING ALGORITHM Our algorithm is based on [12] which can detect the watermark signals in recorded signals, and we modify [12] to improve the estimation accuracy. In this section, we describe the basic concepts of the watermark embedding and the watermark detection and then describe them in more detail. A. Basic Concepts 1) Pattern Block and Tile: The watermark embedder constructs the time-frequency plane of the HS by using the discrete Fourier transform (DFT), and modifies the amplitudes of the segmented areas called pattern blocks, as shown in Fig. 3 (a). A pattern block has W B H B tiles, each of which consists of the H T amplitude spectra of two consecutive DFT frames. The tile in the wth column and in the hth row is represented as the tile at (w, h). 2) Pseudo-Random Array: The amplitude spectra in each tile are modified according to the pseudo-random number in {+1, 1} assigned to the tile. The pseudo-random numbers of the tiles in a pattern block form a two-dimensional pseudorandom array (PRA) as shown in Fig. 4 (a). A pseudo-random number for the tile at (w, h) is denoted by ω(w, h). 3) Multiple Watermark Detection: For the recording position estimation, we need to detect multiple watermark signals in a RS. This is achieved by using a different PRA for the watermark signal for each channel of the soundtrack. The value of ω(w, h) for the watermark signal of the cth channel (c = 1, 2,, N C ) is represented as ω c (w, h).

4 and the frequency masking [19]. We use the ISO-MPEG 1 audio psychoacoustic model 2 for layer 3 [20] as the basis of our psychoacoustic model, and alter it as described in [8]. (a) PRA (b) A tile assigned with +1 Fig. 4. (a) Pseudo-random numbers assigned to the tiles in a pattern block. The pseudo-random numbers form a PRA. (b) A tile assigned with +1. In this case, in the two consecutive frames of the tile, the amplitudes in the first frame are increased (represented by + ) and those in the second frame are decreased (represented by ). 4) Fine Detection: Since the position estimation system requires accurate delay times, we need to calculate the detection strengths at a fine resolution. We call this fine detection. To achieve the fine detection, the detection strength, which is basically a normalized cross-correlation between the RS and each PRA, is repeatedly calculated by shifting the PRA by samples. The detection shift determines the accuracy of the detection strength resolution. A sufficiently small should be used not to spoil the sharpness of the peaks of the detection strengths. 5) Modulus Operator: A modulus operator and the pseudorandom number assigned to a tile determine how the amplitude spectra in the tile are modified. The modulus operator m is defined by m = (m 0, m 1 ) = (+1, 1), (1) The signs of the amplitude modifications for the first and the second frames in the tile at (w, h) of the cth channel are determined by ω c (w, h)m 0 and ω c (w, h)m 1, respectively. This means that ω c (w, h) = +1 increases the amplitude spectra in the first frame of the tile and decreases the spectra in the second frame. In the opposite case, ω c (w, h) = 1 decreases the amplitude spectra in the first frame and decreases the spectra in the second frame. Figure 4 (b) shows a tile which is assigned +1. In the watermark detection, taking the difference of the adjacent frames in a tile enhances the watermark signal, and reduces the influence of the HS. In [12], the modulus operator is defined by m = (m 0, m 1, m 2, m 3 ) = (+1, +1, 1, 1), (2) and a tile consists of four consecutive frames so that the watermarks can be detected even when the starting positions of the frames in the WHS and in the RS are different. In other words, modifying the amplitudes of two consecutive frames by the same sign broadens the peaks of the detection strengths, enabling us to detect watermarks without knowing the exact starting position of the PRA. However, since the broadened peak degrades the accuracy of the delay times, we use the modulus operator defined by (1). 6) Psychoacoustic Model: To make the watermark signals inaudible, we use a psychoacoustic model to decide the amount of the amplitude modifications. There are several kinds of psychoacoustic effects of the human auditory system such as the absolute threshold of hearing, the temporal masking, B. Watermark Embedder The watermark embedder generates a WHS. The energy of the watermark signal is spread on the pattern block using the PRA. The WHS for the cth channel, y c (t), is generated by the following steps. 1) The HS in the time domain, x c (t), is divided into frames, each of which consists of N samples, using the sine window. Adjacent frames are overlapped with each other by N/2 samples to avoid discontinuities. The tth sample of the fth frame is represented as x c (f, t) = x c (t + fn/2)win(t), (3) where win(t) is the sine window defined as win(t) = sin(πt/n) for 0 t N 1. (4) 2) The frames are transformed into the frequency domain using the DFT. The kth complex spectrum of the fth frame, X c (f, k), is obtained as X c (f, k) = DFT[ x c (f, t)](k). (5) The amplitude spectrum, XA c (f, k), and the phase spectrum, XP c (f, k), are given by X c A(f, k) = X c (f, k) (6) X c P(f, k) = arg X c (f, k). (7) 3) The psychoacoustic model determines the inaudible amount of amplitude modification A c (f, k). 4) The amplitude modification sign, Sign c (f, k), for an amplitude spectrum in the tile at (w, h) is calculated as Sign c (f, k) = ω c (w, h)m (f mod 2). (8) 5) The amplitude spectrum of the WHS, YA c (f, k), is obtained as Y c A(f, k) = X c A(f, k) + αa c (f, k)sign c (f, k), (9) where α is the watermarking rate which controls the tradeoff between the acoustic quality of the WHS and the position estimation accuracy. 6) The time-domain representation of the WHS in each frame is constructed with the inverse DFT (IDFT) by using the original phases of the HS. ỹ c (f, t) = IDFT[Y c A(f, k) exp{ 1X c P(f, k)}](t). (10) 7) The final WHS in the time domain, y c (t), is generated by the overlap-and-add technique using the sine window as follows. y c (t) = F 1 f=0 ỹ c (f, t fn/2)win(t fn/2), (11) where F is the number of frames in the HS.

5 5 samples Frames for a tile Pattern blocks (a) Frames for samples... frequency (b) Frames for... (a) RS time Fig. 5. samples Frames in the watermark detection. C. Watermark Detector The watermark detector calculates the detection strengths using the fine detection. The watermark detector detects multiple watermark signals with different SS codes, and gives the detection strengths in a fine resolution. The detection strengths of the cth channel with an i -sample time delay, s c (i), are calculated from the RS by the following steps. 1) The RS, z(t), is divided into frames by the sine window. Each frame is comprised of N samples, and overlaps with each other by N/2 samples. The first frame starts at the i th sample as shown in Fig. 5. That is, z i (f, t) = z(t + i + fn/2)win(t). (12) 2) The frames are transformed into the frequency domain by the DFT. The kth amplitude spectrum of the fth frame, Z i (f, k), is computed by Z i (f, k) = DFT[ z i (f, t)](k). (13) 3) The amplitudes are normalized as Z i (f, k) = 1 N/2 Z i (f, k) N/2 1 k=0 Z i (f, k). (14) 4) The difference between logarithmic amplitudes of two frames in a tile at (w, h), D i (w, k), is calculated as D i (w, k) = log Z i (2w, k) log Z i (2w + 1, k). (15) This alleviates the influence of the HS because the amplitudes of the consecutive frames have close values, while the watermark signal is enhanced by the modulus operator. 5) The amplitude of the tile at (w, h), ρ i (w, h), is given by ρ i (w, h) = k D i (w, k). (16) The summation is computed for k included in the tile at (w, h). 6) The ith detection strength of the cth channel, s c (i), is calculated as W B 1 w=0 s c (i) = W B 1 w=0 H B 1 h=0 H B 1 h=0 ω c (w, h) [ρ i (w, h) ρ i ] {ω c (w, h) [ρ i (w, h) ρ i ]} 2, (17) detection strength detection strength (b) The detection strengths calculated from the RS (c) The detection strength blocks time time Fig. 6. (a) A RS containing multiple watermark signals. The pseudo-random number assigned to each tile in pattern blocks is also shown. (b) The detection strengths calculated from the RS. (c) The detection strength blocks. where 1 ρ i = W B H B W B 1 w=0 H B 1 h=0 ρ i (w, h). (18) From the central limit theorem, s c (i) follows the normal distribution. If the RS is not watermarked, since the standard deviation of the numerator of (17) is given by the denominator, s c (i) asymptotically follows the standard normal distribution. IV. POSITION ESTIMATOR In this section, we describe the maximum-likelihood position estimator in detail. An algorithm which reduces the computational cost of finding the maximum of the likelihood function is also presented. A. Basic Concepts 1) Detection Strength Model: As described above, the detection strengths asymptotically follows the normal distribution with unknown mean and variance. Hence, we model a detection strength as a random value which follows the normal distribution. We call this model a detection strength model. The mean and variance of the distribution is determined as follows. The watermark signal in a RS is shifted by the time delay in proportion to the distance from the loudspeaker to the microphone. Therefore, the detection strength forms a certain pattern with some peaks corresponding to the time delay as shown in Fig. 6 (b), when we see the sequence of the detection strength along the time axis. Based on this assumption, we assume that the mean of the distribution can be determined by a function of the time delay of the peaks. The variance is assumed to be 1 to maintain the simplicity. Since the recording

6 6 position gives the time delay theoretically, we can calculate the likelihood of the detection strength given the recording position. The position estimator finds the recording position which maximizes the likelihood. 2) Fast Maximization using Upper Bound: Our position estimation is a maximization problem of the likelihood function. Since the derivation of the analytical solution of the maximization is too difficult or even impossible, we must maximize the likelihood function by exhaustively searching for the best one from a set of the possible values of the parameters, which is computationally expensive. To reduce the computational cost, we introduce pruning using an upper bound of the likelihood function. If the upper bound is lower than the maximum value that has been obtained, we need not perform further searching with the value of the parameter. B. Derivation of Detection Strength Model We model the detection strength by the normal distribution. The mean of the distribution is determined by the recording position and the recording conditions. In this section, we determine the mean of the distribution from the position and shape of the peaks. The shape of the peak is determined by the watermarking algorithm, the condition of the recording, and the HS. For the fine detection, the correlation of the PRA and the RS, s c (i), is calculated for every samples in the RS. This must be smaller than the length of a tile, N. Therefore, not only at the exact time position of the starting position of the pattern block, but also around that time position, strong correlation values are given as in Fig. 6 (b). Furthermore, the condition of the recording (i.e. the volume, bandwidth of the recording device, noises) and the HS affect the shape. These factors mainly alter the height of the peak. Taking these into account, we compute the averaged shape of the detection strength peak as follows. First, a watermark signal with single pattern block is generated using a PRA, and the watermark detector is applied to the signal. In the calculation of (17), since the pattern block is arranged repeatedly in the actual embedding process, we assume that the signal is periodic, and the detection strength is calculated for i = 0 to I 1 where I is the repetition period. Since the pattern block in the signal starts at the beginning of the signal, the peak is at i = 0. This process is repeated using different PRAs. Then, for each i, the average of detection strengths among the PRAs is calculated. We denote this averaged shape of the detection strength peak by g(i). Using g(i) and assuming that a pattern block starts at i = i, we obtain the mean of the distribution as µ(β, i ) = βg(i i ) (19) where β is a parameter which determines the height of the peak dependent on the recording condition and the HS. As mentioned in Section III-C, the variance of the detection strength is asymptotically 1 for time positions i except the peaks, because the mean of the numerator of (17) is 0 and thus the denominator can be a sample standard deviation of the numerator. On the other hand, the variance is not 1 for time positions close to the peaks. This is because the watermark signal shifts the mean of the distribution of the numerator. However, we ignore this to maintain the simplicity. C. Position of the Peak We formulate the relationship between the recording position and the peaks of the detected strengths as follows. We calculate the relative time delay of the time position of the cth channel theoretically from the reference channel r. Let x m and x c sp denote the recording position and the position of the loudspeaker for the cth channel, respectively. The relative time delay of the cth channel is given by the function of x m as ι c (x m ) = F S( x c sp x m x r sp x m ), (20) V S where F S is the sampling frequency and V S is the speed of sound. From this equation, the time position of the peak of the cth channel is given as i c (x m, i r ) = i r + ι c (x m ) where i r is the time position of the peak of the reference channel. To simplify the notation, we omit the parameter (x m, i r ) which is common for any c unless it is ambiguous. D. Derivation of the Position Estimator Since the PRA is arranged repeatedly on the time-frequency plane, the detection strengths form a peak at the beginning of each pattern block. We segmented the detection strengths into J detection strength blocks as shown in Fig. 6 (c) so that each detection strength block has single peak in itself. The length of the detection strength block is equal to that of the pattern block, I. A pattern block consists of W B N samples. There are W B tiles in each row and each tile occupies two consecutive frames overlapping with each other by N/2. Therefore, since the detection strength is calculated for every W B N samples starting from the i th sample, the length of a detection strength block is I = W B N/. The jth detection strength block of the cth channel, o c j, is represented as o c j = (o c j,0, o c j,1,, o c j,i 1), (21) where o c j,i = sc (ji + i). The detection strengths of the cth channel are denoted as O c = {o c 0, o c 1,, o c J 1}. (22) The value of J depends on the duration of the RS. Now, we derive the maximum-likelihood estimator of the recording position. First, we calculate the probability of O = {O 1, O 2,, O N C }. Since the peak in o c j is at ic, o c j,i follows the normal distribution, N (µ(βj c, ic ), 1), where βj c is the height of the peak. Therefore, the conditional probability of o c j,i is obtained as Pr[o c j,i x m, i r, β c j ] = 1 2π exp [ {oc j,i µ(βc j, ic )} 2 2 (23) Thus, the conditional probability of O is given as N C N C J 1 Pr[O Θ] = Pr[O c Θ] = Pr[o c j Θ] (24) c=1 c=1 j=0 N C = J 1 c=1 j=0 i=0 ] I 1 Pr[o c j,i Θ], (25).

7 7 where Θ = {x m, i r, B} and B = {βj c c = 1, 2,, N C, j = 0, 1,, J 1}. We define the log-likelihood function, L(Θ), as J 1 I 1 {o c j,i L(Θ) = µ(βc j, ic )} 2. (26) 2 c=1 j=0 i=0 Eliminating βj c by setting L(Θ)/ βc j = 0, and ignoring the irrelevant terms, we obtain the following maximization criterion equivalent to (26): [ J 1 I 1 2 L (Θ ) = o c j,ig(i i )] c, (27) c=1 j=0 i=0 where Θ = {x m, i c }. The recording position is estimated by finding the parameters which maximize this criterion. That is, the maximumlikelihood estimator of Θ is ˆΘ = arg max Θ L (Θ ), (28) and the element, x m, is the maximum-likelihood estimator of x m. The simplest solution for this maximization problem is an exhaustive search in the set of possible values of Θ. E. Maximization Algorithm to Reduce Computational Cost Finding the maximum of L (Θ ) by the exhaustive search is computationally too expensive since the parameter space is three-dimensional when x m is two-dimensional, and each possible Θ requires the calculation of (27). In this section, we propose an algorithm which can drastically reduce the computational cost by using an upper bound of L (Θ ). We calculate the upper bound for each value of i r. If the upper bound is lower than the maximum that has been obtained by that time in the search, further search with the value of x m is unnecessary. We define the following function: [ J 1 I 1 2 λ c (i c ) = o c j,ig(i i )] c. (29) j=0 i=0 Since i c can be calculated from Θ, the maximization criterion (27) can be rewritten as L (Θ ) = λ c (i c ) = λ r (i r ) + c=1 c=1,c r λ c (i c ) (30) by separating λ r (i r ), which is irrelevant to x m. In the exhaustive search, the summation in the rightmost side of the equation is maximized for each given i r. That is, max x m λ r (i r ) + c=1,c r λ c (i c ) = λ r (i r ) + max x m c=1,c r λ c (i c ). (31) Since the maximum of the last term is less than or equal to the sum of the maximums of λ c (i c ), we obtain the following inequality: max x m c=1,c r λ c (i c ) c=1,c r max λ c (i c ) = M. (32) i c Maximization algorithm ĩ c arg max i c λ c (i c ) r arg max N C c=1 λc (ĩ c ) current maximum 0 for i r = ĩ r to I 1 and i r = 0 to ĩ r 1 do if current maximum < λ r (i r ) + M then xˇ m search possible x m exhaustively ˇΘ {i r, xˇ m } if L ( ˇΘ ) > current maximum then current maximum L ( ˇΘ ) Θ cand ˇΘ end if end if end for return Θ cand Fig. 7. Maximization algorithm. The maximization of λ c (i c ) is not too computationally expensive since this is a maximization problem involving only one parameter, i c. In other words, although i c is determined by x m and i r, the maximization of λ c (i c ) is simply finding the value of i c regardless of x m and i r. Also, the maximization is done only once because it is irrelevant to the value of i r. Thus, we obtain an upper bound u(i r ) of L (Θ ) given i r as L (Θ ) u(i r ) λ r (i r ) + M. (33) Now we have the upper bound, we can prune the search of x m for i r if u(i r ) is less than the maximum value that we have computed for a different value of i r. Figure 7 shows the maximization algorithm using the upper bound. This algorithm can drastically reduce the number of possible i r s while the exhaustive search requires to find x m which maximize L (Θ ) for all i r s. Furthermore, the earlier we obtain large values, the more effective the pruning of the algorithm becomes. Therefore, we choose the reference channel as where r = arg N C max c=1 λc (ĩ c ), (34) ĩ c = arg max λ c (i c ), (35) i c and the search is begun from ĩ r, where we can expect that L (Θ ) gets large. V. EXPERIMENTAL RESULTS To evaluate the estimation accuracy of our position estimation system, we conducted experiments in a circular auditorium with 250 seats. The effect of the watermarking rate, α, which controls the volume of watermark signals, on the estimation accuracy was investigated by simulation experiments. We also subjectively assessed the acoustic quality of the WHS by using the MUSHRA listening tests [21].

8 8 TABLE III EXPERIMENTAL PARAMETERS. (-6, 8) Loudspeaker for y [m] (-3, 10) (-1, 10) (1, 10) (3, 10) (-3, 8) (-1, 8) (1, 8) (3, 8) (-3, 6) (-1, 6) (1, 6) (3, 4) (6, 8) Loudspeaker for Number of tiles in a column of a pattern block W B 20 Number of tiles in a row of a pattern block H B 24 Height of a tile H T 6 Number of channels N C 3 Frame length [samples] N 512 Detection shift [samples] 16 Sampling frequecny [Hz] F S Sound velocity [m/s] V S 340 TABLE IV ROOT MEANS SQUARE (RMS) VALUES OF THE WATERMARK SIGNALS. (-3, 4) (-1, 4) (1, 4) (3, 4) Label RMS for c = 1 RMS for c = 2 RMS for c = 3 DS [db] [db] [db] DS [db] [db] [db] DS [db] [db] [db] DS [db] [db] [db] DS [db] [db] [db] Fig. 8. Loudspeaker for (0, 0) x [m] The experimental environment for the estimation accuracy evaluation. TABLE I THE TEST SAMPLES USED IN THE EXPERIMENTS. Label Title Start at RMS [db] for channel [sec] c = 1 c = 2 c = 3 DS1 Saw 1, DS2 Pretty Woman 3, DS3 The Bourne Identity 3, DS4 Harry Potter and the Goblet of Fire 2, DS5 RENT 2, A. Estimation Accuracy Evaluation To evaluate the estimation accuracy of our system in a semirealistic environment, we conducted experiments in the Hankyu Sanwa Conference Hall in the Alumnus Union Building for the Osaka University Medical School 1. This is a circular auditorium with a radius of 8.8 m and has 250 seats. Three loudspeakers and 16 microphones (represented by the dots) were arranged in the same plane, as shown in Fig. 8. The experimental setup is shown in Fig. 9. We recorded the sound with all 16 microphones simultaneously. The volume of the two powered mixers was manually adjusted to be the same. The test samples used in these experiments are listed in Table I. The test samples are excerpts from the right (c = 1), center (c = 2), and left (c = 3) channels of the original movie soundtracks. The starting positions were randomly chosen. The TABLE II ROOT MEANS SQUARE (RMS) VALUES OF THE TEST SAMPLES. Label RMS for c = 1 RMS for c = 2 RMS for c = 3 DS [db] [db] [db] DS [db] [db] [db] DS [db] [db] [db] DS [db] [db] [db] DS [db] [db] [db] 1 duration of each test sample is 1,800 seconds (30 minutes). The root mean square (RMS) values of each test sample are listed in Table II. The parameters used in the experiments are listed in Table III. To reduce the cross-correlation effects among the PRAs, we generated PRAs for each test sample by exhaustively searching a set of PRAs for pairs that had low cross-correlation values. The watermarking rate, α, was set to 1.0. The RMS values for the watermark signals are listed in Table IV. Figure 10 shows the estimation errors for each microphone position. Almost all microphone positions were accurately estimated except for microphone positions (3, 4), (1, 4), and ( 1, 4) for DS2. The estimation errors for these microphone positions were large. One of the reasons is that there were not enough watermark signals of the first and third channels in the RS to form peaks in the detection strengths. Since the energy of the first and third channels of the DS2 was low, the watermark embedder could not embed the watermark signals with sufficient energy. The directional characteristics of the loudspeakers and the distances from the loudspeakers to these microphones enhanced this energy imbalance. Furthermore, effect of cross-correlation among the three PRAs enlarged the error. If the first and second channel have correlation, the strong watermark signal of the second channel forms a false peak in the detection strength of the first channel even the correlation is weak. If the false peak is larger than the actual peak, the estimator cannot give the correct estimation. Therefore, in practical use, some types of technique which controls the volume of watermark signals to balance the energy of channels may be necessary. The mean and standard deviation of the estimation errors for all of the microphone positions were 0.40 m and 1.33 m, respectively. Although the standard deviation is large due to the large errors of DS2, this is good enough to reduce the number of suspected seats to a few seats. B. Watermarking Rate versus Estimation Accuracy In the previous section, we showed that our system accurately estimated the recording positions for α = 1.0. However,

9 9 PC Powered Mixer YAMAHA EMX66M Microphone Amp. audio-technica AT-MA2 Audio Interface EDIROL UA-101 RSs PC WHSs Microphone Amp. audio-technica AT-MA2 Audio Interface EDIROL UA-101 Powered Mixer YAMAHA EMX312SC Microphone Amp. audio-technica AT-MA2 Audio Interface EDIROL UA-101 RSs PC Loudspeakers YAMAHA HS-50M Microphones SHURE SM63L Fig. 9. The experimental setup. Estimation Error [m] DS1 DS2 DS3 DS4 DS5 Estimation error [m] (-3, 4) (-1, 4) (1, 4) (3, 4) (-3, 6) (-1, 6) (1, 6) (3, 6) (-3, 8) (-1, 8) (1, 8) (3, 8) (-3, 10) (-1, 10) (1, 10) (3, 10) Watermarking rate Fig. 10. Microphone Position (x [m], y [m]) Estimation errors of the experiment in the auditorium. Fig. 11. The relationship between the watermarking rate and the estimation error. The means and the standard deviations of estimation errors are calculated for each α value. the acoustic quality was heavily degraded since α was too large. To maintain the acoustic quality, the watermarking rate should be small to keep the energy of watermark signals in a low level. This may cause larger estimation errors. We investigated the relationships between α and the estimation errors by using simulation experiments. First, we model the RS which is received by the microphone at x as z x (t) = y c (t) h c x(t) + n B (t), (36) c=1 where y c (t) is a WHS of the cth channel, h c x(t) is the impulse response of the path from the loudspeaker for the cth channel to the microphone at x, n B (t) is the noise (including background noise and the thermal noise), and is the convolution operator. We measured the impulse response, h c x(t), by the time stretched pulse method [22] under the same experimental setup as discussed in Section V-A. The noise, n B (t), is assumed to follow the normal distribution, N (0, σb 2 ), and its variance, σ2 B is determined from the RS without any sound coming from the loudspeakers. Although the impulse response characterizes the linear aspects of the experimental system (i.e. the powered mixers, the loudspeakers, the microphone and so forth), the experimental system actually has nonlinear aspects (e.g. amplifier clipping). However, the effect of the nonlinearity is small in general and thus we consider that the difference between the detection strength of the simulated version of the RS and the actually recorded RS can be neglected. Applying (36) to the WHSs with various α, we generate simulated versions of the RSs. The other parameter values were the same as in Section V-A. The mean and standard deviation of the estimation errors were calculated for each α. The result is shown in Figure 11. The mean and standard deviation of the estimation error for α = 1.0 are 0.41 m and 1.26 m, respectively. These are close enough to the result in Section V-A. Thus, we consider that the result of this simulation experiment is reliable. The mean of the estimation errors was large for α < 0.1. Meanwhile, the microphone positions were estimated with small errors for α 0.1, although the standard deviations were relatively large due to the large estimation errors of DS2, as mentioned in Section V-A. The mean of the estimation error for α = 0.1 was 0.44 m. This result indicates that, in this experimental environment, the peak of the detection strengths is buried in the noise for α < 0.1. In other words, we can reduce the value of α as small as 0.1 without significant estimation errors. Note that the appropriate value of α may depend on the frequency response of the acoustical system of the auditorium including background noise. To show the effectiveness of the algorithm in reducing the computational costs, in this experiment, we measured the

10 10 TABLE V THE SAMPLES USED IN THE SUBJECTIVE ASSESSMENT OF THE ACOUSTIC QUALITY. Label Excerpt from Starts at Ends at SUB1 DS2 454 [s] 473 [s] SUB2 DS3 111 [s] 129 [s] SUB3 DS4 1,229 [s] 1,248 [s] SUB4 DS5 326 [s] 349 [s] SUB5 DS2 1,229 [s] 1,046 [s] TABLE VI THE DESCRIPTION OF THE TEST SIGNAL USED IN THE SUBJECTIVE ASSESSMENT OF THE ACOUSTIC QUALITY. y [m] (0, 3) Loudspeaker for Listening position (3, 3) Label Description REF Reference signal HREF Hidden reference ALPF Low pass filtered signal as an anchor AM48 Compressed signal using MP3 48 kbps as an anchor AM32 Compressed signal using MP3 32 kbps as an anchor WR01 Watermarked signal with α = 0.1 WR03 Watermarked signal with α = 0.3 WR05 Watermarked signal with α = 0.5 time to estimate the positions. A PC with an Intel Core 2 Duo processor running at 1.6 GHz using Windows XP (Service Pack 2) with 1 Gbyte of memory was used in these experiments. The average time was 596 seconds to process a 1,800-second RS with three embedded watermark signals. For comparison, we also measured the time to estimate the positions with the exhaustive search version. However, since this was time consuming, the estimation was executed only twice. The average time for these two estimations was 179,573 seconds. Hence, the proposed algorithm achieved the 99.7% execution time reduction compared to the exhaustive search without any loss of the accuracy. C. Subjective Evaluation of Acoustic Quality We subjectively assessed the acoustic quality of the WHSs by using MUSHRA listening tests [21]. This is a method to assess the acoustic quality of audio signals which undergoes some audio signal processing system like encoding and decoding. A subject listens to multiple audio signals including not only the audio signal which undergoes the objective system but also the original audio signal called the hidden reference and audio signals called anchor, which undergoes the other system, for comparison and were required to grade all audio signals comparing each other. In this assessment, we used the test samples listed in Table V which are excerpts from the samples used in Section V-A. Each of the test samples were processed as in Table VI. For each test sample, 17 inexperienced listeners, who underwent training sessions in which they were exposed to all of the signals used in the tests in advance, graded the processed signals. Since MUSHRA listening tests take a long time, we could not conduct the subjective listening tests in the auditorium with the loudspeakers. Instead, the subjects assessed the test signals under the following conditions. (a) Assessment in a small office with three loudspeakers. The subjects were at the listening position corresponding to (3, 3) in a 6 6 m 2 office as shown in Fig. 12, and Fig. 12. (0, 0) (3, 0) Loudspeaker for Loudspeaker for Listening position in the room for the (a) condition. x [m] TABLE VII SUMMARY OF THE CONDITIONS UNDER WHICH THE ACOUSTIC QUALITY WAS ASSESSED. Listening Method Loudspeaker Headphone Office Room (a) (b) Auditorium (c) assessed the test signals coming from three loudspeakers. (b) Assessment of the signals simulating listening in the office using headphones. The test signals were convolved by the impulse responses measured by a dummy head at the listening position in the same room as used for the (a) condition, and subjects listened to the simulated signals with headphones. (c) Assessment of the signals simulating listening in the auditorium using headphones. This condition is almost the same as (b) but the impulse responses were measured at (0, 6) in the auditorium of Fig. 8. These conditions are summarized in Table VII. Since the test signals for (b) were generated using the impulse responses measured in the same room as used for (a), the results of (a) and (b) should be similar. If this is satisfied, the results of (c) can be considered to be similar to those of subjective assessment under the condition where the subjects listen to the test signals from the loudspeakers in the auditorium. The means and 95% confidence intervals for the acoustic quality of the test signals under (a) and (b) are shown in Figs. 13 and 14, respectively. The degradation of the acoustic quality for WR01 and WR03 was almost imperceptible, and that for WR05 was perceptible, though it was still acceptably low. We can say that the subjective acoustic qualities under (a) and (b) were almost the same. Therefore, the results under (c) should be similar to these results under the conditions where the subjects assessed the acoustic quality in the auditorium. Figure 15 shows the means and 95% confidence intervals for the acoustic quality of the test signals under (c). Although the watermark signals were relatively audible compared to (a) or (b), the subjective acoustic qualities of WR01 and WR03

11 Excellent 80 Good 60 Fair 40 Poor 20 Bad 0 REF HREF ALPF AM48 AM32 WR01 WR03 WR05 Fig. 13. The means and the 95% confidence intervals for the acoustic quality of the test signals under (a) for all of the subjects. 100 Excellent 80 were still good enough for practical use. D. Discussion From the results of Section V-B and V-C, with α = 0.1, our system was able to estimate the recording position with the mean estimation errors of 0.44 m while the subjective acoustic quality was in the range of excellent quality. By increasing α to 0.3, the estimation error can be reduced to 0.34 m at the expense of the acoustic quality degradation down to the range of good quality. Therefore, we successfully showed that the proposed system is able to estimate the recording position without significantly spoiling the acoustic quality of movie soundtracks. However, the difference between the results of (b) and (c) indicated that the acoustic quality is largely dependent on the environment in which the system is used. It is also supposed that the estimation accuracy depends on the frequency response of the auditorium, the background noise and so forth. Hence, we need a preliminary experiment in the actual environment before the practical use to determine the appropriate value of α. Summary of the conditions under which the acoustic quality was assessed. Good 60 Fair 40 Poor 20 Bad 0 REF HREF ALPF AM48 AM32 WR01 WR03 WR05 Fig. 14. The means and the 95% confidence intervals for the acoustic quality of the test signals under (b) for all of the subjects. 100 Excellent 80 Good 60 Fair 40 Poor 20 Bad 0 REF HREF ALPF AM48 AM32 WR01 WR03 WR05 Fig. 15. The means and the 95% confidence intervals for the acoustic quality of the test signals under (c) for all of the subjects. VI. CONCLUSION In this paper, we have presented a position estimation system to prevent camcorder piracy in theaters as a new application of the audio watermarking technique. The core idea of our system is to utilize delays of the multiple-channel watermark signals in a recorded signal. Our system consists of a watermarking algorithm and a position estimator. The presented watermarking algorithm is designed to accurately obtain the delay times. Then to implement the position estimation from the recorded movie soundtracks, we have developed a position estimator using a stochastic model of the detection strengths. The long duration of the recorded movie soundtrack enables us to improve the estimation accuracy. Our experimental results showed that the system was able to estimate the recording position with the mean estimation errors of 0.44 m without significantly spoiling the acoustic quality assessed by MUSHRA listening test. However, the acoustic quality seemed to depend on the environment in which the system was used. To clarify the effect of environmental factors (i.e. the frequency response of the auditorium, the background noise and so forth) on the acoustic quality and the estimation accuracy, we need more experiments in various environments. Furthermore, the robustness of our system against attacks such as pitch shifting and lossy compression should be investigated. Especially, pitch shifting is serious since it may be caused by the slight difference of the sampling rate of the playback device and the recording device. Collusion attack is also a tough problem. Our system may estimate an irrelevant position if two or more recorded signals that are recorded at the different positions are mixed. However, it would be possible to detect the collusion attack by examining the detection strength.

12 12 R EFERENCES [1] 2005 US Picary Fact Sheet. Motion Picture Association of America. [Online]. Available: [2] Anti-piracy fact sheet asia-pacific region. Motion Picture Association. [Online]. Available: [3] J. A. Bloom and C. Polyzois, Watermarking to track motion picture theft, in Proc. of Signals, Systems and Computers, Conference Record of the Thirty-Eighth Asilomar Conference on, vol. 1, 2004, pp [4] S. Byers, L. Cranor, E. Cronin, D. Kormann, and P. McDaniel, Analysis of security vulnerabilities in the movie production and distribution process, Telecommunications Policy, vol. 28, no. 7-8, pp , August-September [5] J. Haitsma and T. Kalker, A watermarking scheme for digital cinema, in Proc. of International Conference on Image Processing, vol. 2, October 2001, pp [6] P. Nguyen, R. Balter, N. Montfort, and S. Baudry, Registration methods for non blind watermark detection in digital cinema applications, in Proc. of Security and Watermarking of Multimedia Contents V, vol. SPIE vol. 5020, June 2003, pp [7] J. Lubin, J. A. Bloom, and H. Cheng, Robust, content-dependent, high-fidelity watermark for tracking in digital cinema, in Security and Watermarking of Multimedia Contents V, Proc. of SPIE, vol. 5020, January [8] R. Tachibana, Sonic watermarking, EURASIP Journal on Applied Signal Processing, vol. 13, pp , [9] S. Gohshi, H. Nakamura, H. Ito, R. Fujii, M. Suzuki, S. Takai, and Y. Tani, A new watermark surviving after re-shooting the images displayed on a screen, KES (2), vol. 3682, pp , [10] M. D. Swanson, B. Zhu, A. H. Tewfik, and L. Boney, Robust audio watermarking using perceptual masking, Signal Processing, vol. 66, pp , [11] D. Kirovski and H. S. Malvar, Spread-spectrum watermarking of audio signals, IEEE Transactions on Signal Processing, vol. 51, no. 4, pp , [12] R. Tachibana, S. Shimizu, S. Kobayashi, and T. Nakamura, An audio watermarking method using a two-dimensional psuedo-random array, Signal Processing, vol. 82, pp , [13] D. Gruhl, A. Lu, and W. Bender, Echo hiding, in Proc. of the First International Workshop on Information Hiding, vol. 1174, 1996, pp [14] B.-S. Ko, R. Nishimura, and Y. Suzuki, Time-spread echo method for digital audio watermarking, IEEE Transactions on Multimedia, vol. 7, no. 2, April [15] N. Lazic and P. Aarabi, Communication over an acoustic channel using data hiding technique, IEEE Transactions on Multimedia, vol. 8, no. 5, pp , [16] Y. Nakashima, R. Tachibana, M. Nishimura, and N. Babaguchi, Estimation of recording location using audio watermarking, in Proc. of ACM Multimedia and Security Workshop 2006, Geneva, September 2006, pp [17] Y. Nakashima, R. Tachibana, M. Nishimura and N. Babaguchi, Determining recording location based on synchrnonization positions of audio watermarking, in Proc. of International Conference on Acoustic, Speech, Signal Processing 2007, Hawaii, April 2007, pp. II253 II256. [18] Y. Nakashima, R. Tachibana, M. Nishimura, and N. Babaguchi, Maximum-likelihood estimation of recording position based on audio watermarking, in Proc. of The Third International Conference on Intelligent Information Hiding and Multimdia Signal Processing, [19] E. Zwicker and H. Fastl, Psychoacoustics: Facts and Models. SpringerVerlag, [20] Information technology Coding of moving pictures and associated audio for digital storage media up to about 1.5Mbits/s part 3: Audio, ISO/IEC Std :1993, [21] Method for the subjective assessment of intermediate quality levels of coding systems, ITU Std. BS. 1534, [22] Y. Suzuki, F. Asano, H.-Y. Kim, and T. Sone, An optimum computergenerated pulse signal suitable for the measurement of very long impulse responses, The Journal of the Acoustical Society of America, vol. 97, no. 2, pp , February Yuta Nakashima received the B.E. and M.E. degrees in communication engineering from Osaka University, Osaka, Japan, in 2006 and 2008, respectively. He was with Texas Instrument Japan Limited and engages in research and development regarding audio signal processing. He is currently pursuing the doctoral degree at Osaka University. Ryuki Tachibana is a researcher at Tokyo Research Laboratory of IBM Japan. He received his B.E. and M.E. in aerospace engineering from the University of Tokyo, Japan, in 1996 and 1998, and his Dr. Eng. degree from Osaka University in Since he joined IBM Japan in 1998, his main research interests have been in the field of digital audio watermarking and text-to-speech synthesis. In 2003, he was awarded the Digital Watermarking Industry Gathering Event s Best Paper Award at Security and Multimedia Contents V of Electronic Imaging He is a member of the ASJ and the IEICE. Noboru Babaguchi (M 90-SM 07) received the B.E., M.E. and Ph.D. degrees in communication engineering from Osaka University, in 1979, 1981 and 1984, respectively. He is currently a Professor of the Department of Communication Engineering, Osaka University. From 1996 to 1997, he was a Visiting Scholar at the University of California, San Diego. His research interests include image analysis, multimedia computing and intelligent systems, currently content based video indexing and summarization. He has published over 100 journal and conference papers and several textbooks. Dr. Babaguchi received Best Paper Award of 2006 Pacific-Rim Conference on Multimedia (PCM2006). He is on the editorial board of Multimedia Tools and Applications and New Generation Computing. He served as a workshop Co-chair of 3rd International Workshop on Multimedia Information Retrieval (MIR2001), a Track Co-chair of 2006 IEEE International Conference on Multimedia & Expo (ICME2006), and served as a General Co-chair of the 14th International MultiMedia Modeling Conference (MMM2008). He also served on program committee of international conferences in these fields. He is a senior member of the IEEE, and a member of the ACM, the IEICE, the IPSJ, the ITE and the JSAI.

Introduction to Audio Watermarking Schemes

Introduction to Audio Watermarking Schemes Introduction to Audio Watermarking Schemes N. Lazic and P. Aarabi, Communication over an Acoustic Channel Using Data Hiding Techniques, IEEE Transactions on Multimedia, Vol. 8, No. 5, October 2006 Multimedia

More information

11th International Conference on, p

11th International Conference on, p NAOSITE: Nagasaki University's Ac Title Audible secret keying for Time-spre Author(s) Citation Matsumoto, Tatsuya; Sonoda, Kotaro Intelligent Information Hiding and 11th International Conference on, p

More information

An Improvement for Hiding Data in Audio Using Echo Modulation

An Improvement for Hiding Data in Audio Using Echo Modulation An Improvement for Hiding Data in Audio Using Echo Modulation Huynh Ba Dieu International School, Duy Tan University 182 Nguyen Van Linh, Da Nang, VietNam huynhbadieu@dtu.edu.vn ABSTRACT This paper presents

More information

Sound Quality Evaluation for Audio Watermarking Based on Phase Shift Keying Using BCH Code

Sound Quality Evaluation for Audio Watermarking Based on Phase Shift Keying Using BCH Code IEICE TRANS. INF. & SYST., VOL.E98 D, NO.1 JANUARY 2015 89 LETTER Special Section on Enriched Multimedia Sound Quality Evaluation for Audio Watermarking Based on Phase Shift Keying Using BCH Code Harumi

More information

Scale estimation in two-band filter attacks on QIM watermarks

Scale estimation in two-band filter attacks on QIM watermarks Scale estimation in two-band filter attacks on QM watermarks Jinshen Wang a,b, vo D. Shterev a, and Reginald L. Lagendijk a a Delft University of Technology, 8 CD Delft, etherlands; b anjing University

More information

Nonuniform multi level crossing for signal reconstruction

Nonuniform multi level crossing for signal reconstruction 6 Nonuniform multi level crossing for signal reconstruction 6.1 Introduction In recent years, there has been considerable interest in level crossing algorithms for sampling continuous time signals. Driven

More information

TOWARD ROBUSTNESS OF AUDIO WATERMARKING SYSTEMS TO ACOUSTIC CHANNELS. Emmanuel Wolff, Cléo Baras, and Cyrille Siclet

TOWARD ROBUSTNESS OF AUDIO WATERMARKING SYSTEMS TO ACOUSTIC CHANNELS. Emmanuel Wolff, Cléo Baras, and Cyrille Siclet 8th European Signal Processing Conference (EUSIPCO-200) Aalborg, Denmark, August 23-27, 200 TOWARD ROBUSTNESS OF AUDIO WATERMARKING SYSTEMS TO ACOUSTIC CHANNELS Emmanuel Wolff, Cléo Baras, and Cyrille

More information

RESEARCH ON METHODS FOR ANALYZING AND PROCESSING SIGNALS USED BY INTERCEPTION SYSTEMS WITH SPECIAL APPLICATIONS

RESEARCH ON METHODS FOR ANALYZING AND PROCESSING SIGNALS USED BY INTERCEPTION SYSTEMS WITH SPECIAL APPLICATIONS Abstract of Doctorate Thesis RESEARCH ON METHODS FOR ANALYZING AND PROCESSING SIGNALS USED BY INTERCEPTION SYSTEMS WITH SPECIAL APPLICATIONS PhD Coordinator: Prof. Dr. Eng. Radu MUNTEANU Author: Radu MITRAN

More information

Chapter 4 SPEECH ENHANCEMENT

Chapter 4 SPEECH ENHANCEMENT 44 Chapter 4 SPEECH ENHANCEMENT 4.1 INTRODUCTION: Enhancement is defined as improvement in the value or Quality of something. Speech enhancement is defined as the improvement in intelligibility and/or

More information

The psychoacoustics of reverberation

The psychoacoustics of reverberation The psychoacoustics of reverberation Steven van de Par Steven.van.de.Par@uni-oldenburg.de July 19, 2016 Thanks to Julian Grosse and Andreas Häußler 2016 AES International Conference on Sound Field Control

More information

Data Embedding Using Phase Dispersion. Chris Honsinger and Majid Rabbani Imaging Science Division Eastman Kodak Company Rochester, NY USA

Data Embedding Using Phase Dispersion. Chris Honsinger and Majid Rabbani Imaging Science Division Eastman Kodak Company Rochester, NY USA Data Embedding Using Phase Dispersion Chris Honsinger and Majid Rabbani Imaging Science Division Eastman Kodak Company Rochester, NY USA Abstract A method of data embedding based on the convolution of

More information

Speech Enhancement using Wiener filtering

Speech Enhancement using Wiener filtering Speech Enhancement using Wiener filtering S. Chirtmay and M. Tahernezhadi Department of Electrical Engineering Northern Illinois University DeKalb, IL 60115 ABSTRACT The problem of reducing the disturbing

More information

ROBUST echo cancellation requires a method for adjusting

ROBUST echo cancellation requires a method for adjusting 1030 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 3, MARCH 2007 On Adjusting the Learning Rate in Frequency Domain Echo Cancellation With Double-Talk Jean-Marc Valin, Member,

More information

Omnidirectional Sound Source Tracking Based on Sequential Updating Histogram

Omnidirectional Sound Source Tracking Based on Sequential Updating Histogram Proceedings of APSIPA Annual Summit and Conference 5 6-9 December 5 Omnidirectional Sound Source Tracking Based on Sequential Updating Histogram Yusuke SHIIKI and Kenji SUYAMA School of Engineering, Tokyo

More information

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner.

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner. Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb 2009. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence

More information

Chapter IV THEORY OF CELP CODING

Chapter IV THEORY OF CELP CODING Chapter IV THEORY OF CELP CODING CHAPTER IV THEORY OF CELP CODING 4.1 Introduction Wavefonn coders fail to produce high quality speech at bit rate lower than 16 kbps. Source coders, such as LPC vocoders,

More information

WIND SPEED ESTIMATION AND WIND-INDUCED NOISE REDUCTION USING A 2-CHANNEL SMALL MICROPHONE ARRAY

WIND SPEED ESTIMATION AND WIND-INDUCED NOISE REDUCTION USING A 2-CHANNEL SMALL MICROPHONE ARRAY INTER-NOISE 216 WIND SPEED ESTIMATION AND WIND-INDUCED NOISE REDUCTION USING A 2-CHANNEL SMALL MICROPHONE ARRAY Shumpei SAKAI 1 ; Tetsuro MURAKAMI 2 ; Naoto SAKATA 3 ; Hirohumi NAKAJIMA 4 ; Kazuhiro NAKADAI

More information

Sound Processing Technologies for Realistic Sensations in Teleworking

Sound Processing Technologies for Realistic Sensations in Teleworking Sound Processing Technologies for Realistic Sensations in Teleworking Takashi Yazu Makoto Morito In an office environment we usually acquire a large amount of information without any particular effort

More information

Method to Improve Watermark Reliability. Adam Brickman. EE381K - Multidimensional Signal Processing. May 08, 2003 ABSTRACT

Method to Improve Watermark Reliability. Adam Brickman. EE381K - Multidimensional Signal Processing. May 08, 2003 ABSTRACT Method to Improve Watermark Reliability Adam Brickman EE381K - Multidimensional Signal Processing May 08, 2003 ABSTRACT This paper presents a methodology for increasing audio watermark robustness. The

More information

Multiple Sound Sources Localization Using Energetic Analysis Method

Multiple Sound Sources Localization Using Energetic Analysis Method VOL.3, NO.4, DECEMBER 1 Multiple Sound Sources Localization Using Energetic Analysis Method Hasan Khaddour, Jiří Schimmel Department of Telecommunications FEEC, Brno University of Technology Purkyňova

More information

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner.

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner. Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb 2008. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum,

More information

Encoding a Hidden Digital Signature onto an Audio Signal Using Psychoacoustic Masking

Encoding a Hidden Digital Signature onto an Audio Signal Using Psychoacoustic Masking The 7th International Conference on Signal Processing Applications & Technology, Boston MA, pp. 476-480, 7-10 October 1996. Encoding a Hidden Digital Signature onto an Audio Signal Using Psychoacoustic

More information

High capacity robust audio watermarking scheme based on DWT transform

High capacity robust audio watermarking scheme based on DWT transform High capacity robust audio watermarking scheme based on DWT transform Davod Zangene * (Sama technical and vocational training college, Islamic Azad University, Mahshahr Branch, Mahshahr, Iran) davodzangene@mail.com

More information

SIGNALS AND SYSTEMS LABORATORY 13: Digital Communication

SIGNALS AND SYSTEMS LABORATORY 13: Digital Communication SIGNALS AND SYSTEMS LABORATORY 13: Digital Communication INTRODUCTION Digital Communication refers to the transmission of binary, or digital, information over analog channels. In this laboratory you will

More information

Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution

Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution PAGE 433 Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution Wenliang Lu, D. Sen, and Shuai Wang School of Electrical Engineering & Telecommunications University of New South Wales,

More information

Live multi-track audio recording

Live multi-track audio recording Live multi-track audio recording Joao Luiz Azevedo de Carvalho EE522 Project - Spring 2007 - University of Southern California Abstract In live multi-track audio recording, each microphone perceives sound

More information

Audio Watermarking Based on Multiple Echoes Hiding for FM Radio

Audio Watermarking Based on Multiple Echoes Hiding for FM Radio INTERSPEECH 2014 Audio Watermarking Based on Multiple Echoes Hiding for FM Radio Xuejun Zhang, Xiang Xie Beijing Institute of Technology Zhangxuejun0910@163.com,xiexiang@bit.edu.cn Abstract An audio watermarking

More information

Lecture 3 Concepts for the Data Communications and Computer Interconnection

Lecture 3 Concepts for the Data Communications and Computer Interconnection Lecture 3 Concepts for the Data Communications and Computer Interconnection Aim: overview of existing methods and techniques Terms used: -Data entities conveying meaning (of information) -Signals data

More information

BEAT DETECTION BY DYNAMIC PROGRAMMING. Racquel Ivy Awuor

BEAT DETECTION BY DYNAMIC PROGRAMMING. Racquel Ivy Awuor BEAT DETECTION BY DYNAMIC PROGRAMMING Racquel Ivy Awuor University of Rochester Department of Electrical and Computer Engineering Rochester, NY 14627 rawuor@ur.rochester.edu ABSTRACT A beat is a salient

More information

Performance Evaluation of STBC-OFDM System for Wireless Communication

Performance Evaluation of STBC-OFDM System for Wireless Communication Performance Evaluation of STBC-OFDM System for Wireless Communication Apeksha Deshmukh, Prof. Dr. M. D. Kokate Department of E&TC, K.K.W.I.E.R. College, Nasik, apeksha19may@gmail.com Abstract In this paper

More information

THE STATISTICAL ANALYSIS OF AUDIO WATERMARKING USING THE DISCRETE WAVELETS TRANSFORM AND SINGULAR VALUE DECOMPOSITION

THE STATISTICAL ANALYSIS OF AUDIO WATERMARKING USING THE DISCRETE WAVELETS TRANSFORM AND SINGULAR VALUE DECOMPOSITION THE STATISTICAL ANALYSIS OF AUDIO WATERMARKING USING THE DISCRETE WAVELETS TRANSFORM AND SINGULAR VALUE DECOMPOSITION Mr. Jaykumar. S. Dhage Assistant Professor, Department of Computer Science & Engineering

More information

Background Dirty Paper Coding Codeword Binning Code construction Remaining problems. Information Hiding. Phil Regalia

Background Dirty Paper Coding Codeword Binning Code construction Remaining problems. Information Hiding. Phil Regalia Information Hiding Phil Regalia Department of Electrical Engineering and Computer Science Catholic University of America Washington, DC 20064 regalia@cua.edu Baltimore IEEE Signal Processing Society Chapter,

More information

THE problem of acoustic echo cancellation (AEC) was

THE problem of acoustic echo cancellation (AEC) was IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 13, NO. 6, NOVEMBER 2005 1231 Acoustic Echo Cancellation and Doubletalk Detection Using Estimated Loudspeaker Impulse Responses Per Åhgren Abstract

More information

Localized Robust Audio Watermarking in Regions of Interest

Localized Robust Audio Watermarking in Regions of Interest Localized Robust Audio Watermarking in Regions of Interest W Li; X Y Xue; X Q Li Department of Computer Science and Engineering University of Fudan, Shanghai 200433, P. R. China E-mail: weili_fd@yahoo.com

More information

Surround: The Current Technological Situation. David Griesinger Lexicon 3 Oak Park Bedford, MA

Surround: The Current Technological Situation. David Griesinger Lexicon 3 Oak Park Bedford, MA Surround: The Current Technological Situation David Griesinger Lexicon 3 Oak Park Bedford, MA 01730 www.world.std.com/~griesngr There are many open questions 1. What is surround sound 2. Who will listen

More information

DWT based high capacity audio watermarking

DWT based high capacity audio watermarking LETTER DWT based high capacity audio watermarking M. Fallahpour, student member and D. Megias Summary This letter suggests a novel high capacity robust audio watermarking algorithm by using the high frequency

More information

Different Approaches of Spectral Subtraction Method for Speech Enhancement

Different Approaches of Spectral Subtraction Method for Speech Enhancement ISSN 2249 5460 Available online at www.internationalejournals.com International ejournals International Journal of Mathematical Sciences, Technology and Humanities 95 (2013 1056 1062 Different Approaches

More information

Lecture 9: Spread Spectrum Modulation Techniques

Lecture 9: Spread Spectrum Modulation Techniques Lecture 9: Spread Spectrum Modulation Techniques Spread spectrum (SS) modulation techniques employ a transmission bandwidth which is several orders of magnitude greater than the minimum required bandwidth

More information

Chapter 2: Digitization of Sound

Chapter 2: Digitization of Sound Chapter 2: Digitization of Sound Acoustics pressure waves are converted to electrical signals by use of a microphone. The output signal from the microphone is an analog signal, i.e., a continuous-valued

More information

ECMA-108. Measurement of Highfrequency. emitted by Information Technology and Telecommunications Equipment. 4 th Edition / December 2008

ECMA-108. Measurement of Highfrequency. emitted by Information Technology and Telecommunications Equipment. 4 th Edition / December 2008 ECMA-108 4 th Edition / December 2008 Measurement of Highfrequency Noise emitted by Information Technology and Telecommunications Equipment COPYRIGHT PROTECTED DOCUMENT Ecma International 2008 Standard

More information

Spread Spectrum Techniques

Spread Spectrum Techniques 0 Spread Spectrum Techniques Contents 1 1. Overview 2. Pseudonoise Sequences 3. Direct Sequence Spread Spectrum Systems 4. Frequency Hopping Systems 5. Synchronization 6. Applications 2 1. Overview Basic

More information

Laboratory Assignment 2 Signal Sampling, Manipulation, and Playback

Laboratory Assignment 2 Signal Sampling, Manipulation, and Playback Laboratory Assignment 2 Signal Sampling, Manipulation, and Playback PURPOSE This lab will introduce you to the laboratory equipment and the software that allows you to link your computer to the hardware.

More information

ECE 476/ECE 501C/CS Wireless Communication Systems Winter Lecture 6: Fading

ECE 476/ECE 501C/CS Wireless Communication Systems Winter Lecture 6: Fading ECE 476/ECE 501C/CS 513 - Wireless Communication Systems Winter 2005 Lecture 6: Fading Last lecture: Large scale propagation properties of wireless systems - slowly varying properties that depend primarily

More information

Perception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner.

Perception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner. Perception of pitch AUDL4007: 11 Feb 2010. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum, 2005 Chapter 7 1 Definitions

More information

A Comparison of the Convolutive Model and Real Recording for Using in Acoustic Echo Cancellation

A Comparison of the Convolutive Model and Real Recording for Using in Acoustic Echo Cancellation A Comparison of the Convolutive Model and Real Recording for Using in Acoustic Echo Cancellation SEPTIMIU MISCHIE Faculty of Electronics and Telecommunications Politehnica University of Timisoara Vasile

More information

A Soft-Limiting Receiver Structure for Time-Hopping UWB in Multiple Access Interference

A Soft-Limiting Receiver Structure for Time-Hopping UWB in Multiple Access Interference 2006 IEEE Ninth International Symposium on Spread Spectrum Techniques and Applications A Soft-Limiting Receiver Structure for Time-Hopping UWB in Multiple Access Interference Norman C. Beaulieu, Fellow,

More information

A Computational Efficient Method for Assuring Full Duplex Feeling in Hands-free Communication

A Computational Efficient Method for Assuring Full Duplex Feeling in Hands-free Communication A Computational Efficient Method for Assuring Full Duplex Feeling in Hands-free Communication FREDRIC LINDSTRÖM 1, MATTIAS DAHL, INGVAR CLAESSON Department of Signal Processing Blekinge Institute of Technology

More information

DESIGN OF VOICE ALARM SYSTEMS FOR TRAFFIC TUNNELS: OPTIMISATION OF SPEECH INTELLIGIBILITY

DESIGN OF VOICE ALARM SYSTEMS FOR TRAFFIC TUNNELS: OPTIMISATION OF SPEECH INTELLIGIBILITY DESIGN OF VOICE ALARM SYSTEMS FOR TRAFFIC TUNNELS: OPTIMISATION OF SPEECH INTELLIGIBILITY Dr.ir. Evert Start Duran Audio BV, Zaltbommel, The Netherlands The design and optimisation of voice alarm (VA)

More information

FOURIER analysis is a well-known method for nonparametric

FOURIER analysis is a well-known method for nonparametric 386 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 54, NO. 1, FEBRUARY 2005 Resonator-Based Nonparametric Identification of Linear Systems László Sujbert, Member, IEEE, Gábor Péceli, Fellow,

More information

Introduction to More Advanced Steganography. John Ortiz. Crucial Security Inc. San Antonio

Introduction to More Advanced Steganography. John Ortiz. Crucial Security Inc. San Antonio Introduction to More Advanced Steganography John Ortiz Crucial Security Inc. San Antonio John.Ortiz@Harris.com 210 977-6615 11/17/2011 Advanced Steganography 1 Can YOU See the Difference? Which one of

More information

EWGAE 2010 Vienna, 8th to 10th September

EWGAE 2010 Vienna, 8th to 10th September EWGAE 2010 Vienna, 8th to 10th September Frequencies and Amplitudes of AE Signals in a Plate as a Function of Source Rise Time M. A. HAMSTAD University of Denver, Department of Mechanical and Materials

More information

Acoustic Communication System Using Mobile Terminal Microphones

Acoustic Communication System Using Mobile Terminal Microphones Acoustic Communication System Using Mobile Terminal Microphones Hosei Matsuoka, Yusuke Nakashima and Takeshi Yoshimura DoCoMo has developed a data transmission technology called Acoustic OFDM that embeds

More information

Performance Analysis of Parallel Acoustic Communication in OFDM-based System

Performance Analysis of Parallel Acoustic Communication in OFDM-based System Performance Analysis of Parallel Acoustic Communication in OFDM-based System Junyeong Bok, Heung-Gyoon Ryu Department of Electronic Engineering, Chungbuk ational University, Korea 36-763 bjy84@nate.com,

More information

Proceedings of the 5th WSEAS Int. Conf. on SIGNAL, SPEECH and IMAGE PROCESSING, Corfu, Greece, August 17-19, 2005 (pp17-21)

Proceedings of the 5th WSEAS Int. Conf. on SIGNAL, SPEECH and IMAGE PROCESSING, Corfu, Greece, August 17-19, 2005 (pp17-21) Ambiguity Function Computation Using Over-Sampled DFT Filter Banks ENNETH P. BENTZ The Aerospace Corporation 5049 Conference Center Dr. Chantilly, VA, USA 90245-469 Abstract: - This paper will demonstrate

More information

MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS

MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS 1 S.PRASANNA VENKATESH, 2 NITIN NARAYAN, 3 K.SAILESH BHARATHWAAJ, 4 M.P.ACTLIN JEEVA, 5 P.VIJAYALAKSHMI 1,2,3,4,5 SSN College of Engineering,

More information

EFFECTS OF PHASE AND AMPLITUDE ERRORS ON QAM SYSTEMS WITH ERROR- CONTROL CODING AND SOFT DECISION DECODING

EFFECTS OF PHASE AND AMPLITUDE ERRORS ON QAM SYSTEMS WITH ERROR- CONTROL CODING AND SOFT DECISION DECODING Clemson University TigerPrints All Theses Theses 8-2009 EFFECTS OF PHASE AND AMPLITUDE ERRORS ON QAM SYSTEMS WITH ERROR- CONTROL CODING AND SOFT DECISION DECODING Jason Ellis Clemson University, jellis@clemson.edu

More information

On Event Signal Reconstruction in Wireless Sensor Networks

On Event Signal Reconstruction in Wireless Sensor Networks On Event Signal Reconstruction in Wireless Sensor Networks Barış Atakan and Özgür B. Akan Next Generation Wireless Communications Laboratory Department of Electrical and Electronics Engineering Middle

More information

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm A.T. Rajamanickam, N.P.Subiramaniyam, A.Balamurugan*,

More information

EE 791 EEG-5 Measures of EEG Dynamic Properties

EE 791 EEG-5 Measures of EEG Dynamic Properties EE 791 EEG-5 Measures of EEG Dynamic Properties Computer analysis of EEG EEG scientists must be especially wary of mathematics in search of applications after all the number of ways to transform data is

More information

1.Discuss the frequency domain techniques of image enhancement in detail.

1.Discuss the frequency domain techniques of image enhancement in detail. 1.Discuss the frequency domain techniques of image enhancement in detail. Enhancement In Frequency Domain: The frequency domain methods of image enhancement are based on convolution theorem. This is represented

More information

Digital Audio Watermarking With Discrete Wavelet Transform Using Fibonacci Numbers

Digital Audio Watermarking With Discrete Wavelet Transform Using Fibonacci Numbers Digital Audio Watermarking With Discrete Wavelet Transform Using Fibonacci Numbers P. Mohan Kumar 1, Dr. M. Sailaja 2 M. Tech scholar, Dept. of E.C.E, Jawaharlal Nehru Technological University Kakinada,

More information

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter 1 Gupteswar Sahu, 2 D. Arun Kumar, 3 M. Bala Krishna and 4 Jami Venkata Suman Assistant Professor, Department of ECE,

More information

Audio Watermark Detection Improvement by Using Noise Modelling

Audio Watermark Detection Improvement by Using Noise Modelling Audio Watermark Detection Improvement by Using Noise Modelling NEDELJKO CVEJIC, TAPIO SEPPÄNEN*, DAVID BULL Dept. of Electrical and Electronic Engineering University of Bristol Merchant Venturers Building,

More information

CHAPTER. delta-sigma modulators 1.0

CHAPTER. delta-sigma modulators 1.0 CHAPTER 1 CHAPTER Conventional delta-sigma modulators 1.0 This Chapter presents the traditional first- and second-order DSM. The main sources for non-ideal operation are described together with some commonly

More information

High-speed Noise Cancellation with Microphone Array

High-speed Noise Cancellation with Microphone Array Noise Cancellation a Posteriori Probability, Maximum Criteria Independent Component Analysis High-speed Noise Cancellation with Microphone Array We propose the use of a microphone array based on independent

More information

Maximum Likelihood Sequence Detection (MLSD) and the utilization of the Viterbi Algorithm

Maximum Likelihood Sequence Detection (MLSD) and the utilization of the Viterbi Algorithm Maximum Likelihood Sequence Detection (MLSD) and the utilization of the Viterbi Algorithm Presented to Dr. Tareq Al-Naffouri By Mohamed Samir Mazloum Omar Diaa Shawky Abstract Signaling schemes with memory

More information

Analysis of Processing Parameters of GPS Signal Acquisition Scheme

Analysis of Processing Parameters of GPS Signal Acquisition Scheme Analysis of Processing Parameters of GPS Signal Acquisition Scheme Prof. Vrushali Bhatt, Nithin Krishnan Department of Electronics and Telecommunication Thakur College of Engineering and Technology Mumbai-400101,

More information

DIGITAL IMAGE PROCESSING Quiz exercises preparation for the midterm exam

DIGITAL IMAGE PROCESSING Quiz exercises preparation for the midterm exam DIGITAL IMAGE PROCESSING Quiz exercises preparation for the midterm exam In the following set of questions, there are, possibly, multiple correct answers (1, 2, 3 or 4). Mark the answers you consider correct.

More information

FPGA implementation of DWT for Audio Watermarking Application

FPGA implementation of DWT for Audio Watermarking Application FPGA implementation of DWT for Audio Watermarking Application Naveen.S.Hampannavar 1, Sajeevan Joseph 2, C.B.Bidhul 3, Arunachalam V 4 1, 2, 3 M.Tech VLSI Students, 4 Assistant Professor Selection Grade

More information

Advanced Digital Signal Processing Part 2: Digital Processing of Continuous-Time Signals

Advanced Digital Signal Processing Part 2: Digital Processing of Continuous-Time Signals Advanced Digital Signal Processing Part 2: Digital Processing of Continuous-Time Signals Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical Engineering

More information

Subband Analysis of Time Delay Estimation in STFT Domain

Subband Analysis of Time Delay Estimation in STFT Domain PAGE 211 Subband Analysis of Time Delay Estimation in STFT Domain S. Wang, D. Sen and W. Lu School of Electrical Engineering & Telecommunications University of ew South Wales, Sydney, Australia sh.wang@student.unsw.edu.au,

More information

IMPROVING AUDIO WATERMARK DETECTION USING NOISE MODELLING AND TURBO CODING

IMPROVING AUDIO WATERMARK DETECTION USING NOISE MODELLING AND TURBO CODING IMPROVING AUDIO WATERMARK DETECTION USING NOISE MODELLING AND TURBO CODING Nedeljko Cvejic, Tapio Seppänen MediaTeam Oulu, Information Processing Laboratory, University of Oulu P.O. Box 4500, 4STOINF,

More information

Biomedical Signals. Signals and Images in Medicine Dr Nabeel Anwar

Biomedical Signals. Signals and Images in Medicine Dr Nabeel Anwar Biomedical Signals Signals and Images in Medicine Dr Nabeel Anwar Noise Removal: Time Domain Techniques 1. Synchronized Averaging (covered in lecture 1) 2. Moving Average Filters (today s topic) 3. Derivative

More information

EE228 Applications of Course Concepts. DePiero

EE228 Applications of Course Concepts. DePiero EE228 Applications of Course Concepts DePiero Purpose Describe applications of concepts in EE228. Applications may help students recall and synthesize concepts. Also discuss: Some advanced concepts Highlight

More information

CHAPTER 3 ADAPTIVE MODULATION TECHNIQUE WITH CFO CORRECTION FOR OFDM SYSTEMS

CHAPTER 3 ADAPTIVE MODULATION TECHNIQUE WITH CFO CORRECTION FOR OFDM SYSTEMS 44 CHAPTER 3 ADAPTIVE MODULATION TECHNIQUE WITH CFO CORRECTION FOR OFDM SYSTEMS 3.1 INTRODUCTION A unique feature of the OFDM communication scheme is that, due to the IFFT at the transmitter and the FFT

More information

BLIND DETECTION OF PSK SIGNALS. Yong Jin, Shuichi Ohno and Masayoshi Nakamoto. Received March 2011; revised July 2011

BLIND DETECTION OF PSK SIGNALS. Yong Jin, Shuichi Ohno and Masayoshi Nakamoto. Received March 2011; revised July 2011 International Journal of Innovative Computing, Information and Control ICIC International c 2012 ISSN 1349-4198 Volume 8, Number 3(B), March 2012 pp. 2329 2337 BLIND DETECTION OF PSK SIGNALS Yong Jin,

More information

Distributed Collaborative Path Planning in Sensor Networks with Multiple Mobile Sensor Nodes

Distributed Collaborative Path Planning in Sensor Networks with Multiple Mobile Sensor Nodes 7th Mediterranean Conference on Control & Automation Makedonia Palace, Thessaloniki, Greece June 4-6, 009 Distributed Collaborative Path Planning in Sensor Networks with Multiple Mobile Sensor Nodes Theofanis

More information

SOPA version 2. Revised July SOPA project. September 21, Introduction 2. 2 Basic concept 3. 3 Capturing spatial audio 4

SOPA version 2. Revised July SOPA project. September 21, Introduction 2. 2 Basic concept 3. 3 Capturing spatial audio 4 SOPA version 2 Revised July 7 2014 SOPA project September 21, 2014 Contents 1 Introduction 2 2 Basic concept 3 3 Capturing spatial audio 4 4 Sphere around your head 5 5 Reproduction 7 5.1 Binaural reproduction......................

More information

TE 302 DISCRETE SIGNALS AND SYSTEMS. Chapter 1: INTRODUCTION

TE 302 DISCRETE SIGNALS AND SYSTEMS. Chapter 1: INTRODUCTION TE 302 DISCRETE SIGNALS AND SYSTEMS Study on the behavior and processing of information bearing functions as they are currently used in human communication and the systems involved. Chapter 1: INTRODUCTION

More information

inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering August 2000, Nice, FRANCE

inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering August 2000, Nice, FRANCE Copyright SFA - InterNoise 2000 1 inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering 27-30 August 2000, Nice, FRANCE I-INCE Classification: 6.1 AUDIBILITY OF COMPLEX

More information

Auditory modelling for speech processing in the perceptual domain

Auditory modelling for speech processing in the perceptual domain ANZIAM J. 45 (E) ppc964 C980, 2004 C964 Auditory modelling for speech processing in the perceptual domain L. Lin E. Ambikairajah W. H. Holmes (Received 8 August 2003; revised 28 January 2004) Abstract

More information

Complex Sounds. Reading: Yost Ch. 4

Complex Sounds. Reading: Yost Ch. 4 Complex Sounds Reading: Yost Ch. 4 Natural Sounds Most sounds in our everyday lives are not simple sinusoidal sounds, but are complex sounds, consisting of a sum of many sinusoids. The amplitude and frequency

More information

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Mel Spectrum Analysis of Speech Recognition using Single Microphone International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree

More information

ECE 476/ECE 501C/CS Wireless Communication Systems Winter Lecture 6: Fading

ECE 476/ECE 501C/CS Wireless Communication Systems Winter Lecture 6: Fading ECE 476/ECE 501C/CS 513 - Wireless Communication Systems Winter 2004 Lecture 6: Fading Last lecture: Large scale propagation properties of wireless systems - slowly varying properties that depend primarily

More information

Digitally controlled Active Noise Reduction with integrated Speech Communication

Digitally controlled Active Noise Reduction with integrated Speech Communication Digitally controlled Active Noise Reduction with integrated Speech Communication Herman J.M. Steeneken and Jan Verhave TNO Human Factors, Soesterberg, The Netherlands herman@steeneken.com ABSTRACT Active

More information

Sound Source Localization using HRTF database

Sound Source Localization using HRTF database ICCAS June -, KINTEX, Gyeonggi-Do, Korea Sound Source Localization using HRTF database Sungmok Hwang*, Youngjin Park and Younsik Park * Center for Noise and Vibration Control, Dept. of Mech. Eng., KAIST,

More information

Robust Low-Resource Sound Localization in Correlated Noise

Robust Low-Resource Sound Localization in Correlated Noise INTERSPEECH 2014 Robust Low-Resource Sound Localization in Correlated Noise Lorin Netsch, Jacek Stachurski Texas Instruments, Inc. netsch@ti.com, jacek@ti.com Abstract In this paper we address the problem

More information

FREQUENCY RESPONSE AND LATENCY OF MEMS MICROPHONES: THEORY AND PRACTICE

FREQUENCY RESPONSE AND LATENCY OF MEMS MICROPHONES: THEORY AND PRACTICE APPLICATION NOTE AN22 FREQUENCY RESPONSE AND LATENCY OF MEMS MICROPHONES: THEORY AND PRACTICE This application note covers engineering details behind the latency of MEMS microphones. Major components of

More information

TRANSFORMS / WAVELETS

TRANSFORMS / WAVELETS RANSFORMS / WAVELES ransform Analysis Signal processing using a transform analysis for calculations is a technique used to simplify or accelerate problem solution. For example, instead of dividing two

More information

Since the advent of the sine wave oscillator

Since the advent of the sine wave oscillator Advanced Distortion Analysis Methods Discover modern test equipment that has the memory and post-processing capability to analyze complex signals and ascertain real-world performance. By Dan Foley European

More information

Physical Layer: Outline

Physical Layer: Outline 18-345: Introduction to Telecommunication Networks Lectures 3: Physical Layer Peter Steenkiste Spring 2015 www.cs.cmu.edu/~prs/nets-ece Physical Layer: Outline Digital networking Modulation Characterization

More information

Signals, Sound, and Sensation

Signals, Sound, and Sensation Signals, Sound, and Sensation William M. Hartmann Department of Physics and Astronomy Michigan State University East Lansing, Michigan Л1Р Contents Preface xv Chapter 1: Pure Tones 1 Mathematics of the

More information

Matched filter. Contents. Derivation of the matched filter

Matched filter. Contents. Derivation of the matched filter Matched filter From Wikipedia, the free encyclopedia In telecommunications, a matched filter (originally known as a North filter [1] ) is obtained by correlating a known signal, or template, with an unknown

More information

Signal Processing for Digitizers

Signal Processing for Digitizers Signal Processing for Digitizers Modular digitizers allow accurate, high resolution data acquisition that can be quickly transferred to a host computer. Signal processing functions, applied in the digitizer

More information

The Scientist and Engineer's Guide to Digital Signal Processing By Steven W. Smith, Ph.D.

The Scientist and Engineer's Guide to Digital Signal Processing By Steven W. Smith, Ph.D. The Scientist and Engineer's Guide to Digital Signal Processing By Steven W. Smith, Ph.D. Home The Book by Chapters About the Book Steven W. Smith Blog Contact Book Search Download this chapter in PDF

More information

REAL-TIME BROADBAND NOISE REDUCTION

REAL-TIME BROADBAND NOISE REDUCTION REAL-TIME BROADBAND NOISE REDUCTION Robert Hoeldrich and Markus Lorber Institute of Electronic Music Graz Jakoministrasse 3-5, A-8010 Graz, Austria email: robert.hoeldrich@mhsg.ac.at Abstract A real-time

More information

Statistical Pulse Measurements using USB Power Sensors

Statistical Pulse Measurements using USB Power Sensors Statistical Pulse Measurements using USB Power Sensors Today s modern USB Power Sensors are capable of many advanced power measurements. These Power Sensors are capable of demodulating the signal and processing

More information

A COMPARISON OF SITE-AMPLIFICATION ESTIMATED FROM DIFFERENT METHODS USING A STRONG MOTION OBSERVATION ARRAY IN TANGSHAN, CHINA

A COMPARISON OF SITE-AMPLIFICATION ESTIMATED FROM DIFFERENT METHODS USING A STRONG MOTION OBSERVATION ARRAY IN TANGSHAN, CHINA A COMPARISON OF SITE-AMPLIFICATION ESTIMATED FROM DIFFERENT METHODS USING A STRONG MOTION OBSERVATION ARRAY IN TANGSHAN, CHINA Wenbo ZHANG 1 And Koji MATSUNAMI 2 SUMMARY A seismic observation array for

More information

DSRC using OFDM for roadside-vehicle communication systems

DSRC using OFDM for roadside-vehicle communication systems DSRC using OFDM for roadside-vehicle communication systems Akihiro Kamemura, Takashi Maehata SUMITOMO ELECTRIC INDUSTRIES, LTD. Phone: +81 6 6466 5644, Fax: +81 6 6462 4586 e-mail:kamemura@rrad.sei.co.jp,

More information

Measuring impulse responses containing complete spatial information ABSTRACT

Measuring impulse responses containing complete spatial information ABSTRACT Measuring impulse responses containing complete spatial information Angelo Farina, Paolo Martignon, Andrea Capra, Simone Fontana University of Parma, Industrial Eng. Dept., via delle Scienze 181/A, 43100

More information