Nicholas Chong, Shanhung Wong, Sven Nordholm, Iain Murray

Size: px
Start display at page:

Download "Nicholas Chong, Shanhung Wong, Sven Nordholm, Iain Murray"

Transcription

1 MULTIPLE SOUND SOURCE TRACKING AND IDENTIFICATION VIA DEGENERATE UNMIXING ESTIMATION TECHNIQUE AND CARDINALITY BALANCED MULTI-TARGET MULTI-BERNOULLI FILTER (DUET-CBMEMBER) WITH TRACK MANAGEMENT Nicholas Chong, Shanhung Wong, Sven Nordholm, Iain Murray Department of Electrical and Computer Engineering Curtin University, Kent Street, Bentley, WA 612, Australia ABSTRACT In Source Separation research, cocktail party problem is a challenging problem that research into source separation aims to solve. Many attempts have been made to solve this complex problem. A logical approach would be to break down this complex problem into several smaller problems which are solved in different stages - each considering various aspects. In this paper, we are providing a robust solution to a part of the problem by localizing and tracking multiple moving speech sources in a room environment. Here we study the separation problem for unknown number of moving sources. The DUET-CBMeMBer method we outline is capable of estimating the number of sound sources as well as tracking and labelling them. This paper proposes a track management technique that identifies sound sources based on their trajectory as an extension to the DUET-CBMeMBer technique. Index Terms BSS, DUET, CBMeMBer, multiple speaker tracking, track management 1. INTRODUCTION The classic cocktail party problem has been widely researched and various methods of blind source separation systems have been proposed to solve it [1], [2] [3], [4], [5]. Most of these methods assume that the sound sources are not moving. Such an assumption does not hold true in a practical setting as people will be moving around in any most acoustic communication scenarios accordingly and a blind source separation system should be able to account for such movements as well as the time varying number of sound sources. The DUET-CBMeMBer [6], inspired by both the Degenerate Unmixing Estimation Technique (DUET) [2] and the Cardinality-Balanced Multi-Target Multi-Bernoulli Filter (CBMeMBer) [7] was previously proposed to address the problem of localizing and tracking multiple speakers in room. The proposed method uses the Random Finite Set (RFS) theory to track multiple speakers in the room and also automatically follow new speakers when new speakers turn up as such a technique was shown to be viable in [8] and [9]. This paper seeks to address the problem of multiple speaker tracking and speaker identification in acoustic environments. There have been previous attempts to separate moving sound source mixtures [1], [11] but these methods require additional processing such as speech recognition or image recognition in order to identify the separated sound sources. These would mean additional computational complexity in order to identify the separated sound sources and they will be only suited for offline processing. [12] [13] [14] are earlier works which also utilize the spatio temporal nature of speech to perform sound source separation. However, these techniques only track the sound sources Direction of Arrival (DoA) which is one dimensional. The proposed algorithm aims to track the Cartesian coordinates and the velocity of the speakers within a room and not just the DoA of the sound sources. The addition of another dimension adds a new layer of complexity to the problem as the Time Difference of Arrival (TDOA) obtained from the different microphone pairs need to be fused in a cohesive manner in order to estimate the position of the speakers in the room. DUET-CBMeMBer provides the means to fuse the extracted information and use it to estimate the source location but it still lacks the ability to identify and distinguish the sound features which have been separated. Sound source reconstruction is very challenging without information about the sound source identity as there will be a permutation problem in the features required to construct the mask. Using the idea from the works [15] and [16], this paper proposes an intuitive method of track management based on the trajectory of sound sources as an extension to the DUET-CBMeMBer algorithm. The addition of the track management system addresses the problem of track labelling which was lacking in the original DUET-CBMeMBer algorithm [6]. The rest of this paper is separated into 4 Sections - System Overview, Proposed Track Management, Experiment and Conclusion. A brief overview of the DUET-CBMeMBer is given in the System Overview section. In Section 3, the proposed track management is discussed in greater detail. The experiments conducted and the analysis of the results will be APSIPA APSIPA 214

2 presented in Section 4. Closing remarks and future works are given in Section DUET 2. DUET-CBMEMBER OVERVIEW The DUET-CBMeMBer is an algorithm proposed in [6] to deal with the problem of tracking multiple moving sound sources. There are two main stages in the DUET-CBMeMBer algorithm. The first stage is feature extraction which is performed using the DUET algorithm [17]. This is followed by CbMeMBer [7] which performs the sound source tracking. DUET is a robust and efficient BSS technique as long as the W-disjoint orthogonality assumption of only one source is active at any time frequency point holds true. This can be expressed as: S j (τ, ω)s k (τ, ω) =, τ, ω, j k (1) whereby S j and S k refers to the jth and kth Short Time Fourier Transformed sound sources while τ and ω refer to the time frame and frequency respectively. The two anechoic sound mixtures used by DUET to recover the original sound sources can be expressed as: y 1 (t) = y 2 (t) = N s j (t) (2) j=1 N a j s j (t δ j ) (3) j=1 with y(t) being the received signal and s j (t) being the sound source while a j and δ j are the attenuation and the delay respectively. Short Time Fourier Transform (STFT) is performed on both of the observed sound mixtures in order to transform the observed mixtures into the time frequency domain in which the source signals are sparse. Feature extraction is achieved using the ratio of the mixtures to obtain the features - relative attenuation and relative delay. The time frequency ratio can be expressed as: Y 2 (τ, ω) Y 1 (τ, ω) = N j=1 a je iωδj S j (τ, ω) N j=1 S j(τ, ω) Only the relative delay is used in CBMeMBer to localize the sound sources after it is transformed into the TDOA. The relative delay can be obtained from the Time Frequency Ratio based on the following equation: (4) Y 2 (τ, ω) Y 1 (τ, ω) = a je iωδj S j (τ, ω) = a j e iωδj (5) S j (τ, ω) while the equation used to transform the relative delay into TDOA is: t j = δ j f s (6) whereby t j refers to the TDOA of a jth peak, δj is relative delay and f s is the sampling frequency. This only holds true if no anti aliasing occurs so the distance between two micro- c 2f s phones must be less than whereby c is the speed of sound and f s is the frequency of interest[17]. A power weighted histogram is constructed to cluster the estimated parameters into groups. Instead of the relative features, the power weighted features are used to construct the power weighted histogram. The weighted features are chosen for the construction of the power weighted histogram because they are more accurate estimates of the true parameters [17]. The power weighted histogram can be used to estimate the mixing parameters as the relative parameters will cluster around the true parameters. The power weighted delta estimate can be expressed as: δ j := (τ,ω) Ω j Y 1 (τ, ω)y 2 (τ, ω) p ω q δ(τ, ω)dτdω (τ,ω) Ω j Y 1 (τ, ω)y 2 (τ, ω) p ω q dτdω The time frequency weight of a given time frequency point is Y 1 (τ, ω)y 2 (τ, ω) p ω q whereas Ω j refers to a set of (τ, ω) points (determined to be associated with the jth cluster). p=1 and q= is a good default choice [17] but p=2 and q=2 is a better choice for low Signal to Noise Ratio scenarios or speech. In the scenario whereby two sound sources have vastly different time frequency weight, a compression is performed by setting the value of p to be less than 1 and the value of q= CBMeMBer The CBMeMBer filter is used to estimate the time varying number of targets as well as the state of the targets based on the observations made by DUET. The CBMeMBer was implemented using the Sequential Monte Carlo (SMC) method. Similar to the Particle filter used for acoustic localization and tracking [18], the CBMeMBer filter operates in two spaces - the multi-target state space, F(X ), and the multi-target observation space, F(Z). The multi-target state, X k, which consist of the Cartesian coordinates and the velocity of the targets is recursively estimated from the multi-target observation, Z k, which contains the TDOAs of the observed targets. (7) X k = {x k,1,..., x k,n(k) } F(X ), (8) Z k = {z k,1,..., z k,m(k) } F(Z), (9) The states of the targets are predicted and updated with each propagation of the multi-target multi-bernoulli density, π k (. Z 1:k ). The multi-target posterior density is propagated recursively according to[7]: π k k 1 (X k Z 1:k 1 ) = f k k 1 (X k X)π k 1 (X Z 1:k 1 )δx (1) π k (X k Z 1:k ) = g k(z k X k )π k k 1 (X k Z 1:k 1 ) gk (Z k X)π k k 1 (X Z 1:k 1 )δx (11)

3 where f k k 1 (..) is the multi-target transition density and g k (..) is the multi-target likelihood at any given time k. The CBMeMBer filter uses the parameterized multi-target multi- Bernoulli density, described by the parameters, r (i) and p (i), for propagation as it is less computationally intensive than to propagate than using the exact multi-target density. By using the paramaterized approximation, a finite set of existence probabilities and a corresponding density based on the kinematic state of each sound source is propagated in time. Readers are referred to [7] for further details on the SMC implementation of the CBMeMBer filter. 3. PROPOSED TRACK MANAGEMENT 3.1. Track Management based on Sound Source Trajectory Target state estimation is performed by the CBMeMBer technique without the need for data association in the filtering process. In order to perform data association of the estimated states, a track management algorithm is required to associate the states with the corresponding tracks by labelling the states. The track management system used is similar to the ones used in [16] and [19] with modifications made to suit acoustic signal analysis. There are multiple factors such as silent period, speaker interaction and room reverberation that needs to be considered when speech sound sources are to be tracked. To account for these factors in speech tracking, certain constraints are placed on the track management system. Due to the nature of speech, there might be silent periods which provide no information that allows it to be tracked. As a result of this, the track management system has to be able to retain the track information for the particular speech sound source instead of terminating it. By applying the idea of the track management used in [15], the labels of several time steps are taken into consideration for the data association. The track management based on [15] has four basic stages. The first stage is to associate the estimated data at current time step with the data from the previous time step. Non-associated data in current time step is then considered for data association with previously missed data. If the current data has no association with the data from previous time steps, a new identity is assigned to the current data. Memory retention will only be available for a set number of time steps. If a data remains associated for a certain time period, it will be removed from the data association memory. In the cases of moving sound sources, the location of the sound sources will change based on the trajectories of the individual sound sources. The true estimates will follow a trajectory whereas the false estimates which result from noise or reverberation will be spurious. Hence, the true tracks will be considered for data association while the spurious tracks that result from noise or reverberation will be removed from track association. A modification made to the track management system is the addition of a track merging algorithm in the CBMeMBer filter. Tracks within a short distance of each other are merged into a single track as such scenarios are highly unlikely in real life. There will always be a personal space between speakers in most social settings so the boundary of this personal space is used as the threshold for track merging. As the track management accounts for missed data, the merged track will not be lost as long as the tracks do not overlap for a period of time exceeding the memory retention period of the track management algorithm. The details of the track management algorithm is given in Track Mangement Pseudocode SIMULATION AND DISCUSSION 1. m.4 m 1. m Fig. 1: Room dimension and setup c 2f s In [6], the DUET-CBMeMBer was shown to be capable of localizing and tracking two sound sources in an ideal scenario with no noise and reverberation. A similar simulation was performed to further test the system s tracking and sound source identification capabilities in presence of reverberation. The simulated room setup shown in figure 1, is the same as the one used in [6] with four pairs of microphones which are.4m apart, spread out across a 1m x 1m room. The main difference is the addition of a T6 reverberation time of.15 s. The distance between the microphone pairs fulfill the condition of d < as the speed of sound is 343m/s while the frequency of interest is 8kHz. The sound sources used in the scenario are still synthetic male and female speech signals sampled at 16kHz. For the settings in the CBMeMBer, the steady state velocity of, v and the rate constant, B of a speaker in the Langevin model are respectively set to 1.4m/s and 1Hz. The birth model used is a Gaussian spawning at the birth location of the speakers. The sigma, σ, of the normally distributed likelihood is set at.5% of the maximum TDOA between the sensor pair and the clutter rate, κ is set to 6. The merging threshold used in this scenario is.5m so speakers which are within.5m of each other are merged into a single track. An example of the simulation results is shown in figure

4 Algorithm 1 Pseudocode for Multiple Sound Sources Track Management Data Initialization 1: Acquire the estimated sound source locations from the DUET-CBMeMBer filter starting from the first time frame 2: Set the track merging threshold, association distance threshold and maximum memory retention period 3: Set all association bits for the estimated sound source locations to 1 4: Set all missed bits for the estimated sound source locations to 5: Assign a new label for each estimated sound source location Track Merging 1: Compare the distance for all the estimated sound source location at each time frame 2: while Distance between two estimated locations < merging threshold do 3: Combine the particle clouds of those two estimates 4: Reweight the particles 5: Combine the probability of existence of the two particle clouds 6: end while 7: Estimate new sound source locations based on the combined particle clouds Iterative Data Association 1: while time frame last time frame do 2: Compare estimates from current time frame with the directly previous time frame 3: if Distance between current estimate and previous estimate is within the association distance threshold then 4: Assign the previous label to current estimate and set the corresponding association bit to 1 5: end if 6: Compare estimates from current time frame with previously missed data 7: if Distance between current and the previously missed estimate is within the association distance threshold missed time steps then 8: Assign the previously missed estimate s label to current estimate and set the corresponding association bit to 1 9: end if 1: if Current estimate has no association to previous data or previously missed data then 11: Assign a new label to the current estimate and set the corresponding association bit to 1 12: end if 13: if Previously missed data > maximum memory retention period then 14: Increase its missed estimate counter by 1 15: else if Previously missed data > maximum memory retention period then 16: Remove it from memory 17: end if 18: end while 19: x coordinate (m) y coordinate (m) Frequency Frequency True tracks Speaker1 Speaker Time (Frame) Time (Frame) Time Time Fig. 2: Result of source tracking and labelling 2. As shown in the result, the proposed algorithm is able to track and add an identity to both the speakers while they are moving. The gaps in the tracks are due to the silent periods in speech. No observation could be extracted during the silent periods so the DUET-CBMeMBer is unable to produce location estimates during these periods. As the track management algorithm is able to account for the silent periods and retain the track information, the tracks continue to be propagated after these silent periods. As the tracks do not cross path with each other in the scenario shown, it is mainly used to clean up the tracks by eliminating most of the false estimates resulting from room reverberation. 5. CONCLUSION In conclusion, the addition of a track management algorithm to the DUET-CBMeMBer filter is demonstrated to be a viable method of adding identity information to the sound sources tracked. Apart from that, the addition of the track management algorithm also produce cleaner results with significantly less false estimates due to room reverberation. Identity association of a tracked sound source is an important stepping stone in the overall research aim of multiple moving sound sources separation. The output of this proposed algorithm provides the ability to construct time frequency masks based on each sound source s state information and identities. Future work involves the use of tracking information including identities for time frequency mask design and subsequently the reconstruction the separated sound sources based on these time frequency masks. x 1 4 x 1 4

5 6. REFERENCES [1] N. Grbic, X.J. Tao, S. Nordholm, and I. Claesson, Blind signal separation using overcomplete subband representation, IEEE Transactions on Speech and Audio Processing, vol. 9, no. 5, pp , 21. [2] O. Yilmaz and S. Rickard, Blind separation of speech mixtures via time-frequency masking, IEEE Transactions on Signal Processing, vol. 52, no. 7, pp , 24. [3] S.Y. Low, S. Nordholm, and R. Togneri, Convolutive blind signal separation with post-processing, IEEE Transactions on Speech and Audio Processing, vol. 12, no. 5, pp , 24. [4] V.G. Reju, S.N. Koh, and Y. Soon, Underdetermined convolutive blind source separation via time frequency masking, IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, no. 1, pp , 21. [5] I. Jafari, S. Haque, R. Togneri, and S. Nordholm, Evaluations on underdetermined blind source separation in adverse environments using time-frequency masking, EURASIP Journal on Advances in Signal Processing, vol. 213, no. 1, pp. 162, 213. [6] N. Chong, S. Wong, B-T Vo, S. Nordholm, and I. Murray, Multiple sound source tracking via degenerate unmixing estimation technique and cardinality balanced multi-target multi-bernoulli filter (DUET-CBMeMBer) with track management, in Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP) Proceedings 214 IEEE Ninth International Conference on. IEEE, 214, pp [7] B-T Vo, B-N Vo, and A. Cantoni, The cardinality balanced multi-target multi-bernoulli filter and its implementations, IEEE Transactions on Signal Processing, vol. 57, no. 2, pp , 29. [8] B-N Vo, S. Singh, and W.K. Ma, Tracking multiple speakers using random sets, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, 24 (ICASSP 4). IEEE, 24, vol. 2, pp. ii 357. International Conference on. IEEE, 2, vol. 2, pp. II1133 II1136. [11] S.M. Naqvi, M Yu, and J.A. Chambers, A multimodal approach to blind source separation of moving sources, Selected Topics in Signal Processing, IEEE Journal of, vol. 4, no. 5, pp , 21. [12] P. Pertilä, Online blind speech separation using multiple acoustic speaker tracking and time frequency masking, Computer Speech & Language, vol. 27, no. 3, pp , 213. [13] B. Loesch and B. Yang, Online blind source separation based on time-frequency sparseness, in Acoustics, Speech and Signal Processing, 29. ICASSP 29. IEEE International Conference on. IEEE, 29, pp [14] A. Brutti and F. Nesta, Multiple source tracking by sequential posterior kernel density estimation through gsct, in Proc. of EUSIPCO, 211, pp [15] K. Shafique and M. Shah, A noniterative greedy algorithm for multiframe point correspondence, Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 27, no. 1, pp , 25. [16] S. Wong, B-T Vo, B-N Vo, and R. Hoseinnezhad, Multi-bernoulli based track-before-detect with road constraints, in Information Fusion (FUSION), th International Conference on. IEEE, 212, pp [17] S. Rickard, The DUET blind source separation algorithm, Blind Speech Separation, pp , 27. [18] D.B. Ward, E.A. Lehmann, and R.C. Williamson, Particle filtering algorithms for tracking an acoustic source in a reverberant environment, IEEE Transactions on Speech and Audio Processing, vol. 11, no. 6, pp , 23. [19] R. Hoseinnezhad, B-N Vo, B-T Vo, and D. Suter, Visual tracking of numerous targets via multi-bernoulli filtering of image data, Pattern Recognition, vol. 45, no. 1, pp , 212. [9] N.T Pham, W. Huang, and S.H. Ong, Tracking multiple speakers using CPHD filter, in Proceedings of the 15th international conference on Multimedia. ACM, 27, pp [1] A Koutvas, E Dermatas, and G Kokkinakis, Blind speech separation of moving speakers in real reverberant environments, in Acoustics, Speech, and Signal Processing, 2. ICASSP. Proceedings. 2 IEEE

Nonlinear postprocessing for blind speech separation

Nonlinear postprocessing for blind speech separation Nonlinear postprocessing for blind speech separation Dorothea Kolossa and Reinhold Orglmeister 1 TU Berlin, Berlin, Germany, D.Kolossa@ee.tu-berlin.de, WWW home page: http://ntife.ee.tu-berlin.de/personen/kolossa/home.html

More information

The Role of High Frequencies in Convolutive Blind Source Separation of Speech Signals

The Role of High Frequencies in Convolutive Blind Source Separation of Speech Signals The Role of High Frequencies in Convolutive Blind Source Separation of Speech Signals Maria G. Jafari and Mark D. Plumbley Centre for Digital Music, Queen Mary University of London, UK maria.jafari@elec.qmul.ac.uk,

More information

Omnidirectional Sound Source Tracking Based on Sequential Updating Histogram

Omnidirectional Sound Source Tracking Based on Sequential Updating Histogram Proceedings of APSIPA Annual Summit and Conference 5 6-9 December 5 Omnidirectional Sound Source Tracking Based on Sequential Updating Histogram Yusuke SHIIKI and Kenji SUYAMA School of Engineering, Tokyo

More information

A HYPOTHESIS TESTING APPROACH FOR REAL-TIME MULTICHANNEL SPEECH SEPARATION USING TIME-FREQUENCY MASKS. Ryan M. Corey and Andrew C.

A HYPOTHESIS TESTING APPROACH FOR REAL-TIME MULTICHANNEL SPEECH SEPARATION USING TIME-FREQUENCY MASKS. Ryan M. Corey and Andrew C. 6 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, SEPT. 3 6, 6, SALERNO, ITALY A HYPOTHESIS TESTING APPROACH FOR REAL-TIME MULTICHANNEL SPEECH SEPARATION USING TIME-FREQUENCY MASKS

More information

Multiple Sound Sources Localization Using Energetic Analysis Method

Multiple Sound Sources Localization Using Energetic Analysis Method VOL.3, NO.4, DECEMBER 1 Multiple Sound Sources Localization Using Energetic Analysis Method Hasan Khaddour, Jiří Schimmel Department of Telecommunications FEEC, Brno University of Technology Purkyňova

More information

MINUET: MUSICAL INTERFERENCE UNMIXING ESTIMATION TECHNIQUE

MINUET: MUSICAL INTERFERENCE UNMIXING ESTIMATION TECHNIQUE MINUET: MUSICAL INTERFERENCE UNMIXING ESTIMATION TECHNIQUE Scott Rickard, Conor Fearon University College Dublin, Dublin, Ireland {scott.rickard,conor.fearon}@ee.ucd.ie Radu Balan, Justinian Rosca Siemens

More information

Time-of-arrival estimation for blind beamforming

Time-of-arrival estimation for blind beamforming Time-of-arrival estimation for blind beamforming Pasi Pertilä, pasi.pertila (at) tut.fi www.cs.tut.fi/~pertila/ Aki Tinakari, aki.tinakari (at) tut.fi Tampere University of Technology Tampere, Finland

More information

Performance Evaluation of Nonlinear Speech Enhancement Based on Virtual Increase of Channels in Reverberant Environments

Performance Evaluation of Nonlinear Speech Enhancement Based on Virtual Increase of Channels in Reverberant Environments Performance Evaluation of Nonlinear Speech Enhancement Based on Virtual Increase of Channels in Reverberant Environments Kouei Yamaoka, Shoji Makino, Nobutaka Ono, and Takeshi Yamada University of Tsukuba,

More information

Recent Advances in Acoustic Signal Extraction and Dereverberation

Recent Advances in Acoustic Signal Extraction and Dereverberation Recent Advances in Acoustic Signal Extraction and Dereverberation Emanuël Habets Erlangen Colloquium 2016 Scenario Spatial Filtering Estimated Desired Signal Undesired sound components: Sensor noise Competing

More information

Simultaneous Recognition of Speech Commands by a Robot using a Small Microphone Array

Simultaneous Recognition of Speech Commands by a Robot using a Small Microphone Array 2012 2nd International Conference on Computer Design and Engineering (ICCDE 2012) IPCSIT vol. 49 (2012) (2012) IACSIT Press, Singapore DOI: 10.7763/IPCSIT.2012.V49.14 Simultaneous Recognition of Speech

More information

arxiv: v1 [cs.sd] 4 Dec 2018

arxiv: v1 [cs.sd] 4 Dec 2018 LOCALIZATION AND TRACKING OF AN ACOUSTIC SOURCE USING A DIAGONAL UNLOADING BEAMFORMING AND A KALMAN FILTER Daniele Salvati, Carlo Drioli, Gian Luca Foresti Department of Mathematics, Computer Science and

More information

ESTIMATION OF TIME-VARYING ROOM IMPULSE RESPONSES OF MULTIPLE SOUND SOURCES FROM OBSERVED MIXTURE AND ISOLATED SOURCE SIGNALS

ESTIMATION OF TIME-VARYING ROOM IMPULSE RESPONSES OF MULTIPLE SOUND SOURCES FROM OBSERVED MIXTURE AND ISOLATED SOURCE SIGNALS ESTIMATION OF TIME-VARYING ROOM IMPULSE RESPONSES OF MULTIPLE SOUND SOURCES FROM OBSERVED MIXTURE AND ISOLATED SOURCE SIGNALS Joonas Nikunen, Tuomas Virtanen Tampere University of Technology Korkeakoulunkatu

More information

Emanuël A. P. Habets, Jacob Benesty, and Patrick A. Naylor. Presented by Amir Kiperwas

Emanuël A. P. Habets, Jacob Benesty, and Patrick A. Naylor. Presented by Amir Kiperwas Emanuël A. P. Habets, Jacob Benesty, and Patrick A. Naylor Presented by Amir Kiperwas 1 M-element microphone array One desired source One undesired source Ambient noise field Signals: Broadband Mutually

More information

Michael Brandstein Darren Ward (Eds.) Microphone Arrays. Signal Processing Techniques and Applications. With 149 Figures. Springer

Michael Brandstein Darren Ward (Eds.) Microphone Arrays. Signal Processing Techniques and Applications. With 149 Figures. Springer Michael Brandstein Darren Ward (Eds.) Microphone Arrays Signal Processing Techniques and Applications With 149 Figures Springer Contents Part I. Speech Enhancement 1 Constant Directivity Beamforming Darren

More information

Performance study of Text-independent Speaker identification system using MFCC & IMFCC for Telephone and Microphone Speeches

Performance study of Text-independent Speaker identification system using MFCC & IMFCC for Telephone and Microphone Speeches Performance study of Text-independent Speaker identification system using & I for Telephone and Microphone Speeches Ruchi Chaudhary, National Technical Research Organization Abstract: A state-of-the-art

More information

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis Mohini Avatade & S.L. Sahare Electronics & Telecommunication Department, Cummins

More information

Convention Paper Presented at the 131st Convention 2011 October New York, USA

Convention Paper Presented at the 131st Convention 2011 October New York, USA Audio Engineering Society Convention Paper Presented at the 131st Convention 211 October 2 23 New York, USA This paper was peer-reviewed as a complete manuscript for presentation at this Convention. Additional

More information

Tracking of UWB Multipath Components Using Probability Hypothesis Density Filters

Tracking of UWB Multipath Components Using Probability Hypothesis Density Filters Tracking of UWB Multipath Components Using Probability Hypothesis Density Filters Markus Froehle, Paul Meissner and Klaus Witrisal Graz University of Technology, Graz, Austria. Email: {froehle, paul.meissner,

More information

Chapter 4 SPEECH ENHANCEMENT

Chapter 4 SPEECH ENHANCEMENT 44 Chapter 4 SPEECH ENHANCEMENT 4.1 INTRODUCTION: Enhancement is defined as improvement in the value or Quality of something. Speech enhancement is defined as the improvement in intelligibility and/or

More information

Mikko Myllymäki and Tuomas Virtanen

Mikko Myllymäki and Tuomas Virtanen NON-STATIONARY NOISE MODEL COMPENSATION IN VOICE ACTIVITY DETECTION Mikko Myllymäki and Tuomas Virtanen Department of Signal Processing, Tampere University of Technology Korkeakoulunkatu 1, 3370, Tampere,

More information

BLIND SOURCE SEPARATION USING WAVELETS

BLIND SOURCE SEPARATION USING WAVELETS 2 IEEE International Conference on Computational Intelligence and Computing Research BLIND SOURCE SEPARATION USING WAVELETS A.Wims Magdalene Mary, Anto Prem Kumar 2, Anish Abraham Chacko 3 Karunya University,

More information

Broadband Microphone Arrays for Speech Acquisition

Broadband Microphone Arrays for Speech Acquisition Broadband Microphone Arrays for Speech Acquisition Darren B. Ward Acoustics and Speech Research Dept. Bell Labs, Lucent Technologies Murray Hill, NJ 07974, USA Robert C. Williamson Dept. of Engineering,

More information

Speech and Audio Processing Recognition and Audio Effects Part 3: Beamforming

Speech and Audio Processing Recognition and Audio Effects Part 3: Beamforming Speech and Audio Processing Recognition and Audio Effects Part 3: Beamforming Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Electrical Engineering and Information Engineering

More information

Using RASTA in task independent TANDEM feature extraction

Using RASTA in task independent TANDEM feature extraction R E S E A R C H R E P O R T I D I A P Using RASTA in task independent TANDEM feature extraction Guillermo Aradilla a John Dines a Sunil Sivadas a b IDIAP RR 04-22 April 2004 D a l l e M o l l e I n s t

More information

Microphone Array Design and Beamforming

Microphone Array Design and Beamforming Microphone Array Design and Beamforming Heinrich Löllmann Multimedia Communications and Signal Processing heinrich.loellmann@fau.de with contributions from Vladi Tourbabin and Hendrik Barfuss EUSIPCO Tutorial

More information

Passive Emitter Geolocation using Agent-based Data Fusion of AOA, TDOA and FDOA Measurements

Passive Emitter Geolocation using Agent-based Data Fusion of AOA, TDOA and FDOA Measurements Passive Emitter Geolocation using Agent-based Data Fusion of AOA, TDOA and FDOA Measurements Alex Mikhalev and Richard Ormondroyd Department of Aerospace Power and Sensors Cranfield University The Defence

More information

Ocean Acoustics and Signal Processing for Robust Detection and Estimation

Ocean Acoustics and Signal Processing for Robust Detection and Estimation Ocean Acoustics and Signal Processing for Robust Detection and Estimation Zoi-Heleni Michalopoulou Department of Mathematical Sciences New Jersey Institute of Technology Newark, NJ 07102 phone: (973) 596

More information

SUPERVISED SIGNAL PROCESSING FOR SEPARATION AND INDEPENDENT GAIN CONTROL OF DIFFERENT PERCUSSION INSTRUMENTS USING A LIMITED NUMBER OF MICROPHONES

SUPERVISED SIGNAL PROCESSING FOR SEPARATION AND INDEPENDENT GAIN CONTROL OF DIFFERENT PERCUSSION INSTRUMENTS USING A LIMITED NUMBER OF MICROPHONES SUPERVISED SIGNAL PROCESSING FOR SEPARATION AND INDEPENDENT GAIN CONTROL OF DIFFERENT PERCUSSION INSTRUMENTS USING A LIMITED NUMBER OF MICROPHONES SF Minhas A Barton P Gaydecki School of Electrical and

More information

An Introduction to Compressive Sensing and its Applications

An Introduction to Compressive Sensing and its Applications International Journal of Scientific and Research Publications, Volume 4, Issue 6, June 2014 1 An Introduction to Compressive Sensing and its Applications Pooja C. Nahar *, Dr. Mahesh T. Kolte ** * Department

More information

Single Channel Speaker Segregation using Sinusoidal Residual Modeling

Single Channel Speaker Segregation using Sinusoidal Residual Modeling NCC 2009, January 16-18, IIT Guwahati 294 Single Channel Speaker Segregation using Sinusoidal Residual Modeling Rajesh M Hegde and A. Srinivas Dept. of Electrical Engineering Indian Institute of Technology

More information

High-speed Noise Cancellation with Microphone Array

High-speed Noise Cancellation with Microphone Array Noise Cancellation a Posteriori Probability, Maximum Criteria Independent Component Analysis High-speed Noise Cancellation with Microphone Array We propose the use of a microphone array based on independent

More information

Applications & Theory

Applications & Theory Applications & Theory Azadeh Kushki azadeh.kushki@ieee.org Professor K N Plataniotis Professor K.N. Plataniotis Professor A.N. Venetsanopoulos Presentation Outline 2 Part I: The case for WLAN positioning

More information

Nonuniform multi level crossing for signal reconstruction

Nonuniform multi level crossing for signal reconstruction 6 Nonuniform multi level crossing for signal reconstruction 6.1 Introduction In recent years, there has been considerable interest in level crossing algorithms for sampling continuous time signals. Driven

More information

Dynamically Configured Waveform-Agile Sensor Systems

Dynamically Configured Waveform-Agile Sensor Systems Dynamically Configured Waveform-Agile Sensor Systems Antonia Papandreou-Suppappola in collaboration with D. Morrell, D. Cochran, S. Sira, A. Chhetri Arizona State University June 27, 2006 Supported by

More information

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm A.T. Rajamanickam, N.P.Subiramaniyam, A.Balamurugan*,

More information

Subband Analysis of Time Delay Estimation in STFT Domain

Subband Analysis of Time Delay Estimation in STFT Domain PAGE 211 Subband Analysis of Time Delay Estimation in STFT Domain S. Wang, D. Sen and W. Lu School of Electrical Engineering & Telecommunications University of ew South Wales, Sydney, Australia sh.wang@student.unsw.edu.au,

More information

TARGET SPEECH EXTRACTION IN COCKTAIL PARTY BY COMBINING BEAMFORMING AND BLIND SOURCE SEPARATION

TARGET SPEECH EXTRACTION IN COCKTAIL PARTY BY COMBINING BEAMFORMING AND BLIND SOURCE SEPARATION TARGET SPEECH EXTRACTION IN COCKTAIL PARTY BY COMBINING BEAMFORMING AND BLIND SOURCE SEPARATION Lin Wang 1,2, Heping Ding 2 and Fuliang Yin 1 1 School of Electronic and Information Engineering, Dalian

More information

Two-channel Separation of Speech Using Direction-of-arrival Estimation And Sinusoids Plus Transients Modeling

Two-channel Separation of Speech Using Direction-of-arrival Estimation And Sinusoids Plus Transients Modeling Two-channel Separation of Speech Using Direction-of-arrival Estimation And Sinusoids Plus Transients Modeling Mikko Parviainen 1 and Tuomas Virtanen 2 Institute of Signal Processing Tampere University

More information

Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks

Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks Alfredo Zermini, Qiuqiang Kong, Yong Xu, Mark D. Plumbley, Wenwu Wang Centre for Vision,

More information

Time Delay Estimation: Applications and Algorithms

Time Delay Estimation: Applications and Algorithms Time Delay Estimation: Applications and Algorithms Hing Cheung So http://www.ee.cityu.edu.hk/~hcso Department of Electronic Engineering City University of Hong Kong H. C. So Page 1 Outline Introduction

More information

Applying the Filtered Back-Projection Method to Extract Signal at Specific Position

Applying the Filtered Back-Projection Method to Extract Signal at Specific Position Applying the Filtered Back-Projection Method to Extract Signal at Specific Position 1 Chia-Ming Chang and Chun-Hao Peng Department of Computer Science and Engineering, Tatung University, Taipei, Taiwan

More information

Blind Blur Estimation Using Low Rank Approximation of Cepstrum

Blind Blur Estimation Using Low Rank Approximation of Cepstrum Blind Blur Estimation Using Low Rank Approximation of Cepstrum Adeel A. Bhutta and Hassan Foroosh School of Electrical Engineering and Computer Science, University of Central Florida, 4 Central Florida

More information

Distance Estimation and Localization of Sound Sources in Reverberant Conditions using Deep Neural Networks

Distance Estimation and Localization of Sound Sources in Reverberant Conditions using Deep Neural Networks Distance Estimation and Localization of Sound Sources in Reverberant Conditions using Deep Neural Networks Mariam Yiwere 1 and Eun Joo Rhee 2 1 Department of Computer Engineering, Hanbat National University,

More information

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Ching-Ta Lu, Kun-Fu Tseng 2, Chih-Tsung Chen 2 Department of Information Communication, Asia University, Taichung, Taiwan, ROC

More information

AUDIO VISUAL TRACKING OF A SPEAKER BASED ON FFT AND KALMAN FILTER

AUDIO VISUAL TRACKING OF A SPEAKER BASED ON FFT AND KALMAN FILTER AUDIO VISUAL TRACKING OF A SPEAKER BASED ON FFT AND KALMAN FILTER Muhammad Muzammel, Mohd Zuki Yusoff, Mohamad Naufal Mohamad Saad and Aamir Saeed Malik Centre for Intelligent Signal and Imaging Research,

More information

Joint recognition and direction-of-arrival estimation of simultaneous meetingroom acoustic events

Joint recognition and direction-of-arrival estimation of simultaneous meetingroom acoustic events INTERSPEECH 2013 Joint recognition and direction-of-arrival estimation of simultaneous meetingroom acoustic events Rupayan Chakraborty and Climent Nadeu TALP Research Centre, Department of Signal Theory

More information

Adaptive f-xy Hankel matrix rank reduction filter to attenuate coherent noise Nirupama (Pam) Nagarajappa*, CGGVeritas

Adaptive f-xy Hankel matrix rank reduction filter to attenuate coherent noise Nirupama (Pam) Nagarajappa*, CGGVeritas Adaptive f-xy Hankel matrix rank reduction filter to attenuate coherent noise Nirupama (Pam) Nagarajappa*, CGGVeritas Summary The reliability of seismic attribute estimation depends on reliable signal.

More information

ONLINE REPET-SIM FOR REAL-TIME SPEECH ENHANCEMENT

ONLINE REPET-SIM FOR REAL-TIME SPEECH ENHANCEMENT ONLINE REPET-SIM FOR REAL-TIME SPEECH ENHANCEMENT Zafar Rafii Northwestern University EECS Department Evanston, IL, USA Bryan Pardo Northwestern University EECS Department Evanston, IL, USA ABSTRACT REPET-SIM

More information

MULTIMODAL BLIND SOURCE SEPARATION WITH A CIRCULAR MICROPHONE ARRAY AND ROBUST BEAMFORMING

MULTIMODAL BLIND SOURCE SEPARATION WITH A CIRCULAR MICROPHONE ARRAY AND ROBUST BEAMFORMING 19th European Signal Processing Conference (EUSIPCO 211) Barcelona, Spain, August 29 - September 2, 211 MULTIMODAL BLIND SOURCE SEPARATION WITH A CIRCULAR MICROPHONE ARRAY AND ROBUST BEAMFORMING Syed Mohsen

More information

BLIND SOURCE SEPARATION FOR CONVOLUTIVE MIXTURES USING SPATIALLY RESAMPLED OBSERVATIONS

BLIND SOURCE SEPARATION FOR CONVOLUTIVE MIXTURES USING SPATIALLY RESAMPLED OBSERVATIONS 14th European Signal Processing Conference (EUSIPCO 26), Florence, Italy, September 4-8, 26, copyright by EURASIP BLID SOURCE SEPARATIO FOR COVOLUTIVE MIXTURES USIG SPATIALLY RESAMPLED OBSERVATIOS J.-F.

More information

Speech Enhancement using Wiener filtering

Speech Enhancement using Wiener filtering Speech Enhancement using Wiener filtering S. Chirtmay and M. Tahernezhadi Department of Electrical Engineering Northern Illinois University DeKalb, IL 60115 ABSTRACT The problem of reducing the disturbing

More information

Tracking Unknown Number of Stealth Targets in a Multi-Static Radar with Unknown Receiver Detection Profile Using RCS Model

Tracking Unknown Number of Stealth Targets in a Multi-Static Radar with Unknown Receiver Detection Profile Using RCS Model Progress In Electromagnetics Research M, Vol. 7, 45 55, 8 Tracking Unknown Number of Stealth Targets in a Multi-Static Radar with Unknown Receiver Detection Profile Using RCS Model Amin Razmi, Mohammad

More information

Localization of underwater moving sound source based on time delay estimation using hydrophone array

Localization of underwater moving sound source based on time delay estimation using hydrophone array Journal of Physics: Conference Series PAPER OPEN ACCESS Localization of underwater moving sound source based on time delay estimation using hydrophone array To cite this article: S. A. Rahman et al 2016

More information

MULTIPLE HARMONIC SOUND SOURCES SEPARA- TION IN THE UDER-DETERMINED CASE BASED ON THE MERGING OF GONIOMETRIC AND BEAMFORM- ING APPROACH

MULTIPLE HARMONIC SOUND SOURCES SEPARA- TION IN THE UDER-DETERMINED CASE BASED ON THE MERGING OF GONIOMETRIC AND BEAMFORM- ING APPROACH MULTIPLE HARMONIC SOUND SOURCES SEPARA- TION IN THE UDER-DETERMINED CASE BASED ON THE MERGING OF GONIOMETRIC AND BEAMFORM- ING APPROACH Patrick Marmaroli, Xavier Falourd, Hervé Lissek EPFL - LEMA, Switzerland

More information

Robust Low-Resource Sound Localization in Correlated Noise

Robust Low-Resource Sound Localization in Correlated Noise INTERSPEECH 2014 Robust Low-Resource Sound Localization in Correlated Noise Lorin Netsch, Jacek Stachurski Texas Instruments, Inc. netsch@ti.com, jacek@ti.com Abstract In this paper we address the problem

More information

Study Of Sound Source Localization Using Music Method In Real Acoustic Environment

Study Of Sound Source Localization Using Music Method In Real Acoustic Environment International Journal of Electronics Engineering Research. ISSN 975-645 Volume 9, Number 4 (27) pp. 545-556 Research India Publications http://www.ripublication.com Study Of Sound Source Localization Using

More information

Enhanced Waveform Interpolative Coding at 4 kbps

Enhanced Waveform Interpolative Coding at 4 kbps Enhanced Waveform Interpolative Coding at 4 kbps Oded Gottesman, and Allen Gersho Signal Compression Lab. University of California, Santa Barbara E-mail: [oded, gersho]@scl.ece.ucsb.edu Signal Compression

More information

Advanced Techniques for Mobile Robotics Location-Based Activity Recognition

Advanced Techniques for Mobile Robotics Location-Based Activity Recognition Advanced Techniques for Mobile Robotics Location-Based Activity Recognition Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz Activity Recognition Based on L. Liao, D. J. Patterson, D. Fox,

More information

Boldt, Jesper Bünsow; Kjems, Ulrik; Pedersen, Michael Syskind; Lunner, Thomas; Wang, DeLiang

Boldt, Jesper Bünsow; Kjems, Ulrik; Pedersen, Michael Syskind; Lunner, Thomas; Wang, DeLiang Downloaded from vbn.aau.dk on: januar 14, 19 Aalborg Universitet Estimation of the Ideal Binary Mask using Directional Systems Boldt, Jesper Bünsow; Kjems, Ulrik; Pedersen, Michael Syskind; Lunner, Thomas;

More information

Artificial Beacons with RGB-D Environment Mapping for Indoor Mobile Robot Localization

Artificial Beacons with RGB-D Environment Mapping for Indoor Mobile Robot Localization Sensors and Materials, Vol. 28, No. 6 (2016) 695 705 MYU Tokyo 695 S & M 1227 Artificial Beacons with RGB-D Environment Mapping for Indoor Mobile Robot Localization Chun-Chi Lai and Kuo-Lan Su * Department

More information

About Multichannel Speech Signal Extraction and Separation Techniques

About Multichannel Speech Signal Extraction and Separation Techniques Journal of Signal and Information Processing, 2012, *, **-** doi:10.4236/jsip.2012.***** Published Online *** 2012 (http://www.scirp.org/journal/jsip) About Multichannel Speech Signal Extraction and Separation

More information

Auditory System For a Mobile Robot

Auditory System For a Mobile Robot Auditory System For a Mobile Robot PhD Thesis Jean-Marc Valin Department of Electrical Engineering and Computer Engineering Université de Sherbrooke, Québec, Canada Jean-Marc.Valin@USherbrooke.ca Motivations

More information

Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise

Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise Noha KORANY 1 Alexandria University, Egypt ABSTRACT The paper applies spectral analysis to

More information

Automotive three-microphone voice activity detector and noise-canceller

Automotive three-microphone voice activity detector and noise-canceller Res. Lett. Inf. Math. Sci., 005, Vol. 7, pp 47-55 47 Available online at http://iims.massey.ac.nz/research/letters/ Automotive three-microphone voice activity detector and noise-canceller Z. QI and T.J.MOIR

More information

Channel Modeling ETI 085

Channel Modeling ETI 085 Channel Modeling ETI 085 Overview Lecture no: 9 What is Ultra-Wideband (UWB)? Why do we need UWB channel models? UWB Channel Modeling UWB channel modeling Standardized UWB channel models Fredrik Tufvesson

More information

Super-Resolution and Reconstruction of Sparse Sub-Wavelength Images

Super-Resolution and Reconstruction of Sparse Sub-Wavelength Images Super-Resolution and Reconstruction of Sparse Sub-Wavelength Images Snir Gazit, 1 Alexander Szameit, 1 Yonina C. Eldar, 2 and Mordechai Segev 1 1. Department of Physics and Solid State Institute, Technion,

More information

Relative phase information for detecting human speech and spoofed speech

Relative phase information for detecting human speech and spoofed speech Relative phase information for detecting human speech and spoofed speech Longbiao Wang 1, Yohei Yoshida 1, Yuta Kawakami 1 and Seiichi Nakagawa 2 1 Nagaoka University of Technology, Japan 2 Toyohashi University

More information

Real-time Adaptive Concepts in Acoustics

Real-time Adaptive Concepts in Acoustics Real-time Adaptive Concepts in Acoustics Real-time Adaptive Concepts in Acoustics Blind Signal Separation and Multichannel Echo Cancellation by Daniel W.E. Schobben, Ph. D. Philips Research Laboratories

More information

Separation of Multiple Speech Signals by Using Triangular Microphone Array

Separation of Multiple Speech Signals by Using Triangular Microphone Array Separation of Multiple Speech Signals by Using Triangular Microphone Array 15 Separation of Multiple Speech Signals by Using Triangular Microphone Array Nozomu Hamada 1, Non-member ABSTRACT Speech source

More information

Audio Engineering Society Convention Paper Presented at the 110th Convention 2001 May Amsterdam, The Netherlands

Audio Engineering Society Convention Paper Presented at the 110th Convention 2001 May Amsterdam, The Netherlands Audio Engineering Society Convention Paper Presented at the th Convention May 5 Amsterdam, The Netherlands This convention paper has been reproduced from the author's advance manuscript, without editing,

More information

An Adaptive Algorithm for Speech Source Separation in Overcomplete Cases Using Wavelet Packets

An Adaptive Algorithm for Speech Source Separation in Overcomplete Cases Using Wavelet Packets Proceedings of the th WSEAS International Conference on Signal Processing, Istanbul, Turkey, May 7-9, 6 (pp4-44) An Adaptive Algorithm for Speech Source Separation in Overcomplete Cases Using Wavelet Packets

More information

Applications of Flash and No-Flash Image Pairs in Mobile Phone Photography

Applications of Flash and No-Flash Image Pairs in Mobile Phone Photography Applications of Flash and No-Flash Image Pairs in Mobile Phone Photography Xi Luo Stanford University 450 Serra Mall, Stanford, CA 94305 xluo2@stanford.edu Abstract The project explores various application

More information

Introduction to Audio Watermarking Schemes

Introduction to Audio Watermarking Schemes Introduction to Audio Watermarking Schemes N. Lazic and P. Aarabi, Communication over an Acoustic Channel Using Data Hiding Techniques, IEEE Transactions on Multimedia, Vol. 8, No. 5, October 2006 Multimedia

More information

The psychoacoustics of reverberation

The psychoacoustics of reverberation The psychoacoustics of reverberation Steven van de Par Steven.van.de.Par@uni-oldenburg.de July 19, 2016 Thanks to Julian Grosse and Andreas Häußler 2016 AES International Conference on Sound Field Control

More information

A Novel Technique or Blind Bandwidth Estimation of the Radio Communication Signal

A Novel Technique or Blind Bandwidth Estimation of the Radio Communication Signal International Journal of ISSN 0974-2107 Systems and Technologies IJST Vol.3, No.1, pp 11-16 KLEF 2010 A Novel Technique or Blind Bandwidth Estimation of the Radio Communication Signal Gaurav Lohiya 1,

More information

Adaptive Fingerprint Binarization by Frequency Domain Analysis

Adaptive Fingerprint Binarization by Frequency Domain Analysis Adaptive Fingerprint Binarization by Frequency Domain Analysis Josef Ström Bartůněk, Mikael Nilsson, Jörgen Nordberg, Ingvar Claesson Department of Signal Processing, School of Engineering, Blekinge Institute

More information

DIRECTION of arrival (DOA) estimation of audio sources. Real-Time Multiple Sound Source Localization and Counting using a Circular Microphone Array

DIRECTION of arrival (DOA) estimation of audio sources. Real-Time Multiple Sound Source Localization and Counting using a Circular Microphone Array 1 Real-Time Multiple Sound Source Localization and Counting using a Circular Microphone Array Despoina Pavlidi, Student Member, IEEE, Anthony Griffin, Matthieu Puigt, and Athanasios Mouchtaris, Member,

More information

BEAMFORMING WITHIN THE MODAL SOUND FIELD OF A VEHICLE INTERIOR

BEAMFORMING WITHIN THE MODAL SOUND FIELD OF A VEHICLE INTERIOR BeBeC-2016-S9 BEAMFORMING WITHIN THE MODAL SOUND FIELD OF A VEHICLE INTERIOR Clemens Nau Daimler AG Béla-Barényi-Straße 1, 71063 Sindelfingen, Germany ABSTRACT Physically the conventional beamforming method

More information

Applications of Music Processing

Applications of Music Processing Lecture Music Processing Applications of Music Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Singing Voice Detection Important pre-requisite

More information

Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012

Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012 Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012 o Music signal characteristics o Perceptual attributes and acoustic properties o Signal representations for pitch detection o STFT o Sinusoidal model o

More information

SONG RETRIEVAL SYSTEM USING HIDDEN MARKOV MODELS

SONG RETRIEVAL SYSTEM USING HIDDEN MARKOV MODELS SONG RETRIEVAL SYSTEM USING HIDDEN MARKOV MODELS AKSHAY CHANDRASHEKARAN ANOOP RAMAKRISHNA akshayc@cmu.edu anoopr@andrew.cmu.edu ABHISHEK JAIN GE YANG ajain2@andrew.cmu.edu younger@cmu.edu NIDHI KOHLI R

More information

Grouping Separated Frequency Components by Estimating Propagation Model Parameters in Frequency-Domain Blind Source Separation

Grouping Separated Frequency Components by Estimating Propagation Model Parameters in Frequency-Domain Blind Source Separation 1 Grouping Separated Frequency Components by Estimating Propagation Model Parameters in Frequency-Domain Blind Source Separation Hiroshi Sawada, Senior Member, IEEE, Shoko Araki, Member, IEEE, Ryo Mukai,

More information

Performance Analysis of Feedforward Adaptive Noise Canceller Using Nfxlms Algorithm

Performance Analysis of Feedforward Adaptive Noise Canceller Using Nfxlms Algorithm Performance Analysis of Feedforward Adaptive Noise Canceller Using Nfxlms Algorithm ADI NARAYANA BUDATI 1, B.BHASKARA RAO 2 M.Tech Student, Department of ECE, Acharya Nagarjuna University College of Engineering

More information

IMPROVEMENT OF SPEECH SOURCE LOCALIZATION IN NOISY ENVIRONMENT USING OVERCOMPLETE RATIONAL-DILATION WAVELET TRANSFORMS

IMPROVEMENT OF SPEECH SOURCE LOCALIZATION IN NOISY ENVIRONMENT USING OVERCOMPLETE RATIONAL-DILATION WAVELET TRANSFORMS 1 International Conference on Cyberworlds IMPROVEMENT OF SPEECH SOURCE LOCALIZATION IN NOISY ENVIRONMENT USING OVERCOMPLETE RATIONAL-DILATION WAVELET TRANSFORMS Di Liu, Andy W. H. Khong School of Electrical

More information

Implementation of Adaptive Coded Aperture Imaging using a Digital Micro-Mirror Device for Defocus Deblurring

Implementation of Adaptive Coded Aperture Imaging using a Digital Micro-Mirror Device for Defocus Deblurring Implementation of Adaptive Coded Aperture Imaging using a Digital Micro-Mirror Device for Defocus Deblurring Ashill Chiranjan and Bernardt Duvenhage Defence, Peace, Safety and Security Council for Scientific

More information

Reducing comb filtering on different musical instruments using time delay estimation

Reducing comb filtering on different musical instruments using time delay estimation Reducing comb filtering on different musical instruments using time delay estimation Alice Clifford and Josh Reiss Queen Mary, University of London alice.clifford@eecs.qmul.ac.uk Abstract Comb filtering

More information

An analysis of blind signal separation for real time application

An analysis of blind signal separation for real time application University of Wollongong Research Online University of Wollongong Thesis Collection 1954-2016 University of Wollongong Thesis Collections 2006 An analysis of blind signal separation for real time application

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 12 Speech Signal Processing 14/03/25 http://www.ee.unlv.edu/~b1morris/ee482/

More information

AD-HOC acoustic sensor networks composed of randomly

AD-HOC acoustic sensor networks composed of randomly IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 24, NO. 6, JUNE 2016 1079 An Iterative Approach to Source Counting and Localization Using Two Distant Microphones Lin Wang, Tsz-Kin

More information

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter 1 Gupteswar Sahu, 2 D. Arun Kumar, 3 M. Bala Krishna and 4 Jami Venkata Suman Assistant Professor, Department of ECE,

More information

Adaptive Waveforms for Target Class Discrimination

Adaptive Waveforms for Target Class Discrimination Adaptive Waveforms for Target Class Discrimination Jun Hyeong Bae and Nathan A. Goodman Department of Electrical and Computer Engineering University of Arizona 3 E. Speedway Blvd, Tucson, Arizona 857 dolbit@email.arizona.edu;

More information

Speech/Music Discrimination via Energy Density Analysis

Speech/Music Discrimination via Energy Density Analysis Speech/Music Discrimination via Energy Density Analysis Stanis law Kacprzak and Mariusz Zió lko Department of Electronics, AGH University of Science and Technology al. Mickiewicza 30, Kraków, Poland {skacprza,

More information

SURVEILLANCE SYSTEMS WITH AUTOMATIC RESTORATION OF LINEAR MOTION AND OUT-OF-FOCUS BLURRED IMAGES. Received August 2008; accepted October 2008

SURVEILLANCE SYSTEMS WITH AUTOMATIC RESTORATION OF LINEAR MOTION AND OUT-OF-FOCUS BLURRED IMAGES. Received August 2008; accepted October 2008 ICIC Express Letters ICIC International c 2008 ISSN 1881-803X Volume 2, Number 4, December 2008 pp. 409 414 SURVEILLANCE SYSTEMS WITH AUTOMATIC RESTORATION OF LINEAR MOTION AND OUT-OF-FOCUS BLURRED IMAGES

More information

3rd International Conference on Machinery, Materials and Information Technology Applications (ICMMITA 2015)

3rd International Conference on Machinery, Materials and Information Technology Applications (ICMMITA 2015) 3rd International Conference on Machinery, Materials and Information echnology Applications (ICMMIA 015) he processing of background noise in secondary path identification of Power transformer ANC system

More information

Voice Activity Detection

Voice Activity Detection Voice Activity Detection Speech Processing Tom Bäckström Aalto University October 2015 Introduction Voice activity detection (VAD) (or speech activity detection, or speech detection) refers to a class

More information

Calibration of Microphone Arrays for Improved Speech Recognition

Calibration of Microphone Arrays for Improved Speech Recognition MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Calibration of Microphone Arrays for Improved Speech Recognition Michael L. Seltzer, Bhiksha Raj TR-2001-43 December 2001 Abstract We present

More information

Speech Enhancement Using Beamforming Dr. G. Ramesh Babu 1, D. Lavanya 2, B. Yamuna 2, H. Divya 2, B. Shiva Kumar 2, B.

Speech Enhancement Using Beamforming Dr. G. Ramesh Babu 1, D. Lavanya 2, B. Yamuna 2, H. Divya 2, B. Shiva Kumar 2, B. www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 4 Issue 4 April 2015, Page No. 11143-11147 Speech Enhancement Using Beamforming Dr. G. Ramesh Babu 1, D. Lavanya

More information

Frugal Sensing Spectral Analysis from Power Inequalities

Frugal Sensing Spectral Analysis from Power Inequalities Frugal Sensing Spectral Analysis from Power Inequalities Nikos Sidiropoulos Joint work with Omar Mehanna IEEE SPAWC 2013 Plenary, June 17, 2013, Darmstadt, Germany Wideband Spectrum Sensing (for CR/DSM)

More information

Acoustic Source Tracking in Reverberant Environment Using Regional Steered Response Power Measurement

Acoustic Source Tracking in Reverberant Environment Using Regional Steered Response Power Measurement Acoustic Source Tracing in Reverberant Environment Using Regional Steered Response Power Measurement Kai Wu and Andy W. H. Khong School of Electrical and Electronic Engineering, Nanyang Technological University,

More information

Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues

Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues DeLiang Wang Perception & Neurodynamics Lab The Ohio State University Outline of presentation Introduction Human performance Reverberation

More information