Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise
|
|
- Letitia Stone
- 5 years ago
- Views:
Transcription
1 Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise Noha KORANY 1 Alexandria University, Egypt ABSTRACT The paper applies spectral analysis to the underwater acoustic noise radiated by ships. A frequency band is specified to characterize the sound produced by different platforms. Hence the paper proposes a new technique for feature extraction by applying autocorrelation technique and discrete cosine transform. The features extracted are employed by a recognition engine that uses Gaussian mixture model for the classification of the underwater acoustic noise radiated by ships. The performance of the recognition is investigated for various settings of the proposed technique. Keywords: ships noise, underwater signals, identification 54.3 I-INCE Classification of Subjects Number(s): 1. INTRODUCTION Automatic classification of underwater acoustic signals has an increased amount of attention in the last decades. Ships radiate noise whose spectral features are related to machinery, propellers, generators, etc. These signals propagate through the underwater and form the tonal signature of the platform (1, 2). The paper aims to recognize ships according to their unique sound signature. Underwater acoustic signals are recorded, preprocessed, and analyzed. Then acoustic features are extracted, and are employed by a recognition algorithm for platform identification. The ships radiate noise in all frequency bands. The key is to develop methods for extracting features that fit better with the recognizer. Various techniques have been developed for extracting speech features, such as linear predictive coefficients (LPC), Mel-frequency cepstral coefficients (MFCC), perceptual linear predictive cepstral coefficients (PLPCC) and relative spectral perceptual linear predictive cepstral coefficients (RASTA_PLPCC) (3,4,5). Those techniques have been used for feature extraction to characterize the noise radiated by ships, and MFCC show the best performance when it is employed by Gaussian mixture model in clean environments (6, 7). The paper proposes a method to extract important features at various amplitudes and frequencies. These features form the acoustic signature of the platform. The method is based on the application of autocorrelation technique and the discrete cosine transform (DCT) to the signals produced by various platform. The objective is to specify the frequency bands and/or energy levels that provide the best performance for the classification problem. The paper is organized as follows. The next section describes the proposed method for feature extraction. Section 3 presents the database, whereas section 4 develops the experiments and discuss the results. Finally, section 5 summarizes the paper. 2. FEATURE EXTRACTION METHOD The first operation for automatic classification of underwater acoustic signals is extracting features to describe their spectral characteristics. The noise produced by various platforms is periodic, fairly continuous, and appears over a large time area. To characterize these signals, tonal signatures are extracted. These tonal signatures are full of discrete frequencies at various amplitudes. Figure 1 shows the noise radiated from a ship. 1 Noha.korany@alexu.edu.eg 7097
2 Figure 1 - Noise radiated from a ship over 1 s. The autocorrelation technique is a useful tool for detecting periodicity. For a discrete signal x, the autocorrelation function, R, is defined in equation (1), where N is the number of samples in x, and Mo is the number of autocorrelation points to be computed. The autocorrelation function for 1s of ship s noise is shown in figure 2. R N m 1 n 0 m 1 N x n x n m 0 m Mo (1) Figure 2 - The autocorrelation function for 1s of ship s noise The objective is to represent the signal energy at various frequencies. This is accomplished by applying DCT to the autocorrelation function. DCT has a higher degree of spectral compaction as compared to discrete Fourier transform (DFT). This is a good choice, as the signal is represented using a relatively small set of DCT coefficients that contain significant amount of energy (8). Figure 3 shows the block diagram of the proposed feature extraction method. First, the noise radiated from ships are recorded, and are stored in a digital format using analog-to-digital converter card. The next step is to compute the autocorrelation function. The autocorrelation signal is blocked into short time segments, normally from ms, and each segment is multiplied by a window function. Hamming window is often used. Then, DCT is applied to the windowed signal, and the autocorrelation-based DCT feature vector is computed for each segment. Finally, the compression step reduces the number of coefficients within the feature vector. The reduction takes place according to the amplitude and the frequency of the coefficients, as will be discussed in section 4. The main goal of reduction is to extract the most relevant features using a minimum number of coefficients. 7098
3 Ship noise Preprocessing x(n) Autocorrelation R(m) Windowing Reduced dimension of autocorrelation-based DCT feature vectors Compression Autocorrelation-based DCT feature vectors DCT Figure 3 The block diagram of the proposed feature extraction method 3. DATABASE Sounds that are produced from ships are simulated using predefined characteristics like number of shafts, blades per shaft, speed, direction and distance for each ship. The database consists of three-surface and three-subsurface ships with different numbers of blades and shafts. Every ship was recorded in course 000 with four different speeds, also was recorded with four different ranges and finally was recorded with four different directions degree. Every range and direction was measured in relation to own ship with course 000. Now the data sets contain 6 x 3 x 4 = 72 audio files from six different types of simulated ships. Each file was recorded using mono-format with same microphone, same sound card, and has approximately durations of 9 seconds long. Each file was sampled at Hz; 16-bit quantization level was used. Next the data were segmented into approximately 23.2 ms frame s length, overlapped by 50% overlapped frame. A Hamming window was then applied to each frame. 4. EXPERIMENTS AND RESULTS 4.1 Experiments description In this part, the autocorrelation-based DCT feature vector is first extracted, and then it is employed by the recognition engine. Gaussian mixture model is employed for the identification problem (9). The recognition engine employs a train signal, and a test signal each of 3 s duration. Two Gaussian components are used (7). Three experiments are conducted to investigate the performance of targets identification employing various number of the autocorrelation-based DCT feature vector. The identification rate is determined for each case. The aim of the first experiment is to determine the effect of the number of the autocorrelation-based DCT coefficients on the identification rate. The second experiment aims to specify the relevant frequency band that contains the most important features for the identification problem. The dimension of the autocorrelation-based DCT feature vector is reduced according to the frequency band chosen. Octave bands are used. The feature vector is limited to those coefficients that belong to a certain octave band centered at frequency f c, whereas the remaining coefficients are discarded. The reduced feature vector is employed by the recognition engine, targets are identified and the identification rate is calculated. The recognition process is repeated for the different octave bands whose center frequencies are from 125 Hz to 4 khz. Moreover, the coefficients that belong to multiple number of frequency bands are combined, then they are employed for target identification, and the identification rate is calculated. Equation (2) relates the coefficient number, k, to its frequency, f. f s is the sampling frequency, and N is the number of samples per frame. k 2 N 1 f f s (2) The third experiment is conducted to determine if the most important DCT coefficients are those having high energy or if those coefficients with low amplitudes within a certain frequency band are the relevant features for the identification problem. The feature vector is reduced by selecting those coefficients having the highest energy within a specified range of frequencies. The reduced feature vector is employed by the recognition engine, and the identification rate is calculated. 7099
4 4.2 Results and discussion Figure 4 shows the results for experiment1. The number of the coefficients varies from 10 to 1024, discarding the zero-order coefficient. The maximum identification rate equals 93%, and it is reached when 512 coefficients are employed for the identification problem. Figure 4 The identification rate employing various number of autocorrelation-based DCT coefficients The results of the second experiment are presented on tables 1 and 2. Table 1 shows the identification rate when those coefficients belonging to single octave bands are employed by the recognition engine, whereas table 2 shows the identification rate when coefficients belonging to multiple bands are used for the identification problem. Table 2 shows that maximum identification rate is obtained when coefficients that belong to the frequency band centered at 1 khz are employed by the recognition engine. It is found that identification rate reaches 86% when employing 34 coefficients. Moreover, table 3 shows that combining those coefficients that belong to the frequency band centered at 500 Hz and those belonging to that band centered at 1 khz, the identification rate increases, and it reaches 90%. It is concluded that the most important features are those within the frequency band centered at 1 khz. Hence, the identification rate is calculated employing a number of the coefficients that belong to the band centered at 1 khz. The number of the coefficients varies from 15 to 30. Table 3 shows the identification rate for various number of coefficients within that frequency band. Table 3 concludes that maximum identification rate is reached when selecting the most relevant coefficients within that band. It is also concluded that the coefficients lying at the highest frequency range within the 1 khz band are the most relevant for targets identification, as the identification rate reaches 84% for high-order 15 coefficients, whereas 67% of the targets are identified when low-order 15 coefficients are employed by the recognition engine. Table 1 The identification rate versus number of coefficients within a certain octave band Center frequency of an octave band (Hz) Lower higher frequency within a band (Hz) Number of coefficients employed Identification rate (%)
5 Table 2 The identification rate versus number of coefficients within combined octave bands Center frequency of Number of Identification rate combined octave coefficients (%) bands (Hz) employed 1000, 2000, , , 500, , 250, 500, , 500, Table 3 The identification rate versus various number of coefficients within the 1 khz octave band Lower higher frequency (Hz) Number of coefficients employed Identification rate (%) The results for the third experiment are demonstrated on tables 4 and 5 where the identification rate for each setting of reduced feature vector is shown. Tables 4 and 5 show the identification rate for various number of coefficients within the 1 khz frequency band. For table 4 the coefficients that have the highest energy are selected, whereas the coefficients having the lowest energy are employed in table 5. Tables 4 and 5 show that maximum identification rate of 47% is reached when highest energy coefficients within the 1 khz band are selected and are employed for targets identification. Comparing table 4 to table 5, it is found that the highest energy coefficients are more significant that the lowest energy ones for the identification problem. Comparing table 3 to table 4, it is shown that the identification rate reaches 92% for 30 consecutive coefficients within the 1 khz band, whereas the rate decreases to 47% for 30 coefficients that having the highest energy within the same band. Then, important features are found at low amplitudes within the frequency band. It is concluded that the spectral distribution of the feature vector affects the identification rate more than the energy level of discrete frequencies. Table 4 The identification rate versus various number of coefficients within the 1 khz octave band, coefficients with highest energy are selected Number of coefficients employed Identification rate (%)
6 Table 5 The identification rate versus various number of coefficients within the 1 khz octave band, coefficients with lowest energy are selected Number of coefficients employed Identification rate (%) CONCLUSIONS The paper proposes the extraction of the autocorrelation-based DCT coefficients to characterize the noise radiated by ships. Those coefficients are employed by the Gaussian mixture model to identify the targets. The performance of the recognition system is investigated. The number of coefficients employed by the recognition engine affects the identification rate. Coefficients at various amplitudes and frequencies are selected and are employed for the identification problem. The goal is to find those coefficients that fit better with the recognizer. It is concluded that the most important features are those within the frequency band centered at 1 khz, and they yield to the highest identification rate. Moreover, relevant features are found at low amplitudes within that frequency band, and it is concluded that the spectral distribution of the feature vector affects the identification rate more than the energy level of discrete frequencies. High identification rate is obtained when consecutive coefficients within the 1 khz band are employed by the recognition engine. On the other hand this rate decreases significantly when employing the highest energy coefficients within the same band. ACKNOWLEDGEMENTS The author would like to acknowledge gratefully Eng. Mohammed abd Elzaher, and Dr. Hatem Khater for the construction of database. REFERENCES 1. McKenna M. F., Ross D., Wiggins S.M., Hildebrand J. A. Underwater radiated noise from modern commercial ships. J Acoust Soc Am. 2012; 131(1): Wang L.S., Robinson S.P., Theobald P., Lepper P.A., Hayman G., Humphrey V.F. Measurements of radiated ship noise. Proceedings of meetings on acoustics; 2-6 July 2012; Edinburgh, Scottland p Davis S., Mermelstein P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust, speech, signal processing 1980; 28(4): Hermansky H. Perceptual Linear Predictive (PLP) analysis of speech. J Acoust Soc Am. 1990; 87(4): Hermansky H., Morgan N. RASTA processing of speech. IEEE Trans. speech and audio processing 1994; 2: Korany N., Abd Elzaher M., Khater H. Classification of underwater acoustic signals using various extraction methods. Fortschritte der Akustik Deutsche Gesellschaft fuer Akustik DAGA 2012; March 2012; Darmstadt, Germany p Korany N., Abd Elzaher M., Khater H. Investigation about the performance of GMM for recognition of underwater acoustic signals. Fortschritte der Akustik Deutsche Gesellschaft fuer Akustik DAGA 2012; March 2012; Darmstadt, Germany p Wihelm B., Burge M.J. Digital image processing: an algorithmic introduction using JAVA. Springer London; Reynolds D.A., Rose R.C. Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. speech and audio processing 1995; 3(1):
AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS
AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS Kuldeep Kumar 1, R. K. Aggarwal 1 and Ankita Jain 2 1 Department of Computer Engineering, National Institute
More informationPerformance study of Text-independent Speaker identification system using MFCC & IMFCC for Telephone and Microphone Speeches
Performance study of Text-independent Speaker identification system using & I for Telephone and Microphone Speeches Ruchi Chaudhary, National Technical Research Organization Abstract: A state-of-the-art
More informationPerformance Analysis of MFCC and LPCC Techniques in Automatic Speech Recognition
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume - 3 Issue - 8 August, 2014 Page No. 7727-7732 Performance Analysis of MFCC and LPCC Techniques in Automatic
More informationMel Spectrum Analysis of Speech Recognition using Single Microphone
International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree
More informationI D I A P. Mel-Cepstrum Modulation Spectrum (MCMS) Features for Robust ASR R E S E A R C H R E P O R T. Iain McCowan a Hemant Misra a,b
R E S E A R C H R E P O R T I D I A P Mel-Cepstrum Modulation Spectrum (MCMS) Features for Robust ASR a Vivek Tyagi Hervé Bourlard a,b IDIAP RR 3-47 September 23 Iain McCowan a Hemant Misra a,b to appear
More informationRobust Speaker Identification for Meetings: UPC CLEAR 07 Meeting Room Evaluation System
Robust Speaker Identification for Meetings: UPC CLEAR 07 Meeting Room Evaluation System Jordi Luque and Javier Hernando Technical University of Catalonia (UPC) Jordi Girona, 1-3 D5, 08034 Barcelona, Spain
More informationI D I A P. On Factorizing Spectral Dynamics for Robust Speech Recognition R E S E A R C H R E P O R T. Iain McCowan a Hemant Misra a,b
R E S E A R C H R E P O R T I D I A P On Factorizing Spectral Dynamics for Robust Speech Recognition a Vivek Tyagi Hervé Bourlard a,b IDIAP RR 3-33 June 23 Iain McCowan a Hemant Misra a,b to appear in
More informationAudio Fingerprinting using Fractional Fourier Transform
Audio Fingerprinting using Fractional Fourier Transform Swati V. Sutar 1, D. G. Bhalke 2 1 (Department of Electronics & Telecommunication, JSPM s RSCOE college of Engineering Pune, India) 2 (Department,
More informationDimension Reduction of the Modulation Spectrogram for Speaker Verification
Dimension Reduction of the Modulation Spectrogram for Speaker Verification Tomi Kinnunen Speech and Image Processing Unit Department of Computer Science University of Joensuu, Finland Kong Aik Lee and
More informationSPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT
SPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT RASHMI MAKHIJANI Department of CSE, G. H. R.C.E., Near CRPF Campus,Hingna Road, Nagpur, Maharashtra, India rashmi.makhijani2002@gmail.com
More informationInternational Journal of Engineering and Techniques - Volume 1 Issue 6, Nov Dec 2015
RESEARCH ARTICLE OPEN ACCESS A Comparative Study on Feature Extraction Technique for Isolated Word Speech Recognition Easwari.N 1, Ponmuthuramalingam.P 2 1,2 (PG & Research Department of Computer Science,
More informationDERIVATION OF TRAPS IN AUDITORY DOMAIN
DERIVATION OF TRAPS IN AUDITORY DOMAIN Petr Motlíček, Doctoral Degree Programme (4) Dept. of Computer Graphics and Multimedia, FIT, BUT E-mail: motlicek@fit.vutbr.cz Supervised by: Dr. Jan Černocký, Prof.
More informationSpeech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter
Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter 1 Gupteswar Sahu, 2 D. Arun Kumar, 3 M. Bala Krishna and 4 Jami Venkata Suman Assistant Professor, Department of ECE,
More informationComparison of Spectral Analysis Methods for Automatic Speech Recognition
INTERSPEECH 2013 Comparison of Spectral Analysis Methods for Automatic Speech Recognition Venkata Neelima Parinam, Chandra Vootkuri, Stephen A. Zahorian Department of Electrical and Computer Engineering
More informationAudio Similarity. Mark Zadel MUMT 611 March 8, Audio Similarity p.1/23
Audio Similarity Mark Zadel MUMT 611 March 8, 2004 Audio Similarity p.1/23 Overview MFCCs Foote Content-Based Retrieval of Music and Audio (1997) Logan, Salomon A Music Similarity Function Based On Signal
More informationChange Point Determination in Audio Data Using Auditory Features
INTL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 0, VOL., NO., PP. 8 90 Manuscript received April, 0; revised June, 0. DOI: /eletel-0-00 Change Point Determination in Audio Data Using Auditory Features
More informationA CONSTRUCTION OF COMPACT MFCC-TYPE FEATURES USING SHORT-TIME STATISTICS FOR APPLICATIONS IN AUDIO SEGMENTATION
17th European Signal Processing Conference (EUSIPCO 2009) Glasgow, Scotland, August 24-28, 2009 A CONSTRUCTION OF COMPACT MFCC-TYPE FEATURES USING SHORT-TIME STATISTICS FOR APPLICATIONS IN AUDIO SEGMENTATION
More informationDimension Reduction of the Modulation Spectrogram for Speaker Verification
Dimension Reduction of the Modulation Spectrogram for Speaker Verification Tomi Kinnunen Speech and Image Processing Unit Department of Computer Science University of Joensuu, Finland tkinnu@cs.joensuu.fi
More informationDetermining Guava Freshness by Flicking Signal Recognition Using HMM Acoustic Models
Determining Guava Freshness by Flicking Signal Recognition Using HMM Acoustic Models Rong Phoophuangpairoj applied signal processing to animal sounds [1]-[3]. In speech recognition, digitized human speech
More informationMODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS
MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS 1 S.PRASANNA VENKATESH, 2 NITIN NARAYAN, 3 K.SAILESH BHARATHWAAJ, 4 M.P.ACTLIN JEEVA, 5 P.VIJAYALAKSHMI 1,2,3,4,5 SSN College of Engineering,
More informationSound Recognition. ~ CSE 352 Team 3 ~ Jason Park Evan Glover. Kevin Lui Aman Rawat. Prof. Anita Wasilewska
Sound Recognition ~ CSE 352 Team 3 ~ Jason Park Evan Glover Kevin Lui Aman Rawat Prof. Anita Wasilewska What is Sound? Sound is a vibration that propagates as a typically audible mechanical wave of pressure
More informationRhythmic Similarity -- a quick paper review. Presented by: Shi Yong March 15, 2007 Music Technology, McGill University
Rhythmic Similarity -- a quick paper review Presented by: Shi Yong March 15, 2007 Music Technology, McGill University Contents Introduction Three examples J. Foote 2001, 2002 J. Paulus 2002 S. Dixon 2004
More informationGammatone Cepstral Coefficient for Speaker Identification
Gammatone Cepstral Coefficient for Speaker Identification Rahana Fathima 1, Raseena P E 2 M. Tech Student, Ilahia college of Engineering and Technology, Muvattupuzha, Kerala, India 1 Asst. Professor, Ilahia
More informationSOUND SOURCE RECOGNITION AND MODELING
SOUND SOURCE RECOGNITION AND MODELING CASA seminar, summer 2000 Antti Eronen antti.eronen@tut.fi Contents: Basics of human sound source recognition Timbre Voice recognition Recognition of environmental
More informationSpeech Signal Analysis
Speech Signal Analysis Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 2&3 14,18 January 216 ASR Lectures 2&3 Speech Signal Analysis 1 Overview Speech Signal Analysis for
More informationElectronic disguised voice identification based on Mel- Frequency Cepstral Coefficient analysis
International Journal of Scientific and Research Publications, Volume 5, Issue 11, November 2015 412 Electronic disguised voice identification based on Mel- Frequency Cepstral Coefficient analysis Shalate
More informationAuditory motivated front-end for noisy speech using spectro-temporal modulation filtering
Auditory motivated front-end for noisy speech using spectro-temporal modulation filtering Sriram Ganapathy a) and Mohamed Omar IBM T.J. Watson Research Center, Yorktown Heights, New York 10562 ganapath@us.ibm.com,
More informationIsolated Digit Recognition Using MFCC AND DTW
MarutiLimkar a, RamaRao b & VidyaSagvekar c a Terna collegeof Engineering, Department of Electronics Engineering, Mumbai University, India b Vidyalankar Institute of Technology, Department ofelectronics
More informationMFCC AND GMM BASED TAMIL LANGUAGE SPEAKER IDENTIFICATION SYSTEM
www.advancejournals.org Open Access Scientific Publisher MFCC AND GMM BASED TAMIL LANGUAGE SPEAKER IDENTIFICATION SYSTEM ABSTRACT- P. Santhiya 1, T. Jayasankar 1 1 AUT (BIT campus), Tiruchirappalli, India
More informationSpeech Synthesis; Pitch Detection and Vocoders
Speech Synthesis; Pitch Detection and Vocoders Tai-Shih Chi ( 冀泰石 ) Department of Communication Engineering National Chiao Tung University May. 29, 2008 Speech Synthesis Basic components of the text-to-speech
More informationFeature Extraction Using 2-D Autoregressive Models For Speaker Recognition
Feature Extraction Using 2-D Autoregressive Models For Speaker Recognition Sriram Ganapathy 1, Samuel Thomas 1 and Hynek Hermansky 1,2 1 Dept. of ECE, Johns Hopkins University, USA 2 Human Language Technology
More informationUsing RASTA in task independent TANDEM feature extraction
R E S E A R C H R E P O R T I D I A P Using RASTA in task independent TANDEM feature extraction Guillermo Aradilla a John Dines a Sunil Sivadas a b IDIAP RR 04-22 April 2004 D a l l e M o l l e I n s t
More informationNOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS: The copyright law of the United States (title 17, U.S. Code) governs the making of photocopies or
NOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS: The copyright law of the United States (title 17, U.S. Code) governs the making of photocopies or other reproductions of copyrighted material. Any copying
More informationScienceDirect. Unsupervised Speech Segregation Using Pitch Information and Time Frequency Masking
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 46 (2015 ) 122 126 International Conference on Information and Communication Technologies (ICICT 2014) Unsupervised Speech
More informationSignal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2
Signal Processing for Speech Applications - Part 2-1 Signal Processing For Speech Applications - Part 2 May 14, 2013 Signal Processing for Speech Applications - Part 2-2 References Huang et al., Chapter
More informationIsolated Word Recognition Based on Combination of Multiple Noise-Robust Techniques
Isolated Word Recognition Based on Combination of Multiple Noise-Robust Techniques 81 Isolated Word Recognition Based on Combination of Multiple Noise-Robust Techniques Noboru Hayasaka 1, Non-member ABSTRACT
More informationRASTA-PLP SPEECH ANALYSIS. Aruna Bayya. Phil Kohn y TR December 1991
RASTA-PLP SPEECH ANALYSIS Hynek Hermansky Nelson Morgan y Aruna Bayya Phil Kohn y TR-91-069 December 1991 Abstract Most speech parameter estimation techniques are easily inuenced by the frequency response
More informationRelative phase information for detecting human speech and spoofed speech
Relative phase information for detecting human speech and spoofed speech Longbiao Wang 1, Yohei Yoshida 1, Yuta Kawakami 1 and Seiichi Nakagawa 2 1 Nagaoka University of Technology, Japan 2 Toyohashi University
More informationspeech signal S(n). This involves a transformation of S(n) into another signal or a set of signals
16 3. SPEECH ANALYSIS 3.1 INTRODUCTION TO SPEECH ANALYSIS Many speech processing [22] applications exploits speech production and perception to accomplish speech analysis. By speech analysis we extract
More informationPerceptually Motivated Linear Prediction Cepstral Features for Network Speech Recognition
Perceptually Motivated Linear Prediction Cepstral Features for Network Speech Recognition Aadel Alatwi, Stephen So, Kuldip K. Paliwal Signal Processing Laboratory Griffith University, Brisbane, QLD, 4111,
More informationAutomatic Text-Independent. Speaker. Recognition Approaches Using Binaural Inputs
Automatic Text-Independent Speaker Recognition Approaches Using Binaural Inputs Karim Youssef, Sylvain Argentieri and Jean-Luc Zarader 1 Outline Automatic speaker recognition: introduction Designed systems
More informationSPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN. Yu Wang and Mike Brookes
SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN Yu Wang and Mike Brookes Department of Electrical and Electronic Engineering, Exhibition Road, Imperial College London,
More informationSpeech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 7, Issue, Ver. I (Mar. - Apr. 7), PP 4-46 e-issn: 9 4, p-issn No. : 9 497 www.iosrjournals.org Speech Enhancement Using Spectral Flatness Measure
More informationOriginal Research Articles
Original Research Articles Researchers A.K.M Fazlul Haque Department of Electronics and Telecommunication Engineering Daffodil International University Emailakmfhaque@daffodilvarsity.edu.bd FFT and Wavelet-Based
More informationRECENTLY, there has been an increasing interest in noisy
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 52, NO. 9, SEPTEMBER 2005 535 Warped Discrete Cosine Transform-Based Noisy Speech Enhancement Joon-Hyuk Chang, Member, IEEE Abstract In
More informationChapter 4 SPEECH ENHANCEMENT
44 Chapter 4 SPEECH ENHANCEMENT 4.1 INTRODUCTION: Enhancement is defined as improvement in the value or Quality of something. Speech enhancement is defined as the improvement in intelligibility and/or
More informationEffective post-processing for single-channel frequency-domain speech enhancement Weifeng Li a
R E S E A R C H R E P O R T I D I A P Effective post-processing for single-channel frequency-domain speech enhancement Weifeng Li a IDIAP RR 7-7 January 8 submitted for publication a IDIAP Research Institute,
More informationSpeech/Music Change Point Detection using Sonogram and AANN
International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 6, Number 1 (2016), pp. 45-49 International Research Publications House http://www. irphouse.com Speech/Music Change
More informationText and Language Independent Speaker Identification By Using Short-Time Low Quality Signals
Text and Language Independent Speaker Identification By Using Short-Time Low Quality Signals Maurizio Bocca*, Reino Virrankoski**, Heikki Koivo* * Control Engineering Group Faculty of Electronics, Communications
More informationSpeech Compression Using Voice Excited Linear Predictive Coding
Speech Compression Using Voice Excited Linear Predictive Coding Ms.Tosha Sen, Ms.Kruti Jay Pancholi PG Student, Asst. Professor, L J I E T, Ahmedabad Abstract : The aim of the thesis is design good quality
More informationIntroduction of Audio and Music
1 Introduction of Audio and Music Wei-Ta Chu 2009/12/3 Outline 2 Introduction of Audio Signals Introduction of Music 3 Introduction of Audio Signals Wei-Ta Chu 2009/12/3 Li and Drew, Fundamentals of Multimedia,
More informationAn Improved Voice Activity Detection Based on Deep Belief Networks
e-issn 2455 1392 Volume 2 Issue 4, April 2016 pp. 676-683 Scientific Journal Impact Factor : 3.468 http://www.ijcter.com An Improved Voice Activity Detection Based on Deep Belief Networks Shabeeba T. K.
More informationAudio Signal Compression using DCT and LPC Techniques
Audio Signal Compression using DCT and LPC Techniques P. Sandhya Rani#1, D.Nanaji#2, V.Ramesh#3,K.V.S. Kiran#4 #Student, Department of ECE, Lendi Institute Of Engineering And Technology, Vizianagaram,
More informationRobust telephone speech recognition based on channel compensation
Pattern Recognition 32 (1999) 1061}1067 Robust telephone speech recognition based on channel compensation Jiqing Han*, Wen Gao Department of Computer Science and Engineering, Harbin Institute of Technology,
More informationEnhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis
Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis Mohini Avatade & S.L. Sahare Electronics & Telecommunication Department, Cummins
More informationTopic. Spectrogram Chromagram Cesptrogram. Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Topic Spectrogram Chromagram Cesptrogram Short time Fourier Transform Break signal into windows Calculate DFT of each window The Spectrogram spectrogram(y,1024,512,1024,fs,'yaxis'); A series of short term
More informationKONKANI SPEECH RECOGNITION USING HILBERT-HUANG TRANSFORM
KONKANI SPEECH RECOGNITION USING HILBERT-HUANG TRANSFORM Shruthi S Prabhu 1, Nayana C G 2, Ashwini B N 3, Dr. Parameshachari B D 4 Assistant Professor, Department of Telecommunication Engineering, GSSSIETW,
More informationIdentification of disguised voices using feature extraction and classification
Identification of disguised voices using feature extraction and classification Lini T Lal, Avani Nath N.J, Dept. of Electronics and Communication, TKMIT, Kollam, Kerala, India linithyvila23@gmail.com,
More informationAn Audio Fingerprint Algorithm Based on Statistical Characteristics of db4 Wavelet
Journal of Information & Computational Science 8: 14 (2011) 3027 3034 Available at http://www.joics.com An Audio Fingerprint Algorithm Based on Statistical Characteristics of db4 Wavelet Jianguo JIANG
More informationReal time speaker recognition from Internet radio
Real time speaker recognition from Internet radio Radoslaw Weychan, Tomasz Marciniak, Agnieszka Stankiewicz, Adam Dabrowski Poznan University of Technology Faculty of Computing Science Chair of Control
More informationANALYSIS OF ACOUSTIC FEATURES FOR AUTOMATED MULTI-TRACK MIXING
th International Society for Music Information Retrieval Conference (ISMIR ) ANALYSIS OF ACOUSTIC FEATURES FOR AUTOMATED MULTI-TRACK MIXING Jeffrey Scott, Youngmoo E. Kim Music and Entertainment Technology
More informationSpeech Synthesis using Mel-Cepstral Coefficient Feature
Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract
More informationSpectral analysis of seismic signals using Burg algorithm V. Ravi Teja 1, U. Rakesh 2, S. Koteswara Rao 3, V. Lakshmi Bharathi 4
Volume 114 No. 1 217, 163-171 ISSN: 1311-88 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu Spectral analysis of seismic signals using Burg algorithm V. avi Teja
More informationDWT and LPC based feature extraction methods for isolated word recognition
RESEARCH Open Access DWT and LPC based feature extraction methods for isolated word recognition Navnath S Nehe 1* and Raghunath S Holambe 2 Abstract In this article, new feature extraction methods, which
More information651 Analysis of LSF frame selection in voice conversion
651 Analysis of LSF frame selection in voice conversion Elina Helander 1, Jani Nurminen 2, Moncef Gabbouj 1 1 Institute of Signal Processing, Tampere University of Technology, Finland 2 Noia Technology
More informationSpeech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm
International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm A.T. Rajamanickam, N.P.Subiramaniyam, A.Balamurugan*,
More informationENHANCEMENT OF THE TRANSMISSION LOSS OF DOUBLE PANELS BY MEANS OF ACTIVELY CONTROLLING THE CAVITY SOUND FIELD
ENHANCEMENT OF THE TRANSMISSION LOSS OF DOUBLE PANELS BY MEANS OF ACTIVELY CONTROLLING THE CAVITY SOUND FIELD André Jakob, Michael Möser Technische Universität Berlin, Institut für Technische Akustik,
More informationSynchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech
INTERSPEECH 5 Synchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech M. A. Tuğtekin Turan and Engin Erzin Multimedia, Vision and Graphics Laboratory,
More informationA Real Time Noise-Robust Speech Recognition System
A Real Time Noise-Robust Speech Recognition System 7 A Real Time Noise-Robust Speech Recognition System Naoya Wada, Shingo Yoshizawa, and Yoshikazu Miyanaga, Non-members ABSTRACT This paper introduces
More informationAutonomous Vehicle Speaker Verification System
Autonomous Vehicle Speaker Verification System Functional Requirements List and Performance Specifications Aaron Pfalzgraf Christopher Sullivan Project Advisor: Dr. Jose Sanchez 4 November 2013 AVSVS 2
More informationAudio processing methods on marine mammal vocalizations
Audio processing methods on marine mammal vocalizations Xanadu Halkias Laboratory for the Recognition and Organization of Speech and Audio http://labrosa.ee.columbia.edu Sound to Signal sound is pressure
More informationPerformance Analysiss of Speech Enhancement Algorithm for Robust Speech Recognition System
Performance Analysiss of Speech Enhancement Algorithm for Robust Speech Recognition System C.GANESH BABU 1, Dr.P..T.VANATHI 2 R.RAMACHANDRAN 3, M.SENTHIL RAJAA 3, R.VENGATESH 3 1 Research Scholar (PSGCT)
More informationMMSE STSA Based Techniques for Single channel Speech Enhancement Application Simit Shah 1, Roma Patel 2
MMSE STSA Based Techniques for Single channel Speech Enhancement Application Simit Shah 1, Roma Patel 2 1 Electronics and Communication Department, Parul institute of engineering and technology, Vadodara,
More informationIEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING. Department of Signal Theory and Communications. c/ Gran Capitán s/n, Campus Nord, Edificio D5
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING Javier Hernando Department of Signal Theory and Communications Polytechnical University of Catalonia c/ Gran Capitán s/n, Campus Nord, Edificio D5 08034
More informationSingle Channel Speaker Segregation using Sinusoidal Residual Modeling
NCC 2009, January 16-18, IIT Guwahati 294 Single Channel Speaker Segregation using Sinusoidal Residual Modeling Rajesh M Hegde and A. Srinivas Dept. of Electrical Engineering Indian Institute of Technology
More informationLOCALIZATION AND IDENTIFICATION OF PERSONS AND AMBIENT NOISE SOURCES VIA ACOUSTIC SCENE ANALYSIS
ICSV14 Cairns Australia 9-12 July, 2007 LOCALIZATION AND IDENTIFICATION OF PERSONS AND AMBIENT NOISE SOURCES VIA ACOUSTIC SCENE ANALYSIS Abstract Alexej Swerdlow, Kristian Kroschel, Timo Machmer, Dirk
More informationInternational Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015
International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 Analysis of Speech Signal Using Graphic User Interface Solly Joy 1, Savitha
More informationAdvanced audio analysis. Martin Gasser
Advanced audio analysis Martin Gasser Motivation Which methods are common in MIR research? How can we parameterize audio signals? Interesting dimensions of audio: Spectral/ time/melody structure, high
More informationIMPROVING QUALITY OF SPEECH SYNTHESIS IN INDIAN LANGUAGES. P. K. Lehana and P. C. Pandey
Workshop on Spoken Language Processing - 2003, TIFR, Mumbai, India, January 9-11, 2003 149 IMPROVING QUALITY OF SPEECH SYNTHESIS IN INDIAN LANGUAGES P. K. Lehana and P. C. Pandey Department of Electrical
More informationSIMULATION VOICE RECOGNITION SYSTEM FOR CONTROLING ROBOTIC APPLICATIONS
SIMULATION VOICE RECOGNITION SYSTEM FOR CONTROLING ROBOTIC APPLICATIONS 1 WAHYU KUSUMA R., 2 PRINCE BRAVE GUHYAPATI V 1 Computer Laboratory Staff., Department of Information Systems, Gunadarma University,
More informationSound source localization accuracy of ambisonic microphone in anechoic conditions
Sound source localization accuracy of ambisonic microphone in anechoic conditions Pawel MALECKI 1 ; 1 AGH University of Science and Technology in Krakow, Poland ABSTRACT The paper presents results of determination
More informationA DEVICE FOR AUTOMATIC SPEECH RECOGNITION*
EVICE FOR UTOTIC SPEECH RECOGNITION* ats Blomberg and Kjell Elenius INTROUCTION In the following a device for automatic recognition of isolated words will be described. It was developed at The department
More informationLearning to Unlearn and Relearn Speech Signal Processing using Neural Networks: current and future perspectives
Learning to Unlearn and Relearn Speech Signal Processing using Neural Networks: current and future perspectives Mathew Magimai Doss Collaborators: Vinayak Abrol, Selen Hande Kabil, Hannah Muckenhirn, Dimitri
More informationSPEech Feature Toolbox (SPEFT) Design and Emotional Speech Feature Extraction
SPEech Feature Toolbox (SPEFT) Design and Emotional Speech Feature Extraction by Xi Li A thesis submitted to the Faculty of Graduate School, Marquette University, in Partial Fulfillment of the Requirements
More informationNew Techniques to Suppress the Sidelobes in OFDM System to Design a Successful Overlay System
Bahria University Journal of Information & Communication Technology Vol. 1, Issue 1, December 2008 New Techniques to Suppress the Sidelobes in OFDM System to Design a Successful Overlay System Saleem Ahmed,
More informationEC 6501 DIGITAL COMMUNICATION UNIT - II PART A
EC 6501 DIGITAL COMMUNICATION 1.What is the need of prediction filtering? UNIT - II PART A [N/D-16] Prediction filtering is used mostly in audio signal processing and speech processing for representing
More informationBasic Characteristics of Speech Signal Analysis
www.ijird.com March, 2016 Vol 5 Issue 4 ISSN 2278 0211 (Online) Basic Characteristics of Speech Signal Analysis S. Poornima Assistant Professor, VlbJanakiammal College of Arts and Science, Coimbatore,
More informationSYNTHETIC SPEECH DETECTION USING TEMPORAL MODULATION FEATURE
SYNTHETIC SPEECH DETECTION USING TEMPORAL MODULATION FEATURE Zhizheng Wu 1,2, Xiong Xiao 2, Eng Siong Chng 1,2, Haizhou Li 1,2,3 1 School of Computer Engineering, Nanyang Technological University (NTU),
More informationA STUDY ON CEPSTRAL SUB-BAND NORMALIZATION FOR ROBUST ASR
A STUDY ON CEPSTRAL SUB-BAND NORMALIZATION FOR ROBUST ASR Syu-Siang Wang 1, Jeih-weih Hung, Yu Tsao 1 1 Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan Dept. of Electrical
More informationRobust Voice Activity Detection Based on Discrete Wavelet. Transform
Robust Voice Activity Detection Based on Discrete Wavelet Transform Kun-Ching Wang Department of Information Technology & Communication Shin Chien University kunching@mail.kh.usc.edu.tw Abstract This paper
More informationSeparating Voiced Segments from Music File using MFCC, ZCR and GMM
Separating Voiced Segments from Music File using MFCC, ZCR and GMM Mr. Prashant P. Zirmite 1, Mr. Mahesh K. Patil 2, Mr. Santosh P. Salgar 3,Mr. Veeresh M. Metigoudar 4 1,2,3,4Assistant Professor, Dept.
More informationLinguistic Phonetics. Spectral Analysis
24.963 Linguistic Phonetics Spectral Analysis 4 4 Frequency (Hz) 1 Reading for next week: Liljencrants & Lindblom 1972. Assignment: Lip-rounding assignment, due 1/15. 2 Spectral analysis techniques There
More informationIN recent decades following the introduction of hidden. Power-Normalized Cepstral Coefficients (PNCC) for Robust Speech Recognition
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. X, NO. X, MONTH, YEAR 1 Power-Normalized Cepstral Coefficients (PNCC) for Robust Speech Recognition Chanwoo Kim and Richard M. Stern, Member,
More informationAdaptive Filters Application of Linear Prediction
Adaptive Filters Application of Linear Prediction Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Electrical Engineering and Information Technology Digital Signal Processing
More informationAuditory modelling for speech processing in the perceptual domain
ANZIAM J. 45 (E) ppc964 C980, 2004 C964 Auditory modelling for speech processing in the perceptual domain L. Lin E. Ambikairajah W. H. Holmes (Received 8 August 2003; revised 28 January 2004) Abstract
More informationJOURNAL OF OBJECT TECHNOLOGY
JOURNAL OF OBJECT TECHNOLOGY Online at http://www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2009 Vol. 9, No. 1, January-February 2010 The Discrete Fourier Transform, Part 5: Spectrogram
More informationAuditory Based Feature Vectors for Speech Recognition Systems
Auditory Based Feature Vectors for Speech Recognition Systems Dr. Waleed H. Abdulla Electrical & Computer Engineering Department The University of Auckland, New Zealand [w.abdulla@auckland.ac.nz] 1 Outlines
More informationMonophony/Polyphony Classification System using Fourier of Fourier Transform
International Journal of Electronics Engineering, 2 (2), 2010, pp. 299 303 Monophony/Polyphony Classification System using Fourier of Fourier Transform Kalyani Akant 1, Rajesh Pande 2, and S.S. Limaye
More informationSignal Analysis Using Autoregressive Models of Amplitude Modulation. Sriram Ganapathy
Signal Analysis Using Autoregressive Models of Amplitude Modulation Sriram Ganapathy Advisor - Hynek Hermansky Johns Hopkins University 11-18-2011 Overview Introduction AR Model of Hilbert Envelopes FDLP
More informationAudio and Speech Compression Using DCT and DWT Techniques
Audio and Speech Compression Using DCT and DWT Techniques M. V. Patil 1, Apoorva Gupta 2, Ankita Varma 3, Shikhar Salil 4 Asst. Professor, Dept.of Elex, Bharati Vidyapeeth Univ.Coll.of Engg, Pune, Maharashtra,
More information