Phoneme Recognizer Based Verification of the Voice Segment Type Determination
|
|
- Angel Barton
- 5 years ago
- Views:
Transcription
1 honeme Recognizer Based Verification of the Voice Segment Type Determination OLDŘICH HORÁK Faculty of Economics and Administration, Institute of System Engineering and Informatics University of ardubice Studentská 84, ardubice CZECH REUBLIC Abstract: - This document describes the verification of the result of voice segment type determination. The verification is performed by the honeme Recognizer. The voice segment type determination is based on the presence of fundamental frequency in the voice segment signal, and the several methods are used for the fundamental frequency presence demonstration. The fundamental frequency of speaker s voice, which can be extracted from the voiced segment of the speech signal, is one of the basic characteristic features of the speech used in the speaker recognition process. Key-Words: - fundamental frequency, signal processing, speaker recognition, voice signal, phoneme recognizer 1 Introduction The information system s user identification is the very important task of the information systems common security. One of the unusual biometric methods is speaker recognition. This method uses an extraction of the speaker s voice tract anatomy parameters from the voice signal. The sound timbre of the speaker s voice is dependent on and given by these parameters. In the voice signal processing, the signal is divided in blocks short segments of the speech with time duration in tens of milliseconds. The voice characteristic is based on features extracted from the segments. Some features can be extracted from the voiced segments only, other from surd segments. Therefore, the voice segment type determination is important for these tasks. [1, 2] The fundamental frequency is one of the basic features of the voice. It is the common tone level of the speaker s voice. This characteristic feature can be extracted from the voiced segment only. The presence of the fundamental frequency in the voice segment can be also used to determinate the type of given segment. The detection of the fundamental frequency is sufficient to determination of the voiced segment, and the exact value of the frequency is not important in this case. [11] 2 rinciples The voice segment type depends on the voice signal part, which is cut from. Simplified, the segment from vowel is voiced, and the part of consonant signal is surd. It is only the approximation; the exact determination of the segment type has to be evaluated by a signal processing method. 2.1 Segment Type Determination methods There are more methods to distinguish the voice segment type: Fundamental Frequency resence It is the method used, verified, and published by more authors [1, 2, 4, 5, 7, 10, 14]. The dependency is simple. If the fundamental frequency is detected in the voice segment, the segment is voiced, else the segment is surd. The presence of the fundamental frequency can be verified by more methods of signal processing. It can be found as a peek in the real cepstrum of the signal [1, 2, 11]. The calculation of the cepstrum is slow. It can be substituted by faster methods, i.e. the autocorrelation [1, 2, 6, 11, 14-18] Comparison of the Energy Spread There is three or more frequency ranges defined, and the spread of energy in the all of these ranges leads to proper segment type determination. Each segment type has a typical spread of energy in the given frequency sub-ranges. The preprocessing of the typical spreads is needed before use of this method; it leads to more time consumption [1, 2, 7, 10]. ISBN:
2 2.1.3 ZCR to Short-Time Energy Relation Other method uses a relation of the mean value of the zero-crossing rate to short-time energy of the voice signal segment. The voiced segments have higher value of the short-time energy, and lower mean value of the zero-crossing rate. Both of the characteristics have relative values, and are defined without units. [1, 2, 3, 10] Statistical Methods Another methods use a statistical processing as well [8, 9, 14, 15]. The autocorrelation function applied on the signal segment provides an option to detect the presence of fundamental frequency and to estimate its value. [2, 3] 2.2 honeme Recognizer honeme recognizer based on long temporal context was developed at Brno University of Technology, Faculty of Information Technology [12, 13]. It was successfully applied to tasks including language identification, indexing and search of audio records, and keyword spotting. Outputs from this phoneme recognizer can be used as a baseline for subsequent processing. It is the open source tool. Source codes and binaries can be redistributed and/or modified under the terms of the GNU General ublic License as published by the Free Software Foundation; either version 2 of the License, or any later version. Model files can be used for research and educational purposes [12]. There are model files for the Czech, Hungarian, Russian, and English languages. The tool is able to recognize the phonemes from the audio signal (human voice, given language), and provides text output. The output consists of the information about the recognized phonemes, and its position in the processed sample of the voice signal. It is usable for the extraction of the particular phonemes from the given signal. 3 Experimental Verification The voice segment type determination experiment was processed by two methods, and both result sets was compared and evaluated. There are six files of cardinal numerals of the Czech language recorded and used. These words are sound different each other. It is usable for the sufficient variability of the signal see [11] for more details and additional information. 3.1 rocess of Verification The Fig.1 shows the process of verification. Each voice sample has the same duration, and the signal is divided to 70 segments. The voice segment type determination is processed by autocorrelation and by cepstral method. The same sample of signal is processed by honeme Recognizer [12] to recognize phonemes of the word and its position in the given sample. Each phoneme is mapped to segments to be the segment type determination verified. 3.2 Results of Verification Results of both determination method and the phoneme recognition are verified by mapping (see Fig.2 to 7 for all the used words). The first rows in these tables mean the number of the segment N. Next two rows show the voiced segments (marked as X ) determined by autocorrelation A and cepstral method C. Cepstral Method Segmentation Autocorrelation Signal honeme Recognizer Verification Fig.1 Schema of the Verification rocess ISBN:
3 A X X X X X X X X X X X X X X X X X X C X X X X X X X X X X X X X X X X X <sil> j e d n a <sil> Fig.2 Verification for word jedna A X X X X X X X X X X X X X X X X X X X X X X X X C X X X X X X X X X X X X X X X X X X X X X <sil> d v ě [je] <sil> Fig.3 Verification for word dvě A X X X X X X X X X X X X X X X X X X X C X X X X X X X X X X X X X X X X X <sil> t ř i <sil> Fig.4 Verification for word tři A X X X X X X X X X X X X X X X X X X X X X X X X X X X C X X X X X X X X X X X X X X X X X X X X X X X X X X X <sil> č t y ř i <sil> Fig.5 Verification for word čtyři A X X X X X X X X X X C X X X X X X X <sil> p ě [je] t <sil> Fig.6 Verification for word pět A X X X X X X X X X X X C X X X X X X X X X <sil> š e s t <sil> Fig.7 Verification for word šest In the last row, the position of the recognized phonemes is mapped. The <sil> symbol means the silence was recognized in the position. The Czech language character ě is phonetically identical with phoneme couple je in many words; it is corrected in the verification tables. The verification denotes the results of the segment type determination and phoneme recognition match at first sight. The voiced segments are determined significantly in the position of vowels. It conforms to the expectation as is written above. In the auxiliary view, some segments (i.e. Fig.6, segment no. 62) were determined as voiced ones, but the phoneme recognition denotes, there was silence. It means the recorded voice signal had a higher noise level in that time. However, the rate of false determined segments is low by this type of error, because there are four cases only in the 420 processed segments. ISBN:
4 For the recap of the previous experiments [11] comparison of the voice segment type determination methods shows the difference in units of segments. The counter value of error rate is about less than 10% (see Tab.1 cited from [11]). It means, the difference between both methods can be omitted for the verification using phoneme recognition. Next work is to analyze the verification results. Compared word Total Autocorrelation Cepstral method Difference segments Voiced Surd Voiced Surd Absolute Relative 1 jedna % 2 dvě % 3 tři % 4 čtyři % 5 pět % 6 šest % Tab.1 Comparison Results [11] The verification results analyses are provided in Tab.2 and Tab.3. There are noted counts of false determinations of the voiced type segment in the signal part recognized as the silence and as the surd phoneme. The absolute and corresponding relative values are denoted in the tables as well as the total values. The different results are highlighted in both tables to be well telling. Verified word Total False voiced type determ. out of the voiced signal ranges segments Silence erc. Surd erc. Total erc. 1 jedna % 3 4.3% 3 4.3% 2 dvě % 3 4.3% 5 7.1% 3 tři % 1 1.4% 2 2.9% 4 čtyři % 1 1.4% 1 1.4% 5 pět % 2 2.9% 3 4.3% 6 šest % 0 0.0% 0 0.0% Tab.2 Verification Results (Autocorrelation) Verified word Total False voiced type determ. out of the voiced signal ranges segments Silence erc. Surd erc. Total erc. 1 jedna % 3 4.3% 3 4.3% 2 dvě % 3 4.3% 5 7.1% 3 tři % 1 1.4% 2 2.9% 4 čtyři % 1 1.4% 1 1.4% 5 pět % 1 1.4% 2 2.9% 6 šest % 0 0.0% 0 0.0% Tab.3 Verification Results (Cepstral method) ISBN:
5 4 Conclusion and Future Work The results of the verification show that the counter value of false determination rate given by the voiced segment determination is under the error rate of both used determination methods [11]. It means the determination of the voice segment type using both methods corresponds to the phoneme recognition with the error rate fewer than 10%, and methods can be used very well. This result will be used for the support of theoretical base for the information system s user identification by speaker s voice. The next steps will be in the area of sufficient set of voice features extraction from the speaker s speech. There are several methods that provide feature extraction from both types of the voice segment (voiced and surd). Therefore, the determination is the important part of preprocess for the extraction. This type of user identification is simple for users. It is not necessary to remember passwords. The users can use standard multimedia inputs for the identification, none expensive equipment needed. The communication using multimedia is very popular nowadays. Multimedia applications as interactive systems of digital media are favored by many users [19]. The multimedia information can be selected by the user himself according to his individual needs, and the system can recognize the user by his/her voice. It can be the goal of this method. 5 Acknowledgement This work was supported by the project No. CZ.1.07/2.2.00/ Innovation and support of doctoral study program (INDO), financed from EU and Czech Republic funds. References: [1] H. Atassi, Metody detekce základního tónu řeči. Elektrorevue, Vol.4, 2008, ISSN [2] J. sutka, et al., Mluvíme s počítačem česky. raha, Academia, 2006, ISBN [3] Y. Tadokoro, et al., itch Estimation for Musical Sound Including ercussion Sound Using Comb Filters and Autocorrelation Function, roceedings of the 8th WSEAS International Conference on Acoustics & Music: Theory & Applications, Vancouver, Canada, June 19-21, 2007, pp [4] C. Moisa, H. Silaghi, A. Silaghi, Speech and Speaker Recognition for the Command of an Industrial Robot, roceedings of the 12th WSEAS international conference on Mathematical methods and computational techniques in electrical engineering, Stevens oint, Wisconsin, USA, 2010, pp , ISBN: [5] M. Vondra, Kepstrální analýza řečového signálu. Elektrorevue. Vol.48, 2001, ISSN [6] M. E. Torres, et al., A Multiresolution Information Measure approach to Speech Recognition, roceedings of the 6 th WSEAS International Conference on Signal, Speech and Image rocessing, Lisbon, ortugal, September 22-24, 2006, pp [7] E. Marchetto, F. Avanzini, and F. Flego, An Automatic Speaker Recognition System for Intelligence Applications, roceedings of the 17 th European Signal rocessing Conference (EUSICO 2009), Glasgow, Scotland, August 24-28, 2009, pp [8] J. Sohn, N. S. Kim, and W. Sung, A Statistical Model-Based Voice Activity Detection, IEEE Signal rocessing Letters, vol. 6, no. 1, January 1999, pp [9] A. Stolcke, S. Kajarekar, and L. Ferrer, Nonparametric Feature Normalization for SVM-based Speaker Verification, IEEE International Conference on Acoustics Speech and Signal rocessing (ICASS 2008), vol. 104, no. 23, pp [10] J.. Campbell, Jr. Speaker recognition: a tutorial. IEEE 85, 1997, pp [11] O. Horák, The Voice Segment Type Determination using the Autocorrelation Compared to Cepstral Method, WSEAS Transactions on Signal rocessing, vol. 8, issue 1, January 2012, pp [12]. Schwarz,. Matejka, L. Burget, O. Glembek, Description honeme recognizer based on long temporal context, Brno 2012, online: [13]. Schwarz, honeme Recognition based on Long Temporal Context, hd Thesis, Brno University of Technology, [14] A. Kabir, Sh. Md. M. Ahsan, Vector Quantization In Text Dependent Automatic Speaker Recognition Using Mel-frequency Cepstrum Coefficient, 6th WSEAS International Conference on Circuits, Systems, ISBN:
6 Electronics, Control & Signal rocessing, Cairo, Egypt, Dec 29-31, 2007, pp [15] A. etry, S. da S. Soares, G. F. Marchioro, A. S. M. de Franceschi, A Distributed Speaker Authentication System, Applied Computing Conference (ACC '08), Istanbul, Turkey, May 27-30, 2008, pp [16] W. Al-Sawalmeh, Kh. Daqrouq, and A.-R. Al- Qawasmi, Multistage Speaker Feature Tracking Identification System Based on Continuous and Discrete Wavelet Transform, roceedings of the 9th WSEAS International Conference on Multimedia Systems & Signal rocessing, Hangzhou,China, May 20-22, 2009, pp [17] W. H. Abdulla, Auditory based feature vectors for speech recognition systems, Advances in Communications And Software Technologies, WSEAS ress, 2002, pp [18] J. S. Jung, J. K. Kim, and M. J. Bae, Speaker Recognition System Using the rosodic Information, WSEAS Transactions on Systems, Vol. 3, Issue 3, May 2004, pp [19] Milková, E., What can multimedia add to the optimization of teaching and learning at universities? 7th WSEAS International Conferences on Advances on Applied Computer & Applied Computational Science (ACACOS 08), Hangzhou, China, April 6-8, 2008, pp ISBN:
Mel Spectrum Analysis of Speech Recognition using Single Microphone
International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree
More informationInternational Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015
International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 Analysis of Speech Signal Using Graphic User Interface Solly Joy 1, Savitha
More informationA Method for Voiced/Unvoiced Classification of Noisy Speech by Analyzing Time-Domain Features of Spectrogram Image
Science Journal of Circuits, Systems and Signal Processing 2017; 6(2): 11-17 http://www.sciencepublishinggroup.com/j/cssp doi: 10.11648/j.cssp.20170602.12 ISSN: 2326-9065 (Print); ISSN: 2326-9073 (Online)
More informationAutomatic Transcription of Monophonic Audio to MIDI
Automatic Transcription of Monophonic Audio to MIDI Jiří Vass 1 and Hadas Ofir 2 1 Czech Technical University in Prague, Faculty of Electrical Engineering Department of Measurement vassj@fel.cvut.cz 2
More informationDERIVATION OF TRAPS IN AUDITORY DOMAIN
DERIVATION OF TRAPS IN AUDITORY DOMAIN Petr Motlíček, Doctoral Degree Programme (4) Dept. of Computer Graphics and Multimedia, FIT, BUT E-mail: motlicek@fit.vutbr.cz Supervised by: Dr. Jan Černocký, Prof.
More informationPerformance study of Text-independent Speaker identification system using MFCC & IMFCC for Telephone and Microphone Speeches
Performance study of Text-independent Speaker identification system using & I for Telephone and Microphone Speeches Ruchi Chaudhary, National Technical Research Organization Abstract: A state-of-the-art
More informationAutonomous Vehicle Speaker Verification System
Autonomous Vehicle Speaker Verification System Functional Requirements List and Performance Specifications Aaron Pfalzgraf Christopher Sullivan Project Advisor: Dr. Jose Sanchez 4 November 2013 AVSVS 2
More informationSpeech Synthesis using Mel-Cepstral Coefficient Feature
Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract
More informationPerformance Analysis of MFCC and LPCC Techniques in Automatic Speech Recognition
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume - 3 Issue - 8 August, 2014 Page No. 7727-7732 Performance Analysis of MFCC and LPCC Techniques in Automatic
More informationNCCF ACF. cepstrum coef. error signal > samples
ESTIMATION OF FUNDAMENTAL FREQUENCY IN SPEECH Petr Motl»cek 1 Abstract This paper presents an application of one method for improving fundamental frequency detection from the speech. The method is based
More informationSOUND SOURCE RECOGNITION AND MODELING
SOUND SOURCE RECOGNITION AND MODELING CASA seminar, summer 2000 Antti Eronen antti.eronen@tut.fi Contents: Basics of human sound source recognition Timbre Voice recognition Recognition of environmental
More informationClassification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise
Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise Noha KORANY 1 Alexandria University, Egypt ABSTRACT The paper applies spectral analysis to
More informationSlovak University of Technology and Planned Research in Voice De-Identification. Anna Pribilova
Slovak University of Technology and Planned Research in Voice De-Identification Anna Pribilova SLOVAK UNIVERSITY OF TECHNOLOGY IN BRATISLAVA the oldest and the largest university of technology in Slovakia
More informationExperimental Study on Feature Selection Using Artificial AE Sources
3th European Conference on Acoustic Emission Testing & 7th International Conference on Acoustic Emission University of Granada, 12-15 September 212 www.ndt.net/ewgae-icae212/ Experimental Study on Feature
More informationSPEECH TO SINGING SYNTHESIS SYSTEM. Mingqing Yun, Yoon mo Yang, Yufei Zhang. Department of Electrical and Computer Engineering University of Rochester
SPEECH TO SINGING SYNTHESIS SYSTEM Mingqing Yun, Yoon mo Yang, Yufei Zhang Department of Electrical and Computer Engineering University of Rochester ABSTRACT This paper describes a speech-to-singing synthesis
More informationAudio Fingerprinting using Fractional Fourier Transform
Audio Fingerprinting using Fractional Fourier Transform Swati V. Sutar 1, D. G. Bhalke 2 1 (Department of Electronics & Telecommunication, JSPM s RSCOE college of Engineering Pune, India) 2 (Department,
More informationPoS(CENet2015)037. Recording Device Identification Based on Cepstral Mixed Features. Speaker 2
Based on Cepstral Mixed Features 12 School of Information and Communication Engineering,Dalian University of Technology,Dalian, 116024, Liaoning, P.R. China E-mail:zww110221@163.com Xiangwei Kong, Xingang
More informationVoice Activity Detection
Voice Activity Detection Speech Processing Tom Bäckström Aalto University October 2015 Introduction Voice activity detection (VAD) (or speech activity detection, or speech detection) refers to a class
More informationLicense Plate Localisation based on Morphological Operations
License Plate Localisation based on Morphological Operations Xiaojun Zhai, Faycal Benssali and Soodamani Ramalingam School of Engineering & Technology University of Hertfordshire, UH Hatfield, UK Abstract
More informationMICROCHIP PATTERN RECOGNITION BASED ON OPTICAL CORRELATOR
38 Acta Electrotechnica et Informatica, Vol. 17, No. 2, 2017, 38 42, DOI: 10.15546/aeei-2017-0014 MICROCHIP PATTERN RECOGNITION BASED ON OPTICAL CORRELATOR Dávid SOLUS, Ľuboš OVSENÍK, Ján TURÁN Department
More informationDetermining Guava Freshness by Flicking Signal Recognition Using HMM Acoustic Models
Determining Guava Freshness by Flicking Signal Recognition Using HMM Acoustic Models Rong Phoophuangpairoj applied signal processing to animal sounds [1]-[3]. In speech recognition, digitized human speech
More informationElectric Guitar Pickups Recognition
Electric Guitar Pickups Recognition Warren Jonhow Lee warrenjo@stanford.edu Yi-Chun Chen yichunc@stanford.edu Abstract Electric guitar pickups convert vibration of strings to eletric signals and thus direcly
More informationEvaluation of Audio Compression Artifacts M. Herrera Martinez
Evaluation of Audio Compression Artifacts M. Herrera Martinez This paper deals with subjective evaluation of audio-coding systems. From this evaluation, it is found that, depending on the type of signal
More informationElectronic disguised voice identification based on Mel- Frequency Cepstral Coefficient analysis
International Journal of Scientific and Research Publications, Volume 5, Issue 11, November 2015 412 Electronic disguised voice identification based on Mel- Frequency Cepstral Coefficient analysis Shalate
More informationSPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT
SPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT RASHMI MAKHIJANI Department of CSE, G. H. R.C.E., Near CRPF Campus,Hingna Road, Nagpur, Maharashtra, India rashmi.makhijani2002@gmail.com
More informationAn Adaptive Multi-Band System for Low Power Voice Command Recognition
INTERSPEECH 206 September 8 2, 206, San Francisco, USA An Adaptive Multi-Band System for Low Power Voice Command Recognition Qing He, Gregory W. Wornell, Wei Ma 2 EECS & RLE, MIT, Cambridge, MA 0239, USA
More informationDesign and Analysis of New Digital Modulation classification method
Design and Analysis of New Digital Modulation classification method ANNA KUBANKOVA Department of Telecommunications Brno University of Technology Purkynova 118, 612 00 Brno CZECH REPUBLIC shklya@feec.vutbr.cz
More informationSIMULATION VOICE RECOGNITION SYSTEM FOR CONTROLING ROBOTIC APPLICATIONS
SIMULATION VOICE RECOGNITION SYSTEM FOR CONTROLING ROBOTIC APPLICATIONS 1 WAHYU KUSUMA R., 2 PRINCE BRAVE GUHYAPATI V 1 Computer Laboratory Staff., Department of Information Systems, Gunadarma University,
More informationDifferent Approaches of Spectral Subtraction Method for Speech Enhancement
ISSN 2249 5460 Available online at www.internationalejournals.com International ejournals International Journal of Mathematical Sciences, Technology and Humanities 95 (2013 1056 1062 Different Approaches
More informationDimension Reduction of the Modulation Spectrogram for Speaker Verification
Dimension Reduction of the Modulation Spectrogram for Speaker Verification Tomi Kinnunen Speech and Image Processing Unit Department of Computer Science University of Joensuu, Finland Kong Aik Lee and
More informationAUTOMATIC NUMBER PLATE DETECTION USING IMAGE PROCESSING AND PAYMENT AT TOLL PLAZA
Reg. No.:20151213 DOI:V4I3P13 AUTOMATIC NUMBER PLATE DETECTION USING IMAGE PROCESSING AND PAYMENT AT TOLL PLAZA Meet Shah, meet.rs@somaiya.edu Information Technology, KJSCE Mumbai, India. Akshaykumar Timbadia,
More informationPerformance Analysiss of Speech Enhancement Algorithm for Robust Speech Recognition System
Performance Analysiss of Speech Enhancement Algorithm for Robust Speech Recognition System C.GANESH BABU 1, Dr.P..T.VANATHI 2 R.RAMACHANDRAN 3, M.SENTHIL RAJAA 3, R.VENGATESH 3 1 Research Scholar (PSGCT)
More informationANALYSIS OF MEASUREMENT ACCURACY OF CONTACTLESS 3D OPTICAL SCANNERS
ANALYSIS OF MEASUREMENT ACCURACY OF CONTACTLESS 3D OPTICAL SCANNERS RADOMIR MENDRICKY Department of Manufacturing Systems and Automation, Technical University of Liberec, Liberec, Czech Republic DOI: 10.17973/MMSJ.2015_10_201541
More informationImplementing Speaker Recognition
Implementing Speaker Recognition Chase Zhou Physics 406-11 May 2015 Introduction Machinery has come to replace much of human labor. They are faster, stronger, and more consistent than any human. They ve
More informationA multi-class method for detecting audio events in news broadcasts
A multi-class method for detecting audio events in news broadcasts Sergios Petridis, Theodoros Giannakopoulos, and Stavros Perantonis Computational Intelligence Laboratory, Institute of Informatics and
More informationSONG RETRIEVAL SYSTEM USING HIDDEN MARKOV MODELS
SONG RETRIEVAL SYSTEM USING HIDDEN MARKOV MODELS AKSHAY CHANDRASHEKARAN ANOOP RAMAKRISHNA akshayc@cmu.edu anoopr@andrew.cmu.edu ABHISHEK JAIN GE YANG ajain2@andrew.cmu.edu younger@cmu.edu NIDHI KOHLI R
More informationA Survey and Evaluation of Voice Activity Detection Algorithms
A Survey and Evaluation of Voice Activity Detection Algorithms Seshashyama Sameeraj Meduri (ssme09@student.bth.se, 861003-7577) Rufus Ananth (anru09@student.bth.se, 861129-5018) Examiner: Dr. Sven Johansson
More informationNAVIGATION SECURITY MODULE WITH REAL-TIME VOICE COMMAND RECOGNITION SYSTEM
POLISH MARITIME RESEARCH 2 (94) 2017 Vol. 24; pp. 17-26 10.1515/pomr-2017-0046 NAVIGATION SECURITY MODULE WITH REAL-TIME VOICE COMMAND RECOGNITION SYSTEM Mustafa Yagimli Okan University, Vocational School,
More informationRobust Voice Activity Detection Based on Discrete Wavelet. Transform
Robust Voice Activity Detection Based on Discrete Wavelet Transform Kun-Ching Wang Department of Information Technology & Communication Shin Chien University kunching@mail.kh.usc.edu.tw Abstract This paper
More informationFPGA implementation of DWT for Audio Watermarking Application
FPGA implementation of DWT for Audio Watermarking Application Naveen.S.Hampannavar 1, Sajeevan Joseph 2, C.B.Bidhul 3, Arunachalam V 4 1, 2, 3 M.Tech VLSI Students, 4 Assistant Professor Selection Grade
More informationVIDEO DATABASE FOR FACE RECOGNITION
VIDEO DATABASE FOR FACE RECOGNITION P. Bambuch, T. Malach, J. Malach EBIS, spol. s r.o. Abstract This paper deals with video sequences database design and assembly for face recognition system working under
More informationAn Efficient Approach for Iris Recognition by Improving Iris Segmentation and Iris Image Compression
An Efficient Approach for Iris Recognition by Improving Iris Segmentation and Iris Image Compression K. N. Jariwala, SVNIT, Surat, India U. D. Dalal, SVNIT, Surat, India Abstract The biometric person authentication
More informationMFCC AND GMM BASED TAMIL LANGUAGE SPEAKER IDENTIFICATION SYSTEM
www.advancejournals.org Open Access Scientific Publisher MFCC AND GMM BASED TAMIL LANGUAGE SPEAKER IDENTIFICATION SYSTEM ABSTRACT- P. Santhiya 1, T. Jayasankar 1 1 AUT (BIT campus), Tiruchirappalli, India
More informationSignal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2
Signal Processing for Speech Applications - Part 2-1 Signal Processing For Speech Applications - Part 2 May 14, 2013 Signal Processing for Speech Applications - Part 2-2 References Huang et al., Chapter
More informationApplications of Music Processing
Lecture Music Processing Applications of Music Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Singing Voice Detection Important pre-requisite
More informationStatistical Modeling of Speaker s Voice with Temporal Co-Location for Active Voice Authentication
INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Statistical Modeling of Speaker s Voice with Temporal Co-Location for Active Voice Authentication Zhong Meng, Biing-Hwang (Fred) Juang School of
More informationTHE USE OF THE ADAPTIVE NOISE CANCELLATION FOR VOICE COMMUNICATION WITH THE CONTROL SYSTEM
International Journal of Computer Science and Applications, Technomathematics Research Foundation Vol. 8, No. 1, pp. 54 70, 2011 THE USE OF THE ADAPTIVE NOISE CANCELLATION FOR VOICE COMMUNICATION WITH
More informationSelected Research Signal & Information Processing Group
COST Action IC1206 - MC Meeting Selected Research Activities @ Signal & Information Processing Group Zheng-Hua Tan Dept. of Electronic Systems, Aalborg Univ., Denmark zt@es.aau.dk 1 Outline Introduction
More informationIris Recognition-based Security System with Canny Filter
Canny Filter Dr. Computer Engineering Department, University of Technology, Baghdad-Iraq E-mail: hjhh2007@yahoo.com Received: 8/9/2014 Accepted: 21/1/2015 Abstract Image identification plays a great role
More informationA variable step-size LMS adaptive filtering algorithm for speech denoising in VoIP
7 3rd International Conference on Computational Systems and Communications (ICCSC 7) A variable step-size LMS adaptive filtering algorithm for speech denoising in VoIP Hongyu Chen College of Information
More informationOnline Signature Verification by Using FPGA
Online Signature Verification by Using FPGA D.Sandeep Assistant Professor, Department of ECE, Vignan Institute of Technology & Science, Telangana, India. ABSTRACT: The main aim of this project is used
More informationAN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS
AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS Kuldeep Kumar 1, R. K. Aggarwal 1 and Ankita Jain 2 1 Department of Computer Engineering, National Institute
More informationSpeech Recognition using FIR Wiener Filter
Speech Recognition using FIR Wiener Filter Deepak 1, Vikas Mittal 2 1 Department of Electronics & Communication Engineering, Maharishi Markandeshwar University, Mullana (Ambala), INDIA 2 Department of
More informationAn Audio Fingerprint Algorithm Based on Statistical Characteristics of db4 Wavelet
Journal of Information & Computational Science 8: 14 (2011) 3027 3034 Available at http://www.joics.com An Audio Fingerprint Algorithm Based on Statistical Characteristics of db4 Wavelet Jianguo JIANG
More informationCombining Pitch-Based Inference and Non-Negative Spectrogram Factorization in Separating Vocals from Polyphonic Music
Combining Pitch-Based Inference and Non-Negative Spectrogram Factorization in Separating Vocals from Polyphonic Music Tuomas Virtanen, Annamaria Mesaros, Matti Ryynänen Department of Signal Processing,
More informationIDENTIFICATION OF SIGNATURES TRANSMITTED OVER RAYLEIGH FADING CHANNEL BY USING HMM AND RLE
International Journal of Technology (2011) 1: 56 64 ISSN 2086 9614 IJTech 2011 IDENTIFICATION OF SIGNATURES TRANSMITTED OVER RAYLEIGH FADING CHANNEL BY USING HMM AND RLE Djamhari Sirat 1, Arman D. Diponegoro
More informationKONKANI SPEECH RECOGNITION USING HILBERT-HUANG TRANSFORM
KONKANI SPEECH RECOGNITION USING HILBERT-HUANG TRANSFORM Shruthi S Prabhu 1, Nayana C G 2, Ashwini B N 3, Dr. Parameshachari B D 4 Assistant Professor, Department of Telecommunication Engineering, GSSSIETW,
More informationVoice Activity Detection for Speech Enhancement Applications
Voice Activity Detection for Speech Enhancement Applications E. Verteletskaya, K. Sakhnov Abstract This paper describes a study of noise-robust voice activity detection (VAD) utilizing the periodicity
More informationGammatone Cepstral Coefficient for Speaker Identification
Gammatone Cepstral Coefficient for Speaker Identification Rahana Fathima 1, Raseena P E 2 M. Tech Student, Ilahia college of Engineering and Technology, Muvattupuzha, Kerala, India 1 Asst. Professor, Ilahia
More informationEnhanced MLP Input-Output Mapping for Degraded Pattern Recognition
Enhanced MLP Input-Output Mapping for Degraded Pattern Recognition Shigueo Nomura and José Ricardo Gonçalves Manzan Faculty of Electrical Engineering, Federal University of Uberlândia, Uberlândia, MG,
More informationSpeech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm
International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm A.T. Rajamanickam, N.P.Subiramaniyam, A.Balamurugan*,
More informationAuditory Based Feature Vectors for Speech Recognition Systems
Auditory Based Feature Vectors for Speech Recognition Systems Dr. Waleed H. Abdulla Electrical & Computer Engineering Department The University of Auckland, New Zealand [w.abdulla@auckland.ac.nz] 1 Outlines
More informationA DEVICE FOR AUTOMATIC SPEECH RECOGNITION*
EVICE FOR UTOTIC SPEECH RECOGNITION* ats Blomberg and Kjell Elenius INTROUCTION In the following a device for automatic recognition of isolated words will be described. It was developed at The department
More informationAUTOMATIC SPEECH RECOGNITION FOR NUMERIC DIGITS USING TIME NORMALIZATION AND ENERGY ENVELOPES
AUTOMATIC SPEECH RECOGNITION FOR NUMERIC DIGITS USING TIME NORMALIZATION AND ENERGY ENVELOPES N. Sunil 1, K. Sahithya Reddy 2, U.N.D.L.mounika 3 1 ECE, Gurunanak Institute of Technology, (India) 2 ECE,
More informationLearning to Unlearn and Relearn Speech Signal Processing using Neural Networks: current and future perspectives
Learning to Unlearn and Relearn Speech Signal Processing using Neural Networks: current and future perspectives Mathew Magimai Doss Collaborators: Vinayak Abrol, Selen Hande Kabil, Hannah Muckenhirn, Dimitri
More informationPreeti Rao 2 nd CompMusicWorkshop, Istanbul 2012
Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012 o Music signal characteristics o Perceptual attributes and acoustic properties o Signal representations for pitch detection o STFT o Sinusoidal model o
More informationAutomatic Text-Independent. Speaker. Recognition Approaches Using Binaural Inputs
Automatic Text-Independent Speaker Recognition Approaches Using Binaural Inputs Karim Youssef, Sylvain Argentieri and Jean-Luc Zarader 1 Outline Automatic speaker recognition: introduction Designed systems
More informationVOICE COMMAND RECOGNITION SYSTEM BASED ON MFCC AND DTW
VOICE COMMAND RECOGNITION SYSTEM BASED ON MFCC AND DTW ANJALI BALA * Kurukshetra University, Department of Instrumentation & Control Engineering., H.E.C* Jagadhri, Haryana, 135003, India sachdevaanjali26@gmail.com
More informationFundamental Frequency Detection
Fundamental Frequency Detection Jan Černocký, Valentina Hubeika {cernocky ihubeika}@fit.vutbr.cz DCGM FIT BUT Brno Fundamental Frequency Detection Jan Černocký, Valentina Hubeika, DCGM FIT BUT Brno 1/37
More informationOptimization of FSS Filters
Optimization of FSS Filters P. Tomasek Abstract This work aims at description of the optimization process of frequency selective surfaces. The method of moments is used to analyze the planar periodic structure
More informationDWT and LPC based feature extraction methods for isolated word recognition
RESEARCH Open Access DWT and LPC based feature extraction methods for isolated word recognition Navnath S Nehe 1* and Raghunath S Holambe 2 Abstract In this article, new feature extraction methods, which
More informationDWT BASED AUDIO WATERMARKING USING ENERGY COMPARISON
DWT BASED AUDIO WATERMARKING USING ENERGY COMPARISON K.Thamizhazhakan #1, S.Maheswari *2 # PG Scholar,Department of Electrical and Electronics Engineering, Kongu Engineering College,Erode-638052,India.
More informationCommunications Theory and Engineering
Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 Speech and telephone speech Based on a voice production model Parametric representation
More informationA Wavelet Based Approach for Speaker Identification from Degraded Speech
International Journal of Communication Networks and Information Security (IJCNIS) Vol., No. 3, December A Wavelet Based Approach for Speaker Identification from Degraded Speech A. Shafik, S. M. Elhalafawy,
More informationSpeech Synthesis; Pitch Detection and Vocoders
Speech Synthesis; Pitch Detection and Vocoders Tai-Shih Chi ( 冀泰石 ) Department of Communication Engineering National Chiao Tung University May. 29, 2008 Speech Synthesis Basic components of the text-to-speech
More informationTime-Frequency Distributions for Automatic Speech Recognition
196 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 9, NO. 3, MARCH 2001 Time-Frequency Distributions for Automatic Speech Recognition Alexandros Potamianos, Member, IEEE, and Petros Maragos, Fellow,
More informationKeywords Decomposition; Reconstruction; SNR; Speech signal; Super soft Thresholding.
Volume 5, Issue 2, February 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Speech Enhancement
More informationChapter 4 SPEECH ENHANCEMENT
44 Chapter 4 SPEECH ENHANCEMENT 4.1 INTRODUCTION: Enhancement is defined as improvement in the value or Quality of something. Speech enhancement is defined as the improvement in intelligibility and/or
More informationVoice Recognition Technology Using Neural Networks
Journal of New Technology and Materials JNTM Vol. 05, N 01 (2015)27-31 OEB Univ. Publish. Co. Voice Recognition Technology Using Neural Networks Abdelouahab Zaatri 1, Norelhouda Azzizi 2 and Fouad Lazhar
More informationDimension Reduction of the Modulation Spectrogram for Speaker Verification
Dimension Reduction of the Modulation Spectrogram for Speaker Verification Tomi Kinnunen Speech and Image Processing Unit Department of Computer Science University of Joensuu, Finland tkinnu@cs.joensuu.fi
More informationA CONSTRUCTION OF COMPACT MFCC-TYPE FEATURES USING SHORT-TIME STATISTICS FOR APPLICATIONS IN AUDIO SEGMENTATION
17th European Signal Processing Conference (EUSIPCO 2009) Glasgow, Scotland, August 24-28, 2009 A CONSTRUCTION OF COMPACT MFCC-TYPE FEATURES USING SHORT-TIME STATISTICS FOR APPLICATIONS IN AUDIO SEGMENTATION
More informationVocal Command Recognition Using Parallel Processing of Multiple Confidence-Weighted Algorithms in an FPGA
Vocal Command Recognition Using Parallel Processing of Multiple Confidence-Weighted Algorithms in an FPGA ECE-492/3 Senior Design Project Spring 2015 Electrical and Computer Engineering Department Volgenau
More informationInternational Journal of Engineering and Techniques - Volume 1 Issue 6, Nov Dec 2015
RESEARCH ARTICLE OPEN ACCESS A Comparative Study on Feature Extraction Technique for Isolated Word Speech Recognition Easwari.N 1, Ponmuthuramalingam.P 2 1,2 (PG & Research Department of Computer Science,
More informationAn Approach to Very Low Bit Rate Speech Coding
Computing For Nation Development, February 26 27, 2009 Bharati Vidyapeeth s Institute of Computer Applications and Management, New Delhi An Approach to Very Low Bit Rate Speech Coding Hari Kumar Singh
More informationON THE RELATIONSHIP BETWEEN INSTANTANEOUS FREQUENCY AND PITCH IN. 1 Introduction. Zied Mnasri 1, Hamid Amiri 1
ON THE RELATIONSHIP BETWEEN INSTANTANEOUS FREQUENCY AND PITCH IN SPEECH SIGNALS Zied Mnasri 1, Hamid Amiri 1 1 Electrical engineering dept, National School of Engineering in Tunis, University Tunis El
More informationSpeech/Music Change Point Detection using Sonogram and AANN
International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 6, Number 1 (2016), pp. 45-49 International Research Publications House http://www. irphouse.com Speech/Music Change
More informationIntelligent Identification System Research
2016 International Conference on Manufacturing Construction and Energy Engineering (MCEE) ISBN: 978-1-60595-374-8 Intelligent Identification System Research Zi-Min Wang and Bai-Qing He Abstract: From the
More informationNon-intrusive intelligibility prediction for Mandarin speech in noise. Creative Commons: Attribution 3.0 Hong Kong License
Title Non-intrusive intelligibility prediction for Mandarin speech in noise Author(s) Chen, F; Guan, T Citation The 213 IEEE Region 1 Conference (TENCON 213), Xi'an, China, 22-25 October 213. In Conference
More informationDEEP LEARNING BASED AUTOMATIC VOLUME CONTROL AND LIMITER SYSTEM. Jun Yang (IEEE Senior Member), Philip Hilmes, Brian Adair, David W.
DEEP LEARNING BASED AUTOMATIC VOLUME CONTROL AND LIMITER SYSTEM Jun Yang (IEEE Senior Member), Philip Hilmes, Brian Adair, David W. Krueger Amazon Lab126, Sunnyvale, CA 94089, USA Email: {junyang, philmes,
More informationReading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday.
L105/205 Phonetics Scarborough Handout 7 10/18/05 Reading: Johnson Ch.2.3.3-2.3.6, Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday Spectral Analysis 1. There are
More informationText and Language Independent Speaker Identification By Using Short-Time Low Quality Signals
Text and Language Independent Speaker Identification By Using Short-Time Low Quality Signals Maurizio Bocca*, Reino Virrankoski**, Heikki Koivo* * Control Engineering Group Faculty of Electronics, Communications
More informationAn Optimization of Audio Classification and Segmentation using GASOM Algorithm
An Optimization of Audio Classification and Segmentation using GASOM Algorithm Dabbabi Karim, Cherif Adnen Research Unity of Processing and Analysis of Electrical and Energetic Systems Faculty of Sciences
More informationBasic Characteristics of Speech Signal Analysis
www.ijird.com March, 2016 Vol 5 Issue 4 ISSN 2278 0211 (Online) Basic Characteristics of Speech Signal Analysis S. Poornima Assistant Professor, VlbJanakiammal College of Arts and Science, Coimbatore,
More informationCOMP 546, Winter 2017 lecture 20 - sound 2
Today we will examine two types of sounds that are of great interest: music and speech. We will see how a frequency domain analysis is fundamental to both. Musical sounds Let s begin by briefly considering
More informationDesign and Implementation on a Sub-band based Acoustic Echo Cancellation Approach
Vol., No. 6, 0 Design and Implementation on a Sub-band based Acoustic Echo Cancellation Approach Zhixin Chen ILX Lightwave Corporation Bozeman, Montana, USA chen.zhixin.mt@gmail.com Abstract This paper
More informationCO-CHANNEL SPEECH DETECTION APPROACHES USING CYCLOSTATIONARITY OR WAVELET TRANSFORM
CO-CHANNEL SPEECH DETECTION APPROACHES USING CYCLOSTATIONARITY OR WAVELET TRANSFORM Arvind Raman Kizhanatham, Nishant Chandra, Robert E. Yantorno Temple University/ECE Dept. 2 th & Norris Streets, Philadelphia,
More informationspeech signal S(n). This involves a transformation of S(n) into another signal or a set of signals
16 3. SPEECH ANALYSIS 3.1 INTRODUCTION TO SPEECH ANALYSIS Many speech processing [22] applications exploits speech production and perception to accomplish speech analysis. By speech analysis we extract
More informationCepstrum alanysis of speech signals
Cepstrum alanysis of speech signals ELEC-E5520 Speech and language processing methods Spring 2016 Mikko Kurimo 1 /48 Contents Literature and other material Idea and history of cepstrum Cepstrum and LP
More informationBook Chapters. Refereed Journal Publications J11
Book Chapters B2 B1 A. Mouchtaris and P. Tsakalides, Low Bitrate Coding of Spot Audio Signals for Interactive and Immersive Audio Applications, in New Directions in Intelligent Interactive Multimedia,
More informationParticipant Identification in Haptic Systems Using Hidden Markov Models
HAVE 25 IEEE International Workshop on Haptic Audio Visual Environments and their Applications Ottawa, Ontario, Canada, 1-2 October 25 Participant Identification in Haptic Systems Using Hidden Markov Models
More information