Monophony/Polyphony Classification System using Fourier of Fourier Transform
|
|
- Tracey Richard
- 6 years ago
- Views:
Transcription
1 International Journal of Electronics Engineering, 2 (2), 2010, pp Monophony/Polyphony Classification System using Fourier of Fourier Transform Kalyani Akant 1, Rajesh Pande 2, and S.S. Limaye 3 1 Department of Electronics and Communication Engineering, Manoharbhai Patel Institute of Engineering and Technology, Gondia, Maharashtra, India, kalyaniakant@gmail.com. 2 Department of Electronics Engineering, Shri Ramdeobaba Kamla Nehru Engineering College, Nagpur, Maharashtra, India, panderaj@yahoo.com. 3 Department of Electronics Engineering, Jhulelal Institute of Technology, Nagpur, Maharashtra, India, shyam_limaye@hotmail.com. Abstract: In this paper we have proposed a method of classifying monophonic signals from polyphonic ones using Fourier of Fourier Transform (FFT2). Pitch estimation for monophonic signals is much simpler than polyphonic signals. Also prior knowledge of number of notes played (in case of polyphony) facilitates multi pitch estimation. One may use different methods for estimation of pitch in monophonic and polyphonic context. Hence identifying signal as monophonic or polyphonic becomes essential. Investigation of harmonic pattern of the sound in frequency domain gives us fundamental frequency (pitch). The periodicity of the Fourier transform is detected by again taking its Fourier transform to obtain the Fourier of Fourier transform [7] (FFT2). Classification is based on the fact that music signals are harmonic. For monophonic signals, we get series of peaks in FFT2 domain at near bin difference related to pitch of single note. Whereas for polyphonic signals this regularity will be disturbed, as spectrum in FFT2 will contain multiple series of peaks corresponding to multiple notes. We have tested our method on the database available at [15]. Keywords: Monophony, Polyphony, Fourier of Fourier Transform, Pitch. INTRODUCTION Many methods have been proposed for estimation of pitch in literature [1]. In case of monophony, the pitch is relatively easy to determine than polyphony. The problem of pitch estimation of monophonic signals is said to be solved, whereas multi-pitch estimation is still a challenging issue. Few methods of monophonic pitch estimation are, time domain methods: [2], [3], and [4], frequency domain methods: [5], [6], and [7]. Few methods of pitch estimation in polyphonic context are [8], [9], and [10]. In [11], monophony/polyphony classification is done based on a confidence indicator used by de Cheveign e [12]. Short term mean and variance of this indicator was calculated and bivariate repartition of these two parameters was modeled with Weibull bivariate distributions for each class. The classification was made by computing the likelihood over one second for each class and taking the best one. The problem of singing voice detection in monophonic and polyphonic contexts is addressed in [13] where, again the method by by de Cheveign e [12] is used for signal classification as mono/polyphony. Our method is based on the fact that music signals are harmonic. The harmonicity was detected in FFT2 [7] domain. For monophonic signals all peaks in FFT2 spectrum will observe harmonic relation. For polyphonic signals the FFT2 spectrum will be mixture of multiple harmonic peaks corresponding to multiple notes. Hence all peaks will not follow harmonic relation. Knowing if all peaks are harmonically related, signal is classified as mono/ polyphonic. Our main objective is singing voice detection from mono recordings in view of query by humming applications. This method is part of main objective. This paper is organized as follows. Section 1 presents the details of Fourier of Fourier Transform. Proposed method is explained in Section 2. Result and conclusion are given in Sections 3 and 4 respectively. 1. FOURIER OF FOURIER TRANSFORM In our analysis we have used two Fourier transforms in sequence referred as Fourier of Fourier Transform (FFT2). Our method works very well in the case harmonic sounds, i.e. sounds rich in harmonics. It is not suited for pure sinusoids. Fourier transform, FT (first Fourier transform of the signal) of a typical musical sound has a series of peaks in its magnitude spectrum corresponding to the harmonics of the sound, at frequencies close to multiples of the fundamental frequency F. The peak showing fundamental frequency may not always be dominant. Hence single Fourier transform is inefficient to identify correct peak. Fourier of Fourier Transform is of great interest in locating this peak, which helps to overcome the possibility of octave error. To find out Fourier of Fourier Transform,
2 300 International Journal of Electronics Engineering we compute magnitude spectrum of the Fourier transform of singing voice. Magnitude spectrum of the Fourier transform of the above magnitude spectrum is then computed. Note that this transform is not the same as the well-known Cepstrum, which is the (inverse) Fourier transform of the logarithm of the spectrum resulting from the Fourier transform. Figure 1 shows the FT of piano C# of 5th octave. This FT has a series of uniformly spaced peaks as shown in Fig. 1, corresponding to the harmonics of fundamental frequency. Fig. 1: Fourier Transform of Piano C# of 5th Octave We can clearly see that, peak corresponding to fundamental frequency is not dominant. If fundamental frequency is F, the distance between two consecutive peaks corresponds to a period of 1 bins where: 1 = F N 1... (1) N1: Size of the first Fourier transform. : Sampling frequency. The first peak is at bin 0 and it corresponds to the DC level. The difference between second peak (shown by an arrow in Fig. 1) and the first peak is 1 bins. Figure 2 shows the spectrum of Fourier of Fourier Transform of piano C# of 5th octave. In this spectrum of FFT2, there are series of peaks. Here also, the first peak is at bin 0 and it corresponds to the DC level. The second peak is shown by an arrow in Fig. 2. The distance between two consecutive peaks corresponds to a period of 2 bins where: 2 = N 2 1 N 2 : Size of the second Fourier transform. From Eqs (1) and (2), we get 2 = N2 ( N1 ) F... (2)... (3) Fig. 2: Fourier of Fourier Transform of Piano C# of 5th Octave If size of first and second Fourier transform is same (N 2 = N 1 ), Fundamental frequency F is given by, F = Advantage of FFT2 Over FT... (4) The peaks in FFT2 are more widely spaced as illustrated in the Table 1. Here 12 notes in the 4th octave are analyzed with sampling frequency of Hz and FFT size as 4096 and the bin index numbers in FT and FFT2 algorithms are tabulated. (Note that due to slight mistuning of the Piano, A is having a frequency of Hz rather than 440 Hz). Frequency of note is found by applying parabolic interpolation [14] to the peak found in FFT2.
3 Monophony/Polyphony Classification System using Fourier of Fourier Transform 301 Table 1 Indices of Harmonics in Terms of Bins in FT and FFT2 for Notes in 4th Octave Musical Index Index in Frequency MIDI note note in FT FFT2 of note number C Hz 60 C # Hz 61 D Hz 62 E b Hz 63 E Hz 64 F Hz 65 F # Hz 66 G Hz 67 A b Hz 68 A Hz 69 B b Hz 70 B Hz 71 We observe from above table that, in FT there is only one or two bins difference for a semitone, while in FFT2 there is five to nine bins difference. Also, as we move to the lower octaves, index in FT goes on reducing while index in FFT2 goes on increasing. For some of the two consecutive semitones in third octave, the indices in FT are same but in actual their frequencies are different. Estimation of fundamental frequency without parabolic interpolation would give same value for these semitones. Hence it is parabolic interpolation which plays important role in finding correct frequency of such semitones. Another feature of FFT2 is its ability to detect peaks of harmonics corresponding to multiple pitches. 1.2 Ability of FFT2 to Detect Multiple Pitches In FFT2 domain the spectral peaks are not as closely placed as in FT domain, so it becomes easier for peak detector to locate the peaks without any ambiguity. Figure 3 shows the FFT2 spectrum when A flat and C sharp of 4th octave played together. This ability of FFT2 is of great interest in music segregation in polyphonic environment. 2. PROPOSED METHOD Step 1: Signal of frame size N is selected, FFT2 of this frame is computed. Size of first and second FT was chosen to be 2 N to improve frequency resolution. In our case, N = Step 2: Bin numbers of all peaks in FFT2 spectrum from 0 to N are stored in a vector V. V = {V 1, V 2.., V n Step 3: Bin number of maximum amplitude in the FFT2 spectrum is detected. Let s denote this by K.(If bin0 is at 1, as in case of matlab, K should be considered K 1). Step 4: If singing voice lies in the frequency range from f1 to f2, FFT2 bin numbers of maximum amplitude in the spectrum will be from /f2 to /f1. All the bin numbers in this range from vector V are stored in vector X. Let X = {X 1, X 2.., X m Step 5: From X, those bin numbers whose peak values are less than 30% of peak value at K are rejected. Let the remaining bin numbers are Y = {Y 1, Y 2.., Y i Step 6: Now, it is checked whether bin numbers + K or + K 1 in case of matlab for (1 j i) fall in the vicinity of V j 5 to V j + 5 for (1 j n). If this happens, then the signal is monophonic else polyphonic. Above condition is critical for monophonic signals, so probability of misclassifying monophonic signals is more than polyphonic ones. So, we have tested our method for large database of monophonic signals at [15]. 3. RESULTS Algorithm is explained using following examples. For the frame in Fig. 4, K = 195, K 1 = 194. V = {1, 49, 97, 145, 195, 245, 292, 340, 391, 441, 487, 535, 586, 636, 682, 730, 781, 831, 877, 925, 976, 1027, 1073, 1119, 1171, 1222, 1269, 1314, 1366, 1418, 1464, 1508, 1559, 1616, 1661, 1698, 1735, 1768, 1800, 1832, 1907, 1955, 2006 We considered f1 = 100 Hz, f2 = 800 Hz. If = Hz, vector X in Step 4 will be the bin numbers from 55 to 441. So, X in this example is {97, 145, 195, 245, 292, 340, 391, 441 Y = {145, 195, 245, 340, 391, 441, Now Step 6 is performed. Fig. 4: FFT2 Spectrum of one Frame of Note of Frequency Hz Fig. 3: FFT2 Spectrum when A Flat and C Sharp of 4th Octave Played Together Condition in Step 6 is satisfied, hence signal is monophonic.
4 302 International Journal of Electronics Engineering Table 2 Illustration of Step 6 for Monophonic Signal + K 1 If fall in the vicinity of V j 5 to V j + 5 for (1 j n) Yes (element 340 in V) Yes (element 391 in V) Yes (element 441 in V) Yes (element 535 in V) Yes (element 586 in V) Yes (element 636 in V) For the frame in Fig. 5, K = 330, K 1 = 329. V = {1, 44, 83, 126, 202, 246, 285, 330, 380, 421, 468, 534, 579, 617, 663, 721, 762, 804, 847, 882, 925, 1005, 1052, 1096, 1139, 1179, 1212, 1253, 1294, 1334, 1379, 1418, 1455, 1495, 1536, 1579, 1620, 1662, 1700, 1726, 1765, 1808, 1858, 1894, 1926, 1963, 2002, 2038 We considered f1 = 100 Hz, f2 = 800 Hz. If = Hz, vector X in Step 4 will be the bin numbers from 55 to 441. So, X in this example is (83, 126, 202, 246, 285, 330, 380, 421 Y = {83, 126, 202, 246, 285, 330, 380 Now Step 6 is performed. Fig. 5: FFT2 Spectrum of one Frame of Polyphonic Signals Table 3 Illustration of Step 6 for Polyphonic Signal + K 1 If fall in the vicinity of V j 5 to V j + 5 for (1 j n) No No Yes (element 534 in V) Yes (element 579 in V) Yes (element 617 in V) Yes (element 663 in V) No Condition in step 6 is not satisfied, hence signal is polyphonic. Accuracy of our algorithm is tested using global error rate: Error = Number of misclassified seconds / Total number of seconds. We observed that error reduces for the frames, whose signal amplitude is more. Signal amplitude in a frame is calculated by adding modulus of each sample value in a frame. We run the algorithm for those frames whose amplitude is more than threshold. If threshold is set at larger value, error reduces. In the following table, Threshold/ Maximum amplitude of signal = 0 means threshold = 0, hence algorithm will be run for entire signal. This effect is shown in Table 4. All the files are available at [15]. Table 4 % Error Name of file Threshold/Maximum % error amplitude of signal AltoFlute_ff_C4B BassFlute_pp_C4B Bassoon_pp_C4B EbClar_pp_C4B Flute_novib_pp_B3B Horn_pp_C4B TenorTrombone_pp_C4B CONCLUSION AND FUTURE WORK Real world signals are noisy. Our algorithm may fail for noisy signals. So, signal should be band pass filtered ( Hz) prior to the application of this algorithm to reject peaks corresponding to noise. This algorithm will be merged with our main goal: pitch tracking of singing voice in polyphonic context. Monophonic pitch tracking is simple and requires less time. Once the signal is classified at each frame, different algorithms will be run for pitch tracking for each class. REFERENCES [1] Zhenyu Zhao, Lyndon J. Brown, Musical Pitch Tracking using Internal Model Control Based Frequency Cancellation, 42nd IEEE Conference on Decision and Control, 5, December 2003, pp [2] L.R. Rabiner, et.al. A Comparative Performance Study of Several Pitch Detection Algorithms, IEEE Trans. ASSP, 24 (5), pp , October 1976.
5 Monophony/Polyphony Classification System using Fourier of Fourier Transform 303 [3] J.C. Brown and M.S. Puckette, Calculation of a Narrowed Autocorrelation Function, J. Acoust. Soc. Am., 85 (4), pp , April [4] J.C. Brown and B. Zhang, Musical Frequency Tracking using the Methods of Conventional and Narrowed Autocorrelation, J. Acoust. Soc. Am., 89 (5), pp , May [5] M. Piszczalski and B.A. Galler, Predicting Musical Pitch from Component Frequency Ratios, J. Acoust. Soc. Am., 66 (3), pp , September, [6] J.C Brown, Musical Fundamental Frequency Tracking using a Pattern Recognition Method, J. Acoust. Soc. Am., 92 (3), pp , September [7] Sylvain Marchand, An Efficient Pitch-tracking Algorithm using a Combination of Fourier Transforms, Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-01), Limerick, Ireland, December 6 8, [8] Walmsley P.J., Godsill S.J., Rayner P.J.W., Polyphonic Pitch Tracking using Joint Bayesian Estimation of Multiple Frame Parameters Department of Engineering, Cambridge University, Proc IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltc, New York, Oct , 1999, pp [9] Klapuri A.P., Multiple Fundamental Frequency Estimation Based on Harmonicity and Spectral Smoothness, IEEE Transactions on Speech and Audio Processing, 11 (6), 2003, pp [10] Chunghsin Yeh, Robel A., Rodet X., Multiple Fundamental Frequency Estimation of Polyphonic Music Signals, IEEE International Conference on Acoustics, Speech, and Signal Processing, Proceedings. (ICASSP '05). 3, pp. iii/225 iii/228. [11] Lachambre H., Andre-Obrecht R., Pinquier J., Monophony vs Polyphony: A New Method Based on Weibull Bivariate Models, Content-Based Multimedia Indexing, CBMI '09, pp [12] A. de Cheveign e and H. Kawahara. Yin, A Fundamental Frequency Estimator for Speech and Music. Journal of the Acoustical Society of America, 111 (4), , April [13] H el`ene Lachambre, R egine Andr e-obrecht, Julien Pinquier, Singing Voice Detection in Monophonic and Polyphonic Contexts, 17th European Signal Processing Conference (EUSIPCO 2009). [14] J.O. Smith and X. Serra, PARSHL: An Analysis/Synthesis Program for Non-Harmonic Sounds Based on a Sinusoidal Representation, Proceedings of the 1987 International Computer Music Conference, International Computer Music Association, San Francisco, 1987, pp [15]
Transcription of Piano Music
Transcription of Piano Music Rudolf BRISUDA Slovak University of Technology in Bratislava Faculty of Informatics and Information Technologies Ilkovičova 2, 842 16 Bratislava, Slovakia xbrisuda@is.stuba.sk
More informationPOLYPHONIC PITCH DETECTION BY MATCHING SPECTRAL AND AUTOCORRELATION PEAKS. Sebastian Kraft, Udo Zölzer
POLYPHONIC PITCH DETECTION BY MATCHING SPECTRAL AND AUTOCORRELATION PEAKS Sebastian Kraft, Udo Zölzer Department of Signal Processing and Communications Helmut-Schmidt-University, Hamburg, Germany sebastian.kraft@hsu-hh.de
More informationA Parametric Model for Spectral Sound Synthesis of Musical Sounds
A Parametric Model for Spectral Sound Synthesis of Musical Sounds Cornelia Kreutzer University of Limerick ECE Department Limerick, Ireland cornelia.kreutzer@ul.ie Jacqueline Walker University of Limerick
More informationADAPTIVE NOISE LEVEL ESTIMATION
Proc. of the 9 th Int. Conference on Digital Audio Effects (DAFx-6), Montreal, Canada, September 18-2, 26 ADAPTIVE NOISE LEVEL ESTIMATION Chunghsin Yeh Analysis/Synthesis team IRCAM/CNRS-STMS, Paris, France
More informationAdaptive noise level estimation
Adaptive noise level estimation Chunghsin Yeh, Axel Roebel To cite this version: Chunghsin Yeh, Axel Roebel. Adaptive noise level estimation. Workshop on Computer Music and Audio Technology (WOCMAT 6),
More informationPERIODIC SIGNAL MODELING FOR THE OCTAVE PROBLEM IN MUSIC TRANSCRIPTION. Antony Schutz, Dirk Slock
PERIODIC SIGNAL MODELING FOR THE OCTAVE PROBLEM IN MUSIC TRANSCRIPTION Antony Schutz, Dir Sloc EURECOM Mobile Communication Department 9 Route des Crêtes BP 193, 694 Sophia Antipolis Cedex, France firstname.lastname@eurecom.fr
More informationAberehe Niguse Gebru ABSTRACT. Keywords Autocorrelation, MATLAB, Music education, Pitch Detection, Wavelet
Master of Industrial Sciences 2015-2016 Faculty of Engineering Technology, Campus Group T Leuven This paper is written by (a) student(s) in the framework of a Master s Thesis ABC Research Alert VIRTUAL
More informationPreeti Rao 2 nd CompMusicWorkshop, Istanbul 2012
Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012 o Music signal characteristics o Perceptual attributes and acoustic properties o Signal representations for pitch detection o STFT o Sinusoidal model o
More informationPerformance Analysis of MFCC and LPCC Techniques in Automatic Speech Recognition
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume - 3 Issue - 8 August, 2014 Page No. 7727-7732 Performance Analysis of MFCC and LPCC Techniques in Automatic
More informationAutomatic Transcription of Monophonic Audio to MIDI
Automatic Transcription of Monophonic Audio to MIDI Jiří Vass 1 and Hadas Ofir 2 1 Czech Technical University in Prague, Faculty of Electrical Engineering Department of Measurement vassj@fel.cvut.cz 2
More informationROBUST MULTIPITCH ESTIMATION FOR THE ANALYSIS AND MANIPULATION OF POLYPHONIC MUSICAL SIGNALS
ROBUST MULTIPITCH ESTIMATION FOR THE ANALYSIS AND MANIPULATION OF POLYPHONIC MUSICAL SIGNALS Anssi Klapuri 1, Tuomas Virtanen 1, Jan-Markus Holm 2 1 Tampere University of Technology, Signal Processing
More informationFundamental frequency estimation of speech signals using MUSIC algorithm
Acoust. Sci. & Tech. 22, 4 (2) TECHNICAL REPORT Fundamental frequency estimation of speech signals using MUSIC algorithm Takahiro Murakami and Yoshihisa Ishida School of Science and Technology, Meiji University,,
More informationInternational Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015
International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 Analysis of Speech Signal Using Graphic User Interface Solly Joy 1, Savitha
More informationAutomatic Evaluation of Hindustani Learner s SARGAM Practice
Automatic Evaluation of Hindustani Learner s SARGAM Practice Gurunath Reddy M and K. Sreenivasa Rao Indian Institute of Technology, Kharagpur, India {mgurunathreddy, ksrao}@sit.iitkgp.ernet.in Abstract
More informationMel Spectrum Analysis of Speech Recognition using Single Microphone
International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree
More informationDifferent Approaches of Spectral Subtraction Method for Speech Enhancement
ISSN 2249 5460 Available online at www.internationalejournals.com International ejournals International Journal of Mathematical Sciences, Technology and Humanities 95 (2013 1056 1062 Different Approaches
More informationAudio Engineering Society Convention Paper Presented at the 110th Convention 2001 May Amsterdam, The Netherlands
Audio Engineering Society Convention Paper Presented at the th Convention May 5 Amsterdam, The Netherlands This convention paper has been reproduced from the author's advance manuscript, without editing,
More informationSpeech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm
International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm A.T. Rajamanickam, N.P.Subiramaniyam, A.Balamurugan*,
More informationTimbral Distortion in Inverse FFT Synthesis
Timbral Distortion in Inverse FFT Synthesis Mark Zadel Introduction Inverse FFT synthesis (FFT ) is a computationally efficient technique for performing additive synthesis []. Instead of summing partials
More informationSpeech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 7, Issue, Ver. I (Mar. - Apr. 7), PP 4-46 e-issn: 9 4, p-issn No. : 9 497 www.iosrjournals.org Speech Enhancement Using Spectral Flatness Measure
More informationTIME DOMAIN ATTACK AND RELEASE MODELING Applied to Spectral Domain Sound Synthesis
TIME DOMAIN ATTACK AND RELEASE MODELING Applied to Spectral Domain Sound Synthesis Cornelia Kreutzer, Jacqueline Walker Department of Electronic and Computer Engineering, University of Limerick, Limerick,
More informationDetermination of Pitch Range Based on Onset and Offset Analysis in Modulation Frequency Domain
Determination o Pitch Range Based on Onset and Oset Analysis in Modulation Frequency Domain A. Mahmoodzadeh Speech Proc. Research Lab ECE Dept. Yazd University Yazd, Iran H. R. Abutalebi Speech Proc. Research
More informationA NEW SCORE FUNCTION FOR JOINT EVALUATION OF MULTIPLE F0 HYPOTHESES. Chunghsin Yeh, Axel Röbel
A NEW SCORE FUNCTION FOR JOINT EVALUATION OF MULTIPLE F0 HYPOTHESES Chunghsin Yeh, Axel Röbel Analysis-Synthesis Team, IRCAM, Paris, France cyeh@ircam.fr roebel@ircam.fr ABSTRACT This article is concerned
More informationOriginal Research Articles
Original Research Articles Researchers A.K.M Fazlul Haque Department of Electronics and Telecommunication Engineering Daffodil International University Emailakmfhaque@daffodilvarsity.edu.bd FFT and Wavelet-Based
More informationSpeech Enhancement using Wiener filtering
Speech Enhancement using Wiener filtering S. Chirtmay and M. Tahernezhadi Department of Electrical Engineering Northern Illinois University DeKalb, IL 60115 ABSTRACT The problem of reducing the disturbing
More informationReal-time fundamental frequency estimation by least-square fitting. IEEE Transactions on Speech and Audio Processing, 1997, v. 5 n. 2, p.
Title Real-time fundamental frequency estimation by least-square fitting Author(s) Choi, AKO Citation IEEE Transactions on Speech and Audio Processing, 1997, v. 5 n. 2, p. 201-205 Issued Date 1997 URL
More informationMultipitch estimation using judge-based model
BULLETIN OF THE POLISH ACADEMY OF SCIENCES TECHNICAL SCIENCES, Vol. 62, No. 4, 2014 DOI: 10.2478/bpasts-2014-0081 INFORMATICS Multipitch estimation using judge-based model K. RYCHLICKI-KICIOR and B. STASIAK
More informationA system for automatic detection and correction of detuned singing
A system for automatic detection and correction of detuned singing M. Lech and B. Kostek Gdansk University of Technology, Multimedia Systems Department, /2 Gabriela Narutowicza Street, 80-952 Gdansk, Poland
More informationQuantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation
Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation Peter J. Murphy and Olatunji O. Akande, Department of Electronic and Computer Engineering University
More informationHIGH ACCURACY FRAME-BY-FRAME NON-STATIONARY SINUSOIDAL MODELLING
HIGH ACCURACY FRAME-BY-FRAME NON-STATIONARY SINUSOIDAL MODELLING Jeremy J. Wells, Damian T. Murphy Audio Lab, Intelligent Systems Group, Department of Electronics University of York, YO10 5DD, UK {jjw100
More informationDrum Transcription Based on Independent Subspace Analysis
Report for EE 391 Special Studies and Reports for Electrical Engineering Drum Transcription Based on Independent Subspace Analysis Yinyi Guo Center for Computer Research in Music and Acoustics, Stanford,
More informationQuery by Singing and Humming
Abstract Query by Singing and Humming CHIAO-WEI LIN Music retrieval techniques have been developed in recent years since signals have been digitalized. Typically we search a song by its name or the singer
More informationIdentification of Nonstationary Audio Signals Using the FFT, with Application to Analysis-based Synthesis of Sound
Identification of Nonstationary Audio Signals Using the FFT, with Application to Analysis-based Synthesis of Sound Paul Masri, Prof. Andrew Bateman Digital Music Research Group, University of Bristol 1.4
More informationTime-Frequency Distributions for Automatic Speech Recognition
196 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 9, NO. 3, MARCH 2001 Time-Frequency Distributions for Automatic Speech Recognition Alexandros Potamianos, Member, IEEE, and Petros Maragos, Fellow,
More informationROBUST PITCH TRACKING USING LINEAR REGRESSION OF THE PHASE
- @ Ramon E Prieto et al Robust Pitch Tracking ROUST PITCH TRACKIN USIN LINEAR RERESSION OF THE PHASE Ramon E Prieto, Sora Kim 2 Electrical Engineering Department, Stanford University, rprieto@stanfordedu
More informationAudio Restoration Based on DSP Tools
Audio Restoration Based on DSP Tools EECS 451 Final Project Report Nan Wu School of Electrical Engineering and Computer Science University of Michigan Ann Arbor, MI, United States wunan@umich.edu Abstract
More informationSINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase and Reassigned Spectrum
SINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase Reassigned Spectrum Geoffroy Peeters, Xavier Rodet Ircam - Centre Georges-Pompidou Analysis/Synthesis Team, 1, pl. Igor
More informationSpectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition
Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition Author Shannon, Ben, Paliwal, Kuldip Published 25 Conference Title The 8th International Symposium
More informationKONKANI SPEECH RECOGNITION USING HILBERT-HUANG TRANSFORM
KONKANI SPEECH RECOGNITION USING HILBERT-HUANG TRANSFORM Shruthi S Prabhu 1, Nayana C G 2, Ashwini B N 3, Dr. Parameshachari B D 4 Assistant Professor, Department of Telecommunication Engineering, GSSSIETW,
More informationVIBRATO DETECTING ALGORITHM IN REAL TIME. Minhao Zhang, Xinzhao Liu. University of Rochester Department of Electrical and Computer Engineering
VIBRATO DETECTING ALGORITHM IN REAL TIME Minhao Zhang, Xinzhao Liu University of Rochester Department of Electrical and Computer Engineering ABSTRACT Vibrato is a fundamental expressive attribute in music,
More informationTHE BEATING EQUALIZER AND ITS APPLICATION TO THE SYNTHESIS AND MODIFICATION OF PIANO TONES
J. Rauhala, The beating equalizer and its application to the synthesis and modification of piano tones, in Proceedings of the 1th International Conference on Digital Audio Effects, Bordeaux, France, 27,
More informationFREQUENCY-DOMAIN TECHNIQUES FOR HIGH-QUALITY VOICE MODIFICATION. Jean Laroche
Proc. of the 6 th Int. Conference on Digital Audio Effects (DAFx-3), London, UK, September 8-11, 23 FREQUENCY-DOMAIN TECHNIQUES FOR HIGH-QUALITY VOICE MODIFICATION Jean Laroche Creative Advanced Technology
More informationPitch Estimation of Singing Voice From Monaural Popular Music Recordings
Pitch Estimation of Singing Voice From Monaural Popular Music Recordings Kwan Kim, Jun Hee Lee New York University author names in alphabetical order Abstract A singing voice separation system is a hard
More informationIntroduction of Audio and Music
1 Introduction of Audio and Music Wei-Ta Chu 2009/12/3 Outline 2 Introduction of Audio Signals Introduction of Music 3 Introduction of Audio Signals Wei-Ta Chu 2009/12/3 Li and Drew, Fundamentals of Multimedia,
More information/$ IEEE
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 6, AUGUST 2010 1643 Multipitch Estimation of Piano Sounds Using a New Probabilistic Spectral Smoothness Principle Valentin Emiya,
More informationHIGH ACCURACY AND OCTAVE ERROR IMMUNE PITCH DETECTION ALGORITHMS
ARCHIVES OF ACOUSTICS 29, 1, 1 21 (2004) HIGH ACCURACY AND OCTAVE ERROR IMMUNE PITCH DETECTION ALGORITHMS M. DZIUBIŃSKI and B. KOSTEK Multimedia Systems Department Gdańsk University of Technology Narutowicza
More informationMMSE STSA Based Techniques for Single channel Speech Enhancement Application Simit Shah 1, Roma Patel 2
MMSE STSA Based Techniques for Single channel Speech Enhancement Application Simit Shah 1, Roma Patel 2 1 Electronics and Communication Department, Parul institute of engineering and technology, Vadodara,
More informationReduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter
Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Ching-Ta Lu, Kun-Fu Tseng 2, Chih-Tsung Chen 2 Department of Information Communication, Asia University, Taichung, Taiwan, ROC
More informationA NEW APPROACH TO TRANSIENT PROCESSING IN THE PHASE VOCODER. Axel Röbel. IRCAM, Analysis-Synthesis Team, France
A NEW APPROACH TO TRANSIENT PROCESSING IN THE PHASE VOCODER Axel Röbel IRCAM, Analysis-Synthesis Team, France Axel.Roebel@ircam.fr ABSTRACT In this paper we propose a new method to reduce phase vocoder
More informationEnhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis
Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis Mohini Avatade & S.L. Sahare Electronics & Telecommunication Department, Cummins
More informationClassification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise
Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise Noha KORANY 1 Alexandria University, Egypt ABSTRACT The paper applies spectral analysis to
More informationCorrespondence. Cepstrum-Based Pitch Detection Using a New Statistical V/UV Classification Algorithm
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 7, NO. 3, MAY 1999 333 Correspondence Cepstrum-Based Pitch Detection Using a New Statistical V/UV Classification Algorithm Sassan Ahmadi and Andreas
More informationSINUSOIDAL MODELING. EE6641 Analysis and Synthesis of Audio Signals. Yi-Wen Liu Nov 3, 2015
1 SINUSOIDAL MODELING EE6641 Analysis and Synthesis of Audio Signals Yi-Wen Liu Nov 3, 2015 2 Last time: Spectral Estimation Resolution Scenario: multiple peaks in the spectrum Choice of window type and
More informationSpeech Synthesis using Mel-Cepstral Coefficient Feature
Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract
More informationMikko Myllymäki and Tuomas Virtanen
NON-STATIONARY NOISE MODEL COMPENSATION IN VOICE ACTIVITY DETECTION Mikko Myllymäki and Tuomas Virtanen Department of Signal Processing, Tampere University of Technology Korkeakoulunkatu 1, 3370, Tampere,
More informationSPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT
SPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT RASHMI MAKHIJANI Department of CSE, G. H. R.C.E., Near CRPF Campus,Hingna Road, Nagpur, Maharashtra, India rashmi.makhijani2002@gmail.com
More informationDiscrete Fourier Transform (DFT)
Amplitude Amplitude Discrete Fourier Transform (DFT) DFT transforms the time domain signal samples to the frequency domain components. DFT Signal Spectrum Time Frequency DFT is often used to do frequency
More informationA Method for Voiced/Unvoiced Classification of Noisy Speech by Analyzing Time-Domain Features of Spectrogram Image
Science Journal of Circuits, Systems and Signal Processing 2017; 6(2): 11-17 http://www.sciencepublishinggroup.com/j/cssp doi: 10.11648/j.cssp.20170602.12 ISSN: 2326-9065 (Print); ISSN: 2326-9073 (Online)
More informationA Comparative Study of Formant Frequencies Estimation Techniques
A Comparative Study of Formant Frequencies Estimation Techniques DORRA GARGOURI, Med ALI KAMMOUN and AHMED BEN HAMIDA Unité de traitement de l information et électronique médicale, ENIS University of Sfax
More informationPitch Detection Algorithms
OpenStax-CNX module: m11714 1 Pitch Detection Algorithms Gareth Middleton This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 1.0 Abstract Two algorithms to
More informationHungarian Speech Synthesis Using a Phase Exact HNM Approach
Hungarian Speech Synthesis Using a Phase Exact HNM Approach Kornél Kovács 1, András Kocsor 2, and László Tóth 3 Research Group on Artificial Intelligence of the Hungarian Academy of Sciences and University
More informationAn Efficient Extraction of Vocal Portion from Music Accompaniment Using Trend Estimation
An Efficient Extraction of Vocal Portion from Music Accompaniment Using Trend Estimation Aisvarya V 1, Suganthy M 2 PG Student [Comm. Systems], Dept. of ECE, Sree Sastha Institute of Engg. & Tech., Chennai,
More informationI-Hao Hsiao, Chun-Tang Chao*, and Chi-Jo Wang (2016). A HHT-Based Music Synthesizer. Intelligent Technologies and Engineering Systems, Lecture Notes
I-Hao Hsiao, Chun-Tang Chao*, and Chi-Jo Wang (2016). A HHT-Based Music Synthesizer. Intelligent Technologies and Engineering Systems, Lecture Notes in Electrical Engineering (LNEE), Vol.345, pp.523-528.
More informationVoice Activity Detection
Voice Activity Detection Speech Processing Tom Bäckström Aalto University October 2015 Introduction Voice activity detection (VAD) (or speech activity detection, or speech detection) refers to a class
More informationSingle-channel Mixture Decomposition using Bayesian Harmonic Models
Single-channel Mixture Decomposition using Bayesian Harmonic Models Emmanuel Vincent and Mark D. Plumbley Electronic Engineering Department, Queen Mary, University of London Mile End Road, London E1 4NS,
More informationDominant Voiced Speech Segregation Using Onset Offset Detection and IBM Based Segmentation
Dominant Voiced Speech Segregation Using Onset Offset Detection and IBM Based Segmentation Shibani.H 1, Lekshmi M S 2 M. Tech Student, Ilahia college of Engineering and Technology, Muvattupuzha, Kerala,
More informationMeasuring the complexity of sound
PRAMANA c Indian Academy of Sciences Vol. 77, No. 5 journal of November 2011 physics pp. 811 816 Measuring the complexity of sound NANDINI CHATTERJEE SINGH National Brain Research Centre, NH-8, Nainwal
More informationSGN Audio and Speech Processing
Introduction 1 Course goals Introduction 2 SGN 14006 Audio and Speech Processing Lectures, Fall 2014 Anssi Klapuri Tampere University of Technology! Learn basics of audio signal processing Basic operations
More informationJOURNAL OF OBJECT TECHNOLOGY
JOURNAL OF OBJECT TECHNOLOGY Online at http://www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2009 Vol. 9, No. 1, January-February 2010 The Discrete Fourier Transform, Part 5: Spectrogram
More informationSound Source Localization using HRTF database
ICCAS June -, KINTEX, Gyeonggi-Do, Korea Sound Source Localization using HRTF database Sungmok Hwang*, Youngjin Park and Younsik Park * Center for Noise and Vibration Control, Dept. of Mech. Eng., KAIST,
More informationSignal segmentation and waveform characterization. Biosignal processing, S Autumn 2012
Signal segmentation and waveform characterization Biosignal processing, 5173S Autumn 01 Short-time analysis of signals Signal statistics may vary in time: nonstationary how to compute signal characterizations?
More informationNon-stationary Analysis/Synthesis using Spectrum Peak Shape Distortion, Phase and Reassignment
Non-stationary Analysis/Synthesis using Spectrum Peak Shape Distortion, Phase Reassignment Geoffroy Peeters, Xavier Rodet Ircam - Centre Georges-Pompidou, Analysis/Synthesis Team, 1, pl. Igor Stravinsky,
More informationAudio Imputation Using the Non-negative Hidden Markov Model
Audio Imputation Using the Non-negative Hidden Markov Model Jinyu Han 1,, Gautham J. Mysore 2, and Bryan Pardo 1 1 EECS Department, Northwestern University 2 Advanced Technology Labs, Adobe Systems Inc.
More informationTIME-FREQUENCY ANALYSIS OF MUSICAL SIGNALS USING THE PHASE COHERENCE
Proc. of the 6 th Int. Conference on Digital Audio Effects (DAFx-3), Maynooth, Ireland, September 2-6, 23 TIME-FREQUENCY ANALYSIS OF MUSICAL SIGNALS USING THE PHASE COHERENCE Alessio Degani, Marco Dalai,
More informationComparison of Spectral Analysis Methods for Automatic Speech Recognition
INTERSPEECH 2013 Comparison of Spectral Analysis Methods for Automatic Speech Recognition Venkata Neelima Parinam, Chandra Vootkuri, Stephen A. Zahorian Department of Electrical and Computer Engineering
More informationTIMIT LMS LMS. NoisyNA
TIMIT NoisyNA Shi NoisyNA Shi (NoisyNA) shi A ICA PI SNIR [1]. S. V. Vaseghi, Advanced Digital Signal Processing and Noise Reduction, Second Edition, John Wiley & Sons Ltd, 2000. [2]. M. Moonen, and A.
More informationCombining Pitch-Based Inference and Non-Negative Spectrogram Factorization in Separating Vocals from Polyphonic Music
Combining Pitch-Based Inference and Non-Negative Spectrogram Factorization in Separating Vocals from Polyphonic Music Tuomas Virtanen, Annamaria Mesaros, Matti Ryynänen Department of Signal Processing,
More informationCarrier Frequency Offset Estimation in WCDMA Systems Using a Modified FFT-Based Algorithm
Carrier Frequency Offset Estimation in WCDMA Systems Using a Modified FFT-Based Algorithm Seare H. Rezenom and Anthony D. Broadhurst, Member, IEEE Abstract-- Wideband Code Division Multiple Access (WCDMA)
More informationMultiple Fundamental Frequency Estimation by Modeling Spectral Peaks and Non-peak Regions
Multiple Fundamental Frequency Estimation by Modeling Spectral Peaks and Non-peak Regions Zhiyao Duan Student Member, IEEE, Bryan Pardo Member, IEEE and Changshui Zhang Member, IEEE 1 Abstract This paper
More informationUsing RASTA in task independent TANDEM feature extraction
R E S E A R C H R E P O R T I D I A P Using RASTA in task independent TANDEM feature extraction Guillermo Aradilla a John Dines a Sunil Sivadas a b IDIAP RR 04-22 April 2004 D a l l e M o l l e I n s t
More informationSignal Analysis. Peak Detection. Envelope Follower (Amplitude detection) Music 270a: Signal Analysis
Signal Analysis Music 27a: Signal Analysis Tamara Smyth, trsmyth@ucsd.edu Department of Music, University of California, San Diego (UCSD November 23, 215 Some tools we may want to use to automate analysis
More informationApplications of Music Processing
Lecture Music Processing Applications of Music Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Singing Voice Detection Important pre-requisite
More informationSONG RETRIEVAL SYSTEM USING HIDDEN MARKOV MODELS
SONG RETRIEVAL SYSTEM USING HIDDEN MARKOV MODELS AKSHAY CHANDRASHEKARAN ANOOP RAMAKRISHNA akshayc@cmu.edu anoopr@andrew.cmu.edu ABHISHEK JAIN GE YANG ajain2@andrew.cmu.edu younger@cmu.edu NIDHI KOHLI R
More informationNCCF ACF. cepstrum coef. error signal > samples
ESTIMATION OF FUNDAMENTAL FREQUENCY IN SPEECH Petr Motl»cek 1 Abstract This paper presents an application of one method for improving fundamental frequency detection from the speech. The method is based
More informationADDITIVE synthesis [1] is the original spectrum modeling
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 3, MARCH 2007 851 Perceptual Long-Term Variable-Rate Sinusoidal Modeling of Speech Laurent Girin, Member, IEEE, Mohammad Firouzmand,
More informationSingle Channel Speaker Segregation using Sinusoidal Residual Modeling
NCC 2009, January 16-18, IIT Guwahati 294 Single Channel Speaker Segregation using Sinusoidal Residual Modeling Rajesh M Hegde and A. Srinivas Dept. of Electrical Engineering Indian Institute of Technology
More informationSurvey Paper on Music Beat Tracking
Survey Paper on Music Beat Tracking Vedshree Panchwadkar, Shravani Pande, Prof.Mr.Makarand Velankar Cummins College of Engg, Pune, India vedshreepd@gmail.com, shravni.pande@gmail.com, makarand_v@rediffmail.com
More informationIMPROVED CODING OF TONAL COMPONENTS IN MPEG-4 AAC WITH SBR
IMPROVED CODING OF TONAL COMPONENTS IN MPEG-4 AAC WITH SBR Tomasz Żernici, Mare Domańsi, Poznań University of Technology, Chair of Multimedia Telecommunications and Microelectronics, Polana 3, 6-965, Poznań,
More informationANALYSIS OF EFFECTS OF VECTOR CONTROL ON TOTAL CURRENT HARMONIC DISTORTION OF ADJUSTABLE SPEED AC DRIVE
ANALYSIS OF EFFECTS OF VECTOR CONTROL ON TOTAL CURRENT HARMONIC DISTORTION OF ADJUSTABLE SPEED AC DRIVE KARTIK TAMVADA Department of E.E.E, V.S.Lakshmi Engineering College for Women, Kakinada, Andhra Pradesh,
More informationMulti-Pitch Estimation of Audio Recordings Using a Codebook-Based Approach Hansen, Martin Weiss; Jensen, Jesper Rindom; Christensen, Mads Græsbøll
Aalborg Universitet Multi-Pitch Estimation of Audio Recordings Using a Codebook-Based Approach Hansen, Martin Weiss; Jensen, Jesper Rindom; Christensen, Mads Græsbøll Published in: Proceedings of the 4th
More informationIsolated Digit Recognition Using MFCC AND DTW
MarutiLimkar a, RamaRao b & VidyaSagvekar c a Terna collegeof Engineering, Department of Electronics Engineering, Mumbai University, India b Vidyalankar Institute of Technology, Department ofelectronics
More informationSpeech Endpoint Detection Based on Sub-band Energy and Harmonic Structure of Voice
Speech Endpoint Detection Based on Sub-band Energy and Harmonic Structure of Voice Yanmeng Guo, Qiang Fu, and Yonghong Yan ThinkIT Speech Lab, Institute of Acoustics, Chinese Academy of Sciences Beijing
More informationMultimedia Signal Processing: Theory and Applications in Speech, Music and Communications
Brochure More information from http://www.researchandmarkets.com/reports/569388/ Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications Description: Multimedia Signal
More informationThe Music Retrieval Method Based on The Audio Feature Analysis Technique with The Real World Polyphonic Music
The Music Retrieval Method Based on The Audio Feature Analysis Technique with The Real World Polyphonic Music Chai-Jong Song, Seok-Pil Lee, Sung-Ju Park, Saim Shin, Dalwon Jang Digital Media Research Center,
More information(i) Understanding the basic concepts of signal modeling, correlation, maximum likelihood estimation, least squares and iterative numerical methods
Tools and Applications Chapter Intended Learning Outcomes: (i) Understanding the basic concepts of signal modeling, correlation, maximum likelihood estimation, least squares and iterative numerical methods
More informationEnsemble Empirical Mode Decomposition: An adaptive method for noise reduction
IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735. Volume 5, Issue 5 (Mar. - Apr. 213), PP 6-65 Ensemble Empirical Mode Decomposition: An adaptive
More informationLecture 5: Sinusoidal Modeling
ELEN E4896 MUSIC SIGNAL PROCESSING Lecture 5: Sinusoidal Modeling 1. Sinusoidal Modeling 2. Sinusoidal Analysis 3. Sinusoidal Synthesis & Modification 4. Noise Residual Dan Ellis Dept. Electrical Engineering,
More informationSinging Voice Detection. Applications of Music Processing. Singing Voice Detection. Singing Voice Detection. Singing Voice Detection
Detection Lecture usic Processing Applications of usic Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Important pre-requisite for: usic segmentation
More informationSELECTIVE NOISE FILTERING OF SPEECH SIGNALS USING AN ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM AS A FREQUENCY PRE-CLASSIFIER
SELECTIVE NOISE FILTERING OF SPEECH SIGNALS USING AN ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM AS A FREQUENCY PRE-CLASSIFIER SACHIN LAKRA 1, T. V. PRASAD 2, G. RAMAKRISHNA 3 1 Research Scholar, Computer Sc.
More informationINFLUENCE OF FREQUENCY DISTRIBUTION ON INTENSITY FLUCTUATIONS OF NOISE
INFLUENCE OF FREQUENCY DISTRIBUTION ON INTENSITY FLUCTUATIONS OF NOISE Pierre HANNA SCRIME - LaBRI Université de Bordeaux 1 F-33405 Talence Cedex, France hanna@labriu-bordeauxfr Myriam DESAINTE-CATHERINE
More information