THE USE OF ARTIFICIAL NEURAL NETWORKS IN THE ESTIMATION OF THE PERCEPTION OF SOUND BY THE HUMAN AUDITORY SYSTEM

Size: px
Start display at page:

Download "THE USE OF ARTIFICIAL NEURAL NETWORKS IN THE ESTIMATION OF THE PERCEPTION OF SOUND BY THE HUMAN AUDITORY SYSTEM"

Transcription

1 INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS VOL. 8, NO. 3, SEPTEMBER 2015 THE USE OF ARTIFICIAL NEURAL NETWORKS IN THE ESTIMATION OF THE PERCEPTION OF SOUND BY THE HUMAN AUDITORY SYSTEM D. Riordan*, P. Doody and J. Walsh Intelligent Mechatronic and RFID Technology Gateway, Institute of Technology, Tralee, Co. Kerry, Rep. of Ireland. * Submitted: Apr. 30, 2015 Accepted: July 29, 2015 Published: Sep. 1, 2015 Abstract- The human auditory system perceives sound in a much different manner than how sound is measured by modern audio sensing systems. The most commonly referenced aspects of auditory perception are loudness and pitch, which are related to the objective measures of audio signal frequency and sound pressure level. Here we describe an efficient and accurate method for the conversion of the sensed factors of frequency and sound pressure level to perceived loudness and pitch. This method is achieved through the modeling of the physical auditory system and the biological neural network of the primary auditory cortex using artificial neural networks. The behavior of artificial neural networks both during and after the training process has also been found to mimic that of biological neural networks and this method will be shown to have certain advantages over previous methods in the modeling of auditory perception. This work will describe the nature of artificial neural networks and investigate their suitability over other modeling methods for the task of perception modeling, taking into account development and implementation complexity. It will be shown that while known points on the perception scales of loudness and pitch can be used to objectively test the suitability of artificial neural networks, it is in the estimation of the perception of sound from the unknown (or unseen) data points that this method excels. Index terms: auditory system modeling, audio sensors, artificial neural networks, perception of sound, digital signal processing, loudness, pitch. 1806

2 D. Riordan, P. Doody and J. Walsh, THE USE OF ARTIFICIAL NEURAL NETWORKS IN THE ESTIMATION OF THE PERCEPTION OF SOUND BY THE HUMAN AUDITORY SYSTEM I. INTRODUCTION The modeling of the perception of sound by the human auditory system is vital in fields such as speech recognition, speech synthesis and audio quality measurement. The perception of sound is governed by two main perceptual measures; the Perceptual Loudness measure (Phon or Sone Scale) and the Perceived Pitch (Critical-Band Rate or Bark Scale) of an audio signal. The perceived loudness of an audio signal presented to the ear is influenced by both the signals frequency and sound pressure level (S.P.L). The current method for the conversion from frequency and S.P.L. to perceptual loudness is outlined in ISO 226:2003 Acoustics Normal equal-loudness-level-contours [1]. This conversion involves a complex calculation which incorporates three 29 entry look-up tables. The conversion from frequency to pitch was originally presented by Zwicker[2]in table format. Zwicker's table documents the Critical Frequency Bands along with their corresponding center frequency, maximum cut-off frequency and bandwidth. The current most widely used method for this conversion is a function approximation of Zwicker's data created by Traunmuller[3]. These conversions are described in great detail in section 2. The existing measures for modeling the auditory system's perception of pure-tone audio signal are attempting to model the behavior of the ear (and to a certain extent the filtering effects of the head and torso) in conjunction with the primary auditory cortex. The primary auditory cortex is a biological neural network. Therefore, it may be beneficial to model the conversion from the analytical measures of frequency and S.P.L to loudness and pitch using Artificial Neural Networks (A.N.N.s). A.N.N.s are regarded as a good candidate for the estimation of this perceptual mapping function as their structure is based upon biological Neural Networks. Also, their behavior both during and after the training process, has also been found to mimic that of biological Neural Networks[4]. This chapter describes the development and testing of a system which will use A.N.N. to model both features of sound perception mentioned above. It will also be investigated if a single A.N.N. model can be used to model both of these aspects of sound perception simultaneously, as in Figure

3 INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS VOL. 8, NO. 3, SEPTEMBER 2015 Frequency (Hz) Sound Pressure Level (db) A.N.N. Loudness (Phon / Sone) Pitch (Bark) Figure 1: An A.N.N. Model of Pure-Tone Perception II. THE PERCEPTION OF SOUND BY THE HUMAN AUDITORY SYSTEM Sound is loosely defined as vibrations which travel through the medium of air (although any medium or combination of media will suffice) as longitudinal waves and are perceived by the human ear. There are two main analytical parameters that define the characteristic of a sound, the Sound Pressure Level (SPL) and the frequency components of the longitudinal waveform. Similarly, there are two main characteristics which define a sound as perceived by the auditory system, pitch (measured in Bark) and perceived loudness (measured in Phon or Sone). An otologically normal person is a person who has a fully functioning auditory system, free from impairments. For such a person, the magnitude of these vibrations that can be perceived is generally accepted to be those with a SPL of greater than 20μPa or 0 db. This is known as the Absolute Hearing Threshold (AHT). This value is actually the AHT for a signal of frequency 1 khz. The AHT is known to vary with the frequency of signal being perceived. [5] For a similarly otologically normal person, the frequencies of vibrations which can be perceived are those within the range of 20 Hz to 20 khz and of sufficient SPL. This detectable frequency range generally deteriorates with the age of the listener. This frequency range may also be adversely affected by overexposure to loud sounds causing hearing damage. [5] a. Pitch Critical-Band Rate is a perceptual measure, usually quantified in Bark, of the perceived pitch of an audio signal. This measure is directly related to the frequency of the sound being perceived. The conversion from frequency to perceived pitch is often referred to as frequency-warping. The critical-band rate is a sub-division of the audible frequency range 1808

4 D. Riordan, P. Doody and J. Walsh, THE USE OF ARTIFICIAL NEURAL NETWORKS IN THE ESTIMATION OF THE PERCEPTION OF SOUND BY THE HUMAN AUDITORY SYSTEM into critical bands. These critical bands are more closely related to the manner in which the mechanics of the basilar membrane of the human inner ear operate [6]. Figure 2: Critical-Band Rate versus Frequency The conversion from frequency to pitch was originally presented by Zwicker [2] in table format. This table is presented in the Appendix as Table A.1. Zwicker's table documents the Critical-Band number along with their corresponding center frequency, maximum cut-off frequency and bandwidth [5]. A plot of the relationship between Frequency and Critical Band Rate outlined by Zwicker is shown in Figure 2. Since the first publication of this table in 1961 the conversion from frequency to Critical- Band Rate has been modelled using many function approximations of the data. Resulting equations and algorithms have been proposed by [7], [8] and [9]. The current, most widely used and accepted method for this conversion is outlined by Traunmuller [3]. Traunmuller's equation for the conversion from frequency to Critical-Band Rate is 26.81f z' 1960 f 0.53 If z < 2, z' z 0.15(2 z) If z > 2, z' z 0.22( z 20.1) Else z z' (1) Where z is the critical-band rate (Bark) and f is the frequency (Hz)

5 INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS VOL. 8, NO. 3, SEPTEMBER 2015 b. Loudness The Perceptual Loudness Measure is a psychoacoustic measure correlating to the physical intensity of an audio signal. Perceived Loudness is usually measured in the units Phon or Sone. As well as being sensitive to the SPL of the signal being observed, the perceived loudness of a signal is also highly dependent on the frequency components of the signal. This has led to the creation of the Equal Loudness Curves, shown in Figure 3[5]. Figure 3: The Equal Loudness Contours (I.S.O. 2003) The Equal-Loudness Contours depict the sound pressure levels (SPL) which are required to ensure a perceived constant loudness over the audible frequency band. As it can be seen from the contours of Figure 3, for a perceived loudness of 10 Phons at 1000 Hz an SPL of 10dB is required. To maintain a perceived loudness of 10 Phons at 50Hz an SPL of approximately 55dB is required. The Equal-Loudness contours were initially devised by Fletcher and Munson in The contours were derived using subjective measures, involving a panel of test subjects. Each listener was presented with a pure tone of 1 khz of certain intensity and then a second pure tone of a different frequency. The intensity of the second tone was then varied until the listener perceived the 2 tones to be of equal loudness. The results obtained from the various test subjects were then averaged to obtain the final contours [10]. 1810

6 D. Riordan, P. Doody and J. Walsh, THE USE OF ARTIFICIAL NEURAL NETWORKS IN THE ESTIMATION OF THE PERCEPTION OF SOUND BY THE HUMAN AUDITORY SYSTEM This experiment was repeated in 1956 by Robinson and Dadson, who found their results to differ greatly from those of Fletcher and Munson [11]. Robinson and Dodson s results were accepted as the International Standardisation Organization's (I.S.O.) official standard until replaced by the current standard in [1]. where 40 lg 94 L N B f (2) B f L f p LU T f f LU The current I.S.O. standard is documented in I.S.O 226:2003. This document gives information on the conditions under which the subjective testing for the definition of the curves took place. The derived equations which may be used for the conversion of sound intensity data to perceptual loudness data are also included. These consist of equations for the conversion from frequency and SPL to perceptual Loudness (in Phon) and vice versa and are given here as Eq. 2 and Eq. 3 respectively. These equations are accompanied by a look-up table which is required to implement these equations. This look-up table can be found in the Appendix Table A.2 [1]. In Eq. 2, LN is the perceived loudness level in Phon, Tf is the threshold of hearing, αf is the exponent for loudness perception, Lu is a magnitude of the linear transfer function normalized at 1000 Hz and Lp is SPL. The three factors Tf, αf and Lu each have values determined by the 29 frequencies specified in the lookup table where, SPL (( 10./ af ). log10( Af )) Lu 94 (3) 3 (0.025 Ln) ((( Tf Lu) /10) 9) af Af ( ) ( ) and all symbols represent the same factors as in Eq. 2. The Sone scale of perceived loudness is very similar to the Phon scale. In fact it is a direct translation of the calculated Phon value. In certain instances, the Sone scale can be a more useful measure than the Phon measure. The Sone unit of perceived loudness is analogous to the manner in which the human auditory system perceives a change in loudness. In the Phon scale of perceived loudness, a doubling of the perceived loudness is associated with a rise of 10 Phon[6]. Using the Sone scale, the

7 INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS VOL. 8, NO. 3, SEPTEMBER 2015 perceived loudness of two different signals would be in ratio to the resulting positions on the Sone scale. In other words, a perceived doubling of the loudness of a signal would result in a doubling of the units of the perceptual loudness measure on the Sone scale. l 40/10 2 S l / 40, if l 40, otherwise (4) The equation for the conversion from the Phon scale to the Sone scale is shown in Eq. 4, where S is the resulting perceived loudness in Sone and l is the loudness level in Phon. [12] III. ARTIFICIAL NEURAL NETWORKS A biological neural network is an interconnection of processing elements (neurons) responsible for the processing of information in the nervous systems of animals. Each connection between neurons has a certain strength or weight, which may be strengthened or weakened, which allows the neural network to learn and thus perform processing operations. It is the use of neural networks that allow animals to perform various tasks with ease which have proved excessively difficult to achieve by computational means. [4] An Artificial Neural Network (A.N.N.) is a computational method which is modeled on biological neural networks. An A.N.N. consists of an interconnection of processing elements (artificial neurons) which each carry out a simple computational operation. The neurons are interconnected by weighted connections, similar to the connections in biological neural networks. The weights of each connection are updatable during the training process. It is this ability that allows the A.N.N. to learn functions and processes in the same way as biological neural networks. Artificial Neural Networks (A.N.N.s) are a branch of the inductive machine learning subfield of Artificial Intelligence (A.I.) techniques. A.N.N.s are based upon the behavior, structure and architecture of biological neural networks. For this reason A.N.N.s are very suited to the modeling of biological functions which have traditionally been extremely difficult for other computing methods to model. Their advantages over traditional processing techniques include their ability to learn from pre-existing training material. An A.N.N. generally learns in much the same way as biological neural networks learn. When presented with training material the connection strengths within 1812

8 D. Riordan, P. Doody and J. Walsh, THE USE OF ARTIFICIAL NEURAL NETWORKS IN THE ESTIMATION OF THE PERCEPTION OF SOUND BY THE HUMAN AUDITORY SYSTEM the A.N.N. are either strengthened or weakened until the desired associations are made. Many A.N.N. architectures and training algorithms have been developed to date, each having specific advantages and disadvantages. a. Architectures of Artificial Neural Networks A.N.N. architecture is the arrangement of neurons into layers and the patterns of the interconnection of those layers and the neurons within the layers. The neural nets are often separated into single layer and multi-layer architecture. Single-layer nets usually comprise of an input layer, a single layers of connections and an output layer. The input layer of a neural net does not perform any computation and, therefore, is rarely counted when determining the number of layers in a net. Single-layer nets are often used for pattern classification problems when the output of each output neuron represents a specific class of input pattern. Minsky and Papert proved that single layer neural networks can only be used effectively in problems that are linearly separable. They showed that for more complex problems more complex multi-layer nets need to be used. [13] Multi-layer nets contain an input layer, any number of hidden layers and an output layer. In multi-layer networks it is common to have a layer of connections between each successive layer; however connection of any individual neuron or layer of neurons to any other is possible. A layered network architecture allows neurons only to be connected to neurons of the same or subsequence layers. No intra-layer connection is allowed within the input layer. This architecture insures that no closed-loop feedback occurs in the network. Acyclic networks are a form of layered network in which no connection between neurons of the same layer is permitted. Only connections from a neuron to neurons of a subsequent layer are permitted. A special case of the Acyclic Network architecture is the Feed-Forward network. [14] Feed-Forward A.N.N.s are the most popular form of A.N.N., with the term A.N.N. often being used to describe only Feed-Forward type networks [14]. In this architecture the flow of the signal is always forward through the network towards the output neurons. Connections leading from a neuron to neurons in the same or previous network layers are prohibited in this architecture

9 INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS VOL. 8, NO. 3, SEPTEMBER 2015 Input Layer x x x x Hidden Layer Output Layer y y y Figure 4: A Feed-Forward A.N.N. Modular Networks are formed of an interconnection of separately developed A.N.N.s. This allows a large problem to be separated into smaller problems with the developer using an A.N.N. to solve each smaller problem. These smaller A.N.N.s are then combined in a Modular A.N.N. to solve the larger problem. [14] b. Training Algorithms and Supervised Learning The purpose of a training algorithm is to optimise a neural network so as the network will perform in the manner desired by the user. Upon creation of an A.N.N., the weight values of the network are often assigned at random. The A.N.N. must then be optimised using a training algorithm to perform a useful computational function. This is achieved altering the weights according to a predefined set of rules [14]. Learning algorithms can be divided into two wideranging types; Supervised Learning algorithms and Unsupervised Learning algorithms. Supervised Learning algorithms are very similar to function approximation algorithms. The A.N.N is provided with a training set from which to learn. Each training set consists of an input vector with a corresponding target output vector. The inputs are presented at the inputs nodes of the network and the resulting output is logged. The difference between this A.N.N.s outputs and the target output vector contained in the training vector is said to be the error vector. The supervised learning algorithm then performs some form of optimisation algorithm in order to minimise this error. Depending upon the type of training algorithm being used and for what purpose the A.N.N. will be used, either the M.S.E. or the number of misclassifications is minimised. This involves a measured alteration of the weights of the connections within the network. Most training algorithms are repeated for a number of iterations until some termination criterion is met. This criterion is often a predefined number of iterations, a goal M.S.E. or number of misclassifications or a minimum reduction of error per iteration. 1814

10 D. Riordan, P. Doody and J. Walsh, THE USE OF ARTIFICIAL NEURAL NETWORKS IN THE ESTIMATION OF THE PERCEPTION OF SOUND BY THE HUMAN AUDITORY SYSTEM c. Over-Fitting & Generalization Generalization is the ability of an A.N.N. to perform well when presented with unseen data based upon what it has learnt during the training process. One of the major problems which occurs during the training of A.N.N.s is the memorization of training material. When Memorization occurs, the A.N.N. has over-fitted to the requirements of the training data. It has exactly learned the input-output values in the training set, but performs poorly when presented with unseen data. Over-fitting of the training data may occur when the network has been excessively trained. If a suitable large A.N.N. is trained repeatedly until its M.S.E. is a minimum, it may have memorized the input output relationship and perform poorly on unseen data. [14] Over-fitting can be avoided by limiting the amount of training iterations of the training algorithm. By dividing up the training set into a training set and a validation set, the generalization of the A.N.N. can be monitored. Once trained, the A.N.N. is presented with the unseen validation set. The M.S.E. of the resulting output is monitored. This operation is known as Cross-Validation. Often many A.N.N.s of varying architectures are trained to solve a single problem. By implementing the Cross-Validation training technique with each A.N.N., the A.N.N. with the best generalisation can be identified. This is often not the A.N.N. with the best performance on the training data. [15] Another method to ensure generalization is to limit the degrees of freedom present in the A.N.N.. By limiting the number of neurons in the A.N.N. the net will be unable to memorize the data due to its lack of flexibility. In this way the A.N.N. is forced to generalize the relationship between the input and target values of the training set. [16] IV. A.N.N. IMPLEMENTATION OF THE PERCEPTUAL LOUDNESS MEASURE (PHON) a. Motivation for A.N.N. Implementation ISO 226:2003 specifies combinations of sound pressure levels and frequencies of pure continuous tones which are perceived as equally loud by human listeners [1]. The algorithm to calculate the loudness level, LN, given the frequency, f, and the S.P.L., Lp, of an audio

11 INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS VOL. 8, NO. 3, SEPTEMBER 2015 signal is shown in Eq. 2 earlier in this chapter. This equation uses three 29 entry, look-up tables, to perform the calculation outlined in Eq. 2. See Appendix Table A.2[1]. The algorithm outlined in ISO 226:2003 for the calculation of the perceived loudness can only be implemented accurately for 29 discrete values on the frequency scale. Therefore the frequency components of all audio signals need to be approximated by one of the specified 29 frequencies outlined by ISO 266:2003. This can result in a digitization of such features as uniform tones rising steadily in the frequency domain. The algorithm outlined in ISO 226:2003 is attempting to model the behavior of the biological function. As mentioned in Section 3, A.N.N.s are modeled upon biological neural networks. Also, the behavior of A.N.N.s both during and after the training process has been found to mimic the behavior of biological neural networks. Therefore, it is logical to say that it may be beneficial to model this perception based function using A.N.N. techniques. b. Development of A.N.N. Architecture For simplicity, a two layer feed-forward architecture was used for the A.N.N.. Two nodes are required in the input layer to take the values of frequency and S.P.L. of the audio signal. A single node is used in the output layer to accommodate the output of the Loudness Level in Phon. The number of nodes to implement in the hidden layer was decided during the training process based the performance of various networks during the training/testing process. A tan sigmoid activation function is used in the nodes of the hidden layer. This allows for the use of efficient backpropagation based training algorithms. A linear output activation function in the output layer node. The linear output function is required to allow the output of the A.N.N. may take on any value. This is required as the desired output of the network will be in the range 0 to 90 of the Phon Scale. c. Training / Testing The data used in the training of the A.N.N.s to mimic the manner in which an audio signals loudness is perceived, was generated from the Eq 2 and 3. These equations were implemented for all 29 specified frequencies at each SBL level from 1dB to 90dB (those specified to be accurately catered for by the equations), which resulted in 2581 training vectors. Each training vector contained a frequency value (Hz) and an S.P.L. level (db) as the input values. A corresponding perceptual loudness level (Phon), calculated by the Eq2 and Eq. 3,was included in the training vectors as a target value. With a large quantity of both input values and corresponding target values a supervised training algorithm may be used to train the A.N.N.s. 1816

12 D. Riordan, P. Doody and J. Walsh, THE USE OF ARTIFICIAL NEURAL NETWORKS IN THE ESTIMATION OF THE PERCEPTION OF SOUND BY THE HUMAN AUDITORY SYSTEM The A.N.N. was designed, trained and tested using the Matlab Neural Network Toolbox. The Levenberg-Marquardt backpropagation algorithm was used to train the A.N.N.. The Levenberg-Marquardt algorithm is a least-squares error minimization technique used in the supervised training of A.N.N.s. It is noted as having an appropriate trade-off between efficiency and accuracy which would be suitable for function approximation problems with randomized initial weights [15]. The number of nodes in the hidden layer was decided based on the performance received during successive training/testing iteration as follows. The number of hidden nodes was varied from 10 to 60 with each configuration being tested for ten training sessions. Each session consisted of 1000 epochs, each beginning with the weighs of the A.N.N. being randomized. The resulting M.S.E. at the end of each training session is noted and the associated network weights logged. For each configuration, the A.N.N. with least M.S.E after the ten training sessions is taken as the best initial approximation of the function. These best A.N.N.s are then trained to the maximum amount of epochs as defined by the stop conditions of the Levenberg-Marquardt backpropagation algorithm. Testing is also carried out to determine the level of generalisation achieved by each A.N.N.. This is done by observing the performance of each A.N.N. configuration when presented with unseen data. For this instance, unseen data will consist for a frequency values other than those included in the look-up table associated with Eq. 2 and Eq. 3. Table 1: Results from Training of A.N.N. Model of the Perceptual Loudness Conversion (Phon) No. Nodes in Hidden layer Min M.S.E. of epoch Training Sessions Max error of Net with min MSE Best MSE result received Standard Deviation Max error

13 Perceptual Loudness (Phon) INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS VOL. 8, NO. 3, SEPTEMBER 2015 d. Results Table 1 showsa selection of the results achieved during the investigation of the performance of A.N.N.s with varying numbers of neurons in the hidden layer. It can be seen that a number of A.N.N.s suitable for the estimation of the Perceptual Loudness measure in the Phon scale were created. A network comprising of 40 nodes in the hidden layer was developed and trained to provide a M.S.E. of , with a standard deviation of and maximum individual error of It is of benefit with regard the size, execution time and generalisation of the A.N.N. that the number of neurons is kept to a minimum. Therefore, a certain tradeoff between the number of neurons and the accuracy of the results must be made. With this in mind and based on the results shown in Table 5.1, an A.N.N. comprising of 28 neurons in the hidden layer, with a M.S.E. of and standard deviation would be suitable for use in the estimation of the perceptual loudness measure. Of course, where greater accuracy is needed an increase in the number of neurons may be made. If the situation requires a smaller A.N.N. with a shorter execution time, an A.N.N. with fewer neurons may be used at the expense of accuracy DSP Output A.N.N. Output Frequency (Hz) Figure 5: Results from Testing of Perceptual Loudness (Phon) Mapping A.N.N. Figure 5 depicted a comparison of the performance of the A.N.N. method developed here and the method outlined by Eq. 2 and Eq. 3. Both methods were implemented with a constant S.P.L. of 80dB and a frequency value varying from 20Hz to 12500Hz. The equation based method was presented with those frequency values associated with the look-up table. The A.N.N. was presented with frequency values rising from 20Hz to 12500Hz in increments of 1 Hz. The resulting estimation of the perceived Loudness from both methods is plotted in the 1818

14 Perceptual Loudness (Phon) D. Riordan, P. Doody and J. Walsh, THE USE OF ARTIFICIAL NEURAL NETWORKS IN THE ESTIMATION OF THE PERCEPTION OF SOUND BY THE HUMAN AUDITORY SYSTEM figure. From this figure it can be seen that the A.N.N. shows a very high level of correlation with the values generated by the method outlined in the ISO standard. It was also found that the A.N.N. method produced highly continuous curve when presented with a constant SPL and varying frequency. This is in contrasted to the discontinuous nature of the results generated by the method outlines by the I.S.O. standard. Figure 6 highlights the digitization effects introduced by the implementation outlined by ISO 226:2003 (labeled DSP output ). These effects have been overcome by the A.N.N. method of perceptual loudness evaluation (implemented with 28 neurons in the hidden layer). Figure 5.3 shows the resulting curves when both methods were presented with a constant SPL of 80dB and the frequency was varied from 20Hz to 12500Hz. A good level of generalisation is shown by this A.N.N. configuration as evident by the smooth continuous curve shown in Figures 5 and 6. [17]. 79 DSP Output A.N.N. Output Frequency (Hz) Figure 6: Close-Up of Figure 5.2 V. AN A.N.N. IMPLEMENTATION OF THE PERCEPTUAL LOUDNESS MEASURE (SONE) a. Motivation for A.N.N. Implementation The measure of Perceived Loudness may also be measured on Sone Scale. This Sone measure is generally calculated directly from a previously determined Phon measure. The algorithm for this conversion is given in Eq. 4 earlier in this chapter[12].the Sone scale of perceived loudness is often thought to be a more accurate representation of the manner in which

15 INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS VOL. 8, NO. 3, SEPTEMBER 2015 loudness is perceived by the human auditory system. For this reason it may be more desirable to have a direct conversion from Frequency and S.P.L to the Sone scale rather than the Phon scale. The previous section of this chapter shows that the conversion from frequency and S.P.L. to the loudness measure in Phon can be implemented accurately with an A.N.N. This section will show that an A.N.N. can be used to implement the conversion from frequency and S.P.L. to the loudness measure on the Sone scale. b. Development of A.N.N. Architecture [Artificial Neural] Networks with just two layers of weights are capable of approximating any continuous functional mapping [16].The continuity of a function has many different levels. C0 continuity denotes that the function is continuous and it does not generate any discrete behavior. C1 continuity deals with the first derivative of the function and denotes that this derivative is also continuous. ( L40) /10 L (5) L lim L (6) 40 The conversion from the Phon measure to the Sone measure presented in Eq. 3 was found to be a C0 continuous function as shown in Eq. 5 and Eq. 6 where both methods give a result of 1 when L = 40 and the limit as L goes to 40 respectively. Eq. 7 shows the at L = 40, the first derivative of Eq. 5dS/dL, is equal to Eq. 8 shows that at the limit as L goes to 40 of the first derivative of Eq. 6, ds/dl, is equal to These values are not equal and therefore the function is not C1 continuous. ds dl L 0.1L4 2 ln (40) L (7) ds 2.642( L dl 40 2, ds 2.642(40 ) L (8) 2, dl 40 lim 642 ) 1820

16 D. Riordan, P. Doody and J. Walsh, THE USE OF ARTIFICIAL NEURAL NETWORKS IN THE ESTIMATION OF THE PERCEPTION OF SOUND BY THE HUMAN AUDITORY SYSTEM To approximate a non-continuous function efficiently, an A.N.N. with at least 2 hidden layers is required. For this reason a 3-layer Feed-Forward A.N.N. architecture was implemented. a three-layer network with threshold activation functions could represent an arbitrary decision boundary to arbitrary accuracy [16]. Again, a tan sigmoid function was used in the nodes of the hidden layers and a linear function in the output layer node. Two nodes were required in the input layer to take the values of frequency and S.P.L. of the audio signal. A single node was used in the output layer to accommodate the output of the Loudness Level in Sone. The number of nodes in each of the hidden layers was decided during the training process based upon the performance of various network implementations during the training/testing process. c. Training/Testing The data used here in the training of the A.N.N.s, was generated by calculating the Phon values from the equations provided in I.S.O. 226:2003 and then converting these to Sone with Eq. 4. This resulted in 2581 training vectors, each containing a frequency value (Hz), an S.P.L. level (db) as inputs and a corresponding perceptual loudness level (Sone) as the target value. Again, this training set facilitates supervised training methods. The A.N.N. was designed, trained and tested using the Matlab Neural Network Toolbox. The Levenberg-Marquardt backpropagation algorithm was used to train the A.N.N.. The number of nodes in the hidden layer was decided based on the performance received during successive training/testing iterations as follows. The number of neurons in the first hidden layer was varied from 5 to 40 and the number of neurons in the second hidden layer was varied from 1 to 5. Each possible configuration of these variations was tested for ten training sessions, each of 1000 epochs. Each session begins with randomization of the A.N.N.s weights. For each configuration, the A.N.N. with the least Mean Square Error (M.S.E.) resulting from the ten training sessions is stored as the best initial approximation of the function. These A.N.N.s are then trained to the maximum amount of epochs as defined by the stop conditions of the Levenberg-Marquardt backpropagation algorithm. The results are then logged and analysis to decide upon a suitable A.N.N. configuration for this function approximation. Each A.N.N. configuration is also tested for the level of generalisation achieved. As before, each network is provided with unseen input data and the resulting outputs are analysed for instances of Over-fitting

17 INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS VOL. 8, NO. 3, SEPTEMBER 2015 d. Results The table of results populated during the investigation of the performance of the various A.N.N. configurations can be found in the Appendix, Table A.3. From this table it can be seen that a number of A.N.N.s suitable for the estimation of the perceptual loudness measure in the Sone scale were created. A network comprising of 30 nodes in the first hidden layer and 5 in the second was designed which was trained to provide a M.S.E. of , with a standard deviation of and maximum individual error of Figure 7: Results from Testing Perceptual Loudness (Sone) Mapping A.N.N. Figure 7 depicts the performance of the A.N.N. comprised of 30 neurons in the first hidden layer and 5 in the second. This A.N.N. was presented with a constant S.P.L. of 80dB and a frequency value varying from 20Hz to 12500Hz in increments of 1 Hz. For reference the results generated by presenting Eq. 2, 3 and 4 with the same input S.P.L value and those frequencies present in the associated look-up table are also shown in the figure. A high degree of correlation between the A.N.N. based method and the equation based method is again shown. The digitisation effect of the equation based method is still present while the A.N.N. produces a highly continuous curve. 1822

18 D. Riordan, P. Doody and J. Walsh, THE USE OF ARTIFICIAL NEURAL NETWORKS IN THE ESTIMATION OF THE PERCEPTION OF SOUND BY THE HUMAN AUDITORY SYSTEM Figure 8: Close-Up of Figure 7 (1) Figure 9: Close-Up of Figure 7 (2) Figure 8 and Figure 9 are magnified versions of Figure 5.4 and shown in greater detail the digitization effect which has been overcome by the use of A.N.N.s. The smooth continuous curves shown in Figures 7, 8 and 9 depict the output of a network with a high level of generalisation. A suitable trade-off between network size and performance may be the network containing 20 neurons in the first hidden layer and 1 in the second. This network yields a M.S.E. of approximately and a Standard Deviation of Upon investigation it was also found to produce a smooth continuous curve when tested as above. This demonstrates that the A.N.N. possesses a high degree of generalisation. The actual choice of A.N.N. from those presented will be application specific and will be dependent on such features as accuracy required and system requirements. VI AN A.N.N. FOR THE FREQUENCY TO CRITICAL BAND RATE CONVERSION a. Motivation for A.N.N. Implementation The conversion from frequency to pitch was originally presented by Zwicker in table format [2]. Zwicker s table documents the Critical-Band number along with the corresponding center frequency, maximum cut-off frequency and bandwidth. Since the first publication of this table in 1961, many function approximations of the data, with varying degrees of accuracy, have been presented [7], [18]and [9]. The current most widely used and accepted method for this conversion is outlined by Traunmuller in his paper Analytical expressions for the Tonotopic Sensory Scale [3]. From Zwicker s table outlining the limits of the Critical-Bands, only the Bark value at the specific frequencies listed can be discerned accurately. The Bark values of all other

19 INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS VOL. 8, NO. 3, SEPTEMBER 2015 frequencies values are no-more than educated estimations. While this is generally acceptable in the field of speech processing, there is room for improvement. These improvements may be of use when accurate representations of the perceived pitch are required such as models of the cognitive aspects of sound perception. Traunmuller s equation for the conversion from the frequency scale to the Bark Scale is a function approximation of the information presented in Zwicker s critical-band rate table. This function approximation equation is shown in Eq. 1 earlier in this chapter. The values calculated in this way agree with the table for f > 100Hz to within ± 0.05 Bark [3]. This error measurement can only be taken from the frequency values present on the table. Errors associated with frequencies not listed on the table are unknown. Thus the values generated by this equation which are between Zwickers values are, again, an educated guess. Both Traunmuller and Zwicker, along with many others, are attempting to model the behavior of a fundamentally biological function. Therefore it is logical to say that it may be beneficial to model this conversion using A.I. techniques. The structure of A.N.N.s is based upon biological neural networks and their behavior, both during and after the training process, has been found to mimic that of biological neural networks [15]. b. Development of A.N.N. Architecture As with the A.N.N. for estimation of perceived loudness, a two layer feed-forward architecture was used for this A.N.N.. A single node was required in the input layer to take the frequency value of the audio signal. A single node was used in the output layer to accommodate the output of the perceived pitch in Bark. A tan sigmoid function was used in the nodes of the hidden layer and a linear function in the output layer node. The number of nodes in the hidden layer was decided during the training process based the performance of various networks during the training/testing process. c. Training / Testing The data used here in the training of the A.N.N.s was taken directly from Zwicker s table of Critical-Band limits. This supplied an input of 25 input frequency values for the network with 25 corresponding output values. This allows for a supervised training algorithm to be used in training of the network. The A.N.N. was designed, trained and tested using the Matlab Neural Network Toolbox. The Levenberg-Marquardt backpropagation algorithm was used to train the A.N.N

20 D. Riordan, P. Doody and J. Walsh, THE USE OF ARTIFICIAL NEURAL NETWORKS IN THE ESTIMATION OF THE PERCEPTION OF SOUND BY THE HUMAN AUDITORY SYSTEM The number of nodes in the hidden layer was decided based on the performance received during successive training/testing iterations as described in section 4.3 of this chapter. In this instance the number of hidden nodes was varied from 1 to 20. The results were logged and examined to determine which A.N.N. is the most suitable for the implementation of this function approximation problem. Each trained A.N.N. configuration is also tested for instances of over-fitting and it s performance on unseen data. This is done here by presenting each A.N.N. with frequency values not present in the data set and ensuring the result is consistent with known values. Table 2: Results from Training of A.N.N. Model of the Perceived Pitch Conversion Neurons in Hidden Layer Min M.S.E. of epoch Training Sessions Max error of Net with min MSE Best MSE result received E-05 Standard Deviation Max error d. Results Table 2 shows a sample of the results achieved during the investigation of the performance of various A.N.N.s configurations. It can be seen that a number A.N.N.s suitable for the warping of the frequency scale to Critical-Bate Rate (or the Bark Scale) were created. Based on these results (and other factors to be dealt with later), a network comprising of 10 nodes in the hidden layer would be a suitable candidate for use in the field of auditory system modeling. This A.N.N. was trained to provide a M.S.E. of , with a standard deviation of and maximum individual error of magnitude A plot of the output of this A.N.N. when presented with an input of frequency values ranging from Hz in increments of

21 INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS VOL. 8, NO. 3, SEPTEMBER Hz, is shown in Figure 10. It can be seen to compare very well with a plot of the data listed in Zwickers table shown in the same figure. The curve generated by this A.N.N. can be seen to be an extremely smooth continuous curve ensuring that this A.N.N. has a high level of generalization. Figure 10: Results of Testing Perceived Pitch (Bark) Mapping A.N.N. While networks with higher numbers of hidden nodes provided a better M.S.E. on the training data provided, an effect known as over-fitting was witnessed during testing. In an attempt to match the training data more closely, the values generated at instances between the training points became more non-uniform, as demonstrated in Figures 11 and 12. This results in a poor performance of the A.N.N. on unseen input data. The plots in Figures 11 and 12 show the continuous output from the 15 neuron network, when presented with a continuous input varying from Hz. Large variations can be seen in the region Hz, even though all of the expected outputs of the 25 entry training set have been met to within ± Figure 11: Over-Fitting Figure 12: Close-Up of Fig 11 For a neural Network to perform uniformly for unknown inputs, the A.N.N. will need have a high degree of generalization. A degrading of the generalization will cause over-fitting. Over- 1826

22 D. Riordan, P. Doody and J. Walsh, THE USE OF ARTIFICIAL NEURAL NETWORKS IN THE ESTIMATION OF THE PERCEPTION OF SOUND BY THE HUMAN AUDITORY SYSTEM fitting of a neural network occurs when the flexibility, or degree of freedom, of the network is too great. The degrees of freedom of an A.N.N. can be controlled by a process called structural stabilization. This involves limiting the amount of changeable factors (neurons or weights) in a network. The fewer the changeable factors in the network the less likely it is that over-fitting will occur. Therefore the fewer neurons contained in a network, the less overfitting is likely to occur [16]. This leads to good generalisation within the network. [19] VII AN ALL-IN-ONE A.N.N. PURE-TONE PERCEPTION MODEL a. Motivation for A.N.N. The three previous sections of this chapter have shown that A.N.N. can be used to model the individual features of the human auditory system. This section will present the development of a single A.N.N. with the ability to generate both the Perceived Loudness and Pitch of a audio signal simultaneously. b. Development of A.N.N. Architecture As this A.N.N. is being designed to generate both the Perceived Pitch and Loudness, a minimum of three layers will be required in the network. This is due to the non-linear characteristic of the conversion from frequency and S.P.L. to the Sone Scale of Perceived Loudness. For simplicity, a network with two hidden layers was implemented. Two neurons were required in the input layer to take the frequency and S.P.L. values of the audio signal. Two nodes were also required in the output layer to accommodate the output of the perceived pitch in Bark and Perceived Loudness in Sone. A tan sigmoid function was used in the nodes of the hidden layers and a linear function in the output layer node. The number of nodes in the hidden layers was decided during the training process based the performance of various networks during the training/testing process. c. Training / Testing A training set of 2581 vectors was compiled from the data used to train the A.N.N.s described in the previous chapters. Each training vector contains two input values, frequency and S.P.L., and two corresponding target values, the pitch in Bark and the loudness in Sone

23 INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS VOL. 8, NO. 3, SEPTEMBER 2015 Again, the A.N.N. was designed, trained and tested using the Matlab Neural Network Toolbox. The Levenberg-Marquardt backpropagation algorithm was used to train the A.N.N.. The number of neurons in the hidden layers was decided based on the performance received during successive training/testing iterations as described in section 5.3. In this instance the number of neurons in the first hidden layer was varied from 5 to 40 and the number of neurons in the second hidden layer was varied from 1 to 5. Testing was also carried out to determine the level of generalisation achieved by each A.N.N. on both the estimations of Perceived Loudness and Pitch as before. d. Results The table documenting the performance of the A.N.N.s during training is presented in the Appendix, Table A.4. It can be seen from this table that a number of A.N.N.s have been developed and trained which are suitable for the implementation of both the perceived pitch and loudness measures. Figure 13: Over-Fitting Figure 14: Close-Up of Figure 13 When the inevitable trade-off between network size and performance is taken into account, the network containing 30 neurons in the first hidden layer and 2 neurons in the second seemed to be a viable choice. For the estimation of perceived Pitch, this A.N.N produces a M.S.E. of less-than and standard deviation of the with values obtained from Zwicker s table. Similarly good results of and are produced for the estimation of Perceived Loudness in the Sone scale when compared with results outlined in ISO:226:2003. Upon further investigation it seems over-fitting has occurred with this A.N.N.. Figures 13 shows the estimation of Perceived Loudness resulting from inputs of 60 db S.P.L. and a frequency varying from 20Hz to 12500Hz. Figure 14 is a magnified version of Figure 13 which highlights the irregularities which are not supported by the training data. 1828

24 D. Riordan, P. Doody and J. Walsh, THE USE OF ARTIFICIAL NEURAL NETWORKS IN THE ESTIMATION OF THE PERCEPTION OF SOUND BY THE HUMAN AUDITORY SYSTEM Figure 15 & Figure 16: Generalisation in All-In-One Perceptual Model A.N.N. Investigations into another suitable A.N.N. with 20 nodes in the first hidden layer and 2 nodes in the second showed that network possessed a high level of generalisation. Figure 15and 16 show the estimations of Perceived Loudness and Pitch, respectively, produced for the same input values mentioned above, for the A.N.N. containing 20 nodes in the hidden layer. VIII Conclusions The results which have been presented here, clearly show that the conversion from frequency and S.P.L to Perceived Loudness and Critical-Band Rate (or Bark) can be implemented using an A.N.N.. It has also been shown that the use of A.I. techniques presented here has certain advantages over the existing and accepted methods. The use of A.N.N.s in the estimation of perceived loudness has been shown to eliminate the need to approximate the frequency value of the signal to one of 29 specified frequencies. The values generated for frequencies between those specified are generated purely by the A.N.N. and cannot be validated without subjective testing. Some validation can be inferred by the fact that A.N.N.s have been noted to possess very similarly characteristics to that of biological neural networks and be adept at modeling biological functions. Similarly, the implementation of the frequency to Critical-Band Rate conversion through A.N.N.s is shown bridge the gap between the 25 specified critical band values specified by Zwicker. While this has been done in the past by many function approximation attempts, an A.N.N. approach might prove to be a more suitable method. Again due to the nature of the A.N.N. it is well suited to the modeling of biological functions. Therefore the intermediary values generated by the A.N.N. implementation may be more representative of the true operation of the auditory system

Nonuniform multi level crossing for signal reconstruction

Nonuniform multi level crossing for signal reconstruction 6 Nonuniform multi level crossing for signal reconstruction 6.1 Introduction In recent years, there has been considerable interest in level crossing algorithms for sampling continuous time signals. Driven

More information

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Loudness & Temporal resolution

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Loudness & Temporal resolution AUDL GS08/GAV1 Signals, systems, acoustics and the ear Loudness & Temporal resolution Absolute thresholds & Loudness Name some ways these concepts are crucial to audiologists Sivian & White (1933) JASA

More information

Structure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping

Structure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping Structure of Speech Physical acoustics Time-domain representation Frequency domain representation Sound shaping Speech acoustics Source-Filter Theory Speech Source characteristics Speech Filter characteristics

More information

Figure 1. Artificial Neural Network structure. B. Spiking Neural Networks Spiking Neural networks (SNNs) fall into the third generation of neural netw

Figure 1. Artificial Neural Network structure. B. Spiking Neural Networks Spiking Neural networks (SNNs) fall into the third generation of neural netw Review Analysis of Pattern Recognition by Neural Network Soni Chaturvedi A.A.Khurshid Meftah Boudjelal Electronics & Comm Engg Electronics & Comm Engg Dept. of Computer Science P.I.E.T, Nagpur RCOEM, Nagpur

More information

An introduction to physics of Sound

An introduction to physics of Sound An introduction to physics of Sound Outlines Acoustics and psycho-acoustics Sound? Wave and waves types Cycle Basic parameters of sound wave period Amplitude Wavelength Frequency Outlines Phase Types of

More information

CHAPTER 6 BACK PROPAGATED ARTIFICIAL NEURAL NETWORK TRAINED ARHF

CHAPTER 6 BACK PROPAGATED ARTIFICIAL NEURAL NETWORK TRAINED ARHF 95 CHAPTER 6 BACK PROPAGATED ARTIFICIAL NEURAL NETWORK TRAINED ARHF 6.1 INTRODUCTION An artificial neural network (ANN) is an information processing model that is inspired by biological nervous systems

More information

DETERMINATION OF EQUAL-LOUDNESS RELATIONS AT HIGH FREQUENCIES

DETERMINATION OF EQUAL-LOUDNESS RELATIONS AT HIGH FREQUENCIES DETERMINATION OF EQUAL-LOUDNESS RELATIONS AT HIGH FREQUENCIES Rhona Hellman 1, Hisashi Takeshima 2, Yo^iti Suzuki 3, Kenji Ozawa 4, and Toshio Sone 5 1 Department of Psychology and Institute for Hearing,

More information

The Association of Loudspeaker Manufacturers & Acoustics International presents

The Association of Loudspeaker Manufacturers & Acoustics International presents The Association of Loudspeaker Manufacturers & Acoustics International presents MEASUREMENT OF HARMONIC DISTORTION AUDIBILITY USING A SIMPLIFIED PSYCHOACOUSTIC MODEL Steve Temme, Pascal Brunet, and Parastoo

More information

Design Neural Network Controller for Mechatronic System

Design Neural Network Controller for Mechatronic System Design Neural Network Controller for Mechatronic System Ismail Algelli Sassi Ehtiwesh, and Mohamed Ali Elhaj Abstract The main goal of the study is to analyze all relevant properties of the electro hydraulic

More information

AN IMPROVED NEURAL NETWORK-BASED DECODER SCHEME FOR SYSTEMATIC CONVOLUTIONAL CODE. A Thesis by. Andrew J. Zerngast

AN IMPROVED NEURAL NETWORK-BASED DECODER SCHEME FOR SYSTEMATIC CONVOLUTIONAL CODE. A Thesis by. Andrew J. Zerngast AN IMPROVED NEURAL NETWORK-BASED DECODER SCHEME FOR SYSTEMATIC CONVOLUTIONAL CODE A Thesis by Andrew J. Zerngast Bachelor of Science, Wichita State University, 2008 Submitted to the Department of Electrical

More information

Digitally controlled Active Noise Reduction with integrated Speech Communication

Digitally controlled Active Noise Reduction with integrated Speech Communication Digitally controlled Active Noise Reduction with integrated Speech Communication Herman J.M. Steeneken and Jan Verhave TNO Human Factors, Soesterberg, The Netherlands herman@steeneken.com ABSTRACT Active

More information

Speech/Music Change Point Detection using Sonogram and AANN

Speech/Music Change Point Detection using Sonogram and AANN International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 6, Number 1 (2016), pp. 45-49 International Research Publications House http://www. irphouse.com Speech/Music Change

More information

Artificial Neural Networks. Artificial Intelligence Santa Clara, 2016

Artificial Neural Networks. Artificial Intelligence Santa Clara, 2016 Artificial Neural Networks Artificial Intelligence Santa Clara, 2016 Simulate the functioning of the brain Can simulate actual neurons: Computational neuroscience Can introduce simplified neurons: Neural

More information

CHAPTER 4 PV-UPQC BASED HARMONICS REDUCTION IN POWER DISTRIBUTION SYSTEMS

CHAPTER 4 PV-UPQC BASED HARMONICS REDUCTION IN POWER DISTRIBUTION SYSTEMS 66 CHAPTER 4 PV-UPQC BASED HARMONICS REDUCTION IN POWER DISTRIBUTION SYSTEMS INTRODUCTION The use of electronic controllers in the electric power supply system has become very common. These electronic

More information

Using of Artificial Neural Networks to Recognize the Noisy Accidents Patterns of Nuclear Research Reactors

Using of Artificial Neural Networks to Recognize the Noisy Accidents Patterns of Nuclear Research Reactors Int. J. Advanced Networking and Applications 1053 Using of Artificial Neural Networks to Recognize the Noisy Accidents Patterns of Nuclear Research Reactors Eng. Abdelfattah A. Ahmed Atomic Energy Authority,

More information

The EarSpring Model for the Loudness Response in Unimpaired Human Hearing

The EarSpring Model for the Loudness Response in Unimpaired Human Hearing The EarSpring Model for the Loudness Response in Unimpaired Human Hearing David McClain, Refined Audiometrics Laboratory, LLC December 2006 Abstract We describe a simple nonlinear differential equation

More information

SOUND QUALITY EVALUATION OF FAN NOISE BASED ON HEARING-RELATED PARAMETERS SUMMARY INTRODUCTION

SOUND QUALITY EVALUATION OF FAN NOISE BASED ON HEARING-RELATED PARAMETERS SUMMARY INTRODUCTION SOUND QUALITY EVALUATION OF FAN NOISE BASED ON HEARING-RELATED PARAMETERS Roland SOTTEK, Klaus GENUIT HEAD acoustics GmbH, Ebertstr. 30a 52134 Herzogenrath, GERMANY SUMMARY Sound quality evaluation of

More information

Harmonic detection by using different artificial neural network topologies

Harmonic detection by using different artificial neural network topologies Harmonic detection by using different artificial neural network topologies J.L. Flores Garrido y P. Salmerón Revuelta Department of Electrical Engineering E. P. S., Huelva University Ctra de Palos de la

More information

Hearing and Deafness 2. Ear as a frequency analyzer. Chris Darwin

Hearing and Deafness 2. Ear as a frequency analyzer. Chris Darwin Hearing and Deafness 2. Ear as a analyzer Chris Darwin Frequency: -Hz Sine Wave. Spectrum Amplitude against -..5 Time (s) Waveform Amplitude against time amp Hz Frequency: 5-Hz Sine Wave. Spectrum Amplitude

More information

Testing of Objective Audio Quality Assessment Models on Archive Recordings Artifacts

Testing of Objective Audio Quality Assessment Models on Archive Recordings Artifacts POSTER 25, PRAGUE MAY 4 Testing of Objective Audio Quality Assessment Models on Archive Recordings Artifacts Bc. Martin Zalabák Department of Radioelectronics, Czech Technical University in Prague, Technická

More information

Multichannel level alignment, part I: Signals and methods

Multichannel level alignment, part I: Signals and methods Suokuisma, Zacharov & Bech AES 5th Convention - San Francisco Multichannel level alignment, part I: Signals and methods Pekka Suokuisma Nokia Research Center, Speech and Audio Systems Laboratory, Tampere,

More information

Sound is the human ear s perceived effect of pressure changes in the ambient air. Sound can be modeled as a function of time.

Sound is the human ear s perceived effect of pressure changes in the ambient air. Sound can be modeled as a function of time. 2. Physical sound 2.1 What is sound? Sound is the human ear s perceived effect of pressure changes in the ambient air. Sound can be modeled as a function of time. Figure 2.1: A 0.56-second audio clip of

More information

The db Concept. Chapter six

The db Concept. Chapter six Chapter six The db Concept CHAPTER OUTLINE dbdpower Ratio... 40 dbdamplitude Ratio... 40 From db to Power or Amplitude Ratio... 41 Conversion Table... 41 Reference Values... 41 Other Relative Units...43

More information

Week 1. Signals & Systems for Speech & Hearing. Sound is a SIGNAL 3. You may find this course demanding! How to get through it:

Week 1. Signals & Systems for Speech & Hearing. Sound is a SIGNAL 3. You may find this course demanding! How to get through it: Signals & Systems for Speech & Hearing Week You may find this course demanding! How to get through it: Consult the Web site: www.phon.ucl.ac.uk/courses/spsci/sigsys (also accessible through Moodle) Essential

More information

Auditory modelling for speech processing in the perceptual domain

Auditory modelling for speech processing in the perceptual domain ANZIAM J. 45 (E) ppc964 C980, 2004 C964 Auditory modelling for speech processing in the perceptual domain L. Lin E. Ambikairajah W. H. Holmes (Received 8 August 2003; revised 28 January 2004) Abstract

More information

Week I AUDL Signals & Systems for Speech & Hearing. Sound is a SIGNAL. You may find this course demanding! How to get through it: What is sound?

Week I AUDL Signals & Systems for Speech & Hearing. Sound is a SIGNAL. You may find this course demanding! How to get through it: What is sound? AUDL Signals & Systems for Speech & Hearing Week I You may find this course demanding! How to get through it: Consult the Web site: www.phon.ucl.ac.uk/courses/spsci/sigsys Essential to do the reading and

More information

Synthesis Algorithms and Validation

Synthesis Algorithms and Validation Chapter 5 Synthesis Algorithms and Validation An essential step in the study of pathological voices is re-synthesis; clear and immediate evidence of the success and accuracy of modeling efforts is provided

More information

REAL-TIME BROADBAND NOISE REDUCTION

REAL-TIME BROADBAND NOISE REDUCTION REAL-TIME BROADBAND NOISE REDUCTION Robert Hoeldrich and Markus Lorber Institute of Electronic Music Graz Jakoministrasse 3-5, A-8010 Graz, Austria email: robert.hoeldrich@mhsg.ac.at Abstract A real-time

More information

HARMONIC INSTABILITY OF DIGITAL SOFT CLIPPING ALGORITHMS

HARMONIC INSTABILITY OF DIGITAL SOFT CLIPPING ALGORITHMS HARMONIC INSTABILITY OF DIGITAL SOFT CLIPPING ALGORITHMS Sean Enderby and Zlatko Baracskai Department of Digital Media Technology Birmingham City University Birmingham, UK ABSTRACT In this paper several

More information

Vocal Command Recognition Using Parallel Processing of Multiple Confidence-Weighted Algorithms in an FPGA

Vocal Command Recognition Using Parallel Processing of Multiple Confidence-Weighted Algorithms in an FPGA Vocal Command Recognition Using Parallel Processing of Multiple Confidence-Weighted Algorithms in an FPGA ECE-492/3 Senior Design Project Spring 2015 Electrical and Computer Engineering Department Volgenau

More information

Learning New Articulator Trajectories for a Speech Production Model using Artificial Neural Networks

Learning New Articulator Trajectories for a Speech Production Model using Artificial Neural Networks Learning New Articulator Trajectories for a Speech Production Model using Artificial Neural Networks C. S. Blackburn and S. J. Young Cambridge University Engineering Department (CUED), England email: csb@eng.cam.ac.uk

More information

Application of Multi Layer Perceptron (MLP) for Shower Size Prediction

Application of Multi Layer Perceptron (MLP) for Shower Size Prediction Chapter 3 Application of Multi Layer Perceptron (MLP) for Shower Size Prediction 3.1 Basic considerations of the ANN Artificial Neural Network (ANN)s are non- parametric prediction tools that can be used

More information

A Digital Signal Processor for Musicians and Audiophiles Published on Monday, 09 February :54

A Digital Signal Processor for Musicians and Audiophiles Published on Monday, 09 February :54 A Digital Signal Processor for Musicians and Audiophiles Published on Monday, 09 February 2009 09:54 The main focus of hearing aid research and development has been on the use of hearing aids to improve

More information

CHAPTER 4 LINK ADAPTATION USING NEURAL NETWORK

CHAPTER 4 LINK ADAPTATION USING NEURAL NETWORK CHAPTER 4 LINK ADAPTATION USING NEURAL NETWORK 4.1 INTRODUCTION For accurate system level simulator performance, link level modeling and prediction [103] must be reliable and fast so as to improve the

More information

SMARTPHONE SENSOR BASED GESTURE RECOGNITION LIBRARY

SMARTPHONE SENSOR BASED GESTURE RECOGNITION LIBRARY SMARTPHONE SENSOR BASED GESTURE RECOGNITION LIBRARY Sidhesh Badrinarayan 1, Saurabh Abhale 2 1,2 Department of Information Technology, Pune Institute of Computer Technology, Pune, India ABSTRACT: Gestures

More information

ARTIFICIAL NEURAL NETWORK BASED CLASSIFICATION FOR MONOBLOCK CENTRIFUGAL PUMP USING WAVELET ANALYSIS

ARTIFICIAL NEURAL NETWORK BASED CLASSIFICATION FOR MONOBLOCK CENTRIFUGAL PUMP USING WAVELET ANALYSIS International Journal of Mechanical Engineering and Technology (IJMET), ISSN 0976 6340(Print) ISSN 0976 6359(Online) Volume 1 Number 1, July - Aug (2010), pp. 28-37 IAEME, http://www.iaeme.com/ijmet.html

More information

IMPLEMENTATION OF NEURAL NETWORK IN ENERGY SAVING OF INDUCTION MOTOR DRIVES WITH INDIRECT VECTOR CONTROL

IMPLEMENTATION OF NEURAL NETWORK IN ENERGY SAVING OF INDUCTION MOTOR DRIVES WITH INDIRECT VECTOR CONTROL IMPLEMENTATION OF NEURAL NETWORK IN ENERGY SAVING OF INDUCTION MOTOR DRIVES WITH INDIRECT VECTOR CONTROL * A. K. Sharma, ** R. A. Gupta, and *** Laxmi Srivastava * Department of Electrical Engineering,

More information

FACE RECOGNITION USING NEURAL NETWORKS

FACE RECOGNITION USING NEURAL NETWORKS Int. J. Elec&Electr.Eng&Telecoms. 2014 Vinoda Yaragatti and Bhaskar B, 2014 Research Paper ISSN 2319 2518 www.ijeetc.com Vol. 3, No. 3, July 2014 2014 IJEETC. All Rights Reserved FACE RECOGNITION USING

More information

Results of Egan and Hake using a single sinusoidal masker [reprinted with permission from J. Acoust. Soc. Am. 22, 622 (1950)].

Results of Egan and Hake using a single sinusoidal masker [reprinted with permission from J. Acoust. Soc. Am. 22, 622 (1950)]. XVI. SIGNAL DETECTION BY HUMAN OBSERVERS Prof. J. A. Swets Prof. D. M. Green Linda E. Branneman P. D. Donahue Susan T. Sewall A. MASKING WITH TWO CONTINUOUS TONES One of the earliest studies in the modern

More information

Chapter 12. Preview. Objectives The Production of Sound Waves Frequency of Sound Waves The Doppler Effect. Section 1 Sound Waves

Chapter 12. Preview. Objectives The Production of Sound Waves Frequency of Sound Waves The Doppler Effect. Section 1 Sound Waves Section 1 Sound Waves Preview Objectives The Production of Sound Waves Frequency of Sound Waves The Doppler Effect Section 1 Sound Waves Objectives Explain how sound waves are produced. Relate frequency

More information

CHAPTER 2 FIR ARCHITECTURE FOR THE FILTER BANK OF SPEECH PROCESSOR

CHAPTER 2 FIR ARCHITECTURE FOR THE FILTER BANK OF SPEECH PROCESSOR 22 CHAPTER 2 FIR ARCHITECTURE FOR THE FILTER BANK OF SPEECH PROCESSOR 2.1 INTRODUCTION A CI is a device that can provide a sense of sound to people who are deaf or profoundly hearing-impaired. Filters

More information

Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012

Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012 Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012 o Music signal characteristics o Perceptual attributes and acoustic properties o Signal representations for pitch detection o STFT o Sinusoidal model o

More information

Live Hand Gesture Recognition using an Android Device

Live Hand Gesture Recognition using an Android Device Live Hand Gesture Recognition using an Android Device Mr. Yogesh B. Dongare Department of Computer Engineering. G.H.Raisoni College of Engineering and Management, Ahmednagar. Email- yogesh.dongare05@gmail.com

More information

1 Introduction. w k x k (1.1)

1 Introduction. w k x k (1.1) Neural Smithing 1 Introduction Artificial neural networks are nonlinear mapping systems whose structure is loosely based on principles observed in the nervous systems of humans and animals. The major

More information

Decriminition between Magnetising Inrush from Interturn Fault Current in Transformer: Hilbert Transform Approach

Decriminition between Magnetising Inrush from Interturn Fault Current in Transformer: Hilbert Transform Approach SSRG International Journal of Electrical and Electronics Engineering (SSRG-IJEEE) volume 1 Issue 10 Dec 014 Decriminition between Magnetising Inrush from Interturn Fault Current in Transformer: Hilbert

More information

Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution

Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution PAGE 433 Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution Wenliang Lu, D. Sen, and Shuai Wang School of Electrical Engineering & Telecommunications University of New South Wales,

More information

AN AUDITORILY MOTIVATED ANALYSIS METHOD FOR ROOM IMPULSE RESPONSES

AN AUDITORILY MOTIVATED ANALYSIS METHOD FOR ROOM IMPULSE RESPONSES Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-), Verona, Italy, December 7-9,2 AN AUDITORILY MOTIVATED ANALYSIS METHOD FOR ROOM IMPULSE RESPONSES Tapio Lokki Telecommunications

More information

Auditory Based Feature Vectors for Speech Recognition Systems

Auditory Based Feature Vectors for Speech Recognition Systems Auditory Based Feature Vectors for Speech Recognition Systems Dr. Waleed H. Abdulla Electrical & Computer Engineering Department The University of Auckland, New Zealand [w.abdulla@auckland.ac.nz] 1 Outlines

More information

Perception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner.

Perception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner. Perception of pitch AUDL4007: 11 Feb 2010. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum, 2005 Chapter 7 1 Definitions

More information

A Novel Fuzzy Neural Network Based Distance Relaying Scheme

A Novel Fuzzy Neural Network Based Distance Relaying Scheme 902 IEEE TRANSACTIONS ON POWER DELIVERY, VOL. 15, NO. 3, JULY 2000 A Novel Fuzzy Neural Network Based Distance Relaying Scheme P. K. Dash, A. K. Pradhan, and G. Panda Abstract This paper presents a new

More information

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 MODELING SPECTRAL AND TEMPORAL MASKING IN THE HUMAN AUDITORY SYSTEM PACS: 43.66.Ba, 43.66.Dc Dau, Torsten; Jepsen, Morten L.; Ewert,

More information

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner.

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner. Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb 2009. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence

More information

Chapter IV THEORY OF CELP CODING

Chapter IV THEORY OF CELP CODING Chapter IV THEORY OF CELP CODING CHAPTER IV THEORY OF CELP CODING 4.1 Introduction Wavefonn coders fail to produce high quality speech at bit rate lower than 16 kbps. Source coders, such as LPC vocoders,

More information

AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS

AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS Kuldeep Kumar 1, R. K. Aggarwal 1 and Ankita Jain 2 1 Department of Computer Engineering, National Institute

More information

Performance Improvement of Contactless Distance Sensors using Neural Network

Performance Improvement of Contactless Distance Sensors using Neural Network Performance Improvement of Contactless Distance Sensors using Neural Network R. ABDUBRANI and S. S. N. ALHADY School of Electrical and Electronic Engineering Universiti Sains Malaysia Engineering Campus,

More information

Enhancement of Speech Signal by Adaptation of Scales and Thresholds of Bionic Wavelet Transform Coefficients

Enhancement of Speech Signal by Adaptation of Scales and Thresholds of Bionic Wavelet Transform Coefficients ISSN (Print) : 232 3765 An ISO 3297: 27 Certified Organization Vol. 3, Special Issue 3, April 214 Paiyanoor-63 14, Tamil Nadu, India Enhancement of Speech Signal by Adaptation of Scales and Thresholds

More information

MINE 432 Industrial Automation and Robotics

MINE 432 Industrial Automation and Robotics MINE 432 Industrial Automation and Robotics Part 3, Lecture 5 Overview of Artificial Neural Networks A. Farzanegan (Visiting Associate Professor) Fall 2014 Norman B. Keevil Institute of Mining Engineering

More information

NEURAL NETWORK BASED MAXIMUM POWER POINT TRACKING

NEURAL NETWORK BASED MAXIMUM POWER POINT TRACKING NEURAL NETWORK BASED MAXIMUM POWER POINT TRACKING 3.1 Introduction This chapter introduces concept of neural networks, it also deals with a novel approach to track the maximum power continuously from PV

More information

MUS 302 ENGINEERING SECTION

MUS 302 ENGINEERING SECTION MUS 302 ENGINEERING SECTION Wiley Ross: Recording Studio Coordinator Email =>ross@email.arizona.edu Twitter=> https://twitter.com/ssor Web page => http://www.arts.arizona.edu/studio Youtube Channel=>http://www.youtube.com/user/wileyross

More information

Terminology (1) Chapter 3. Terminology (3) Terminology (2) Transmitter Receiver Medium. Data Transmission. Direct link. Point-to-point.

Terminology (1) Chapter 3. Terminology (3) Terminology (2) Transmitter Receiver Medium. Data Transmission. Direct link. Point-to-point. Terminology (1) Chapter 3 Data Transmission Transmitter Receiver Medium Guided medium e.g. twisted pair, optical fiber Unguided medium e.g. air, water, vacuum Spring 2012 03-1 Spring 2012 03-2 Terminology

More information

POWER TRANSFORMER PROTECTION USING ANN, FUZZY SYSTEM AND CLARKE S TRANSFORM

POWER TRANSFORMER PROTECTION USING ANN, FUZZY SYSTEM AND CLARKE S TRANSFORM POWER TRANSFORMER PROTECTION USING ANN, FUZZY SYSTEM AND CLARKE S TRANSFORM 1 VIJAY KUMAR SAHU, 2 ANIL P. VAIDYA 1,2 Pg Student, Professor E-mail: 1 vijay25051991@gmail.com, 2 anil.vaidya@walchandsangli.ac.in

More information

Efficient Computation of Resonant Frequency of Rectangular Microstrip Antenna using a Neural Network Model with Two Stage Training

Efficient Computation of Resonant Frequency of Rectangular Microstrip Antenna using a Neural Network Model with Two Stage Training www.ijcsi.org 209 Efficient Computation of Resonant Frequency of Rectangular Microstrip Antenna using a Neural Network Model with Two Stage Training Guru Pyari Jangid *, Gur Mauj Saran Srivastava and Ashok

More information

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner.

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner. Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb 2008. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum,

More information

Initialisation improvement in engineering feedforward ANN models.

Initialisation improvement in engineering feedforward ANN models. Initialisation improvement in engineering feedforward ANN models. A. Krimpenis and G.-C. Vosniakos National Technical University of Athens, School of Mechanical Engineering, Manufacturing Technology Division,

More information

Computational Intelligence Introduction

Computational Intelligence Introduction Computational Intelligence Introduction Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Fall 2011 Farzaneh Abdollahi Neural Networks 1/21 Fuzzy Systems What are

More information

describe sound as the transmission of energy via longitudinal pressure waves;

describe sound as the transmission of energy via longitudinal pressure waves; 1 Sound-Detailed Study Study Design 2009 2012 Unit 4 Detailed Study: Sound describe sound as the transmission of energy via longitudinal pressure waves; analyse sound using wavelength, frequency and speed

More information

Audio Restoration Based on DSP Tools

Audio Restoration Based on DSP Tools Audio Restoration Based on DSP Tools EECS 451 Final Project Report Nan Wu School of Electrical Engineering and Computer Science University of Michigan Ann Arbor, MI, United States wunan@umich.edu Abstract

More information

ECMA TR/105. A Shaped Noise File Representative of Speech. 1 st Edition / December Reference number ECMA TR/12:2009

ECMA TR/105. A Shaped Noise File Representative of Speech. 1 st Edition / December Reference number ECMA TR/12:2009 ECMA TR/105 1 st Edition / December 2012 A Shaped Noise File Representative of Speech Reference number ECMA TR/12:2009 Ecma International 2009 COPYRIGHT PROTECTED DOCUMENT Ecma International 2012 Contents

More information

CHAPTER 4 MONITORING OF POWER SYSTEM VOLTAGE STABILITY THROUGH ARTIFICIAL NEURAL NETWORK TECHNIQUE

CHAPTER 4 MONITORING OF POWER SYSTEM VOLTAGE STABILITY THROUGH ARTIFICIAL NEURAL NETWORK TECHNIQUE 53 CHAPTER 4 MONITORING OF POWER SYSTEM VOLTAGE STABILITY THROUGH ARTIFICIAL NEURAL NETWORK TECHNIQUE 4.1 INTRODUCTION Due to economic reasons arising out of deregulation and open market of electricity,

More information

Processor Setting Fundamentals -or- What Is the Crossover Point?

Processor Setting Fundamentals -or- What Is the Crossover Point? The Law of Physics / The Art of Listening Processor Setting Fundamentals -or- What Is the Crossover Point? Nathan Butler Design Engineer, EAW There are many misconceptions about what a crossover is, and

More information

III. Publication III. c 2005 Toni Hirvonen.

III. Publication III. c 2005 Toni Hirvonen. III Publication III Hirvonen, T., Segregation of Two Simultaneously Arriving Narrowband Noise Signals as a Function of Spatial and Frequency Separation, in Proceedings of th International Conference on

More information

The psychoacoustics of reverberation

The psychoacoustics of reverberation The psychoacoustics of reverberation Steven van de Par Steven.van.de.Par@uni-oldenburg.de July 19, 2016 Thanks to Julian Grosse and Andreas Häußler 2016 AES International Conference on Sound Field Control

More information

Impulse Noise Removal Based on Artificial Neural Network Classification with Weighted Median Filter

Impulse Noise Removal Based on Artificial Neural Network Classification with Weighted Median Filter Impulse Noise Removal Based on Artificial Neural Network Classification with Weighted Median Filter Deepalakshmi R 1, Sindhuja A 2 PG Scholar, Department of Computer Science, Stella Maris College, Chennai,

More information

Acoustics, signals & systems for audiology. Week 9. Basic Psychoacoustic Phenomena: Temporal resolution

Acoustics, signals & systems for audiology. Week 9. Basic Psychoacoustic Phenomena: Temporal resolution Acoustics, signals & systems for audiology Week 9 Basic Psychoacoustic Phenomena: Temporal resolution Modulating a sinusoid carrier at 1 khz (fine structure) x modulator at 100 Hz (envelope) = amplitudemodulated

More information

ALTERNATING CURRENT (AC)

ALTERNATING CURRENT (AC) ALL ABOUT NOISE ALTERNATING CURRENT (AC) Any type of electrical transmission where the current repeatedly changes direction, and the voltage varies between maxima and minima. Therefore, any electrical

More information

Fundamentals of Digital Audio *

Fundamentals of Digital Audio * Digital Media The material in this handout is excerpted from Digital Media Curriculum Primer a work written by Dr. Yue-Ling Wong (ylwong@wfu.edu), Department of Computer Science and Department of Art,

More information

Multiple-Layer Networks. and. Backpropagation Algorithms

Multiple-Layer Networks. and. Backpropagation Algorithms Multiple-Layer Networks and Algorithms Multiple-Layer Networks and Algorithms is the generalization of the Widrow-Hoff learning rule to multiple-layer networks and nonlinear differentiable transfer functions.

More information

CHAPTER 6 ANFIS BASED NEURO-FUZZY CONTROLLER

CHAPTER 6 ANFIS BASED NEURO-FUZZY CONTROLLER 143 CHAPTER 6 ANFIS BASED NEURO-FUZZY CONTROLLER 6.1 INTRODUCTION The quality of generated electricity in power system is dependent on the system output, which has to be of constant frequency and must

More information

Digital Signal Processing Audio Measurements Custom Designed Tools. Loudness measurement in sone (DIN ISO 532B)

Digital Signal Processing Audio Measurements Custom Designed Tools. Loudness measurement in sone (DIN ISO 532B) Loudness measurement in sone (DIN 45631 ISO 532B) Sound can be described with various physical parameters e.g. intensity, pressure or energy. These parameters are very limited to describe the perception

More information

Machine recognition of speech trained on data from New Jersey Labs

Machine recognition of speech trained on data from New Jersey Labs Machine recognition of speech trained on data from New Jersey Labs Frequency response (peak around 5 Hz) Impulse response (effective length around 200 ms) 41 RASTA filter 10 attenuation [db] 40 1 10 modulation

More information

A Compact DGS Low Pass Filter using Artificial Neural Network

A Compact DGS Low Pass Filter using Artificial Neural Network A Compact DGS Low Pass Filter using Artificial Neural Network Vitthal Chaudhary Department of Electronics, Madhav Institute of Technology and Science Gwalior, India Gwalior, India Vandana Vikas Thakare

More information

Artificial Neural Network Approach to Mobile Location Estimation in GSM Network

Artificial Neural Network Approach to Mobile Location Estimation in GSM Network INTL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 2017, VOL. 63, NO. 1,. 39-44 Manuscript received March 31, 2016; revised December, 2016. DOI: 10.1515/eletel-2017-0006 Artificial Neural Network Approach

More information

Can binary masks improve intelligibility?

Can binary masks improve intelligibility? Can binary masks improve intelligibility? Mike Brookes (Imperial College London) & Mark Huckvale (University College London) Apparently so... 2 How does it work? 3 Time-frequency grid of local SNR + +

More information

Technical University of Denmark

Technical University of Denmark Technical University of Denmark Masking 1 st semester project Ørsted DTU Acoustic Technology fall 2007 Group 6 Troels Schmidt Lindgreen 073081 Kristoffer Ahrens Dickow 071324 Reynir Hilmisson 060162 Instructor

More information

SOUND SOURCE RECOGNITION AND MODELING

SOUND SOURCE RECOGNITION AND MODELING SOUND SOURCE RECOGNITION AND MODELING CASA seminar, summer 2000 Antti Eronen antti.eronen@tut.fi Contents: Basics of human sound source recognition Timbre Voice recognition Recognition of environmental

More information

Audio Signal Compression using DCT and LPC Techniques

Audio Signal Compression using DCT and LPC Techniques Audio Signal Compression using DCT and LPC Techniques P. Sandhya Rani#1, D.Nanaji#2, V.Ramesh#3,K.V.S. Kiran#4 #Student, Department of ECE, Lendi Institute Of Engineering And Technology, Vizianagaram,

More information

Lesson 3 Measurement of sound

Lesson 3 Measurement of sound Lesson 3 Measurement of sound 1.1 CONTENTS 1.1 Contents 1 1.2 Measuring noise 1 1.3 The sound level scale 2 1.4 Instruments used to measure sound 6 1.5 Recording sound data 14 1.6 The sound chamber 15

More information

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Mel Spectrum Analysis of Speech Recognition using Single Microphone International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree

More information

DC Motor Speed Control Using Machine Learning Algorithm

DC Motor Speed Control Using Machine Learning Algorithm DC Motor Speed Control Using Machine Learning Algorithm Jeen Ann Abraham Department of Electronics and Communication. RKDF College of Engineering Bhopal, India. Sanjeev Shrivastava Department of Electronics

More information

Psycho-acoustics (Sound characteristics, Masking, and Loudness)

Psycho-acoustics (Sound characteristics, Masking, and Loudness) Psycho-acoustics (Sound characteristics, Masking, and Loudness) Tai-Shih Chi ( 冀泰石 ) Department of Communication Engineering National Chiao Tung University Mar. 20, 2008 Pure tones Mathematics of the pure

More information

EBU UER. european broadcasting union. Listening conditions for the assessment of sound programme material. Supplement 1.

EBU UER. european broadcasting union. Listening conditions for the assessment of sound programme material. Supplement 1. EBU Tech 3276-E Listening conditions for the assessment of sound programme material Revised May 2004 Multichannel sound EBU UER european broadcasting union Geneva EBU - Listening conditions for the assessment

More information

Distortion products and the perceived pitch of harmonic complex tones

Distortion products and the perceived pitch of harmonic complex tones Distortion products and the perceived pitch of harmonic complex tones D. Pressnitzer and R.D. Patterson Centre for the Neural Basis of Hearing, Dept. of Physiology, Downing street, Cambridge CB2 3EG, U.K.

More information

Journal of the Acoustical Society of America 88

Journal of the Acoustical Society of America 88 The following article appeared in Journal of the Acoustical Society of America 88: 97 100 and may be found at http://scitation.aip.org/content/asa/journal/jasa/88/1/10121/1.399849. Copyright (1990) Acoustical

More information

Speech Synthesis using Mel-Cepstral Coefficient Feature

Speech Synthesis using Mel-Cepstral Coefficient Feature Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract

More information

Overview of Code Excited Linear Predictive Coder

Overview of Code Excited Linear Predictive Coder Overview of Code Excited Linear Predictive Coder Minal Mulye 1, Sonal Jagtap 2 1 PG Student, 2 Assistant Professor, Department of E&TC, Smt. Kashibai Navale College of Engg, Pune, India Abstract Advances

More information

Enhanced MLP Input-Output Mapping for Degraded Pattern Recognition

Enhanced MLP Input-Output Mapping for Degraded Pattern Recognition Enhanced MLP Input-Output Mapping for Degraded Pattern Recognition Shigueo Nomura and José Ricardo Gonçalves Manzan Faculty of Electrical Engineering, Federal University of Uberlândia, Uberlândia, MG,

More information

Temporal resolution AUDL Domain of temporal resolution. Fine structure and envelope. Modulating a sinusoid. Fine structure and envelope

Temporal resolution AUDL Domain of temporal resolution. Fine structure and envelope. Modulating a sinusoid. Fine structure and envelope Modulating a sinusoid can also work this backwards! Temporal resolution AUDL 4007 carrier (fine structure) x modulator (envelope) = amplitudemodulated wave 1 2 Domain of temporal resolution Fine structure

More information

MAGNITUDE-COMPLEMENTARY FILTERS FOR DYNAMIC EQUALIZATION

MAGNITUDE-COMPLEMENTARY FILTERS FOR DYNAMIC EQUALIZATION Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-), Limerick, Ireland, December 6-8, MAGNITUDE-COMPLEMENTARY FILTERS FOR DYNAMIC EQUALIZATION Federico Fontana University of Verona

More information

Laboratory 1: Uncertainty Analysis

Laboratory 1: Uncertainty Analysis University of Alabama Department of Physics and Astronomy PH101 / LeClair May 26, 2014 Laboratory 1: Uncertainty Analysis Hypothesis: A statistical analysis including both mean and standard deviation can

More information

Acoustic Echo Cancellation using LMS Algorithm

Acoustic Echo Cancellation using LMS Algorithm Acoustic Echo Cancellation using LMS Algorithm Nitika Gulbadhar M.Tech Student, Deptt. of Electronics Technology, GNDU, Amritsar Shalini Bahel Professor, Deptt. of Electronics Technology,GNDU,Amritsar

More information