Monaural and binaural processing of fluctuating sounds in the auditory system

Size: px
Start display at page:

Download "Monaural and binaural processing of fluctuating sounds in the auditory system"

Transcription

1 Monaural and binaural processing of fluctuating sounds in the auditory system Eric R. Thompson September 23, 2005 MSc Thesis Acoustic Technology Ørsted DTU Technical University of Denmark Supervisor: Torsten Dau

2 Abstract Two models of the effective signal processing in the human auditory system have been developed recently. One model from Dau et al. (1997) can predict human listeners performance with static and dynamic monaural signals. The other model from Breebaart et al. (2001a) is a binaural model that is based on the same peripheral processing as the Dau model, but was developed mainly for static signals. These models were selected because of their success and common roots for the development of a consolidated model that can effectively process static and dynamic, monaural and binaural signals in reverberant environments. Such a model could eventually be used as the basis for real-time speech recognition and auditory object identification systems. In this project, the first steps toward this goal were made with measurements of the ability of the auditory systems to detect monaural, diotic and dichotic envelope fluctuations. The measurements were performed using artificial probe signals in a series of headphone experiments with pure-tone and 3, 30 and 300 Hz wide noise carriers centered at 5 khz aimed at measuring the threshold for the detection and discrimination of amplitude modulation. The amplitude modulation was presented monaurally, diotically and interaurally in antiphase with diotic and interaurally uncorrelated carriers. The pure-tone results showed a very different shape than the binaural temporal modulation transfer function (TMTF) reported in the literature. This suggests a new paradigm for characterizing the envelope processing capabilities of the binaural system. The intrinsic fluctuations of both diotic and uncorrelated carriers were seen to create masking in the binaural domain, similar in some respects to that reported in previous studies in the monaural domain. Preliminary tests indicated that the Breebaart model would not be able to predict the same relative thresholds as were measured when using narrowband carriers without further development with dynamic signals. Further simulations were recommended to test the dynamic capabilities of the model in order to find its strengths and weaknesses. When the model can successfully predict thresholds similar to those measured, it will be well equipped to process both monaural and binaural, static and dynamic signals in anechoic conditions. The next steps will be to gather data and continue development of the model with more complex, reverberant environments. i

3 Contents 1 Introduction 1 2 Background Signal envelopes The envelope of speech Intrinsic envelope fluctuations Imposed envelopes and amplitude modulation Amplitude modulation and the monaural system Detection of amplitude modulation Detection of AM with narrowband noise carriers Model of monaural AM detection Binaural processing Interaural time and level differences Binaural masking level difference Dynamic interaural parameters Binaural models Jeffress model Equalization-Cancellation model Breebaart model Experimental Methods Procedure Test Subjects Apparatus and sessions Stimuli Results and Discussion 31 5 Toward a consolidated model 38 6 Overall summary and conclusions 43 ii

4 List of Figures 1 Envelope and fine structure of a 3 Hz wide signal Spectrogram of a speech sample, Frequency selectivity Modulation spectrum for two minutes of spoken discourse from a single speaker Beats created by adding a 40 Hz and a 44 Hz sine wave Intrinsic envelope fluctuations for a 5 Hz wide noise Theoretical power spectrum and envelope power spectrum of Gaussian bandpass noise Typical envelope imposed on a tone by a music synthesizer Sinusoidally amplitude modulated (SAM) pure-tone example. 8 9 Approximate regions of amplitude modulation perception Inner hair cell receptor potential response to pure-tone stimuli Temporal modulation transfer function with pure-tone carriers TMTF measured with pure-tone and narrowband noise carriers Block diagram of the modulation processing model from Dau et al. (1997) Transfer functions of the modulation filters used for processing monaural signals in the monaural modulation detection model Monaural modulation phase discrimination probability as a function of modulation frequency Theoretical interaural time difference (ITD) vs. azimuth Measured interaural level differences (ILD) vs. azimuth Illustration of binaural masking level differences Monaural and binaural TMTFs and masked TMTFs from Grantham and Bacon (1991) Instantaneous ILD vs. envelope amplitude for homophasic and antiphasic modulation with a narrowband carrier Basic concept of the Jeffress model Block diagram of the Equalization-Cancellation (EC) model Block diagram of the Breebaart model Array of EI-elements from the binaural model Detail of the EI-elements from the binaural model Amplitude Modulation detection and discrimination thresholds with 5 khz pure-tone carriers Amplitude Modulation detection and discrimination thresholds with 3 Hz wide carriers, f c = 5 khz Amplitude Modulation detection and discrimination thresholds with 30 Hz wide carriers, f c = 5 khz iii

5 29 Amplitude Modulation detection and discrimination thresholds with 300 Hz wide carriers, f c = 5 khz Interaural AM detection and discrimination thresholds with pure-tone and 3 Hz wide noise carriers Binaural modulation masking caused by intrinsic fluctuations of diotic narrowband carriers Monaural modulation masking caused by intrinsic fluctuations of narrowband carriers Binaural modulation masking caused by intrinsic interaural fluctuations from uncorrelated narrowband carriers Model predicted thresholds for discriminating interaurally antiphasic amplitude modulation Average EI activity vs. time for low-frequency uncorrelated noise carriers Average EI activity vs. time for correlated narrowband and pure-tone carriers Possible implementation of the consolidated model List of Acronyms AC ADSR AM BMLD DC EC EEG EI fmri FFT IFC ILD IPD ITD SAM SNR TMTF Alternating Current Attack, Decay, Sustain, Release Amplitude Modulation Binaural Masking Level Difference Direct Current Equalization-Cancellation Electroencephalography Excitation-Inhibition Functional Magnetic Resonance Imaging Fast Fourier Transform Interval, Forced Choice Interaural Level Difference Interaural Phase Difference Interaural Time Difference (or Delay) Sinusoidal Amplitude Modulation Signal-to-Noise Ratio Temporal Modulation Transfer Function iv

6 1 Introduction The world around us is constantly changing, and the acoustic environment is no exception. All real sounds will change in some manner if given enough time. Even steady state sounds will eventually be turned off or the sound producing mechanism will degrade so that the spectral characteristics of the sound will change over time. There may also be inherent properties of the sound that cause continuous change in level or frequency. These changes can be described by, for example, their speed, magnitude, or periodicity. Since our ears move through this constantly changing soundscape, it is of great interest to know how audible those changes are as part of a deeper understanding of the human auditory system. One of the most important tasks of the auditory system is the processing of speech, often in the presence of background noise. Speech signals are characterized by variations in level with many pauses between words and variations in frequency with low-frequency vowels and high-frequency fricative consonants as well as the small fluctuations in pitch that make speech more colorful. A normal hearing human listener can make sense out of these fluctuations, even in complex, reverberant acoustic environments, and understand the speech. Hearing-impaired listeners and computer-based speech recognition systems often have difficulties in understanding speech under such conditions. In many ways, the auditory system is still a black box into which sounds flow through two input channels (ears) and from which come many sensations and perceptions. The exact components of the system and their functions cannot be directly viewed or measured because of ethical and other restrictions. There are non-invasive methods of measuring the response of the brain to audio signals, such as functional magnetic resonance imaging (fmri) and electroencephalography (EEG), which can reveal general locations of activity and the far-field electrical response, respectively, but are limited in spatial resolution and cannot pinpoint precisely which cell responds to a signal in what way. Therefore, these tools should be used in conjunction with welldesigned psychoacoustic experiments in order to reverse engineer the auditory system and understand how the system is supposed to work. With this knowledge, it should be possible to identify more causes and symptoms of damage and defects in the auditory system, and to design better systems for repairing lost hearing ability or more effectively utilizing the residual hearing ability in an impaired system. In addition, a knowledge of which changes in a sound are audible can help in improving sound quality of e.g. loudspeakers or audio processing algorithms by pointing out which artifacts are audible and which are not, as has already been done in the 1

7 development of the MPEG Audio Layer-3 (MP3) audio compression algorithm (Fraunhofer IIS, 1998). As wireless and battery technologies improve, hearing aid manufacturers are looking at designing binaural hearing aids and need to know what interaural fluctuations are most important for formation and localization of sound sources and for speech intelligibility in real acoustic environments. A computer model that can simulate these aspects of a human listener s ability would be a powerful tool in the development cycle that could save time and expense by reducing the need for extensive listening tests. In this project, many experiments were performed that gathered data to better understand how the auditory system processes changes in the overall sound pressure level of a sound when those changes are only in one ear (monaural), the same in both ears (diotic, homophasic), or opposite in the two ears (dichotic, antiphasic). To this end, artificial signals were generated using pure-tone and narrowband high-frequency carriers with an imposed sinusoidal amplitude modulation. The response of a listener to these signals should provide insight into the temporal resolution and frequency selectivity of the auditory subsystems that process these amplitude fluctuations. This insight could be used to enhance existing models and work toward a model that can parse a complex acoustic signal into auditory objects, much like a human listener might do. Two existing models were selected as a starting point to develop a new model that can simulate the envelope processing abilities of a human listener in some tasks in anechoic environments. The first model is a monaural model from Dau et al. (1997) that includes envelope processing capabilities. The second model, from Breebaart et al. (2001a), simulates binaural processing, but was developed mainly for processing static signals. The goal for this project is to combine the dynamic signal processing of the Dau model with the binaural processing of the Breebaart model to develop a comprehensive model that can simulate human performance in detection tasks with static and dynamic, monaural and binaural signals. 2 Background 2.1 Signal envelopes Any sound signal can be described in terms of its fine structure and envelope, where the fine structure contains the rapid fluctuations in the sound pressure level and the envelope is a smooth, slowly varying curve tangent to the peaks of the fine-structure. Figure 1 shows an example of a signal with the fine 2

8 Figure 1: Envelope (heavy line) and fine structure (fine line) of a 1 s long sample of a 3 Hz wide noise signal centered at 40 Hz. The envelope was extracted using the Hilbert transform. structure drawn with a fine line and its envelope drawn with a heavy line. The fluctuations seen in the envelope of this signal are actually the result of interference between frequency components of the signal. The envelope of the signal shown in Figure 1 was calculated using the Hilbert transform. This transform convolves a real signal x(t) with a filter 1/πt to calculate the imaginary part of the analytic signal x a (t), as defined in Equation 1 (from Proakis and Manolakis, 1995, sec ). x a (t) = x(t) + j 1 x(t) (1) πt The envelope of the signal x(t) is then equal to the magnitude of the analytic signal x a (t) and is therefore always positive and real-valued The envelope of speech As described in the introduction, speech signals have large variations in level over time. There are pauses between words and a rise and fall during syllables. Figure 2 shows a frequency analysis vs. time for a sample of speech from Rosen and Fourcin (1986). In this plot, the shade indicates the amount of energy at that frequency at that instant in time. Darker shades indicate 3

9 Figure 2: Top panel: wideband spectrogram of a speech sample, Frequency selectivity, shown in the bottom panel (from Rosen and Fourcin, 1986). In the spectrogram, the shade indicates the energy level for that frequency at that moment in time. Darker shades represent higher energy levels, white indicates low energy or pauses. Figure 3: Modulation spectrum for two minutes of spoken discourse from a single speaker (from Greenberg et al., 1996). A peak is seen at about 4-5 Hz, which corresponds to a typical syllable duration of about ms. 4

10 Figure 4: Beats created by adding a 40 Hz and a 44 Hz sine wave. The time domain signal (fine line) is shown in the left panel with its Hilbert envelope (heavy line). The top-right panel shows the magnitude of the same signal in the frequency domain. The lower-right panel shows the magnitude of the FFT of the envelope of the signal. more energy and white indicates little or no energy (i.e., a pause). The fricative phonemes (e.g., /s/ and /f/) have a lot of high-frequency energy and the vowels (e.g., /i/, /@/ and /E/) have more low-frequency energy. An analysis of the speech envelope can be done on the total signal or by frequency band, as is shown in Figure 3 for a different speech sample from Greenberg et al. (1996). The modulation spectrum for the 1-2 khz frequency band is shown for two minutes of running speech for a single speaker. A maximum can be seen at about 4-5 Hz, which corresponds to a typical syllable duration of about ms Intrinsic envelope fluctuations When two pure-tone signals with similar frequencies are added together, the result is a signal whose envelope varies in amplitude at a frequency equal to the difference between the frequencies of the pure-tones. This effect, often referred to as beats, is shown in the left panel of Figure 4. This signal was created by adding a 40 Hz and a 44 Hz sine wave, as can be seen from the 5

11 Figure 5: Intrinsic envelope fluctuations for a 5 Hz wide noise. The time domain signal (fine line) is shown in the left panel with its Hilbert envelope (heavy line). The top-right panel shows the magnitude of the same signal in the frequency domain. The lower-right panel shows the magnitude of the FFT of the envelope of the signal. frequency domain representation of the signal in the top-right panel of the figure, calculated using the fast Fourier transform (FFT). The envelope was extracted from the signal using the Hilbert transform and can be seen to have the shape of a sinusoid with the frequency of the beats, 4 Hz (i.e., 44 40). Since an envelope is always positive valued, the result is here the absolute value of the sinusoid. Taking the absolute value creates a non-linearity at the zero crossings that produces harmonics, which can be seen along with the fundamental beat frequency in the envelope or modulation frequency domain, calculated as an FFT of the envelope, shown in the lower-right panel of Figure 4. If many sequential frequency components are added together with random phase, the result is a narrowband noise as shown in Figure 5. This signal was created by bandpass filtering a Gaussian noise between 38 and 43 Hz. Each component of the signal creates interference with each other component. Each of these beats contributes to the envelope energy at the frequency equal to the difference between the components frequencies. In the example shown in Figure 5, there are six frequency components with a 1 Hz spacing, which 6

12 Figure 6: Theoretical power spectrum (left panel) and envelope power spectrum (right panel) of Gaussian bandpass noise (from Dau et al., 1999), for details see also Lawson and Uhlenbeck (1950). means that there are five pairs of components with 1 Hz spacing, four pairs with 2 Hz spacing, and so on, up to one pair with 5 Hz spacing. There are no component pairs that have a frequency difference greater than the bandwidth of the noise, in this case 5 Hz, so, in theory, there should not be any intrinsic envelope energy at frequencies above the bandwidth (Lawson and Uhlenbeck, 1950). The theoretical representation of the intrinsic fluctuations in narrowband noises is shown in Figure 6. Given a narrowband noise with bandwidth f and power density ρ, the envelope power density forms a triangle between the origin, π ρ along the power density axis and the bandwidth of the noise along 4 the frequency axis, plus a DC component reflecting the non-zero mean of the envelope. The shape of the envelope power density function will be used later in discussions of amplitude modulation detection when using narrowband noise carriers Imposed envelopes and amplitude modulation A signal can also have an envelope imposed on it. In the simplest case, by simply switching on a sound source and switching it off again, a rectangular envelope is imposed on the sound. In music synthesizers, the envelope of a sound played when a key is pressed is often described in terms of its Attack time, Decay time, Sustain level and Release time (ADSR) (Kientzle, 1998), where the attack time describes how quickly the sound reaches its maximum 7

13 Figure 7: Typical envelope imposed on a tone by a music synthesizer (from Kientzle, 1998). The control parameters are the attack and decay time, the sustain level and the release time. Figure 8: Example of a sinusoidally amplitude modulated (SAM) pure-tone (f c =40 Hz; f m =4 Hz). The time domain signal is shown in the left panel with its Hilbert envelope. The top-right panel shows the magnitude of the same signal in the frequency domain. The lower-right panel shows the magnitude of the FFT of the envelope of the signal. 8

14 amplitude when the key is depressed, the decay time is how long it takes the sound to settle into its long-term or sustain level after reaching the maximum, and the release time is how quickly the volume returns to zero after releasing the key (see Figure 7). One question that an accurate model of the auditory system could answer is how well a human listener can actually perceive the initial peak in the envelope, so the developers of the synthesizer might be able to shape the envelope to enhance features of the sound based on their audibility. Another common way of imposing an envelope on a signal is by adding amplitude modulation (AM). With AM, the amplitude of a carrier c(t) is changed in time proportional to the magnitude of a modulator m(t) as given by Equation 2. [1 + m(t)] c(t) (2) The carrier can be any signal, for example a pure-tone, and the modulator can also be any signal with a lower frequency than the carrier signal. If a sinusoidal amplitude modulator (SAM) with amplitude, or modulation depth, m and frequency f m is used with a pure-tone carrier with frequency f c, as in equation 3 and the left panel of Figure 8, it can be shown using trigonometric identities that there are three resultant frequency components at f c, f c f m and f c + f m. These three components can also be seen in the frequency spectrum of the 40 Hz sinusoid with 4 Hz SAM shown in the top-right panel of Figure 8. [1 + m sin(2πf m t)] sin(2πf c t) = sin(2πf c t) + m 2 cos[2π(f c f m )t] m 2 cos[2π(f c + f m )t] (3) These sidebands can be used as detection cues for the presence of amplitude modulation in listening tests. The power of a periodic signal can be calculated from the sum of the squares of its frequency components. If a carrier has a power P, then when a sinusoidal amplitude modulation is imposed on the carrier, its power will be P (1 + m 2 /2). This can be shown easily for the example signal from Equation 3. Given a carrier signal x(t) = sin(2πf c t) with power P and an imposed sinusoidal amplitude modulation with modulation depth m, the power of the resultant signal, P AM, will be: ( m P AM = P + 2 ) 2 P + ( m 2 ) 2 P = P (1 + m2 2 This change in signal intensity could also be used as a cue for detection, so amplitude modulated signals should be scaled appropriately in order for the dynamic AM cue to be the most salient cue. 9 ) (4)

15 Figure 9: Approximate regions of amplitude modulation perception for carrier and modulation frequencies (adapted from Joris et al., 2004). Below about 20 Hz, fluctuations in level are perceived (hatched region). This percept makes a smooth transition to a percept of roughness above about 15 Hz (region between solid lines). Sidebands may be resolved for modulation frequencies above the upper solid line. 2.2 Amplitude modulation and the monaural system The cochlea is the transducer in the inner ear that converts mechanical vibrations into neural impulses. It contains the basilar membrane, which vibrates with incident sound and serves as a frequency-to-place filter where low frequencies activate the apical end of the membrane and high frequencies activate the basal end. It is often modeled as a sort of filterbank, dividing the audible frequency range into bands, often called critical bands (Zwicker, 1961). These filters are assumed not to be fixed to specific center frequencies, but rather can be moved to optimally improve signal-to-noise ratios (SNR). Among other effects, the auditory filters influence the perception of amplitude modulated signals. If the sidebands of a modulated signal fall within the same auditory filter as the carrier, then the sound may be perceived as a single auditory object. At very low modulation frequencies, the fluctuations of the envelope can be followed and are perceived as changes in loudness. As the modulation frequency increases, the loudness of the sound is not perceived to change, but a 10

16 Figure 10: Inner hair cell receptor potential response to pure-tone stimuli at frequencies indicated in Hz to the right of each trace (adapted from Palmer and Russell, 1986). The upper scale bar corresponds to the traces from 300 to 900 Hz, and the lower scale bar for 1-5 khz. Data measured in a guinea pig. quality of roughness is perceived instead. These regions of perceived loudness fluctuations and roughness are depicted in Figure 9. If the modulation frequency is further increased so that the sidebands can be resolved in adjacent auditory filters, then the sidebands will be perceived as independent tones. The inner hair cells are cells that ride on the basilar membrane whose inner electrical potential varies in response to the membrane motion. The inner potential of the hair cells is the driver for action potentials in neurons in the auditory nerve. For low frequencies, the inner hair cell potential follows, or is phase-locked to, the fine structure of the driving frequency (see Figure 10). However, as the frequency increases, the DC component of the inner hair cell response increases and the AC component decreases, until for very high frequencies the receptor potential is essentially just following the envelope of the stimulus. Therefore, high-frequency signals, where no finestructure information is coded, are often used in tests of envelope processing in the auditory system. 11

17 Figure 11: Temporal modulation transfer function with pure-tone carriers (from Kohlrausch et al., 2000). The data was measured for carriers at 1 ( ), 3 ( ), 5 ( ), 8 ( ) and 10 khz (pentagon). The TMTF curves show a lowpass shape for carriers above 1 khz with a sharp increase in sensitivity when the modulation sidebands can be resolved Detection of amplitude modulation Wideband noise carriers are also frequently used when measuring modulation detection thresholds, or the temporal modulation transfer function (TMTF), in order to eliminate the possible spectral cues resulting from the sidebands. In Viemeister (1979), the TMTFs were measured using wideband Gaussian noise carriers. The results showed great sensitivity at low modulation frequencies and a decrease in sensitivity of about 3 db/oct. above about 64 Hz. This led to a model of envelope processing that simply used a lowpass filter with a cut-off frequency of 64 Hz. In Kohlrausch et al. (2000), TMTFs were measured using sinusoidal carriers with frequencies at and above 1 khz (see Figure 11). The shape of the curves were similar to those found by Viemeister for carrier frequencies over 1 khz, although the cut-off frequencies were higher, typically around 130 Hz. In addition, Kohlrausch and colleagues measured thresholds in regions where the sidebands could be resolved and found dramatic increases in sensitivity with the presence of these spectral cues. The frequency where this increase occurs increases with increasing carrier frequency, which shows that the bandwidth of the auditory filters also increases with increasing center frequency. With the 1 khz carrier, the sidebands were resolved below the 130 Hz cutoff frequency, so the lowpass shape was not seen. 12

18 Figure 12: TMTF measured with pure-tone ( ) and narrowband noise carriers with bandwidths of 3 ( ), 31 ( ) and 314 Hz ( ), centered at 5 khz (adapted from Moore, 2003, data from Dau et al., 1997). Also shown are the theoretical envelope spectra for 3, 31 and 314 Hz wide noise bands Detection of AM with narrowband noise carriers The shape of the TMTF looks quite different when narrowband noises are used as carriers instead of broadband noise or pure-tones, as can be seen in Figure 12 (see also Fleischer, 1982; Dau et al., 1997). The TMTFs for narrowband noises with bandwidths of 3 ( ), 31 ( ) and 314 Hz ( ) are shown with the TMTF measured with a pure-tone carrier ( ) (data from Dau et al., 1997). The theoretical envelope spectra of the carriers (3, 31 and 314 Hz wide), as discussed above (see also Figure 6), are also plotted, only now with logarithmic axes. From these results, it can be seen that the intrinsic fluctuations of the carrier mask SAM detection for frequencies that are in or close to the range of the carrier fluctuations. The fact that the detection is not exactly the same as with the sinusoidal carrier immediately outside the range of intrinsic fluctuations indicates that the frequency resolution in envelope processing is limited. A lowpass filter model would still predict the same shape TMTF as measured with broadband noise, therefore a model with a modulation filterbank was proposed in Dau et al. (1997). Joris et al. (2004) also reported that there are neurons that respond in proportion to the degree of modulation and show a bandpass characteristic with respect to modulation frequency. This adds a physiological case in support of the modulation filterbank. 13

19 Figure 13: Block diagram of the modulation processing model from Dau et al. (1997). The model includes basilar membrane filtering in the form of a gammatone filterbank and inner hair cell processing with half-wave rectification and a lowpass filter. Adaptation loops add a compressive non-linearity. Then a modulation filterbank is applied to each peripheral channel, internal noise is added to limit the resolution of the system and an optimal detector generates the decision. 2.3 Model of monaural AM detection The model proposed in Dau et al. (1997) was based largely on the peripheral processing from Dau et al. (1996). Figure 13 shows a block diagram of the model. The first stage of the model simulates the bandpass characteristics of the basilar membrane with the gammatone filterbank from Patterson et al. (1988). The output from each of the gammatone filters is then half-wave rectified, to simulate the characteristic of the inner hair cells to only fire when displaced in one direction, and lowpass filtered with a cutoff frequency of 1 khz, to simulate the loss of phase-locking for higher frequencies. A series of five adaptation loops with different time constants is next applied to provide a compressive non-linearity while maintaining sensitivity for fast temporal variations (see Dau et al., 1996, for details). This adaptation step 14

20 Figure 14: Transfer functions of the modulation filters used for processing monaural signals in the monaural modulation detection model from Dau et al. (1997). Below 10 Hz, the filters have a constant bandwidth of 5 Hz, starting with a lowpass filter with a cut-off frequency of 2.5 Hz. Above 10 Hz, the filters have a constant Q-value of 2. also enables the simulation of forward masking results. Then a modulation filterbank is applied to each channel. Finally, an internal noise is added to limit the resolution of the system and an optimal detector generates the decision. The modulation filterbank is shown in more detail in Figure 14. Below 10 Hz, the filters have a constant bandwidth of 5 Hz, starting with a lowpass filter with a cutoff frequency of 2.5 Hz. From 10 Hz to 1000 Hz, the filters have a constant Q value of 2. Listening tests indicated that human listeners could detect an inversion of modulation phase better than chance probability for frequencies below approx. 10 Hz (see Figure 15) (Dau, 1996). Therefore, the modulation phase was preserved below 10 Hz and only the magnitude of the envelope was maintained for the higher modulation frequencies. With the modulation filterbank, this model is able to predict human listeners modulation detection thresholds with narrowband noise carriers. The energy of the intrinsic fluctuations for the 3 Hz wide carrier (see Figure 12) falls mostly within the first filter, greatly increasing the modulation depth required to detect modulation below 2.5 Hz. The subsequent filters also pass some of the modulation energy from this carrier, but the amount of energy decreases and the threshold slowly decreases toward the pure-tone carrier threshold with increasing filter center frequency. With the 30 Hz wide carrier, the envelope energy appears fairly flat on the logarithmic scale until 15

21 Figure 15: Monaural modulation phase discrimination probability as a function of modulation frequency. 5 khz pure-tone carrier with full modulation (m = 1). Data from three test subjects (dashed and dotted lines) and chance probability (solid line) (from Dau, 1996). almost 20 Hz. The modulation detection threshold is also very flat in the same range and then slowly rolls off. The intrinsic modulation energy of the 300 Hz wide carrier is very flat through the whole frequency range of interest. Since the bandwidth of the modulation filters increases with increasing filter center frequency, the total amount of energy passed through each filter also increases with filter center frequency with a flat envelope frequency spectrum. Assuming that a constant signal-to-noise ratio is required at the output of any filter for detection, this model also predicts a lowpass shape similar to Viemeister s model for a wideband noise carrier. Given a flat carrier envelope spectrum and the logarithmic scaling of the filters, each doubling of filter center frequency results in a doubling of the carrier envelope power (noise) passed by the filter and a subsequent 3 db/oct. increase in the threshold required for signal modulation detection. 2.4 Binaural processing Up to this point, all discussion of the auditory system has dealt with processes requiring only one input, or monaural processing. The complete auditory system is, however, a binaural system, meaning that there are two ear input channels. The main functions of the binaural system are to identify 16

22 Figure 16: Theoretical interaural time difference (ITD) as a function of azimuth. From Moore (2003), adapted from Feddersen et al. (1957). Calculated from the difference in distance traveled from the sound source to each of the ears. and localize sound sources and to improve the signal-to-noise ratio (SNR) by selectively tuning in on one particular sound source. Again, the exact mechanisms are unknown, but the physics of the acoustic waves are fairly well understood and there are several models that can simulate some aspects of binaural hearing Interaural time and level differences The most basic physics behind sound localization can be explained with a simple model of a spherical head with two receivers (ears) at opposite ends of an axis, placed in an anechoic sound field. When a sound is incident from directly in front of this head, the signals at the two ears should be identical. However, if the sound source moves to the right, as seen from the head, then the sound has to travel a greater distance to reach the left ear than the right ear. Since sound travels at a finite rate, this difference in distance creates an interaural time difference (ITD), which can be derived based on assumptions about the shape and size of the head. Figure 16 shows the theoretical ITD as a function of the angle from directly ahead, or azimuth. This figure shows the increase in ITD with azimuth until 90 and the subsequent decrease until 180. This also shows how there can be front-to-back confusions with cues based solely on ITD, since angles symmetric about 90 azimuth produce the same ITD. The ITD can be converted to an interaural phase difference (IPD) 17

23 Figure 17: Measured interaural level differences (ILD) as a function of azimuth for pure-tones from 200 to 6000 Hz. These curves show almost no measurable ITD for low frequencies and up to almost 20 db ILD for high-frequencies. Average of five subjects. From Moore (2003), adapted from Feddersen et al. (1957) for sinusoidal tones or for frequency components of a complex sound. This cue is only unambiguous for phase differences between ± π. For a low-frequency 2 sinusoid at 250 Hz, an ITD of 0.4 ms is equivalent to an IPD of π. However, 5 for a 5 khz sinusoid, the same ITD of 0.4 ms is equivalent to an IPD of 4π (two full cycles) and it is impossible for the auditory system to know if the two signals are in phase or n-cycles out of phase. In addition to the ITD, there is another basic physical difference between the sound that reaches the two ears from an angle other than straight ahead. The physical presence of the head between the ear aimed away from the source and the source itself attenuates the level of the sound, creating the so-called head shadow effect and an interaural level difference (ILD). The measured ILDs for various frequencies are shown in Figure 17 (from Feddersen et al., 1957). In these curves, it is obvious that the head shadow effect is strongest for high frequencies with almost 20 db ILD for large azimuth angles and creates almost no difference for low frequencies. 18

24 A human listener is extremely sensitive to ITDs with difference limens of as little as 10 µs (Klumpp and Eady, 1956) for low-frequency tones and noise. In that study, it was possible to measure ITD thresholds for puretones with frequencies up to 1300 Hz, but the listeners were unable to detect ITDs for any higher frequency pure-tones. This can be explained by the loss of fine structure coding for high frequency sounds (see Figure 10). If only the envelope of the signal is preserved in the peripheral processing, then any timing differences in the fine structure would not be seen at higher processing levels in the brain. It is interesting to note that the listeners could detect ITDs with narrowband noises that consisted exclusively of frequencies that were above the range of pure-tone ITD detectability (in one case, Hz). This indicates that the listeners were able to compare the timing of the intrinsic fluctuations of the envelope in the detection task. In the high-frequency range, Mills (1960) measured static ILD detection thresholds of about db using pure-tones. When listening through headphones, a diotic sound is typically perceived to be located in the center of the head on a line connecting the two ears. If an ITD or ILD is then artificially imposed on the signal, the signal is perceived to move toward either the leading ear or the ear with the greater sound pressure level, respectively. Together, the ITD and ILD resulting from the size and shape of the head and the location of the ears provide cues that can be used by a human listener or a computer to localize a sound source. Additional cues are provided to the human listener by reflections and resonances from the outer ear, or pinna, which are not yet fully understood and are beyond the scope of this project Binaural masking level difference In addition to providing information about the location of a sound source, the binaural system can actually improve our ability to hear a signal in background noise. This system is used, for example, in the cocktail party scenario where a listener s task is to understand a conversation in the presence of multiple other sound sources (e.g. a stereo system and other conversations) that are usually incident from random angles. In the laboratory, this effect has been demonstrated using white noise maskers (N) and pure-tone signals (S) where the task is to detect the tone in the noise (see illustration in Figure 18). If the noise and signal are presented monaurally (N m S m subscript m for monaural), then a certain detection threshold is measured. If the same noise is added to the opposite ear (N 0 S m subscript 0 for zero interaural phase shift), it becomes easier to detect the signal again. If the signal is then added to the second ear (N 0 S 0 ), it is again more difficult to detect the 19

25 Figure 18: Illustration of binaural masking level differences (BMLD). The N and S descriptors are for the noise and signal, respectively. The subscripts indicate the method of presentation. m for monaural, 0 for homophasic and π for antiphasic. Smiling faces indicate better detectability (from Moore, 2003) relative to the monaural case. signal. However, if the phase of the signal is inverted (N 0 S π subscript π for the interaural phase shift), the signal is again easily detectable. It is assumed that the binaural system somehow suppresses the information that is the same in the two ears (in this case N 0 ) leaving the information that is different (S π ). Since the binaural system reduces the effective masking of the noise, the effect is called the binaural masking level difference (BMLD). There have been numerous models developed that can simulate this behavior in simple environments Dynamic interaural parameters As with the monaural system, it is relevant to measure the temporal acuity of the binaural system using dynamic signals. This can be done by varying the interaural parameters in time at a given rate of change. Grantham and colleagues measured the ability of human listeners to detect sinusoidally varying interaural time and intensity differences (Grantham and Wightman, 1978; Grantham, 1984). They concluded that the binaural system is sluggish in that only modulation rates in the range of 1-5 Hz are perceived to have motion. Higher interaural modulation rates can still be discriminated from diotic noise, even above 100 Hz, but instead of having a perceived motion, the sounds have an increased perceived width. In Grantham and Bacon (1991), the TMTF and modulation masking curves in the binaural domain were measured. They used wideband Gaus- 20

26 Figure 19: Left panel: Monaural and binaural TMTFs from Grantham and Bacon (1991). Right panel: Monaural and binaural masked TMTFs measured in the presence of a second noise with a diotic 16 Hz sinusoidal amplitude modulation. : monaural thresholds (S m ), : binaural threshold (S π ). Note inverted ordinate. sian noise carriers and the task was to detect an applied sinusoidal signal modulation with modulation frequencies from Hz in monaural (S m ) and binaural (S π ) presentations. 1 Their monaural results were very similar to those reported in Viemeister (1979) and the binaural TMTF had no significant difference from the monaural case except at 512 Hz, where the binaural case had an advantage of approximately 5 db over the monaural case, as seen in the mean data (see left panel of Figure 19, note inverted ordinate). When measuring modulation masking, a wideband Gaussian noise with a 16 Hz amplitude modulation was added to the signals from the previous experiment and the task was to detect the applied signal modulation as a function of signal modulation frequency. The presence of the masking 16 Hz modulation made it more difficult to detect the signal modulation, particularly for signal modulation frequencies close to the masker frequency, thereby creating the dip in the masked TMTF curves. In these masked conditions, the binaural detection had an advantage in the lowest frequencies and at 512 Hz, where the thresholds were very similar to those measured in 1 Note that the signals in the dynamic interaural parameter experiments are the modulated parameters (ITD or ILD) themselves, which can be presented monaurally (S m ), interaurally in-phase (S 0 ) or interaurally in antiphase (S π ). This differs from the BMLD experiments where the signals were pure-tones added to a noise masker as shown in Figure

27 Figure 20: Instantaneous ILD vs. normalized envelope amplitude (max. of antiphasic signal amplitude = 1) for homophasic (red) and antiphasic (blue) modulation with a diotic 3 Hz wide carrier. f m = 4 Hz and m = 10 db. The projections of these two signals in the ILD-time and amplitude-time planes are also shown, drawn with fine lines. the unmasked condition, and around the masker frequency (16 Hz), where the threshold was about the same as that measured at 512 Hz, as seen in the mean data. The cues for detection of homophasic (or monaural) and antiphasic amplitude modulation are very different. The homophasic modulations cause changes in perceived loudness or roughness, depending on the frequency of modulation. On the other hand, the antiphasic modulations cause a perceived motion without a change in loudness or an increase in perceived source width, again depending on the frequency of modulation. Figure 20 shows the instantaneous ILD and envelope amplitude for a diotic 3 Hz wide noise carrier with an imposed homophasic (red) and antiphasic (blue) 4 Hz amplitude modulation with m = 10 db. For a low-frequency amplitude modulation, a change in ILD with time can be perceived as a change in localization and a change in level can be perceived as a change in loudness. These changes can be seen more clearly in the projections in the ILD-time and amplitude-time planes, drawn with fine lines in the figure. The plot in Figure 20 shows that the homophasic modulation causes a 22

28 change in the level of the sound without a change in ILD, so this sound would be localized in the center of the head, while the antiphasic modulation causes a change in lateralization with little change in perceived level. Since the carrier was diotic and only 3 Hz wide, it had intrinsic fluctuations in level but not in ILD. These changes in level make it much more difficult to detect diotic amplitude modulation (see section 2.2.2), but have no effect on ILD. The level of the carrier does fluctuate in time, so a time of maximum ILD resulting from the imposed antiphasic modulation may coincide with a low carrier level, thereby making it harder to hear, so there could be an influence of the shape of the diotic carrier on the detection of antiphasic modulation. However, it should be much more effective to use an antiphasic amplitude modulation masker or uncorrelated narrowband carriers, which have intrinsic fluctuations in ILD, in measurements of masking of modulation detection in the binaural system. 2.5 Binaural models The task of the binaural models is to be able to simulate human performance in the binaural experiments described above. In this section, some of the historical models that have provided a foundation for today s models and one state-of-the-art model are described Jeffress model Almost all current models of binaural hearing are in some way based on the Jeffress coincidence detector model (Jeffress, 1948). The basic concept is that there are arrays of neurons that receive inputs from both ears via delay lines that serve to delay one signal relative to the other. Each neuron will have a characteristic interaural delay and serves as a coincidence detector so that the neuron has a maximum output if the two inputs arrive simultaneously. A conceptual drawing of the Jeffress model is shown in Figure 21. This model can also be thought of as performing a sort of cross-correlation of the left and right signals Equalization-Cancellation model A new model was proposed in Durlach (1963) that should account for ITDs, ILDs and the BMLD effect. This model is called the equalization-cancellation (EC) model (see block diagram in Figure 22). The equalization stage adds an optimal interaural delay and interaural gain so that the two signals are most similar. However, there is jitter added to the delay and noise added 23

29 Figure 21: Basic concept of the Jeffress (1948) model showing different length delay lines with t spacing between neurons that create an array for detecting coincidence for a range of ITDs. Figure 22: Block diagram of the Equalization-Cancellation (EC) model (Durlach, 1963). The signals from the two ears are first equalized in time and level (E mechanism) with internal noise and jitter to limit resolution and then subtracted (C mechanism). The decision device makes use of the input channel with the best signal-to-noise ratio (SNR) to make its decision. 24

30 to the gain in order to limit the resolution of the system. The cancellation stage then simply takes the difference of the two signals, removing what is diotic after equalization and leaving the dichotic part of the signal. If an N 0 S π signal from the BMLD experiments is fed into the EC model, the optimal imposed ITD and ILD in the equalization stage would both be zero. In the cancellation stage, the diotic noise (N 0 ) would cancel out leaving the antiphasic signal, thereby making the signal detection easier. Finally, a detection mechanism determines which of the two monaural channels and the EC channel has the best SNR and generates the decision Breebaart model One of the latest models of binaural processing was proposed in Breebaart et al. (2001a) (see block diagram in Figure 23). This model makes use of the peripheral processing from the Dau et al. (1996) monaural model (see description in 2.3). After the adaptation loops, the two monaural channels are fed through an array of excitation-inhibition (EI) elements. Each of the EI elements has a characteristic ITD ( τ) and ILD ( α), as shown in Figure 24. The EI name stems from a type of neuron that receives excitatory input from the ipsilateral side and inhibitory input from the contralateral side, thereby effectively canceling out identical inputs. The output from each EI element is passed through a sliding integrator, implemented as a double-sided exponential with time constants of 30 ms (see upper-right panel of Figure 25), that smoothes the output signals and accounts for binaural sluggishness. A logarithmic compression is then applied to each channel (see lower-right panel of Figure 25). This compression function is essentially linear for low input levels and compressive for higher input levels. As with the Dau models, an internal noise is added to limit the resolution of the system and an optimal detector is used over all monaural and binaural channels to generate the decision. This model was quite successful in simulating a human listener s performance in a range of binaural listening tests based on spectral and temporal cues (see Breebaart et al., 2001b,c, for details). Among the tests of the model, Breebaart and colleagues tried to simulate the dynamic ITD and ILD experiments from Grantham and Wightman (1978) and Grantham (1984). The model was unable to simulate human listeners performance with the dynamic ITD tests, but had fairly good results with the dynamic ILD experiments. This will be described in more detail in the Modeling section. Since the model was very successful with static binaural signals and had some success with dynamic interaural parameters, and because it was based on the same peripheral processing as the monaural modulation processing 25

A CLOSER LOOK AT THE REPRESENTATION OF INTERAURAL DIFFERENCES IN A BINAURAL MODEL

A CLOSER LOOK AT THE REPRESENTATION OF INTERAURAL DIFFERENCES IN A BINAURAL MODEL 9th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, -7 SEPTEMBER 7 A CLOSER LOOK AT THE REPRESENTATION OF INTERAURAL DIFFERENCES IN A BINAURAL MODEL PACS: PACS:. Pn Nicolas Le Goff ; Armin Kohlrausch ; Jeroen

More information

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 MODELING SPECTRAL AND TEMPORAL MASKING IN THE HUMAN AUDITORY SYSTEM PACS: 43.66.Ba, 43.66.Dc Dau, Torsten; Jepsen, Morten L.; Ewert,

More information

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner.

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner. Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb 2008. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum,

More information

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner.

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner. Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb 2009. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence

More information

Perception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner.

Perception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner. Perception of pitch AUDL4007: 11 Feb 2010. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum, 2005 Chapter 7 1 Definitions

More information

Intensity Discrimination and Binaural Interaction

Intensity Discrimination and Binaural Interaction Technical University of Denmark Intensity Discrimination and Binaural Interaction 2 nd semester project DTU Electrical Engineering Acoustic Technology Spring semester 2008 Group 5 Troels Schmidt Lindgreen

More information

I. INTRODUCTION. NL-5656 AA Eindhoven, The Netherlands. Electronic mail:

I. INTRODUCTION. NL-5656 AA Eindhoven, The Netherlands. Electronic mail: Binaural processing model based on contralateral inhibition. II. Dependence on spectral parameters Jeroen Breebaart a) IPO, Center for User System Interaction, P.O. Box 513, NL-5600 MB Eindhoven, The Netherlands

More information

THE MATLAB IMPLEMENTATION OF BINAURAL PROCESSING MODEL SIMULATING LATERAL POSITION OF TONES WITH INTERAURAL TIME DIFFERENCES

THE MATLAB IMPLEMENTATION OF BINAURAL PROCESSING MODEL SIMULATING LATERAL POSITION OF TONES WITH INTERAURAL TIME DIFFERENCES THE MATLAB IMPLEMENTATION OF BINAURAL PROCESSING MODEL SIMULATING LATERAL POSITION OF TONES WITH INTERAURAL TIME DIFFERENCES J. Bouše, V. Vencovský Department of Radioelectronics, Faculty of Electrical

More information

Binaural Hearing. Reading: Yost Ch. 12

Binaural Hearing. Reading: Yost Ch. 12 Binaural Hearing Reading: Yost Ch. 12 Binaural Advantages Sounds in our environment are usually complex, and occur either simultaneously or close together in time. Studies have shown that the ability to

More information

Spectral and temporal processing in the human auditory system

Spectral and temporal processing in the human auditory system Spectral and temporal processing in the human auditory system To r s t e n Da u 1, Mo rt e n L. Jepsen 1, a n d St e p h a n D. Ew e r t 2 1Centre for Applied Hearing Research, Ørsted DTU, Technical University

More information

Tone-in-noise detection: Observed discrepancies in spectral integration. Nicolas Le Goff a) Technische Universiteit Eindhoven, P.O.

Tone-in-noise detection: Observed discrepancies in spectral integration. Nicolas Le Goff a) Technische Universiteit Eindhoven, P.O. Tone-in-noise detection: Observed discrepancies in spectral integration Nicolas Le Goff a) Technische Universiteit Eindhoven, P.O. Box 513, NL-5600 MB Eindhoven, The Netherlands Armin Kohlrausch b) and

More information

The psychoacoustics of reverberation

The psychoacoustics of reverberation The psychoacoustics of reverberation Steven van de Par Steven.van.de.Par@uni-oldenburg.de July 19, 2016 Thanks to Julian Grosse and Andreas Häußler 2016 AES International Conference on Sound Field Control

More information

HCS 7367 Speech Perception

HCS 7367 Speech Perception HCS 7367 Speech Perception Dr. Peter Assmann Fall 212 Power spectrum model of masking Assumptions: Only frequencies within the passband of the auditory filter contribute to masking. Detection is based

More information

III. Publication III. c 2005 Toni Hirvonen.

III. Publication III. c 2005 Toni Hirvonen. III Publication III Hirvonen, T., Segregation of Two Simultaneously Arriving Narrowband Noise Signals as a Function of Spatial and Frequency Separation, in Proceedings of th International Conference on

More information

Hearing and Deafness 2. Ear as a frequency analyzer. Chris Darwin

Hearing and Deafness 2. Ear as a frequency analyzer. Chris Darwin Hearing and Deafness 2. Ear as a analyzer Chris Darwin Frequency: -Hz Sine Wave. Spectrum Amplitude against -..5 Time (s) Waveform Amplitude against time amp Hz Frequency: 5-Hz Sine Wave. Spectrum Amplitude

More information

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Loudness & Temporal resolution

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Loudness & Temporal resolution AUDL GS08/GAV1 Signals, systems, acoustics and the ear Loudness & Temporal resolution Absolute thresholds & Loudness Name some ways these concepts are crucial to audiologists Sivian & White (1933) JASA

More information

The role of intrinsic masker fluctuations on the spectral spread of masking

The role of intrinsic masker fluctuations on the spectral spread of masking The role of intrinsic masker fluctuations on the spectral spread of masking Steven van de Par Philips Research, Prof. Holstlaan 4, 5656 AA Eindhoven, The Netherlands, Steven.van.de.Par@philips.com, Armin

More information

SOUND QUALITY EVALUATION OF FAN NOISE BASED ON HEARING-RELATED PARAMETERS SUMMARY INTRODUCTION

SOUND QUALITY EVALUATION OF FAN NOISE BASED ON HEARING-RELATED PARAMETERS SUMMARY INTRODUCTION SOUND QUALITY EVALUATION OF FAN NOISE BASED ON HEARING-RELATED PARAMETERS Roland SOTTEK, Klaus GENUIT HEAD acoustics GmbH, Ebertstr. 30a 52134 Herzogenrath, GERMANY SUMMARY Sound quality evaluation of

More information

Auditory modelling for speech processing in the perceptual domain

Auditory modelling for speech processing in the perceptual domain ANZIAM J. 45 (E) ppc964 C980, 2004 C964 Auditory modelling for speech processing in the perceptual domain L. Lin E. Ambikairajah W. H. Holmes (Received 8 August 2003; revised 28 January 2004) Abstract

More information

Temporal resolution AUDL Domain of temporal resolution. Fine structure and envelope. Modulating a sinusoid. Fine structure and envelope

Temporal resolution AUDL Domain of temporal resolution. Fine structure and envelope. Modulating a sinusoid. Fine structure and envelope Modulating a sinusoid can also work this backwards! Temporal resolution AUDL 4007 carrier (fine structure) x modulator (envelope) = amplitudemodulated wave 1 2 Domain of temporal resolution Fine structure

More information

Computational Perception. Sound localization 2

Computational Perception. Sound localization 2 Computational Perception 15-485/785 January 22, 2008 Sound localization 2 Last lecture sound propagation: reflection, diffraction, shadowing sound intensity (db) defining computational problems sound lateralization

More information

Psycho-acoustics (Sound characteristics, Masking, and Loudness)

Psycho-acoustics (Sound characteristics, Masking, and Loudness) Psycho-acoustics (Sound characteristics, Masking, and Loudness) Tai-Shih Chi ( 冀泰石 ) Department of Communication Engineering National Chiao Tung University Mar. 20, 2008 Pure tones Mathematics of the pure

More information

Modeling auditory processing of amplitude modulation I. Detection and masking with narrow-band carriers Dau, T.; Kollmeier, B.; Kohlrausch, A.G.

Modeling auditory processing of amplitude modulation I. Detection and masking with narrow-band carriers Dau, T.; Kollmeier, B.; Kohlrausch, A.G. Modeling auditory processing of amplitude modulation I. Detection and masking with narrow-band carriers Dau, T.; Kollmeier, B.; Kohlrausch, A.G. Published in: Journal of the Acoustical Society of America

More information

AN AUDITORILY MOTIVATED ANALYSIS METHOD FOR ROOM IMPULSE RESPONSES

AN AUDITORILY MOTIVATED ANALYSIS METHOD FOR ROOM IMPULSE RESPONSES Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-), Verona, Italy, December 7-9,2 AN AUDITORILY MOTIVATED ANALYSIS METHOD FOR ROOM IMPULSE RESPONSES Tapio Lokki Telecommunications

More information

Complex Sounds. Reading: Yost Ch. 4

Complex Sounds. Reading: Yost Ch. 4 Complex Sounds Reading: Yost Ch. 4 Natural Sounds Most sounds in our everyday lives are not simple sinusoidal sounds, but are complex sounds, consisting of a sum of many sinusoids. The amplitude and frequency

More information

Interaction of Object Binding Cues in Binaural Masking Pattern Experiments

Interaction of Object Binding Cues in Binaural Masking Pattern Experiments Interaction of Object Binding Cues in Binaural Masking Pattern Experiments Jesko L.Verhey, Björn Lübken and Steven van de Par Abstract Object binding cues such as binaural and across-frequency modulation

More information

Envelopment and Small Room Acoustics

Envelopment and Small Room Acoustics Envelopment and Small Room Acoustics David Griesinger Lexicon 3 Oak Park Bedford, MA 01730 Copyright 9/21/00 by David Griesinger Preview of results Loudness isn t everything! At least two additional perceptions:

More information

Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues

Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues DeLiang Wang Perception & Neurodynamics Lab The Ohio State University Outline of presentation Introduction Human performance Reverberation

More information

Testing of Objective Audio Quality Assessment Models on Archive Recordings Artifacts

Testing of Objective Audio Quality Assessment Models on Archive Recordings Artifacts POSTER 25, PRAGUE MAY 4 Testing of Objective Audio Quality Assessment Models on Archive Recordings Artifacts Bc. Martin Zalabák Department of Radioelectronics, Czech Technical University in Prague, Technická

More information

Structure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping

Structure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping Structure of Speech Physical acoustics Time-domain representation Frequency domain representation Sound shaping Speech acoustics Source-Filter Theory Speech Source characteristics Speech Filter characteristics

More information

Chapter 12. Preview. Objectives The Production of Sound Waves Frequency of Sound Waves The Doppler Effect. Section 1 Sound Waves

Chapter 12. Preview. Objectives The Production of Sound Waves Frequency of Sound Waves The Doppler Effect. Section 1 Sound Waves Section 1 Sound Waves Preview Objectives The Production of Sound Waves Frequency of Sound Waves The Doppler Effect Section 1 Sound Waves Objectives Explain how sound waves are produced. Relate frequency

More information

Auditory filters at low frequencies: ERB and filter shape

Auditory filters at low frequencies: ERB and filter shape Auditory filters at low frequencies: ERB and filter shape Spring - 2007 Acoustics - 07gr1061 Carlos Jurado David Robledano Spring 2007 AALBORG UNIVERSITY 2 Preface The report contains all relevant information

More information

I R UNDERGRADUATE REPORT. Stereausis: A Binaural Processing Model. by Samuel Jiawei Ng Advisor: P.S. Krishnaprasad UG

I R UNDERGRADUATE REPORT. Stereausis: A Binaural Processing Model. by Samuel Jiawei Ng Advisor: P.S. Krishnaprasad UG UNDERGRADUATE REPORT Stereausis: A Binaural Processing Model by Samuel Jiawei Ng Advisor: P.S. Krishnaprasad UG 2001-6 I R INSTITUTE FOR SYSTEMS RESEARCH ISR develops, applies and teaches advanced methodologies

More information

Acoustics, signals & systems for audiology. Week 9. Basic Psychoacoustic Phenomena: Temporal resolution

Acoustics, signals & systems for audiology. Week 9. Basic Psychoacoustic Phenomena: Temporal resolution Acoustics, signals & systems for audiology Week 9 Basic Psychoacoustic Phenomena: Temporal resolution Modulating a sinusoid carrier at 1 khz (fine structure) x modulator at 100 Hz (envelope) = amplitudemodulated

More information

A cat's cocktail party: Psychophysical, neurophysiological, and computational studies of spatial release from masking

A cat's cocktail party: Psychophysical, neurophysiological, and computational studies of spatial release from masking A cat's cocktail party: Psychophysical, neurophysiological, and computational studies of spatial release from masking Courtney C. Lane 1, Norbert Kopco 2, Bertrand Delgutte 1, Barbara G. Shinn- Cunningham

More information

You know about adding up waves, e.g. from two loudspeakers. AUDL 4007 Auditory Perception. Week 2½. Mathematical prelude: Adding up levels

You know about adding up waves, e.g. from two loudspeakers. AUDL 4007 Auditory Perception. Week 2½. Mathematical prelude: Adding up levels AUDL 47 Auditory Perception You know about adding up waves, e.g. from two loudspeakers Week 2½ Mathematical prelude: Adding up levels 2 But how do you get the total rms from the rms values of two signals

More information

COM325 Computer Speech and Hearing

COM325 Computer Speech and Hearing COM325 Computer Speech and Hearing Part III : Theories and Models of Pitch Perception Dr. Guy Brown Room 145 Regent Court Department of Computer Science University of Sheffield Email: g.brown@dcs.shef.ac.uk

More information

Machine recognition of speech trained on data from New Jersey Labs

Machine recognition of speech trained on data from New Jersey Labs Machine recognition of speech trained on data from New Jersey Labs Frequency response (peak around 5 Hz) Impulse response (effective length around 200 ms) 41 RASTA filter 10 attenuation [db] 40 1 10 modulation

More information

The EarSpring Model for the Loudness Response in Unimpaired Human Hearing

The EarSpring Model for the Loudness Response in Unimpaired Human Hearing The EarSpring Model for the Loudness Response in Unimpaired Human Hearing David McClain, Refined Audiometrics Laboratory, LLC December 2006 Abstract We describe a simple nonlinear differential equation

More information

AUDL GS08/GAV1 Auditory Perception. Envelope and temporal fine structure (TFS)

AUDL GS08/GAV1 Auditory Perception. Envelope and temporal fine structure (TFS) AUDL GS08/GAV1 Auditory Perception Envelope and temporal fine structure (TFS) Envelope and TFS arise from a method of decomposing waveforms The classic decomposition of waveforms Spectral analysis... Decomposes

More information

Perception of low frequencies in small rooms

Perception of low frequencies in small rooms Perception of low frequencies in small rooms Fazenda, BM and Avis, MR Title Authors Type URL Published Date 24 Perception of low frequencies in small rooms Fazenda, BM and Avis, MR Conference or Workshop

More information

Spectro-Temporal Methods in Primary Auditory Cortex David Klein Didier Depireux Jonathan Simon Shihab Shamma

Spectro-Temporal Methods in Primary Auditory Cortex David Klein Didier Depireux Jonathan Simon Shihab Shamma Spectro-Temporal Methods in Primary Auditory Cortex David Klein Didier Depireux Jonathan Simon Shihab Shamma & Department of Electrical Engineering Supported in part by a MURI grant from the Office of

More information

Additive Versus Multiplicative Combination of Differences of Interaural Time and Intensity

Additive Versus Multiplicative Combination of Differences of Interaural Time and Intensity Additive Versus Multiplicative Combination of Differences of Interaural Time and Intensity Samuel H. Tao Submitted to the Department of Electrical and Computer Engineering in Partial Fulfillment of the

More information

Principles of Musical Acoustics

Principles of Musical Acoustics William M. Hartmann Principles of Musical Acoustics ^Spr inger Contents 1 Sound, Music, and Science 1 1.1 The Source 2 1.2 Transmission 3 1.3 Receiver 3 2 Vibrations 1 9 2.1 Mass and Spring 9 2.1.1 Definitions

More information

AUDL 4007 Auditory Perception. Week 1. The cochlea & auditory nerve: Obligatory stages of auditory processing

AUDL 4007 Auditory Perception. Week 1. The cochlea & auditory nerve: Obligatory stages of auditory processing AUDL 4007 Auditory Perception Week 1 The cochlea & auditory nerve: Obligatory stages of auditory processing 1 Think of the ear as a collection of systems, transforming sounds to be sent to the brain 25

More information

PERFORMANCE COMPARISON BETWEEN STEREAUSIS AND INCOHERENT WIDEBAND MUSIC FOR LOCALIZATION OF GROUND VEHICLES ABSTRACT

PERFORMANCE COMPARISON BETWEEN STEREAUSIS AND INCOHERENT WIDEBAND MUSIC FOR LOCALIZATION OF GROUND VEHICLES ABSTRACT Approved for public release; distribution is unlimited. PERFORMANCE COMPARISON BETWEEN STEREAUSIS AND INCOHERENT WIDEBAND MUSIC FOR LOCALIZATION OF GROUND VEHICLES September 1999 Tien Pham U.S. Army Research

More information

Laboratory Assignment 5 Amplitude Modulation

Laboratory Assignment 5 Amplitude Modulation Laboratory Assignment 5 Amplitude Modulation PURPOSE In this assignment, you will explore the use of digital computers for the analysis, design, synthesis, and simulation of an amplitude modulation (AM)

More information

Technical University of Denmark

Technical University of Denmark Technical University of Denmark Masking 1 st semester project Ørsted DTU Acoustic Technology fall 2007 Group 6 Troels Schmidt Lindgreen 073081 Kristoffer Ahrens Dickow 071324 Reynir Hilmisson 060162 Instructor

More information

Study on method of estimating direct arrival using monaural modulation sp. Author(s)Ando, Masaru; Morikawa, Daisuke; Uno

Study on method of estimating direct arrival using monaural modulation sp. Author(s)Ando, Masaru; Morikawa, Daisuke; Uno JAIST Reposi https://dspace.j Title Study on method of estimating direct arrival using monaural modulation sp Author(s)Ando, Masaru; Morikawa, Daisuke; Uno Citation Journal of Signal Processing, 18(4):

More information

Psychoacoustic Cues in Room Size Perception

Psychoacoustic Cues in Room Size Perception Audio Engineering Society Convention Paper Presented at the 116th Convention 2004 May 8 11 Berlin, Germany 6084 This convention paper has been reproduced from the author s advance manuscript, without editing,

More information

Signals, Sound, and Sensation

Signals, Sound, and Sensation Signals, Sound, and Sensation William M. Hartmann Department of Physics and Astronomy Michigan State University East Lansing, Michigan Л1Р Contents Preface xv Chapter 1: Pure Tones 1 Mathematics of the

More information

THE PERCEPTION OF ALL-PASS COMPONENTS IN TRANSFER FUNCTIONS

THE PERCEPTION OF ALL-PASS COMPONENTS IN TRANSFER FUNCTIONS PACS Reference: 43.66.Pn THE PERCEPTION OF ALL-PASS COMPONENTS IN TRANSFER FUNCTIONS Pauli Minnaar; Jan Plogsties; Søren Krarup Olesen; Flemming Christensen; Henrik Møller Department of Acoustics Aalborg

More information

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals 16 3. SPEECH ANALYSIS 3.1 INTRODUCTION TO SPEECH ANALYSIS Many speech processing [22] applications exploits speech production and perception to accomplish speech analysis. By speech analysis we extract

More information

Computational Perception /785

Computational Perception /785 Computational Perception 15-485/785 Assignment 1 Sound Localization due: Thursday, Jan. 31 Introduction This assignment focuses on sound localization. You will develop Matlab programs that synthesize sounds

More information

A binaural auditory model and applications to spatial sound evaluation

A binaural auditory model and applications to spatial sound evaluation A binaural auditory model and applications to spatial sound evaluation Ma r k o Ta k a n e n 1, Ga ë ta n Lo r h o 2, a n d Mat t i Ka r ja l a i n e n 1 1 Helsinki University of Technology, Dept. of Signal

More information

Signals and Systems Lecture 9 Communication Systems Frequency-Division Multiplexing and Frequency Modulation (FM)

Signals and Systems Lecture 9 Communication Systems Frequency-Division Multiplexing and Frequency Modulation (FM) Signals and Systems Lecture 9 Communication Systems Frequency-Division Multiplexing and Frequency Modulation (FM) April 11, 2008 Today s Topics 1. Frequency-division multiplexing 2. Frequency modulation

More information

Preface A detailed knowledge of the processes involved in hearing is an essential prerequisite for numerous medical and technical applications, such a

Preface A detailed knowledge of the processes involved in hearing is an essential prerequisite for numerous medical and technical applications, such a Modeling auditory processing of amplitude modulation Torsten Dau Preface A detailed knowledge of the processes involved in hearing is an essential prerequisite for numerous medical and technical applications,

More information

What is Sound? Part II

What is Sound? Part II What is Sound? Part II Timbre & Noise 1 Prayouandi (2010) - OneOhtrix Point Never PSYCHOACOUSTICS ACOUSTICS LOUDNESS AMPLITUDE PITCH FREQUENCY QUALITY TIMBRE 2 Timbre / Quality everything that is not frequency

More information

Binaural Mechanisms that Emphasize Consistent Interaural Timing Information over Frequency

Binaural Mechanisms that Emphasize Consistent Interaural Timing Information over Frequency Binaural Mechanisms that Emphasize Consistent Interaural Timing Information over Frequency Richard M. Stern 1 and Constantine Trahiotis 2 1 Department of Electrical and Computer Engineering and Biomedical

More information

Imagine the cochlea unrolled

Imagine the cochlea unrolled 2 2 1 1 1 1 1 Cochlea & Auditory Nerve: obligatory stages of auditory processing Think of the auditory periphery as a processor of signals 2 2 1 1 1 1 1 Imagine the cochlea unrolled Basilar membrane motion

More information

Phase and Feedback in the Nonlinear Brain. Malcolm Slaney (IBM and Stanford) Hiroko Shiraiwa-Terasawa (Stanford) Regaip Sen (Stanford)

Phase and Feedback in the Nonlinear Brain. Malcolm Slaney (IBM and Stanford) Hiroko Shiraiwa-Terasawa (Stanford) Regaip Sen (Stanford) Phase and Feedback in the Nonlinear Brain Malcolm Slaney (IBM and Stanford) Hiroko Shiraiwa-Terasawa (Stanford) Regaip Sen (Stanford) Auditory processing pre-cosyne workshop March 23, 2004 Simplistic Models

More information

AUDL Final exam page 1/7 Please answer all of the following questions.

AUDL Final exam page 1/7 Please answer all of the following questions. AUDL 11 28 Final exam page 1/7 Please answer all of the following questions. 1) Consider 8 harmonics of a sawtooth wave which has a fundamental period of 1 ms and a fundamental component with a level of

More information

Since the advent of the sine wave oscillator

Since the advent of the sine wave oscillator Advanced Distortion Analysis Methods Discover modern test equipment that has the memory and post-processing capability to analyze complex signals and ascertain real-world performance. By Dan Foley European

More information

Modeling auditory processing of amplitude modulation II. Spectral and temporal integration Dau, T.; Kollmeier, B.; Kohlrausch, A.G.

Modeling auditory processing of amplitude modulation II. Spectral and temporal integration Dau, T.; Kollmeier, B.; Kohlrausch, A.G. Modeling auditory processing of amplitude modulation II. Spectral and temporal integration Dau, T.; Kollmeier, B.; Kohlrausch, A.G. Published in: Journal of the Acoustical Society of America DOI: 10.1121/1.420345

More information

BIOLOGICALLY INSPIRED BINAURAL ANALOGUE SIGNAL PROCESSING

BIOLOGICALLY INSPIRED BINAURAL ANALOGUE SIGNAL PROCESSING Brain Inspired Cognitive Systems August 29 September 1, 2004 University of Stirling, Scotland, UK BIOLOGICALLY INSPIRED BINAURAL ANALOGUE SIGNAL PROCESSING Natasha Chia and Steve Collins University of

More information

Exploiting envelope fluctuations to achieve robust extraction and intelligent integration of binaural cues

Exploiting envelope fluctuations to achieve robust extraction and intelligent integration of binaural cues The Technology of Binaural Listening & Understanding: Paper ICA216-445 Exploiting envelope fluctuations to achieve robust extraction and intelligent integration of binaural cues G. Christopher Stecker

More information

Auditory Based Feature Vectors for Speech Recognition Systems

Auditory Based Feature Vectors for Speech Recognition Systems Auditory Based Feature Vectors for Speech Recognition Systems Dr. Waleed H. Abdulla Electrical & Computer Engineering Department The University of Auckland, New Zealand [w.abdulla@auckland.ac.nz] 1 Outlines

More information

Measurement of the binaural auditory filter using a detection task

Measurement of the binaural auditory filter using a detection task Measurement of the binaural auditory filter using a detection task Andrew J. Kolarik and John F. Culling School of Psychology, Cardiff University, Tower Building, Park Place, Cardiff CF1 3AT, United Kingdom

More information

Outline. Communications Engineering 1

Outline. Communications Engineering 1 Outline Introduction Signal, random variable, random process and spectra Analog modulation Analog to digital conversion Digital transmission through baseband channels Signal space representation Optimal

More information

Results of Egan and Hake using a single sinusoidal masker [reprinted with permission from J. Acoust. Soc. Am. 22, 622 (1950)].

Results of Egan and Hake using a single sinusoidal masker [reprinted with permission from J. Acoust. Soc. Am. 22, 622 (1950)]. XVI. SIGNAL DETECTION BY HUMAN OBSERVERS Prof. J. A. Swets Prof. D. M. Green Linda E. Branneman P. D. Donahue Susan T. Sewall A. MASKING WITH TWO CONTINUOUS TONES One of the earliest studies in the modern

More information

Distortion products and the perceived pitch of harmonic complex tones

Distortion products and the perceived pitch of harmonic complex tones Distortion products and the perceived pitch of harmonic complex tones D. Pressnitzer and R.D. Patterson Centre for the Neural Basis of Hearing, Dept. of Physiology, Downing street, Cambridge CB2 3EG, U.K.

More information

Introduction to cochlear implants Philipos C. Loizou Figure Captions

Introduction to cochlear implants Philipos C. Loizou Figure Captions http://www.utdallas.edu/~loizou/cimplants/tutorial/ Introduction to cochlear implants Philipos C. Loizou Figure Captions Figure 1. The top panel shows the time waveform of a 30-msec segment of the vowel

More information

Musical Acoustics, C. Bertulani. Musical Acoustics. Lecture 14 Timbre / Tone quality II

Musical Acoustics, C. Bertulani. Musical Acoustics. Lecture 14 Timbre / Tone quality II 1 Musical Acoustics Lecture 14 Timbre / Tone quality II Odd vs Even Harmonics and Symmetry Sines are Anti-symmetric about mid-point If you mirror around the middle you get the same shape but upside down

More information

AUDITORY ILLUSIONS & LAB REPORT FORM

AUDITORY ILLUSIONS & LAB REPORT FORM 01/02 Illusions - 1 AUDITORY ILLUSIONS & LAB REPORT FORM NAME: DATE: PARTNER(S): The objective of this experiment is: To understand concepts such as beats, localization, masking, and musical effects. APPARATUS:

More information

Audio Engineering Society. Convention Paper. Presented at the 124th Convention 2008 May Amsterdam, The Netherlands

Audio Engineering Society. Convention Paper. Presented at the 124th Convention 2008 May Amsterdam, The Netherlands Audio Engineering Society Convention Paper Presented at the 124th Convention 2008 May 17 20 Amsterdam, The Netherlands The papers at this Convention have been selected on the basis of a submitted abstract

More information

MUS 302 ENGINEERING SECTION

MUS 302 ENGINEERING SECTION MUS 302 ENGINEERING SECTION Wiley Ross: Recording Studio Coordinator Email =>ross@email.arizona.edu Twitter=> https://twitter.com/ssor Web page => http://www.arts.arizona.edu/studio Youtube Channel=>http://www.youtube.com/user/wileyross

More information

Shift of ITD tuning is observed with different methods of prediction.

Shift of ITD tuning is observed with different methods of prediction. Supplementary Figure 1 Shift of ITD tuning is observed with different methods of prediction. (a) ritdfs and preditdfs corresponding to a positive and negative binaural beat (resp. ipsi/contra stimulus

More information

A102 Signals and Systems for Hearing and Speech: Final exam answers

A102 Signals and Systems for Hearing and Speech: Final exam answers A12 Signals and Systems for Hearing and Speech: Final exam answers 1) Take two sinusoids of 4 khz, both with a phase of. One has a peak level of.8 Pa while the other has a peak level of. Pa. Draw the spectrum

More information

Neural Processing of Amplitude-Modulated Sounds: Joris, Schreiner and Rees, Physiol. Rev. 2004

Neural Processing of Amplitude-Modulated Sounds: Joris, Schreiner and Rees, Physiol. Rev. 2004 Neural Processing of Amplitude-Modulated Sounds: Joris, Schreiner and Rees, Physiol. Rev. 2004 Richard Turner (turner@gatsby.ucl.ac.uk) Gatsby Computational Neuroscience Unit, 02/03/2006 As neuroscientists

More information

SGN Audio and Speech Processing

SGN Audio and Speech Processing Introduction 1 Course goals Introduction 2 SGN 14006 Audio and Speech Processing Lectures, Fall 2014 Anssi Klapuri Tampere University of Technology! Learn basics of audio signal processing Basic operations

More information

Acoustics, signals & systems for audiology. Week 4. Signals through Systems

Acoustics, signals & systems for audiology. Week 4. Signals through Systems Acoustics, signals & systems for audiology Week 4 Signals through Systems Crucial ideas Any signal can be constructed as a sum of sine waves In a linear time-invariant (LTI) system, the response to a sinusoid

More information

Instruction Manual for Concept Simulators. Signals and Systems. M. J. Roberts

Instruction Manual for Concept Simulators. Signals and Systems. M. J. Roberts Instruction Manual for Concept Simulators that accompany the book Signals and Systems by M. J. Roberts March 2004 - All Rights Reserved Table of Contents I. Loading and Running the Simulators II. Continuous-Time

More information

Different Approaches of Spectral Subtraction Method for Speech Enhancement

Different Approaches of Spectral Subtraction Method for Speech Enhancement ISSN 2249 5460 Available online at www.internationalejournals.com International ejournals International Journal of Mathematical Sciences, Technology and Humanities 95 (2013 1056 1062 Different Approaches

More information

Using the Gammachirp Filter for Auditory Analysis of Speech

Using the Gammachirp Filter for Auditory Analysis of Speech Using the Gammachirp Filter for Auditory Analysis of Speech 18.327: Wavelets and Filterbanks Alex Park malex@sls.lcs.mit.edu May 14, 2003 Abstract Modern automatic speech recognition (ASR) systems typically

More information

Estimating critical bandwidths of temporal sensitivity to low-frequency amplitude modulation

Estimating critical bandwidths of temporal sensitivity to low-frequency amplitude modulation Estimating critical bandwidths of temporal sensitivity to low-frequency amplitude modulation Allison I. Shim a) and Bruce G. Berg Department of Cognitive Sciences, University of California, Irvine, Irvine,

More information

Lateralisation of multiple sound sources by the auditory system

Lateralisation of multiple sound sources by the auditory system Modeling of Binaural Discrimination of multiple Sound Sources: A Contribution to the Development of a Cocktail-Party-Processor 4 H.SLATKY (Lehrstuhl für allgemeine Elektrotechnik und Akustik, Ruhr-Universität

More information

Final Exam Study Guide: Introduction to Computer Music Course Staff April 24, 2015

Final Exam Study Guide: Introduction to Computer Music Course Staff April 24, 2015 Final Exam Study Guide: 15-322 Introduction to Computer Music Course Staff April 24, 2015 This document is intended to help you identify and master the main concepts of 15-322, which is also what we intend

More information

Sound is the human ear s perceived effect of pressure changes in the ambient air. Sound can be modeled as a function of time.

Sound is the human ear s perceived effect of pressure changes in the ambient air. Sound can be modeled as a function of time. 2. Physical sound 2.1 What is sound? Sound is the human ear s perceived effect of pressure changes in the ambient air. Sound can be modeled as a function of time. Figure 2.1: A 0.56-second audio clip of

More information

Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012

Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012 Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012 o Music signal characteristics o Perceptual attributes and acoustic properties o Signal representations for pitch detection o STFT o Sinusoidal model o

More information

Recurrent Timing Neural Networks for Joint F0-Localisation Estimation

Recurrent Timing Neural Networks for Joint F0-Localisation Estimation Recurrent Timing Neural Networks for Joint F0-Localisation Estimation Stuart N. Wrigley and Guy J. Brown Department of Computer Science, University of Sheffield Regent Court, 211 Portobello Street, Sheffield

More information

Auditory Localization

Auditory Localization Auditory Localization CMPT 468: Sound Localization Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser University November 15, 2013 Auditory locatlization is the human perception

More information

Experiments in two-tone interference

Experiments in two-tone interference Experiments in two-tone interference Using zero-based encoding An alternative look at combination tones and the critical band John K. Bates Time/Space Systems Functions of the experimental system: Variable

More information

An introduction to physics of Sound

An introduction to physics of Sound An introduction to physics of Sound Outlines Acoustics and psycho-acoustics Sound? Wave and waves types Cycle Basic parameters of sound wave period Amplitude Wavelength Frequency Outlines Phase Types of

More information

EXPERIMENTAL INVESTIGATION INTO THE OPTIMAL USE OF DITHER

EXPERIMENTAL INVESTIGATION INTO THE OPTIMAL USE OF DITHER EXPERIMENTAL INVESTIGATION INTO THE OPTIMAL USE OF DITHER PACS: 43.60.Cg Preben Kvist 1, Karsten Bo Rasmussen 2, Torben Poulsen 1 1 Acoustic Technology, Ørsted DTU, Technical University of Denmark DK-2800

More information

IMPLEMENTATION AND APPLICATION OF A BINAURAL HEARING MODEL TO THE OBJECTIVE EVALUATION OF SPATIAL IMPRESSION

IMPLEMENTATION AND APPLICATION OF A BINAURAL HEARING MODEL TO THE OBJECTIVE EVALUATION OF SPATIAL IMPRESSION IMPLEMENTATION AND APPLICATION OF A BINAURAL HEARING MODEL TO THE OBJECTIVE EVALUATION OF SPATIAL IMPRESSION RUSSELL MASON Institute of Sound Recording, University of Surrey, Guildford, UK r.mason@surrey.ac.uk

More information

Monaural and Binaural Speech Separation

Monaural and Binaural Speech Separation Monaural and Binaural Speech Separation DeLiang Wang Perception & Neurodynamics Lab The Ohio State University Outline of presentation Introduction CASA approach to sound separation Ideal binary mask as

More information

Signals & Systems for Speech & Hearing. Week 6. Practical spectral analysis. Bandpass filters & filterbanks. Try this out on an old friend

Signals & Systems for Speech & Hearing. Week 6. Practical spectral analysis. Bandpass filters & filterbanks. Try this out on an old friend Signals & Systems for Speech & Hearing Week 6 Bandpass filters & filterbanks Practical spectral analysis Most analogue signals of interest are not easily mathematically specified so applying a Fourier

More information

Modeling binaural signal detection

Modeling binaural signal detection Modeling binaural signal detection Breebaart, D.J. DOI: 1.61/IR546322 Published: 1/1/21 Document Version Publisher s PDF, also known as Version of Record (includes final page, issue and volume numbers)

More information

Loudspeaker Distortion Measurement and Perception Part 2: Irregular distortion caused by defects

Loudspeaker Distortion Measurement and Perception Part 2: Irregular distortion caused by defects Loudspeaker Distortion Measurement and Perception Part 2: Irregular distortion caused by defects Wolfgang Klippel, Klippel GmbH, wklippel@klippel.de Robert Werner, Klippel GmbH, r.werner@klippel.de ABSTRACT

More information

Biomedical Signals. Signals and Images in Medicine Dr Nabeel Anwar

Biomedical Signals. Signals and Images in Medicine Dr Nabeel Anwar Biomedical Signals Signals and Images in Medicine Dr Nabeel Anwar Noise Removal: Time Domain Techniques 1. Synchronized Averaging (covered in lecture 1) 2. Moving Average Filters (today s topic) 3. Derivative

More information