Neural Processing of Amplitude-Modulated Sounds: Joris, Schreiner and Rees, Physiol. Rev. 2004

Similar documents
TNS Journal Club: Efficient coding of natural sounds, Lewicki, Nature Neurosceince, 2002

Perception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner.

Hearing and Deafness 2. Ear as a frequency analyzer. Chris Darwin

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner.

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

Spectro-Temporal Methods in Primary Auditory Cortex David Klein Didier Depireux Jonathan Simon Shihab Shamma

Psycho-acoustics (Sound characteristics, Masking, and Loudness)

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner.

AUDL GS08/GAV1 Auditory Perception. Envelope and temporal fine structure (TFS)

You know about adding up waves, e.g. from two loudspeakers. AUDL 4007 Auditory Perception. Week 2½. Mathematical prelude: Adding up levels

AUDL GS08/GAV1 Signals, systems, acoustics and the ear. Loudness & Temporal resolution

Temporal resolution AUDL Domain of temporal resolution. Fine structure and envelope. Modulating a sinusoid. Fine structure and envelope

Acoustics, signals & systems for audiology. Week 9. Basic Psychoacoustic Phenomena: Temporal resolution

Neuronal correlates of pitch in the Inferior Colliculus

Complex Sounds. Reading: Yost Ch. 4

AUDL 4007 Auditory Perception. Week 1. The cochlea & auditory nerve: Obligatory stages of auditory processing

Imagine the cochlea unrolled

Structure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping

A CLOSER LOOK AT THE REPRESENTATION OF INTERAURAL DIFFERENCES IN A BINAURAL MODEL

Ripples in the Anterior Auditory Field and Inferior Colliculus of the Ferret

A102 Signals and Systems for Hearing and Speech: Final exam answers

Predicting discrimination of formant frequencies in vowels with a computational model of the auditory midbrain

Neural Representations of Sinusoidal Amplitude and Frequency Modulations in the Primary Auditory Cortex of Awake Primates

III. Publication III. c 2005 Toni Hirvonen.

Signals & Systems for Speech & Hearing. Week 6. Practical spectral analysis. Bandpass filters & filterbanks. Try this out on an old friend

COM325 Computer Speech and Hearing

Massachusetts Institute of Technology Dept. of Electrical Engineering and Computer Science Fall Semester, Introduction to EECS 2

Signal Characteristics

The psychoacoustics of reverberation

The quality of the transmission signal The characteristics of the transmission medium. Some type of transmission medium is required for transmission:

Fundamentals of Digital Communication

HCS 7367 Speech Perception

Spectral and temporal processing in the human auditory system

THE MATLAB IMPLEMENTATION OF BINAURAL PROCESSING MODEL SIMULATING LATERAL POSITION OF TONES WITH INTERAURAL TIME DIFFERENCES

Signals, Sound, and Sensation

Acoustics, signals & systems for audiology. Week 4. Signals through Systems

Code No: R Set No. 1

ELEC3242 Communications Engineering Laboratory Amplitude Modulation (AM)

Distortion products and the perceived pitch of harmonic complex tones

Computational Perception. Sound localization 2

Problems from the 3 rd edition

Angle Modulated Systems

Some key functions implemented in the transmitter are modulation, filtering, encoding, and signal transmitting (to be elaborated)

Wireless Communication Fading Modulation

I R UNDERGRADUATE REPORT. Stereausis: A Binaural Processing Model. by Samuel Jiawei Ng Advisor: P.S. Krishnaprasad UG

Outline. Communications Engineering 1

Lecture 6. Angle Modulation and Demodulation

Binaural Hearing. Reading: Yost Ch. 12

Monaural and Binaural Speech Separation

Speech, music, images, and video are examples of analog signals. Each of these signals is characterized by its bandwidth, dynamic range, and the

Amplitude Modulation, II

Study on method of estimating direct arrival using monaural modulation sp. Author(s)Ando, Masaru; Morikawa, Daisuke; Uno

A cat's cocktail party: Psychophysical, neurophysiological, and computational studies of spatial release from masking

The role of intrinsic masker fluctuations on the spectral spread of masking

Pressure vs. decibel modulation in spectrotemporal representations: How nonlinear are auditory cortical stimuli?

EE482: Digital Signal Processing Applications

Neural Coding of Multiple Stimulus Features in Auditory Cortex

Signal Processing. Naureen Ghani. December 9, 2017

Pitch estimation using spiking neurons

Modeling auditory processing of amplitude modulation I. Detection and masking with narrow-band carriers Dau, T.; Kollmeier, B.; Kohlrausch, A.G.

B.Tech II Year II Semester (R13) Supplementary Examinations May/June 2017 ANALOG COMMUNICATION SYSTEMS (Electronics and Communication Engineering)

Limulus eye: a filter cascade. Limulus 9/23/2011. Dynamic Response to Step Increase in Light Intensity

Experiments in two-tone interference

8.5 Modulation of Signals

DSP First. Laboratory Exercise #7. Everyday Sinusoidal Signals

Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012

Music 171: Sinusoids. Tamara Smyth, Department of Music, University of California, San Diego (UCSD) January 10, 2019

Auditory modelling for speech processing in the perceptual domain

Matching the waveform and the temporal window in the creation of experimental signals

Synthesis Techniques. Juan P Bello

Monaural and binaural processing of fluctuating sounds in the auditory system

Intensity Discrimination and Binaural Interaction

Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues

Modulation is the process of impressing a low-frequency information signal (baseband signal) onto a higher frequency carrier signal

Amplitude Modulation. Ahmad Bilal

SOUND QUALITY EVALUATION OF FAN NOISE BASED ON HEARING-RELATED PARAMETERS SUMMARY INTRODUCTION

Digital Communications over Fading Channel s

(b) What are the differences between FM and PM? (c) What are the differences between NBFM and WBFM? [9+4+3]

ECE 556 BASICS OF DIGITAL SPEECH PROCESSING. Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2

CHAPTER 2! AMPLITUDE MODULATION (AM)

Terminology (1) Chapter 3. Terminology (3) Terminology (2) Transmitter Receiver Medium. Data Transmission. Direct link. Point-to-point.

Lecture 10. Digital Modulation

Introduction to cochlear implants Philipos C. Loizou Figure Captions

21/01/2014. Fundamentals of the analysis of neuronal oscillations. Separating sources

ECE5713 : Advanced Digital Communications

Estimating critical bandwidths of temporal sensitivity to low-frequency amplitude modulation

Binaural hearing. Prof. Dan Tollin on the Hearing Throne, Oldenburg Hearing Garden

Laboratory Assignment 4. Fourier Sound Synthesis

Communication Channels

Modeling auditory processing of amplitude modulation II. Spectral and temporal integration Dau, T.; Kollmeier, B.; Kohlrausch, A.G.

Part A: Spread Spectrum Systems

Problem Set 8 #4 Solution

Overview of Code Excited Linear Predictive Coder

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi

DIGITAL COMMUNICATIONS SYSTEMS. MSc in Electronic Technologies and Communications

Speech Signal Analysis

Tone-in-noise detection: Observed discrepancies in spectral integration. Nicolas Le Goff a) Technische Universiteit Eindhoven, P.O.

Chapter 5 Window Functions. periodic with a period of N (number of samples). This is observed in table (3.1).

Block diagram of proposed general approach to automatic reduction of speech wave to lowinformation-rate signals.

Problem Sheet 1 Probability, random processes, and noise

Transcription:

Neural Processing of Amplitude-Modulated Sounds: Joris, Schreiner and Rees, Physiol. Rev. 2004 Richard Turner (turner@gatsby.ucl.ac.uk) Gatsby Computational Neuroscience Unit, 02/03/2006

As neuroscientists we would like to: Introduction Discover the key perceptual dimensions and features neurons directly process e.g. periodicity pitch, which involves... Finding the corresponding physical variables e.g. repetition rate Things appear much more intuitive in vision and this is one reason why more progress has been made toward this end than in auditory. This paper posits amplitude modulation and attempts to resolve: Is the temporal envelope dimension of sounds a fundamental organising principle in the auditory system? Is it essential to understand the modulation domain in order to understand the architecture of the auditory system?

Key Findings Amplitude modulation is a feature of most natural signals Psychophysics: important in many tasks over many time-scales Specialised neural mechanisms seem to be present in the auditory system for the extraction of AM Peripheral neural structures synchronised to the modulation waveform over limited ranges Higher structures, the rates are tuned to AM and the upper limit of modulation frequency tuning decreases.

Three Health warnings 1. This paper is like a dictionary... Over 300 references Anaesthetic, animal, stimulus, part of the auditory system, cell-type(s), analysis Documentation is a combinatoric task and needs theory for compression 2....with many pages containing the most interesting words missing Little has been done with interesting stimuli yet (the simplest stimuli mathematically are not necessarily the simplest for understanding the system) 3. Beware epiphenomena i.e. does modulation have a representation specific to itself, or are we observing the results of other types of processing e.g. toasters will turn out a light if you probe them with a knife into them, but they re really designed for making toast - requires some intelligence

Time scales in sounds and AM - everything is temporal Fine structure = fast pressure variations (eg. formants of speech 500Hz) Amplitude modulation = envelope (eg. in speech 3-4Hz, up to 20Hz, and up to the pitch (100-200Hz) in voiced sections)

Natural Scene statistics: Acoustic ecology Low frequency AMs are prominent in natural environments over different frequency regions (Nelken et. al.) The modulation often carries the important information (Schroeder) arbitraily soft sounds have finite prob.: 1/f scaling over a few decades (Voss and Clarke) AM stats are non-gaussian, cover a wide range of modulation frequencies and scale universally (Attias and Schreiner): speech, music and animal vocalisations narrow-band frequency channels. FIND: power-law distrib. for modulation in each channel: p(m ωc ) = exp[ αm ωc ], b 2 +m β/2 ωc ω c (translation-) and ensemble invariant long temporal correlations (100ms), long correlations across ω c

Stimuli The envelope and fine structure interact to produce the spectrum: beats They do not correspond to separate components of the spectra e.g. sin (ω + ω) + sin (ω ω) = 2 sin ( ω) sin (ω) The most common AM signal used in experiments used to be: s(t) = [1 + m sin (ω m t)] sin (ω c t) (1) = sin (ω c t) + m 2 (sin [(ω c + ω m ) t] + sin [(ω c ω m ) t]) (2) In contrast, real world signals have a range of modulations present The modulation spectrum, describes the distribution of modulation energy for each of the carrier frequencies in the waveform.

Psychophysics 1 Spectrum of an AM signal: x(t) = m(t) c(t), is a convolution: x(ω) = m(ω) c(ω) (spectral splatter) spectral cues can be used to detect the modulation frequency (eg. when the side bands fall in different critical bands). One way round this is to modulate noise rather than a sinusoid. Broad-band noise eliminate spectral cues (only signatures now on edge of the spectrum). Plot 20 log 1/m at threshold as a function of f m to get a modulation transfer function MTF (Moore, pg 233).

Psychophysics 2 Pitch at the frequency of modulation can be heard even when the carrier is noise must be some kind of temporal analysis. MTFs are also used for neural responses (see later)

Psychophysics 3 Modulation detection interference (MDI): detection of AM is influenced by modulation at the same frequency but on a different carrier. Comodulation masking release (CMR): masking in noise reduced if the noise becomes modulated.

Neural Response Measures 1 Consider two main encodings of modulation frequency: 1. envelope phase locking (predominant at early stages) 2. rate coding (further along the neuraxis) eg. Auditory nerve fibres phase lock to the fine structure and to the envelope NB The AM signal appears in the spectrum of the PSTH - ie. non-linearities demodulate the modulator

Neural Response Measures 2 Need a measure of the phase locking. One rather arbitrary metric (used by most protagonists) is to define θ n = mod 2π (t n /T m ) and x n = cos(θ n ), y n = sin(θ n ) R = 1 ( ) N 2 ( N ) 2 x n + y n (3) N n=1 n=1 So, if a neuron always fires at the same phase θ, R = 1. R = 1 N (N cos θ) 2 + (N sin θ) 2 = 1 (4)

Neural Response Measures 3 BUT 1. although a value of R = 1 is unambiguous, a low value is not. eg. spikes equally divided between φ and φ + π have R = 0. 2. this measure likes bunched up spikes, so if the cell is representing the modulation waveform with it s probability if firing, then is might score poorly on the above - even though this is a faithful representation of the envelope - this is a consequence of sinusoid thinking in sinusoid world 3. Finally it s not obvious how to calculate the above when the modulation spectrum is more complex than a delta function - what T m do you use?.

Auditory Nerve - type 1 cells

Auditory Nerve - type 1 cells, explanation A. Increase m: R increases monotonically, eventually saturating. The rate remains constant. B. Increase SPL: R increases to a maximum, and then decreases. Expected from sigmoidal rate level function: At low SPL the neuron doesn t fire, for intermediate levels the stimulus distribution sits on the sharpest part of the rate level curve and there is good modulation, for high SPL the neuron just fires all the time and there is little modulation. C. Increase f m, the side-bands move away from the carrier frequency and become attenuated as they move out of the filter. Causes modulation to drop. The bandwidth of the MTFs increases with CF (as you d expect from the increase in filter band-width). Highest modulation frequency at which envelope-phase-locking is observed is 2KHz. D. For moderate or loud stimuli the strongest phase locking will be from fibres with CF f c. However, the effect of f c relative to CF has not been investigated.

High SR versus low SR Reminder: two sorts of type 1 cells: High SR cells: 18 spikes/s, low thresholds, limited range Low SR cells: high thresholds, wider range, don t really saturate, adapt less Cells with low and medium SR tend to have higher R max values especially if they have low CFs. Different metrics give different answers though! Synchronisation is robust in high SR cells at low SPL and in low-srs at high and medium SPL. Low SR fibres have larger dynamic range over which the modulation is present.

Summary of Auditory Nerve tuning envelope info abundant each nerve fibre transmits info over a stereotypical range of modulation frequencies, carrier freqs and intensities main bottle-neck for the processing of AM is the extent of modulation frequencies over which synchronisation occurs

Superior Olivary Complex SOC transforms the stimulus locked ITD temporal code into a rate code (labelled line). SOC seems to have two binaural circuits for localisation: ITDs for low frequency sounds (mainly with low CF neurons), ILDs for High frequency sounds (mainly with high CF neurons) JNDs for ITDs of high frequency sounds almost the same as for low frequency sounds if the high frequency carrier is modulated by a time-varying envelope r MSO e(t t i ) e(t t c ) In general, the upper limit of phase locking reduces as we move up the system so this might be a general principle.

IC - tmtfs Strongly modulated responses: larger modulation gains than CN But restricted to a smaller range up to 200-300Hz tmtf is either low-pass (most), then band-pass

IC - rmtfs rmtf much more peaked than CN (for which rmtf is usually flat) seem to have a wider range of patterns than tmtf bandpass - most common, low pass, band-reject, complex tmtfs and rmtfs generally match, but in a number of neurons they do not. Rate codes might represent modulation frequencies higher than can be supported temporally in the IC But: evidence that rmtf peaks tend to be higher than the tmft peaks is debated.

Topographic mapping of AM? Schreiner and Langer (251) central nucleus of disc cells and stellate cells form twisted laminae of cells that support the highly tonotopic frequency organisation of the IC Evidence for a modulation filter bank reconstruct the location of recording sites create maps of best modulation frequency iso-best-modulation-frequency contours are cones with the tip at low CF and the base at high CF In support of this map response latency (which should be inversely proportional to BMF) is also topographically mapped across the iso-frequency laminae There is debate about whether this map exists

Topographic mapping of modulation frequency 2

Cortex 1 Temporal coding in AC substantially reduced: max following rates 30Hz High percentage have band-pass tmtfs tbmfs seem to be independent of CF independent processing of modulation frequency in each spectral band Could allow spectral components to be sorted according to their modulation rate and then common modulated components bound at a later stage. However, still unclear and before we jump to the conclusion of modulation filter banks, more work needs to be done. eg. does the phase of the envelope components reflected in the synchronisation or rate?

Cortex 2 Other pathways show a distinct movement from temporal to rate coding (eg. ITD, and envelope ITD) and sensitivity is retained in the new encoding. However, although band pass rmtfs are found - they are far less common than tmtfs rmtfs BMFs do move to higher frequencies, but they are still low as compared to the brain stem. Lu et al, Bieser and Muller Preuss suggest that low modulation rates are encoded by the phase locked neurons and high modulation rates by the rate variations. But if this is true - why can we detect the envelope up to 1000Hz, when the BMFs are only a few hundred?

Conclusion Patchy picture, many unsolved issues But we have seen: 1. a recoding of modulation selectivity from temporal to rate based 2. a decrease in the highest modulation frequencies coded (in the temporal or rate code) Two views: 1. Point 2. skeptical view: modulation coding is epiphenomenal, ie a necessary phenomena of other types of processing 2. non-skeptical view: neural mechanisms are dedicated to modulation processing tuning to modulation is prominent in the rate and temporal domain range spans perceptually relevant ranges topographic mapping evidence... Where next? more complex stimuli - sinusoidally modulated sinusoids are very simple both in the modulation and the carrier: eg. modulated noise, complex modulation by a sum of sines, different modulations for different carriers etc.