Singing Expression Transfer from One Voice to Another for a Given Song

Size: px
Start display at page:

Download "Singing Expression Transfer from One Voice to Another for a Given Song"

Transcription

1 Singing Expression Transfer from One Voice to Another for a Given Song Korea Advanced Institute of Science and Technology Sangeon Yong, Juhan Nam MACLab Music and Audio Computing

2 Introduction

3 Introduction source target

4 Related Works Antares Autotune 8 graphical mode Steinberg Variaudio

5 Related Works Cano et al. (ICMC, 2000) Voice morphing system with source and target voice Score information is used for temporal alignment Nakano et al. (SMC, 2009) Similar with above but using a singing synthesizer instead of the source voice (i.e. Vocaloid) Tune synthesizer parameter with the lyric information of the song However, they require additional score information!

6 Research Goal Voice color? Rhythm, Pitch, Dynamics Transfer musical expressions without any additional information

7 System Structure Temporal Alignment Pitch Alignment Dynamics Alignment Target Feature Extraction DTW Smoothing HPSS Pitch Detector Envelope Detector stretching ratio harmonic signal smoothed stretching ratio pitch ratio gain ratio Source Time-Scale Modification Pitch Shifting s s T s TP s TPE Gain Modified

8 System Structure Temporal Alignment Pitch Alignment Dynamics Alignment Target Feature Extraction DTW Smoothing HPSS Pitch Detector Envelope Detector stretching ratio harmonic signal smoothed stretching ratio pitch ratio gain ratio Source Time-Scale Modification Pitch Shifting s s T s TP s TPE Gain Modified

9 System Structure Temporal Alignment Pitch Alignment Dynamics Alignment Target Feature Extraction DTW Smoothing HPSS Pitch Detector Envelope Detector stretching ratio harmonic signal smoothed stretching ratio pitch ratio gain ratio Source Time-Scale Modification Pitch Shifting s s T s TP s TPE Gain Modified

10 Temporal Alignment Singer A Lyrics Let it go let it go Singer B

11 Temporal Alignment Dynamic Time Warping

12 System Structure Temporal Alignment Pitch Alignment Dynamics Alignment Target Feature Extraction DTW Smoothing HPSS Pitch Detector Envelope Detector stretching ratio harmonic signal smoothed stretching ratio pitch ratio gain ratio Source Time-Scale Modification Pitch Shifting s s T s TP s TPE Gain Modified

13 Temporal Alignment Feature Extraction Spectrogram of Source Spectrogram of Target

14 Temporal Alignment Feature Extraction Similarity matrix with spectrogram

15 Temporal Alignment Feature Extraction Spectrogram of Source Spectrogram of Target

16 Feature Extraction Strategy Preserving common elements Note-level melody Lyrics Suppressing different characteristics Vibrato or other pitch-related articulations Singer timbre

17 Proposed Features Max-filtered Constant-Q transform Semi-tone pitch resolution: vibrato with less than one semi-tone Frequency-wise max-filtering: vibrato with more than one semi-tone Constant-Q Transform Const-Q Trans with Maximum Filtering

18 Phonemes Proposed Features Phoneme score (phoneme classifier posteriorgram) Frame-level features for accurate temporal alignment Singer invariant lyrical features

19 Temporal Alignment Feature Comparison Spectrogram Max-filtered Constant-Q Transform

20 Temporal Alignment Feature Comparison Spectrogram phoneme score

21 Temporal Alignment Feature Comparison Spectrogram Phoneme Score +Const-Q Trans

22 System Structure Temporal Alignment Pitch Alignment Dynamics Alignment Target Feature Extraction DTW Smoothing HPSS Pitch Detector Envelope Detector stretching ratio harmonic signal smoothed stretching ratio pitch ratio gain ratio Source Time-Scale Modification Pitch Shifting s s T s TP s TPE Gain Modified

23 Temporal Alignment Path Smoothing

24 Temporal Alignment Path Smoothing Savitzky, Abraham, and Marcel JE Golay. "Smoothing and differentiation of data by simplified least squares procedures." Analytical chemistry 36.8 (1964):

25 Temporal Alignment Path Smoothing

26 Temporal Alignment Path Smoothing

27 System Structure Temporal Alignment Pitch Alignment Dynamics Alignment Target Feature Extraction DTW Smoothing HPSS Pitch Detector Envelope Detector stretching ratio harmonic signal smoothed stretching ratio pitch ratio gain ratio Source Time-Scale Modification Pitch Shifting s s T s TP s TPE WSOLA Gain Modified

28 System Structure Temporal Alignment Pitch Alignment Dynamics Alignment Target Feature Extraction DTW Smoothing HPSS Pitch Detector Envelope Detector stretching ratio harmonic signal smoothed stretching ratio pitch ratio gain ratio Source Time-Scale Modification Pitch Shifting s s T s TP s TPE Gain Modified

29 Pitch Alignment Harmonic-Percussion Source Separation (HPSS) Pre-processing of pitch detection to increase detection accuracy Median filter (IEEE Signal Processing Letters 2014) Pitch Detector YIN Pitch shifting Pitch-Synchronous Overlap-Add (PSOLA) Formant preservation

30 Pitch Alignment source target result

31 System Structure Temporal Alignment Pitch Alignment Dynamics Alignment Target Feature Extraction DTW Smoothing HPSS Pitch Detector Envelope Detector stretching ratio harmonic signal smoothed stretching ratio pitch ratio gain ratio Source Time-Scale Modification Pitch Shifting s s T s TP s TPE Gain Modified

32 Dynamics Alignment source target result

33 Evaluation Datasets 4 recordings for each of 4 songs (total 16 recordings) One of 4 recordings is a target singing voice (professional or skilled) Totally 12 pairs of source-target singing voice Song 1 Song 2 Song 3 Song 4 Gender female male male male No. of source Remarks high pitch English low pitch English swing rhythm Korean swing rhythm Korean

34 Evaluation Temporal alignment Better alignment has less fluctuation of the DTW slope Standard deviation of slope angle θ = arctan(slope) Song 1 Song 2 Song 3 Song 4 Gender female male male male No. of source Remarks high pitch English low pitch English swing rhythm Korean swing rhythm Korean song 1 song 2 song 3 song 4

35 Evaluation Pitch alignment Song 1 Song 2 Song 3 Song 4 Gender female male male male No. of source Remarks high pitch English low pitch English swing rhythm Korean swing rhythm Korean

36 Evaluation Dynamics alignment Song 1 Song 2 Song 3 Song 4 Gender female male male male No. of source Remarks high pitch English low pitch English swing rhythm Korean swing rhythm Korean

37 Audio Examples let it go source target result cherry blossom ending More examples are available on

38 Summary Proposed a method to transfer vocal expressions from one voice to another in terms of tempo, pitch and dynamics without any additional information Showed the proposed method effectively transformed the source voices so that they mimic singing skills from the target voice

39 Future Plan The limitation of this work is that the target voice must be available A possible solution is to model a target singer model (e.g. singing synthesizer with natural expressions) and generate a target example using melody and lyrics information extracted from the source voice Improve the audio quality using other time-scale/pitch modification algorithms

40

41

SPEECH TO SINGING SYNTHESIS SYSTEM. Mingqing Yun, Yoon mo Yang, Yufei Zhang. Department of Electrical and Computer Engineering University of Rochester

SPEECH TO SINGING SYNTHESIS SYSTEM. Mingqing Yun, Yoon mo Yang, Yufei Zhang. Department of Electrical and Computer Engineering University of Rochester SPEECH TO SINGING SYNTHESIS SYSTEM Mingqing Yun, Yoon mo Yang, Yufei Zhang Department of Electrical and Computer Engineering University of Rochester ABSTRACT This paper describes a speech-to-singing synthesis

More information

A system for automatic detection and correction of detuned singing

A system for automatic detection and correction of detuned singing A system for automatic detection and correction of detuned singing M. Lech and B. Kostek Gdansk University of Technology, Multimedia Systems Department, /2 Gabriela Narutowicza Street, 80-952 Gdansk, Poland

More information

Converting Speaking Voice into Singing Voice

Converting Speaking Voice into Singing Voice Converting Speaking Voice into Singing Voice 1 st place of the Synthesis of Singing Challenge 2007: Vocal Conversion from Speaking to Singing Voice using STRAIGHT by Takeshi Saitou et al. 1 STRAIGHT Speech

More information

Non-stationary Analysis/Synthesis using Spectrum Peak Shape Distortion, Phase and Reassignment

Non-stationary Analysis/Synthesis using Spectrum Peak Shape Distortion, Phase and Reassignment Non-stationary Analysis/Synthesis using Spectrum Peak Shape Distortion, Phase Reassignment Geoffroy Peeters, Xavier Rodet Ircam - Centre Georges-Pompidou, Analysis/Synthesis Team, 1, pl. Igor Stravinsky,

More information

Applications of Music Processing

Applications of Music Processing Lecture Music Processing Applications of Music Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Singing Voice Detection Important pre-requisite

More information

THE HUMANISATION OF STOCHASTIC PROCESSES FOR THE MODELLING OF F0 DRIFT IN SINGING

THE HUMANISATION OF STOCHASTIC PROCESSES FOR THE MODELLING OF F0 DRIFT IN SINGING THE HUMANISATION OF STOCHASTIC PROCESSES FOR THE MODELLING OF F0 DRIFT IN SINGING Ryan Stables [1], Dr. Jamie Bullock [2], Dr. Cham Athwal [3] [1] Institute of Digital Experience, Birmingham City University,

More information

Rhythmic Similarity -- a quick paper review. Presented by: Shi Yong March 15, 2007 Music Technology, McGill University

Rhythmic Similarity -- a quick paper review. Presented by: Shi Yong March 15, 2007 Music Technology, McGill University Rhythmic Similarity -- a quick paper review Presented by: Shi Yong March 15, 2007 Music Technology, McGill University Contents Introduction Three examples J. Foote 2001, 2002 J. Paulus 2002 S. Dixon 2004

More information

Singing Voice Detection. Applications of Music Processing. Singing Voice Detection. Singing Voice Detection. Singing Voice Detection

Singing Voice Detection. Applications of Music Processing. Singing Voice Detection. Singing Voice Detection. Singing Voice Detection Detection Lecture usic Processing Applications of usic Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Important pre-requisite for: usic segmentation

More information

Between physics and perception signal models for high level audio processing. Axel Röbel. Analysis / synthesis team, IRCAM. DAFx 2010 iem Graz

Between physics and perception signal models for high level audio processing. Axel Röbel. Analysis / synthesis team, IRCAM. DAFx 2010 iem Graz Between physics and perception signal models for high level audio processing Axel Röbel Analysis / synthesis team, IRCAM DAFx 2010 iem Graz Overview Introduction High level control of signal transformation

More information

L19: Prosodic modification of speech

L19: Prosodic modification of speech L19: Prosodic modification of speech Time-domain pitch synchronous overlap add (TD-PSOLA) Linear-prediction PSOLA Frequency-domain PSOLA Sinusoidal models Harmonic + noise models STRAIGHT This lecture

More information

The Music Retrieval Method Based on The Audio Feature Analysis Technique with The Real World Polyphonic Music

The Music Retrieval Method Based on The Audio Feature Analysis Technique with The Real World Polyphonic Music The Music Retrieval Method Based on The Audio Feature Analysis Technique with The Real World Polyphonic Music Chai-Jong Song, Seok-Pil Lee, Sung-Ju Park, Saim Shin, Dalwon Jang Digital Media Research Center,

More information

An Efficient Extraction of Vocal Portion from Music Accompaniment Using Trend Estimation

An Efficient Extraction of Vocal Portion from Music Accompaniment Using Trend Estimation An Efficient Extraction of Vocal Portion from Music Accompaniment Using Trend Estimation Aisvarya V 1, Suganthy M 2 PG Student [Comm. Systems], Dept. of ECE, Sree Sastha Institute of Engg. & Tech., Chennai,

More information

Sound Synthesis Methods

Sound Synthesis Methods Sound Synthesis Methods Matti Vihola, mvihola@cs.tut.fi 23rd August 2001 1 Objectives The objective of sound synthesis is to create sounds that are Musically interesting Preferably realistic (sounds like

More information

Verse (Bars 5 20) The Contour of the Acoustic Guitar Riff

Verse (Bars 5 20) The Contour of the Acoustic Guitar Riff Verse (Bars 5 20) The Contour of the Acoustic Guitar Riff a. The Guitar riff starts with five descending stepwise notes (D#, C#, B, A# and G#), followed by six notes (G#) repeated at the same pitch, then

More information

INTRODUCTION TO COMPUTER MUSIC. Roger B. Dannenberg Professor of Computer Science, Art, and Music. Copyright by Roger B.

INTRODUCTION TO COMPUTER MUSIC. Roger B. Dannenberg Professor of Computer Science, Art, and Music. Copyright by Roger B. INTRODUCTION TO COMPUTER MUSIC FM SYNTHESIS A classic synthesis algorithm Roger B. Dannenberg Professor of Computer Science, Art, and Music ICM Week 4 Copyright 2002-2013 by Roger B. Dannenberg 1 Frequency

More information

Determination of instants of significant excitation in speech using Hilbert envelope and group delay function

Determination of instants of significant excitation in speech using Hilbert envelope and group delay function Determination of instants of significant excitation in speech using Hilbert envelope and group delay function by K. Sreenivasa Rao, S. R. M. Prasanna, B.Yegnanarayana in IEEE Signal Processing Letters,

More information

NOTES FOR THE SYLLABLE-SIGNAL SYNTHESIS METHOD: TIPW

NOTES FOR THE SYLLABLE-SIGNAL SYNTHESIS METHOD: TIPW NOTES FOR THE SYLLABLE-SIGNAL SYNTHESIS METHOD: TIPW Hung-Yan GU Department of EE, National Taiwan University of Science and Technology 43 Keelung Road, Section 4, Taipei 106 E-mail: root@guhy.ee.ntust.edu.tw

More information

Lesson Plans Contents

Lesson Plans Contents 2 Lesson Plans Contents Introduction... 3 Tuning... 4 MusicPlus Digital Checklist... 5 How to use MusicPlus Digital... 6 MPD Mnemonics explained... 7 Lesson 1 - Learn the Ukulele... 8 Lesson 2 - Strings...

More information

Drum Transcription Based on Independent Subspace Analysis

Drum Transcription Based on Independent Subspace Analysis Report for EE 391 Special Studies and Reports for Electrical Engineering Drum Transcription Based on Independent Subspace Analysis Yinyi Guo Center for Computer Research in Music and Acoustics, Stanford,

More information

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Mel Spectrum Analysis of Speech Recognition using Single Microphone International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree

More information

SINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase and Reassigned Spectrum

SINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase and Reassigned Spectrum SINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase Reassigned Spectrum Geoffroy Peeters, Xavier Rodet Ircam - Centre Georges-Pompidou Analysis/Synthesis Team, 1, pl. Igor

More information

DAFX - Digital Audio Effects

DAFX - Digital Audio Effects DAFX - Digital Audio Effects Udo Zölzer, Editor University of the Federal Armed Forces, Hamburg, Germany Xavier Amatriain Pompeu Fabra University, Barcelona, Spain Daniel Arfib CNRS - Laboratoire de Mecanique

More information

Transcription of Piano Music

Transcription of Piano Music Transcription of Piano Music Rudolf BRISUDA Slovak University of Technology in Bratislava Faculty of Informatics and Information Technologies Ilkovičova 2, 842 16 Bratislava, Slovakia xbrisuda@is.stuba.sk

More information

G (IV) D (I) 5 R. G (IV) o o o

G (IV) D (I) 5 R. G (IV) o o o THE D PROGRESSION D (I) x o o G (IV) o o o A7 (V7) o o o o R 5 In this unit, you will learn a I - IV - V7 progression in each key. For the key of D, those chords are D - G - A7. To change easily from D

More information

Learning to Unlearn and Relearn Speech Signal Processing using Neural Networks: current and future perspectives

Learning to Unlearn and Relearn Speech Signal Processing using Neural Networks: current and future perspectives Learning to Unlearn and Relearn Speech Signal Processing using Neural Networks: current and future perspectives Mathew Magimai Doss Collaborators: Vinayak Abrol, Selen Hande Kabil, Hannah Muckenhirn, Dimitri

More information

Combining Pitch-Based Inference and Non-Negative Spectrogram Factorization in Separating Vocals from Polyphonic Music

Combining Pitch-Based Inference and Non-Negative Spectrogram Factorization in Separating Vocals from Polyphonic Music Combining Pitch-Based Inference and Non-Negative Spectrogram Factorization in Separating Vocals from Polyphonic Music Tuomas Virtanen, Annamaria Mesaros, Matti Ryynänen Department of Signal Processing,

More information

CMPT 468: Frequency Modulation (FM) Synthesis

CMPT 468: Frequency Modulation (FM) Synthesis CMPT 468: Frequency Modulation (FM) Synthesis Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser University October 6, 23 Linear Frequency Modulation (FM) Till now we ve seen signals

More information

VOICE BOX Harmony Machine and Vocoder

VOICE BOX Harmony Machine and Vocoder BASIC CONNECTION SETUP - QUICK START GUIDE - VOICE BOX Harmony Machine and Vocoder Congratulations on your purchase of the Electro-Harmonix Voice Box! The Voice Box is a comprehensive and easy to use vocal

More information

Determination of Variation Ranges of the Psola Transformation Parameters by Using Their Influence on the Acoustic Parameters of Speech

Determination of Variation Ranges of the Psola Transformation Parameters by Using Their Influence on the Acoustic Parameters of Speech Determination of Variation Ranges of the Psola Transformation Parameters by Using Their Influence on the Acoustic Parameters of Speech L. Demri1, L. Falek2, H. Teffahi3, and A.Djeradi4 Speech Communication

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume, http://acousticalsociety.org/ ICA Montreal Montreal, Canada - June Musical Acoustics Session amu: Aeroacoustics of Wind Instruments and Human Voice II amu.

More information

Lecture 5: Sinusoidal Modeling

Lecture 5: Sinusoidal Modeling ELEN E4896 MUSIC SIGNAL PROCESSING Lecture 5: Sinusoidal Modeling 1. Sinusoidal Modeling 2. Sinusoidal Analysis 3. Sinusoidal Synthesis & Modification 4. Noise Residual Dan Ellis Dept. Electrical Engineering,

More information

Structure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping

Structure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping Structure of Speech Physical acoustics Time-domain representation Frequency domain representation Sound shaping Speech acoustics Source-Filter Theory Speech Source characteristics Speech Filter characteristics

More information

Automatic Lyrics Alignment for Cantonese Popular Music

Automatic Lyrics Alignment for Cantonese Popular Music Multimedia Systems manuscript No. (will be inserted by the editor) Chi Hang Wong Wai Man Szeto Kin Hong Wong Automatic Lyrics Alignment for Cantonese Popular Music Abstract From lyrics-display on electronic

More information

Vocal effort modification for singing synthesis

Vocal effort modification for singing synthesis INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Vocal effort modification for singing synthesis Olivier Perrotin, Christophe d Alessandro LIMSI, CNRS, Université Paris-Saclay, France olivier.perrotin@limsi.fr

More information

YAMAHA. Modifying Preset Voices. IlU FD/D SUPPLEMENTAL BOOKLET DIGITAL PROGRAMMABLE ALGORITHM SYNTHESIZER

YAMAHA. Modifying Preset Voices. IlU FD/D SUPPLEMENTAL BOOKLET DIGITAL PROGRAMMABLE ALGORITHM SYNTHESIZER YAMAHA Modifying Preset Voices I IlU FD/D DIGITAL PROGRAMMABLE ALGORITHM SYNTHESIZER SUPPLEMENTAL BOOKLET Welcome --- This is the first in a series of Supplemental Booklets designed to provide a practical

More information

REpeating Pattern Extraction Technique (REPET)

REpeating Pattern Extraction Technique (REPET) REpeating Pattern Extraction Technique (REPET) EECS 32: Machine Perception of Music & Audio Zafar RAFII, Spring 22 Repetition Repetition is a fundamental element in generating and perceiving structure

More information

Localized Robust Audio Watermarking in Regions of Interest

Localized Robust Audio Watermarking in Regions of Interest Localized Robust Audio Watermarking in Regions of Interest W Li; X Y Xue; X Q Li Department of Computer Science and Engineering University of Fudan, Shanghai 200433, P. R. China E-mail: weili_fd@yahoo.com

More information

Automatic Evaluation of Hindustani Learner s SARGAM Practice

Automatic Evaluation of Hindustani Learner s SARGAM Practice Automatic Evaluation of Hindustani Learner s SARGAM Practice Gurunath Reddy M and K. Sreenivasa Rao Indian Institute of Technology, Kharagpur, India {mgurunathreddy, ksrao}@sit.iitkgp.ernet.in Abstract

More information

DREAM DSP LIBRARY. All images property of DREAM.

DREAM DSP LIBRARY. All images property of DREAM. DREAM DSP LIBRARY One of the pioneers in digital audio, DREAM has been developing DSP code for over 30 years. But the company s roots go back even further to 1977, when their founder was granted his first

More information

Speech Synthesis; Pitch Detection and Vocoders

Speech Synthesis; Pitch Detection and Vocoders Speech Synthesis; Pitch Detection and Vocoders Tai-Shih Chi ( 冀泰石 ) Department of Communication Engineering National Chiao Tung University May. 29, 2008 Speech Synthesis Basic components of the text-to-speech

More information

Linear Frequency Modulation (FM) Chirp Signal. Chirp Signal cont. CMPT 468: Lecture 7 Frequency Modulation (FM) Synthesis

Linear Frequency Modulation (FM) Chirp Signal. Chirp Signal cont. CMPT 468: Lecture 7 Frequency Modulation (FM) Synthesis Linear Frequency Modulation (FM) CMPT 468: Lecture 7 Frequency Modulation (FM) Synthesis Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser University January 26, 29 Till now we

More information

Fitur YAMAHA ELS-02C. An improved and superbly expressive STAGEA. AWM Tone Generator. Super Articulation Voices

Fitur YAMAHA ELS-02C. An improved and superbly expressive STAGEA. AWM Tone Generator. Super Articulation Voices Fitur YAMAHA ELS-02C An improved and superbly expressive STAGEA Generating all the sounds of the world AWM Tone Generator The Advanced Wave Memory (AWM) tone generator incorporates 986 voices. A wide variety

More information

Tempo and Beat Tracking

Tempo and Beat Tracking Lecture Music Processing Tempo and Beat Tracking Meinard Müller International Audio Laboratories Erlangen meinard.mueller@audiolabs-erlangen.de Introduction Basic beat tracking task: Given an audio recording

More information

Query by Singing and Humming

Query by Singing and Humming Abstract Query by Singing and Humming CHIAO-WEI LIN Music retrieval techniques have been developed in recent years since signals have been digitalized. Typically we search a song by its name or the singer

More information

Advanced Music Content Analysis

Advanced Music Content Analysis RuSSIR 2013: Content- and Context-based Music Similarity and Retrieval Titelmasterformat durch Klicken bearbeiten Advanced Music Content Analysis Markus Schedl Peter Knees {markus.schedl, peter.knees}@jku.at

More information

Music Signal Processing

Music Signal Processing Tutorial Music Signal Processing Meinard Müller Saarland University and MPI Informatik meinard@mpi-inf.mpg.de Anssi Klapuri Queen Mary University of London anssi.klapuri@elec.qmul.ac.uk Overview Part I:

More information

2. Experiment with your basic ring modulator by tuning the oscillators to see and hear the output change as the sound is modulated.

2. Experiment with your basic ring modulator by tuning the oscillators to see and hear the output change as the sound is modulated. Have a Synth kit? Try boosting it with some logic to create a simple ring modulator, an addition that will allow you to create complex sounds that in our opinion, sound eerie, wobbly, metallic, droney

More information

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Ching-Ta Lu, Kun-Fu Tseng 2, Chih-Tsung Chen 2 Department of Information Communication, Asia University, Taichung, Taiwan, ROC

More information

A Parametric Model for Spectral Sound Synthesis of Musical Sounds

A Parametric Model for Spectral Sound Synthesis of Musical Sounds A Parametric Model for Spectral Sound Synthesis of Musical Sounds Cornelia Kreutzer University of Limerick ECE Department Limerick, Ireland cornelia.kreutzer@ul.ie Jacqueline Walker University of Limerick

More information

Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012

Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012 Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012 o Music signal characteristics o Perceptual attributes and acoustic properties o Signal representations for pitch detection o STFT o Sinusoidal model o

More information

IMPROVING QUALITY OF SPEECH SYNTHESIS IN INDIAN LANGUAGES. P. K. Lehana and P. C. Pandey

IMPROVING QUALITY OF SPEECH SYNTHESIS IN INDIAN LANGUAGES. P. K. Lehana and P. C. Pandey Workshop on Spoken Language Processing - 2003, TIFR, Mumbai, India, January 9-11, 2003 149 IMPROVING QUALITY OF SPEECH SYNTHESIS IN INDIAN LANGUAGES P. K. Lehana and P. C. Pandey Department of Electrical

More information

Lab 8. ANALYSIS OF COMPLEX SOUNDS AND SPEECH ANALYSIS Amplitude, loudness, and decibels

Lab 8. ANALYSIS OF COMPLEX SOUNDS AND SPEECH ANALYSIS Amplitude, loudness, and decibels Lab 8. ANALYSIS OF COMPLEX SOUNDS AND SPEECH ANALYSIS Amplitude, loudness, and decibels A complex sound with particular frequency can be analyzed and quantified by its Fourier spectrum: the relative amplitudes

More information

A Design of Matching Engine for a Practical Query-by-Singing/Humming System with Polyphonic Recordings

A Design of Matching Engine for a Practical Query-by-Singing/Humming System with Polyphonic Recordings KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS VOL. 8, NO. 2, February 2014 723 Copyright c 2014 KSII A Design of Matching Engine for a Practical Query-by-Singing/Humming System with Polyphonic

More information

SPEECH AND SPECTRAL ANALYSIS

SPEECH AND SPECTRAL ANALYSIS SPEECH AND SPECTRAL ANALYSIS 1 Sound waves: production in general: acoustic interference vibration (carried by some propagation medium) variations in air pressure speech: actions of the articulatory organs

More information

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm A.T. Rajamanickam, N.P.Subiramaniyam, A.Balamurugan*,

More information

CHORD DETECTION USING CHROMAGRAM OPTIMIZED BY EXTRACTING ADDITIONAL FEATURES

CHORD DETECTION USING CHROMAGRAM OPTIMIZED BY EXTRACTING ADDITIONAL FEATURES CHORD DETECTION USING CHROMAGRAM OPTIMIZED BY EXTRACTING ADDITIONAL FEATURES Jean-Baptiste Rolland Steinberg Media Technologies GmbH jb.rolland@steinberg.de ABSTRACT This paper presents some concepts regarding

More information

Audio Watermarking Based on Music Content Analysis: Robust against Time Scale Modification

Audio Watermarking Based on Music Content Analysis: Robust against Time Scale Modification Audio Watermarking Based on Music Content Analysis: Robust against Time Scale Modification Wei Li and Xiangyang Xue Department of Computer Science and Engineering University of Fudan, 220 Handan Road Shanghai

More information

University of Colorado at Boulder ECEN 4/5532. Lab 1 Lab report due on February 2, 2015

University of Colorado at Boulder ECEN 4/5532. Lab 1 Lab report due on February 2, 2015 University of Colorado at Boulder ECEN 4/5532 Lab 1 Lab report due on February 2, 2015 This is a MATLAB only lab, and therefore each student needs to turn in her/his own lab report and own programs. 1

More information

Tempo and Beat Tracking

Tempo and Beat Tracking Lecture Music Processing Tempo and Beat Tracking Meinard Müller International Audio Laboratories Erlangen meinard.mueller@audiolabs-erlangen.de Book: Fundamentals of Music Processing Meinard Müller Fundamentals

More information

Class Overview. tracking mixing mastering encoding. Figure 1: Audio Production Process

Class Overview. tracking mixing mastering encoding. Figure 1: Audio Production Process MUS424: Signal Processing Techniques for Digital Audio Effects Handout #2 Jonathan Abel, David Berners April 3, 2017 Class Overview Introduction There are typically four steps in producing a CD or movie

More information

Transferring Singing Expressions from One Voice to Another

Transferring Singing Expressions from One Voice to Another 석사학위논문 Master s Thesis 가창표현이식알고리즘 Transferring Singing Expressions from One Voice to Another 2017 용상언 ( 龍相彦 Yong, Sangeon) 한국과학기술원 Korea Advanced Institute of Science and Technology 석사학위논문 가창표현이식알고리즘 2017

More information

VIBRATO DETECTING ALGORITHM IN REAL TIME. Minhao Zhang, Xinzhao Liu. University of Rochester Department of Electrical and Computer Engineering

VIBRATO DETECTING ALGORITHM IN REAL TIME. Minhao Zhang, Xinzhao Liu. University of Rochester Department of Electrical and Computer Engineering VIBRATO DETECTING ALGORITHM IN REAL TIME Minhao Zhang, Xinzhao Liu University of Rochester Department of Electrical and Computer Engineering ABSTRACT Vibrato is a fundamental expressive attribute in music,

More information

Lecture 6. Rhythm Analysis. (some slides are adapted from Zafar Rafii and some figures are from Meinard Mueller)

Lecture 6. Rhythm Analysis. (some slides are adapted from Zafar Rafii and some figures are from Meinard Mueller) Lecture 6 Rhythm Analysis (some slides are adapted from Zafar Rafii and some figures are from Meinard Mueller) Definitions for Rhythm Analysis Rhythm: movement marked by the regulated succession of strong

More information

Characterization of the singing voice from polyphonic recordings

Characterization of the singing voice from polyphonic recordings Characterization of the singing voice from polyphonic recordings Christine Smit Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate School of Arts

More information

Recent Development of the HMM-based Singing Voice Synthesis System Sinsy

Recent Development of the HMM-based Singing Voice Synthesis System Sinsy ISCA Archive http://www.isca-speech.org/archive 7 th ISCAWorkshopon Speech Synthesis(SSW-7) Kyoto, Japan September 22-24, 200 Recent Development of the HMM-based Singing Voice Synthesis System Sinsy Keiichiro

More information

Assessment Schedule 2014 Music: Demonstrate knowledge of conventions used in music scores (91094)

Assessment Schedule 2014 Music: Demonstrate knowledge of conventions used in music scores (91094) NCEA Level 1 Music (91094) 2014 page 1 of 7 Assessment Schedule 2014 Music: Demonstrate knowledge of conventions used in music scores (91094) Evidence Statement Question Sample Evidence ONE (a) (i) Dd

More information

Color Score Melody Harmonization System & User Guide

Color Score Melody Harmonization System & User Guide Color Score Melody Harmonization System & User Guide This is a promotional copy of the Color Score Melody Harmonization System from learncolorpiano.com Contents: Melody Harmonization System (Key of C Major)

More information

COMP 546, Winter 2017 lecture 20 - sound 2

COMP 546, Winter 2017 lecture 20 - sound 2 Today we will examine two types of sounds that are of great interest: music and speech. We will see how a frequency domain analysis is fundamental to both. Musical sounds Let s begin by briefly considering

More information

Machine Learning for Signal Processing. Course Projects. Class Sep 2009

Machine Learning for Signal Processing. Course Projects. Class Sep 2009 11-755 Machine Learning for Signal Processing Course Projects Class 9. 22 Sep 2009 Administrivia n THURSDAY S CLASS: WEAN HALL 5403 q Thanks to Ramkumar Krishnan for arranging the room! n Almost all submissions

More information

A Novel Approach to Separation of Musical Signal Sources by NMF

A Novel Approach to Separation of Musical Signal Sources by NMF ICSP2014 Proceedings A Novel Approach to Separation of Musical Signal Sources by NMF Sakurako Yazawa Graduate School of Systems and Information Engineering, University of Tsukuba, Japan Masatoshi Hamanaka

More information

A SEGMENTATION-BASED TEMPO INDUCTION METHOD

A SEGMENTATION-BASED TEMPO INDUCTION METHOD A SEGMENTATION-BASED TEMPO INDUCTION METHOD Maxime Le Coz, Helene Lachambre, Lionel Koenig and Regine Andre-Obrecht IRIT, Universite Paul Sabatier, 118 Route de Narbonne, F-31062 TOULOUSE CEDEX 9 {lecoz,lachambre,koenig,obrecht}@irit.fr

More information

TRANSCRIBING VOCAL EXPRESSION FROM POLYPHONIC MUSIC. Yukara Ikemiya, Katsutoshi Itoyama, Hiroshi G. Okuno

TRANSCRIBING VOCAL EXPRESSION FROM POLYPHONIC MUSIC. Yukara Ikemiya, Katsutoshi Itoyama, Hiroshi G. Okuno RANSCRIBING VOCAL EXPRESSION FROM POLYPHONIC MUSIC Yukara Ikemiya, Katsutoshi Itoyama, Hiroshi G. Okuno Graduate School of Informatics, Kyoto University, Japan ABSRAC A method for transcribing vocal expressions

More information

Introduction... xxvii Conventions used in this book... xxvii Acknowledgements...xxviii

Introduction... xxvii Conventions used in this book... xxvii Acknowledgements...xxviii Contents Introduction... xxvii Conventions used in this book... xxvii Acknowledgements...xxviii Sequencing...1 Getting Started...2 The Transport...4 The Metronome...4 Changing Tempo...5 Picking a sound...5

More information

Separating Voiced Segments from Music File using MFCC, ZCR and GMM

Separating Voiced Segments from Music File using MFCC, ZCR and GMM Separating Voiced Segments from Music File using MFCC, ZCR and GMM Mr. Prashant P. Zirmite 1, Mr. Mahesh K. Patil 2, Mr. Santosh P. Salgar 3,Mr. Veeresh M. Metigoudar 4 1,2,3,4Assistant Professor, Dept.

More information

Adaptive time scale modification of speech for graceful degrading voice quality in congested networks

Adaptive time scale modification of speech for graceful degrading voice quality in congested networks Adaptive time scale modification of speech for graceful degrading voice quality in congested networks Prof. H. Gokhan ILK Ankara University, Faculty of Engineering, Electrical&Electronics Eng. Dept 1 Contact

More information

Using Audio Onset Detection Algorithms

Using Audio Onset Detection Algorithms Using Audio Onset Detection Algorithms 1 st Diana Siwiak Victoria University of Wellington Wellington, New Zealand 2 nd Dale A. Carnegie Victoria University of Wellington Wellington, New Zealand 3 rd Jim

More information

Psychology of Language

Psychology of Language PSYCH 150 / LIN 155 UCI COGNITIVE SCIENCES syn lab Psychology of Language Prof. Jon Sprouse 01.10.13: The Mental Representation of Speech Sounds 1 A logical organization For clarity s sake, we ll organize

More information

WK-7500 WK-6500 CTK-7000 CTK-6000 BS A

WK-7500 WK-6500 CTK-7000 CTK-6000 BS A WK-7500 WK-6500 CTK-7000 CTK-6000 Windows and Windows Vista are registered trademarks of Microsoft Corporation in the United States and other countries. Mac OS is a registered trademark of Apple Inc. in

More information

Isolated Digit Recognition Using MFCC AND DTW

Isolated Digit Recognition Using MFCC AND DTW MarutiLimkar a, RamaRao b & VidyaSagvekar c a Terna collegeof Engineering, Department of Electronics Engineering, Mumbai University, India b Vidyalankar Institute of Technology, Department ofelectronics

More information

Contents. Sevana Voice Quality Analyzer Copyright (c) 2009 by Sevana Oy, Finland. All rights reserved.

Contents. Sevana Voice Quality Analyzer Copyright (c) 2009 by Sevana Oy, Finland. All rights reserved. Sevana Voice Quality Analyzer 3.4.10.327 Contents Contents... 1 Introduction... 2 Functionality... 2 Requirements... 2 Generate test signals... 2 Test voice codecs... 2 Compare wav files... 2 Testing parameters...

More information

BEAT DETECTION BY DYNAMIC PROGRAMMING. Racquel Ivy Awuor

BEAT DETECTION BY DYNAMIC PROGRAMMING. Racquel Ivy Awuor BEAT DETECTION BY DYNAMIC PROGRAMMING Racquel Ivy Awuor University of Rochester Department of Electrical and Computer Engineering Rochester, NY 14627 rawuor@ur.rochester.edu ABSTRACT A beat is a salient

More information

- CROWD REVIEW FOR - The Silent Me enge

- CROWD REVIEW FOR - The Silent Me enge - CROWD REVIEW FOR - The Silent Me enge JOHN DANZEN - FEB 28, 2016 Word cloud THIS VISUALIZATION REVEALS WHAT EMOTIONS AND KEY THEMES THE REVIEWERS MENTIONED MOST OFTEN IN THE REVIEWS. THE LARGER T HE

More information

Project 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing

Project 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing Project : Part 2 A second hands-on lab on Speech Processing Frequency-domain processing February 24, 217 During this lab, you will have a first contact on frequency domain analysis of speech signals. You

More information

Speech Synthesis using Mel-Cepstral Coefficient Feature

Speech Synthesis using Mel-Cepstral Coefficient Feature Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract

More information

Worship Team Expectations

Worship Team Expectations Worship Team Expectations General Expectations: To participate on the worship team, you must consider FaithBridge to be your home church. Being an active member of the FaithBridge family means: Participate

More information

Digital Speech Processing and Coding

Digital Speech Processing and Coding ENEE408G Spring 2006 Lecture-2 Digital Speech Processing and Coding Spring 06 Instructor: Shihab Shamma Electrical & Computer Engineering University of Maryland, College Park http://www.ece.umd.edu/class/enee408g/

More information

TIME DOMAIN ATTACK AND RELEASE MODELING Applied to Spectral Domain Sound Synthesis

TIME DOMAIN ATTACK AND RELEASE MODELING Applied to Spectral Domain Sound Synthesis TIME DOMAIN ATTACK AND RELEASE MODELING Applied to Spectral Domain Sound Synthesis Cornelia Kreutzer, Jacqueline Walker Department of Electronic and Computer Engineering, University of Limerick, Limerick,

More information

Single-channel Mixture Decomposition using Bayesian Harmonic Models

Single-channel Mixture Decomposition using Bayesian Harmonic Models Single-channel Mixture Decomposition using Bayesian Harmonic Models Emmanuel Vincent and Mark D. Plumbley Electronic Engineering Department, Queen Mary, University of London Mile End Road, London E1 4NS,

More information

Perception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner.

Perception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner. Perception of pitch AUDL4007: 11 Feb 2010. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum, 2005 Chapter 7 1 Definitions

More information

Enhanced Harmonic Content and Vocal Note Based Predominant Melody Extraction from Vocal Polyphonic Music Signals

Enhanced Harmonic Content and Vocal Note Based Predominant Melody Extraction from Vocal Polyphonic Music Signals INTERSPEECH 016 September 8 1, 016, San Francisco, USA Enhanced Harmonic Content and Vocal Note Based Predominant Melody Extraction from Vocal Polyphonic Music Signals Gurunath Reddy M, K. Sreenivasa Rao

More information

Sinusoidal Modelling in Speech Synthesis, A Survey.

Sinusoidal Modelling in Speech Synthesis, A Survey. Sinusoidal Modelling in Speech Synthesis, A Survey. A.S. Visagie, J.A. du Preez Dept. of Electrical and Electronic Engineering University of Stellenbosch, 7600, Stellenbosch avisagie@dsp.sun.ac.za, dupreez@dsp.sun.ac.za

More information

SOUND SOURCE RECOGNITION AND MODELING

SOUND SOURCE RECOGNITION AND MODELING SOUND SOURCE RECOGNITION AND MODELING CASA seminar, summer 2000 Antti Eronen antti.eronen@tut.fi Contents: Basics of human sound source recognition Timbre Voice recognition Recognition of environmental

More information

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner.

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner. Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb 2009. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence

More information

SurferEQ 2. User Manual. SurferEQ v Sound Radix, All Rights Reserved

SurferEQ 2. User Manual. SurferEQ v Sound Radix, All Rights Reserved 1 SurferEQ 2 User Manual 2 RADICALLY MUSICAL, CREATIVE TIMBRE SHAPER SurferEQ is a ground-breaking pitch-tracking equalizer plug-in that tracks a monophonic instrument or vocal and moves the selected bands

More information

The Deep Sound of a Global Tweet: Sonic Window #1

The Deep Sound of a Global Tweet: Sonic Window #1 The Deep Sound of a Global Tweet: Sonic Window #1 (a Real Time Sonification) Andrea Vigani Como Conservatory, Electronic Music Composition Department anvig@libero.it Abstract. People listen music, than

More information

COMPUTATIONAL RHYTHM AND BEAT ANALYSIS Nicholas Berkner. University of Rochester

COMPUTATIONAL RHYTHM AND BEAT ANALYSIS Nicholas Berkner. University of Rochester COMPUTATIONAL RHYTHM AND BEAT ANALYSIS Nicholas Berkner University of Rochester ABSTRACT One of the most important applications in the field of music information processing is beat finding. Humans have

More information

Speech Processing. Undergraduate course code: LASC10061 Postgraduate course code: LASC11065

Speech Processing. Undergraduate course code: LASC10061 Postgraduate course code: LASC11065 Speech Processing Undergraduate course code: LASC10061 Postgraduate course code: LASC11065 All course materials and handouts are the same for both versions. Differences: credits (20 for UG, 10 for PG);

More information

HCS 7367 Speech Perception

HCS 7367 Speech Perception HCS 7367 Speech Perception Dr. Peter Assmann Fall 212 Power spectrum model of masking Assumptions: Only frequencies within the passband of the auditory filter contribute to masking. Detection is based

More information

Dept. of Computer Science, University of Copenhagen Universitetsparken 1, DK-2100 Copenhagen Ø, Denmark

Dept. of Computer Science, University of Copenhagen Universitetsparken 1, DK-2100 Copenhagen Ø, Denmark NORDIC ACOUSTICAL MEETING 12-14 JUNE 1996 HELSINKI Dept. of Computer Science, University of Copenhagen Universitetsparken 1, DK-2100 Copenhagen Ø, Denmark krist@diku.dk 1 INTRODUCTION Acoustical instruments

More information

City, University of London Institutional Repository

City, University of London Institutional Repository City Research Online City, University of London Institutional Repository Citation: Benetos, E., Holzapfel, A. & Stylianou, Y. (29). Pitched Instrument Onset Detection based on Auditory Spectra. Paper presented

More information