Resynthesizing audiovisual percep5on with augmented reality
|
|
- Collin Robbins
- 6 years ago
- Views:
Transcription
1 Resynthesizing audiovisual percep5on with augmented reality Parag K Mital Department of Compu5ng, Goldsmiths, University of London hbp://pkmital.com Presented for Lunch BITES, CULTURE Lab, Newcastle on 30/06/11
2 Ques5ons What computa5onal processes describe audiovisual percep5on in the real- world? What can augmented reality reveal about our underlying percep5on? Objec5ves Build computa5onal models of audio- visual aben5on using controlled experiments Interpret these models in a real- 5me context situated in real- life scenarios using augmented reality and re- synthesis techniques
3 Modeling ABen5on Prior Spectral/Region Segmenta5on Temporal Event Segmenta5on Synthesis Retrieval/Indexing Scene Reconstruc5on
4 Modeling A"en%on Prior Spectral/Region Segmenta5on Temporal Event Segmenta5on Synthesis Retrieval/Indexing Scene Reconstruc5on
5 Experimental Psychology What processes describe human cogni5on? Visual cogni5on Vision research Auditory scene analysis Auditory aben5on Psychophysics Psychoacous5cs Mul5sensory/Crossmodal percep5on Film cogni5on
6 Computa5onal Cogni5on What computa5onal models best describe human cogni5on? Computer vision Computa5onal neuroscience Machine learning Speech recogni5on Saliency models
7 Dynamic Images and Eye Movements John Henderson, Tim Smith, Robin Hill, Parag K Mital awarded to John Henderson and funded by Leverhulme and ESRC Ques5on What drives human aben5on and eye- movement behavior during moving images? Objec5ves Build a corpus of eye- movement data and corresponding moving images Develop theories and tools for understanding ac5ve visual cogni5on
8 82 videos Range between 30 seconds and 3 minutes. 200 viewers+ Broad range of s5muli: adverts film clips real- world scenes social scenes film trailers video game trailers music videos documentaries news clips anima5on 8
9 Eye- tracking data CARPE X/Y coords of eyes per millisecond per eye per person, plus various eye- movement events and messages. >1000 lines of 8- column data per second! Gaze videos Gaussian Mixture Models Low- level feature visualiza5ons Op5cal flow, edges, gabors, flicker, chroma5city, luminance Dynamic Heatmap videos
10
11
12
13
14
15
16
17 Auditory ABen5on Modeling
18 Modeling ABen5on Prior Spectral/Region Segmenta%on Temporal Event Segmenta5on Synthesis Retrieval/Indexing Scene Reconstruc5on
19 Vision Processing Detec5on Features (SIFT, SURF, Harris Corners) Regions (Mean- shil, MSER) Haar- Features (Boosted Cascades, Viola- Jones) Templates (MI, SSD, Lucas- Kanade) Descrip5on Vector codes (GIST, SIFT, SURF, BRIEF) Trees (FlANN, LSH) Model- based reconstruc5on (PCA, plsa, LDA)
20 J. Matas, O. Chum, M. Urba, and T. Pajdla. "Robust wide baseline stereo from maximally stable extremal regions. Proc. Of Bri5sh Machine Vision Conference, pp , Stanislav Basovnik, Lukas Mach, Andrej Mikulik, and David Obdrzalek. Detec5ng Scene Elements Using Maximally Stable Colour Regions IEEE Computer Vision and PaBern Recogni5on, 2007.
21 J. Matas, O. Chum, M. Urba, and T. Pajdla. "Robust wide baseline stereo from maximally stable extremal regions. Proc. Of Bri5sh Machine Vision Conference, pp , Stanislav Basovnik, Lukas Mach, Andrej Mikulik, and David Obdrzalek. Detec5ng Scene Elements Using Maximally Stable Colour Regions IEEE Computer Vision and PaBern Recogni5on, 2007.
22 J. Matas, O. Chum, M. Urba, and T. Pajdla. "Robust wide baseline stereo from maximally stable extremal regions. Proc. Of Bri5sh Machine Vision Conference, pp , Stanislav Basovnik, Lukas Mach, Andrej Mikulik, and David Obdrzalek. Detec5ng Scene Elements Using Maximally Stable Colour Regions IEEE Computer Vision and PaBern Recogni5on, 2007.
23 J. Matas, O. Chum, M. Urba, and T. Pajdla. "Robust wide baseline stereo from maximally stable extremal regions. Proc. Of Bri5sh Machine Vision Conference, pp , Stanislav Basovnik, Lukas Mach, Andrej Mikulik, and David Obdrzalek. Detec5ng Scene Elements Using Maximally Stable Colour Regions IEEE Computer Vision and PaBern Recogni5on, 2007.
24 Source Separa5on Ques5on How can we describe a chunk of audio in terms of seman5c factors? Paris Smaragdis et al, Sparse and Shil- Invariant Feature Extrac5on From Non- Nega5ve Data
25 Modeling ABen5on Prior Spectral/Region Segmenta5on Temporal Event Segmenta%on Synthesis Retrieval/Indexing Scene Reconstruc5on
26 Modeling ABen5on Prior Spectral/Region Segmenta5on Temporal Event Segmenta5on Synthesis Retrieval/Indexing Scene Reconstruc5on
27 Interpre5ng the Model in Real- Time Ques5on How can technology employing cogni5ve models help us to beber understand the model?
28 Human- Computer Interac5on Ques5on How can we build interfaces to our own perceptual processes? Augmented reality Interfaces for musical expression Robot percep5on
29 Corpus based resynthesis Catart SoundspoBer A new approach to crea5ng musical streams by selec5ng and concatena5ng source segments from a large audio database using methods from music informa5on retrieval (Casey, 2009) Casey, M Soundsposng: a new kind of process?. In The Oxford Handbook of Computer Music, ed. R. Dean New York: Oxford University Press.
30
31 Modeling ABen5on Prior Spectral/Region Segmenta5on Temporal Event Segmenta5on Synthesis Retrieval/Indexing Scene Reconstruc%on
32 Sound Spa5aliza5on HRIR using both MIT and IRCAM LISTEN 1 Perceptual filter encoding source of sound [1]. hbp://recherche.ircam.fr/equipes/salles/listen
33 Loca5on of Impulse Responses
34 Convolu5on Convolu5on Impulse response Binaural Audio
35 hbp://pkmital.com
Introduction to Computer Engineering
Introduction to Computer Engineering Mohammad Hossein Manshaei manshaei@gmail.com Textbook Computer Science an Overview J.Glenn Brooksher, 11 th Edition Pearson 2011 2 Contents 1. Computer science vs computer
More informationRhythm Analysis in Music
Rhythm Analysis in Music EECS 352: Machine Percep;on of Music & Audio Zafar Rafii, Winter 24 Some Defini;ons Rhythm movement marked by the regulated succession of strong and weak elements, or of opposite
More informationRecognizing Words in Scenes with a Head-Mounted Eye-Tracker
Recognizing Words in Scenes with a Head-Mounted Eye-Tracker Takuya Kobayashi, Takumi Toyama, Faisal Shafait, Masakazu Iwamura, Koichi Kise and Andreas Dengel Graduate School of Engineering Osaka Prefecture
More informationIvan Tashev Microsoft Research
Hannes Gamper Microsoft Research David Johnston Microsoft Research Ivan Tashev Microsoft Research Mark R. P. Thomas Dolby Laboratories Jens Ahrens Chalmers University, Sweden Augmented and virtual reality,
More informationComputational Methods for Analysis of Footwear Impression Evidence
Computational Methods for Analysis of Footwear Impression Evidence Sargur Srihari University at Buffalo, The State University of New York Presenta(on Outline Background on Shoeprint Evidence Database Crea(on
More informationAn Egocentric Perspec/ve on Ac/ve Vision and Visual Object Learning in Toddlers
An Egocentric Perspec/ve on Ac/ve Vision and Visual Object Learning in Toddlers S. Bambach, D. Crandall, L. Smith, C. Yu. ICDL 2017 Experiment presenters: Arjun, Ginevra Their Experiments Image source:
More informationToday. CS 232: Ar)ficial Intelligence. Introduc)on August 31, What is ar)ficial intelligence? What can AI do? What is this course?
CS 232: Ar)ficial Intelligence Introduc)on August 31, 2015 Today What is ar)ficial intelligence? What can AI do? What is this course? [These slides were created by Dan Klein and Pieter Abbeel for CS188
More information6.02 Fall 2013 Lecture #7
6. Fall Lecture #7 Viterbi decoding of convoluonal codes 6. Fall Lecture 7, Slide # Convolutional Coding Shift Register View + mod p [n] x[n] x[n-] x[n-] The values in the registers define the state of
More informationComputa(onal Vision Introduc(on and Overview. Lecture 1: Introduc(on Hamid Dehghani Office: UG38
Computa(onal Vision Introduc(on and Overview Lecture 1: Introduc(on Hamid Dehghani Office: UG38 Schedule 1 Lecture / week 9 am, Fridays@ Nuffield G13 1 Lab / week 11 am Fridays, @ UG04, CS Modules webpages
More informationExploiting envelope fluctuations to achieve robust extraction and intelligent integration of binaural cues
The Technology of Binaural Listening & Understanding: Paper ICA216-445 Exploiting envelope fluctuations to achieve robust extraction and intelligent integration of binaural cues G. Christopher Stecker
More informationMachine recognition of speech trained on data from New Jersey Labs
Machine recognition of speech trained on data from New Jersey Labs Frequency response (peak around 5 Hz) Impulse response (effective length around 200 ms) 41 RASTA filter 10 attenuation [db] 40 1 10 modulation
More informationEvaluating Context-Aware Saliency Detection Method
Evaluating Context-Aware Saliency Detection Method Christine Sawyer Santa Barbara City College Computer Science & Mechanical Engineering Funding: Office of Naval Research Defense University Research Instrumentation
More informationBowdoin Computer Science
Bowdoin Computer Science Reasons to study Computer Science Compu3ng is part of everything we do! Exper3se in compu3ng enables you to solve complex problems Compu3ng enables you to make a posi3ve difference
More informationQuality Assessment Method for Warping and Cropping Error Detection in Digital Repositories
Qualitative and Quantitative Methods in Libraries (QQML) 4: 811-820, 2015 Quality Assessment Method for Warping and Cropping Error Detection in Digital Repositories Roman Graf and Ross King and Martin
More informationComputer Vision Slides curtesy of Professor Gregory Dudek
Computer Vision Slides curtesy of Professor Gregory Dudek Ioannis Rekleitis Why vision? Passive (emits nothing). Discreet. Energy efficient. Intuitive. Powerful (works well for us, right?) Long and short
More informationSpatialization and Timbre for Effective Auditory Graphing
18 Proceedings o1't11e 8th WSEAS Int. Conf. on Acoustics & Music: Theory & Applications, Vancouver, Canada. June 19-21, 2007 Spatialization and Timbre for Effective Auditory Graphing HONG JUN SONG and
More informationEffects of the Unscented Kalman Filter Process for High Performance Face Detector
Effects of the Unscented Kalman Filter Process for High Performance Face Detector Bikash Lamsal and Naofumi Matsumoto Abstract This paper concerns with a high performance algorithm for human face detection
More informationBowdoin Computer Science
Bowdoin Computer Science Reasons to study Computer Science Compu3ng is part of everything we do! Exper3se in compu3ng enables you to solve complex problems Compu3ng enables you to make a posi3ve difference
More informationLOOK WHO S TALKING: SPEAKER DETECTION USING VIDEO AND AUDIO CORRELATION. Ross Cutler and Larry Davis
LOOK WHO S TALKING: SPEAKER DETECTION USING VIDEO AND AUDIO CORRELATION Ross Cutler and Larry Davis Institute for Advanced Computer Studies University of Maryland, College Park rgc,lsd @cs.umd.edu ABSTRACT
More informationSound source localization and its use in multimedia applications
Notes for lecture/ Zack Settel, McGill University Sound source localization and its use in multimedia applications Introduction With the arrival of real-time binaural or "3D" digital audio processing,
More informationVideo Segmentation and Its Applications
Video Segmentation and Its Applications King Ngi Ngan Hongliang Li Editors Video Segmentation and Its Applications ABC Editors King Ngi Ngan Department of Electronic Engineering The Chinese University
More informationEnvironmental Sound Recognition using MP-based Features
Environmental Sound Recognition using MP-based Features Selina Chu, Shri Narayanan *, and C.-C. Jay Kuo * Speech Analysis and Interpretation Lab Signal & Image Processing Institute Department of Computer
More informationReal-Time Face Detection and Tracking for High Resolution Smart Camera System
Digital Image Computing Techniques and Applications Real-Time Face Detection and Tracking for High Resolution Smart Camera System Y. M. Mustafah a,b, T. Shan a, A. W. Azman a,b, A. Bigdeli a, B. C. Lovell
More informationFace Detection using 3-D Time-of-Flight and Colour Cameras
Face Detection using 3-D Time-of-Flight and Colour Cameras Jan Fischer, Daniel Seitz, Alexander Verl Fraunhofer IPA, Nobelstr. 12, 70597 Stuttgart, Germany Abstract This paper presents a novel method to
More informationConvolu'onal Neural Networks. November 17, 2015
Convolu'onal Neural Networks November 17, 2015 Ar'ficial Neural Networks Feedforward neural networks Ar'ficial Neural Networks Feedforward, fully-connected neural networks Ar'ficial Neural Networks Feedforward,
More informationSOPA version 2. Revised July SOPA project. September 21, Introduction 2. 2 Basic concept 3. 3 Capturing spatial audio 4
SOPA version 2 Revised July 7 2014 SOPA project September 21, 2014 Contents 1 Introduction 2 2 Basic concept 3 3 Capturing spatial audio 4 4 Sphere around your head 5 5 Reproduction 7 5.1 Binaural reproduction......................
More informationAutomatic Text-Independent. Speaker. Recognition Approaches Using Binaural Inputs
Automatic Text-Independent Speaker Recognition Approaches Using Binaural Inputs Karim Youssef, Sylvain Argentieri and Jean-Luc Zarader 1 Outline Automatic speaker recognition: introduction Designed systems
More informationAudio Imputation Using the Non-negative Hidden Markov Model
Audio Imputation Using the Non-negative Hidden Markov Model Jinyu Han 1,, Gautham J. Mysore 2, and Bryan Pardo 1 1 EECS Department, Northwestern University 2 Advanced Technology Labs, Adobe Systems Inc.
More informationEmbodiment Mark W. Newman SI 688 Fall 2010
Embodiment Mark W. Newman SI 688 Fall 2010 Where the Action Is The cogni
More informationROOM AND CONCERT HALL ACOUSTICS MEASUREMENTS USING ARRAYS OF CAMERAS AND MICROPHONES
ROOM AND CONCERT HALL ACOUSTICS The perception of sound by human listeners in a listening space, such as a room or a concert hall is a complicated function of the type of source sound (speech, oration,
More informationA classification-based cocktail-party processor
A classification-based cocktail-party processor Nicoleta Roman, DeLiang Wang Department of Computer and Information Science and Center for Cognitive Science The Ohio State University Columbus, OH 43, USA
More informationA VIDEO CAMERA ROAD SIGN SYSTEM OF THE EARLY WARNING FROM COLLISION WITH THE WILD ANIMALS
Vol. 12, Issue 1/2016, 42-46 DOI: 10.1515/cee-2016-0006 A VIDEO CAMERA ROAD SIGN SYSTEM OF THE EARLY WARNING FROM COLLISION WITH THE WILD ANIMALS Slavomir MATUSKA 1*, Robert HUDEC 2, Patrik KAMENCAY 3,
More informationTime- frequency Masking
Time- Masking EECS 352: Machine Percep=on of Music & Audio Zafar Rafii, Winter 214 1 STFT The Short- Time Fourier Transform (STFT) is a succession of local Fourier Transforms (FT) Time signal Real spectrogram
More informationPerceptual Interfaces. Matthew Turk s (UCSB) and George G. Robertson s (Microsoft Research) slides on perceptual p interfaces
Perceptual Interfaces Adapted from Matthew Turk s (UCSB) and George G. Robertson s (Microsoft Research) slides on perceptual p interfaces Outline Why Perceptual Interfaces? Multimodal interfaces Vision
More informationThe psychoacoustics of reverberation
The psychoacoustics of reverberation Steven van de Par Steven.van.de.Par@uni-oldenburg.de July 19, 2016 Thanks to Julian Grosse and Andreas Häußler 2016 AES International Conference on Sound Field Control
More informationToday. CS 395T Visual Recognition. Course content. Administration. Expectations. Paper reviews
Today CS 395T Visual Recognition Course logistics Overview Volunteers, prep for next week Thursday, January 18 Administration Class: Tues / Thurs 12:30-2 PM Instructor: Kristen Grauman grauman at cs.utexas.edu
More informationOpponent Colors Revisited. Sabine Süsstrunk Image and Visual Representation Lab
Opponent Colors Revisited Sabine Süsstrunk Image and Visual Representation Lab A small exercise Ques0on: what color is [255,0,0]? A small exercise Ques0on: what color is [255,0,0]? Answer: red S.K. Shevell
More informationLecture 14: Source Separation
ELEN E896 MUSIC SIGNAL PROCESSING Lecture 1: Source Separation 1. Sources, Mixtures, & Perception. Spatial Filtering 3. Time-Frequency Masking. Model-Based Separation Dan Ellis Dept. Electrical Engineering,
More informationImproving reverberant speech separation with binaural cues using temporal context and convolutional neural networks
Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks Alfredo Zermini, Qiuqiang Kong, Yong Xu, Mark D. Plumbley, Wenwu Wang Centre for Vision,
More informationData Insufficiency in Sketch Versus Photo Face Recognition
CVPR Workshop in Biometrics 2012 Data Insufficiency in Sketch Versus Photo Face Recognition 17 June 2012 Jonghyun Choi Abhishek Sharma, David W. Jacobs, Larry S. Davis Ins=tute of Advanced Computer Studies
More informationMicrophone Array Design and Beamforming
Microphone Array Design and Beamforming Heinrich Löllmann Multimedia Communications and Signal Processing heinrich.loellmann@fau.de with contributions from Vladi Tourbabin and Hendrik Barfuss EUSIPCO Tutorial
More informationVisual Search using Principal Component Analysis
Visual Search using Principal Component Analysis Project Report Umesh Rajashekar EE381K - Multidimensional Digital Signal Processing FALL 2000 The University of Texas at Austin Abstract The development
More informationPose Invariant Face Recognition
Pose Invariant Face Recognition Fu Jie Huang Zhihua Zhou Hong-Jiang Zhang Tsuhan Chen Electrical and Computer Engineering Department Carnegie Mellon University jhuangfu@cmu.edu State Key Lab for Novel
More informationInterframe Coding of Global Image Signatures for Mobile Augmented Reality
Interframe Coding of Global Image Signatures for Mobile Augmented Reality David Chen 1, Mina Makar 1,2, Andre Araujo 1, Bernd Girod 1 1 Department of Electrical Engineering, Stanford University 2 Qualcomm
More informationDERIVATION OF TRAPS IN AUDITORY DOMAIN
DERIVATION OF TRAPS IN AUDITORY DOMAIN Petr Motlíček, Doctoral Degree Programme (4) Dept. of Computer Graphics and Multimedia, FIT, BUT E-mail: motlicek@fit.vutbr.cz Supervised by: Dr. Jan Černocký, Prof.
More informationVIRTUAL REALITY Introduction. Emil M. Petriu SITE, University of Ottawa
VIRTUAL REALITY Introduction Emil M. Petriu SITE, University of Ottawa Natural and Virtual Reality Virtual Reality Interactive Virtual Reality Virtualized Reality Augmented Reality HUMAN PERCEPTION OF
More informationExploiting deep neural networks and head movements for binaural localisation of multiple speakers in reverberant conditions
INTERSPEECH 2015 Exploiting deep neural networks and head movements for binaural localisation of multiple speakers in reverberant conditions Ning Ma 1, Guy J. Brown 1, Tobias May 2 1 Department of Computer
More informationBEAT DETECTION BY DYNAMIC PROGRAMMING. Racquel Ivy Awuor
BEAT DETECTION BY DYNAMIC PROGRAMMING Racquel Ivy Awuor University of Rochester Department of Electrical and Computer Engineering Rochester, NY 14627 rawuor@ur.rochester.edu ABSTRACT A beat is a salient
More informationRecent Advances in Acoustic Signal Extraction and Dereverberation
Recent Advances in Acoustic Signal Extraction and Dereverberation Emanuël Habets Erlangen Colloquium 2016 Scenario Spatial Filtering Estimated Desired Signal Undesired sound components: Sensor noise Competing
More informationProf. Feng Liu. Winter /09/2017
Prof. Feng Liu Winter 2017 http://www.cs.pdx.edu/~fliu/courses/cs410/ 01/09/2017 Today Course overview Computer vision Admin. Info Visual Computing at PSU Image representation Color 2 Big Picture: Visual
More informationFace detection, face alignment, and face image parsing
Lecture overview Face detection, face alignment, and face image parsing Brandon M. Smith Guest Lecturer, CS 534 Monday, October 21, 2013 Brief introduction to local features Face detection Face alignment
More informationVICs: A Modular Vision-Based HCI Framework
VICs: A Modular Vision-Based HCI Framework The Visual Interaction Cues Project Guangqi Ye, Jason Corso Darius Burschka, & Greg Hager CIRL, 1 Today, I ll be presenting work that is part of an ongoing project
More informationMonaural and Binaural Speech Separation
Monaural and Binaural Speech Separation DeLiang Wang Perception & Neurodynamics Lab The Ohio State University Outline of presentation Introduction CASA approach to sound separation Ideal binary mask as
More informationLCC 3710 Principles of Interaction Design. Readings. Sound in Interfaces. Speech Interfaces. Speech Applications. Motivation for Speech Interfaces
LCC 3710 Principles of Interaction Design Class agenda: - Readings - Speech, Sonification, Music Readings Hermann, T., Hunt, A. (2005). "An Introduction to Interactive Sonification" in IEEE Multimedia,
More informationSCIENCE & TECHNOLOGY
Pertanika J. Sci. & Technol. 25 (S): 163-172 (2017) SCIENCE & TECHNOLOGY Journal homepage: http://www.pertanika.upm.edu.my/ Performance Comparison of Min-Max Normalisation on Frontal Face Detection Using
More informationFrom acoustic simulation to virtual auditory displays
PROCEEDINGS of the 22 nd International Congress on Acoustics Plenary Lecture: Paper ICA2016-481 From acoustic simulation to virtual auditory displays Michael Vorländer Institute of Technical Acoustics,
More informationROBUST LOCALIZATION OF MULTIPLE SPEAKERS EXPLOITING HEAD MOVEMENTS AND MULTI-CONDITIONAL TRAINING OF BINAURAL CUES
ROBUST LOCALIZATION OF MULTIPLE SPEAKERS EXPLOITING HEAD MOVEMENTS AND MULTI-CONDITIONAL TRAINING OF BINAURAL CUES Tobias May Technical University of Denmark Centre for Applied Hearing Research DK - 28
More informationComparing Computer-predicted Fixations to Human Gaze
Comparing Computer-predicted Fixations to Human Gaze Yanxiang Wu School of Computing Clemson University yanxiaw@clemson.edu Andrew T Duchowski School of Computing Clemson University andrewd@cs.clemson.edu
More informationAudio Similarity. Mark Zadel MUMT 611 March 8, Audio Similarity p.1/23
Audio Similarity Mark Zadel MUMT 611 March 8, 2004 Audio Similarity p.1/23 Overview MFCCs Foote Content-Based Retrieval of Music and Audio (1997) Logan, Salomon A Music Similarity Function Based On Signal
More informationINTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK SMILE DETECTION WITH IMPROVED MISDETECTION RATE AND REDUCED FALSE ALARM RATE VRUSHALI
More informationGlossary of Terms. Beta Movement (Lesson 3)
Glossary of Terms Frame Rate (Lesson 3) The rate of frames per second in film and video. Modern theatrical film runs at 24 frames a second. This is the rate for both tradi:onal film and digital cinema
More informationProceedings of Meetings on Acoustics
Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Architectural Acoustics Session 1pAAa: Advanced Analysis of Room Acoustics:
More informationFundamentals of Signals, DSP and Applica7ons in m- Health. By Deepta Rajan FSE Oct 10, 2013.
Fundamentals of Signals, DSP and Applica7ons in m- Health By Deepta Rajan FSE 100 - Oct 10, 2013. Outline Signals What are they? Fourier Transform - T/F domain Challenges in Signal Processing The AJDSP
More informationYDDON. Humans, Robots, & Intelligent Objects New communication approaches
YDDON Humans, Robots, & Intelligent Objects New communication approaches Building Robot intelligence Interdisciplinarity Turning things into robots www.ydrobotics.co m Edifício A Moagem Cidade do Engenho
More informationUsing RASTA in task independent TANDEM feature extraction
R E S E A R C H R E P O R T I D I A P Using RASTA in task independent TANDEM feature extraction Guillermo Aradilla a John Dines a Sunil Sivadas a b IDIAP RR 04-22 April 2004 D a l l e M o l l e I n s t
More information3D and Sequential Representations of Spatial Relationships among Photos
3D and Sequential Representations of Spatial Relationships among Photos Mahoro Anabuki Canon Development Americas, Inc. E15-349, 20 Ames Street Cambridge, MA 02139 USA mahoro@media.mit.edu Hiroshi Ishii
More informationInstitute for Media Technology Electronic Media Technology (ELMT)
Institute for Media Technology Electronic Media Technology (ELMT) 21.09.2017 Page 1 Key expertise of EMT The key expertise in research and education is related to technological developments for capturing,
More informationSUB-BAND INDEPENDENT SUBSPACE ANALYSIS FOR DRUM TRANSCRIPTION. Derry FitzGerald, Eugene Coyle
SUB-BAND INDEPENDEN SUBSPACE ANALYSIS FOR DRUM RANSCRIPION Derry FitzGerald, Eugene Coyle D.I.., Rathmines Rd, Dublin, Ireland derryfitzgerald@dit.ie eugene.coyle@dit.ie Bob Lawlor Department of Electronic
More informationApplications of Music Processing
Lecture Music Processing Applications of Music Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Singing Voice Detection Important pre-requisite
More informationClustering of Gaze During Dynamic Scene Viewing is Predicted by Motion
Cogn Comput (2011) 3:5 24 DOI 10.1007/s12559-010-9074-z Clustering of Gaze During Dynamic Scene Viewing is Predicted by Motion Parag K. Mital Tim J. Smith Robin L. Hill John M. Henderson Received: 23 April
More informationInside the Psychology of the Agent Informa)on, Associa)on, A/rac)on and Repulsion
Ins$tute for Advanced Topics in the Digital Humani$es University of North Carolina, Charlo?e June 10 th, 2011 Inside the Psychology of the Agent Informa)on, Associa)on, A/rac)on and Repulsion Inside the
More informationPerception. Introduction to HRI Simmons & Nourbakhsh Spring 2015
Perception Introduction to HRI Simmons & Nourbakhsh Spring 2015 Perception my goals What is the state of the art boundary? Where might we be in 5-10 years? The Perceptual Pipeline The classical approach:
More informationVision-based User-interfaces for Pervasive Computing. CHI 2003 Tutorial Notes. Trevor Darrell Vision Interface Group MIT AI Lab
Vision-based User-interfaces for Pervasive Computing Tutorial Notes Vision Interface Group MIT AI Lab Table of contents Biographical sketch..ii Agenda..iii Objectives.. iv Abstract..v Introduction....1
More informationROBUST LOCALISATION OF MULTIPLE SPEAKERS EXPLOITING HEAD MOVEMENTS AND MULTI-CONDITIONAL TRAINING OF BINAURAL CUES
Downloaded from orbit.dtu.dk on: Dec 28, 2018 ROBUST LOCALISATION OF MULTIPLE SPEAKERS EXPLOITING HEAD MOVEMENTS AND MULTI-CONDITIONAL TRAINING OF BINAURAL CUES May, Tobias; Ma, Ning; Brown, Guy Published
More informationDirection-Dependent Physical Modeling of Musical Instruments
15th International Congress on Acoustics (ICA 95), Trondheim, Norway, June 26-3, 1995 Title of the paper: Direction-Dependent Physical ing of Musical Instruments Authors: Matti Karjalainen 1,3, Jyri Huopaniemi
More informationClassification of Clothes from Two Dimensional Optical Images
Human Journals Research Article June 2017 Vol.:6, Issue:4 All rights are reserved by Sayali S. Junawane et al. Classification of Clothes from Two Dimensional Optical Images Keywords: Dominant Colour; Image
More informationAn Un-awarely Collected Real World Face Database: The ISL-Door Face Database
An Un-awarely Collected Real World Face Database: The ISL-Door Face Database Hazım Kemal Ekenel, Rainer Stiefelhagen Interactive Systems Labs (ISL), Universität Karlsruhe (TH), Am Fasanengarten 5, 76131
More informationBook Chapters. Refereed Journal Publications J11
Book Chapters B2 B1 A. Mouchtaris and P. Tsakalides, Low Bitrate Coding of Spot Audio Signals for Interactive and Immersive Audio Applications, in New Directions in Intelligent Interactive Multimedia,
More informationVirtual Acoustic Space as Assistive Technology
Multimedia Technology Group Virtual Acoustic Space as Assistive Technology Czech Technical University in Prague Faculty of Electrical Engineering Department of Radioelectronics Technická 2 166 27 Prague
More informationSpatial Audio & The Vestibular System!
! Spatial Audio & The Vestibular System! Gordon Wetzstein! Stanford University! EE 267 Virtual Reality! Lecture 13! stanford.edu/class/ee267/!! Updates! lab this Friday will be released as a video! TAs
More informationReinventing movies How do we tell stories in VR? Diego Gutierrez Graphics & Imaging Lab Universidad de Zaragoza
Reinventing movies How do we tell stories in VR? Diego Gutierrez Graphics & Imaging Lab Universidad de Zaragoza Computer Graphics Computational Imaging Virtual Reality Joint work with: A. Serrano, J. Ruiz-Borau
More informationAn Audio Fingerprint Algorithm Based on Statistical Characteristics of db4 Wavelet
Journal of Information & Computational Science 8: 14 (2011) 3027 3034 Available at http://www.joics.com An Audio Fingerprint Algorithm Based on Statistical Characteristics of db4 Wavelet Jianguo JIANG
More informationRecurrent Timing Neural Networks for Joint F0-Localisation Estimation
Recurrent Timing Neural Networks for Joint F0-Localisation Estimation Stuart N. Wrigley and Guy J. Brown Department of Computer Science, University of Sheffield Regent Court, 211 Portobello Street, Sheffield
More informationMPEG-4 Structured Audio Systems
MPEG-4 Structured Audio Systems Mihir Anandpara The University of Texas at Austin anandpar@ece.utexas.edu 1 Abstract The MPEG-4 standard has been proposed to provide high quality audio and video content
More informationEffects of Reverberation on Pitch, Onset/Offset, and Binaural Cues
Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues DeLiang Wang Perception & Neurodynamics Lab The Ohio State University Outline of presentation Introduction Human performance Reverberation
More informationConvention e-brief 400
Audio Engineering Society Convention e-brief 400 Presented at the 143 rd Convention 017 October 18 1, New York, NY, USA This Engineering Brief was selected on the basis of a submitted synopsis. The author
More informationExploiting deep neural networks and head movements for binaural localisation of multiple speakers in reverberant conditions
Downloaded from orbit.dtu.dk on: Dec 28, 2018 Exploiting deep neural networks and head movements for binaural localisation of multiple speakers in reverberant conditions Ma, Ning; Brown, Guy J.; May, Tobias
More informationRIR Estimation for Synthetic Data Acquisition
RIR Estimation for Synthetic Data Acquisition Kevin Venalainen, Philippe Moquin, Dinei Florencio Microsoft ABSTRACT - Automatic Speech Recognition (ASR) works best when the speech signal best matches the
More informationSpring 2018 CS543 / ECE549 Computer Vision. Course webpage URL:
Spring 2018 CS543 / ECE549 Computer Vision Course webpage URL: http://slazebni.cs.illinois.edu/spring18/ The goal of computer vision To extract meaning from pixels What we see What a computer sees Source:
More informationRoberto Togneri (Signal Processing and Recognition Lab)
Signal Processing and Machine Learning for Power Quality Disturbance Detection and Classification Roberto Togneri (Signal Processing and Recognition Lab) Power Quality (PQ) disturbances are broadly classified
More informationMobile and Ubiquitous Compu3ng. Wireless Signals. George Roussos.
Mobile and Ubiquitous Compu3ng Wireless Signals George Roussos g.roussos@dcs.bbk.ac.uk Overview Signal characteris3cs Represen3ng digital informa3on with wireless Transmission and propaga3on Accessing
More informationApProgXimate Audio: A Distributed Interactive Experiment in Sound Art and Live Coding
ApProgXimate Audio: A Distributed Interactive Experiment in Sound Art and Live Coding Chris Kiefer Department of Music & Sussex Humanities Lab, University of Sussex, Brighton, UK. School of Media, Film
More information- applications on same or different network node of the workstation - portability of application software - multiple displays - open architecture
12 Window Systems - A window system manages a computer screen. - Divides the screen into overlapping regions. - Each region displays output from a particular application. X window system is widely used
More informationVISUAL PITCH CLASS PROFILE A Video-Based Method for Real-Time Guitar Chord Identification
VISUAL PITCH CLASS PROFILE A Video-Based Method for Real-Time Guitar Chord Identification First Author Name, Second Author Name Institute of Problem Solving, XYZ University, My Street, MyTown, MyCountry
More informationTechniques for Designing GPGPU Games. Mark Joselli Esteban Clua
Techniques for Designing GPGPU Games Mark Joselli Esteban Clua Presenta?on; Background; Mo?va?on; Objec?ves; Games and GPGPU; Techniques analyzed; Examples; Conclusions; Agenda Presenta?on: Mark Joselli
More informationDistance Estimation and Localization of Sound Sources in Reverberant Conditions using Deep Neural Networks
Distance Estimation and Localization of Sound Sources in Reverberant Conditions using Deep Neural Networks Mariam Yiwere 1 and Eun Joo Rhee 2 1 Department of Computer Engineering, Hanbat National University,
More informationChallenge to Open Systems Problems
Stanford Unversity EE380 Computer Systems Colloquium Challenge to Open Systems Problems September 29, 2010 Mario Tokoro President & CEO Sony Computer Science Laboratories, Inc. Victory of Science and Technology
More informationMulti-Resolution Estimation of Optical Flow on Vehicle Tracking under Unpredictable Environments
, pp.32-36 http://dx.doi.org/10.14257/astl.2016.129.07 Multi-Resolution Estimation of Optical Flow on Vehicle Tracking under Unpredictable Environments Viet Dung Do 1 and Dong-Min Woo 1 1 Department of
More informationIMAGE PROCESSING IEEE TITLES
2017 2018 IMAGE IEEE TITLES S.no TITLE DOMAIN 1 An Watermarking Scheme Using Threshold Based Secret Sharing 2 Brain Tumor Detection And Segmentation Using Conditional Random Field 3 A Reversible Rie Based
More information