The revolution of the empiricists. Machine Translation. Motivation for Data-Driven MT. Machine Translation as Search

Size: px
Start display at page:

Download "The revolution of the empiricists. Machine Translation. Motivation for Data-Driven MT. Machine Translation as Search"

Transcription

1 The revolution of the empiricists Machine Translation Word alignment & Statistical MT Jörg Tiedemann Department of Linguistics and Philology Uppsala University Classical approaches require lots of manual work! long development times low coverage, not robust disambiguation at various levels slow! Learn from translation data: example databases for CAT and MT bilingual lexicon/terminology extraction statistical translation models Jörg Tiedemann 1/37 Jörg Tiedemann 2/37 Machine Translation as Search Motivation for Data-Driven MT Maria no daba una bofetada a la bruja verde Mary not give a slap to the witch green did not a slap by green witch How do we learn to translate? grammer vs. examples teacher vs. practice intuition vs. experience no did not give slap slap to the to the the witch Is it possible to create an MT engine without any human effort? no writing of grammar rules no bilingual lexicography no writing of preference & disambiguation rules Look at all alternatives and find the best (most likely) one... Jörg Tiedemann 3/37 Jörg Tiedemann 4/37

2 Motivation for Data-Driven MT Learning from Parallel corpora Learning to translate: there is a bunch of translated stuff (collect all) learn common word/phrase translations from this collection look at typical sentences in the target language learn how to write a sentence in the target language Translation: try various translations of words/phrases in given sentence put them together, shuffle them around check which translation candidate looks best Word alignment: required for many purposes! Jörg Tiedemann 5/37 Jörg Tiedemann 6/37 Translating: Example-Based MT Components of Example-Based MT The classical example (Sato & Nagao, 1990) translate: He buys a book on international politics. examples: 1. He buys a notebook. Kare wa nõto o kau. HE topic NOTEBOOK obj BUY. 2. I read a book on international politics. Watashi wa kokusai seiji nitsuite kakareta hon o yomu. I topic INTERNATIONAL POLITICS ABOUT CONCERNED BOOK obj READ. output: Kare wa kokusai seiji nitsuite kakareta hon o kau. Issues to be addressed: aquiring & storing large example databases matching suitable example fragments alignment of fragments to target language recombination of translated fragments ranking of alternative solutions Can we do this in a formal computational model? Jörg Tiedemann 7/37 Jörg Tiedemann 8/37

3 Statistical Machine Translation Noisy channel for MT: What could have been the sentence that has generated the observed source language sentence? Statistical Machine Translation Ideas borrowed from Speech Recognition:... what a strange idea! (Thanks to Markus Saers for this and other pictures) Jörg Tiedemann 9/37 Jörg Tiedemann 10/37 A brief history of Statistical Machine Translation Statistical Machine Translation 1947 / 49: When I look at an article in Russian, I say: This is really written in English, but it has been coded in some strange symbols. I will now proceed to decode. (Warren Weaver) 1988: first SMT publication (IBM Candide) Every time I fire a linguist the performance goes up (Fred Jelinek) 1999: Johns-Hopkins University summer workshop: Egypt toolkit (GIZA+Cairo) 2003: Phrase-based SMT 2004: Pharaoh: phrase-based decoder 2007: Moses: Open-source toolbox, factored, phrase-based SMT, hierarchical SMT,... Probabilistic view on MT (E = target language, F = source language): Ê = argmax E P(E F) = argmax E P(F E)P(E) Jörg Tiedemann 11/37 Jörg Tiedemann 12/37

4 Statistical Machine Translation Some (very) basic concepts of probability theory Informally: model translation as an optimization (search) problem look for the most likely translation E for a given input F use a probabilistic model that assigns these conditional likelihoods use Bayes theorem to split the model into 2 parts: a language model (for the target language) a translation model (source language given target language) probability P(X) maps event X to number between 0 and 1 P(X) represents the likelihood of observing event X in some kind of experiment (trial) discrete probability distribution: i P(X = x i) = 1 P(X Y ) = conditional probability (likelihood of event X given that event Y has been observed before) joint probability: P(X, Y ) (likelihood of seeing both events) P(X, Y ) = P(X) P(Y X) = P(Y ) P(X Y ), therefore: Bayes Theorem:P(X Y ) = P(X) P(Y X) P(Y ) Jörg Tiedemann 13/37 Jörg Tiedemann 14/37 Some quick words on probability theory & Statistics Where do the probabilities come from? Experience! Use experiments (and repeat them often...) Maximum Likelihood Estimation (rely on N experiments only): P(X) count(x) N For conditional probabilities: P(X Y ) = P(X, Y ) P(Y ) count(x, Y ) N count(y ) N = count(x, Y ) count(y ) Also important: marginalizing out joined probabilities: P(X, Y i ) = i i P(X)P(Y i X) = P(X) i... more details in Matematik f or språkteknologer P(Y i X) = P(X) (Classical) Statistical Machine Translation Translation model: Ê = argmax E P(E F) = argmax E P(F E)P(E) P(F) = argmax E P(F E)P(E) P(F E), estimated from (big) parallel corpora Language model: P(E), estimated from (huge) monolingual target language corpora, takes care of fluency Decoder: global search for argmax E P(F E)P(E) for a given sentence F Jörg Tiedemann 15/37 Jörg Tiedemann 16/37

5 Statistical Machine Translation Word-based SMT models Training: Decoding: How do we get the probabilistic model? estimating probabilities from training data (machine learning) tuning model/learner parameters using development sets How do we find the most likely translation? search problem (argmax) (assuming our model is correct) gigantic search space! optimization & heuristics Why do we need word alignment? Cannot estimate P(F E)... Why not? almost all sentences are unique sparse counts! no good estimations decompose into smaller chunks! Word-based model: Assume that words in one language have been generated by words in another! a (hidden) word alignment explains this process Jörg Tiedemann 17/37 Jörg Tiedemann 18/37 Word-based Translation Models Word-based Translation Models What do we need to estimate model parameters? lexical translation distortion/re-ordering fertility NULL insertion We need a word-aligned parallel corpus! Jörg Tiedemann 19/37 Jörg Tiedemann 20/37

6 Word alignment Word alignment How do we formalize word alignment? A simple example: das Haus ist klein the house is small Natural languages are not that easy... not always 1:1 relation between words some words may be dropped word order can be quite different Define alignment function a based on positions: a : {1 1, 2 2, 3 3, 4 4} Jörg Tiedemann 21/37 Jörg Tiedemann 22/37 Word alignment Word alignment Example of reordering klein ist das Haus the house is small What does the alignment function look like? a : {1 3, 2 4, 3 2, 4 1} One-to-many alignments huset är jättelitet the house is very small 5 a : {1 1, 2 1, 3 2, 4 3, 5 3} Jörg Tiedemann 23/37 Jörg Tiedemann 24/37

7 Word alignment Word alignment Dropping words: huset är ganska litet the house is small Inserting words: huset är NULL litet the house is just small 5 a : {1 1, 2 1, 3 2, 4 4} a : {1 1, 2 1, 3 2, 4 0, 5 3} Jörg Tiedemann 25/37 Jörg Tiedemann 26/37 Statistical word alignment models Statistical Machine Translation Standard word-based translation models: IBM 1: lexical translation probabilities IBM 2: add absolute reordering IBM 3: add fertility IBM 4: relative reordering Remember: Ê = argmax E P(F E)P(E) aligned parallel corpora translation model What is missing? How can we learn model parameters from parallel corpora without explicit word-alignment? Next time more about this... aligned parallel corpora translation model P(F E) we still need the language model P(E) Standard N-gram language models Jörg Tiedemann 27/37 Jörg Tiedemann 28/37

8 Statistical Machine Translation: Language Modeling Statistical Machine Translation: Language Modeling Language modeling: (probabilistic) LM = predict likelihood of any given string What is the likelihood P(E) to observe sentence E? P LM (the house is small) > P LM (small the is house) Estimate probabilities from corpora: P(E) = P(e 1, e 2, e 3,.., e j ) What is the problem here again? Remember: Maximum Likelihood Estimation (MLE) P(E) = P(e 1,.., e j ) count(e 1, e 2,..., e j ) N Can we estimate reliable probabilities for arbitrary sentences? Remember: P(X, Y ) = P(X) P(Y X) chain rule: P(E) = P(e 1 ) P(e 2 e 1 ) P(e 3 e 1, e 2 )... P(e j e 1,.., e j 1 ) Does this help? Jörg Tiedemann 29/37 Jörg Tiedemann 30/37 Statistical Machine Translation: Language Modeling Remember: MLE for conditional probabilities P(e j e 1,.., e j 1 ) count(e 1, e 2,..., e j ) count(e 1, e 2,..., e j 1 ) Again: What is the problem? sparse counts for large N-grams! Markov assumption: Limit the dependencies! (bigram model: P(e 3 e 1, e 2 ) P(e 3 e 2 )) Statistical Machine Translation: Language Modeling Compute sentence probabilities based on overlapping N-grams: the the house the house is house is is too too small too small. unigram model: P(E) = P(e 1 ) P(e 2 )...P(e n ) bigram model: P(E) = P(e 1 ) P(e 2 e 1 ) P(e 3 e 2 )...P(e n e n 1 ) trigram model: P(E) = P(e 1 ) P(e 2 e 1 ) P(e 3 e 1, e 2 )...P(e n e n 2 e n 1, ) What is P trigram ("the house is too small.")? Jörg Tiedemann 31/37 Jörg Tiedemann 32/37

9 Statistical Machine Translation: Language Modeling Statistical Machine Translation: Decoding Another problem: zero counts! Maria no daba una bofetada a la bruja verde Some N-grams are never observed ( count(e i, e j ) = 0)... but appear in real data (e.g. as translation candidate) What happens if we need the probability for...e i e j...? multiplying with one factor = 0 everything is zero BAD IDEA! This must be avoided! Mary not did not give a slap a slap no slap did not give to the witch green by green witch to the to Smoothing! (reserve probability mass for unseen events) Backoff models (from higher order models to lower models) slap the the witch... there would be so much more to say about LM s (see ch.7) Ê = argmax E P(F E)P(E) Jörg Tiedemann 33/37 Jörg Tiedemann 34/37 Statistical Machine Translation: Decoding Summary Decoding = search a solution for Ê given F using: Ê = argmax E P(F E)P(E) Far too many possible E s to search globally! Approximate search using good partial candidates! later more about this... MT can be put into a probabilistic framework translation models: estimated from parallel corpora language models: estimated from monolingual corpora global search = decoding = translating fully automatic (!!!) various simplifications / assumptions necessary probabilistic variant of direct translation Jörg Tiedemann 35/37 Jörg Tiedemann 36/37

10 What s next? statistical word alignment phrase-based SMT decoding Jörg Tiedemann 37/37

Statistical Machine Translation. Machine Translation Phrase-Based Statistical MT. Motivation for Phrase-based SMT

Statistical Machine Translation. Machine Translation Phrase-Based Statistical MT. Motivation for Phrase-based SMT Statistical Machine Translation Machine Translation Phrase-Based Statistical MT Jörg Tiedemann jorg.tiedemann@lingfil.uu.se Department of Linguistics and Philology Uppsala University October 2009 Probabilistic

More information

Introduction to Markov Models

Introduction to Markov Models Introduction to Markov Models But first: A few preliminaries Estimating the probability of phrases of words, sentences, etc. CIS 391 - Intro to AI 2 What counts as a word? A tricky question. How to find

More information

Challenges in Statistical Machine Translation

Challenges in Statistical Machine Translation p.1 Challenges in Statistical Machine Translation Philipp Koehn koehn@csail.mit.edu Computer Science and Artificial Intelligence Lab Massachusetts Institute of Technology Outline p Statistical Machine

More information

Machine Translation - Decoding

Machine Translation - Decoding January 15, 2007 Table of Contents 1 Introduction 2 3 4 5 6 Integer Programing Decoder 7 Experimental Results Word alignments Fertility Table Translation Table Heads Non-heads NULL-generated (ct.) Figure:

More information

Introduction to Markov Models. Estimating the probability of phrases of words, sentences, etc.

Introduction to Markov Models. Estimating the probability of phrases of words, sentences, etc. Introduction to Markov Models Estimating the probability of phrases of words, sentences, etc. But first: A few preliminaries on text preprocessing What counts as a word? A tricky question. CIS 421/521

More information

신경망기반자동번역기술. Konkuk University Computational Intelligence Lab. 김강일

신경망기반자동번역기술. Konkuk University Computational Intelligence Lab.  김강일 신경망기반자동번역기술 Konkuk University Computational Intelligence Lab. http://ci.konkuk.ac.kr kikim01@kunkuk.ac.kr 김강일 Index Issues in AI and Deep Learning Overview of Machine Translation Advanced Techniques in

More information

Machine Learning for Language Technology

Machine Learning for Language Technology Machine Learning for Language Technology Generative and Discriminative Models Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Machine Learning for Language

More information

Sketching Interface. Larry Rudolph April 24, Pervasive Computing MIT SMA 5508 Spring 2006 Larry Rudolph

Sketching Interface. Larry Rudolph April 24, Pervasive Computing MIT SMA 5508 Spring 2006 Larry Rudolph Sketching Interface Larry April 24, 2006 1 Motivation Natural Interface touch screens + more Mass-market of h/w devices available Still lack of s/w & applications for it Similar and different from speech

More information

Sketching Interface. Motivation

Sketching Interface. Motivation Sketching Interface Larry Rudolph April 5, 2007 1 1 Natural Interface Motivation touch screens + more Mass-market of h/w devices available Still lack of s/w & applications for it Similar and different

More information

Yu Chen Andreas Eisele Martin Kay

Yu Chen Andreas Eisele Martin Kay LREC 2008: Marrakech, Morocco Department of Computational Linguistics Saarland University May 29, 2008 Outline 1 2 3 4 5 Outline 1 2 3 4 5 SMT architecture To build a phrase-based SMT system: Parallel

More information

Part of Speech Tagging & Hidden Markov Models (Part 1) Mitch Marcus CIS 421/521

Part of Speech Tagging & Hidden Markov Models (Part 1) Mitch Marcus CIS 421/521 Part of Speech Tagging & Hidden Markov Models (Part 1) Mitch Marcus CIS 421/521 NLP Task I Determining Part of Speech Tags Given a text, assign each token its correct part of speech (POS) tag, given its

More information

Rule Filtering by Pattern for Efficient Hierarchical Translation

Rule Filtering by Pattern for Efficient Hierarchical Translation for Efficient Hierarchical Translation Gonzalo Iglesias 1 Adrià de Gispert 2 Eduardo R. Banga 1 William Byrne 2 1 Department of Signal Processing and Communications University of Vigo, Spain 2 Department

More information

Log-linear models (part 1I)

Log-linear models (part 1I) Log-linear models (part 1I) CS 690N, Spring 2018 Advanced Natural Language Processing http://people.cs.umass.edu/~brenocon/anlp2018/ Brendan O Connor College of Information and Computer Sciences University

More information

Lecture 4: n-grams in NLP. LING 1330/2330: Introduction to Computational Linguistics Na-Rae Han

Lecture 4: n-grams in NLP. LING 1330/2330: Introduction to Computational Linguistics Na-Rae Han Lecture 4: n-grams in NLP LING 1330/2330: Introduction to Computational Linguistics Na-Rae Han Objectives Frequent n-grams in English n-grams and statistical NLP n-grams and conditional probability Large

More information

NLP, Games, and Robotic Cars

NLP, Games, and Robotic Cars NLP, Games, and Robotic Cars [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at http://ai.berkeley.edu.] So Far: Foundational

More information

Log-linear models (part 1I)

Log-linear models (part 1I) Log-linear models (part 1I) Lecture, Feb 2 CS 690N, Spring 2017 Advanced Natural Language Processing http://people.cs.umass.edu/~brenocon/anlp2017/ Brendan O Connor College of Information and Computer

More information

CS 343: Artificial Intelligence

CS 343: Artificial Intelligence CS 343: Artificial Intelligence NLP, Games, and Autonomous Vehicles Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI

More information

CSCI 5832 Natural Language Processing

CSCI 5832 Natural Language Processing CSCI 5832 Natural Language Processing Lecture 25 Jim Martin 4/24/07 CSCI 5832 Spring 2007 1 Machine Translation Slides stolen from Kevin Knight (USC/ISI) 4/24/07 CSCI 5832 Spring 2007 2 1 Today 4/24 Machine

More information

CS 188: Artificial Intelligence Spring Speech in an Hour

CS 188: Artificial Intelligence Spring Speech in an Hour CS 188: Artificial Intelligence Spring 2006 Lecture 19: Speech Recognition 3/23/2006 Dan Klein UC Berkeley Many slides from Dan Jurafsky Speech in an Hour Speech input is an acoustic wave form s p ee ch

More information

/665 Natural Language Processing

/665 Natural Language Processing 601.465/665 Natural Language Processing Prof: Jason Eisner Webpage: http://cs.jhu.edu/~jason/465 syllabus, announcements, slides, homeworks 1 Goals of the field Computers would be a lot more useful if

More information

COMPUTATIONAL LINGUISTIC CREATIVITY

COMPUTATIONAL LINGUISTIC CREATIVITY COMPUTATIONAL LINGUISTIC CREATIVITY Khalid Alnajjar alnajjar[at]cs.helsinki.fi UNIVERSITY OF HELSINKI Department of Computer Science Khalid Alnajjar 6 Nov 2017 1/ 35 OUTLINE Introduction to Computational

More information

Discriminative Training for Automatic Speech Recognition

Discriminative Training for Automatic Speech Recognition Discriminative Training for Automatic Speech Recognition 22 nd April 2013 Advanced Signal Processing Seminar Article Heigold, G.; Ney, H.; Schluter, R.; Wiesler, S. Signal Processing Magazine, IEEE, vol.29,

More information

Recap from previous lecture. Information Retrieval. Topics for Today. Recall: Basic structure of an Inverted index. Dictionaries & Tolerant Retrieval

Recap from previous lecture. Information Retrieval. Topics for Today. Recall: Basic structure of an Inverted index. Dictionaries & Tolerant Retrieval Recap from previous lecture nformation Retrieval Dictionaries & Tolerant Retrieval Jörg Tiedemann jorg.tiedemann@lingfil.uu.se Department of Linguistics and Philology Uppsala University nverted indexes

More information

Quick Fixes for Your Top English Challenges

Quick Fixes for Your Top English Challenges 15 Quick Fixes for Your Top English Challenges Please Share this ebook! Do you like this ebook? Please share it with your friends! #15 Listen vs. Hear Listen and hear seem to mean the same thing. They

More information

Speech Coding in the Frequency Domain

Speech Coding in the Frequency Domain Speech Coding in the Frequency Domain Speech Processing Advanced Topics Tom Bäckström Aalto University October 215 Introduction The speech production model can be used to efficiently encode speech signals.

More information

Attacking Quality Limitations:

Attacking Quality Limitations: Attacking Quality Limitations: New Approaches to Translation Technologies Hans Uszkoreit German Research Center for Artifical Intelligence (DFKI) and Saarland University Coordinator META-NET My Background

More information

Capturing and Classifying Ontology Evolution in News Media Archives

Capturing and Classifying Ontology Evolution in News Media Archives Capturing and Classifying Ontology Evolution in News Media Archives Albert Weichselbraun, Arno Scharl and Wei Liu Vienna University of Economics and Business Administration Department of Information Systems

More information

Two Bracketing Schemes for the Penn Treebank

Two Bracketing Schemes for the Penn Treebank Anssi Yli-Jyrä Two Bracketing Schemes for the Penn Treebank Abstract The trees in the Penn Treebank have a standard representation that involves complete balanced bracketing. In this article, an alternative

More information

The Game-Theoretic Approach to Machine Learning and Adaptation

The Game-Theoretic Approach to Machine Learning and Adaptation The Game-Theoretic Approach to Machine Learning and Adaptation Nicolò Cesa-Bianchi Università degli Studi di Milano Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 1 / 25 Machine Learning

More information

Recap from previous lectures. Information Retrieval. Recap from previous lectures. Topics for Today. Dictionaries & Tolerant Retrieval.

Recap from previous lectures. Information Retrieval. Recap from previous lectures. Topics for Today. Dictionaries & Tolerant Retrieval. Recap from previous lectures nformation Retrieval Dictionaries & Tolerant Retrieval Jörg Tiedemann jorg.tiedemann@lingfil.uu.se Department of Linguistics and Philology Uppsala University nverted indexes

More information

Speech Processing. Simon King University of Edinburgh. additional lecture slides for

Speech Processing. Simon King University of Edinburgh. additional lecture slides for Speech Processing Simon King University of Edinburgh additional lecture slides for 2018-19 assignment Q&A writing exercise Roadmap Modules 1-2: The basics Modules 3-5: Speech synthesis Modules 6-9: Speech

More information

Introduction to probability

Introduction to probability Introduction to probability Suppose an experiment has a finite set X = {x 1,x 2,...,x n } of n possible outcomes. Each time the experiment is performed exactly one on the n outcomes happens. Assign each

More information

Combinatorics: The Fine Art of Counting

Combinatorics: The Fine Art of Counting Combinatorics: The Fine Art of Counting Week 6 Lecture Notes Discrete Probability Note Binomial coefficients are written horizontally. The symbol ~ is used to mean approximately equal. Introduction and

More information

User Goal Change Model for Spoken Dialog State Tracking

User Goal Change Model for Spoken Dialog State Tracking User Goal Change Model for Spoken Dialog State Tracking Yi Ma Department of Computer Science & Engineering The Ohio State University Columbus, OH 43210, USA may@cse.ohio-state.edu Abstract In this paper,

More information

Kalman Filtering, Factor Graphs and Electrical Networks

Kalman Filtering, Factor Graphs and Electrical Networks Kalman Filtering, Factor Graphs and Electrical Networks Pascal O. Vontobel, Daniel Lippuner, and Hans-Andrea Loeliger ISI-ITET, ETH urich, CH-8092 urich, Switzerland. Abstract Factor graphs are graphical

More information

The Munich 2011 CHiME Challenge Contribution: BLSTM-NMF Speech Enhancement and Recognition for Reverberated Multisource Environments

The Munich 2011 CHiME Challenge Contribution: BLSTM-NMF Speech Enhancement and Recognition for Reverberated Multisource Environments The Munich 2011 CHiME Challenge Contribution: BLSTM-NMF Speech Enhancement and Recognition for Reverberated Multisource Environments Felix Weninger, Jürgen Geiger, Martin Wöllmer, Björn Schuller, Gerhard

More information

SONG RETRIEVAL SYSTEM USING HIDDEN MARKOV MODELS

SONG RETRIEVAL SYSTEM USING HIDDEN MARKOV MODELS SONG RETRIEVAL SYSTEM USING HIDDEN MARKOV MODELS AKSHAY CHANDRASHEKARAN ANOOP RAMAKRISHNA akshayc@cmu.edu anoopr@andrew.cmu.edu ABHISHEK JAIN GE YANG ajain2@andrew.cmu.edu younger@cmu.edu NIDHI KOHLI R

More information

Lessons learned & Future of FeedMAP

Lessons learned & Future of FeedMAP Lessons learned & Future of FeedMAP Final Workshop 6.10.2008 Trento, Italy Hans-Ulrich Otto Tele Atlas NV Lessons learned - FeedMAP in-vehicle client Positional accuracy of GPS receivers differs up to

More information

EE 435/535: Error Correcting Codes Project 1, Fall 2009: Extended Hamming Code. 1 Introduction. 2 Extended Hamming Code: Encoding. 1.

EE 435/535: Error Correcting Codes Project 1, Fall 2009: Extended Hamming Code. 1 Introduction. 2 Extended Hamming Code: Encoding. 1. EE 435/535: Error Correcting Codes Project 1, Fall 2009: Extended Hamming Code Project #1 is due on Tuesday, October 6, 2009, in class. You may turn the project report in early. Late projects are accepted

More information

VQ Source Models: Perceptual & Phase Issues

VQ Source Models: Perceptual & Phase Issues VQ Source Models: Perceptual & Phase Issues Dan Ellis & Ron Weiss Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA {dpwe,ronw}@ee.columbia.edu

More information

Introduction. Description of the Project. Debopam Das

Introduction. Description of the Project. Debopam Das Computational Analysis of Text Sentiment: A Report on Extracting Contextual Information about the Occurrence of Discourse Markers Debopam Das Introduction This report documents a particular task performed

More information

10/12/2015. SHRDLU: 1969 NLP solved?? : A sea change in AI technologies. SHRDLU: A demonstration proof. 1990: Parsing Research in Crisis

10/12/2015. SHRDLU: 1969 NLP solved?? : A sea change in AI technologies. SHRDLU: A demonstration proof. 1990: Parsing Research in Crisis SHRDLU: 1969 NLP solved?? 1980-1995: A sea change in AI technologies Example: Natural Language Processing The Great Wave off Kanagawa by Hokusai, ~1830 ] Person: PICK UP A BIG RED BLOCK. Computer: OK.

More information

Statistical Machine Translation with Long Phrase Table and without Long Parallel Sentences

Statistical Machine Translation with Long Phrase Table and without Long Parallel Sentences Statistical Machine Translation with Long Phrase Table and without Long Parallel Sentences Jin ichi Murakami, Masato Tokuhisa, Satoru Ikehara Department of Information and Knowledge Engineering Faculty

More information

Techniques for Sentiment Analysis survey

Techniques for Sentiment Analysis survey I J C T A, 9(41), 2016, pp. 355-360 International Science Press ISSN: 0974-5572 Techniques for Sentiment Analysis survey Anu Sharma* and Savleen Kaur** ABSTRACT A Sentiment analysis is a technique to analyze

More information

Overview. Pre AI developments. Birth of AI, early successes. Overwhelming optimism underwhelming results

Overview. Pre AI developments. Birth of AI, early successes. Overwhelming optimism underwhelming results Help Overview Administrivia History/applications Modeling agents/environments What can we learn from the past? 1 Pre AI developments Philosophy: intelligence can be achieved via mechanical computation

More information

PDF hosted at the Radboud Repository of the Radboud University Nijmegen

PDF hosted at the Radboud Repository of the Radboud University Nijmegen PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is an author's version which may differ from the publisher's version. For additional information about this

More information

THEORY: NASH EQUILIBRIUM

THEORY: NASH EQUILIBRIUM THEORY: NASH EQUILIBRIUM 1 The Story Prisoner s Dilemma Two prisoners held in separate rooms. Authorities offer a reduced sentence to each prisoner if he rats out his friend. If a prisoner is ratted out

More information

Supporting Communications in Global Networks. Kevin Duh & 歐陽靖民

Supporting Communications in Global Networks. Kevin Duh & 歐陽靖民 Supporting Communications in Global Networks Kevin Duh & 歐陽靖民 Supporting Communications in Global Networks Machine Translation Kevin Duh 6000 Number of Languages in the World 世界中の言語の数 Image courtesy of:

More information

Long Range Acoustic Classification

Long Range Acoustic Classification Approved for public release; distribution is unlimited. Long Range Acoustic Classification Authors: Ned B. Thammakhoune, Stephen W. Lang Sanders a Lockheed Martin Company P. O. Box 868 Nashua, New Hampshire

More information

Battleship as a Dialog System Aaron Brackett, Gerry Meixiong, Tony Tan-Torres, Jeffrey Yu

Battleship as a Dialog System Aaron Brackett, Gerry Meixiong, Tony Tan-Torres, Jeffrey Yu Battleship as a Dialog System Aaron Brackett, Gerry Meixiong, Tony Tan-Torres, Jeffrey Yu Abstract For our project, we built a conversational agent for Battleship using Dialog systems. In this paper, we

More information

CSE 473 Artificial Intelligence (AI)

CSE 473 Artificial Intelligence (AI) CSE 473 Artificial Intelligence (AI) Rajesh Rao (Instructor) Jennifer Hanson (TA) Evan Herbst (TA) http://www.cs.washington.edu/473 Based on slides by UW CSE AI faculty, Dan Klein, Stuart Russell, Andrew

More information

Grounding into bits: the semantics of virtual worlds

Grounding into bits: the semantics of virtual worlds Grounding into bits: the semantics of virtual worlds CHRIS QUIRK /// UW MSR SUMMER INSTITUTE /// 2013 JULY 23 JOINT WORK WITH BILL DOLAN, CHRIS BROCKETT, PALLAVI CHOUDHURY, LUKE ZETTLEMOYER, SVITLANA VOLKOVA,

More information

Summer Reading Guide

Summer Reading Guide Critical Reading & Writing 1 2014 15 Summer Reading Guide Happy summer and welcome to freshman English at Hathaway Brown! This year our summer reading is the novel Feed by M.T. Anderson. This book is set

More information

2 Development of multilingual content and systems

2 Development of multilingual content and systems 2 nd report on the actions taken to give effect to recommendations as formulated in the 2003 October UNESCO General Conference concerning the promotion and use of multilingualism and universal access to

More information

Using RASTA in task independent TANDEM feature extraction

Using RASTA in task independent TANDEM feature extraction R E S E A R C H R E P O R T I D I A P Using RASTA in task independent TANDEM feature extraction Guillermo Aradilla a John Dines a Sunil Sivadas a b IDIAP RR 04-22 April 2004 D a l l e M o l l e I n s t

More information

CSC384 Intro to Artificial Intelligence* *The following slides are based on Fahiem Bacchus course lecture notes.

CSC384 Intro to Artificial Intelligence* *The following slides are based on Fahiem Bacchus course lecture notes. CSC384 Intro to Artificial Intelligence* *The following slides are based on Fahiem Bacchus course lecture notes. Artificial Intelligence A branch of Computer Science. Examines how we can achieve intelligent

More information

Joint recognition and direction-of-arrival estimation of simultaneous meetingroom acoustic events

Joint recognition and direction-of-arrival estimation of simultaneous meetingroom acoustic events INTERSPEECH 2013 Joint recognition and direction-of-arrival estimation of simultaneous meetingroom acoustic events Rupayan Chakraborty and Climent Nadeu TALP Research Centre, Department of Signal Theory

More information

Introduction to Markov Models

Introduction to Markov Models Itroductio to Markov Models But first: A few prelimiaries o text preprocessig Estimatig the probability of phrases of words, seteces, etc. What couts as a word? A tricky questio. How to fid Seteces?? CIS

More information

Chapter 4 Student Lecture Notes 4-1

Chapter 4 Student Lecture Notes 4-1 Chapter 4 Student Lecture Notes 4-1 Basic Business Statistics (9 th Edition) Chapter 4 Basic Probability 2004 Prentice-Hall, Inc. Chap 4-1 Chapter Topics Basic Probability Concepts Sample spaces and events,

More information

Hypergeometric Probability Distribution

Hypergeometric Probability Distribution Hypergeometric Probability Distribution Example problem: Suppose 30 people have been summoned for jury selection, and that 12 people will be chosen entirely at random (not how the real process works!).

More information

Using Deep Learning for Sentiment Analysis and Opinion Mining

Using Deep Learning for Sentiment Analysis and Opinion Mining Using Deep Learning for Sentiment Analysis and Opinion Mining Gauging opinions is faster and more accurate. Abstract How does a computer analyze sentiment? How does a computer determine if a comment or

More information

Machine Learning. Classification, Discriminative learning. Marc Toussaint University of Stuttgart Summer 2014

Machine Learning. Classification, Discriminative learning. Marc Toussaint University of Stuttgart Summer 2014 Machine Learning Classification, Discriminative learning Structured output, structured input, discriminative function, joint input-output features, Likelihood Maximization, Logistic regression, binary

More information

Chapter 4 Human Evaluation

Chapter 4 Human Evaluation Chapter 4 Human Evaluation Human evaluation is a key component in any MT evaluation process. This kind of evaluation acts as a reference key to automatic evaluation process. The automatic metrics is judged

More information

Deep Learning for Autonomous Driving

Deep Learning for Autonomous Driving Deep Learning for Autonomous Driving Shai Shalev-Shwartz Mobileye IMVC dimension, March, 2016 S. Shalev-Shwartz is also affiliated with The Hebrew University Shai Shalev-Shwartz (MobilEye) DL for Autonomous

More information

Estimating Ancient Population Sizes using the Coalescent with Recombination

Estimating Ancient Population Sizes using the Coalescent with Recombination Estimating Ancient Population Sizes using the Coalescent with Recombination Sara Sheehan joint work with Kelley Harris and Yun S. Song May 26, 2012 Sheehan, Harris, Song May 26, 2012 1 Motivation Introduction

More information

Chapter 4 SPEECH ENHANCEMENT

Chapter 4 SPEECH ENHANCEMENT 44 Chapter 4 SPEECH ENHANCEMENT 4.1 INTRODUCTION: Enhancement is defined as improvement in the value or Quality of something. Speech enhancement is defined as the improvement in intelligibility and/or

More information

Math 58. Rumbos Fall Solutions to Exam Give thorough answers to the following questions:

Math 58. Rumbos Fall Solutions to Exam Give thorough answers to the following questions: Math 58. Rumbos Fall 2008 1 Solutions to Exam 2 1. Give thorough answers to the following questions: (a) Define a Bernoulli trial. Answer: A Bernoulli trial is a random experiment with two possible, mutually

More information

Convolutional neural networks

Convolutional neural networks Convolutional neural networks Themes Curriculum: Ch 9.1, 9.2 and http://cs231n.github.io/convolutionalnetworks/ The simple motivation and idea How it s done Receptive field Pooling Dilated convolutions

More information

The Automatic Classification Problem. Perceptrons, SVMs, and Friends: Some Discriminative Models for Classification

The Automatic Classification Problem. Perceptrons, SVMs, and Friends: Some Discriminative Models for Classification Perceptrons, SVMs, and Friends: Some Discriminative Models for Classification Parallel to AIMA 8., 8., 8.6.3, 8.9 The Automatic Classification Problem Assign object/event or sequence of objects/events

More information

L19: Prosodic modification of speech

L19: Prosodic modification of speech L19: Prosodic modification of speech Time-domain pitch synchronous overlap add (TD-PSOLA) Linear-prediction PSOLA Frequency-domain PSOLA Sinusoidal models Harmonic + noise models STRAIGHT This lecture

More information

Computing Science (CMPUT) 496

Computing Science (CMPUT) 496 Computing Science (CMPUT) 496 Search, Knowledge, and Simulations Martin Müller Department of Computing Science University of Alberta mmueller@ualberta.ca Winter 2017 Part IV Knowledge 496 Today - Mar 9

More information

Lecture 1 What is AI? EECS 348 Intro to Artificial Intelligence Doug Downey

Lecture 1 What is AI? EECS 348 Intro to Artificial Intelligence Doug Downey Lecture 1 What is AI? EECS 348 Intro to Artificial Intelligence Doug Downey Outline 1) What is AI: The Course 2) What is AI: The Field 3) Why to take the class (or not) 4) A Brief History of AI 5) Predict

More information

1 This work was partially supported by NSF Grant No. CCR , and by the URI International Engineering Program.

1 This work was partially supported by NSF Grant No. CCR , and by the URI International Engineering Program. Combined Error Correcting and Compressing Codes Extended Summary Thomas Wenisch Peter F. Swaszek Augustus K. Uht 1 University of Rhode Island, Kingston RI Submitted to International Symposium on Information

More information

Can Linguistics Lead a Digital Revolution in the Humanities?

Can Linguistics Lead a Digital Revolution in the Humanities? Can Linguistics Lead a Digital Revolution in the Humanities? Martin Wynne Martin.wynne@it.ox.ac.uk Digital Humanities Seminar Oxford e-research Centre & IT Services (formerly OUCS) & Nottingham Wednesday

More information

arxiv: v2 [eess.sp] 10 Sep 2018

arxiv: v2 [eess.sp] 10 Sep 2018 Designing communication systems via iterative improvement: error correction coding with Bayes decoder and codebook optimized for source symbol error arxiv:1805.07429v2 [eess.sp] 10 Sep 2018 Chai Wah Wu

More information

The indirect object (IO) tells us where the direct object (DO) is going.

The indirect object (IO) tells us where the direct object (DO) is going. Indirect Object Pronouns The indirect object (IO) tells us where the direct object (DO) is going. He gives the book to María. DO=Book Where is the book going? To María. IO=María He gives María the book.

More information

Teddy Mantoro.

Teddy Mantoro. Teddy Mantoro Email: teddy@ieee.org 1. Title and Abstract 2. AI Method 3. Induction Approach 4. Writing Abstract 5. Writing Introduction What should be in the title: Problem, Method and Result The title

More information

Learning Structured Predictors

Learning Structured Predictors Learning Structured Predictors Xavier Carreras Xerox Research Centre Europe Supervised (Structured) Prediction Learning to predict: given training data { (x (1), y (1) ), (x (2), y (2) ),..., (x (m), y

More information

Learning Artificial Intelligence in Large-Scale Video Games

Learning Artificial Intelligence in Large-Scale Video Games Learning Artificial Intelligence in Large-Scale Video Games A First Case Study with Hearthstone: Heroes of WarCraft Master Thesis Submitted for the Degree of MSc in Computer Science & Engineering Author

More information

A Brief Introduction to Information Theory and Lossless Coding

A Brief Introduction to Information Theory and Lossless Coding A Brief Introduction to Information Theory and Lossless Coding 1 INTRODUCTION This document is intended as a guide to students studying 4C8 who have had no prior exposure to information theory. All of

More information

INTENSIVE READING & VOCABULARY STUDY JOURNAL

INTENSIVE READING & VOCABULARY STUDY JOURNAL INTENSIVE READING & VOCABULARY STUDY JOURNAL 1. What article or collection of abstracts have you chosen? Please give details such as the name of the journal, volume and issue numbers, URL, etc. WP document

More information

Midterm for Name: Good luck! Midterm page 1 of 9

Midterm for Name: Good luck! Midterm page 1 of 9 Midterm for 6.864 Name: 40 30 30 30 Good luck! 6.864 Midterm page 1 of 9 Part #1 10% We define a PCFG where the non-terminals are {S, NP, V P, V t, NN, P P, IN}, the terminal symbols are {Mary,ran,home,with,John},

More information

Announcements. Today. Speech and Language. State Path Trellis. HMMs: MLE Queries. Introduction to Artificial Intelligence. V22.

Announcements. Today. Speech and Language. State Path Trellis. HMMs: MLE Queries. Introduction to Artificial Intelligence. V22. Introduction to Artificial Intelligence Announcements V22.0472-001 Fall 2009 Lecture 19: Speech Recognition & Viterbi Decoding Rob Fergus Dept of Computer Science, Courant Institute, NYU Slides from John

More information

Mathematics Explorers Club Fall 2012 Number Theory and Cryptography

Mathematics Explorers Club Fall 2012 Number Theory and Cryptography Mathematics Explorers Club Fall 2012 Number Theory and Cryptography Chapter 0: Introduction Number Theory enjoys a very long history in short, number theory is a study of integers. Mathematicians over

More information

Teddy Mantoro.

Teddy Mantoro. Teddy Mantoro Email: teddy@ieee.org Marshal D Carper Hannah Heath The secret of good writing is rewriting The secret of rewriting is rethinking 1. Title and Abstract 2. AI Method 3. Induction Approach

More information

Frictional Force (32 Points)

Frictional Force (32 Points) Dual-Range Force Sensor Frictional Force (32 Points) Computer 19 Friction is a force that resists motion. It involves objects in contact with each other, and it can be either useful or harmful. Friction

More information

DERIVATION OF TRAPS IN AUDITORY DOMAIN

DERIVATION OF TRAPS IN AUDITORY DOMAIN DERIVATION OF TRAPS IN AUDITORY DOMAIN Petr Motlíček, Doctoral Degree Programme (4) Dept. of Computer Graphics and Multimedia, FIT, BUT E-mail: motlicek@fit.vutbr.cz Supervised by: Dr. Jan Černocký, Prof.

More information

Cheap, Fast and Good Enough: Speech Transcription with Mechanical Turk. Scott Novotney and Chris Callison-Burch 04/02/10

Cheap, Fast and Good Enough: Speech Transcription with Mechanical Turk. Scott Novotney and Chris Callison-Burch 04/02/10 Cheap, Fast and Good Enough: Speech Transcription with Mechanical Turk Scott Novotney and Chris Callison-Burch 04/02/10 Motivation Speech recognition models hunger for data ASR requires thousands of hours

More information

Paper Presentation. Steve Jan. March 5, Virginia Tech. Steve Jan (Virginia Tech) Paper Presentation March 5, / 28

Paper Presentation. Steve Jan. March 5, Virginia Tech. Steve Jan (Virginia Tech) Paper Presentation March 5, / 28 Paper Presentation Steve Jan Virginia Tech March 5, 2015 Steve Jan (Virginia Tech) Paper Presentation March 5, 2015 1 / 28 2 paper to present Nonparametric Multi-group Membership Model for Dynamic Networks,

More information

Signal segmentation and waveform characterization. Biosignal processing, S Autumn 2012

Signal segmentation and waveform characterization. Biosignal processing, S Autumn 2012 Signal segmentation and waveform characterization Biosignal processing, 5173S Autumn 01 Short-time analysis of signals Signal statistics may vary in time: nonstationary how to compute signal characterizations?

More information

Contents 2.1 Basic Concepts of Probability Methods of Assigning Probabilities Principle of Counting - Permutation and Combination 39

Contents 2.1 Basic Concepts of Probability Methods of Assigning Probabilities Principle of Counting - Permutation and Combination 39 CHAPTER 2 PROBABILITY Contents 2.1 Basic Concepts of Probability 38 2.2 Probability of an Event 39 2.3 Methods of Assigning Probabilities 39 2.4 Principle of Counting - Permutation and Combination 39 2.5

More information

Probability (Devore Chapter Two)

Probability (Devore Chapter Two) Probability (Devore Chapter Two) 1016-351-01 Probability Winter 2011-2012 Contents 1 Axiomatic Probability 2 1.1 Outcomes and Events............................... 2 1.2 Rules of Probability................................

More information

Enabling New Speech Driven Services for Mobile Devices: An overview of the ETSI standards activities for Distributed Speech Recognition Front-ends

Enabling New Speech Driven Services for Mobile Devices: An overview of the ETSI standards activities for Distributed Speech Recognition Front-ends Distributed Speech Recognition Enabling New Speech Driven Services for Mobile Devices: An overview of the ETSI standards activities for Distributed Speech Recognition Front-ends David Pearce & Chairman

More information

Joint Distributions, Independence Class 7, Jeremy Orloff and Jonathan Bloom

Joint Distributions, Independence Class 7, Jeremy Orloff and Jonathan Bloom Learning Goals Joint Distributions, Independence Class 7, 8.5 Jeremy Orloff and Jonathan Bloom. Understand what is meant by a joint pmf, pdf and cdf of two random variables. 2. Be able to compute probabilities

More information

Simple Large-scale Relation Extraction from Unstructured Text

Simple Large-scale Relation Extraction from Unstructured Text Simple Large-scale Relation Extraction from Unstructured Text Christos Christodoulopoulos and Arpit Mittal Amazon Research Cambridge Alexa Question Answering Alexa, what books did Carrie Fisher write?

More information

ARGUMENTATION MINING

ARGUMENTATION MINING ARGUMENTATION MINING Marie-Francine Moens joint work with Raquel Mochales Palau and Parisa Kordjamshidi Language Intelligence and Information Retrieval Department of Computer Science KU Leuven, Belgium

More information

Image Forgery. Forgery Detection Using Wavelets

Image Forgery. Forgery Detection Using Wavelets Image Forgery Forgery Detection Using Wavelets Introduction Let's start with a little quiz... Let's start with a little quiz... Can you spot the forgery the below image? Let's start with a little quiz...

More information

Analysis and Improvements of Linear Multi-user user MIMO Precoding Techniques

Analysis and Improvements of Linear Multi-user user MIMO Precoding Techniques 1 Analysis and Improvements of Linear Multi-user user MIMO Precoding Techniques Bin Song and Martin Haardt Outline 2 Multi-user user MIMO System (main topic in phase I and phase II) critical problem Downlink

More information

Progress in the BBN Keyword Search System for the DARPA RATS Program

Progress in the BBN Keyword Search System for the DARPA RATS Program INTERSPEECH 2014 Progress in the BBN Keyword Search System for the DARPA RATS Program Tim Ng 1, Roger Hsiao 1, Le Zhang 1, Damianos Karakos 1, Sri Harish Mallidi 2, Martin Karafiát 3,KarelVeselý 3, Igor

More information

A Bayesian rating system using W-Stein s identity

A Bayesian rating system using W-Stein s identity A Bayesian rating system using W-Stein s identity Ruby Chiu-Hsing Weng Department of Statistics National Chengchi University 2011.12.16 Joint work with C.-J. Lin Ruby Chiu-Hsing Weng (National Chengchi

More information