Introduction to Markov Models

Size: px
Start display at page:

Download "Introduction to Markov Models"

Transcription

1 Introduction to Markov Models But first: A few preliminaries Estimating the probability of phrases of words, sentences, etc. CIS Intro to AI 2 What counts as a word? A tricky question. How to find Sentences?? CIS Intro to AI 3 CIS Intro to AI 4 Q1: How to estimate the probability of a given sentence W? A crucial step in speech recognition (and lots of other applications) First guess: products of unigrams Pˆ( W ) P ( w ) Given word lattice: form subsidy for farm subsidies far Unigram counts (in 1.7 * 10 6 words of AP text): form 183 subsidy 15 for farm 74 subsidies 55 far 570 Not quite right CIS Intro to AI 5 ww Predicting a word sequence II Next guess: products of bigrams For W=w 1 w 2 w 3 w n, Given word lattice: form subsidy for farm subsidies far Bigram counts (in 1.7 * 10 6 words of AP text): form subsidy 0 subsidy for 2 form subsidies 0 subsidy far 0 farm subsidy 0 subsidies for 6 farm subsidies 4 subsidies far 0 Better (if not quite right) (But the counts are tiny! Why?) CIS 391 Intr)o to AI 6 n1 Pˆ( W ) P ( wiwi 1 ) i1 1

2 How can we estimate P correctly? Problem: Naïve Bayes model for bigrams violates independence assumptions. Let s do this right. Let W=w 1 w 2 w 3 w n. Then, by the chain rule, P ( W ) P ( w )* P ( w w )* P ( w w w )*...* P ( w w... w ) We can estimate P(w 2 w 1 ) by the Maximum Likelihood Estimator and P(w 3 w 1 w 2 ) by and so on n n Count( w1w 2) Count( w ) Count( w1w 2w3 ) Count( w w ) CIS Intro to AI and finally, Estimating P(w n w 1 w 2 w n-1 ) Again, we can estimate P(w n w 1 w 2 w n-1 ) with the MLE Count( w w... w ) Count( w w... ) 1 2 n 1 2 wn 1 So to decide pat vs. pot in Heat up the oil in a large p?t, compute for pot Count("Heat up the oil in a large pot") 0 Count("Heat up the oil in a larg e") 0 CIS Intro to AI 8 Hmm..The Web Changes Things (2008 or so) Statistics and the Web II Even the web in 2008 yields low counts! So, P( pot heat up the oil in a large ) = 8/ CIS Intro to AI 9 CIS Intro to AI 10 But the web has grown!!!. 165/891=0.185 CIS Intro to AI 11 CIS Intro to AI 12 2

3 So. A larger corpus won t help much unless it s HUGE. but the web is!!! A BOTEC Estimate of What We Can Estimate What parameters can we estimate with 100 million words of training data?? Assuming (for now) uniform distribution over only 5000 words But what if we only have 100 million words for our estimates?? So even with 10 8 words of data, for even trigrams we encounter the sparse data problem.. CIS Intro to AI 13 CIS Intro to AI 14 The Markov Assumption: Only the Immediate Past Matters The Markov Assumption: Estimation We estimate the probability of each w i given previous context by P(w i w 1 w 2 w i-1 ) = P(w i w i-1 ) which can be estimated by Count( wi 1wi) Count( w ) i1 So we re back to counting only unigrams and bigrams!! AND we have a correct practical estimation method for P(W) given the Markov assumption! CIS Intro to AI 15 CIS Intro to AI 16 Markov Models Visualizing an n-gram based language model: the Shannon/Miller/Selfridge method To generate a sequence of n words given unigram estimates: Fix some ordering of the vocabulary v 1 v 2 v 3 v k. For each word w i, 1 i n Choose a random value r i between 0 and 1 w i = the first v j such that j m1 P( v ) r m i CIS Intro to AI 17 CIS Intro to AI 18 3

4 Visualizing an n-gram based language model: the Shannon/Miller/Selfridge method The Shannon/Miller/Selfridge method trained on Shakespeare To generate a sequence of n words given a 1 st order Markov model (i.e. conditioned on one previous word): Fix some ordering of the vocabulary v 1 v 2 v 3 v k. Use unigram method to generate an initial word w 1 For each remaining w i, 2 i n Choose a random value r i between 0 and 1 j w i = the first v j such that m1 P( v w ) r m i1 i CIS Intro to AI 19 (This and next two slides from Jurafsky) CIS Intro to AI 20 Wall Street Journal just isn t Shakespeare Shakespeare as corpus N=884,647 tokens, V=29,066 Shakespeare produced 300,000 bigram types out of V 2 = 844 million possible bigrams. So 99.96% of the possible bigrams were never seen (have zero entries in the table) Quadrigrams worse: What's coming out looks like Shakespeare because it is Shakespeare The Sparse Data Problem Again English word frequencies well described by Zipf s Law Zipf (1949) characterized the relation between word frequency and rank as: f r C (for constant C) r C/f log(r) log(c) - log (f) Purely Zipfian data plots as a straight line on a loglog scale How likely is a 0 count? Much more likely than I let on!!! CIS Intro to AI 23 *Rank (r): The numerical position of a word in a list sorted by decreasing frequency (f ). CIS Intro to AI 24 4

5 Word frequency & rank in Brown Corpus vs Zipf Zipf s law for the Brown corpus Lots of area under the tail of this curve! From: Interactive mathematics CIS Intro to AI 25 CIS Intro to AI 26 Smoothing Smoothing At least one unknown word likely per sentence given Zipf!! To fix 0 s caused by this, we can smooth the data. Assume we know how many types never occur in the data. Steal probability mass from types that occur at least once. Distribute this probability mass over the types that never occur. This black art is why NLP is taught in the engineering school Jason Eisner CIS Intro to AI 28 Smoothing.is like Robin Hood: it steals from the rich and gives to the poor Review: Add-One Smoothing Estimate probabilities ˆP by assuming every possible word type v V actually occurred one extra time (as if by appending an unabridged dictionary) So if there were N words in our corpus, then instead of estimating Count( w) Pw ˆ( ) N we estimate Count( w 1) Pw ˆ( ) N V CIS Intro to AI 29 CIS Intro to AI 30 5

6 Add-One Smoothing (again) Pro: Very simple technique Cons: Probability of frequent n-grams is underestimated Probability of rare (or unseen) n-grams is overestimated Therefore, too much probability mass is shifted towards unseen n- grams All unseen n-grams are smoothed in the same way Using a smaller added-count improves things but only some More advanced techniques (Kneser Ney, Witten-Bell) use properties of component n-1 grams and the like... (Hint for this homework ) CIS Intro to AI 31 6

Introduction to Markov Models. Estimating the probability of phrases of words, sentences, etc.

Introduction to Markov Models. Estimating the probability of phrases of words, sentences, etc. Introduction to Markov Models Estimating the probability of phrases of words, sentences, etc. But first: A few preliminaries on text preprocessing What counts as a word? A tricky question. CIS 421/521

More information

Introduction to Markov Models

Introduction to Markov Models Itroductio to Markov Models But first: A few prelimiaries o text preprocessig Estimatig the probability of phrases of words, seteces, etc. What couts as a word? A tricky questio. How to fid Seteces?? CIS

More information

Part of Speech Tagging & Hidden Markov Models (Part 1) Mitch Marcus CIS 421/521

Part of Speech Tagging & Hidden Markov Models (Part 1) Mitch Marcus CIS 421/521 Part of Speech Tagging & Hidden Markov Models (Part 1) Mitch Marcus CIS 421/521 NLP Task I Determining Part of Speech Tags Given a text, assign each token its correct part of speech (POS) tag, given its

More information

The revolution of the empiricists. Machine Translation. Motivation for Data-Driven MT. Machine Translation as Search

The revolution of the empiricists. Machine Translation. Motivation for Data-Driven MT. Machine Translation as Search The revolution of the empiricists Machine Translation Word alignment & Statistical MT Jörg Tiedemann jorg.tiedemann@lingfil.uu.se Department of Linguistics and Philology Uppsala University Classical approaches

More information

/665 Natural Language Processing

/665 Natural Language Processing 601.465/665 Natural Language Processing Prof: Jason Eisner Webpage: http://cs.jhu.edu/~jason/465 syllabus, announcements, slides, homeworks 1 Goals of the field Computers would be a lot more useful if

More information

Log-linear models (part 1I)

Log-linear models (part 1I) Log-linear models (part 1I) Lecture, Feb 2 CS 690N, Spring 2017 Advanced Natural Language Processing http://people.cs.umass.edu/~brenocon/anlp2017/ Brendan O Connor College of Information and Computer

More information

Log-linear models (part 1I)

Log-linear models (part 1I) Log-linear models (part 1I) CS 690N, Spring 2018 Advanced Natural Language Processing http://people.cs.umass.edu/~brenocon/anlp2018/ Brendan O Connor College of Information and Computer Sciences University

More information

Lecture 4: n-grams in NLP. LING 1330/2330: Introduction to Computational Linguistics Na-Rae Han

Lecture 4: n-grams in NLP. LING 1330/2330: Introduction to Computational Linguistics Na-Rae Han Lecture 4: n-grams in NLP LING 1330/2330: Introduction to Computational Linguistics Na-Rae Han Objectives Frequent n-grams in English n-grams and statistical NLP n-grams and conditional probability Large

More information

Machine Translation - Decoding

Machine Translation - Decoding January 15, 2007 Table of Contents 1 Introduction 2 3 4 5 6 Integer Programing Decoder 7 Experimental Results Word alignments Fertility Table Translation Table Heads Non-heads NULL-generated (ct.) Figure:

More information

The Log-Log Term Frequency Distribution

The Log-Log Term Frequency Distribution The Log-Log Term Frequency Distribution Jason D. M. Rennie jrennie@gmail.com July 14, 2005 Abstract Though commonly used, the unigram is widely known as being a poor model of term frequency; it assumes

More information

신경망기반자동번역기술. Konkuk University Computational Intelligence Lab. 김강일

신경망기반자동번역기술. Konkuk University Computational Intelligence Lab.  김강일 신경망기반자동번역기술 Konkuk University Computational Intelligence Lab. http://ci.konkuk.ac.kr kikim01@kunkuk.ac.kr 김강일 Index Issues in AI and Deep Learning Overview of Machine Translation Advanced Techniques in

More information

The Automatic Classification Problem. Perceptrons, SVMs, and Friends: Some Discriminative Models for Classification

The Automatic Classification Problem. Perceptrons, SVMs, and Friends: Some Discriminative Models for Classification Perceptrons, SVMs, and Friends: Some Discriminative Models for Classification Parallel to AIMA 8., 8., 8.6.3, 8.9 The Automatic Classification Problem Assign object/event or sequence of objects/events

More information

Speech Recognition. Mitch Marcus CIS 421/521 Artificial Intelligence

Speech Recognition. Mitch Marcus CIS 421/521 Artificial Intelligence Speech Recognition Mitch Marcus CIS 421/521 Artificial Intelligence A Sample of Speech Recognition Today's class is about: First, why speech recognition is difficult. As you'll see, the impression we have

More information

Statistical Machine Translation. Machine Translation Phrase-Based Statistical MT. Motivation for Phrase-based SMT

Statistical Machine Translation. Machine Translation Phrase-Based Statistical MT. Motivation for Phrase-based SMT Statistical Machine Translation Machine Translation Phrase-Based Statistical MT Jörg Tiedemann jorg.tiedemann@lingfil.uu.se Department of Linguistics and Philology Uppsala University October 2009 Probabilistic

More information

Chapter 11. Sampling Distributions. BPS - 5th Ed. Chapter 11 1

Chapter 11. Sampling Distributions. BPS - 5th Ed. Chapter 11 1 Chapter 11 Sampling Distributions BPS - 5th Ed. Chapter 11 1 Sampling Terminology Parameter fixed, unknown number that describes the population Statistic known value calculated from a sample a statistic

More information

Laws of Text. Lecture Objectives. Text Technologies for Data Science INFR Learn about some text laws. This lecture is practical 9/26/2018

Laws of Text. Lecture Objectives. Text Technologies for Data Science INFR Learn about some text laws. This lecture is practical 9/26/2018 Text Technologies for Data Science INFR11145 Laws of Text Instructor: Walid Magdy 26-Sep-2018 Lecture Objectives Learn about some text laws Zipf s law Benford s law Heap s law Clumping/contagion This lecture

More information

CS 188: Artificial Intelligence Spring Speech in an Hour

CS 188: Artificial Intelligence Spring Speech in an Hour CS 188: Artificial Intelligence Spring 2006 Lecture 19: Speech Recognition 3/23/2006 Dan Klein UC Berkeley Many slides from Dan Jurafsky Speech in an Hour Speech input is an acoustic wave form s p ee ch

More information

Understanding Apparent Increasing Random Jitter with Increasing PRBS Test Pattern Lengths

Understanding Apparent Increasing Random Jitter with Increasing PRBS Test Pattern Lengths JANUARY 28-31, 2013 SANTA CLARA CONVENTION CENTER Understanding Apparent Increasing Random Jitter with Increasing PRBS Test Pattern Lengths 9-WP6 Dr. Martin Miller The Trend and the Concern The demand

More information

Social media sentiment analysis and topic detection for Singapore English

Social media sentiment analysis and topic detection for Singapore English Calhoun: The NPS Institutional Archive DSpace Repository Theses and Dissertations 1. Thesis and Dissertation Collection, all items 2013-09 Social media sentiment analysis and topic detection for Singapore

More information

System Identification and CDMA Communication

System Identification and CDMA Communication System Identification and CDMA Communication A (partial) sample report by Nathan A. Goodman Abstract This (sample) report describes theory and simulations associated with a class project on system identification

More information

Statistical tests. Paired t-test

Statistical tests. Paired t-test Statistical tests Gather data to assess some hypothesis (e.g., does this treatment have an effect on this outcome?) Form a test statistic for which large values indicate a departure from the hypothesis.

More information

Guess the Mean. Joshua Hill. January 2, 2010

Guess the Mean. Joshua Hill. January 2, 2010 Guess the Mean Joshua Hill January, 010 Challenge: Provide a rational number in the interval [1, 100]. The winner will be the person whose guess is closest to /3rds of the mean of all the guesses. Answer:

More information

Statistical Analysis of Modern Communication Signals

Statistical Analysis of Modern Communication Signals Whitepaper Statistical Analysis of Modern Communication Signals Bob Muro Application Group Manager, Boonton Electronics Abstract The latest wireless communication formats like DVB, DAB, WiMax, WLAN, and

More information

Laboratory 1: Uncertainty Analysis

Laboratory 1: Uncertainty Analysis University of Alabama Department of Physics and Astronomy PH101 / LeClair May 26, 2014 Laboratory 1: Uncertainty Analysis Hypothesis: A statistical analysis including both mean and standard deviation can

More information

News English.com Ready-to-use ESL/EFL Lessons by Sean Banville Facebook creator is Time Person of the Year

News English.com Ready-to-use ESL/EFL Lessons by Sean Banville Facebook creator is Time Person of the Year www.breaking News English.com Ready-to-use ESL/EFL Lessons by Sean Banville 1,000 IDEAS & ACTIVITIES FOR LANGUAGE TEACHERS The Breaking News English.com Resource Book http://www.breakingnewsenglish.com/book.html

More information

Midterm 2 Practice Problems

Midterm 2 Practice Problems Midterm 2 Practice Problems May 13, 2012 Note that these questions are not intended to form a practice exam. They don t necessarily cover all of the material, or weight the material as I would. They are

More information

FAST LEMPEL-ZIV (LZ 78) COMPLEXITY ESTIMATION USING CODEBOOK HASHING

FAST LEMPEL-ZIV (LZ 78) COMPLEXITY ESTIMATION USING CODEBOOK HASHING FAST LEMPEL-ZIV (LZ 78) COMPLEXITY ESTIMATION USING CODEBOOK HASHING Harman Jot, Rupinder Kaur M.Tech, Department of Electronics and Communication, Punjabi University, Patiala, Punjab, India I. INTRODUCTION

More information

10/12/2015. SHRDLU: 1969 NLP solved?? : A sea change in AI technologies. SHRDLU: A demonstration proof. 1990: Parsing Research in Crisis

10/12/2015. SHRDLU: 1969 NLP solved?? : A sea change in AI technologies. SHRDLU: A demonstration proof. 1990: Parsing Research in Crisis SHRDLU: 1969 NLP solved?? 1980-1995: A sea change in AI technologies Example: Natural Language Processing The Great Wave off Kanagawa by Hokusai, ~1830 ] Person: PICK UP A BIG RED BLOCK. Computer: OK.

More information

Speech Processing. Simon King University of Edinburgh. additional lecture slides for

Speech Processing. Simon King University of Edinburgh. additional lecture slides for Speech Processing Simon King University of Edinburgh additional lecture slides for 2018-19 assignment Q&A writing exercise Roadmap Modules 1-2: The basics Modules 3-5: Speech synthesis Modules 6-9: Speech

More information

MAS160: Signals, Systems & Information for Media Technology. Problem Set 4. DUE: October 20, 2003

MAS160: Signals, Systems & Information for Media Technology. Problem Set 4. DUE: October 20, 2003 MAS160: Signals, Systems & Information for Media Technology Problem Set 4 DUE: October 20, 2003 Instructors: V. Michael Bove, Jr. and Rosalind Picard T.A. Jim McBride Problem 1: Simple Psychoacoustic Masking

More information

Discriminative Training for Automatic Speech Recognition

Discriminative Training for Automatic Speech Recognition Discriminative Training for Automatic Speech Recognition 22 nd April 2013 Advanced Signal Processing Seminar Article Heigold, G.; Ney, H.; Schluter, R.; Wiesler, S. Signal Processing Magazine, IEEE, vol.29,

More information

Bayesian Positioning in Wireless Networks using Angle of Arrival

Bayesian Positioning in Wireless Networks using Angle of Arrival Bayesian Positioning in Wireless Networks using Angle of Arrival Presented by: Rich Martin Joint work with: David Madigan, Eiman Elnahrawy, Wen-Hua Ju, P. Krishnan, A.S. Krishnakumar Rutgers University

More information

The Game-Theoretic Approach to Machine Learning and Adaptation

The Game-Theoretic Approach to Machine Learning and Adaptation The Game-Theoretic Approach to Machine Learning and Adaptation Nicolò Cesa-Bianchi Università degli Studi di Milano Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 1 / 25 Machine Learning

More information

Algorithms and Data Structures

Algorithms and Data Structures Algorithms and Data Structures Self-Organizing Lists Marius Kloft Assumptions for Searching Until now, we implicitly assumed that every element of our list is searched with the same probability, i.e.,

More information

TO PLOT OR NOT TO PLOT?

TO PLOT OR NOT TO PLOT? Graphic Examples This document provides examples of a number of graphs that might be used in understanding or presenting data. Comments with each example are intended to help you understand why the data

More information

Science Binder and Science Notebook. Discussions

Science Binder and Science Notebook. Discussions Lane Tech H. Physics (Joseph/Machaj 2016-2017) A. Science Binder Science Binder and Science Notebook Name: Period: Unit 1: Scientific Methods - Reference Materials The binder is the storage device for

More information

Introduction to Artificial Intelligence

Introduction to Artificial Intelligence Introduction to Artificial Intelligence Mitch Marcus CIS521 Fall, 2017 Welcome to CIS 521 Professor: Mitch Marcus, mitch@ Levine 503 TAs: Eddie Smith, Heejin Jeong, Kevin Wang, Ming Zhang

More information

Shaftesbury Park Primary School. Wandsworth test examples

Shaftesbury Park Primary School. Wandsworth test examples Shaftesbury Park Primary School Wandsworth test examples Non-verbal reasoning Non-verbal reasoning is problem-solving based around pictures, diagrams and shapes, rather than words. Unlike verbal reasoning,

More information

Lesson Activity Toolkit

Lesson Activity Toolkit Lesson Activity Toolkit Tool name Tool definition Ideas for tool use Screen shot of tool Activities Anagram Unscramble given letters to solve problems. Unscramble for word games Category sort - Category

More information

Failures of Intuition: Building a Solid Poker Foundation through Combinatorics

Failures of Intuition: Building a Solid Poker Foundation through Combinatorics Failures of Intuition: Building a Solid Poker Foundation through Combinatorics by Brian Space Two Plus Two Magazine, Vol. 14, No. 8 To evaluate poker situations, the mathematics that underpin the dynamics

More information

Midterm for Name: Good luck! Midterm page 1 of 9

Midterm for Name: Good luck! Midterm page 1 of 9 Midterm for 6.864 Name: 40 30 30 30 Good luck! 6.864 Midterm page 1 of 9 Part #1 10% We define a PCFG where the non-terminals are {S, NP, V P, V t, NN, P P, IN}, the terminal symbols are {Mary,ran,home,with,John},

More information

Log-linear models (part III)

Log-linear models (part III) Log-linear models (part III) Lecture, Feb 7 CS 690N, Spring 2017 Advanced Natural Language Processing http://people.cs.umass.edu/~brenocon/anlp2017/ Brendan O Connor College of Information and Computer

More information

News English.com Ready-to-use ESL / EFL Lessons

News English.com Ready-to-use ESL / EFL Lessons www.breaking News English.com Ready-to-use ESL / EFL Lessons The Breaking News English.com Resource Book 1,000 Ideas & Activities For Language Teachers http://www.breakingnewsenglish.com/book.html Workers

More information

MFL and Numeracy. Teachers of MFL in KS2 and KS3 reinforce:

MFL and Numeracy. Teachers of MFL in KS2 and KS3 reinforce: MFL and Numeracy "When evaluating the achievement of pupils, inspectors consider...how well pupils develop a range of skills, including reading, writing, communication and mathematical skills, and how

More information

The Odds Calculators: Partial simulations vs. compact formulas By Catalin Barboianu

The Odds Calculators: Partial simulations vs. compact formulas By Catalin Barboianu The Odds Calculators: Partial simulations vs. compact formulas By Catalin Barboianu As result of the expanded interest in gambling in past decades, specific math tools are being promulgated to support

More information

Why Should We Care? Everyone uses plotting But most people ignore or are unaware of simple principles Default plotting tools are not always the best

Why Should We Care? Everyone uses plotting But most people ignore or are unaware of simple principles Default plotting tools are not always the best Elementary Plots Why Should We Care? Everyone uses plotting But most people ignore or are unaware of simple principles Default plotting tools are not always the best More importantly, it is easy to lie

More information

The Self-Avoiding Walk (Probability And Its Applications) By Neal Madras;Gordon Slade

The Self-Avoiding Walk (Probability And Its Applications) By Neal Madras;Gordon Slade The Self-Avoiding Walk (Probability And Its Applications) By Neal Madras;Gordon Slade If you are searching for a book by Neal Madras;Gordon Slade The Self-Avoiding Walk (Probability and Its Applications)

More information

Motif finding. GCB 535 / CIS 535 M. T. Lee, 10 Oct 2004

Motif finding. GCB 535 / CIS 535 M. T. Lee, 10 Oct 2004 Motif finding GCB 535 / CIS 535 M. T. Lee, 10 Oct 2004 Our goal is to identify significant patterns of letters (nucleotides, amino acids) contained within long sequences. The pattern is called a motif.

More information

Named Entity Recognition. Natural Language Processing Emory University Jinho D. Choi

Named Entity Recognition. Natural Language Processing Emory University Jinho D. Choi Named Entity Recognition Natural Language Processing Emory University Jinho D. Choi Named Entity Recognition 2 Named Entity Recognition Classify the named entity tag of each chunk. 2 Named Entity Recognition

More information

Battleship as a Dialog System Aaron Brackett, Gerry Meixiong, Tony Tan-Torres, Jeffrey Yu

Battleship as a Dialog System Aaron Brackett, Gerry Meixiong, Tony Tan-Torres, Jeffrey Yu Battleship as a Dialog System Aaron Brackett, Gerry Meixiong, Tony Tan-Torres, Jeffrey Yu Abstract For our project, we built a conversational agent for Battleship using Dialog systems. In this paper, we

More information

Dance Movement Patterns Recognition (Part II)

Dance Movement Patterns Recognition (Part II) Dance Movement Patterns Recognition (Part II) Jesús Sánchez Morales Contents Goals HMM Recognizing Simple Steps Recognizing Complex Patterns Auto Generation of Complex Patterns Graphs Test Bench Conclusions

More information

Real Time Word to Picture Translation for Chinese Restaurant Menus

Real Time Word to Picture Translation for Chinese Restaurant Menus Real Time Word to Picture Translation for Chinese Restaurant Menus Michelle Jin, Ling Xiao Wang, Boyang Zhang Email: mzjin12, lx2wang, boyangz @stanford.edu EE268 Project Report, Spring 2014 Abstract--We

More information

The study of probability is concerned with the likelihood of events occurring. Many situations can be analyzed using a simplified model of probability

The study of probability is concerned with the likelihood of events occurring. Many situations can be analyzed using a simplified model of probability The study of probability is concerned with the likelihood of events occurring Like combinatorics, the origins of probability theory can be traced back to the study of gambling games Still a popular branch

More information

Latest trends in sentiment analysis - A survey

Latest trends in sentiment analysis - A survey Latest trends in sentiment analysis - A survey Anju Rose G Punneliparambil PG Scholar Department of Computer Science & Engineering Govt. Engineering College, Thrissur, India anjurose.ar@gmail.com Abstract

More information

If a series of games (on which money has been bet) is interrupted before it can end, what is the fairest way to divide the stakes?

If a series of games (on which money has been bet) is interrupted before it can end, what is the fairest way to divide the stakes? Interrupted Games of Chance Berkeley Math Circle (Advanced) John McSweeney March 13th, 2012 1 The Problem If a series of games (on which money has been bet) is interrupted before it can end, what is the

More information

Machine Learning for Language Technology

Machine Learning for Language Technology Machine Learning for Language Technology Generative and Discriminative Models Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Machine Learning for Language

More information

HW1 is due Thu Oct 12 in the first 5 min of class. Read through chapter 5.

HW1 is due Thu Oct 12 in the first 5 min of class. Read through chapter 5. Stat 100a, Introduction to Probability. Outline for the day: 1. Bayes's rule. 2. Random variables. 3. cdf, pmf, and density. 4. Expected value, continued. 5. All in with AA. 6. Pot odds. 7. Violette vs.

More information

Backward induction is a widely accepted principle for predicting behavior in sequential games. In the classic

Backward induction is a widely accepted principle for predicting behavior in sequential games. In the classic Published online ahead of print November 9, 212 MANAGEMENT SCIENCE Articles in Advance, pp. 1 18 ISSN 25-199 (print) ISSN 1526-551 (online) http://dx.doi.org/1.1287/mnsc.112.1645 212 INFORMS A Dynamic

More information

THE CHALLENGES OF SENTIMENT ANALYSIS ON SOCIAL WEB COMMUNITIES

THE CHALLENGES OF SENTIMENT ANALYSIS ON SOCIAL WEB COMMUNITIES THE CHALLENGES OF SENTIMENT ANALYSIS ON SOCIAL WEB COMMUNITIES Osamah A.M Ghaleb 1,Anna Saro Vijendran 2 1 Ph.D Research Scholar, Department of Computer Science, Sri Ramakrishna College of Arts and Science,(India)

More information

Sketching Interface. Larry Rudolph April 24, Pervasive Computing MIT SMA 5508 Spring 2006 Larry Rudolph

Sketching Interface. Larry Rudolph April 24, Pervasive Computing MIT SMA 5508 Spring 2006 Larry Rudolph Sketching Interface Larry April 24, 2006 1 Motivation Natural Interface touch screens + more Mass-market of h/w devices available Still lack of s/w & applications for it Similar and different from speech

More information

Information Extraction. CS6200 Information Retrieval (and a sort of advertisement for NLP in the spring)

Information Extraction. CS6200 Information Retrieval (and a sort of advertisement for NLP in the spring) Information Extraction CS6200 Information Retrieval (and a sort of advertisement for NLP in the spring) 1 Informa(on Extrac(on Automa(cally extract structure from text annotate document using tags to iden(fy

More information

Multimedia Systems Entropy Coding Mahdi Amiri February 2011 Sharif University of Technology

Multimedia Systems Entropy Coding Mahdi Amiri February 2011 Sharif University of Technology Course Presentation Multimedia Systems Entropy Coding Mahdi Amiri February 2011 Sharif University of Technology Data Compression Motivation Data storage and transmission cost money Use fewest number of

More information

Basics of Five Card Draw

Basics of Five Card Draw Basics of Five Card Draw Jasonpariah 19 January 2009 Introduction and Bio: I ve been asked to write an article for this site in relation to five card draw. Of course your first question reading this should

More information

Sketching Interface. Motivation

Sketching Interface. Motivation Sketching Interface Larry Rudolph April 5, 2007 1 1 Natural Interface Motivation touch screens + more Mass-market of h/w devices available Still lack of s/w & applications for it Similar and different

More information

Efficiency and detectability of random reactive jamming in wireless networks

Efficiency and detectability of random reactive jamming in wireless networks Efficiency and detectability of random reactive jamming in wireless networks Ni An, Steven Weber Modeling & Analysis of Networks Laboratory Drexel University Department of Electrical and Computer Engineering

More information

Recent Advances in Image Deblurring. Seungyong Lee (Collaboration w/ Sunghyun Cho)

Recent Advances in Image Deblurring. Seungyong Lee (Collaboration w/ Sunghyun Cho) Recent Advances in Image Deblurring Seungyong Lee (Collaboration w/ Sunghyun Cho) Disclaimer Many images and figures in this course note have been copied from the papers and presentation materials of previous

More information

Skill, Matchmaking, and Ranking. Dr. Josh Menke Sr. Systems Designer Activision Publishing

Skill, Matchmaking, and Ranking. Dr. Josh Menke Sr. Systems Designer Activision Publishing Skill, Matchmaking, and Ranking Dr. Josh Menke Sr. Systems Designer Activision Publishing Outline I. Design Philosophy II. Definitions III.Skill IV.Matchmaking V. Ranking Design Values Easy to Learn, Hard

More information

CPS331 Lecture: Heuristic Search last revised 6/18/09

CPS331 Lecture: Heuristic Search last revised 6/18/09 CPS331 Lecture: Heuristic Search last revised 6/18/09 Objectives: 1. To introduce the use of heuristics in searches 2. To introduce some standard heuristic algorithms 3. To introduce criteria for evaluating

More information

AUTOMATIC SPEECH RECOGNITION FOR NUMERIC DIGITS USING TIME NORMALIZATION AND ENERGY ENVELOPES

AUTOMATIC SPEECH RECOGNITION FOR NUMERIC DIGITS USING TIME NORMALIZATION AND ENERGY ENVELOPES AUTOMATIC SPEECH RECOGNITION FOR NUMERIC DIGITS USING TIME NORMALIZATION AND ENERGY ENVELOPES N. Sunil 1, K. Sahithya Reddy 2, U.N.D.L.mounika 3 1 ECE, Gurunanak Institute of Technology, (India) 2 ECE,

More information

News English.com Ready-to-use ESL / EFL Lessons

News English.com Ready-to-use ESL / EFL Lessons www.breaking News English.com Ready-to-use ESL / EFL Lessons 1,000 IDEAS & ACTIVITIES FOR LANGUAGE TEACHERS The Breaking News English.com Resource Book http://www.breakingnewsenglish.com/book.html New

More information

Frictional Force (32 Points)

Frictional Force (32 Points) Dual-Range Force Sensor Frictional Force (32 Points) Computer 19 Friction is a force that resists motion. It involves objects in contact with each other, and it can be either useful or harmful. Friction

More information

the question of whether computers can think is like the question of whether submarines can swim -- Dijkstra

the question of whether computers can think is like the question of whether submarines can swim -- Dijkstra the question of whether computers can think is like the question of whether submarines can swim -- Dijkstra Game AI: The set of algorithms, representations, tools, and tricks that support the creation

More information

SELECTING RELEVANT DATA

SELECTING RELEVANT DATA EXPLORATORY ANALYSIS The data that will be used comes from the reviews_beauty.json.gz file which contains information about beauty products that were bought and reviewed on Amazon.com. Each data point

More information

1 2-step and other basic conditional probability problems

1 2-step and other basic conditional probability problems Name M362K Exam 2 Instructions: Show all of your work. You do not have to simplify your answers. No calculators allowed. 1 2-step and other basic conditional probability problems 1. Suppose A, B, C are

More information

Pixel Response Effects on CCD Camera Gain Calibration

Pixel Response Effects on CCD Camera Gain Calibration 1 of 7 1/21/2014 3:03 PM HO M E P R O D UC T S B R IE F S T E C H NO T E S S UP P O RT P UR C HA S E NE W S W E B T O O L S INF O C O NTA C T Pixel Response Effects on CCD Camera Gain Calibration Copyright

More information

An Adaptive Intelligence For Heads-Up No-Limit Texas Hold em

An Adaptive Intelligence For Heads-Up No-Limit Texas Hold em An Adaptive Intelligence For Heads-Up No-Limit Texas Hold em Etan Green December 13, 013 Skill in poker requires aptitude at a single task: placing an optimal bet conditional on the game state and the

More information

Chapter 11. Sampling Distributions. BPS - 5th Ed. Chapter 11 1

Chapter 11. Sampling Distributions. BPS - 5th Ed. Chapter 11 1 Chapter 11 Sampling Distributions BPS - 5th Ed. Chapter 11 1 Sampling Terminology Parameter fixed, unknown number that describes the population Example: population mean Statistic known value calculated

More information

the ARTICLE (for teachers)

the ARTICLE (for teachers) the ARTICLE (for teachers) Oliver Curry studies evolution and the future, and he has an interesting idea about the future of the human race. He suggests that in 100,000 years, humankind will split into

More information

Objective: the student will gain speed and accuracy in letter recognition.

Objective: the student will gain speed and accuracy in letter recognition. ALPHABET ARC Objective: the student will gain speed and accuracy in letter recognition. Materials: Alphabet arc (enlarge 200 percent and attach to 12 x 18 construction paper). 12 x18 construction paper

More information

Huffman Coding with Non-Sorted Frequencies

Huffman Coding with Non-Sorted Frequencies Huffman Coding with Non-Sorted Frequencies Shmuel T. Klein and Dana Shapira Abstract. A standard way of implementing Huffman s optimal code construction algorithm is by using a sorted sequence of frequencies.

More information

Alternation in the repeated Battle of the Sexes

Alternation in the repeated Battle of the Sexes Alternation in the repeated Battle of the Sexes Aaron Andalman & Charles Kemp 9.29, Spring 2004 MIT Abstract Traditional game-theoretic models consider only stage-game strategies. Alternation in the repeated

More information

Objective: Plot points, using them to draw lines in the plane, and describe

Objective: Plot points, using them to draw lines in the plane, and describe NYS COMMON CORE MATHEMATICS CURRICULUM Lesson 7 5 6 Lesson 7 Objective: Plot points, using them to draw lines in the plane, and describe patterns within the coordinate pairs. Suggested Lesson Structure

More information

MAS.160 / MAS.510 / MAS.511 Signals, Systems and Information for Media Technology Fall 2007

MAS.160 / MAS.510 / MAS.511 Signals, Systems and Information for Media Technology Fall 2007 MIT OpenCourseWare http://ocw.mit.edu MAS.160 / MAS.510 / MAS.511 Signals, Systems and Information for Media Technology Fall 2007 For information about citing these materials or our Terms of Use, visit:

More information

WHAT THE COURSE IS AND ISN T ABOUT. Welcome to CIS 391. Introduction to Artificial Intelligence. Grading & Homework. Welcome to CIS 391

WHAT THE COURSE IS AND ISN T ABOUT. Welcome to CIS 391. Introduction to Artificial Intelligence. Grading & Homework. Welcome to CIS 391 Welcome to CIS 391 Introduction to Artificial Intelligence Lecturer: Mitch Marcus, mitch@ Levine 503 Office hours will be announced on Piazza Mitch Marcus CIS391 Fall, 2015 TA: Daniel Moroz,

More information

NEWS ENGLISH LESSONS.com

NEWS ENGLISH LESSONS.com NEWS ENGLISH LESSONS.com Superman comic sells for $1 million MANY FLASH AND ONLINE ACTIVITIES FOR THIS LESSON, PLUS A LISTENING, AT: http://www.newsenglishlessons.com/1002/100223-superman.html IN THIS

More information

Outcome Forecasting in Sports. Ondřej Hubáček

Outcome Forecasting in Sports. Ondřej Hubáček Outcome Forecasting in Sports Ondřej Hubáček Motivation & Challenges Motivation exploiting betting markets performance optimization Challenges no available datasets difficulties with establishing the state-of-the-art

More information

JIGSAW ACTIVITY, TASK # Make sure your answer in written in the correct order. Highest powers of x should come first, down to the lowest powers.

JIGSAW ACTIVITY, TASK # Make sure your answer in written in the correct order. Highest powers of x should come first, down to the lowest powers. JIGSAW ACTIVITY, TASK #1 Your job is to multiply and find all the terms in ( 1) Recall that this means ( + 1)( + 1)( + 1)( + 1) Start by multiplying: ( + 1)( + 1) x x x x. x. + 4 x x. Write your answer

More information

News English.com Ready-to-use ESL / EFL Lessons

News English.com Ready-to-use ESL / EFL Lessons www.breaking News English.com Ready-to-use ESL / EFL Lessons 1,000 IDEAS & ACTIVITIES FOR LANGUAGE TEACHERS The Breaking News English.com Resource Book http://www.breakingnewsenglish.com/book.html UN boss

More information

Why Should We Care? More importantly, it is easy to lie or deceive people with bad plots

Why Should We Care? More importantly, it is easy to lie or deceive people with bad plots Elementary Plots Why Should We Care? Everyone uses plotting But most people ignore or are unaware of simple principles Default plotting tools (or default settings) are not always the best More importantly,

More information

Technologists and economists both think about the future sometimes, but they each have blind spots.

Technologists and economists both think about the future sometimes, but they each have blind spots. The Economics of Brain Simulations By Robin Hanson, April 20, 2006. Introduction Technologists and economists both think about the future sometimes, but they each have blind spots. Technologists think

More information

The fundamentals of detection theory

The fundamentals of detection theory Advanced Signal Processing: The fundamentals of detection theory Side 1 of 18 Index of contents: Advanced Signal Processing: The fundamentals of detection theory... 3 1 Problem Statements... 3 2 Detection

More information

Name: Exam 01 (Midterm Part 2 take home, open everything)

Name: Exam 01 (Midterm Part 2 take home, open everything) Name: Exam 01 (Midterm Part 2 take home, open everything) To help you budget your time, questions are marked with *s. One * indicates a straightforward question testing foundational knowledge. Two ** indicate

More information

Quiddler Skill Connections for Teachers

Quiddler Skill Connections for Teachers Quiddler Skill Connections for Teachers Quiddler is a game primarily played for fun and entertainment. The fact that it teaches, strengthens and exercises an abundance of skills makes it one of the best

More information

One-Sample Z: C1, C2, C3, C4, C5, C6, C7, C8,... The assumed standard deviation = 110

One-Sample Z: C1, C2, C3, C4, C5, C6, C7, C8,... The assumed standard deviation = 110 SMAM 314 Computer Assignment 3 1.Suppose n = 100 lightbulbs are selected at random from a large population.. Assume that the light bulbs put on test until they fail. Assume that for the population of light

More information

Biased Opponent Pockets

Biased Opponent Pockets Biased Opponent Pockets A very important feature in Poker Drill Master is the ability to bias the value of starting opponent pockets. A subtle, but mostly ignored, problem with computing hand equity against

More information

2. Review of Pawns p

2. Review of Pawns p Critical Thinking, version 2.2 page 2-1 2. Review of Pawns p Objectives: 1. State and apply rules of movement for pawns 2. Solve problems using pawns The main objective of this lesson is to reinforce the

More information

Card counting meets hidden Markov models

Card counting meets hidden Markov models University of New Mexico UNM Digital Repository Electrical and Computer Engineering ETDs Engineering ETDs 2-7-2011 Card counting meets hidden Markov models Steven J. Aragon Follow this and additional works

More information

CCMR Educational Programs

CCMR Educational Programs CCMR Educational Programs Title: Date Created: August 6, 2006 Author(s): Appropriate Level: Abstract: Time Requirement: Joan Erickson Should We Count the Beans one at a time? Introductory statistics or

More information

TRANSFORMS / WAVELETS

TRANSFORMS / WAVELETS RANSFORMS / WAVELES ransform Analysis Signal processing using a transform analysis for calculations is a technique used to simplify or accelerate problem solution. For example, instead of dividing two

More information

News English.com Ready-to-use ESL / EFL Lessons

News English.com Ready-to-use ESL / EFL Lessons www.breaking News English.com Ready-to-use ESL / EFL Lessons The Breaking News English.com Resource Book 1,000 Ideas & Activities For Language Teachers http://www.breakingnewsenglish.com/book.html World

More information