Introduction to Markov Models. Estimating the probability of phrases of words, sentences, etc.
|
|
- Tyler Long
- 6 years ago
- Views:
Transcription
1 Introduction to Markov Models Estimating the probability of phrases of words, sentences, etc.
2 But first: A few preliminaries on text preprocessing
3 What counts as a word? A tricky question. CIS 421/521 - Intro to AI 3
4 How to find Sentences?? CIS 421/521 - Intro to AI 4
5 Q1: How to estimate the probability of a given sentence W? A crucial step in speech recognition (and lots of other applications) First guess: bag of words : Given word lattice: form subsidy for farm subsidies far Pˆ( W ) P ( w ) w W Unigram counts (in 1.7 * 10 6 words of AP text): form 183 subsidy 15 for farm 74 subsidies 55 far 570 Most likely word string given ˆ( ) PW isn t quite right CIS 421/521 - Intro to AI 5
6 Predicting a word sequence II Next guess: products of bigrams For W=w 1 w 2 w 3 w n, Given word lattice: Bigram counts (in 1.7 * 10 6 words of AP text): Much Better (if not quite right) (Q: the counts are tiny! Why?) CIS 421/521 - Intro to AI 6 n 1 Pˆ( W ) P ( wiwi 1 ) i 1 form subsidy for farm subsidies far form subsidy 0 subsidy for 2 form subsidies 0 subsidy far 0 farm subsidy 0 subsidies for 6 farm subsidies 4 subsidies far 0
7 How can we estimate P(W) correctly? Problem: Naïve Bayes model for bigrams violates independence assumptions. Let s do this right. Let W=w 1 w 2 w 3 w n. Then, by the chain rule, P( W ) P( w1 )* P( w2 w )* P( w3 w w2 )*...* P( wn w... wn 1) We can estimate P(w 2 w 1 ) by the Maximum Likelihood Estimator and P(w 3 w 1 w 2 ) by and so on Count( w1w 2) Count( w ) Count( w1w 2w3 ) Count( w w ) CIS 421/521 - Intro to AI 7
8 and finally, Estimating P(w n w 1 w 2 w n-1 ) Again, we can estimate P(w n w 1 w 2 w n-1 ) with the MLE Count( w w... w ) 1 2 n 1 2 wn 1 Count( w w... ) So to decide pat vs. pot in Heat up the oil in a large p?t, compute for pot Count("Heat up the oil in a large pot") Count("Heat up the oil in a large") UNLESS OUR CORPUS IS REALLY HUGE BOTH COUNTS WILL BE 0, yielding 0/0 CIS 421/521 - Intro to AI 8
9 The Web is HUGE!! (2016 version) 48.9/403=0.121 CIS 421/521 - Intro to AI 9
10 But what if we only have 100 million words for our estimates?? CIS 421/521 - Intro to AI 10
11 A BOTEC Estimate of What We Can Estimate What parameters can we estimate with 100 million words of training data?? Assuming (for now) uniform distribution over only 5000 words So even with 10 8 words of data, for even trigrams we encounter the sparse data problem.. CIS 421/521 - Intro to AI 11
12 Review: How can we estimate P(W) correctly? Problem: Naïve Bayes model for bigrams violates independence assumptions. Let s do this right. Let W=w 1 w 2 w 3 w n. Then, by the chain rule, P( W ) P( w1 )* P( w2 w )* P( w3 w w2 )*...* P( wn w... wn 1) We can estimate P(w 2 w 1 ) by the Maximum Likelihood Estimator Count( w1w 2) Count( w ) and P(w 3 w 1 w 2 ) by Count( w1w 2w3 ) Count( w1w 2) and so on CIS 421/521 - Intro to AI 12
13 The Markov Assumption: Only the Immediate Past Matters CIS 421/521 - Intro to AI 13
14 The Markov Assumption: Estimation We estimate the probability of each w i given previous context by P(w i w 1 w 2 w i-1 ) = P(w i w i-1 ) which can be estimated by Count( w w ) i 1 Count( w ) i 1 i So we re back to counting only unigrams and bigrams!! AND we have a correct practical estimation method for P(W) given the Markov assumption! CIS 421/521 - Intro to AI 14
15 Markov Models CIS 421/521 - Intro to AI 15
16 Review (and crucial for upcoming homework): Cumulative distribution Functions (CDFs) The CDF of a random variable X is denoted by F X (x) and is defined by F X (x)=pr(x x) F is monotonic nondecreasing: x y, F x F y If X is a discrete random variable that attains values x 1, x 2,, x n with probabilities p(x 1 ), p(x 2 ), then FX ( xi ) p( xi ) j i CIS 421/521 - Intro to AI 16
17 CDF for a very small English corpus Corpus: the mouse ran up the clock. The spider ran up the waterspout. P(the)=4/12, P(ran)=P(up)=2/12 P(mouse)=P(clock)=P(spider)=P(waterspout)=1/12 Arbitrarily fix an order: w1=the, w2=ran, w3=up, w4=mouse, 1 11/1 10/1 9/12 8/12 7/12 6/12 5/12 ` 4/12 3/12 2/12 F(the)=4/12 F(ran)=6/12 F(up)=8/12 F(mouse)=9/12 ` 1/12 The Ran Up Mouse Clock Spider waterspout CIS 421/521 - Intro to AI 17
18 Visualizing an n-gram based language model: the Shannon/Miller/Selfridge method To generate a sequence of n words given unigram estimates: Fix some ordering of the vocabulary v 1 v 2 v 3 v k. For each word position i, 1 i n Choose a random value r i between 0 and 1 Choose w i = the first v j such that F V v r i i.e the first v j such that j m 1 P( v ) m r i CIS 421/521 - Intro to AI 18
19 Visualizing an n-gram based language model: the Shannon/Miller/Selfridge method To generate a sequence of n words given a 1 st order Markov model (i.e. conditioned on one previous word): Fix some ordering of the vocabulary v 1 v 2 v 3 v k. Use unigram method to generate an initial word w 1 For each remaining position i, 2 i n Choose a random value r i between 0 and 1 Choose w i = the first v j such that P( vm wi 1) ri j m 1 CIS 421/521 - Intro to AI 19
20 The Shannon/Miller/Selfridge method trained on Shakespeare (This and next two slides from Jurafsky) CIS 421/521 - Intro to AI 20
21 Wall Street Journal just isn t Shakespeare CIS 421/521 - Intro to AI 21
22 Shakespeare as corpus N=884,647 tokens, V=29,066 Shakespeare produced 300,000 bigram types out of V 2 = 844 million possible bigrams. So 99.96% of the possible bigrams were never seen (have zero entries in the table) Quadgrams worse: What's coming out looks like Shakespeare because it is Shakespeare CIS 421/521 - Intro to AI 22
23 The Sparse Data Problem Again So we smooth. How likely is a 0 count? Much more likely than I let on!!! CIS 421/521 - Intro to AI 23
24 English word frequencies well described by Zipf s Law Zipf (1949) characterized the relation between word frequency and rank as: f r r C/f log(r) C (for constant log(c) - log Purely Zipfian data plots as a straight line on a loglog scale (f) C) *Rank (r): The numerical position of a word in a list sorted by decreasing frequency (f ). CIS 421/521 - Intro to AI 24
25 Word frequency & rank in Brown Corpus vs Zipf Lots of area under the tail of this curve! From: Interactive mathematics CIS 421/521 - Intro to AI 25
26 Zipf s law for the Brown corpus CIS 421/521 - Intro to AI 26
27 Exploiting Zipf to do Language ID #The following filters out arabic words that are also frequent in Spanish and English... arabic_top_12 = [ '7ata', 'ana', 'ma', 'w', 'bs', 'fe', 'b3d', '3adou', 'mn', 'kan', 'men', 'ahmed' ] #The following filters out urdu words common in English urdu_top_17 = ['hai', 'ko', 'ki', 'main', 'na', 'se', 'ho', 'bhi', 'mein', 'ka', 'tum', 'nahi', 'meri', 'jo', 'wo', 'dil', 'hain'] spanish_top_16 = ['de', 'la', 'que', 'el', 'en', 'y', 'es', 'un', 'los', 'por', 'se', 'para', 'con'] english_top_20 = ['the', 'to', 'of', 'in', 'i', 'a', 'is', 'and, 'you', 'for', 'on', 'it', 'that', 'are', 'with', 'am', 'my', 'be', 'at' 'not', 'we'] CIS 421/521 - Intro to AI 27
28 All the code you need. #TO GET BEST LANGUAGE AS STRING: lid_pick_best(lid_process_tweet(tweet)) counts=collections.counter() def lid_process_tweet(tweet): counts.clear() for word in re.split(r'[\.?!,]*\s+, tweet.encode('ascii','replace ).strip().lower()): if not re.match(r' for lang in languages: if word in topwords[lang]: #( english, arabic,...) counts[lang]+=1 #dict of word lists indexed by land return counts.most_common() def lid_pick_best (count_list): if count_list: return count_list[0][0] else: return 'UNKNOWN' CIS 421/521 - Intro to AI 28
Introduction to Markov Models
Introduction to Markov Models But first: A few preliminaries Estimating the probability of phrases of words, sentences, etc. CIS 391 - Intro to AI 2 What counts as a word? A tricky question. How to find
More informationIntroduction to Markov Models
Itroductio to Markov Models But first: A few prelimiaries o text preprocessig Estimatig the probability of phrases of words, seteces, etc. What couts as a word? A tricky questio. How to fid Seteces?? CIS
More informationPart of Speech Tagging & Hidden Markov Models (Part 1) Mitch Marcus CIS 421/521
Part of Speech Tagging & Hidden Markov Models (Part 1) Mitch Marcus CIS 421/521 NLP Task I Determining Part of Speech Tags Given a text, assign each token its correct part of speech (POS) tag, given its
More informationThe revolution of the empiricists. Machine Translation. Motivation for Data-Driven MT. Machine Translation as Search
The revolution of the empiricists Machine Translation Word alignment & Statistical MT Jörg Tiedemann jorg.tiedemann@lingfil.uu.se Department of Linguistics and Philology Uppsala University Classical approaches
More informationLog-linear models (part 1I)
Log-linear models (part 1I) Lecture, Feb 2 CS 690N, Spring 2017 Advanced Natural Language Processing http://people.cs.umass.edu/~brenocon/anlp2017/ Brendan O Connor College of Information and Computer
More informationLog-linear models (part 1I)
Log-linear models (part 1I) CS 690N, Spring 2018 Advanced Natural Language Processing http://people.cs.umass.edu/~brenocon/anlp2018/ Brendan O Connor College of Information and Computer Sciences University
More informationLecture 4: n-grams in NLP. LING 1330/2330: Introduction to Computational Linguistics Na-Rae Han
Lecture 4: n-grams in NLP LING 1330/2330: Introduction to Computational Linguistics Na-Rae Han Objectives Frequent n-grams in English n-grams and statistical NLP n-grams and conditional probability Large
More informationThe fundamentals of detection theory
Advanced Signal Processing: The fundamentals of detection theory Side 1 of 18 Index of contents: Advanced Signal Processing: The fundamentals of detection theory... 3 1 Problem Statements... 3 2 Detection
More informationHW1 is due Thu Oct 12 in the first 5 min of class. Read through chapter 5.
Stat 100a, Introduction to Probability. Outline for the day: 1. Bayes's rule. 2. Random variables. 3. cdf, pmf, and density. 4. Expected value, continued. 5. All in with AA. 6. Pot odds. 7. Violette vs.
More informationSpeech Recognition. Mitch Marcus CIS 421/521 Artificial Intelligence
Speech Recognition Mitch Marcus CIS 421/521 Artificial Intelligence A Sample of Speech Recognition Today's class is about: First, why speech recognition is difficult. As you'll see, the impression we have
More information24.09 Minds and Machines Fall 11 HASS-D CI
24.09 Minds and Machines Fall 11 HASS-D CI self-assessment the Chinese room argument Image by MIT OpenCourseWare. 1 derived vs. underived intentionality Something has derived intentionality just in case
More informationStatistical Machine Translation. Machine Translation Phrase-Based Statistical MT. Motivation for Phrase-based SMT
Statistical Machine Translation Machine Translation Phrase-Based Statistical MT Jörg Tiedemann jorg.tiedemann@lingfil.uu.se Department of Linguistics and Philology Uppsala University October 2009 Probabilistic
More informationGreat Is the Love/Hay Gran Amor. Jaime Cortez. Unison Keyboard
887 OCP sheet music Great Is the Love/Hay Gran Amor aime Cortez Unison Keyboard The material that you have requested is copyrighted. Copyright la requires you to obtain a license from the copyright holder
More informationBayesian Positioning in Wireless Networks using Angle of Arrival
Bayesian Positioning in Wireless Networks using Angle of Arrival Presented by: Rich Martin Joint work with: David Madigan, Eiman Elnahrawy, Wen-Hua Ju, P. Krishnan, A.S. Krishnakumar Rutgers University
More informationSpecimen 2018 Morning Time allowed: 1 hour
SPECIMEN MATERIAL GCSE SPANISH Foundation Tier Paper 4 Writing F Specimen 2018 Morning Time allowed: 1 hour Materials: You will need no other materials. Instructions Use black ink or black ball-point pen.
More informationSystem Identification and CDMA Communication
System Identification and CDMA Communication A (partial) sample report by Nathan A. Goodman Abstract This (sample) report describes theory and simulations associated with a class project on system identification
More informationCS 188: Artificial Intelligence Spring Speech in an Hour
CS 188: Artificial Intelligence Spring 2006 Lecture 19: Speech Recognition 3/23/2006 Dan Klein UC Berkeley Many slides from Dan Jurafsky Speech in an Hour Speech input is an acoustic wave form s p ee ch
More informationBackward induction is a widely accepted principle for predicting behavior in sequential games. In the classic
Published online ahead of print November 9, 212 MANAGEMENT SCIENCE Articles in Advance, pp. 1 18 ISSN 25-199 (print) ISSN 1526-551 (online) http://dx.doi.org/1.1287/mnsc.112.1645 212 INFORMS A Dynamic
More informationThe study of probability is concerned with the likelihood of events occurring. Many situations can be analyzed using a simplified model of probability
The study of probability is concerned with the likelihood of events occurring Like combinatorics, the origins of probability theory can be traced back to the study of gambling games Still a popular branch
More informationAlternation in the repeated Battle of the Sexes
Alternation in the repeated Battle of the Sexes Aaron Andalman & Charles Kemp 9.29, Spring 2004 MIT Abstract Traditional game-theoretic models consider only stage-game strategies. Alternation in the repeated
More information/665 Natural Language Processing
601.465/665 Natural Language Processing Prof: Jason Eisner Webpage: http://cs.jhu.edu/~jason/465 syllabus, announcements, slides, homeworks 1 Goals of the field Computers would be a lot more useful if
More informationStatistical Analysis of Modern Communication Signals
Whitepaper Statistical Analysis of Modern Communication Signals Bob Muro Application Group Manager, Boonton Electronics Abstract The latest wireless communication formats like DVB, DAB, WiMax, WLAN, and
More informationCard counting meets hidden Markov models
University of New Mexico UNM Digital Repository Electrical and Computer Engineering ETDs Engineering ETDs 2-7-2011 Card counting meets hidden Markov models Steven J. Aragon Follow this and additional works
More informationMotif finding. GCB 535 / CIS 535 M. T. Lee, 10 Oct 2004
Motif finding GCB 535 / CIS 535 M. T. Lee, 10 Oct 2004 Our goal is to identify significant patterns of letters (nucleotides, amino acids) contained within long sequences. The pattern is called a motif.
More informationLog-linear models (part III)
Log-linear models (part III) Lecture, Feb 7 CS 690N, Spring 2017 Advanced Natural Language Processing http://people.cs.umass.edu/~brenocon/anlp2017/ Brendan O Connor College of Information and Computer
More informationMachine Learning for Language Technology
Machine Learning for Language Technology Generative and Discriminative Models Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Machine Learning for Language
More informationGuess the Mean. Joshua Hill. January 2, 2010
Guess the Mean Joshua Hill January, 010 Challenge: Provide a rational number in the interval [1, 100]. The winner will be the person whose guess is closest to /3rds of the mean of all the guesses. Answer:
More informationNaive Bayes text classification. Sumin Han
Naive Bayes text classification Sumin Han (hsm69@kaist.ac.kr) Contents - Introduction Bayes theorem Likelihood Text categorization Tips & Reference 2 Introduction 3 Artificial Intelligence Rule-based AI
More informationLaws of Text. Lecture Objectives. Text Technologies for Data Science INFR Learn about some text laws. This lecture is practical 9/26/2018
Text Technologies for Data Science INFR11145 Laws of Text Instructor: Walid Magdy 26-Sep-2018 Lecture Objectives Learn about some text laws Zipf s law Benford s law Heap s law Clumping/contagion This lecture
More informationThe Automatic Classification Problem. Perceptrons, SVMs, and Friends: Some Discriminative Models for Classification
Perceptrons, SVMs, and Friends: Some Discriminative Models for Classification Parallel to AIMA 8., 8., 8.6.3, 8.9 The Automatic Classification Problem Assign object/event or sequence of objects/events
More informationMidterm for Name: Good luck! Midterm page 1 of 9
Midterm for 6.864 Name: 40 30 30 30 Good luck! 6.864 Midterm page 1 of 9 Part #1 10% We define a PCFG where the non-terminals are {S, NP, V P, V t, NN, P P, IN}, the terminal symbols are {Mary,ran,home,with,John},
More informationA Maximum Likelihood TOA Based Estimator For Localization in Heterogeneous Networks
Int. J. Communications, Network and System Sciences, 010, 3, 38-4 doi:10.436/ijcns.010.31004 Published Online January 010 (http://www.scirp.org/journal/ijcns/). A Maximum Likelihood OA Based Estimator
More informationNovember 8, Chapter 8: Probability: The Mathematics of Chance
Chapter 8: Probability: The Mathematics of Chance November 8, 2013 Last Time Probability Models and Rules Discrete Probability Models Equally Likely Outcomes Crystallographic notation The first symbol
More informationUniversity of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005
University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 Lecture 5 Slides Jan 26 th, 2005 Outline of Today s Lecture Announcements Filter-bank analysis
More informationMachine Translation - Decoding
January 15, 2007 Table of Contents 1 Introduction 2 3 4 5 6 Integer Programing Decoder 7 Experimental Results Word alignments Fertility Table Translation Table Heads Non-heads NULL-generated (ct.) Figure:
More informationLaboratory 1: Uncertainty Analysis
University of Alabama Department of Physics and Astronomy PH101 / LeClair May 26, 2014 Laboratory 1: Uncertainty Analysis Hypothesis: A statistical analysis including both mean and standard deviation can
More informationIf a series of games (on which money has been bet) is interrupted before it can end, what is the fairest way to divide the stakes?
Interrupted Games of Chance Berkeley Math Circle (Advanced) John McSweeney March 13th, 2012 1 The Problem If a series of games (on which money has been bet) is interrupted before it can end, what is the
More informationIsmaila Ba MSc Student, Department of Mathematics and Statistics Université de Moncton
Discrimination between statistical distributions for hydrometeorological frequency modeling Ismaila Ba MSc Student, Department of Mathematics and Statistics Université de Moncton INTRODUCTION The identification
More information10/12/2015. SHRDLU: 1969 NLP solved?? : A sea change in AI technologies. SHRDLU: A demonstration proof. 1990: Parsing Research in Crisis
SHRDLU: 1969 NLP solved?? 1980-1995: A sea change in AI technologies Example: Natural Language Processing The Great Wave off Kanagawa by Hokusai, ~1830 ] Person: PICK UP A BIG RED BLOCK. Computer: OK.
More informationLatest trends in sentiment analysis - A survey
Latest trends in sentiment analysis - A survey Anju Rose G Punneliparambil PG Scholar Department of Computer Science & Engineering Govt. Engineering College, Thrissur, India anjurose.ar@gmail.com Abstract
More information(Small Group Sydney, Emma, Carson, Lucas) What ya gonna do when the lake goes dry, honey What ya gonna do when the lake goes dry?
The Crawdad Song You get a line and I ll get a pole, honey You get a line and I ll get a pole, babe You get a line and I ll get a pole, and we ll go down to the crawdad hole, honey sugar ba-by mine Sittin
More informationTotal. STAT/MATH 394 A - Autumn Quarter Midterm. Name: Student ID Number: Directions. Complete all questions.
STAT/MATH 9 A - Autumn Quarter 015 - Midterm Name: Student ID Number: Problem 1 5 Total Points Directions. Complete all questions. You may use a scientific calculator during this examination; graphing
More information3.5 Marginal Distributions
STAT 421 Lecture Notes 52 3.5 Marginal Distributions Definition 3.5.1 Suppose that X and Y have a joint distribution. The c.d.f. of X derived by integrating (or summing) over the support of Y is called
More informationPeak-based EMG Detection Via CWT
41 Chapter 3 Peak-based EMG Detection Via CWT 3.1 Existing Methods In the EMG signal detection problem, one of the main tasks is to identify transient peaks of the muscle responses, or Motor Evoked Potentials
More informationLesson 6.1 Linear Equation Review
Name: Lesson 6.1 Linear Equation Review Vocabulary Equation: a math sentence that contains Linear: makes a straight line (no Variables: quantities represented by (often x and y) Function: equations can
More informationCS 540: Introduction to Artificial Intelligence
CS 540: Introduction to Artificial Intelligence Mid Exam: 7:15-9:15 pm, October 25, 2000 Room 1240 CS & Stats CLOSED BOOK (one sheet of notes and a calculator allowed) Write your answers on these pages
More informationDiscriminative Training for Automatic Speech Recognition
Discriminative Training for Automatic Speech Recognition 22 nd April 2013 Advanced Signal Processing Seminar Article Heigold, G.; Ney, H.; Schluter, R.; Wiesler, S. Signal Processing Magazine, IEEE, vol.29,
More information1 What s in the shipping package?
SST 900B 900 MHz RS 232/RS 485 Wireless Modem Quick Start Guide 1 What s in the shipping package? SST-900B Wireless Modem CA-0910 Quick Start CD 3dBi 900M Hz Antenna Guide 2 External switch introduction
More informationFAST LEMPEL-ZIV (LZ 78) COMPLEXITY ESTIMATION USING CODEBOOK HASHING
FAST LEMPEL-ZIV (LZ 78) COMPLEXITY ESTIMATION USING CODEBOOK HASHING Harman Jot, Rupinder Kaur M.Tech, Department of Electronics and Communication, Punjabi University, Patiala, Punjab, India I. INTRODUCTION
More informationRecap from previous lecture. Information Retrieval. Topics for Today. Recall: Basic structure of an Inverted index. Dictionaries & Tolerant Retrieval
Recap from previous lecture nformation Retrieval Dictionaries & Tolerant Retrieval Jörg Tiedemann jorg.tiedemann@lingfil.uu.se Department of Linguistics and Philology Uppsala University nverted indexes
More informationSome Parameter Estimators in the Generalized Pareto Model and their Inconsistency with Observed Data
Some Parameter Estimators in the Generalized Pareto Model and their Inconsistency with Observed Data F. Ashkar, 1 and C. N. Tatsambon 2 1 Department of Mathematics and Statistics, Université de Moncton,
More informationEfficiency and detectability of random reactive jamming in wireless networks
Efficiency and detectability of random reactive jamming in wireless networks Ni An, Steven Weber Modeling & Analysis of Networks Laboratory Drexel University Department of Electrical and Computer Engineering
More informationThe Log-Log Term Frequency Distribution
The Log-Log Term Frequency Distribution Jason D. M. Rennie jrennie@gmail.com July 14, 2005 Abstract Though commonly used, the unigram is widely known as being a poor model of term frequency; it assumes
More informationAlgorithms and Data Structures
Algorithms and Data Structures Self-Organizing Lists Marius Kloft Assumptions for Searching Until now, we implicitly assumed that every element of our list is searched with the same probability, i.e.,
More informationChapter 11. Sampling Distributions. BPS - 5th Ed. Chapter 11 1
Chapter 11 Sampling Distributions BPS - 5th Ed. Chapter 11 1 Sampling Terminology Parameter fixed, unknown number that describes the population Statistic known value calculated from a sample a statistic
More informationUnderstanding Apparent Increasing Random Jitter with Increasing PRBS Test Pattern Lengths
JANUARY 28-31, 2013 SANTA CLARA CONVENTION CENTER Understanding Apparent Increasing Random Jitter with Increasing PRBS Test Pattern Lengths 9-WP6 Dr. Martin Miller The Trend and the Concern The demand
More informationVeracity Managing Uncertain Data. Skript zur Vorlesung Datenbanksystem II Dr. Andreas Züfle
Veracity Managing Uncertain Data Skript zur Vorlesung Datenbanksystem II Dr. Andreas Züfle Geo-Spatial Data Huge flood of geo-spatial data Modern technology New user mentality Great research potential
More informationSimple Large-scale Relation Extraction from Unstructured Text
Simple Large-scale Relation Extraction from Unstructured Text Christos Christodoulopoulos and Arpit Mittal Amazon Research Cambridge Alexa Question Answering Alexa, what books did Carrie Fisher write?
More informationMAS.160 / MAS.510 / MAS.511 Signals, Systems and Information for Media Technology Fall 2007
MIT OpenCourseWare http://ocw.mit.edu MAS.160 / MAS.510 / MAS.511 Signals, Systems and Information for Media Technology Fall 2007 For information about citing these materials or our Terms of Use, visit:
More informationThe Self-Avoiding Walk (Probability And Its Applications) By Neal Madras;Gordon Slade
The Self-Avoiding Walk (Probability And Its Applications) By Neal Madras;Gordon Slade If you are searching for a book by Neal Madras;Gordon Slade The Self-Avoiding Walk (Probability and Its Applications)
More informationOutlier-Robust Estimation of GPS Satellite Clock Offsets
Outlier-Robust Estimation of GPS Satellite Clock Offsets Simo Martikainen, Robert Piche and Simo Ali-Löytty Tampere University of Technology. Tampere, Finland Email: simo.martikainen@tut.fi Abstract A
More informationDiscrete Structures for Computer Science
Discrete Structures for Computer Science William Garrison bill@cs.pitt.edu 6311 Sennott Square Lecture #23: Discrete Probability Based on materials developed by Dr. Adam Lee The study of probability is
More informationSimple Large-scale Relation Extraction from Unstructured Text
Simple Large-scale Relation Extraction from Unstructured Text Christos Christodoulopoulos and Arpit Mittal Amazon Research Cambridge Alexa Question Answering Alexa, what books did Carrie Fisher write?
More informationCH 20 NUMBER WORD PROBLEMS
187 CH 20 NUMBER WORD PROBLEMS Terminology To double a number means to multiply it by 2. When n is doubled, it becomes 2n. The double of 12 is 2(12) = 24. To square a number means to multiply it by itself.
More informationMachine Learning. Classification, Discriminative learning. Marc Toussaint University of Stuttgart Summer 2014
Machine Learning Classification, Discriminative learning Structured output, structured input, discriminative function, joint input-output features, Likelihood Maximization, Logistic regression, binary
More informationOutcome Forecasting in Sports. Ondřej Hubáček
Outcome Forecasting in Sports Ondřej Hubáček Motivation & Challenges Motivation exploiting betting markets performance optimization Challenges no available datasets difficulties with establishing the state-of-the-art
More informationResearch Seminar. Stefano CARRINO fr.ch
Research Seminar Stefano CARRINO stefano.carrino@hefr.ch http://aramis.project.eia- fr.ch 26.03.2010 - based interaction Characterization Recognition Typical approach Design challenges, advantages, drawbacks
More informationSTAT Statistics I Midterm Exam One. Good Luck!
STAT 515 - Statistics I Midterm Exam One Name: Instruction: You can use a calculator that has no connection to the Internet. Books, notes, cellphones, and computers are NOT allowed in the test. There are
More informationA Weighted Least Squares Algorithm for Passive Localization in Multipath Scenarios
A Weighted Least Squares Algorithm for Passive Localization in Multipath Scenarios Noha El Gemayel, Holger Jäkel, Friedrich K. Jondral Karlsruhe Institute of Technology, Germany, {noha.gemayel,holger.jaekel,friedrich.jondral}@kit.edu
More informationLiterature Look for these books in a library. Point out shapes and how they can be found in everyday objects. Vocabulary Builder. Home Activity.
12 Chapter Dear Family, My class started Chapter 12 this week. In this chapter, I will describe and combine two-dimensional shapes. I will learn about equal shares, halves, and fourths. Love, Vocabulary
More informationLecture 15. Turbo codes make use of a systematic recursive convolutional code and a random permutation, and are encoded by a very simple algorithm:
18.413: Error-Correcting Codes Lab April 6, 2004 Lecturer: Daniel A. Spielman Lecture 15 15.1 Related Reading Fan, pp. 108 110. 15.2 Remarks on Convolutional Codes Most of this lecture ill be devoted to
More informationLesson 47. A 30X zoom lens
Lesson 47. A 30X zoom lens Lesson 38 showed how to design an 8X zoom lens with no starting configuration. Now we will do a more difficult job, aiming for a zoom ratio of 30X. This exercise will use many
More informationMobility Patterns in Microcellular Wireless Networks
Carnegie Mellon University Research Showcase @ CMU Department of Engineering and Public Policy Carnegie Institute of Technology 3-23 Mobility Patterns in Microcellular Wireless Networks Suttipong Thajchayapong
More informationfinal examination on May 31 Topics from the latter part of the course (covered in homework assignments 4-7) include:
The final examination on May 31 may test topics from any part of the course, but the emphasis will be on topic after the first three homework assignments, which were covered in the midterm. Topics from
More informationEssential Question How can you list the possible outcomes in the sample space of an experiment?
. TEXAS ESSENTIAL KNOWLEDGE AND SKILLS G..B Sample Spaces and Probability Essential Question How can you list the possible outcomes in the sample space of an experiment? The sample space of an experiment
More informationBattleship as a Dialog System Aaron Brackett, Gerry Meixiong, Tony Tan-Torres, Jeffrey Yu
Battleship as a Dialog System Aaron Brackett, Gerry Meixiong, Tony Tan-Torres, Jeffrey Yu Abstract For our project, we built a conversational agent for Battleship using Dialog systems. In this paper, we
More informationChapter 3: Resistive Network Analysis Instructor Notes
Chapter 3: Resistive Network Analysis Instructor Notes Chapter 3 presents the principal topics in the analysis of resistive (DC) circuits The presentation of node voltage and mesh current analysis is supported
More information15 Discrete-Time Modulation
15 Discrete-Time Modulation The modulation property is basically the same for continuous-time and discrete-time signals. The principal difference is that since for discrete-time signals the Fourier transform
More informationThe Munich 2011 CHiME Challenge Contribution: BLSTM-NMF Speech Enhancement and Recognition for Reverberated Multisource Environments
The Munich 2011 CHiME Challenge Contribution: BLSTM-NMF Speech Enhancement and Recognition for Reverberated Multisource Environments Felix Weninger, Jürgen Geiger, Martin Wöllmer, Björn Schuller, Gerhard
More informationInstrumental Considerations
Instrumental Considerations Many of the limits of detection that are reported are for the instrument and not for the complete method. This may be because the instrument is the one thing that the analyst
More informationTHE CHALLENGES OF SENTIMENT ANALYSIS ON SOCIAL WEB COMMUNITIES
THE CHALLENGES OF SENTIMENT ANALYSIS ON SOCIAL WEB COMMUNITIES Osamah A.M Ghaleb 1,Anna Saro Vijendran 2 1 Ph.D Research Scholar, Department of Computer Science, Sri Ramakrishna College of Arts and Science,(India)
More informationThe Game-Theoretic Approach to Machine Learning and Adaptation
The Game-Theoretic Approach to Machine Learning and Adaptation Nicolò Cesa-Bianchi Università degli Studi di Milano Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 1 / 25 Machine Learning
More informationJoint Distributions, Independence Class 7, Jeremy Orloff and Jonathan Bloom
Learning Goals Joint Distributions, Independence Class 7, 8.5 Jeremy Orloff and Jonathan Bloom. Understand what is meant by a joint pmf, pdf and cdf of two random variables. 2. Be able to compute probabilities
More informationDigital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay
Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 03 Quantization, PCM and Delta Modulation Hello everyone, today we will
More informationAnnouncements. Today. Speech and Language. State Path Trellis. HMMs: MLE Queries. Introduction to Artificial Intelligence. V22.
Introduction to Artificial Intelligence Announcements V22.0472-001 Fall 2009 Lecture 19: Speech Recognition & Viterbi Decoding Rob Fergus Dept of Computer Science, Courant Institute, NYU Slides from John
More informationDay Session Common Core Adaptation Common Core Standards Measurement Benchmarks
Unit 4 Common Core Mathematical Practices (MP) Domains Number and Operations in Base Ten (NBT) Measurement and Data (MD) Geometry (G) INVESTIG ATION 1 Linear Measurement Teach this Investigation as is.
More informationApply Kalman Filter in Financial Time Series
Apply Kalman Filter in Financial Time Series Final Project for EE616 Signal Detection & Estimation Xingzhong Xu Department of Electrical & Computer Engineering Stevens Institute of Technology April 9,
More informationImage Processing Computer Graphics I Lecture 20. Display Color Models Filters Dithering Image Compression
15-462 Computer Graphics I Lecture 2 Image Processing April 18, 22 Frank Pfenning Carnegie Mellon University http://www.cs.cmu.edu/~fp/courses/graphics/ Display Color Models Filters Dithering Image Compression
More informationCHAPTER 9 THE EFFECTS OF GAUGE LENGTH AND STRAIN RATE ON THE TENSILE PROPERTIES OF REGULAR AND AIR JET ROTOR SPUN COTTON YARNS
170 CHAPTER 9 THE EFFECTS OF GAUGE LENGTH AND STRAIN RATE ON THE TENSILE PROPERTIES OF REGULAR AND AIR JET ROTOR SPUN COTTON YARNS 9.1 INTRODUCTION It is the usual practise to test the yarn at a gauge
More informationChapter 5 Exercise Solutions
-bar R Chapter Eercise Solutions Notes:. Several eercises in this chapter differ from those in the th edition. An * indicates that the description has changed. A second eercise number in parentheses indicates
More informationDigital Communication Systems ECS 452
Digital Communication Systems ECS 452 Asst. Prof. Dr. Prapun Suksompong prapun@siit.tu.ac.th 2. Source Coding 1 Office Hours: BKD, 6th floor of Sirindhralai building Monday 10:00-10:40 Tuesday 12:00-12:40
More informationNumerical: Data with quantity Discrete: whole number answers Example: How many siblings do you have?
Types of data Numerical: Data with quantity Discrete: whole number answers Example: How many siblings do you have? Continuous: Answers can fall anywhere in between two whole numbers. Usually any type of
More informationBetaPoker: Reinforcement Learning for Heads-Up Limit Poker Albert Tung, Eric Xu, and Jeffrey Zhang
Introduction BetaPoker: Reinforcement Learning for Heads-Up Limit Poker Albert Tung, Eric Xu, and Jeffrey Zhang Texas Hold em Poker is considered the most popular variation of poker that is played widely
More informationElectric Guitar Pickups Recognition
Electric Guitar Pickups Recognition Warren Jonhow Lee warrenjo@stanford.edu Yi-Chun Chen yichunc@stanford.edu Abstract Electric guitar pickups convert vibration of strings to eletric signals and thus direcly
More informationLocal Search: Hill Climbing. When A* doesn t work AIMA 4.1. Review: Hill climbing on a surface of states. Review: Local search and optimization
Outline When A* doesn t work AIMA 4.1 Local Search: Hill Climbing Escaping Local Maxima: Simulated Annealing Genetic Algorithms A few slides adapted from CS 471, UBMC and Eric Eaton (in turn, adapted from
More informationAnalog Circuits Prof. Jayanta Mukherjee Department of Electrical Engineering Indian Institute of Technology-Bombay
Analog Circuits Prof. Jayanta Mukherjee Department of Electrical Engineering Indian Institute of Technology-Bombay Week -02 Module -01 Non Idealities in Op-Amp (Finite Gain, Finite Bandwidth and Slew Rate)
More informationFree-Standing Mathematics Qualification Mathematics
Free-Standing Mathematics Qualification Mathematics 4986 Data Handling Mark scheme 4986 June 016 Version 1.0: Final Mark Scheme Mark schemes are prepared by the Lead Assessment Writer and considered, together
More informationDegrees of Freedom in Adaptive Modulation: A Unified View
Degrees of Freedom in Adaptive Modulation: A Unified View Seong Taek Chung and Andrea Goldsmith Stanford University Wireless System Laboratory David Packard Building Stanford, CA, U.S.A. taek,andrea @systems.stanford.edu
More informationCSE 255 Assignment 1: Helpfulness in Amazon Reviews
CSE 255 Assignment 1: Helpfulness in Amazon Reviews Kristján Jónsson University of California, San Diego 9500 Gilman Dr La Jolla, CA 92093 USA kjonsson@eng.ucsd.edu Devin Platt University of California,
More informationReal Time Word to Picture Translation for Chinese Restaurant Menus
Real Time Word to Picture Translation for Chinese Restaurant Menus Michelle Jin, Ling Xiao Wang, Boyang Zhang Email: mzjin12, lx2wang, boyangz @stanford.edu EE268 Project Report, Spring 2014 Abstract--We
More information