Potential and Limitations of Commercial Sentiment Detection Tools

Similar documents
Sentiment Analysis of User-Generated Contents for Pharmaceutical Product Safety

Social media corpora, datasets and tools: An overview

Cheap, Fast and Good Enough: Speech Transcription with Mechanical Turk. Scott Novotney and Chris Callison-Burch 04/02/10

THE CHALLENGES OF SENTIMENT ANALYSIS ON SOCIAL WEB COMMUNITIES

Predicting Video Game Popularity With Tweets

Using RASTA in task independent TANDEM feature extraction

Opinion Mining and Emotional Intelligence: Techniques and Methodology

I. INTRODUCTION. Keywords - Data mining; Sentiment Analysis; Social Media; Indian Cities Traffic; Twitter.

ARGUMENTATION MINING

Sentiment Analysis. (thanks to Matt Baker)

Polarization Analysis of Twitter Users Using Sentiment Analysis

Named Entity Recognition. Natural Language Processing Emory University Jinho D. Choi

Techniques for Sentiment Analysis survey

SELECTING RELEVANT DATA

Twitter Used by Indonesian President: An Sentiment Analysis of Timeline Paulina Aliandu

Learning Artificial Intelligence in Large-Scale Video Games

Natural Language for Visual Reasoning

Name that sculpture. Relja Arandjelovid and Andrew Zisserman. Visual Geometry Group Department of Engineering Science University of Oxford

Growing an Organic Indoor Location System

THE DET CURVE IN ASSESSMENT OF DETECTION TASK PERFORMANCE

WHITE PAPER. NLP TOOL (Natural Language Processing) User Case: isocialcube (Social Networks Campaign Management)

Pragbot II Corpus & Mechanical Turk

Latest trends in sentiment analysis - A survey

Time-aware Collaborative Topic Regression: Towards Higher Relevance in Textual Items Recommendation

PREDICTING VIDEO GAME SALES USING AN ANALYSIS OF INTERNET MESSAGE BOARD DISCUSSIONS. A Thesis. Presented to the. Faculty of

Concept hierarchies and Credibility

Search results fusion

Semantic Localization of Indoor Places. Lukas Kuster

Study Singular They in Contemporary English. Bich Ngoc Do

Douglas W. Oard University of Maryland, College Park (ischool/umiacs) University of South Florida (ischool) University of Florida (CS)

Genbby Technical Paper

Learning to rank search results

Applications of Music Processing

Dance Movement Patterns Recognition (Part II)

Machine Learning Practical Part 2: Group Projects. MLP Lecture 11 MLP Part 2: Group Projects 1

Comp 3211 Final Project - Poker AI

Plot Your Romance Novel: Writer's Cheat Sheet By Natasha James READ ONLINE

Framework for Participative and Collaborative Governance using Social Media Mining Techniques

Sponsoring Documentation. 11th Conference of the International Shoulder Group ISG. University of Applied Sciences ZHAW

ISSN: (Online) Volume 2, Issue 4, April 2014 International Journal of Advance Research in Computer Science and Management Studies

ECO 463. SimultaneousGames

Understanding the city to make it smart

2. What is Text Mining? There is no single definition of text mining. In general, text mining is a subdomain of data mining that primarily deals with

IN357: ADAPTIVE FILTERS

Performance study of Text-independent Speaker identification system using MFCC & IMFCC for Telephone and Microphone Speeches

Advanced Meshing Techniques

Conversational Systems in the Era of Deep Learning and Big Data. Ian Lane Carnegie Mellon University

Analysis of Data Mining Methods for Social Media

STSM: On-Site Emission Measurements

Comparative Study of various Surveys on Sentiment Analysis

PHYS 1112L - Introductory Physics Laboratory II

3D-Assisted Image Feature Synthesis for Novel Views of an Object

A SCALABLE AUDIO FINGERPRINT METHOD WITH ROBUSTNESS TO PITCH-SHIFTING

Efficient Codes using Channel Polarization!

Exploring the New Trends of Chinese Tourists in Switzerland

A Tempest Or, on the flood of interest in: sentiment analysis, opinion mining, and the computational treatment of subjective language

Colour Profiling Using Multiple Colour Spaces

Onset Detection Revisited

Sentiment Visualization on Tweet Stream

Collecting language data of non-public social media profiles

SpotTheLink: A Game for Ontology Alignment

Logic Families. Describes Process used to implement devices Input and output structure of the device. Four general categories.

The Cricket Indoor Location System

AS-MAC: An Asynchronous Scheduled MAC Protocol for Wireless Sensor Networks

The Deliberate Creative Podcast with Amy Climer Transcript for Episode #006: Creative Problem Solving Stage 3 - Develop

Some Issues in Automatic Genre Classification of Web Pages

On the Energy Consumption of Design Patterns

Introduction to Markov Models

Exploring the effect of rhythmic style classification on automatic tempo estimation

International Journal of Computer Engineering and Applications, Volume XII, Issue IV, April 18, ISSN

Classifying Green Software Engineering The GREENSOFT Model

RSSI Based Uncooperative Direction Finding

Opponent Modelling In World Of Warcraft

Replay-based Strategy Prediction and Build Order Adaptation for StarCraft AI Bots

A Case Study of Machine Translation in Financial Sentiment Analysis

The Subjective and Objective. Evaluation of. Room Correction Products

With any other power quality analyzer you re just wasting energy.

C A R I B B E A N E X A M I N A T I ON S C O U N C I L MODERATOR S COMMENTS AND RATINGS ON SCHOOL-BASED ASSESSMENT BUILDING AND FURNITURE TECHNOLOGY

A SURVEY OF MACHINE LEARNING TECHNIQUES FOR SENTIMENT CLASSIFICATION

Solutions to Assignment-2 MOOC-Information Theory

LifeCLEF Bird Identification Task 2016

Android Speech Interface to a Home Robot July 2012

The revolution of the empiricists. Machine Translation. Motivation for Data-Driven MT. Machine Translation as Search

Fault Diagnosis of Hybrid Dynamic and Complex Systems

Digital communication strategy for statistics

A Review of Related Work on Machine Learning in Semiconductor Manufacturing and Assembly Lines

Robust direction of arrival estimation

Speech/Music Discrimination via Energy Density Analysis

Performance of Specific vs. Generic Feature Sets in Polyphonic Music Instrument Recognition

SELF ESTEEM AND SELF WORTH

Topology Control. Chapter 3. Ad Hoc and Sensor Networks. Roger Wattenhofer 3/1

We ve looked at timing issues in combinational logic Let s now examine timing issues we must deal with in sequential circuits

Efficient Peer-to-Peer Belief Propagation

Powerful But Limited: A DARPA Perspective on AI. Arati Prabhakar Director, DARPA

Expectations for Intelligent Computing

Colour Based People Search in Surveillance

Home Tweet Home Engineering Portfolio. This portfolio belongs to:

Singing Voice Detection. Applications of Music Processing. Singing Voice Detection. Singing Voice Detection. Singing Voice Detection

Power Analysis Summary

Mitigation of Mode Partition Noise in Quantum-dash Fabry-Perot Mode-locked Lasers using Manchester Encoding

Transcription:

Potential and Limitations of Commercial Sentiment Detection Tools Fatih Uzdilli joint work with Mark Cieliebak and Oliver Dürr 03.12.2013 @ ESSEM 13

About Me Fatih Uzdilli Institute of Applied Information Technology (InIT) ZHAW, Winterthur, Switzerland Email:, more about me: home.zhaw.ch/~uzdi Research Interest Information Retrieval, Machine Learning, Sentiment Analysis Background Software Engineer, Social Media Monitoring, Search Technologies 2

Abstract Evaluation of 9 commercial sentiment tools on approx. 30'000 short texts. Best commercial tools have accuracy of only 60%. Combining all tools using Random Forest improved the accuracy. 3

Motivation Scientific results for sentiment detection: C «very good performance: > 80% accuracy» Blog posts about commercial tools: D «very poor quality, unusable» 4

Motivation Scientific results for sentiment detection: C «very good performance: > 80% accuracy» Blog posts about commercial tools: D «very poor quality, unusable» 5

How good is commercial Sentiment Detection? source: http://www.commute.com/images/schools_evaluation.jpg 03.12.2013 Is there potential for improvement? source: http://3.bp.blogspot.com/-u3ack_wjalu/ulyv51mhehi/ AAAAAAAAARY/ DIZqOfxuswc/s1600/IcebergQ1.jpg Fatih Uzdilli 6

Evaluation Setup 7 Public Text Corpora Single Statements Different Media Types Tweet, News, Review, Speech Transcript Total: 28653 Texts 9 Commercial APIs Stand-alone Free for this evaluation Arbitrary Text POSITIVE NEGATIVE OTHER ( neutral / mixed ) 7

Tool Accuracy Accuracy 0.8 0.7 0.6 0.5 Best Tool per Corpus Average of All Tools Worst Tool per Corpus Avg. 61% 52% 40% 0.4 0.3 0.2 C1(Tweets) C2(Quotations) C3(Reviews) C4(Headlines) C5(Reviews) C6(Reviews) C7(News) 8

Tool Accuracy Accuracy 0.8 0.7 0.6 0.5 Best Tool per Corpus Average of All Tools Worst Tool per Corpus Overall Best Tool Overall Worst Tool Avg. 61% 52% 40% 59% 45% 0.4 0.3 0.2 C1(Tweets) C2(Quotations) C3(Reviews) C4(Headlines) C5(Reviews) C6(Reviews) C7(News) 9

Further Findings Longer texts are hard to classify Corpus annotations might be erroneous 10

Can a Meta-Classifier do better? 1st Approach: Majority Classifier Sentiment with most votes chosen Illustration: api1 api2 api3 api4 api5 api6 api7 Majority Text 1 + + - o - + o + Text 2 - + + - - - - - Text 3 - o + + + + - + Text n o o + o - o o o 11

Tool Accuracy 0.8 0.7 Best Tool per Corpus Average of All Tools Worst Tool per Corpus Accuracy 0.6 0.5 0.4 0.3 0.2 C1(Tweets) C2(Quotations) C3(Reviews) C4(Headlines) C5(Reviews) C6(Reviews) C7(News) 12

Majority Classifier beats Average 0.8 0.7 Best Tool per Corpus Average of All Tools Worst Tool per Corpus Majority Classifier Accuracy 0.6 0.5 0.4 0.3 0.2 C1(Tweets) C2(Quotations) C3(Reviews) C4(Headlines) C5(Reviews) C6(Reviews) C7(News) 13

2nd Approach: Random-Forest api1 api2 api3 api n annotation Text 1 + - + o + Text 2 - + o + - Text 3 - o - + - Text 4 + o + - + Text 5 + o + o o Text 6 + o o - o Text 7 + - + o unknown Text 8 + + o - unknown Text 9 o - + o unknown Train Train Train Train Train Train Predict Predict Predict Random Forest Classifier + + o 14

Before Random Forest 0.8 0.7 Best Tool per Corpus Average of All Tools Worst Tool per Corpus Majority Classifier Accuracy 0.6 0.5 0.4 0.3 0.2 C1(Tweets) C2(Quotations) C3(Reviews) C4(Headlines) C5(Reviews) C6(Reviews) C7(News) 15

Random Forest Beats Best Single Tool 0.8 0.7 Best Tool per Corpus Average of All Tools Worst Tool per Corpus Majority Classifier Random Forest Classifier Accuracy 0.6 0.5 0.4 0.3 0.2 C1(Tweets) C2(Quotations) C3(Reviews) C4(Headlines) C5(Reviews) C6(Reviews) C7(News) 16

Summary Best Tool: 59% Accuracy Random Forest combination: Up to 9% improvement <=9% 17