computational social media lecture 07: crowdsourcing daniel gatica-perez 03.06.2016
reminders HW3: Algorithmic Bias Check email (also on course website) Due Thu 09.06.2016 Last lecture of the semester Fri 10.06.2016 Same time
this lecture 1. introduction 2. definitions & categorization of crowdsourcing systems 3. understanding crowdsourcing research analyzing crowdsourcing workers & systems designing crowdsourcing systems open issues
1. introduction
Human Computation (Luis von Ahn, 2004) espgame.org, gwap.com L. von Ahn & L. Dabbish, Labeling Images with a Computer Game, in Proc. ACM CHI 2004 http://www.cs.cmu.edu/~biglou/esp.pdf
image: http://www.interaction-design.org/encyclopedia/social_computing.html
2. definitions & categorization of crowdsourcing systems
term coined by Jeff Howe (2005)
definition of crowdsourcing source: cooltownstudios crowdsourcing is an online, distributed problem-solving and production model that leverages the collective intelligence of online communities to serve specific organizational goals crowds are given the opportunity to respond to tasks promoted by the organization motivated to respond for a variety of reasons. D. C. Brabham, Crowdsourcing, MIT Press, 2013
key ingredients of crowdsourcing 1. an organization (or entity) that has a task it needs performed 2. a community that is willing to perform the task voluntarily 3. an environment that allows the work to take place and the community to interact with the organization 4. mutual benefit for organization and community D. C. Brabham, Crowdsourcing, MIT Press, 2013 credit (cc): https://www.flickr.com/photos/winton/5837240004/
a problem-focused categorization type how it works ideal kind of problems examples knowledge discovery & management crowd has to find & collect info into a common format info gathering, reporting problems, creation of collective resources peer-to-patent.org D. C. Brabham, Crowdsourcing, MIT Press, 2013 seeclickfix.com
a problem-focused categorization type how it works ideal kind of problems examples knowledge discovery & management crowd has to find & collect info into a common format info gathering, reporting problems, creation of collective resources peer-to-patent.org broadcast search crowd has to solve empirical problems ideation problems with innocentive.com empirical provable solution like scientific problems D. C. Brabham, Crowdsourcing, MIT Press, 2013 seeclickfix.com
a problem-focused categorization type how it works ideal kind of problems examples knowledge discovery & management crowd has to find & collect info into a common format info gathering, reporting problems, creation of collective resources peer-to-patent.org broadcast search crowd has to solve empirical problems ideation problems with innocentive.com empirical provable solution like scientific problems peer-vetted creative production crowd creates & selects creative ideas ideation problems where solutions are matter of taste such as design & aesthetics D. C. Brabham, Crowdsourcing, MIT Press, 2013 seeclickfix.com threadless.com crashthesuperbowl.com nextstopdesign.com
one world (MadV, Nov. 2006) https://www.youtube.com/watch?v=fhc0xjrgi60
https://www.youtube.com/watch?v=z-bzxpoch-e 2000 video replies: most responded video
http://hypetrak.com/2014/03/celebrate-international-happy-day-with-pharrell-and-the-un-foundation/ http://24hoursofhappiness.com/
http://24hoursofhappiness.com/
a problem-focused categorization type how it works ideal kind of problems examples knowledge discovery & management crowd has to find & collect info into a common format info gathering, reporting problems, creation of collective resources peer-to-patent.org broadcast search crowd has to solve empirical problems ideation problems with innocentive.com empirical provable solution like scientific problems peer-vetted creative production crowd creates & selects creative ideas ideation problems where solutions are matter of taste such as design & aesthetics threadless.com distributed human intelligence tasking crowd analyzes large amounts of information large-scale data analysis where human intelligence is more effective/efficient than machine intelligence mturk.com D. C. Brabham, Crowdsourcing, MIT Press, 2013 seeclickfix.com crashthesuperbowl.com nextstopdesign.com crowdflower.com
HIT: Human Intelligence Task https://www.mturk.com/mturk/welcome
https://www.mturk.com/mturk/preview?groupid=3synypc5i7oueocqcjj9deue2zzhyh
https://www.mturk.com/mturk/preview?groupid=3synypc5i7oueocqcjj9deue2zzhyh
connections between crowdsourcing & social media crowdsourcing & social media face similar challenges + motivate / incentivize users & workers + sustainability of site activity + community management social media as implicit crowdsourcing crowdsourcing can use social media as channels + use Twitter to crowdsource city reports + post videos on YouTube + use Facebook for recruitment for workers crowdsourcing can involve online social interaction + vote / rate / comment on other people s contributions credit (cc): https://www.flickr.com/photos/jamescridland/613445810
3. understanding crowdsourcing research
categorizing crowdsourcing research analyzing crowdsourcing workers & systems designing crowdsourcing systems crowdsourcing uses & applications
3.1 analyzing crowdsourcing workers & systems
example 1: who are the crowdworkers? J. Ross, L. Irani, M.S. Silberman, A. Zaldivar, and B. Tomlinson. Who are the Crowdworkers? Shifting Demographics in Mechanical Turk, in Proc. ACM CHI Extended Abstracts, 2010.
J. Ross, L. Irani, M.S. Silberman, A. Zaldivar, and B. Tomlinson. Who are the Crowdworkers? Shifting Demographics in Mechanical Turk, in Proc. ACM CHI Extended Abstracts, 2010.
reported annual household income by country (USD) J. Ross, L. Irani, M.S. Silberman, A. Zaldivar, and B. Tomlinson. Who are the Crowdworkers? Shifting Demographics in Mechanical Turk, in Proc. ACM CHI Extended Abstracts, 2010.
how well do crowdworkers do their job?
example 2: crowdworker modeling issues: 1 - annotators might not be of same quality 2 objects might not have same difficulty 3 ground-truth might not exist at all this and the next 3 slides are taken from: Marco Tagliasacchi, Crowdsourcing for Multimedia Retrieval, Summer School on Social Media Modeling and Search, 2012. http://www.slideshare.net/cubrikproject/crowdsourcing-for-multimedia-retrieval
aggregating judgments a classifier M. Tagliasacchi, Crowdsourcing for Multimedia Retrieval, Summer School on Social Media Modeling and Search, Sep. 2012.
the basic idea (Dawid & Skene, 1979) i: object j: annotator (a.k.a. sensitivity) (a.k.a. specificity) A. P. Dawid and A. M. Skene, Maximum Likelihood Estimation of Observer Error-Rates Using the EM Algorithm, J. of the Royal Statistical Society. Series C, Vol. 28, No. 1, pp. 20-28, 1979 M. Tagliasacchi, Crowdsourcing for Multimedia Retrieval, Summer School on Social Media Modeling and Search, Sep. 2012.
the basic idea (2) (Dawid & Skene, 1979) Estimates of the true labels are produced in the E-step Estimates of the alpha parameters are produced in the M-step The learned parameters can be used to remove poor annotators in a crowdsourced annotation task and to better inform the process M. Tagliasacchi, Crowdsourcing for Multimedia Retrieval, Summer School on Social Media Modeling and Search, Sep. 2012.
further reading if you who want to know more Aashish Sheshadri and Matthew Lease SQUARE: A Benchmark for Research on Computing Crowd Consensus In Proc. 1st AAAI Conference on Human Computation (HCOMP) 2013
3.2. designing crowdsourcing systems
let s say you have a job for MTurk
crowdsourcing tasks in MTurk: tricks of the trade ownership: host your data on Mturk or not? biases: MTurkers aren t fair sample of world population engagement: is the task fun? entertaining? complexity: how difficult or time-involved is the task? payment: what is a fair rate for workers? interaction: create community of trusted workers geography: limit workers to one country/region? spam: use tricks to verify no spammers fill tasks qualifications: engage workers of recognized quality quality: use model to reduce effect of bad responses W. Mason and S. Suri, Conducting behavioral research on Amazon s Mechanical Turk, Behavior Research Methods, Volume 44, Vol. 1 pp 1-23, Mar. 2012
for further details paper: W. Mason and S. Suri, Conducting behavioral research on Amazon s Mechanical Turk, Behavior Research Methods, Volume 44, Vol. 1 pp 1-23, Mar. 2012 slides: http://www.slideshare.net/cloud/sa1-how-to-use-mechanical-turkfor-behavioral-research
why do crowds participate? motivations (Deci & Ryan, 1985) intrinsic: doing an activity for its inherent satisfactions rather than for some separable consequence + fun + challenge extrinsic: an activity that is done in order to attain some separable outcome + financial reward + social pressure extrinsec motivations tend to undermine intrinsic ones E. L. Deci and R. M Ryan, Intrinsic Motivation and Self-determination in Human Behavior, Plenum Press, 1985 D. C. Brabham, Crowdsourcing, MIT Press, 2013
why do crowds participate? to earn money to develop creative skills to network with other creative professionals to build a portfolio for future employers to challenge oneself to solve a tough problem to socialize and make friends to pass time when bored to contribute to a larger project of common interest to share with others to have fun D. C. Brabham, Crowdsourcing, MIT Press, 2013 credit (cc): https://www.flickr.com/photos/quintanomedia/13076281575
3.3. (a few other) open issues
innovation: crowdsourcing & google glass OpenGlass project: applications to help blind and visually impaired users identify objects via crowdsourcing https://www.youtube.com/watch?v=cedg0k1hsh8 http://www.openshades.com/
discovery: citizen science "Foldit attempts to predict the structure of a protein by taking advantage of humans' puzzle-solving intuitions and having people play competitively to fold the best proteins," http://www.scientificamerican.com/article/foldit-gamers-solve-riddle/
https://www.galaxyzoo.org/
crowdsourcing as practice intellectual property & copyright + who owns my idea if it wins? if if doesn t? unfair business practices + crowdsourcing used for manipulation: fake reviews of products & services labor rights + fair payment, spec work, transnational jobs, + does crowdsourcing dump salaries? credit (cc): https://www.flickr.com/photos/61115981@n06/7476310372 D. C. Brabham, Crowdsourcing, MIT Press, 2013
Mary L Gray (MSR): Ethnographic study of the people who offer projects and the workers who provide labor through Mechanical Turk. There's a conversation in media studies right now about "immaterial labor" as part of the new economy of digital media I want to look at this other world of labor to find out who "Mechanical Turks" are, what motivates "buyers" and "sellers" of MTURK skills, and what kind of labor they feel they're exchanging. The project offers a way to think through the politics and ethics of MTURK and the material labor it reflects. http://www.latimes.com/opinion/op-ed/la-oe-0110-digital-turk-work-20160110-story.html http://research.microsoft.com/en-us/people/mlg/
what to remember crowdsourcing & social media have many common points from online participation to incentives & sustainability crowdsourcing is a very active research field understanding of workers & systems models & frameworks for system design new uses & applications many issues remain open both technical & societal
questions? gatica@idiap.ch daniel.gatica-perez@epfl.ch @dgaticaperez