The Uses of Big Data in Social Research. Ralph Schroeder, Professor & MSc Programme Director

Similar documents
UCL Institute for Digital Innovation in the Built Environment. MSc Digital Innovation in Built Asset Management

Social Network Analysis in HCI

SOCIAL DECODING OF SOCIAL MEDIA: AN INTERVIEW WITH ANABEL QUAN-HAASE

Social Network Analysis and Its Developments

Ethical, Epistemological, Methodological, Social and Other

COMPUTATIONAL SOCIAL SCIENCE AND ADVANCED COMPUTING INFRASTRUCTURE: CHALLENGES AND OPPORTUNITIES

KELLER REALTY WILLIAMS. Getting Started on Twitter. Brought to you by Keller Williams Realty

The Long Tail of Research Data

TITLE: The multidisciplinarity of media and CCI clusters A structured literature review

Written response to the public consultation on the European Commission Green Paper: From

SOCIAL MEDIA UTILIZATION FOR ISLAMIC DA WAH

Chitika Insights The Value of Google Result Positioning

Future of Strategic Foresight

Front Digital page Strategy and leadership

Book review: Profit and gift in the digital economy

Understanding Real-World Mobile Network Experience

November 6, Keynote Speaker. Panelists. Heng Xu Penn State. Rebecca Wang Lehigh University. Eric P. S. Baumer Lehigh University

Executive Summary. The process. Intended use

Oxford Fintech Programme

A Bibliometric Analysis of Australia s International Research Collaboration in Science and Technology: Analytical Methods and Initial Findings

Dynamic Data-Driven Adaptive Sampling and Monitoring of Big Spatial-Temporal Data Streams for Real-Time Solar Flare Detection

Web 2.0 in social science research

Social Big Data. LauritzenConsulting. Content and applications. Key environments and star researchers. Potential for attracting investment

The A.I. Revolution Begins With Augmented Intelligence. White Paper January 2018

MSc(CompSc) List of courses offered in

Outline of Presentation

Scenario Planning edition 2

SOCI 425 Industrial Sociology I

QUANTITATIVE ASSESSMENT OF INSTITUTIONAL INVENTION CYCLE

Data Science Research Fellow

Big Data Analytics in Science and Research: New Drivers for Growth and Global Challenges

Altmetrics could enable scholarship from developing countries to receive due recognition.

User Research in Fractal Spaces:

Work Session on the Communication of Statistics (Geneva, Switzerland, June 2012)

AI powering Corporate Communications

A FORWARD- LOOKING VIEW on how analytics will solve some pressing business, consumer and social insight problems.

Verification & Validation

The Future of e-tourism Research

Expression Of Interest

The University of Sheffield Research Ethics Policy Note no. 14 RESEARCH INVOLVING SOCIAL MEDIA DATA 1. BACKGROUND

Information Systems Frontiers CALL FOR PAPERS. Special Issue on: Digital transformation for a sustainable society in the 21st century

Research Impact: The Wider Dimension. For Complexity. Dr Claire Donovan, School of Sociology, RSSS, ANU

Understanding Social Computing: Challenges and Opportunities for Europe

Data Analysis Fundamentals

FDA Centers of Excellence in Regulatory and Information Sciences

GUIDE TO SPEAKING POINTS:

ArkPSA Arkansas Political Science Association

Data Analysis Fundamentals

Truthy: Enabling the Study of Online Social Networks

Why Google Result Positioning Matters

Visualizations of personal social networks on Facebook and community structure: an exploratory study

BOOK REVIEW. Navigating Global Business. A Cultural Compass. By Simcha Ronen and Oded Shenkar

Keywords Big Data; digital devices; Interdisciplinarity; social life of methods; transactional data

RecordDNA DEVELOPING AN R&D AGENDA TO SUSTAIN THE DIGITAL EVIDENCE BASE THROUGH TIME

ECONOMIC AND SOCIAL RESEARCH COUNCIL IMPACT REPORT

Reasoning By Michael Scriven READ ONLINE

Energy for society: The value and need for interdisciplinary research

Our Digital Future: An Interview with the UM Dean of School of Information

Renewing Sociology in the Digital Age

Big Data What it Means For Business. Dr. Bob Porter Executive Director UCF Executive Development Center

BOOK REVIEWS. Technological Superpower China

School of Informatics Director of Commercialisation and Industry Engagement

New forms of scholarly communication Lunch e-research methods and case studies

Defining analytics: a conceptual framework

SMART PLACES WHAT. WHY. HOW.

Executive summary. AI is the new electricity. I can hardly imagine an industry which is not going to be transformed by AI.

The ERC: a contribution to society and the knowledge-based economy

Transportation Education in the New Millennium

Governing energy transitions towards a low-carbon society: the role of reflexive regulation and strategic experiments

Front Digital page Strategy and Leadership

SOCIAL MEDIA MINING AN INTRODUCTION CHGCAM

Building Governance Capability in Online Social Production: Insights from Wikipedia

Programme Curriculum for Master Programme in Economic History

II. MEASUREMENT OF THE CITY PERFORMANCE EFFICIENCY

TELEVISION STUDIES OCW UC3M. Topic VII. Television Audiences: Consumption and Fandom.

Visual Analytics in the New Normal: Past, Present & Future. geologic Technology Showcase Adapting to the New Normal, Nov 16 th, 2017

SYSTEM ANALYSIS & STUDIES (SAS) PANEL CALL FOR PAPERS

Prospects and Challenges of Digital Technology in Indonesia: A socio-economic perspective

Social Data Analytics Tool (SODATO)

L(p) 0 p 1. Lorenz Curve (LC) is defined as

Information Sociology

DIPLOMA IN FASHION DESIGN AND TECHNOLOGY Qualification code: DPFD19 - NQF Level 6 (360 credits)

Doing, supporting and using public health research. The Public Health England strategy for research, development and innovation

Chapter 7 Information Redux

To Become Fit for the IoT Data Game Change

The World Wide Web of Science and Global Expertise: Democratizing Access to Knowledge?

Find and analyse the most relevant patents for your research

New Approaches to Safety and Risk Management

Introducing Elsevier Research Intelligence

Romantic Partnerships and the Dispersion of Social Ties

Exploring the New Trends of Chinese Tourists in Switzerland

Recommendations Worth a Million

Contextual Integrity through the lens of computer science

PRODUCT SCOTLAND: BRINGING DESIGNERS, ANTHROPOLOGISTS, ARTISTS AND ENGINEERS TOGETHER

From FP7 towards Horizon 2020 Workshop on " Research performance measurement and the impact of innovation in Europe" IPERF, Luxembourg, 31/10/2013

Our Corporate Strategy Digital

UDIS Programme of Inquiry

Running head: SOCIAL NETWORK RESEARCH 1. The Evolving Nature of Social Network Research: A commentary to Gleibs (2014)

Durham Research Online

THE UNIVERSITY OF MANCHESTER PARTICULARS OF APPOINTMENT FACULTY OF HUMANITIES SCHOOL OF SOCIAL SCIENCES SOCIAL ANTHROPOLOGY DALTON RESEARCH ASSOCIATE

Transcription:

The Uses of Big Data in Social Research Ralph Schroeder, Professor & MSc Programme Director Hong Kong University of Science and Technology, March 6, 2013

Source: Leonard John Matthews, CC-BY-SA (http://www.flickr.com/photos/mythoto/3033590171)

Big data are data that are unprecedented in scale and scope in relation to a given phenomenon. They are often streams of data (rather than fixed datasets), accumulating large volumes, often at high velocity. Is the tail of the availability of big data and computational methods wagging the dog of good research questions and advancing social science? If not, how do big data advance research? What are the opportunities and challenges?

Business Value versus Academic Value Strategic Knowledge Generally time-limited (with exceptions) Value comes from knowing what your competitors don t Often has high monetary value if it can be exploited

Business Value versus Academic Value Durable Knowledge Less time-limited (with exceptions) Value comes from adding to the world s knowledge (the global brain is cumulative/scientific) Rarely has direct monetary value, but has value in terms of creating the possibility both of future knowledge and of future exploitation and commercial uses

Commercial/Governmental versus Social Science Research: Diverging Aims, with Overlap Manipulation of Behaviour: For aims limited to research in social science. The threat of social science knowledge, and of commercial/governmental knowledge and control of the natural environment.

Big Data Analytics Access to data Cost of analytical tools Skills to use the tools Why should anyone share? How different skills and disciplines work together Starting with questions, or starting with data? Prediction? A/B and other experiments Gaps? Futures

From Big Data to Big (Hi-res) Picture Marketing Tailoring Forecasting Prediction Complex Trends Linking datasets plus modelling

See http://www.oii.ox.ac.uk/research/projects/?id=98

Twitter-bots OII master s students Alexander Furnas and Devin Gaffney saw a large spike in then-us presidential candidate Mitt Romney s Twitter followers, and decided to look at the new followers: Furnas, A. and Gaffney, D. (2012). Statistical Probability That Mitt Romney's New Twitter Followers Are Just Normal Users: 0%. The Atlantic, July 31, http://www.theatlantic.com/technology/archive/2012/07/statistical-probability-that-mitt-romneys-new-twitter-followers-are-just-normal-users-0/260539/ (accessed August 31, 2012).

Source: http://www.flickr.com/photos/nakedcharlton/597075830/ Source: http://www.flickr.com/photos/jamescridland/613445810/

the distinctiveness of the network of mathematical practitioners is that they focus their attention on the pure, contentless form of human communicative operations: on the gestures of marking items as equivalent and of ordering them in series, and on the higher-order operations which reflexively investigate the combinations of such operations mathematical rapid-discovery science the lineage of techniques for manipulating formal symbols representing classes of communicative operations

Case 1: Search engine behaviour Waller s analysis of Australian Google Users Key findings: - Mainly leisure - > 2% contemporary issues - No perceptible class differences Novel advance: - Unprecedented insight into what people search for Challenge: - Replicability - Securing access to commercial data

Source: Waller, V. (2011). Not Just Information:Who Searches for What on the Search Engine Google? Journal of the American Society for Information Science & Technology 62(4): 761-775.??? Surprisingly,? the distribution of? types of search query did not? vary significantly across the different? Lifestyle Groups (p>0.01).??

Case 2: Large-scale text analysis Michel et al. culturomic analysis of 5 Million Digitized Google Books and Heuser & Le-Khac of 2779 19th Century British Novels Key findings: - Patterns of key terms - Industrialization tied to shift from abstract to concrete words Novel advance: - Replicability, extension to other areas, systematic analysis of cultural materials Challenge: - Data quality

J Michel et al. Science 2011;331:176-182

Platform Paper Size of Data in relation to phenomenon investigated Theoretical question/practical aim Key findings Facebook Backstrom et al. (2012) 69 billion friendship links between 721 million Facebook users Ugander et al. (2012) 54 million invitation emails to Facebook users Re-examine Milgram s six degrees of separation online How does structure of contacts affect invitation acceptance? Four degrees of separation on Facebook Not number of contacts, but number of distinct contexts, matters for acceptance Bond et al. (2012) 600000 Facebook users Facebook experiment about how to mobilize voters Voters can be mobilized via Facebook friends more than via informational messages Twitter Kwak et al. (2010) 1.47 billion directed Twitter relations Cha et al. (2010) 1.7 billion tweets among 54 million users Is Twitter a broadcast medium or a social network? Who influences whom? Most use is for information, not as a social network Top influentials dominate, but some variation by topic Bakshy et al. (2011) 1.6 million Twitter users Who influences whom? Ordinary user influencers can sometimes be more effective than top influencers Wikipedia Loubser (2009) All Wikipedia activity How is editing organized? Administrators can impact negatively on participation Yasseri, Kertesz (2012) West, Weber and Castillo (2012) Editorial activity on Wikipedia, especially reverts Wikipedia contributions related to Yahoo! browsing Understanding conflict and collaboration What characterizes Wikipedia contributors information behaviour compared to Wikipedia readers and non-readers Types of conflicts can be modelled Wikipedia contributors are more information hungry, especially about their topics

Scientificity and Big Data: Pro and Con Pro Replicability, extension to new domain Total datasets Powerful relation of data to object Con Limited access to object, skills needed Datasets capture limited dimensions, and about one object Object in isolation, not framed for social change significance

Conclusions Savage and Burrows?, who ask are commercial data outpacing social science? Boyd and Crawford?, who ask if big data raise ethical and epistemological conundrums?... No... The connection between research technologies and the advance of knowledge The threats and opportunities represented by unprecedented windows into people s minds and thoughts Does this lead to more scientific (i.e. cumulative) social sciences and humanities?

Implications For research Develop theoretical frame in which to embed big data (for new media), including power/function, relation to traditional media, and role in society For research policy Robust base for advancing research, including shared and open databases For society Awareness of how research can generate transparency and manipulability

Additional readings and references Bond, Robert et al. (2012). A 61-million-person experiment in social influence and political mobilization, Nature 489: 295 298. Bruns, A. and Liang, Y.E. (2012). Tools and methods for capturing Twitter data during natural disasters, First Monday, 17 (4 2), http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/viewarticle/3937/3193 Furnas, A. and Gaffney, D. (2012). Statistical Probability That Mitt Romney's New Twitter Followers Are Just Normal Users: 0%. The Atlantic, July 31, http://www.theatlantic.com/technology/archive/2012/07/statisticalprobability-that-mitt-romneys-new-twitter-followers-are-just-normal-users-0/260539/ (accessed August 31, 2012). Giles, J. (2012). Making the Links: From E-mails to Social Networks, the Digital Traces left Life in the Modern World are Transforming Social Science, Nature, 488: 448-50. Kwak, H. et al. (2010). What is Twitter, a Social Network or a News Media? Proceedings of the 19th International World Wide Web (WWW) Conference, April 26-30, 2010, Raleigh NC. Manyika, J. et al. (2011). Big data: the next frontier for innovation, competition and productivity, McKinsey Global Institute, available at: http://www.mckinsey.com/insights/mgi/research/technology_and_innovation/ big_data_the_next_frontier_for_innovation (last accessed August 29, 2012). Silver, Nate. (2012). The Signal and the Noise: The Art and Science of Prediction. London: Allen Lane. Tancer, B. (2009). Click: What Millions of People are Doing Online and Why It Matters. New York: Harper Collins, 2009. Wu, S., J.M. Hofman, W.A. Mason, and D.J. Watts, (2011). Who says what to whom on twitter, Proceedings of the 20th international conference on World Wide Web. (on Duncan Watts webpage, http://research.microsoft.com/en-us/people/duncan/, last accessed August 29, 2012).

Ralph Schroeder ralph.schroeder@oii.ox.ac.uk http://www.oii.ox.ac.uk/people/?id=26 Oxford Internet Institute See http://www.oii.ox.ac.uk/research/projects/?id=98 With support from: