Ethical, Epistemological, Methodological, Social and Other

Similar documents
Science with and for Society Project Partner Search Form

Contributions of Scientists and Engineers to Defining Article 15. Margaret Weigers Vitullo, PhD American Sociological Association

Media and Communication (MMC)

Truthy: Enabling the Study of Online Social Networks

Using Online Communities as a Research Platform

Energy for society: The value and need for interdisciplinary research

The Long Tail of Research Data

Transparency and Accountability of Algorithmic Systems vs. GDPR?

ArkPSA Arkansas Political Science Association

WP6 Genomics Organizing the societal debate on the use of genomic information in healthcare

Web 2.0 in social science research

The Uses of Big Data in Social Research. Ralph Schroeder, Professor & MSc Programme Director

CHAPTER 8 RESEARCH METHODOLOGY AND DESIGN

Info 2950, Lecture 26

Opening Science & Scholarship


13 Dec 2pm-5pm Olin Hall 218 Final Exam Topics

Keywords Big Data; digital devices; Interdisciplinarity; social life of methods; transactional data

Contextual Integrity through the lens of computer science

MEDIA AND INFORMATION

The University of Sheffield Research Ethics Policy Note no. 14 RESEARCH INVOLVING SOCIAL MEDIA DATA 1. BACKGROUND

On the challenges of cross-national comparative research of NLP

National Workshop on Responsible Research & Innovation in Australia 7 February 2017, Canberra

Gustav Jakob Petersson Swedish Research Council

Social Big Data. LauritzenConsulting. Content and applications. Key environments and star researchers. Potential for attracting investment

November 6, Keynote Speaker. Panelists. Heng Xu Penn State. Rebecca Wang Lehigh University. Eric P. S. Baumer Lehigh University

New forms of scholarly communication Lunch e-research methods and case studies

WORKSHOP ON BASIC RESEARCH: POLICY RELEVANT DEFINITIONS AND MEASUREMENT ISSUES PAPER. Holmenkollen Park Hotel, Oslo, Norway October 2001

DARPA-BAA Next Generation Social Science (NGS2) Frequently Asked Questions (FAQs) as of 3/25/16

Strategic Plan for CREE Oslo Centre for Research on Environmentally friendly Energy

Information Communication Technology

BI TRENDS FOR Data De-silofication: The Secret to Success in the Analytics Economy

Ethics of Data Science

Mission: Materials innovation

THE STATE OF THE SOCIAL SCIENCE OF NANOSCIENCE. D. M. Berube, NCSU, Raleigh

Towards a Consumer-Driven Energy System

Centre for the Study of Human Rights Master programme in Human Rights Practice, 80 credits (120 ECTS) (Erasmus Mundus)

250 Introduction to Applied Programming Fall. 3(2-2) Creation of software that responds to user input. Introduces

Big data for the analysis of digital economy & society Beyond bibliometrics

Introduction. digitalsupercluster.ca

DG CONNECT Artificial Intelligence activities

Ethics Guideline for the Intelligent Information Society

Public Consultation: Science 2.0 : science in transition

Information Sociology

Hamburg, 25 March nd International Science 2.0 Conference Keynote. (does not represent an official point of view of the EC)

Virtual Ethnography. Submitted on 1 st of November To: By:

Improving Education, Training and Communication with the Public on Ionizing Radiation

ACCELERATING TECHNOLOGY VISION FOR AEROSPACE AND DEFENSE 2017

Outline. Collective Intelligence. Collective intelligence & Groupware. Collective intelligence. Master Recherche - Université Paris-Sud

Brief Contents PART 1 FRAMEWORK 1

Assessment of Smart Machines and Manufacturing Competence Centre (SMACC) Scientific Advisory Board Site Visit April 2018.

Faculty of Humanities and Social Sciences

Social Enterprise Summit: Digital Innovation. Emotion Analytics. Hitch Marketing Ltd Nick Godbehere

Digital Humanities: An Exploration of New Programs in Higher Education and its Meaning Making by Community Partners

Spurring Big Data-Driven Innovation and Promoting Responsible Data Governance in a Privacy-Centred Europe

Public Consultation: Horizon 2020 "Science with and for Society" - Work Programme Questionnaire

COMPUTATIONAL SOCIAL SCIENCE AND ADVANCED COMPUTING INFRASTRUCTURE: CHALLENGES AND OPPORTUNITIES

MULTIPLEX Foundational Research on MULTIlevel complex networks and systems

Privacy, Due Process and the Computational Turn: The philosophy of law meets the philosophy of technology

Communication and Culture Concentration 2013

CONSENT IN THE TIME OF BIG DATA. Richard Austin February 1, 2017

Towards a Software Engineering Research Framework: Extending Design Science Research

UK Film Council Strategic Development Invitation to Tender. The Cultural Contribution of Film: Phase 2

Management Consultancy

Media and Information Literacy - Policies and Practices. Introduction to the research report Albania

Code Hunt Contest Analytics. Judith Bishop, Microsoft Research, Redmond USA and team

RepliPRI: Challenges in Replicating Studies of Online Privacy

Machine Learning has been used in the real estate industry much longer than headlines and pitch decks suggest

Artificial intelligence and judicial systems: The so-called predictive justice

Increasing Trust through Standards & Conformity Assessment for Identity

Questionnaire Design with an HCI focus

Can we better support and motivate scientists to deliver impact? Looking at the role of research evaluation and metrics. Áine Regan & Maeve Henchion

SOCIAL DECODING OF SOCIAL MEDIA: AN INTERVIEW WITH ANABEL QUAN-HAASE

Towards Trusted AI Impact on Language Technologies

Section 3 The Desired Human Resource System

2010 World Population and Housing Census Programme. United Nations Statistics Division

Information Evolution in Social Networks

TRUSTING THE MIND OF A MACHINE

Research as a Deliberate Chess Activity Software Testing Platform for Professional Dynamic Development of the Education Sector

PROGRAM CONCEPT NOTE Theme: Identity Ecosystems for Service Delivery

SERBIA. National Development Plan. November

Surveying & Monitoring Mammals

Can Linguistics Lead a Digital Revolution in the Humanities?

Written response to the public consultation on the European Commission Green Paper: From

ICT : Future Hyper-connected Sociality. DG CONNECT European Commission

Future Personas Experience the Customer of the Future

Levels of Trace Data for Social and Behavioural Science Research

User Experience Design I (Interaction Design)

Census Response Rate, 1970 to 1990, and Projected Response Rate in 2000

Trends in TA: Contested futures and prospective knowledge assessment

Participative knowledge society - what does it mean for creative industries? Bror Salmelin Advisor, DG INFSO Valencia April 2010

Introducing Elsevier Research Intelligence

Understanding User s Experiences: Evaluation of Digital Libraries. Ann Blandford University College London

Executive Summary. Questions and requests for deeper analysis can be submitted at

Saint Patrick High School

THE DEEP WATERS OF DEEP LEARNING

Know Your Community. Predict & Mitigate Risk. Social Unrest: Analysis, Monitoring and Developing Effective Countermeasures

Decentralisation, i.e. Internet for Social Good

Karen B. Paul, Ph.D. From Blurring Boundaries to Boundaryless

Benchmarking: The Way Forward for Software Evolution. Susan Elliott Sim University of California, Irvine

Transcription:

Ethical, Epistemological, Methodological, Social and Other Issues in Web/Social Media Mining Marko M. Skoric Department of Communication PhD Student Workshop Web Mining for Communication Research April 22-25, 2014

The Promise of Computational Social Science Using all available digital traces to create comprehensive explanations of individual, group and societal behavior Longitudinal, networked, behavioral, cross- linked data Increasing share of social interactions are mediated by technology Leveraging on exponential increase in both data availability and computational power ( Big Data ) Terabytes vs. megabytes of data Physics, biology, and chemistry have been transformed by such approaches Why not social science? Expected to bring benefits to industries, governments, communities and citizens Increasing productivity and competitiveness, improving functioning of the public sector

Traditional Social Science Typically uses data from Small- scale, qualitative designs (interviews, focus groups, ethnography) Small- scale, laboratory based experiments Larger- scale snapshots of mainly self- reported data (surveys) or larger- scale content analyses Each of the approaches has inherent limitations, which we have learned to accept and address Absence of large- scale behavioral or self- reported data (with some exceptions, e.g. elections and censuses) Many issues of validity, generalizability, representativeness, social desirability, artificiality, and invasiveness

Computational Social Science/Web Mining/Social Media Analytics Low(er) cost (Near) real- time analysis Greater variety of topics and contexts Inobtrusive measures Continuous, longitudinal, panel type data Cross- national/comparative data Captures the structure, not just the content Behavioral logs Social conversations (text, audio, & video) Social networks and relationships Universe vs. sample

Surveys vs. Web/Social Media Mining Survey as a structured, systematic method of collecting fairly large number of solicited responses Survey data frequently criticized for being based on self- reports (Preferably) utilizing probability sampling Ability to generalize based on sound statistical/mathematical principles Social media messages are more similar to short conversations and are typically not solicited (at least not by researchers), nor structured Word- of- mouth (WoM) messages Diaries Ethnography

Data collection Know- how Open vs. closed/limited API Constantly changing APIs/lack of published specs Scale Sampling and representativeness Social media users are different from average citizens Yes, but how and for how long more? Self- selection issues Spam and astroturfing Analysis Techniques and approaches (problems with black- boxing) Statistical assumptions are often violated (which ones?) Scalability/computational issues Lack of good theories data- driven research is dominant Challenges in Web/Social Media Mining

Data Collection and Sampling: Lack of Established Reporting Procedures and Standards Sampling procedure description from a survey study (from Public Opinion Quarterly) Twitter data collection procedure description (from ICWSM)

Challenges in Web/Social Media Mining- continued Social science PhDs are (typically) not sufficiently trained Inherent need for interdisciplinary collaboration (e.g. social scientists + physicists + computational scientists) Issues of consent, anonymity and privacy Even anonymized data can be deanonymized 87% of US population can be identified with 3 variables gender, date of birth and ZIP code Physicists don t need IRBs or do they? Lack of suitable theoretical models The end of theory Inertia and lack of understanding in the social science community Challenges with peer- review and professional evaluation Where to publish? Journals or conferences?

Challenges in Web/Social Media Mining- continued Broader issues of epistemology and ethics What is knowledge? What is research? What s reality? Numbers (don t) speak for themselves Quantification vs. objectivity Tools R Us Is bigger data better data? Self- explanatory? Methodologically sound? Are messages on social media platforms genuine and authentic? Or curated, managed, edited? Long- tail of participation and content creation (80/20 rule) Is Twitter data collected via APIs representative? If so, of what? Firehose, gardenhose & spritzer types of access API characteristics may shift over time (without warning)

Challenges in Web/Social Media Mining- continued (Really) big data is mostly proprietary (Google, Facebook, Weibo) or owned by governments (national security agencies) Lack of proper data- sharing norms, protocols and procedures Only big players (countries, companies, universities) have the privilege of full access Difficulty in replicating findings Supply of web/social media data varies highly across different societies and contexts Dependent on level of technological/infrastructural development Big data divide? Global shortage of computational talent

Some Future Directions Triangulation of traditional and web/social media mining methods Establishing validity of measures Surveys combined with web/social media mining Combining human coders and machine- learning algorithms Amazon.com s Mechanical Turk Developing standardized sets of methods and procedures for data collection, processing and analysis Preserving comparability and allowing for replicability Data sharing Doing RQ or theory- driven research Social science as scholarship aimed at understanding and/or improving lives of human beings

Scientists and Conferences to Follow Albert- Laszlo Barabasi Alex Sandy Pentland Lada Adamic David Lazer Gary King Jure Leskovec Michael Macy Noshir Contractor ICWSM iconference WSDM WebSci WWW

Useful References boyd, d. & Crawford, K. 2012. Critical Questions for Big Data, Information, Communication and Society, Volume 15, no 5, pp 662-679. pdf King, G. 2012. Ensuring the Data Rich Future of the Social Sciences. Science 331, no. 11 February: 719-721. pdf Manovich, L. 2012. Trending: The Promises and the Challenges of Big Social Data. Debates in the Digital Humanities, edited by Matthew K. Gold. The University of Minnesota Press. pdf

Thank you! Questions? Any comments?