J A M E S C O S U L L I VA N J O S U L L I VA N. O R G U N I V E R S I T Y O F S H E F F I E L D

Similar documents
Citizen Science in the context of recent Digital Humanities projects an overview and outlook

University of Massachusetts Amherst Libraries. Digital Preservation Policy, Version 1.3

Guidelines for the Professional Evaluation of Digital Scholarship by Historians

CO-ORDINATION MECHANISMS FOR DIGITISATION POLICIES AND PROGRAMMES:

Digitisation Plan

TITLE: Using collections and worksets in large-scale corpora: Preliminary findings from the Workset Creation for Scholarly Analysis project

Attribution and impact for social science data

Digital Projects Made Easy: It s about Partnerships

University of Kansas. The University of Kansas Libraries

International Symposium on Knowledge Communities 2012

Common Core Structure Final Recommendation to the Chancellor City University of New York Pathways Task Force December 1, 2011

Digital Projects Made Easy: It s All about Partnerships

Can Linguistics Lead a Digital Revolution in the Humanities?

STRATEGIC FRAMEWORK Updated August 2017

Digital Preservation Strategy Implementation roadmaps

DIGITAL BR ITAIN: THE INTER IM R EPOR T R ESPONSE FR OM THE BR ITISH LIBR AR Y INTR ODUCTION

The Library's approach to selection for digitisation

New forms of scholarly communication Lunch e-research methods and case studies

Office of Science and Technology Policy th Street Washington, DC 20502

Embedding Digital Preservation across the Organisation: A Case Study of Internal Collaboration in the National Library of New Zealand

Developing Research Platforms New Roles for New Libraries

BHL Moves Forward 2014 an update

Following these considerations, this encompasses two main objectives:

Digital Preservation Program: Organizational Policy Framework (06/07/2010)

Creative Informatics Research Fellow - Job Description Edinburgh Napier University

Revised East Carolina University General Education Program

Introduction. amy e. earhart and andrew jewell

ADVANCING KNOWLEDGE. FOR CANADA S FUTURE Enabling excellence, building partnerships, connecting research to canadians SSHRC S STRATEGIC PLAN TO 2020

ACQUISITION POLICY. Introduction

SOCIAL STUDIES 10-1: Perspectives on Globalization

History. Political Science. Theology. Philosophy. Sociology Fine Arts. Psychology. Gender Studies. Literature and So Much More

Trends in. Archives. Practice MODULE 8. Steve Marks. with an Introduction by Bruce Ambacher. Edited by Michael Shallcross

Library s role in UP s Digital Humanities (DH) endeavour

IFLA International Newspaper Conference

Interoperable systems that are trusted and secure

Language, Context and Location

Digitization Project of Kindred Languages Materials, methods and tools for researchers

Digital Preservation Policy

Nichesourcing the Uralic languages for the benefit of research and societies

OpenUP. IRCDL 2018 Udine, Gennaio

Strategy for a Digital Preservation Program. Library and Archives Canada

EQF Level Descriptors Theology and Religious Studies

HUMANITIES, ARTS & CULTURE DATA SUMMIT. Rachel Fensham Digital Studio, University of Melbourne

NCRIS Capability 5.7: Population Health and Clinical Data Linkage

Starting a Digital Preservation Program

Convergence of Knowledge and Culture

Research Excellence Framework

Statement of Professional Standards School of Arts + Communication PSC Document 16 Dec 2008

INVOLVING USERS TO SUCCESSFULLY MEET THE CHALLENGES OF THE DIGITAL LIBRARY: A 30 YEAR PERSONAL REFLECTION

Violent Intent Modeling System

Design and Technology Subject Outline Stage 1 and Stage 2

Documentary Heritage Development Framework. Mark Levene Library and Archives Canada

Data users and data producers interaction: the Web-COSI project experience

Bamboo Technology Proposal (Public)

A Digitisation Strategy for the University of Edinburgh

Media Literacy Policy

14 th Berlin Open Access Conference Publisher Colloquy session

1. Context. 2. Vision

Public Information and Disclosure RD/GD-99.3

University of Oxford Gardens, Libraries and Museums Digital Strategy

ScienceDirect: Empowering researchers at every step. Presenter: Lionel New Account Manager, Elsevier Research Solutions

Capturing the impacts of Liverpool 08 Evaluating European Capital of Culture

Royal Pavilion & Museums DRAFT Digital Preservation Policy 2018

Digitisation success on a shoestring? Scoping some issues in sustaining digital collections

A STUDY ON THE DOCUMENT INFORMATION SERVICE OF THE NATIONAL AGRICULTURAL LIBRARY FOR AGRICULTURAL SCI-TECH INNOVATION IN CHINA

Digital Project Co-ordinator (1 year contract)

Oxford Scholarship Online

B R I E F I N G P A P E R

Digital Humanities, Computational Linguistics, and Natural Language Processing

Written response to the public consultation on the European Commission Green Paper: From

Building an Infrastructure for Data Science Data and the Librarians Role. IAMSLIC, Anchorage August, 2012 Linda Pikula, NOAA and IODE GEMIM

Positioning Libraries in the Digital Preservation Landscape

CILIP Privacy Briefing 2017

Mirja Liikkanen. Statistics Finland

Vision. The Hague Declaration on Knowledge Discovery in the Digital Age

Exploring the New Trends of Chinese Tourists in Switzerland

Strategic Plan for CREE Oslo Centre for Research on Environmentally friendly Energy

SERBIA. National Development Plan. November

2. What is Text Mining? There is no single definition of text mining. In general, text mining is a subdomain of data mining that primarily deals with

Making the impact on research and society: Nichesourcingof Uralic language material for the benefit of linguistic research and native-speakers

Engaging Industry Partners

Edgewood College General Education Curriculum Goals

Reported by Georg Vogeler (Graz University)

Research Methods in Computer Science Lecture 4: Research process models

Assessing Geocoding Solutions

Over the 10-year span of this strategy, priorities will be identified under each area of focus through successive annual planning cycles.

The Digital National Library of Scotland Strategic Plan

DELIVERABLE SEPE Exploitation Plan

1. Digital Humanities in the Library: Challenges and Opportunities for Subject Specialists. Copyright 2015 by The Association of College & Research

Libraries and IT: Services Supporting Research at NC State Jill Sexton Interim Associate Director for the Digital Library NCSU Libraries April 25,

UKRI research and innovation infrastructure roadmap: frequently asked questions

BONDING: HOW DOES IT AFFECT US?

Upscaling digitisation at the Wellcome Library showcasing the Goobi workflow system

CONSIDERATIONS REGARDING THE TENURE AND PROMOTION OF CLASSICAL ARCHAEOLOGISTS EMPLOYED IN COLLEGES AND UNIVERSITIES

Guide to the Requirements for Public Information and Disclosure GD-99.3

ECONOMIC AND SOCIAL RESEARCH COUNCIL IMPACT REPORT

2008 INSTITUTIONAL SELF STUDY REPORT EXECUTIVE SUMMARY

Costing the Digital Preservation Lifecycle More Effectively

Research Data Preservation in Canada A White Paper

Okavango Research Institute

Transcription:

#UoRopen T H E C H A L L E N G E S O F D I G I T A L H U M A N I T I E S : C O M M O N R E Q U I R E M E N T S F O R H U M A N I T I E S R E S E A R C H E R S J A M E S O S U L L I VA N U N I V E R S I T Y O F S H E F F I E L D @ J A M E S C O S U L L I VA N J O S U L L I VA N. O R G

DIGITAL HUMANITIES INSTITUTE @DHISHEF ~20-25 live projects, all externally funded External partners in higher education and other public and commercial sectors DH as service Director: Michael Pidd Digital Humanities Research Associate Build capacity Computer Science / Literary Studies Electronic Literature / Computer-assisted Criticism

OVERVIEW Defining DH Requirements & Challenges Case Studies

WHAT IS A DH PROJECT? It can be big Computational linguistic techniques and data visualisation to identify lexical patterns in 250,00 printed books (approx. 30 million pages). but it doesn t have to be

THE DH LIFECYCLE Acquisition & Processing Scope Preparation Management Analysis Adding value Visualisation Interpretation Dissemination Representation Sharing

ACQUISITION & PROCESSING Scope What is your research question? How can computation help you answer that question? What data do you need? How can you get it? Big data is relative to current infrastructure and processing conventions Total archive of documents held by The National Archives dating up to the 1970s is less than the Home Office s annual deposits

ACQUISITION & PROCESSING How you prepare and structure your data can have significant repercussions for the questions you ask and the results you receive

ACQUISITION & PROCESSING Gathering your data How reliable is your source? Do you need to clean the content? OCR vs born-digital Outsourced, in-house, or crowdsourced transcription? How clean is clean? Which edition are you getting? What labour will be required? Correcting errors Spelling conventions Removing boilerplate Critical commentary etc Copyright/licensing issues British Library Nineteenth Century Newspapers Keyword search for pidd gives 2,730 results Regularly cited as the most used online resource by Jisc, but it has an extremely high error rate; little more than a substitution for microfilm

ACQUISITION & PROCESSING And all of that is dependent on Structured Semistructured Unstructured Datasets in the Humanities are usually: 1. Small (discrete sources created by individuals) 2. Broad (many different types of sources have to be assembled) 3. Complex (because humans are not spreadsheets)

ACQUISITION & PROCESSING Management Data management is all about stewardship! It is an ongoing commitment in any DH project. An effort to deliberately care for information resources that enable and inform research saves time and effort both during a project and after it has ended or evolved along new paths of inquiry. A dedicated effort to care for the physical or even digital content of relatively traditional studies aided by our computers seems like a reasonable undertaking, but it will not happen on its own we have to commit to it! -- Sarah Pickle (Assessment Librarian at the Claremont Colleges)

ACQUISITION & PROCESSING Practical Considerations Ensuring your work has the best chance at being useful to you and others Down the line, you know where and what your materials are; where they came from, how you produced them in the first place, and how to work with them. Create a plain text README! Technical Considerations What is your data s current and long-term storage needs? How are you structuring your dataset? Naming conventions etc. What backup policies are in place? What formats are you using? Plain text, XML, TIFF, jpg?

ANALYSIS The application of sophisticated computer-assisted methods to Humanities research What is the added value? DH methods aren t better, they re just different; they lend new types of evidence DH methods allow us to read in different ways, lending quantitative evidence to qualitative arguments Capacity: machines are stupid, but they can process information a lot quicker than we can

We cannot say precisely what literature is, but we can recognise the literary when we see it, and by extension, the various characteristics of that which makes it so. If we view literary texts as systems, however formal or informal, these systems will be comprised of a series of elements, which, while detectable yet insignificant to a machine, can be useful to a human in the construction of meaning. Digital Literary Studies seeks to equip literary and cultural scholars with the instruments necessary to isolate specific literary elements and to use these to conduct some experiment or calculation in an effort to provide additional insight.

ANALYSIS Visualisation & Interpretation How are you going to represent your results? Macro-level information is most intuitive when visual Visualisation is not just about dissemination, it s about interpretation Detecting trends and anomalies Misinterpretation is always possible Discoveries vs mistakes Researcher determines significance

ANALYSIS Will your analysis be limited by your data? Irish Literary Network (This is on top of typical issues like canonicity etc )

ANALYSIS What constitutes a finding?

O Sullivan, James. Finn s Hotel and the Joycean Canon. Genetic Joyce Studies 14 (2014).

ANALYSIS How do we present our findings to Humanities scholars? Sean G. Weidman, The Limits of Distinctive Words: Re-evaluating the Gender Marker Debate Forthcoming in Digital Scholarship in the Humanities (2017) Should we be conducting experiments that aren t reproducible? How do you overcome the scholarly issues posed by black box tools? Is collaboration the answer? O Sullivan, James, Diane Jakacki, and Mary Galvin. Programming in the Digital Humanities. Digital Scholarship in the Humanities 30.1 (2015): i142 47. Understanding is essential, coding is not

DISSEMINATION Representation & Sharing If you can share your dataset, you should Reproducibility! Datasets are most effectively shared with others when accompanied by documentation concerning how it was obtained and massaged Where do you publish your research? Many editors / reviewers are still very suspicious of DH methods If we just continue to publish in DH journals, are we simply building a silo? Who is our audience?

THANK YOU! j.c.osullivan@sheffield.ac.uk @jamescosullivan josullivan.org