Social Networks and Archival Context R&D to Cooperative

Similar documents
Leveraging VIAF in Social Networks and Archival Context

Social Networks and Archival Context

Digital Preservation Policy

National Archival Authorities Infrastructure. Social Networks and Archival Context & National Archival Authorities Cooperative

People of the Founding Era: Mining the Data of the Founders Projects Documents Compass / Virginia Foundation for the Humanities

Names and Identities. Karen Smith-Yoshimura Thomas Hickey RLG Partnership Annual Meeting

University of Massachusetts Amherst Libraries. Digital Preservation Policy, Version 1.3

University of Kansas. The University of Kansas Libraries

Digital Libraries for Biodiversity and Natural History Collections

Embedding Digital Preservation across the Organisation: A Case Study of Internal Collaboration in the National Library of New Zealand

Therese Bonney photographs,

New Directions in Digital Library Publishing: Increasing Access to Non-Textual Cultural Narratives

Strategy for a Digital Preservation Program. Library and Archives Canada

Project Example: wissen.de

What is a collection in digital libraries?

ccess to Cultural Heritage Networks Across Europe

HANDSOME LAMS?: COLLABORATIONS AROUND COLLECTIONS AT YALE UNIVERSITY

Guidelines for the Professional Evaluation of Digital Scholarship by Historians

Memorandum on the long-term accessibility. of digital information in Germany

in the New Zealand Curriculum

CONFERENCE PRESENTATIONS

NEES CYBERINFRASTRUCTURE: A FOUNDATION FOR INNOVATIVE RESEARCH AND EDUCATION

The EU Framework Programme for Research and Innovation HORIZON 2020 SC6 CULT-COOP Albert GAUTHIER. DG Connect Unit G2 Luxembourg

Introduction. amy e. earhart and andrew jewell

Introduction to Planets. Hans Hofman Nationaal Archief Netherlands Barcelona, 27 March 2009

POLICY NUMBER: P

Laurent Romary, Inria DARIAH, director DARIAH - SHAPING EUROPEAN RESEARCH IN THE ARTS AND HUMANITIES

ISNI and the PCC Pilot

STRATEGIC FRAMEWORK Updated August 2017

Digital Preservation Strategy Implementation roadmaps

Over the 10-year span of this strategy, priorities will be identified under each area of focus through successive annual planning cycles.

Klaus Kempf. Beyond the catalog. Standardized metadata creation and linked open data in the German-speaking world.

Office of Science and Technology Policy th Street Washington, DC 20502

COMMISSION RECOMMENDATION. of on access to and preservation of scientific information. {SWD(2012) 221 final} {SWD(2012) 222 final}

The Specimen Case and the Garden: Preserving Complex Digital Objects, Sustaining Digital Projects

A Finding Aid to the Morton Traylor Papers, , in the Archives of American Art

CONVERGENCE ROCKS! LIBRARIES, ARCHIVES and MUSEUMS

The role of SciELO on the road towards the Professionalization, Internationalization and Financial Sustainability of developing country journals

A Finding Aid to the Morton Traylor Papers, , in the Archives of American Art

Cataloging Conventions Item Level Bibliographic Records

Digital Sustainability: Tyler O. Walters

A Finding Aid to the Walter Gay Papers, , 1980, in the Archives of American Art

DARIAH-ERIC. Towards a sustainable social and technical European eresearch Infrastructure for the Arts and Humanities

Guide to the Alice Weber Photograph Albums

CONSIDERATIONS REGARDING THE TENURE AND PROMOTION OF CLASSICAL ARCHAEOLOGISTS EMPLOYED IN COLLEGES AND UNIVERSITIES

Title: Case Study 02 Public Relations and Press Office of the State University of Campinas (UNICAMP) Digital Photographic Records: Final Report.

Enforcement of Intellectual Property Rights Frequently Asked Questions

KU Libraries Digital Data Services Strategy

The FUTURES of Technical Services

1. Additional text and examples in

Gardens, Libraries and Museums. Digital Strategy Termly Update, June 2018

The NEW IUScholarWorks at Indiana University. Repositories, Journals, and Scholarly Publishing

RE-FRAMING COLLECTIONS FOR A DIGITAL AGE: A PREPARATORY STUDY FOR COLLECTING AND PRESERVING WEB-BASED ART RESEARCH MATERIALS

RLG, Where Museums, Libraries, and Archives Intersect

National Perpetual Access & Digital Preservation CRKN & Scholars Portal

Guide to the Northern Pacific Railroad Bridge Construction Photograph Album

Creating a New Kind of Knowledge Institution. Directions for JUNE 2004

Library Special Collections Mission, Principles, and Directions. Introduction

Department of Energy s Legacy Management Program Development

Operational Objectives Outcomes Indicators

MemoryBC AtoM version 2.4. User Manual

RESEARCHING THE NATIONAL ARCHIVES. Compiled by: Sandra M Barnes From

Submitting New Records to the Daguerreotypes at Harvard Virtual Collection. Recommended Guidelines for Cataloging Daguerreotypes in OLIVIA

Guide to the Emile Bachelet Collection

Museum of Mankind copies of William Blackmore photograph albums of American Indians, circa 1850s-1870s

MINERVA: IMPROVING THE PRODUCTION OF DIGITAL CULTURAL HERITAGE IN EUROPE. Rossella Caffo - Ministero per i Beni e le Attività Culturali, Italia

La Vern Frank-Rush papers regarding the WPA Art Center in Sioux City, Iowa

Trust, but Verify : What the Digital and Transparency Revolutions in Social Science Mean for You. Andrew Moravcsik

LIS 688 DigiLib Amanda Goodman Fall 2010

Hamilton Wright Jr. photographs, circa ,

LIBRARY AND ARCHIVES POLICY

Elvi Whittaker fonds. Compiled by Max Steiner (2005) University of British Columbia Archives

A Finding Aid to the Artist Tenants Association Records, , in the Archives of American Art

Preliminary Guide to the Archives Center Poster Stamp Collection

Trends in. Archives. Practice MODULE 8. Steve Marks. with an Introduction by Bruce Ambacher. Edited by Michael Shallcross

SERBIA. National Development Plan. November

National Biodiversity Information System. Brenda Daly South African National Biodiversity Institute

Rufus King Genealogical Research Papers MssCol NYGB 18162

Another Look: Reprocessing Photograph Collections

Convergence of Knowledge and Culture

Research Data Preservation in Canada A White Paper

Preservation & Access to Information vis-à-vis IGNCA cultural Knowledge Resources

CoSA & Preservica Practical Digital Preservation Digital Preservation in the Real-World & Program Round-up June

RECOMMENDATIONS. COMMISSION RECOMMENDATION (EU) 2018/790 of 25 April 2018 on access to and preservation of scientific information

A Framework for Digital Heritage Forensics. Luciana Duranti, The University of British Columbia

Data-intensive environmental research: re-envisioning science, cyberinfrastructure, and institutions

Digital transformation in the Catalan public administrations

14 th Berlin Open Access Conference Publisher Colloquy session

DEPARTMENT OF PUBLIC LIBRARIES

INTEGRATED DATABASE PROJECT

The LDS Pioneering Spirit Continues!

GAMS: More than a Digital Asset Management System

Anne Gilliland Summer School in the Study of Old Books Zadar, Croatia, 27 September, 2009

Guide to the Civil War Propaganda Scrapbooks

Nomination form International Memory of the World Register

Cataloging in the Cloud: Shared Shelf and ArchaeoCore

Using USPTO resources to discover innovation and economic development targets. David Irvin, MSIS Business Librarian, PTRC Representative, NMSU

Configuring JSTOR collections in the EBSCO Discovery Service (EDS): a quick reference guide

Guide to the Active Republican Women's Club Records

Research Infrastructures in FP7

Transcription:

Social Networks and Archival Context R&D to Cooperative Library Science Talks September 2017 CERN Geneva / Zentralbibliothek Zürich

Overview Archival records and the description of people R&D Objectives and Results From R&D to Cooperative: Objectives and Results Social Networks and Archival Context Cooperative (SNAC) within the global cultural heritage landscape Brief look at the soon-to-be-released, revised and enhanced SNAC public interface

Records People living and working together record information Such information may serve a variety of purposes Some times the recorded information is intended to be a reliable witness to human activity: birth and marriage certificates are examples But even when the information is not primarily intended to be a record, it is evidence of human activity If you want to understand a publication, a building, a work of art, an event historical records are essential!

Archival Description Archivists describe not only the records themselves But also the contexts in which the records were created, accumulated, and used A key component of the context is describing the people who created and used the records, as well as, selectively at least, the people documented in them Records are largely unintelligible without intellectually preserving their context through description

Archival Description Thus archivists describe the creators and the contexts in which they worked and lived Their names, of course, but facts about them too: when and where they were active, what they did, and with whom Traditional library authority control was about managing the headings or entry points that appeared in catalog records For archivists, it is more about identities of persons, corporate bodies, and families (CPF entities) Though library authority control is also increasingly about identities

Quick History Overview Research & Development 2010-2015 NEH and Mellon Foundation Cooperative Planning 2011-2015 IMLS and Mellon Foundation Transformation into a Cooperative Phase One 2015-2019 Mellon Foundation

R&D Objectives Demonstrate that data describing people in existing archival description can be used To address the challenge of finding/discovering/locating/understanding distributed historical resources Integrated access to geographically dispersed historical records Access to the social-profession networks that created and are documented in the records To lay the foundation for an international cooperative for centrally maintaining the collectively created biographical-historical data

R&D Activities Data sources: 2.25M WorldCat archival descriptions, 190K EADencoded finding aids, 400K or so British Library and NARA authority records, agency records from Smithsonian Institution Archives and NY State Archives, and more From sources, extracted and assembled descriptions of corporate bodies, persons, and families (CPF entities) Identity Resolution within assembled set and against VIAF to create final set of CPF descriptions Created a prototype history research tool (HRT) Social-professional-intellectual networks (CPF to CPF relations) Links to archival resources documenting the CPF entities (integrated access)

Identity and Identity Resolution Extracting the data from MARC, EAD, and other sources presented challenges, as did the development of the History Research Tool (HRT) But the central challenge was and is identity resolution: two or more people with the same name; two or more names for the same person A challenge for people and computers Quality of computational resolution depends on the a priori quality of human resolution

Identity and Identity Resolution Names are weak identifiers Life dates help, but they are still not enough Additional evidence is needed for reliable resolution; the more the better Each additional fact makes the identification more certain, more reliable A persistent identifier is just another name Essential is the set of facts associated with each identifier: name or names, existence dates, affiliated places, occupations, functions, significant events

R&D Results Original Source Records: 6,719,064 4,653,365 Persons 1,868,448 Corporate Bodies 197,251 Families Merged Records: 3,741,262 2,466,425 persons 1,077,588 corporate bodies 197,249 families All linked to archives resources in ~4000 repositories

From R&D to Cooperative Program: Objectives Primary objectives are practical Sharing description of archives Making description more effective Improving the economy of research Intended as a contribution to the humanities and social science research infrastructure Improving the scholarly communication research economy For curators For researchers For more detail http://socialarchive.iath.virginia.edu/snac-c_rationale.pdf

From R&D to Cooperative Program: Transformation Social Administration Governance Building a community of editors Technological From a set of independent steps that led to an aggregation of identity descriptions To a dynamic, human curated collection of identity descriptions

Administration The University of Virginia Library hosts both the secretariat and the technology infrastructure of the Cooperative Director and deputy director Two programmers Additional administrative assistance provided as needed by Library staff The long-term home is to be determined in conjunction with developing a business model to ensure sustainability without regular grant funding

Governance Building a community with shared understanding and purpose Transitioning from central R&D decision-making to community governance Editorial policy and standards Technology Infrastructure Relation to other archival description systems (local; ArchivesSpace) Relation to other identity resources (VIAF, national authority files, Wikidata ) Communication: within the community and outreach Training (SNAC School) building a community of expert editors: international, cultural heritage professionals and humanities scholars Operations Committee to coordinate

Cooperative Members American Institute of Physics American Museum of Natural History Archives, National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore, India Archives nationales de France Brigham Young University California Digital Library Cecilia Preston (individual scholar) George Washington University Getty Research Institute Harvard University Indiana University Purdue University Indianapolis Jane Addams Papers (documentary editing) Library of Congress Mojave Desert Archives National Archives and Records Administration New York Public Library Princeton University Smith College Smithsonian Institution Tufts University University of California, Irvine University of Miami University of Nebraska Library Walt Whitman Archive (documentary editing) University of North Carolina, Chapel Hill University of Oregon University of Virginia Utah State Archives Yale University

Technology Transformation From an R&D pipeline with three steps Extraction of data and assembling of CPF descriptions Identity Resolution match/merge History Research Tool To an integrated maintenance and publishing platform Supporting human editing of CPF descriptions Supporting batch ingest of new data from new members or data donors Robust History Research Tool LOD exposure of the social-document network

SNAC Cooperative Identities dense certain Human editors: evaluate, verify, add new evidence & create, edit, link evidence EAC-CPF Sources: archives, libraries, museums, scholarly research projects MARC21, EAD, TEI, Local formats Smart algorithms Smart people EAC-CPF sparse uncertain

Getty Research Institute American Institute of Physics Archivists at SNAC Cooperative Institutions New York Public Library Smithsonian Institution University of California, Irvine NARA Princeton University American Museum of Natural History George Washington University Library of Congress University of Virginia Indiana University-Purdue University Indianapolis Smith College Harvard University University of Miami Tufts University Yale University Dashboard Create & Maintain RESTful JSON API ArchivesSpace Other Other Tools Tools Public HRT Linked Data JSON API SNAC Server PostgreSQL Elastic Search Neo4J User Authorization Identity Reconciliation

Outside Clients (Web Browsers, curl, wget, ArchivesSpace) HTML/JSON/JS Rest Rest API API (JSON) (JSON) Server-side Clients Web Web UI UI WebUI Executor User interface for User interface for editing and viewing editing and viewing Rest Rest API API Filter Filter Textual interface for Textual interface for machine access, machine access, Exposes portion of API Exposes portion of API Dev/Test Internal interface for Internal interface for testing the server testing the server Server Server API API (JSON) (JSON) API exposed to internal clients API exposed to internal clients Server-side Server Executor Parses and interprets internal Server API commands Parses and interprets internal Server API commands Interacts directly with internal server components Interacts directly with internal server components PostgreSQL CPF record data User data Reporting Tool Data Validation Engine Authentication Authorization Elastic Search Indexing tool for searching and matching Identity Reconciliation Engine EAC-CPF Serializer EAC-CPF Parser Date Parser Neo4J Graph database Key API Interface Custom Designs OTC Components

SNAC and the Global Cultural Heritage Landscape Broad perspective The scope of archives: any person or group of persons that has ever lived and has left a recorded trace The archival social-document network can provide context for other cultural heritage communities VIAF, Wikipedia/Wikidata, national library authority files, museum authority files (ULAN) ORCID and ISNI Multiple authorities undermines the function of authority Multiple is a political reality and ethically right Alignment of the multiple authorities will be an ongoing challenge Identifying is an ongoing activity and negotiation Though we can never get it right (once and for all), we can continue to make it better based on the available evidence, and based on shared intellectual and ethical values

Preview: snaccooperative.org