National Archival Authorities Infrastructure. Social Networks and Archival Context & National Archival Authorities Cooperative

Similar documents
Leveraging VIAF in Social Networks and Archival Context

Social Networks and Archival Context R&D to Cooperative

Social Networks and Archival Context

Deconstructing the Critical Theory Archive at UCI: An Experiment with EAC-CPF and Linked Open Data

The FUTURES of Technical Services

CONFERENCE PRESENTATIONS

Therese Bonney photographs,

Names and Identities. Karen Smith-Yoshimura Thomas Hickey RLG Partnership Annual Meeting

People of the Founding Era: Mining the Data of the Founders Projects Documents Compass / Virginia Foundation for the Humanities

University of Kansas. The University of Kansas Libraries

RLG, Where Museums, Libraries, and Archives Intersect

Finding Aid to the Halsted N. Gray Carew & English Funeral Home Records, (bulk )

Elvi Whittaker fonds. Compiled by Max Steiner (2005) University of British Columbia Archives

Carmen Rioseco Perry. Facultad de Arquitectura y Bellas Artes Pontificia Universidad Cat6lica de Chile Lo Contador 1916 Santiago,Chile

Submitting New Records to the Daguerreotypes at Harvard Virtual Collection. Recommended Guidelines for Cataloging Daguerreotypes in OLIVIA

Co-create a system that strengthens the sources of wellbeing, both individually and collectively

Questions for the public consultation Europeana next steps

Museum of Mankind copies of William Blackmore photograph albums of American Indians, circa 1850s-1870s

Expert Group on Archival Description (EGAD!) International Council on Archives. Fonds and Bonds Austin 2014

Hamilton Wright Jr. photographs, circa ,

Introduction to Planets. Hans Hofman Nationaal Archief Netherlands Barcelona, 27 March 2009

A Finding Aid to the Marion Greenwood papers, 1883, circa 1933-circa 1960, in the Archives of American Art

Finding Aid to the Elsa S. McGinn Papers, No online items

The EU Framework Programme for Research and Innovation HORIZON 2020 SC6 CULT-COOP Albert GAUTHIER. DG Connect Unit G2 Luxembourg

D E S C R I B I N G A R C H I V E S

American Association of Museums Philadelphia, PA Saturday, 2 May 2009, 3:45 5:00 PM

A Finding Aid to the Jari "WERC" Alvarez and Geraldine "Gera" Lozano Papers, , in the Archives of American Art

Bill Scovill collection RC Finding aid prepared by Venus Van Ness

Julius Robert Oppenheimer ( )

Working in and with African Photo Archives

Digital Preservation Planning: Principles, Examples and the future with Planets

International Symposium on Knowledge Communities 2012

EBLIDA submission to the European Commission Consultation: Europeana: next steps

Rockefeller University records, President, Frederick Seitz, Personal, Biographical and Miscellaneous, Series 6

United States Army Medical Museum photographs of skulls, probably 1870s-1880s

Findind Aid to the Wm. (William) McDevitt Papers, No online items

A Platform for Environmental, Social

Describing Archives: A Content Standard 2007 by the Society of American Archivists. Index

Stewardship of Cultural Heritage Data. In the shoes of a researcher.

Guide to the Ritter Family Papers,

Pennington Funeral Home Ledgers

University of Massachusetts Amherst Libraries. Digital Preservation Policy, Version 1.3

Finding aid for the Wright Morris Collection, AG 190

American Heritage Library and Museum

Digital Preservation Strategy Implementation roadmaps

A Finding Aid to the Mabel Alvarez Papers, , in the Archives of American Art

Rules for archival description

William S. Soule photographs of American Indians and camps in Kansas and Oklahoma,

FALL 2018 NEW ENGLAND ARCHIVISTS MEETING FRIDAY, OCTOBER 26, 2018 JOHN F. KENNEDY PRESIDENTIAL LIBRARY & MUSEUM, BOSTON, MA

History, Here and Now: SFPL Resources

MS-446, Zoe Dell Lantis Papers

HANDSOME LAMS?: COLLABORATIONS AROUND COLLECTIONS AT YALE UNIVERSITY

Digital Preservation Policy

Abstracts. Informare și documentare: activitate științifică și profesională. 1. Tabita Chiriţă, Ph.D.c The Library as Institution and Field of Study

Guide to the Edoardo ("Farfariello") Migliaccio Papers CMS.015

New forms of scholarly communication Lunch e-research methods and case studies

NATIONAL/FEDERAL/REGIONAL POLICIES AND PROGRAMMES OF DIGITIZATION OF THE SCIENTIFIC AND CULTURAL CONTENT

Workshop on the Open Archives Initiative (OAI) and Peer Review Journals in Europe: A Report

Guide to the Luigi Antonini Papers CMS.017

A Finding Aid to the William H. Johnson Papers, , bulk , in the Archives of American Art

Robert C. Graham. A Finding Aid to the Robert C. Graham Collection of Artists' Letters, (bulk ), in the Archives of American Art

Steven P. Andreasen Bruce G. Blair Matthew Bunn Sidney D. Drell

A Finding Aid to the Robert Reid papers, circa 1880-circa 1930, in the Archives of American Art

2016 Genealogy Workshops Districts 2, 4, and 6

DARIAH-ERIC. Towards a sustainable social and technical European eresearch Infrastructure for the Arts and Humanities

Knitting the Digital Library Together. Canadian National Heritage Digitization Strategy Foundational Assembly Report

Manhattan Project (World History)

Winnie Haynie Hamilton genealogy collection OBU.0006

Ball State University Alumni Center photographs and moving images RG

Researching National Archives Resources

ccess to Cultural Heritage Networks Across Europe

Polish Heritage Society scrapbooks Collection 383

John Andrew & Son photogravures of Edward S. Curtis portraits of Plains Indians, circa 1908

Douglas Borgstedt cartoons, MC.836

Finding Aid to the Toshio Yatsushiro Papers MS No online items

The IIE disseminates and promotes the scientific know-how

A Finding Aid to the Morton Traylor Papers, , in the Archives of American Art

A Finding Aid to the Jerry Hudson de Leon Papers, circa 1920s-1980, in the Archives of American Art

Europeana and AccessIT Shkodra, Albania 26/27 June 2012 Rob Davies, MDR Partners, Coordinator

Convergence of Knowledge and Culture

Laurent Romary, Inria DARIAH, director DARIAH - SHAPING EUROPEAN RESEARCH IN THE ARTS AND HUMANITIES

Inventory of the Robert Turner Holocaust Atrocity Photographs, 1945

Research group self-assessment:

C A R T O O N A R T. Adult & Junior Department. Scott Nicol, San Rafael, Cartoonist and Cartooning Educator

A Finding Aid to the John Weatherwax Papers Relating to Frida Kahlo and Diego Rivera, , bulk , in the Archives of American Art

A Finding Aid to the Morton Traylor Papers, , in the Archives of American Art

MS-75: Riesinger Funeral Home Records

THE PAGEANT OF THE PACIFIC MURALS

Aliki papers CLRC.ALIKI. Finding aid prepared by Lindsay Friedman and Caitlin Goodman. Last updated on November 20, 2013.

Digital Sustainability: Tyler O. Walters

Contribution of the support and operation of government agency to the achievement in government-funded strategic research programs

Developing Research Platforms New Roles for New Libraries

Guide to the Esther LaPorta Donor Relations records. No online items

Guide to the John and Garret S. Baxter journals ARC.257

Beatty-Atkins-Ryan family papers MSS.345

Outlining an analytical framework for mapping research evaluation landscapes 1

Ambition & opportunity The library and information profession and where we go from here

The Dutch National Research Agenda

La Vern Frank-Rush papers regarding the WPA Art Center in Sioux City, Iowa

Finding aid of the Medical Examiner-Coroner's Office Records. No online items

Transcription:

National Archival Authorities Infrastructure Social Networks and Archival Context & National Archival Authorities Cooperative

SNAC 2010-2012 National Endowment for the Humanities Preservation and Access, Research and Development grant 2012-2014 Mellon Foundation

Project Team Daniel Pitti (PI) and Worthy Martin (Institute for Advanced Technology in the Humanities, University of Virginia) Adrian Turner and Brian Tingle (California Digital Library, University of California) Ray Larson (School of Information, University of California, Berkeley)

Project Objectives Archival finding aids currently intermix description of records with description of the creators of records and persons evident in the records Further the ongoing process of transforming archival description using advanced technologies By facilitating the separation of the description of people from the description of records Using EAC-CPF, an International archival authority control standard Goal: enhance the economy and effectiveness of archival description to improve access to and understanding of archival resources

Rationale for Separation Authority control of forms of names Flexible description Integrated access to cultural heritage Biographical/historical resource Social/historical context (social-professional networks) Cooperative authority control (more later)

The Data 2010-2012 EAD-encoded finding aids Library of Congress (1,546) Online Archive of California (~15,400 ) Northwest Digital Archive (5,160) Virginia Heritage (8,390) Authority records Library of Congress: NACO/LCNAF (3.8M personal names; 900K corporate names) Getty Vocabulary Program: Union List of Artist Names (293K personal and corporate names) Virtual International Authority File (16M+ personal names, corporate, uniform titles, jurisdictions)

Methods and Processing Extract EAC-CPF records from existing EAD-encoded archival descriptions Extracting both creators and referenced CPF names Match EAC-CPF records against one another and against existing authority records (ULAN, VIAF, LCNAF); merge records for the same entity Enhance EAC-CPF by normalizing entries, adding alternative entries, titles (VIAF), and historical data (ULAN) Key challenge: two or more people with the same name; two or more names for the same person Create a prototype historical resource and access system Historical data and social-professional networks Links to archive, library, and museum resources (by and about)

EAD Source Data Encoded Archival Description Intermixes description of creators of records and, at the discretion of the archivists, names associated with the content of the records Detailed description of creators of records Widely varying quality In the number of names identified and encoded In the formation of the names (direct or inverted, capitalization, punctuation, and so on) In the categorization of names (personal, corporate, or family Many names given but not identified as such Most important of these in biographies/histories and in correspondence description Extraction has focused on the low hanging fruit, that is the names tagged as names Attention shifting to names not identified as such

Archival Records Records are the by-products of people living and working as individuals, in organized groups, in families Records document people living and working People exist in social-professional contexts, in relation to others Records document these relations All records created by the same entity are described together (a fonds or collection) Creators documented in detail Many of the people documented in the record referenced in description Archival descriptions document interrelations among people and records (documents)

Source: J. Robert Oppenheimer Papers (LoC) <origination> <persname source="lcnaf">oppenheimer, J. Robert, 1904-1967</persname> </origination> <controlaccess> <persname source="lcnaf" encodinganalog="100" role="creator">oppenheimer, J. Robert, 1904-1967</persname> <persname source="lcnaf" encodinganalog="600" role="subject">bethe, Hans Albrecht, 1906- --Correspondence</persname> <!-- * + --> <persname source="lcnaf" encodinganalog="600" role="subject">born, Max, 1882-1970 --Correspondence</persname> <persname source="lcnaf" encodinganalog="600" role="subject">boyd, Julian P. (Julian Parks), 1903- --Correspondence</persname> <persname source="lcnaf" encodinganalog="600" role="subject">bush, Vannevar, 1890-1974 --Correspondence</persname> <persname source="lcnaf" encodinganalog="600" role="subject">casals, Pablo, 1876-1973 --Correspondence</persname> <!-- * + --> <corpname source="lcnaf" encodinganalog="610" role="subject">institute for Advanced Study (Princeton, N.J.)</corpname> <corpname source="lcnaf" encodinganalog="610" role="subject">los Alamos Scientific Laboratory</corpname> <!-- * + --> </controlaccess>

Source: Leonard Bernstein Collection (LoC) <c02> <did> <container type="box">1</container> <unittitle>aaltonen, Erkki <unitdate era="ce" calendar="gregorian">1981</unitdate> </unittitle> <physdesc> <extent>1</extent> </physdesc> </did> </c02> <c02> <did> <unittitle>abbado, Claudio <unitdate era="ce" calendar="gregorian">1963-90</unitdate> </unittitle> <physdesc> <extent>5</extent> </physdesc> </did> </c02> * +

<bioghist> <head>biographical Sketch</head> <p>josé Marcos Mugarrieta, prior to his term as Mexican consul in San Francisco 1857-1863, served in the Mexican army from 1837. He saw action in numerous battles and campaigns Jamaica, under General Canalizo in 1841; Campeche, 1842-1843; Merida, 1843; Veracruz, 1845; Mexico City, 1846; Angostura and Cerro-gordo, 1847; Guanajuato, 1848, and Sierra-Gorda under Bustamante, 1848-1849; and Matamoros, 1849-1850. * + </p> <p>in April 1857 Mugarrieta received an appointment from the Comonfort government for the consulship in San Francisco. He did not actually begin his new duties until September 1, 1859, due to illness and to the political situation in Mexico. * +</p> </bioghist>

<bioghist> <head>chronology</head> <chronlist> <chronitem> <date>1900</date> <event>born on Jan. 20 in Hastings, Minnesota.</event> </chronitem> <chronitem> <date>1922</date> <event>received baccalaureate from Princeton University, major in philosophy. </event> </chronitem> * + <chronitem> <date>1965</date> <event>died on April 4.</event> </chronitem> </chronlist> </bioghist>

EAC-CPF Encoded Archival Context-Corporate bodies, Persons, Families An international communication standard for archival authority control Based on International Council for Archives, International Standard Archival Authority Records- Corporate bodies, persons, families (ISAAR(CPF)) SAA Standards Committee, Technical Subcommittee on Encoded Archival Context Co-chairs Katherine Wisser, Simmons College Anila Angjeli, Bibliothèque nationale de France

Library and Archive Authority Control Library (or bibliographic) authority control is almost exclusively about the control of names Archival authority control involves biographicalhistorical description of the CPF entity Descriptions based on controlled vocabularies or values, for example, occupations, place of birth and death But also biographical-historical description Prose Chronological list Archival authority control provides context for understanding records, the context of their creation, the provenance

<identity> <entitytype>person</entitytype> <nameentry xml:lang="en-latn"> <part>oppenheimer, J. Robert, 1904-1967.</part> <authorizedform>aacr2</authorizedform> </nameentry> <nameentry localtype="viaf:mainheading"> <part>oppenheimer, J. Robert (Julius Robert), 1904-1967</part> <alternativeform>viaf</alternativeform> </nameentry> <nameentry localtype="viaf:mainheading"> <part>oppenheimer, Julius Robert, 1904-1967</part> <alternativeform>viaf</alternativeform> </nameentry> <nameentry localtype="viaf:x400"> <part>oppenheimer, Robert</part> <alternativeform>viaf</alternativeform> </nameentry> <nameentry localtype="viaf:x400"> <part>ou-pẽn-hai-mo, 1904-1967</part> <alternativeform>viaf</alternativeform> </nameentry> </identity>

<existdates> <daterange> <fromdate standarddate= 1904-04-22 >1904, Apr. 22</fromDate> <todate standarddate= 1967-02-18 >1967, Feb. 18</toDate> </daterange> </existdates> <!--... --> <localdescription localtype="subject"> <term>science--societies, etc.</term> </localdescription> <localdescription localtype="viaf:nationality"> <placeentry countrycode="us"/> </localdescription> <localdescription localtype="viaf:gender"> <term>male</term> </localdescription> <languageused> <language languagecode="eng"/> </languageused> <occupation> <term>physicists.</term> </occupation> <!--... -->

<chronlist> <chronitem> <date>1904, Apr. 22</date> <placeentry>new York, N.Y.</placeEntry> <event>born, New York, N.Y.</event> </chronitem> <!--... --> <chronitem> <date>1943-1945</date> <placeentry>los Alamos, N. Mex.</placeEntry> <event>director, Los Alamos Scientific Laboratory, Los Alamos, N. Mex.</event> </chronitem> <!--... --> <chronitem> <date>1954</date> <event>(1) Denied security clearance * + (2) Published Science and the Common Understanding * + </event> </chronitem> <!--... --> <chronitem> <date>1967, Feb. 18</date> <placeentry>princeton, N.J.</placeEntry> <event>died, Princeton, N.J.</event> </chronitem> </chronlist>

<cpfrelation xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" xlink:role="http://rdvocab.info/uri/schema/frbrentitiesrda/person" xlink:arcrole="correspondedwith"> <relationentry>bush, Vannevar, 1890-1974.</relationEntry> <descriptivenote> <p>recordid: DLC.ms998007.r007</p> </descriptivenote> </cpfrelation>

<resourcerelation xmlns:xlink="http://www.w3.org/1999/xlink" xlink:arcrole="creatorof" xlink:role="archivalrecords xlink:type="simple xlink:href="http://hdl.loc.gov/loc.mss/eadmss.ms998007"> <relationentry>j. Robert Oppenheimer Papers, 1799-1980 (bulk 1947-1967)</relationEntry> <objectxmlwrap> <did xmlns="urn:isbn:1-931666-22-9 > <unittitle>papers <unitdate normal="1799/1980 era="ce calendar="gregorian">1799-1980 </unitdate><unitdate label="bulk Dates" type="bulk" normal="1947/1967 era="ce calendar="gregorian">(bulk 1947-1967)</unitdate></unittitle> <unitid countrycode="us" repositorycode="us-dlc">mss35188</unitid> <origination label="creator"> <persname>oppenheimer, J. Robert, 1904-1967</persname> </origination> <!--... --> <repository><corpname>manuscript Division. Library of Congress</corpname> </repository> <abstract>physicist and director of the Institute for Advanced Study, Princeton, New Jersey. [...] Topics include theoretical physics, development of the atomic bomb, the relationship between government and science, nuclear energy, security, and national loyalty. </abstract> </did> </objectxmlwrap> </resourcerelation>

Year Two Results-Extraction Library of Congress: 43,702 EAC-CPF from 1,546 finding aids corporatebody: 7,243 person: 36,012 family: 447 Northwest Digital Archive: 24,949 from 5,160 corporatebody: 10,303 person: 13,294 family: 1,352 Online Archive of California: 91,811 from ~15,400 corporatebody: 24,860 person: 66,329 family: 622

Year Two Results-Extraction Virginia Heritage: 15,175 from 8,390 corporatebody: 4,783 person: 9,919 family: 473 Total: 175,637 EAC-CPF from 30,496 corporatebody: 47,189 person: 125,554 family: 2,894

Year Two Matching and Merging Results Total: 128,783 EAC-CPF from 175,637 corporatebody: 31,282 from 47,189 person: 95,583 from 125,554 family: 1,918 from 2,894

Early Observations-Extraction Depth of analysis and quality of description of CPF entities varies widely in EAD-encoded finding aids LoC a lot of names under authority control OAC and NWDA have less names and control varies VH still less names, more variance To be fair, the finding aids were created without SNAC processing in mind!

Next on Extraction Refine extraction processing, incorporating some NLP-like processing, for example Verifying type of name: C or P or F Massaging poorly formed names into better formed names Identifying names in strings that are names-plus (but name not identified as such) Provide context information to enhance matching, for example, date or dates of correspondence, or occupation of creator of records for referenced names

SNAC 2012-2014 SNAC II: Mellon 2012-2014 150,000 EAD-encoded finding aids Most from U.S., but also U.K. and France 1-2M WorldCat MARC archival descriptions British Library: 300K names from mss. Collections Smithsonian Institution: entire agency history; expeditions; and correspondents of Joseph Henry National Archives and Records Administration (80K authority records 16M VIAF clusters And more...

For more information on SNAC http://socialarchive.iath.virginia.edu/ (Project website) http://socialarchive.iath.virginia.edu/xtf/searc h (Public prototype)

National Archival Authorities Cooperative Building a National Archival Authorities Infrastructure IMLS funded two-year project, October 2011- September 2013 EAC-CPF SAA workshops: 140 scholarships National Archival Authorities Cooperative planning Transforming SNAC into a sustainable national cooperative program

Benefits for Archivists Archival authority control at last! Best done cooperatively Consistent use of same form of name across descriptions This is can only effectively be accomplished by maintaining a single, shared authority file Economic benefits to cooperating: the creator in one description is the correspondent in another: people exist in social contexts, records document these contexts Working cooperatively will ensure identifying the interrelations of different collections Cooperative authorities will enable integrated access to distributed records: all of the records relevant to one person, corporate body, or family A shared national authority file would be a substantial historical resource, quite apart from the access enabled by it

Benefits for Users For scholars Integrated access to distributed archival resources Contextual data for not only the records of one creator, but other related records Access to the socio-historical networks in which people lived and worked A biographical-historical resource Time for an anecdote

But Not Only Scholars Use in K-12 education Time for an anecdote Life-time learners Historical curiosity Genealogy

Building the Infrastructure Institute for Museum and Library Services Funding two activities 140 scholarships to seven regional workshops on EAC-CPF (Administered by Simmons College) Series of meetings to develop a blueprint for a sustainable National Archival Authorities Cooperative (NAAC) Transforming SNAC into NAAC, project into program

NAAC Series of three meetings leading to the development of a blueprint All hosted by the National Archives and Records Administration Soliciting community input on the business, governance, and technological requirements First meeting broad, consensus building and idea gathering, followed by two meetings of three teams to address the requirements

First Meeting May 21-22 Around 90 people Archivists, librarians, scholars (40 or so) Representatives of the federal repositories (40 or so) Funders (10 or so) Other stakeholders (OCLC and Getty Vocabulary Program) One and one-half day meeting

Federal Repositories National Archives and Records Administration Including two presidential libraries Library of Congress Smithsonian Institution National Library of Medicine National Agricultural Library National Park Service

Conclusion This may well be a groundbreaking moment for the national archival profession An opportunity to do something really important, really useful To accomplish together what none can accomplish alone I hope (or is it now hopefully?)