Stewardship of Cultural Heritage Data. In the shoes of a researcher.

Similar documents
Gis-Based Monitoring Systems.

SUBJECTIVE QUALITY OF SVC-CODED VIDEOS WITH DIFFERENT ERROR-PATTERNS CONCEALED USING SPATIAL SCALABILITY

A technology shift for a fireworks controller

Power- Supply Network Modeling

The Galaxian Project : A 3D Interaction-Based Animation Engine

A sub-pixel resolution enhancement model for multiple-resolution multispectral images

Opening editorial. The Use of Social Sciences in Risk Assessment and Risk Management Organisations

Globalizing Modeling Languages

Benefits of fusion of high spatial and spectral resolutions images for urban mapping

Modelling and Hazard Analysis for Contaminated Sediments Using STAMP Model

RFID-BASED Prepaid Power Meter

Exploring Geometric Shapes with Touch

Convergence Real-Virtual thanks to Optics Computer Sciences

L-band compact printed quadrifilar helix antenna with Iso-Flux radiating pattern for stratospheric balloons telemetry

The HL7 RIM in the Design and Implementation of an Information System for Clinical Investigations on Medical Devices

High finesse Fabry-Perot cavity for a pulsed laser

VR4D: An Immersive and Collaborative Experience to Improve the Interior Design Process

A Tool for Evaluating, Adapting and Extending Game Progression Planning for Diverse Game Genres

Two Dimensional Linear Phase Multiband Chebyshev FIR Filter

Managing Scientific Patenting in the French Research Organizations during the Interwar Period

Towards Decentralized Computer Programming Shops and its place in Entrepreneurship Development

Bridging the Gap between the User s Digital and Physical Worlds with Compelling Real Life Social Applications

UML based risk analysis - Application to a medical robot

PMF the front end electronic for the ALFA detector

A 100MHz voltage to frequency converter

The CENDARI Project: A user-centered enquiry environment for modern and medieval historians [Poster]

Design Space Exploration of Optical Interfaces for Silicon Photonic Interconnects

A New Approach to Modeling the Impact of EMI on MOSFET DC Behavior

On the role of the N-N+ junction doping profile of a PIN diode on its turn-off transient behavior

Study on a welfare robotic-type exoskeleton system for aged people s transportation.

Demand Response by Decentralized Device Control Based on Voltage Level

Optical component modelling and circuit simulation

100 Years of Shannon: Chess, Computing and Botvinik

Elements of a scientific communication policy

Interactive Ergonomic Analysis of a Physically Disabled Person s Workplace

On the robust guidance of users in road traffic networks

Concepts for teaching optoelectronic circuits and systems

Application of CPLD in Pulse Power for EDM

Régulation des fonctions effectrices anti-tumorales par les cellules dendritiques et les exosomes : vers la désignation de vaccins antitumoraux

HCITools: Strategies and Best Practices for Designing, Evaluating and Sharing Technical HCI Toolkits

Small Array Design Using Parasitic Superdirective Antennas

MODELING OF BUNDLE WITH RADIATED LOSSES FOR BCI TESTING

Dynamic Platform for Virtual Reality Applications

Compound quantitative ultrasonic tomography of long bones using wavelets analysis

Electronic sensor for ph measurements in nanoliters

Diffusion of foreign euro coins in France,

Augmented reality as an aid for the use of machine tools

3D MIMO Scheme for Broadcasting Future Digital TV in Single Frequency Networks

FeedNetBack-D Tools for underwater fleet communication

DUAL-BAND PRINTED DIPOLE ANTENNA ARRAY FOR AN EMERGENCY RESCUE SYSTEM BASED ON CELLULAR-PHONE LOCALIZATION

Influence of ground reflections and loudspeaker directivity on measurements of in-situ sound absorption

S-Parameter Measurements of High-Temperature Superconducting and Normal Conducting Microwave Circuits at Cryogenic Temperatures

Wireless Energy Transfer Using Zero Bias Schottky Diodes Rectenna Structures

Linear MMSE detection technique for MC-CDMA

Human Computer Interaction meets Computer Music: The MIDWAY Project

PANEL MEASUREMENTS AT LOW FREQUENCIES ( 2000 Hz) IN WATER TANK

A high PSRR Class-D audio amplifier IC based on a self-adjusting voltage reference

Interaction and Humans in Internet of Things

Computational models of an inductive power transfer system for electric vehicle battery charge

Design of Cascode-Based Transconductance Amplifiers with Low-Gain PVT Variability and Gain Enhancement Using a Body-Biasing Technique

Low temperature CMOS-compatible JFET s

Collaborative Pseudo-Haptics: Two-User Stiffness Discrimination Based on Visual Feedback

A Low-cost Through Via Interconnection for ISM WLP

A design methodology for electrically small superdirective antenna arrays

Gate and Substrate Currents in Deep Submicron MOSFETs

Nonlinear Ultrasonic Damage Detection for Fatigue Crack Using Subharmonic Component

Networked Service Innovation Process in the Production of a New Urban Area

Proposal for the Conceptual Design of Aeronautical Final Assembly Lines Based on the Industrial Digital Mock-Up Concept

Practical high frequency measurement of a lightning earthing system

Methodological Proposals for Designing Federative Platforms in Cultural Linked Open Data: the example of MoDRef

Distributed Smart Metering by Using Power Electronics Systems

A perception-inspired building index for automatic built-up area detection in high-resolution satellite images

Dictionary Learning with Large Step Gradient Descent for Sparse Representations

Reconfigurable antennas radiations using plasma Faraday cage

Resonance Cones in Magnetized Plasma

INVESTIGATION ON EMI EFFECTS IN BANDGAP VOLTAGE REFERENCES

Indoor Channel Measurements and Communications System Design at 60 GHz

Robust Optimization-Based High Frequency Gm-C Filter Design

Tutorial: Using the UML profile for MARTE to MPSoC co-design dedicated to signal processing

A notched dielectric resonator antenna unit-cell for 60GHz passive repeater with endfire radiation

Floating Body and Hot Carrier Effects in Ultra-Thin Film SOI MOSFETs

PCI Planning Strategies for Long Term Evolution Networks

QPSK-OFDM Carrier Aggregation using a single transmission chain

Dialectical Theory for Multi-Agent Assumption-based Planning

Gathering an even number of robots in an odd ring without global multiplicity detection

Crowdsourcing and digitization

A STUDY ON THE RELATION BETWEEN LEAKAGE CURRENT AND SPECIFIC CREEPAGE DISTANCE

Enhanced spectral compression in nonlinear optical

Indoor MIMO Channel Sounding at 3.5 GHz

Fundamental Study on NDT of Building Wall Structure by Radar

A Novel Piezoelectric Microtransformer for Autonmous Sensors Applications

Arcing test on an aged grouted solar cell coupon with a realistic flashover simulator

New Structure for a Six-Port Reflectometer in Monolithic Microwave Integrated-Circuit Technology

Embedded Multi-Tone Ultrasonic Excitation and Continuous-Scanning Laser Doppler Vibrometry for Rapid and Remote Imaging of Structural Defects

Design of an Efficient Rectifier Circuit for RF Energy Harvesting System

Development and Performance Test for a New Type of Portable Soil EC Detector

Contemporary Ethical Issues in Engineering. Foreword

ISO specifications of complex surfaces: Application on aerodynamic profiles

Comparison of antenna measurement results in disturbed environment using a VHF spherical near field system

A system for creating virtual reality content from make-believe games

Transcription:

Stewardship of Cultural Heritage Data. In the shoes of a researcher. Charles Riondet To cite this version: Charles Riondet. Stewardship of Cultural Heritage Data. In the shoes of a researcher.. Cultural Heritage Data Re-use Charter Feedback workshop hosted by the LIBER Digital Humanities Digital Cultural Heritage Working group, Apr 2018, The Hague, Netherlands. <hal-01762295> HAL Id: hal-01762295 https://hal.inria.fr/hal-01762295 Submitted on 9 Apr 2018 HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Stewardship of Cultural Heritage Data In the shoes of a researcher Charles Riondet 9 avril 2018 Inria charles.riondet@inria.fr 1

Outline Stewardship? Extrapolation from a real life example Challenges How can the Charter help? 2

Stewardship?

Definition Principle : Long-time preservation, persistence, accessibility and legibility of cultural heritage data should be a priority. Commitment : Cultural Heritage Institutions, Researchers and Research Institutions will take the necessary steps and precautions to guarantee long-term stewardship of the original item or record and the resulting research. Whether primary digital surrogates or further representations, forms or enrichments, the various parties involved in the creation and curation of cultural heritage data, will ensure a proper hosting and preservation of all contents. 3

Definition Principle : Long-time preservation, persistence, accessibility and legibility of cultural heritage data should be a priority. Commitment : Cultural Heritage Institutions, Researchers and Research Institutions will take the necessary steps and precautions to guarantee long-term stewardship of the original item or record and the resulting research. Whether primary digital surrogates or further representations, forms or enrichments, the various parties involved in the creation and curation of cultural heritage data, will ensure a proper hosting and preservation of all contents. 3

Definition Principle : Long-time preservation, persistence, accessibility and legibility of cultural heritage data should be a priority. Commitment : Cultural Heritage Institutions, Researchers and Research Institutions will take the necessary steps and precautions to guarantee long-term stewardship of the original item or record and the resulting research. Whether primary digital surrogates or further representations, forms or enrichments, the various parties involved in the creation and curation of cultural heritage data, will ensure a proper hosting and preservation of all contents. 3

Definition Principle : Long-time preservation, persistence, accessibility and legibility of cultural heritage data should be a priority. Commitment : Cultural Heritage Institutions, Researchers and Research Institutions will take the necessary steps and precautions to guarantee long-term stewardship of the original item or record and the resulting research. Whether primary digital surrogates or further representations, forms or enrichments, the various parties involved in the creation and curation of cultural heritage data, will ensure a proper hosting and preservation of all contents. 3

What content? Who is responsible? Primary material (physical artefacts) CHI Digital surrogates CHI & Researcher Metadata CHI & Researcher Enrichments Researcher only?.cultural Heritage Institution 4

Hosting and preservation For all these different contents, we have to determine : What to keep, under which format? Who owns the rights? Who manage the versioning? Where is it stored? For how long? How is it identify? What if it is, for some reasons, put offline? 5

Extrapolation from a real life example

Digital edition of Léo Hamon s clandestine diary Léo Hamon, born Lew Goldenberg (1908-1993) was a French lawyer of Russian origin, and one of the leaders of the Parisian Resistance. His diary relates his underground daily life, reports on his comments on the course of the war, meetings he attended, the organization of the Resistance and on the preparation of the seizure of power in Paris. 6

Research project Study the evolution of the discourse on the actors of the war (at every level) in the diary. The tasks are : OCR the typed version (creation of a training model, several versions of the texts, output in PDF, txt, ) HTR (Handwritten Text Recognition) the manuscript (same kind of data is produced) Structure of the text of the diary in chronological entries (using XML-TEI, merging the two sources) Annotate automatically the Named entities (Persons, organisations, places) Build a graph with the following elements : All the Named entities mentioned (especially persons and organisations) The different ways they are named in the text (nicknames, metonymies,...) How they are qualified (modifiers) 7

Source divided in two Institutions Centre d Histoire de Science Po Paris Part of the academic institution Science Po Paris. Holds archives from major modern French politicians, including Léo Hamon s. Manuscript version of the diary (years 1940-43) Archives nationales Paris In the fonds of the WW2 Historical committee. Hundreds of testimonies and documents from 1945 to the early 80 s Typed version of the diary (year 1944) 8

Outcome Possibly a lot of data : Image Uncorrected OCR text (several formats) Corrected OCR text (several formats and versions) Metadata of the OCR process Structured text Aligned images with OCR (PDF) Annotations as they output by the annotation tools Annotations manually corrected Annotations aligned with the structured text Dictionaries, gazetteers with extracted information, possibly enriched with other sources Graph combining everything Visualization interfaces 9

Challenges

Challenges Dialogue and concertation in any case, since the beginning Results hosted by the originating institution Results hosted by the research institution Results hosted elsewhere agreement between three parties, new constraints. Research Infrastructure like Dariah (HumaNum) Repository like HAL... 10

Challenges Ensure the connexion between the original data and the enrichment, and keep all the data clean Quality repository : Metadata Persistent Identifiers Versioning Research good practices Use of the appropriate standards according to the datatype (image, text, annotation,...) Documentation for all the data created/transformed 11

Challenges Possible limits A survey conducted in the summer 2017 on the Data Reuse Charter commitments outlined some potential users concerns about Stewardship : Time-consuming and possibly costly Not everything has to be stored on the long term CHI and researchers may put more efforts on more valuable data 12

How can the Charter help?

How can the Charter help? No ready-made answers, but a frame where solutions can be designed. Initiate communication between Libraries and Researchers Distribute responsibilities Concrete stewardship agreement Good Data Management 13

How can the Charter help? Initiate communication between Libraries and Researchers : support the dialogue when several parties are involved How far should we go? Release the data in the most appropriate format Allow for the reuse of the metadata Allow for remote and persistent access 14

How can the Charter help? Distribute stewardship responsibilities Allow researchers and CHI to make proposals and to make commitment What is the duty of the Library What can the researcher take care of Do they need to share everything? 15

How can the Charter help? Concrete stewardship agreement Where do we store, and under which format? How many copies? For how long? 16

How can the Charter help? Good Data Management Roadmap on the long run Creates confidence 17

How can the Charter help? No ready-made answers, but a frame where solutions can be designed. Communication between Libraries and Researchers Distribute responsibilities Concrete stewardship agreement Good Data Management 18