THE PRESERVATION OF DIGITAL DOCUMENTARY HERITAGE LESSONS FROM AUSTRALIAN EXPERIENCE

Similar documents
Digitisation success on a shoestring? Scoping some issues in sustaining digital collections

REPORT ON THE INTERNATIONAL CONFERENCE MEMORY OF THE WORLD IN THE DIGITAL AGE: DIGITIZATION AND PRESERVATION OUTLINE

Survey of Institutional Readiness

Royal Pavilion & Museums DRAFT Digital Preservation Policy 2018

Guidelines for the Professional Evaluation of Digital Scholarship by Historians

University of Massachusetts Amherst Libraries. Digital Preservation Policy, Version 1.3

Memorandum on the long-term accessibility. of digital information in Germany

The concept of significant properties is an important and highly debated topic in information science and digital preservation research.

Economies of the Commons 2, Paying the cost of making things free, 13 December 2010, Session Materiality and sustainability of digital culture)

Gerald G. Boyd, Tom D. Anderson, David W. Geiser

Mergers Possibilities & Impact of Mergers in Australia and Overseas

Over the 10-year span of this strategy, priorities will be identified under each area of focus through successive annual planning cycles.

CHAPTER 1 PURPOSES OF POST-SECONDARY EDUCATION

ccess to Cultural Heritage Networks Across Europe

A N A N I L - T. begins me. change with. Towards Mindful Consumption F O M C A

Expert Group on Preservation of Records, Knowledge and Memory across Generations

In Defense of the Book

Antenie Carstens National Library of South Africa. address:

AiA Art News-service. Why it s time to talk seriously about digital reproductions Maggie Gray

Science Impact Enhancing the Use of USGS Science

Compendium Overview. By John Hagel and John Seely Brown

DIGITAL TRANSFORMATION LESSONS LEARNED FROM EARLY INITIATIVES

University of Kansas. The University of Kansas Libraries

Creating a New Kind of Knowledge Institution. Directions for JUNE 2004

RLG, Where Museums, Libraries, and Archives Intersect

DON T LET WORDS GET IN THE WAY

Ross Harvey GSLIS, Simmons College. November 15, 2008

PRESERVATION OF INFORMATION MANAGEMENT IN DIGITAL ERA

2016 Executive Summary Canada

Towards an Arab Knowledge Society. Smart Village, Cairo, Egypt, 30 June 2009

HOW TO DESIGN THE. dream engagement ring

PIRAEUS BANK GROUP CULTURAL FOUNDATION: SYSTEMS OF KNOWLEDGE ORGANIZATION AND CURATING OF DIGITAL COLLECTIONS

Digital Preservation Policy

Research on the Capability Maturity Model of Digital Library Knowledge. Management

Automated Machine Guidance An Emerging Technology Whose Time has Come?

THE IMPACT OF SCIENCE DISCUSSION PAPER

Costing the Digital Preservation Lifecycle More Effectively

Enabling Scientific Breakthroughs at the Petascale

Managerial issues in building digital collections

Preservation Management of Digital Materials: The Handbook

Roy Sandbach interview

Leading by design: Q&A with Dr. Raghuram Tupuri, AMD Chris Hall, DigiTimes.com, Taipei [Monday 12 December 2005]

STRATEGIC ACTIVITIES AND PRIORITIES

LIS 688 DigiLib Amanda Goodman Fall 2010

Comparing Preservation Strategies and Practices for Electronic Records Michèle V. Cloonan and Shelby Sanett, University of California, Los Angeles

Embedding Digital Preservation across the Organisation: A Case Study of Internal Collaboration in the National Library of New Zealand

Instrumentation and Control

NCRIS Capability 5.7: Population Health and Clinical Data Linkage

The ALA and ARL Position on Access and Digital Preservation: A Response to the Section 108 Study Group

EXPERIENCES OF IMPLEMENTING BIM IN SKANSKA FACILITIES MANAGEMENT 1

ONR Strategy 2015 to 2020

Using Data Analytics and Machine Learning to Assess NATO s Information Environment

The future role of libraries in the information age

How to get more quality clients to your law firm

Storybird audio transcript:

INTERNATIONAL OIL AND GAS CONFERENCE IN CHINA OPENING PLENARY SESSION OPPORTUNITIES AND CHALLENGES IN A VOLATILE ENVIRONMENT, BEIJING, JUNE 2010

in the New Zealand Curriculum

Finland s drive to become a world leader in open science

How do our ethical codes relate to safeguarding intellectual property?

REPORT FROM THE COMMISSION TO THE EUROPEAN PARLIAMENT AND THE COUNCIL. on the evaluation of Europeana and the way forward. {SWD(2018) 398 final}

Case study in academic and industry collaboration: the development of an adolescent targeted sun protection intervention in NSW

5 Fatal Mistakes Firefighter Candidates Make During the Oral Interview

Durham Research Online

CCG 360 o stakeholder survey 2017/18

part of our cultural heritage? University of Freiburg, Germany

Europe's cultural wealth at the click of a mouse: frequently asked questions

Technology in Corrections 1

The 2K Method. How to earn $2,000 per month with a simple affiliate marketing method that anybody can use Tim Felmingham

Museum Collections Manager. Job description

Understanding User s Experiences: Evaluation of Digital Libraries. Ann Blandford University College London

Steve Petrenko General Manager IT Ellen McNaught Senior Manager Innovation and Projects

MDA and SPECTRUM. Authors: Nick Poole and Gordon McKenna

WM2015 Conference, March 15 19, 2015, Phoenix, Arizona, USA

Belgian Position Paper

10 Questions to Ask When Hiring Your Marketing Communications Writer

Strategy for a Digital Preservation Program. Library and Archives Canada

GUIDE TO SPEAKING POINTS:

Public Engagement Experiences in Local Road Systems Decision- Making in Minnesota. Guillermo E. Narváez, Ph.D. Humphrey School of Public Affairs

Making Your Work Flow

Countering Capability A Model Driven Approach

Digital Preservation Cross Discipline Survey

What s in This Book...1 Introduction...3 Getting Started...7 All About Die Cutters...9 Choosing a Die Cutter...11 AccuQuilt Studio/AccuCut

1. Context. 2. Vision

Digitisation Plan

Dr Graham Spittle CBE Chairman, The Technology Strategy Board Speech to The Foundation for Science and Technology, 23 rd November, 2011

6 Simple Steps to Achieving Massive Career Success

The Cedars Project. Maggie Jones

STEP TWO: CREATOR UNDERSTANDING YOUR CREATIVE POWER

24 May Committee Secretariat Justice Committee Parliament Buildings Wellington. Dear Justice Select Committee member,

Thank you to Celia Bakke and San Jose State for organizing this forum.

Chapter 22. Technological Forecasting

System of Systems Software Assurance

Module 2: The Free Session That Sell Experience Part 1

ENGINEERS, TECHNICIANS, ICT EXPERTS

Creating Successful Public Private Partnerships Examining External Success Factors

MORE POWER TO THE ENERGY AND UTILITIES BUSINESS, FROM AI.

CONFERENCE PRESENTATIONS

Google SEO Optimization

QUANTITATIVE RESEARCH METHODS FOR POLICY ANALYSIS AND DECISION MAKING

Attribution and impact for social science data

Transcription:

THE PRESERVATION OF DIGITAL DOCUMENTARY HERITAGE LESSONS FROM AUSTRALIAN EXPERIENCE Ross Harvey School of Information Studies Charles Sturt University Locked Bag 675 Wagga Wagga, NSW 2678 Australia Ph: +61 2 6933 2369, Fax +61 2 6933 2733 Email: rossharvey@csu.edu.au INTRODUCTION Ensuring the long-term preservation of information in digital form is the greatest challenge for the information professions this century, a recently published piece (Gorman 2004: xxii) indicates. The issues surrounding digital preservation maintaining access over time to information in digital form demand action. Any guidance that allows us to get closer to widely applicable solutions to the challenges of digital preservation, solutions that apply to both large and small organisations, is worth seeking out. Librarians are experiencing upheavals in traditional practice because of the increasingly widespread use of networked computing. This includes preservation activities, which are being significantly affected by the changing paradigm. For example, the old preservation paradigm recognises that copying (as in refreshing from tape to tape) is the basis of digital preservation, but it does not offer any understanding of the complexity of copying digital materials. This is more than simply preserving a bit stream; it must also take account of other attributes of the digital object that we also need to preserve. New ways of thinking about preservation, and new skills, are needed. WHERE DO WE LOOK FOR GUIDANCE? AUSTRALIAN EXEMPLARS Earlier in 2004 I interviewed senior Australian information management professionals about current digital preservation strategies in Australian cultural heritage institutions. They are in charge of preservation activities in their institutions or are active commentators on digital preservation, and come from the library, recordkeeping, audiovisual archiving, and geospatial communities. These interviews took as their starting point the contention that Australian digital preservation practice is world best practice. This contention is suggested in the literature. One expression of it is found in a 2003 report by Neil Beagrie: For a country with a relatively small population, Australia has a relatively large number of leading-edge online projects across all sectors. Archiving these online materials has become a significant area of effort for Australia s memory institutions, and both the NLA [National Library of

Australia] and the national archive activities and guidelines are frequently cited internationally as exemplars in this area. (Beagrie 2003: 14) The National Library of Australia s experience with digital preservation exemplifies Beagrie s view. The NLA has been active in digital preservation since at least 1994, practically the pre-history of digital preservation. It has actively shared its experience and expertise with the rest of the library community both in Australia and internationally. Early digital preservation activities include the founding of one of the first library digital preservation sections in 1995, establishing an archive of online publications, PANDORA, in 1996, and establishing the PADI (Preserving Access to Digital Information) Web site in 1997. Its experience and expertise are frequently sought out, a recent example being the authorship by Colin Webb, the NLA s Digital Preservation Manager, of the UNESCO Guidelines for the Preservation of Digital Heritage (UNESCO, 2003). This paper considers the responses to two questions asked in the interviews: 1) What makes an effective digital preservation strategy? 2) How long is long-term in digital preservation thinking? WHAT MAKES AN EFFECTIVE DIGITAL PRESERVATION STRATEGY? One question posed during the interviews was: What makes an effective digital preservation strategy? This is asked because, in the words of the UNESCO Guidelines, strategies are still evolving there is, as yet, no universally applicable and practical solution to the problem of technological obsolescence for digital materials (UNESCO 2003: 122, 124-125). We are, I suggest, likely to see a small number of strategies emerging from the current bewilderingly large range. The interviews with senior Australian digital preservation experts sought to establish what the characteristics of these strategies would be. Determining these characteristics could guide us to more rapid development of viable strategies. Five themes emerged from these interviews: 1) Societal and organisational missions 2) Know what you are preserving 3) Standards 4) Operational concerns 5) Technical issues. Theme 1: Societal and Organisational Missions Probably most important, the respondents indicated, was a sustainable environment that supports digital preservation over time. One pessimistically put this in terms of how to assure funding over a long period: In Australia, we live in a political environment where [long-term resourcing is] at risk every year you are working on assumptions that 2

you ll have at least the level of resourcing you ve got now and indefinitely into the future. traditionally for national institutions there is a convention that each year you get what you had last year, maybe with some increment, maybe not but we have no guarantee that convention will continue into the future. Another respondent suggested that the necessary sustainability would best be assured by building digital preservation activities into normal operating activity. Another indicated that within the organisation there needed to be a clarity of vision [because] the higher you went in organisations, the lesser the understanding was and the more there was an attachment to a pre-digital paradigm. The need to understand the context in which digital preservation is operating was emphasised. One respondent from the geosciences community (which is interested in keeping digital satellite mapping data and petroleum survey data for long periods) expanded this point: In our case you have to have a very good understanding of the business drivers our core business is about making it accessible in the short term. The preservation and access in the long term is only seen by the business as being valuable if it s continuing to meet a business need. So if I want to get funding for a preservation strategy, the only way I m going to be able to do that is to link it into the drivers in that particular sector. Another respondent suggested that no real progress would be made until we have a reason or we have a will as a community to hang onto things. Theme 2: Know What You Are Preserving There was considerable comment about the need to think clearly about exactly what it is we are trying to preserve. Respondents from the archives community labeled this as essence : We have this idea of what we call essence the essential parts, the things about a record that we think are significant enough to be retained. This essence needs to be defined in advance and only that data that is part of the essence captured: So when we migrate to our XML format we ve already decided for this particular input format like Word that these are the essential parts of that as a record basically we keep all content and all contextual information, i.e. metadata, that s associated with the contents but the look and feel and structure in information in records are less essential generally. we make decisions about the different data formats, about what we call the essence, about what we see as the essence, so it s different for every data format. 3

Theme 3: Standards Most respondents commented that standards were essential in any preservation strategy. The importance of standard data formats was emphasised in a comment about the multiple attractions of XML:. One is that it is about ordering and managing data The second is it s a fairly old standard it s actually got a very good track record. It s been in existence for over 25 years now. So for an IT standard that s extraordinarily stable. XML s widespread and increasing use was also attractive: Our general policy from an overall strategy approach for digital preservation has been to say, as an organisation we must stay where the industry is. That s one of the attractions of XML. It s actually become much more widely used since 2000 [when we adopted it] and you can get XML expertise relatively easily. There are a lot of applications, there are an increasing number of applications which actually can interpret it and it s not platform-specific. Metadata was noted by only two respondents. One noted the importance of building in sufficient management information metadata, preservation metadata, whatever you want so that you don t have to go back and reconstruct information because you won t be able to do that. Theme 4: Operational Concerns I have grouped here several points made about operational issues. One is the need to capture the data first: you can t preserve it unless you ve got it. So that s trying to get the thing into the cycle of preservation in the first place. Another is the importance of ensuring that digital preservation was established as part of normal operating activity, so that it was considered as a mainstream activity and became part of the normal resourcing issues. For one organisation this was articulated as a conscious decision to make digital preservation part of the preservation strategy. We ve actually quite consciously said this is a preservation strategy, this is a preservation matter. It s not an IT matter We say: Well the preservation program is concerned to preserve all formats of records one of the formats is digital preservation. It requires different skill sets and so on, but it s about preservation. It s not about digital. Another respondent described this in terms of an integrated response: integration is the key word here, because unless you have an integrated approach within an organisation, because so many people are involved here, it s all going to flounder. You need an integrated approach that just inhabits every facet of what people do. 4

Another theme was that data needed to be kept alive: The stuff has to be able to migrate just to keep moving so migration has to be part of the system. You know, there s no question about that. This theme was also described as recognition that it is an active process you do have to think about capture and creation and then active management of that object through its life, through time [and therefore a] basic tenet [was] that digital content needs active maintenance. So as soon as you put it onto a medium and put it up on the shelf, you ve lost it. Theme 5: Technical Issues There was relatively little comment about technical issues: indeed, one respondent commented that his organisation has solved its digital preservation problem and therefore we need to move on to other things which are to do with access and description and other issues that confront the professional. One technical issue that dominated the comments of several respondents was the importance of not relying on proprietary data formats or systems. One respondent was especially critical of the influence of IT people, who unless they were what he described as hacktivists kind of open source type people and IT people are usually not, were invariably priests of proprietary systems, because that s what s given them their job and their edge and everything. So they tend to lead organisations down proprietary paths which is in the end a disaster. He considered that the light at the end of the tunnel is that the whistle has been blown on the ride we ve been taken by the vendors and the IT the priests of those systems that make up IT departments. Components of an Effective Digital Preservation Strategy The interviews so far provisionally suggest that an effective digital preservation strategy has these characteristics: A sustainable environment that supports digital preservation over time Digital preservation activities are built into normal operating activities Definition of what it is that we are trying to preserve, the essence of a record Adoption of stable, widely used and clearly defined standards Building in sufficient management information metadata and preservation metadata Standardising of data formats wherever possible 5

Recognition that digital preservation is an active process; keeping the data alive Not reliant on proprietary data formats or systems. HOW LONG IS LONG-TERM IN DIGITAL PRESERVATION THINKING? Another question posed during the interviews was: How long is long-term in digital preservation thinking? This question is worth asking because it assists us to determine the resourcing we might need to effectively carry out digital preservation. It also suggests some of the ways in which our thinking and our procedures may need to change. Responses to the question ranged from 30 to 250,00 years. A respondent from the geosciences community, which closely links its preservation activities to business requirements, offered the very conservative response of 30 to 50 years: So in terms of long-term for us for business requirements, with the information we re acquiring at the moment, we re possibly thinking up to, say, 30 to 50 years. The things I think of are, say, a satellite imagery archive and that dates back to 1979, in digital form. Most respondents considered 100 years as a minimum requirement, although they provided different reasons for this. One argued it in economic terms: the return on the investment in digital preservation that we make for me has got to be of a 100-year plus time-span. This length of time was also sufficiently long to get over the immediacy of some of the politics surrounding [digital preservation] and firmly establish it. His library was about to celebrate its 150th birthday and he found it useful to be able to popularise the case for preserving digital information in relation to the length of time the library had been established: when we are 300 years old as an institution we will want to be able to retrieve those that we still consider to be valuable from 2004. In addition, he recognised that the question had another dimension, so that another answer was as long as the community thinks it is valuable to keep. Another respondent explained 100 years in more pragmatic terms: At least a life-span and effectively that s got to approach 100 years pragmatically it s the photo of you as the baby that you can look at as the old person the pragmatism of a life-span where we can still access things of our childhood in old age. A third suggested that 100 years was notional. The key was active management of the data: It s a bit of a moot question, because, if you work on the basis of providing accessibility to content over time then if it s still of value then it s got to be actively managed, so it doesn t really matter if it s a million years. One organisation had deliberately decided on 300 years, their rationale being that setting a specific period informed their current practice: 6

There s no point in saying forever, because we know that nothing lasts forever, but what we also know is that, in relation to a lot of conservation science you can say, if I do these, then that will be the consequence I can say to get to 300 years, you know, inversely, coming back, I have to do this, I have to do this other thing saying 200 or 300 years informs your current practice. It actually says, well, for it to last 200 or 300 years, I need to do this now, and I need to take these actions now, or I need to find out what would be the appropriate action for me to take for it to be available in 200 years. At the other end of the spectrum were the respondents who indicated that the only possible answer was indefinitely. One specified 250,000 years. This was based on his work with the international nuclear industry and its requirement to be able to access records relating to radioactive waste these records are digital these days. This industry was keen to hand to [future generations] everything that they need to know about the past. And so they can then make decisions about what they keep and what they can let go. Another indicated I don t have an end-date in my thinking. His view was formed by working in a deposit library where the responsibility for preservation was commonly considered to be for an indefinite period. A technical point was raised. Unlike analog copying processes which would eventually result in the decay of the information, for digital data there are no theoretical limitations any more on how long you can keep things going, except things like the death of the sun once we get to a point where we have successfully migrated the data through a few generations, we can start to feel fairly comfortable that the data will survive as long as our current society survives. Another respondent considered that it s a question you can t answer because the principle determinant was not the technical preservation processes but the organisational context in which these things survive you can do all the things that are necessary to continue the physical life of something, but it sits within some sort of organisational context, which we know can be disturbed or changed or destroyed, and, if that context has gone, what happens to the material. CONCLUSION This is preliminary material based on research in progress. These respondents are not a balanced sample: for example, the views of one large archive dominate. Still to be interviewed are staff from the National Library of Australia although, as already noted, their points of view are already well documented, for instance in the UNESCO 7

Guidelines (UNESCO 2003) and also personnel working in some smaller organisations such as PARADISEC, a newly established organisation that aims to preserve endangered languages. The experience of archives with digital preservation has suggested that this profession has thought more deeply about the issues and has developed more sophisticated responses than the library community. However, there are significant differences in the nature of the information each community collects and maintains, so perhaps the challenge for libraries is how to interpret the archives experience in its own context. The responses so far do not note much that is new, but they place different weight on some aspects of what we supposed previously. For example, relatively little importance is placed on technical matters, perhaps suggesting that the IT issues around systems, equipment and software are considered on the way to being solved. The need to place digital preservation activities in an institutional context and to integrate them into that organisation s normal operating procedures is an area that received heavier emphasis than it would have in the past. A final comment: these lessons come from large and relatively well-funded institutions. I suggest that the real challenge is how to develop strategies and techniques that will allow the smaller and less well resourced libraries and archives to also successfully preserve digital information. REFERENCES Beagrie, Neil (2003). National Digital Preservation Initiatives: An Overview of Developments in Australia, France, the Netherlands, and the United Kingdom and of Related International Activity. Washington, D.C.: CLIR and Library of Congress. Gorman, G.E. (2004). Introduction. In Gorman, G.E. and Dorner, Daniel G. (eds). Metadata Applications and Management. London: Facet Publishing, 2004, pp. xv-xxix. (International Yearbook of Library and Information Management 2003/2004). UNESCO (2003). Guidelines for the Preservation of Digital Heritage. Prepared by the National Library of Australia. http://www.unesco.org/images/0013/001300/13007e.pdf. Accessed 20 June 2003. 8