Digital Preservation Policy

Preserving Archival Digital Records Transferred from Commonwealth Agencies

20 February 2018

Version 1.3

Revision history 

1.2, 10 Jan 2018 
1.3, 20 Feb 2018

1. Policy statement

The National Archives of Australia will secure, preserve and provide access to digital records of enduring value.

2. Policy Aims

Digital preservation aims to address the following risks:

  • The content of digital records becomes inaccessible due to future software obsolescence,
  • data loss due to the obsolescence or failure of the hardware or media used to store digital records,
  • data loss due to inadvertent or malicious alteration of content, and
  • inauthentic or unreliable data due to incomplete or inadequate capture of digital records and metadata at the time of transfer.

This Policy describes the digital archiving principles and approaches adopted by the National Archives of Australia (Archives) to ensure these risks are mitigated as much as possible.

Further policy documents, procedures, standards, and guidance will be developed in future to address specific aspects of the Policy.

This Policy addresses the following target groups:

  • Archives' staff
  • Commonwealth Government agencies
  • Expert groups in the digital archiving community
  • Public clients

3. Related documents

The Policy relates to other key policies and strategies, including:

4. Scope

This Policy applies to digital records of enduring value. These include:

  • Born-digital records , which were created and managed digitally for business purposes and subsequently transferred into the custody of the Archives. Born-digital records include not only the common image and document formats, but also emerging formats, email, audiovisual records, mixed media, structured datasets and computer code.
  • Digitised records , which were created in analogue form, but have been subsequently converted to digital form for one of the following reasons:
    • Business: Created by an agency.
    • Preservation: Created by an agency or the Archives to preservation standard.
    • Access: Created by an agency or the Archives.
    • General Records Authority 31 and Disposal of Records in the Archives' Custody Following Digitisation: Created by an agency or the Archives to preservation standard and the analogue record subsequently destroyed.

Digitised records are digital material that is subject to the same broad challenges of preservation and access as born-digital records.

For the purposes of this policy, a digital record consists of content (encoded in an object such as a data file) and metadata describing the content. Both the content and metadata are essential components of a record. This policy provides for the preservation of the content and associated metadata, the maintenance of a persistent link between the two, and the creation of new metadata to document the preservation actions undertaken.

5. Standards

Standards play an important role in digital preservation. In particular, they provide clear benchmarks for defining requirements and measuring outcomes, and support interoperability between contemporary and future systems. Internal and external standards that the Archives applies to digital preservation include:

  • Conceptual models and standards such as the Reference Model for Open Archival Information Systems and the Archives' Performance Model
  • Metadata standards such as the Commonwealth Record Series (CRS) system, the Australian Government Recordkeeping Metadata Standard (AGRkMS), and Preservation Metadata: Implementation Strategies (PREMIS)
  • File format standards such as ISO/IEC 26300: Open Document Format for Office Applications, ISO/IEC 15948: Portable Network Graphics, TIFF (Tagged Image File Format) Revision 6.0
  • Internal standards for digitisation, preservation formats, transfer and storage and retrieval.

5.1 Performance Model

This policy is informed by the concepts described in the performance model. The Archives developed the performance model as a way of understanding the nature of digital records. The model states that data (the source) requires the application of a set of technologies (the process) to render its information content (the performance).

Data is processed using hardware and software and rendered on screen for the client to view
Fig. 1 Experience of a digital record

According to the model, the preservation of a digital record should focus on maintaining the performance of the record over time, rather than attempting to preserve the specific combination of source and process that supported the record when it was created. Provided that the performance can be replicated over time, the particular source and process used to render it can be changed.

The performance model is a powerful framework for describing digital preservation strategies. For example, under this model normalisation strategies change the source so that it can always be rendered by current processes, while emulation strategies maintain the source, but change the processes required to render it. The performance is defined in terms of the significant properties of the record, which must always be preserved in the performance. The persistence of these properties is fundamental to the authenticity of the record.

5.2 Reference Model for Open Archival Information Systems (OAIS), ISO 14721

The Archives will align its policies with OAIS wherever practicable. OAIS is a highly influential ISO standard that provides a common framework, including terminology and concepts, for describing a digital archive.

OAIS does not just focus on the technical architecture and processes for a digital archive, but gives equal importance to people and policies as constituent parts of a digital archive.

Data from the producer including descriptive information is ingested into archival storage and data management. Access is enabled for the consumer to query the data via the descriptive information, retrieve results and place orders.
Fig. 2 OAIS Model diagram

5.3 Metadata

Metadata is essential for the effective management, preservation and use and reuse of records. The types of metadata used to support the management of authentic digital records include:

  • Descriptive metadata, which describes the intellectual record and its context
  • Technical metadata, such as the file format of the digital record, resolution, the size of the digital record, the software in which the record was created and managed
  • Access and rights metadata, defining who is allowed to view the record under what conditions, and what they can do with it (reuse)
  • Preservation metadata, which keeps track of actions taken to preserve or sustain the record for later access and use.

Different metadata standards, and different schemas, are required for these purposes. The Archives' standard for descriptive metadata on records in Archives' custody is the Commonwealth Record Series (CRS) system. The industry standard for preservation metadata is PREMIS. Agencies may have developed bespoke technical metadata schemas to manage particular digital resources. These metadata schemas will be recorded in a metadata registry and metadata repository with persistent links to the digital objects.

Subject to an assessment of the reliability of the source, the Archives will always generate and process metadata systematically and through highly automated processes such as an application programming interface (API). Wherever possible, the Archives sources recordkeeping and technical metadata from the creating agency which is reused for archival management and control. This is supplemented by metadata which is systematically extracted from the digital objects themselves. Manual metadata creation will normally be feasible only at a very high-level of description, for example to describe the Series or the Agency. The Archives acknowledges that data is imperfect and we cannot always be confident that it is accurate. We will mitigate this risk by ensuring that agencies are compliant with our standards and advice, and by assessing, securing and validating metadata at the point of transfer.

The metadata model used by the Archives must be capable of describing multiple versions of a digital record, and creating a complete audit trail of the preservation actions that have generated those versions. This model must also accurately document the relationships between digital records, and between records and records’ creators. This relationship does not always reflect the traditional archival hierarchy of Organisation, Agency, Function, Series, Item. Digital records can be threaded discussions using a web-based application, a tweet with embedded video, websites, structured datasets and computer code. The metadata model used by the Archives must allow for more fluid and interlinked relationships between entities and objects. The Archives will draw on initiatives like the International Council of Archives’ Records in Contexts (RiC) to develop new models of contextual description that extend the flexibility of the CRS system.

5.4 Preservation formats

The Archives endorses the use of open, standards-based formats and accepted, well-documented industry standards, and will publish its preferred and acceptable preservation file formats for digital records.

The Archives prescribes long-term preservation file formats for categories of digital records. The Archives will investigate and conduct risk assessments of file formats to determine which formats are low risk for the long-term preservation of information. This work will draw on national and international best practice.

5.4.1 Preferred formats

In determining long-term preservation formats, the Archives will take account of format characteristics:

  • Usage: The format is widely used and supported around the world (including as an archival format).
  • Restrictions: The format is free from patents and legal encumbrance (including intellectual property rights) and preferably embodies open-source principles. Technical protection methods (DRM/Encryption/Password protection) are not used.
  • Documentation: The format is identifiable and is well documented by the format creator (the specification is publicly available, anyone with sufficient skills and incentive can build software to read it accurately).
  • Robustness: The file format is stable (rare releases of newer versions) and is backward and forward compatible or has a clear migration path.
  • Interoperability: Formats that are supported by a wide range of software or are platform-independent are most desirable
  • Compression: The format uses lossless compression.
  • Technical support: Technical support is readily available from vendors, community or third parties.
  • Support for metadata: File formats with metadata support are preferred.

These criteria are based on the work on long-term preservation formats undertaken by the Australasian Digital Records Initiative (ADRI).

5.4.2 At-risk formats

'At-risk' formats are formats that the Archives has determined are a high risk of being inaccessible within a short period of time. Records in these formats are normalised into 'preferred' preservation formats. The original bitstream is also retained. In determining 'at-risk' formats, the Archives will take account of one or more of the following characteristics:

  • the format is poorly documented
  • patent or licence restrictions apply to the format
  • there is limited choice of software implementing the format
  • the format is not supported
  • the international digital preservation community has determined that the format is at-risk.

5.5 Digitisation Standards

The Archives publicly documents our internal standards for the creation of digital surrogates, including preservation master files, and provides guidance to agencies on creation of digital surrogates.

6. Principles

The following principles underpin the Archives' approach to digital preservation. The principles are designed to ensure that digital records in the Archives’ custody remain authentic and accessible to anybody who wants to access and use them in the future.

6.1 Authenticity

The record must be trusted as an accurate representation of the original record. The Archives will ensure authenticity through the operation of transparent and fully documented preservation strategies, and by capturing and providing the metadata required to describe the content, context and provenance of the record.

6.2 Integrity

The record must be complete and protected against unauthorised or accidental alteration. The Archives ensures the integrity of the record by always keeping the original record (bitstream preservation), fixity checking of digital objects, and capturing a full audit trail of all preservation actions performed on the record.

6.3 Ongoing Access and Use of Records

The record must be capable of being accessed by users, across time and changing technical environments. This requires that the record be locatable and retrievable, that it can be rendered in a current technical environment, and that it can be understood and interpreted by users. The Archives ensures the usability through:

  • digital preservation strategies to ensure that content remains accessible over time, for example migration or emulation strategies.
  • capture and provision of metadata sufficient to allow the record to be located, retrieved and understood
  • making digital records accessible without restriction, unless there is a justifiable reason to restrict or close access, for example an exemption under Section 33 of the Archives Act.

The Archives will continue to develop its ability to provide access to digital records in new and innovative ways to meet the evolving needs and expectations of our users.

6.4 Digital Preservation Strategies

The Archives takes a risk-based approach to digital preservation and employs several strategies to enable effective preservation of the digital collection.

The preservation of digital records is a complex task: the physical media, encoding format, preservation software and method must be carefully and persistently managed in order to preserve the records. The Archives will implement different preservation strategies over time, for different types of digital records.

These strategies will be selected according to the requirements of particular record types, evolving international best practice, and the resources required to carry out the preservation strategy. Irrespective of the strategies employed at a given time, the Archives will always retain the original digital record (bitstream preservation), and may retain all intermediate versions created as a result of preservation actions.

6.5 Copyright

Both digital preservation and digitisation involve the copying of content. Therefore, copying for preservation is subject to copyright legislation. The Archives will comply with intellectual property rights and with other legal and moral rights related to copying, storage, modification of content, and the use of digital records.

7. Policy Requirements

7.1 Digital Records Capture

The Archives will ensure the complete and full capture of digital records at the time of transfer. Inadequate or incomplete capture results in inauthentic and unreliable records. The Archives will ensure complete capture through:

Transfer validation: Archives staff will quality check transfers to ensure the transfer is properly authorised and that the digital records have been sentenced Retain as National Archives (RNA) against a current Records Authority. Archives staff will check the quality, comprehensiveness and accessibility of digital records and enforce minimum standards at the time of transfer. The Archives will not accept for transfer digital records that are password protected, encrypted, or which have viruses or other malware.

Metadata validation: this ensures that the metadata we receive is technically correct and prevents changes to metadata we receive in the transfer process, for example the last modified date.

Fixity checking: this is used to prevent any alterations to the digital records during the transfer process. Checksums are generated prior to transfer and are checked when the transfer is received by the Archives.

Transfer complete datasets and metadata: Archives staff will actively seek to transfer complete datasets and metadata. The future opportunities to analyse complete sets of data (as opposed to partial or sample datasets) can fundamentally change the value of that information.

8. Skills and Training

The Archives will ensure that its digital preservation activities are carried out by sufficient staff with the appropriate skills. The Archives may use a combination of in-house staff, contractors, and consultants to achieve its objectives. The Archives will provide training opportunities to allow staff to develop, maintain or enhance their digital preservation expertise. These opportunities may include participation on courses, self-directed learning, attendance at national and international seminars, workshops and conferences, study visits, internships, and working exchanges with other institutions and professional bodies.

9. Research and collaboration

The Archives will maintain professional relationships with the wider digital preservation community in Australia and internationally. Where appropriate it may actively participate in initiatives, through partnerships and collaboration with appropriate organisations, e.g. the Australasian Digital Recordkeeping Initiative, the Council of Australasian Archives and Records Authorities, the International Council on Archives, the UNESCO PERSIST Programme, the International Organisation of Sound and Audiovisual Archives, the Open Preservation Foundation, and the Digital Preservation Coalition.

10. Roles and responsibilities

Implementing a digital preservation solution will require working across traditional boundaries within the Archives and, from time to time, with external partners.

Broadly, Collection Management Branch will be responsible for setting, maintaining, and monitoring compliance with digital preservation strategy and policy; transfer; technology watch (with Applications and Business Engagement); storage and lending of digital records, preservation planning; preservation action (with ICT Infrastructure); and outreach.

ICT Infrastructure is responsible for ensuring that the ICT Technology Roadmap and all ICT projects comply with this Policy and associated standards.

Different areas of the Archives are responsible for creating or modifying digital resources, for example Digitisation and Imaging Services create preservation master versions and other surrogates, the Declassification Unit creates redacted versions for access, Public Programs create digital content for exhibitions. Content creators are responsible for ensuring that digital resources are created and managed in accordance with this Policy and associated strategies.

11. Digital Archiving and other standards

This policy is aligned with or references the following standards:

11.1 Commonwealth legislation

Archives Act 1983

Privacy Act 1988

Freedom of Information Act 1982

Evidence Act 1995

Electronic Transactions Act 1999

11.2 National/International standards

ISO 14721:2012 Space data and information transfer systems – Open archival information system (OAIS) – Reference model

ISO 20652:2006 Space data and information transfer systems – Producer-archive interface

ISO/TR 18492:2005 Long-term preservation of electronic-based information

ISO/IEC 20000 Information technology – Service management

ISO/IEC 27000:2009 Information technology – Security techniques – Information security management systems – Overview and vocabulary

AS/NZS ISO/IEC 27001:2006 Information technology – Security techniques – Information security management systems – Requirements

AS/NZS ISO/IEC 27002:2006 Information technology – Security techniques – Code of practice for information security management

AS/NZS ISO/IEC 27005:2012 Information technology – Security techniques – Information security risk management

11.3 Other standards

National Archives of Australia, Commonwealth Record Series (CRS) Manual (2004)

National Archives of Australia, Australian Government Recordkeeping Metadata Standard (AGRkMS) Version 2.2

PREMIS Data Dictionary for Preservation Metadata, version 3.0

National Archives of Australia, Performance Model – An Approach to the Preservation of Digital Records

Records in Context (RiC) – Conceptual Model – International Council on Archives

ISAD(G) – International Council on Archives: General International Standard Archival Description, 2nd ed., (2000)

Australian Government Information Security Manual (ISM)

Protective Security Policy Framework

12. Glossary

Access
The right, opportunity, means of finding, using or retrieving information (AS ISO 15489.1)

Accessible
Records that can be identified, located and accessed as required.

Bitstream
A contiguous sequence of bits, representing a stream of data. In digital archiving, the "original bitstream" is the record in its original format.

Born-digital record
Records created and managed digitally

Digital archiving
The identification, appraisal, description, storage, preservation, management and retrieval of digital records, including all the policies, guidelines and systems associated with those processes, so that the logical and physical integrity of the records is securely maintained over time.

Digital object
An object composed of a set of bit sequences.

Digital preservation
An essential and necessary component of digital archiving ensuring longevity of a digital record. Digital preservation covers the processes and operations involved in ensuring the technical and intellectual survival of authentic records over time (such as the ongoing monitoring, migration and storage of records and managing the metadata which describes the origin and successive treatment of the record).

Digital record
A record produced, stored or transmitted by digital means rather than physical means. A digital record includes born digital records and digitised records.

Digital surrogate
The record produced as a result of a digitisation process, or photographic imaging.

Digitisation
The process of creating digital files by scanning or otherwise converting analogue materials.

Emulation
A means of overcoming technological obsolescence of hardware and software by developing techniques for imitating obsolete systems on future generations of computers

Fixity check
A method for ensuring the integrity of a file and verifying it has not been altered or corrupted. It is most often accomplished by computing checksums such as MD5, SHA1 or SHA256 for a file and comparing them to a stored value

Metadata
Structured information that describes and/or allows users to find, manage, control, understand or preserve other information over time. Record title and creating agency are examples of metadata.

Migration
A means of overcoming technological obsolescence by transferring digital resources from one hardware/software generation to the next.

Normalisation
The process of transforming a digital record from one data format (typically proprietary) to an archival data format (typically an open standard).

Preservation master
Produced when original records are at risk for loss of information, typically deterioration or obsolescence. Preservation master files are created at high to maximum capture specifications and can therefore serve a variety of purposes, including satisfying long-term preservation needs as well as fulfilling client requests for high-quality files.

Significant properties
Characteristics of a particular object subjectively determined to be important to maintain through preservation actions.

Copyright National Archives of Australia 2018