20 February 2018
1.2, 10 Jan 2018
1.3, 20 Feb 2018
The National Archives of Australia will secure, preserve and provide access to digital records of enduring value.
Digital preservation aims to address the following risks:
This Policy describes the digital archiving principles and approaches adopted by the National Archives of Australia (Archives) to ensure these risks are mitigated as much as possible.
Further policy documents, procedures, standards, and guidance will be developed in future to address specific aspects of the Policy.
This Policy addresses the following target groups:
The Policy relates to other key policies and strategies, including:
This Policy applies to digital records of enduring value. These include:
Digitised records are digital material that is subject to the same broad challenges of preservation and access as born-digital records.
For the purposes of this policy, a digital record consists of content (encoded in an object such as a data file) and metadata describing the content. Both the content and metadata are essential components of a record. This policy provides for the preservation of the content and associated metadata, the maintenance of a persistent link between the two, and the creation of new metadata to document the preservation actions undertaken.
Standards play an important role in digital preservation. In particular, they provide clear benchmarks for defining requirements and measuring outcomes, and support interoperability between contemporary and future systems. Internal and external standards that the Archives applies to digital preservation include:
This policy is informed by the concepts described in the performance model. The Archives developed the performance model as a way of understanding the nature of digital records. The model states that data (the source) requires the application of a set of technologies (the process) to render its information content (the performance).
Fig. 1 Experience of a digital record
According to the model, the preservation of a digital record should focus on maintaining the performance of the record over time, rather than attempting to preserve the specific combination of source and process that supported the record when it was created. Provided that the performance can be replicated over time, the particular source and process used to render it can be changed.
The performance model is a powerful framework for describing digital preservation strategies. For example, under this model normalisation strategies change the source so that it can always be rendered by current processes, while emulation strategies maintain the source, but change the processes required to render it. The performance is defined in terms of the significant properties of the record, which must always be preserved in the performance. The persistence of these properties is fundamental to the authenticity of the record.
The Archives will align its policies with OAIS wherever practicable. OAIS is a highly influential ISO standard that provides a common framework, including terminology and concepts, for describing a digital archive.
OAIS does not just focus on the technical architecture and processes for a digital archive, but gives equal importance to people and policies as constituent parts of a digital archive.
Fig. 2 OAIS Model diagram
Metadata is essential for the effective management, preservation and use and reuse of records. The types of metadata used to support the management of authentic digital records include:
Different metadata standards, and different schemas, are required for these purposes. The Archives' standard for descriptive metadata on records in Archives' custody is the Commonwealth Record Series (CRS) system. The industry standard for preservation metadata is PREMIS. Agencies may have developed bespoke technical metadata schemas to manage particular digital resources. These metadata schemas will be recorded in a metadata registry and metadata repository with persistent links to the digital objects.
Subject to an assessment of the reliability of the source, the Archives will always generate and process metadata systematically and through highly automated processes such as an application programming interface (API). Wherever possible, the Archives sources recordkeeping and technical metadata from the creating agency which is reused for archival management and control. This is supplemented by metadata which is systematically extracted from the digital objects themselves. Manual metadata creation will normally be feasible only at a very high-level of description, for example to describe the Series or the Agency. The Archives acknowledges that data is imperfect and we cannot always be confident that it is accurate. We will mitigate this risk by ensuring that agencies are compliant with our standards and advice, and by assessing, securing and validating metadata at the point of transfer.
The metadata model used by the Archives must be capable of describing multiple versions of a digital record, and creating a complete audit trail of the preservation actions that have generated those versions. This model must also accurately document the relationships between digital records, and between records and records’ creators. This relationship does not always reflect the traditional archival hierarchy of Organisation, Agency, Function, Series, Item. Digital records can be threaded discussions using a web-based application, a tweet with embedded video, websites, structured datasets and computer code. The metadata model used by the Archives must allow for more fluid and interlinked relationships between entities and objects. The Archives will draw on initiatives like the International Council of Archives’ Records in Contexts (RiC) to develop new models of contextual description that extend the flexibility of the CRS system.
The Archives endorses the use of open, standards-based formats and accepted, well-documented industry standards, and will publish its preferred and acceptable preservation file formats for digital records.
The Archives prescribes long-term preservation file formats for categories of digital records. The Archives will investigate and conduct risk assessments of file formats to determine which formats are low risk for the long-term preservation of information. This work will draw on national and international best practice.
In determining long-term preservation formats, the Archives will take account of format characteristics:
These criteria are based on the work on long-term preservation formats undertaken by the Australasian Digital Records Initiative (ADRI).
'At-risk' formats are formats that the Archives has determined are a high risk of being inaccessible within a short period of time. Records in these formats are normalised into 'preferred' preservation formats. The original bitstream is also retained. In determining 'at-risk' formats, the Archives will take account of one or more of the following characteristics:
The Archives publicly documents our internal standards for the creation of digital surrogates, including preservation master files, and provides guidance to agencies on creation of digital surrogates.
The following principles underpin the Archives' approach to digital preservation. The principles are designed to ensure that digital records in the Archives’ custody remain authentic and accessible to anybody who wants to access and use them in the future.
The record must be trusted as an accurate representation of the original record. The Archives will ensure authenticity through the operation of transparent and fully documented preservation strategies, and by capturing and providing the metadata required to describe the content, context and provenance of the record.
The record must be complete and protected against unauthorised or accidental alteration. The Archives ensures the integrity of the record by always keeping the original record (bitstream preservation), fixity checking of digital objects, and capturing a full audit trail of all preservation actions performed on the record.
The record must be capable of being accessed by users, across time and changing technical environments. This requires that the record be locatable and retrievable, that it can be rendered in a current technical environment, and that it can be understood and interpreted by users. The Archives ensures the usability through:
The Archives will continue to develop its ability to provide access to digital records in new and innovative ways to meet the evolving needs and expectations of our users.
The Archives takes a risk-based approach to digital preservation and employs several strategies to enable effective preservation of the digital collection.
The preservation of digital records is a complex task: the physical media, encoding format, preservation software and method must be carefully and persistently managed in order to preserve the records. The Archives will implement different preservation strategies over time, for different types of digital records.
These strategies will be selected according to the requirements of particular record types, evolving international best practice, and the resources required to carry out the preservation strategy. Irrespective of the strategies employed at a given time, the Archives will always retain the original digital record (bitstream preservation), and may retain all intermediate versions created as a result of preservation actions.
Both digital preservation and digitisation involve the copying of content. Therefore, copying for preservation is subject to copyright legislation. The Archives will comply with intellectual property rights and with other legal and moral rights related to copying, storage, modification of content, and the use of digital records.
The Archives will ensure the complete and full capture of digital records at the time of transfer. Inadequate or incomplete capture results in inauthentic and unreliable records. The Archives will ensure complete capture through:
Transfer validation: Archives staff will quality check transfers to ensure the transfer is properly authorised and that the digital records have been sentenced Retain as National Archives (RNA) against a current Records Authority. Archives staff will check the quality, comprehensiveness and accessibility of digital records and enforce minimum standards at the time of transfer. The Archives will not accept for transfer digital records that are password protected, encrypted, or which have viruses or other malware.
Metadata validation: this ensures that the metadata we receive is technically correct and prevents changes to metadata we receive in the transfer process, for example the last modified date.
Fixity checking: this is used to prevent any alterations to the digital records during the transfer process. Checksums are generated prior to transfer and are checked when the transfer is received by the Archives.
Transfer complete datasets and metadata: Archives staff will actively seek to transfer complete datasets and metadata. The future opportunities to analyse complete sets of data (as opposed to partial or sample datasets) can fundamentally change the value of that information.
The Archives will achieve its objectives through digital preservation infrastructure that ensures data integrity, format sustainability, and information security. The Archives will do this through:
Storage selection: carefully select storage technologies to maximise the periods between refreshment cycles, and simplify the refreshment process itself, in addition to providing the most secure storage environment possible. Detailed criteria and methods for selecting appropriate storage solutions, including cloud services, will be developed as part of the storage procurement process.
Redundancy and back-up: maintain redundant copies of digital records through appropriate replication and backup processes. The viability and integrity of backup copies, including the ability to restore from backups, will be periodically tested.
Data integrity: malware scanning and checking for file fixity. Files with fixity issues will be repaired and/or replaced.
Information security: identify and enforce who has access to digital archive systems and services, and create and maintain complete record logs of processing actions. Provide a secure and stable digital platform to protect against cyber threats.
The Archives undertakes two fundamental preservation activities:
Bitstream preservation: this approach involves maintaining the digital record in its original format. The purpose of bitstream preservation is to ensure the continuing integrity of the digital record.
Normalisation: this approach involves converting a digital record from an at-risk data format to a preservation format in order to ensure its continued accessibility with current technology.
The Archives conducts format assessments to determine the risk level of particular formats. The risk assessment is used to determine whether a digital record is normalised or maintained in its original format.
Preservation processes that result in any physical or logical change to a digital object will be logged and recorded in the associated metadata, to provide an audit trail. All changes to metadata will themselves be audited.
The relationship between any digital object and its metadata be maintained persistently. A persistent, unique identifier will be assigned to every digital object at the point of ingest, and the recording of this identifier within the associated metadata to provide a persistent link.
Digital records and digitised records will be delivered to users in ways that do not limit the use and re-use of the records' content.
The Archives will ensure that its digital preservation activities are carried out by sufficient staff with the appropriate skills. The Archives may use a combination of in-house staff, contractors, and consultants to achieve its objectives. The Archives will provide training opportunities to allow staff to develop, maintain or enhance their digital preservation expertise. These opportunities may include participation on courses, self-directed learning, attendance at national and international seminars, workshops and conferences, study visits, internships, and working exchanges with other institutions and professional bodies.
The Archives will maintain professional relationships with the wider digital preservation community in Australia and internationally. Where appropriate it may actively participate in initiatives, through partnerships and collaboration with appropriate organisations, e.g. the Australasian Digital Recordkeeping Initiative, the Council of Australasian Archives and Records Authorities, the International Council on Archives, the UNESCO PERSIST Programme, the International Organisation of Sound and Audiovisual Archives, the Open Preservation Foundation, and the Digital Preservation Coalition.
Implementing a digital preservation solution will require working across traditional boundaries within the Archives and, from time to time, with external partners.
Broadly, Collection Management Branch will be responsible for setting, maintaining, and monitoring compliance with digital preservation strategy and policy; transfer; technology watch (with Applications and Business Engagement); storage and lending of digital records, preservation planning; preservation action (with ICT Infrastructure); and outreach.
ICT Infrastructure is responsible for ensuring that the ICT Technology Roadmap and all ICT projects comply with this Policy and associated standards.
Different areas of the Archives are responsible for creating or modifying digital resources, for example Digitisation and Imaging Services create preservation master versions and other surrogates, the Declassification Unit creates redacted versions for access, Public Programs create digital content for exhibitions. Content creators are responsible for ensuring that digital resources are created and managed in accordance with this Policy and associated strategies.
This policy is aligned with or references the following standards:
Archives Act 1983
Privacy Act 1988
Freedom of Information Act 1982
Evidence Act 1995
Electronic Transactions Act 1999
ISO 20652:2006 Space data and information transfer systems – Producer-archive interface
ISO/TR 18492:2005 Long-term preservation of electronic-based information
ISO/IEC 20000 Information technology – Service management
ISO/IEC 27000:2009 Information technology – Security techniques – Information security management systems – Overview and vocabulary
AS/NZS ISO/IEC 27001:2006 Information technology – Security techniques – Information security management systems – Requirements
AS/NZS ISO/IEC 27002:2006 Information technology – Security techniques – Code of practice for information security management
AS/NZS ISO/IEC 27005:2012 Information technology – Security techniques – Information security risk management
National Archives of Australia, Commonwealth Record Series (CRS) Manual (2004)
National Archives of Australia, Australian Government Recordkeeping Metadata Standard (AGRkMS) Version 2.2
National Archives of Australia, Performance Model – An Approach to the Preservation of Digital Records
Records in Context (RiC) – Conceptual Model – International Council on Archives