Into the digital archive

Quarantine

As part of the National Archives' records transfer process, media containing digital objects for preservation arrive at the National Archives from Commonwealth Agencies or Persons. Once the digital objects are received, they are processed in the quarantine facility:

  1. The transfer media (such as CD, DVD or USB hard disk) is connected to the quarantine facility

  2. An automated manifest check is run to assure that the correct files have been received

  3. An antivirus scan is run against the transfer media

  4. The contents of the transfer media are copied to a carrying device (such as a USB hard disk) and checksums are checked. A checksum is an algorithm-based method of determining the integrity and authenticity of a digital data object. It is used to check whether errors or alterations have occurred during the transmission or storage of a data object.

  5. The carrying device is then disconnected and placed into storage for 28 days. During the 28-day quarantine period, the antivirus definitions in the quarantine facility are updated daily. Any new virus found 'in the wild' during this time should have its signature added to the antivirus definition files by the antivirus provider, and hence to the quarantine facility.

  6. After 28 days, the carrying device is reconnected to the quarantine facility and the contents are scanned again. If all is well, the carrying device is disconnected and taken to the next stage. If any virus is detected, the transfer is halted and the originating agency is consulted.

Preservation

The carrying device from the quarantine facility is connected to the preservation facility and the checksums of its contents are checked. Each data object is processed via Xena to produce:

  • a base64 encoded version (ASCII text) of the file with XML metadata header and footer
  • an open format conversion of the original file.

The two data objects created for each incoming data object are written to another carrying device and new checksums are created for each data object.

Digital archive

The carrying device from the preservation facility is connected to the digital archive and the checksums for each data object are checked. The data objects are copied to the digital archive RAID storage.

The resulting data objects are referred to as Archival Information Packages (AIPs). The Digital Preservation Recorder (DPR) software captures descriptive and preservation metadata relating to each Archival Information Package (AIP). The metadata is used to manage the preserved AIPs, such as generating reports and accessing requested records.

The checksum of each AIP in the digital archive is monitored to ensure it has not changed. If the checksum changes, it means that the AIP has been altered somehow. By constantly checking checksums, the integrity of the AIPs in the digital archive can be assured.

When data is copied out of the archive for access, the base64 encoded version may be exported via Xena to produce an exact replica of the original data object.

Further advice

The National Archives can provide you with detailed and specialised advice to assist in the digital preservation process. For more information on how we manage digital records transfers, see our information on transferring records to the National Archives. For other advice, contact the Agency Service Centre.