Transfer media containing digital objects for preservation arrive at the National Archives. The transfer media (eg CD, DVD or USB hard disk) is connected to the quarantine facility. An automated manifest check is run to assure that the correct files have been received. An antivirus scan is run against the transfer media. The contents of the transfer media are copied to a carrying device (eg USB hard disk) and MD5 checksums are checked. A checksum is an algorithm-based method of determining the integrity and authenticity of an digital data object. It is used to check whether errors or alterations have occurred during the transmission or storage of a data object.
The carrying device is then disconnected and placed into storage for 28 days.
During the 28-day quarantine period, the antivirus definitions in the quarantine facility are updated daily. Any new virus found 'in the wild' during this time should have its signature added to the antivirus definition files by the antivirus provider, and hence to the quarantine facility.
After 28 days, the carrying device is reconnected to the quarantine facility and the contents are scanned again. If all is well, the carrying device is disconnected and taken to the next stage. If any virus is detected, the transfer is halted and the originating agency is consulted.
The carrying device from the quarantine facility is connected to the preservation facility and the MD5 checksums of its contents are checked. Each data object is processed via Xena to produce:
The two data objects created for each incoming data object are written to another carrying device (eg another USB hard disk) and new MD5 checksums are created for each data object.
The carrying device from the preservation facility is connected to the digital archive and the MD5 checksums for each data object are checked. The data is copied to RAID storage on the repository and a continuous rechecking of the MD5 checksum of each data object commences.
Digital Preservation Recorder (DPR) software captures descriptive and preservation metadata relating to each Archival Information Package (AIP), which is used to manage the preserved AIPs, such as generating reports and accessing requested records.
When data is copied out of the archive for access, the normalised version may be presented via the Xena application used as a 'viewer' or the base64 encoded version may be exported via Xena to produce an exact replica of the original data object.
The checksum of each data object in the digital archive is monitored to ensure it has not changed. If the checksum of a data object changes, it means that the file has been altered somehow. By constantly checking checksums, we can detect any unplanned changes.
The National Archives can provide you with detailed and specialised advice to assist in the digital preservation process. For more information on how we manage digital records transfers, see our information on transferring records to the National Archives. For other advice, contact the Agency Service Centre.