Preservation Policy and Strategy
This document describes Canadiana’s Preservation Policy and Strategy, including:
- What content is accepted for preservation in the Trustworthy Digital Repository
- The nature of preservation activities
- How access to preserved content is facilitated
- Strategies employed to ensure the ongoing preservation of, and access to, preserved content
Canadiana preserves only digital content and information. Canadiana accepts digital content and metadata for ingest and preservation. Canadiana will also accept analog materials, in which case it will create digital surrogates which it will ingest and preserve. The original analog materials are returned to their owners. Canadiana makes no claims or assumptions about the continuing preservation or availability of these analog sources. Metadata ingested with the digital content is also preserved, whether supplied by the depositor or created by Canadiana.
Canadiana produces, preserves and hosts its own digital content. The original source material is obtained from multiple sources, but the digitized versions belong to Canadiana and are preserved and hosted as part of Canadiana’s own collections.
Canadiana also preserves and hosts digital content belonging to third party depositors. Depositors may supply Canadiana with the digitized content or may engage the services of Canadiana to digitize analog material. A depositor agreement between Canadiana and the depositor is required prior to ingest of any content.
Not all content digitized by Canadiana for third parties is ingested or preserved. From time to time, Canadiana will digitize material for a client and return copies of the digitized content to the client without retaining or preserving a copy itself. Such arrangements do not require a depositor agreement.
Content belonging to Canadiana is preserved indefinitely. Content belonging to third party depositors is preserved in accordance with their respective depositor agreements.
Content is ingested into the Canadiana repository in the form of a SIP. Canadiana preserves the SIP in its original form and guarantees the ability of the depositor to extract an identical SIP from the repository at any point in the future. Any revisions or migrations of SIPs are also preserved such that a depositor can access each revision or migration as well as the original. Previous revisions of SIPs may contain standards or formats no longer supported by Canadiana’s access tools. This means that, while older revisions are preserved and can be retrieved, Canadiana may not maintain the tools needed to provide mediated access to those formats. Canadiana will migrate a SIP to currently-supported standards and formats before phasing out mediated access to any existing standards or formats within that SIP.
From time to time, Canadiana may modify the standards and requirements for a valid SIP. All previously-ingested SIPs will continue to be supported, but any newly ingested SIPs must conform to the current requirements. This means that, within the repository, there may exist SIPs that were valid at the time of ingest and which remain supported by Canadiana, but which would not be ingestible under the the current specifications.
Access to Content
Canadiana makes preserved content, and/or derivatives thereof, available online through a series of online services. This content may be freely accessible to the public or restricted to subscribers or other authorized users. Content may also be held dark: preserved within the repository but inaccessible except by special request from the depositor.
Content belonging to Canadiana is made available in accordance with Canadiana’s collection policies and agreements with subscribers and stakeholders. Content belonging to a third party depository is made available in accordance with the depositor agreement between Canadiana and the depositor. Canadiana makes both the original deposited content as well as any revisions or migrations of that content available to the depositor under terms described in the depositor agreement.
Risks to Content Preservation and Access
Long-term risks to the preservation of and access to digital content fall into three main categories:
- Formats and Standards: inability to migrate or provide access to content and metadata due to obsolete or unsupported standards and formats.
- Preservation and Network Infrastructure: Loss or corruption of data due to errors, failures or intrusion into hardware or software.
- Succession Plan: Inability to maintain preservation and access infrastructure due to loss of financial viability.
Formats and Standards
Digital content and metadata must conform to the supported standards and formats described in Canadiana’s METS application profile. Canadiana may amend the list of supported standards from time to time. Canadiana will continue to preserve existing content within the repository even if it no longer accepts and/or facilitates access to new content of that type.
Canadiana supports standards and formats which it has evaluated and determined that it can reliably support the ingest, preservation and access of that format. In particular, Canadiana must ensure that:
- The format or standard is established, well-documented, and shows evidence of widespread and consistent implementation
- There are no proprietary licences or encumbrances which may restrict Canadiana’s ability to store, manipulate or migrate data from the format or standard to another
- There is a viable forward migration path for the standard or format in the event that it becomes obsolete or unsupported
- There exist sufficient non-proprietary tools to manipulate, evaluate and provide access to the standard or format, and these tools are efficiently and effectively incorporated into Canadiana’s preservation and access network
Each supported format and standard is evaluated on an annual basis to determine whether they continue to meet these criteria. When Canadiana determines that a format or standard no longer meets these criteria, it will migrate any content or metadata in this format to an alternate format which does meet the criteria. Both the original and migrated content will be preserved, but access to the original content may be limited.
Preservation Network and Infrastructure
Ensuring that content is not lost, corrupted or altered as a result of system error, hardware failure, user error or intrusion is achieved through a combination of server redundancy, limited access and regular fixity checking. Key aspects include:
- A minimum of 5 copies of each object (and all associated metadata) are preserved
- Copies are preserved in a minimum of four different physical sites.
- All files have their checksums computed at ingest time. Each copy of each file is validated against its checksum on a regular basis. Missing files, and files which do not match their originally computed checksums are reported automatically to the lead systems engineer. The standard validation period is one check of each copy of each file every 30 days.
- Hardware and software is designed to be fault-tolerant. Industry standard hardware and open source software are used to eliminate the risk of dependence on a particular vendor. Disk arrays are configured with built-in redundancy (RAID) to prevent data loss. In the event that data loss does occur, restoration can be done from one of the remaining copies.
- Access to repository servers is limited to IT staff who have a specific need to access those systems.
- Industry standard system and network security processes are observed, including intrusion detection, system logging and allowing remote login via encrypted (SSH) protocols only.
- One copy of the repository is kept offline at all times in order to prevent a remote intruder from being able to access all copies of the repository.
- All manipulations of the repository, including ingesting, modifying and updating Archival Information Packages (AIPs), are mediated by a software application which ensures all changes result in valid AIPs.
In the event that Canadiana becomes incapable of continuing operations, it has made arrangements to provide copies of the repository and all associated tools, software and systems to three institutions: University of Alberta, University of Toronto and Library and Archives Canada.
Individual depositor agreements stipulate whether each individual depositor’s content will be maintained and preserved by Canadiana’s successors. Canadiana’s successors have agreed to preserve all content in the repository (other than that which is excluded by depositor agreements) where no legal prohibitions to doing so exist.