Community News
Community News
Sustainable Preservation That Works
Distributed Custodial Archival Preservation Environments
In the Information Age society produces a flood of electronic records. State governments, universities, and all sectors of society now rely heavily on records that spend most if not all of their life in electronic form.
But the freedoms that give digital materials their dazzling power also make them ephemeral and challenging to preserve. Despite the fact that electronic records have been around for decades archivists still face a serious gap between the need to archive this growing stream of records in digital form, and the real-world capabilities to make this a reality.
Now, a broad new collaboration, the Distributed Custodial Archival Preservation Environments project, or DCAPE, is taking an important step toward meeting the needs of archival repositories for trusted archival preservation services by giving archivists the tools they need to ensure that software-dependent electronic records created today will be usable with tomorrow’s technology.
The innovative two-and-a-half year project, supported by the National Historical Publications and Records Commission (NHPRC), includes state archives in California, Kansas, Michigan, Kentucky, North Carolina, and New York; university archives at Tufts University and Carleton University (Canada); a cultural institution, the Getty Research Institute; and cyberinfrastructure partners at West Virginia University, the Renaissance Computing Institute (RENCI), and the UNC School of Information and Library Science (SILS). DCAPE is also building on the experience of partners who have worked together in previous projects such as the Persistent Archives Testbed (PAT).
“Many archival repositories are under-funded and struggling to fulfill their responsibilities to preserve and provide access to electronic records,” said project leader Richard Marciano, director of the Sustainable Archives and Library Technologies (SALT) Lab in the Data Intensive Cyber Environments group, Professor in the School of Information and Library Science at the University of North Carolina at Chapel Hill, and Chief Scientist, Persistent Archives and Digital Preservation at RENCI. “So our goal in the DCAPE collaboration is to demonstrate a working cost-effective preservation system and show archivists that this capability may be closer than they think.”
Once practical preservation systems are in place it is clear that the cost to store a unit of information will continue to plummet for electronic records, compared to the ever-rising costs of maintaining hard copy records. As part of this transition, the DCAPE collaboration is developing a sustainable business model based on open source preservation infrastructure.
“We’ll keep the costs far less than most people would expect in several ways,” said Marciano. “We’ll minimize labor costs by automating administration of the preservation environment with iRODS, the Integrated Rule-Oriented Data System.” The iRODS system, developed by the Data Intensive Cyber Environments (DICE) group at the School of Information and Library Science (SILS) and RENCI at UNC Chapel Hill and UC San Diego, has a powerful Rule Engine that lets archivists automate labor-intensive procedures. As freely available open source software, the iRODS system is advancing rapidly through a growing community of users who collaborate on its development.
“Together our DCAPE project includes 33 participants across 12 institutions,” said Marciano. “We can take advantage of this partnership to reduce the risk of data loss and save resources through distributed archives, for example, by using multiple sites in the DCAPE preservation environment to back up each other’s data.”
DCAPE will consist of a distributed and easily expanded preservation environment that enables archivists to manage their records, located at both the DICE facility at UNC and partner institutions in a unified “virtual environment.” Ensuring redundancy, replicas of records may be stored at multiple locations. And by collaborating with each other, DCAPE lets resource-strapped archives reduce the need to invest in so much in-house infrastructure and expertise.
The project will also minimize storage costs through using low-cost commodity storage and a small subset of the massive storage systems that support research at RENCI, the Renaissance Computing Institute at the University of North Carolina.
The DCAPE project will address the risk of technological obsolescence by taking advantage of iRODS’ “infrastructure independence,” the ability to seamlessly migrate electronic records onto newer, more cost-effective infrastructure that continually appears. While archivists can start small with iRODS, in the future they can seamlessly expand their archives to terabytes or petabytes of data.
iRODS, supported by NARA and the NSF, incorporates more than a decade of award-winning research, and offers archivists state-of-the-art automated rules-based capabilities to appraise, describe, accession, replicate, manage, and provide access to large, complex electronic records collections long-term.
For more information on DCAPE see http://www.dcape.org.
Related Links
Data Intensive Cyber Environments (DICE) group http://diceresearch.org
Integrated Rule-Oriented Data System (iRODS) https://www.irods.org/
UNC School of Information and Library Science (SILS) http://sils.unc.edu
Sustainable Archives and Library Technologies (SALT) http://salt.diceresearch.org
November 7, 2008