rutgerS univerSity
RUcore. Archival Standards for Born-Digital Documents
IBB RUcore Preservation Standards Born Digital Documents Rev: 8/9/2010
Archival  standards  for  born-­‐digital  documents: 
Recommended  methods  for  keeping   
stable  preservation  copies 
As part of our plans to preserve student theses, dissertations, and newer editions of faculty texts
and other culturally/academically significant documents, we inevitably will be tasked with preserving an
increasing number of documents that originated electronically. These types of documents have been
authored using various types of word processing and digital publishing software for decades, but the
common practice had continued to be to print the final copy, and refer to the paper form as the final,
finished product; the master original. Consequently, digital preservation would consist of scanning these
analog objects back into a digital form, preserved electronically as scanned surrogates. Until very
recently, we envisioned that scanning and digitizing from analog would comprise the bulk of how we
digitally preserved all of our documents.
However, the increasing use of web-based publishing, online journals, and essentially paperless
production has highlighted the benefits of seeking out the born-digital masters of preservation-worthy
items whenever possible. Doing this affords us some advantages; namely, we can store the original in
its most efficient digital form, often requiring less overhead and disk space while doing away with the
quality challenges associated with scanning.
On the other hand, born digital preservation brings with it new challenges. Development of
preservation standards for analog objects proved to be relatively simple, as the imaging industry laid
much of the groundwork for us in terms of standardization across platforms. Further, development of
future standards for digitized images, sound and video continues in an organized and orderly fashion,
giving us plenty of time to contemplate migration to newer and better preservation formats.
Unfortunately, the same cannot be said for born digital documents. File formats for such objects
vary widely, and the responsibility is upon us to identify a uniform set of file formats that we can adopt
for preservation purposes.
As a result, a strategy for born digital document preservation must be adopted and followed that
accomplishes the following:
Accurately renders the formatting and content of the document, as intended by the
creator of the document
Maintains stability of the file format as well as possible. This may involve converting
the document to archival formats, and storing both the original and the converted
surrogate file.
Proposed Preservation Format Strategy: Multiple standards in play
Historically, born digital documents have been authored using a variety of different software packages,
each with their own proprietary file formats. Early on, programs such as Wordstar, Wordperfect,
Microsoft Works, ClarisWorks/AppleWorks, Adobe PageMaker, Quark Express, and others were
distributed throughout the electronic document landscape.
More recently over the past decade, Microsoft Office has emerged as a de facto standard for general
usage, with most businesses using it to create and distribute common document types. This usage has
