42 · Survey Results: Survey Questions and Responses
A procedure using a combination of Adobe Photoshop and Adobe Bridge was developed locally to batch process files
to accomplish this task. Sensitive data: we have yet to work out issues surrounding born-digital institutional records
with restricted access, e.g., promotion &tenure files, President’s Office files, etc. An organization uses an online service
to process applications that in the past had been delivered in paper format. Acquiring the records in a format that is
useable by the archive may require a contract of some sort with the vendor. This remains to be resolved.
In 2010 the library acquired a collection of nearly 50 floppy discs and a number of CDs most were unlabeled (or
labeled unhelpfully), meaning that we had to view each one and try to deduce at least minimal information so we could
describe the contents. However, the most challenging item was a hard drive, carefully wrapped, with a label reading
“The contents of this drive can only be accessed at the original computer from the New York Times. If installed at any
other computer, you may damage the contents and you may format (wipe out) the drive.” We have no idea quite how to
approach this so have simply left it alone as is!
Inability to access content saved on obsolete media or in obsolete programs. Lack of secure, redundant, geographically
distributed, and reliable preservation storage systems. Lack of system for managing and providing access to born-digital
materials that will allow for restricting some content for a period of time and will also help automate some processes like
generating checksums, virus checking, extraction of technical metadata from file headers, etc.
Ingestion of compound/complex objects (i.e., objects made of many types of materials at once). We use Google
Spreadsheets to compile metadata and file locations, but a solution like BagIt is likely to be more effective. Presentation
of complex objects. Determining how to show a user an object consisting of many disparate parts (e.g., a video with
a transcript, screenshots, and an associated web page). This is usually considered a prerequisite to ingestion, since
an object is only considered accessible if it can be usefully retrieved. We still address this question on an ad hoc basis.
Providing granular security options for all content. The technology required to provide very granular control over rights
and permissions makes it difficult to build services for ingesting and reusing repository content. Few repository systems
(we use Fedora) have a fully developed solution in this regard, so we use our own solution based on the university’s
Shibboleth identity system.
Lack of a standard set of best practice guidelines for dealing with original context (e.g., file system hierarchy) of born-
digital files when ingesting. Lack of a policy on file format normalization, and identification of what a “record copy”
means in the born-digital context. Fear and misunderstanding of the nature of born-digital material.
Lack of software and/or hardware to read files and physical media: We rely on library and college IT departments to
access file content, and we acquire legacy hardware when possible. Lack of server space to use for transfer of records
from digital media: We recently acquired server space hosted by the university’s IT department for use in backing up
digital media. Maintaining privacy and security of confidential records complying with university policy as well as
federal and state laws governing privacy: We have policies governing access to confidential records, but procedures
specific to born-digital materials are still being developed.
Legacy File Format Normalization: We have a collection that includes over 25 different file extensions, mostly text-
based documents, many of which were unrecognized and/or created significant artifacts or “garbage” when rendered
in modern programs. A lot of these files were created on the now defunct and unsupported Nota Bene annotation/
bibliography software. We used a conversion tool called FileMerlin to convert as many of the troubling files as we could
and a Windows Command Line script utilizing Microsoft Word to convert Wordperfect and other Legacy File formats
that Word would recognize. After a significant amount of manual and automated work, we increased the number of
legible files in the collection from around 40% to around 95%. Legacy Media recovery: Like many institutions, we have
many “hybrid” collections that include legacy media such as 3.5”/5 1/4” floppies, hard drives, CD/DVD, even whole
computing environments. We are building a Legacy Archival Media Migration Platform (LAMMP) and an accompanying
manual as an environment and a workflow for capturing images of these media and generating metadata and capturing
A procedure using a combination of Adobe Photoshop and Adobe Bridge was developed locally to batch process files
to accomplish this task. Sensitive data: we have yet to work out issues surrounding born-digital institutional records
with restricted access, e.g., promotion &tenure files, President’s Office files, etc. An organization uses an online service
to process applications that in the past had been delivered in paper format. Acquiring the records in a format that is
useable by the archive may require a contract of some sort with the vendor. This remains to be resolved.
In 2010 the library acquired a collection of nearly 50 floppy discs and a number of CDs most were unlabeled (or
labeled unhelpfully), meaning that we had to view each one and try to deduce at least minimal information so we could
describe the contents. However, the most challenging item was a hard drive, carefully wrapped, with a label reading
“The contents of this drive can only be accessed at the original computer from the New York Times. If installed at any
other computer, you may damage the contents and you may format (wipe out) the drive.” We have no idea quite how to
approach this so have simply left it alone as is!
Inability to access content saved on obsolete media or in obsolete programs. Lack of secure, redundant, geographically
distributed, and reliable preservation storage systems. Lack of system for managing and providing access to born-digital
materials that will allow for restricting some content for a period of time and will also help automate some processes like
generating checksums, virus checking, extraction of technical metadata from file headers, etc.
Ingestion of compound/complex objects (i.e., objects made of many types of materials at once). We use Google
Spreadsheets to compile metadata and file locations, but a solution like BagIt is likely to be more effective. Presentation
of complex objects. Determining how to show a user an object consisting of many disparate parts (e.g., a video with
a transcript, screenshots, and an associated web page). This is usually considered a prerequisite to ingestion, since
an object is only considered accessible if it can be usefully retrieved. We still address this question on an ad hoc basis.
Providing granular security options for all content. The technology required to provide very granular control over rights
and permissions makes it difficult to build services for ingesting and reusing repository content. Few repository systems
(we use Fedora) have a fully developed solution in this regard, so we use our own solution based on the university’s
Shibboleth identity system.
Lack of a standard set of best practice guidelines for dealing with original context (e.g., file system hierarchy) of born-
digital files when ingesting. Lack of a policy on file format normalization, and identification of what a “record copy”
means in the born-digital context. Fear and misunderstanding of the nature of born-digital material.
Lack of software and/or hardware to read files and physical media: We rely on library and college IT departments to
access file content, and we acquire legacy hardware when possible. Lack of server space to use for transfer of records
from digital media: We recently acquired server space hosted by the university’s IT department for use in backing up
digital media. Maintaining privacy and security of confidential records complying with university policy as well as
federal and state laws governing privacy: We have policies governing access to confidential records, but procedures
specific to born-digital materials are still being developed.
Legacy File Format Normalization: We have a collection that includes over 25 different file extensions, mostly text-
based documents, many of which were unrecognized and/or created significant artifacts or “garbage” when rendered
in modern programs. A lot of these files were created on the now defunct and unsupported Nota Bene annotation/
bibliography software. We used a conversion tool called FileMerlin to convert as many of the troubling files as we could
and a Windows Command Line script utilizing Microsoft Word to convert Wordperfect and other Legacy File formats
that Word would recognize. After a significant amount of manual and automated work, we increased the number of
legible files in the collection from around 40% to around 95%. Legacy Media recovery: Like many institutions, we have
many “hybrid” collections that include legacy media such as 3.5”/5 1/4” floppies, hard drives, CD/DVD, even whole
computing environments. We are building a Legacy Archival Media Migration Platform (LAMMP) and an accompanying
manual as an environment and a workflow for capturing images of these media and generating metadata and capturing