40 · Survey Results: Survey Questions and Responses
presents significant problems with readability because the data is stored in file formats that are no longer compatible
with modern operating systems or for which we simply don’t have software to read. An example is architectural
drawings created by a CAD software in the early 1990s. In this case we are able to work with our School of Architecture
to locate some software to read these programs, but it does mean that we need to keep this software viable, which in
many cases means running older operating systems or alternative operating systems to what we currently use on the
forensic imaging hardware (which is primarily Windows). One of the pieces of software we have purchased is Forensic
Tool Kit, which can identify and “read” thousands of file formats. However, these formats are primarily those that would
be most commonly seen in criminal investigations, since that is what the software is designed for. So, things like CAD
software from the early 90s are not included in their list of recognized formats. We have not seriously discussed trying
to emulate any software or operating systems at this time, although we have watched with interest other projects that
have done so. We do not view emulation as a viable approach at this time since our collections are so diverse and we
do not have the type of technology staff in the library to really do this work efficiently. It would simply be impossible
to have the resources available to emulate each and every program we are likely to encounter and to keep those
emulations running in current environments. While there are some things we are likely to see a lot of (Microsoft Word
documents, for example) we also feel that it is not worth the effort at this time to create an emulated environment
when a migrated format (a PDF in this case) would be adequate. This is not to say that in the future emulation may not
be attempted in special circumstances. A third very significant challenge is related to the lack of available tools for doing
archival work with born-digital collections, as well as infrastructure in terms of repository and preservation networks
that can meets the needs of access, management and preservation. There are several open-source and commercial
products that can do pieces of the workflow, but as they are not designed to work together there are inefficiencies
in stringing these workflows together. As an example, we use the Forensic Tool Kit software to extract some basic
technical metadata, identify duplicate materials, and those that might contain predictable sensitive information such
as SSNs or credit card numbers. The output of FTK, however, is some proprietary XML and a PDF report. We then use
Archivematica to further extract technical details and establish a provenance through the creation of PREMIS metadata.
We would then record information about a duplicate removed from the accession in Archivematica, but ingesting the
duplicate file and then removing it manually per the FTK report. Finally, the PREMIS metadata record that Archivematica
creates in nested inside a METS record for the entire accession. Our current storage network however, wants only the
individual PREMIS records for each file, rather than the combined METS, so more work needs to be done to transfer
the file between these two tools. Once the material goes through this network of tools, we still need to work on our
repository and other digital asset management and discovery systems in order to suit the needs of this material which
differs in many ways from the needs of other digital materials we store and manage such as e-books and –journals and
digitized resources. This infrastructure needs to handle the preservation, management, access, and discovery of these
materials. We are watching with interest the developments of open-source tools created by the archival community such
as Archivematica, bitCurator, ArchivesSpace, and Curator’s Workbench as well as potentially doing some work on the
further development of Hypatia.
Adequate digital infrastructure to securely store and describe born-digital content. Adding these responsibilities onto
existing staff: training, workload. No formal records management policy at the university.
Appropriately secure storage. Staffing resources. Policies and workflow development.
Copying/reformatting from old redundant file formats. Network latency and storage; lack of server space. Lack of
software to support integrity of file reformatting.
Copyright: all our metadata contains a copyright statement for our digital object. Other options we can apply are
banding and watermarking to objects. We include the copyright holder when it isn’t our institution and we know who
that is, but this becomes a challenge when unknown. In some instances, we have put up digitized objects, asking
for input from our patrons for ownership. Fixity: we don’t currently have a systematic way of guaranteeing fixity! We
Previous Page Next Page