Managing Born-Digital Special Collections and Archival Materials, SPEC Kit 329 (August 2012)

Nelson, Naomi L.; Shaw, Seth; Deromedi, Nancy; Shallcross, Michael; Ghering, Cynthia; Schmidt, Lisa; Belden, Michelle; Esposito, Jackie R.; Goldman, Ben; Pyatt, Tim

SPEC Kit 329: Managing Born-Digital Special Collections and Archival Materials · 41
are actively working on a preservation plan that will address this issue. Authentication: we don’t currently have a
mechanism to authenticate born-digital objects – we “trust” the source and ingest. We are hoping to make this part of
our Digitization Preservation Policy, which is currently in development.
Developing policies and procedures relating to the acquisition and ingest of born-digital content: the Digital Archivist
has recently completed a research leave where he has drafted a digital preservation policy that could apply to born-
digital materials. Developing an open-source digital asset management system: the ingest process for our digital asset
management system has been unreliable in its early stages of development. The Libraries has dedicated an IT person to
this system and has hired a vendor to further development of the system, particularly regarding its stability. Creating an
inventory of born-digital material on legacy media: the Digital Archivist will soon be compiling such an inventory based
on existing finding aids.
Developing secure hardware infrastructure to protect PII collected and retained have worked closely with the campus IT
security office. Securing secure, backed-up server space for dark archive. Planning access strategy for restricted content.
Digital storage space. We have recently conducted an inventory of all of our special collection digital assets (not
just born-digital). This will be used to more effectively plan our storage needs—the amount and types of storage.
Sustainability of digital library and preservation platform. We haven’t yet adequately addressed this issue.
File format is an enormous challenge. We are receiving research data proprietary to specific data collection and
analysis tools, such as the SURF surface mapping data produced by the software MountainsMap. Another is the gene
sequencing data, FASTA, produced by the SOLiD gene sequencing system. We don’t have non-proprietary formats
in which to store this data and we don’t know enough about persistence and backward compatibility for the tools.
Our researchers are skilled at using the tools and interpreting the data but aren’t able to answer our questions about
persistence and longevity for the data. Thus far, our only strategy is to document the instruments that created the
data, document as much as we know about the data (which is often in multiple files) and to bring this issue up in every
research data gathering and suggest that conversations with these instrument providers are needed. File size is another
challenge. Large files take a very long time to process and can make born-digital files difficult to manipulate in the
repository and for end users to download. We currently bundle large files into zip files for downloading but need an
effective background methodology for ingest.
File format on legacy tape drives from punch card data that has Census/private information for different nations. Need
for old hardware on site for conversions and ingest with immediate time demands. Scaling up for the demand.
File formats: i.e., Word 1.0 documents. Hardware: i.e., receipt of records on 5 1/4” or 3 1/2” discs no computers that
will read such discs. Uncertainty about the authenticity of the records we have received. Do we have the only copy or
are there multiple copies/versions available elsewhere?
Hardware and software. We don’t always have the hardware and/or software to access legacy file formats, and don’t
know how to access files without changing their metadata. We try to collect obsolete hardware when possible, and
sometimes outsource accessing these legacy files. Selection of file formats for streaming media we are currently
working on this with library IT staff. We face challenges trying to educate the university community about giving us their
born-digital files, and lack confidence that we can preserve it and make it accessible because of lack of resources and
internal technical expertise. We are working on outreach to university offices, and working on developing necessary
skills for archiving born-digital content.
Hardware lack of secure storage and backup. We are attempting to implement now, working with university IT. Privacy/
security. We hope to develop written policies.
Images received in digital format but named idiosyncratically by the photographer. In order for these files to be used in a
local digital environment it is necessary to provide meaningful file names in relation to existing or new local directories.

Previous Page Next Page