46 · Survey Results: Survey Questions and Responses
These challenges from Presidential Libraries are representative of the challenges other parts of NARA also experience.
Volume of data to be ingested in as short a time frame as possible: We receive the vast majority of our electronic
records in large transfers at the end of a Presidential administration. Because of our need to provide asset-level access
to electronic records as soon as the records are in our legal custody we need to ingest these large volumes in as short
a time frame as is possible. In our last large transfer we worked with the records creators and with our system vendor
to devise a means of transfer that employed storage area networks (SANs) to move large volumes (tens of terabytes)
of data copied from the creator’s data center to the data center for our Electronic Records Archives (ERA). Four physical
shipments of data stored on SANs over the course of several months moved more than 70TB of data from the source
data center to our data center, where the ﬁles could be staged for ingest and then moved into our system environment.
File-level access control policy: Our system users are located across the country. All users ﬁll the same role in the
system, but users should have access to only subsets of the electronic records maintained in the system (Presidential
records from one administration versus Vice Presidential records from another administration, for instance). To maintain
asset-level access control (among other needs) we established asset catalog entries (ACEs) that were assigned to each
asset upon ingest. These ACEs (xml ﬁles) include elements that deﬁne each asset by a Presidential administration and
by a records status (Presidential, Vice Presidential, or Federal). When users log in to Executive Ofﬁce of the President
instance of the Electronic Records Archives (EOP ERA) the system is able to compare the rights of the user to the
characteristics of assets to determine if the user can have access to the ﬁles. Need to make electronic message ﬁles
accessible: The storage architecture deployed in EOP ERA makes hundreds of formats available for indexing, including
.eml ﬁles for emails. One set of electronic messages planned for transfer to us during the last transition (more than 20
million ﬁles) was stored in a journal format that maintained the messages as text ﬁles. Because we wanted to access
the messages as emails (i.e., using parametric searches of email ﬁelds – To, From, Date, etc.) our vendor (Lockheed
Martin) developed a script that transformed the text ﬁles into discrete .eml ﬁles that could be ingested into EOP ERA
and managed as email ﬁles. As part of this transformation process the vendor used sample data to inform a discussion
with our archivists on the ﬁelds we wanted to maintain in the .eml target ﬁles. As part of testing we were able to assure
ourselves that the content of the messages came through the transformation intact, including any ﬁles attached to the
original message ﬁles.
There is no Digital Asset Management System (DAMS) in place to ingest born-digital material. System wide initiatives
would address this problem. The necessary hardware to transfer born-digital material from legacy media is not available
at our repository. A few pieces of legacy hardware have been purchased. Staff expertise to deal with ingesting born-
digital materials is limited. This has not yet been addressed.
Time: Reformatting legacy media, and arranging and describing born-digital content, are time-consuming activities.
The volume of data that can be found within a single item such as a hard drive can be staggering. Migrating content
from legacy media is also time consuming as there is little automation/batch handling of these materials. We are
investigating ways in which to reduce time spent on individual items. Migrating unidentiﬁed content: With unidentiﬁed
content on an obsolete media format it’s difﬁcult to determine whether the content is a reformatting priority without
accessing the material. If we do not have the equipment in-house for the obsolete media format the item requires
access by a vendor. Sending an item out to a vendor is expensive and may not be the best use of our resources. At this
point, we are investigating ways to address this issue without overuse of resources. Software licensing: Due to stringent
state regulations on software purchasing and needing obsolete software titles to access ﬁles that may be generations
removed from current software (or without a contemporary equivalent) acquiring appropriate software necessary for ﬁle
migration is a challenge. We are looking into software titles that can bridge generations; that is, software that can open
older ﬁles and convert them to a newer generation that can be accessed with current software. We are also examining
software designed to open obsolete ﬁle formats such as Quick View Pro.