6 Survey Results: Executive Summary
Twenty-nine libraries answered a question on which tools and applications they use in their
curation treatments. The most commonly used include BagIt (13 responses) and Fixity (12). Bitcurator,
FITS, and JHOVE are each used by nine institutions. A few also mentioned DROID and OpenRefine. Half
of the respondents use two or more different tools, depending upon their service level.
One tool that many institutions use to ensure access and the citability of research data is a
persistent identifier. Many repository platforms and software applications facilitate the creation of
persistent identifiers for digital assets, and there are a variety of identifier types available for institutions
to adopt. The survey responses indicate that handles are the most commonly employed persistent
identifier (26 responses or 59%), followed by DataCite DOI’s (25 or 57%), and, to a lesser extent, CrossRef
DOI’s (9 or 21%), PURLS (5 or 11%), and ARKS (4 or 9%).
Preservation Services
One key component of the data curation lifecycle is data preservation. Preservation services (such as
emulation, file audits, migration, secure storage, and succession planning) help ensure that the data and
technology is reusable and stable over the long term. Of the 50 respondents to a preservation question, 34
(68%) provide these services for curated data. Fourteen of these indicated they will preserve data for at
least 10 years, four others reported between 12 and 25 years, and at least 10 indicated their commitment is
to preserve data indefinitely. Others don’t specify a time commitment.
The platforms and tools these libraries use for preserving data vary widely, with most
respondents selecting “other platform” from the list of answer choices. Those platforms include
DSpace, ePrints, LOCKSS, Swift Open Stack, APTrust, and DPN. We suspect this variety is due to the
varying degrees of preservation, and the difficulties with pinning down definitions. As one respondent
commented, “We presently steer clear of the word preservation, relying instead on long-term stewardship
as our nomenclature.”
Figure 3. Platforms used for archiving/preservation
The most common preservation-compliant metadata standards used are MODS and PREMIS (12
of 28 responses each or 43%). There is little standardization across institutions in backup services. Many
are employing tape systems and cloud services to ensure redundant copies of the data remain available.
