45 SPEC Kit 354: Data Curation
Interoperability: Formatting the data using a disciplinary standard for better integration with other
datasets and/or systems.
Peer-review: The review of a data set by an expert with similar credentials and subject knowledge as
the data creator for the purposes of validating the soundness and trustworthiness of the file contents.
Persistent Identifier: A URL (or Uniform Resource Locator) that is monitored by an authority
to ensure a stable web location for consistent citation and long-term discoverability. Provides
redirection when necessary (e.g., a Digital Object Identifier or DOI).
Quality Assurance: Ensure that all documentation and metadata are comprehensive and complete.
Example actions might include: open and run the data files; inspect the contents in order to validate,
clean, and/or enhance data for future use; look for missing documentation about codes used, the
significance of “null” and “blank” values, or unclear acronyms.
Restructure: Organize and/or reformate poorly structured data files to clarify their meaning
and importance.
Software Registry: Maintain copies of modern and obsolete versions of software (and any relevant
code libraries) so that data may be opened/used overtime.
Transcoding: With audio and video files, detect technical metadata (min resolution, audio/video
codec) and encode files in ways that optimize reuse and long-term preservation actions (e.g., Convert
QuickTime files to MPEG4).
31. Please indicate your institution’s level of support for these data curation processing and review
activities on a scale of 1 to 5 where 1=currently providing; 2=will provide in the near future;
3=would like to provide, but unable to at this time; 4=no interest/desire to provide; 5=unsure. N=48
Activity 1 2 3 4 5
Persistent Identifier 40 2 5 0 1
Indexing 25 2 16 3 2
File renaming 22 2 14 9 1
Quality Assurance 22 1 16 6 3
File Inventory or Manifest 21 2 19 4 2
Restructure 17 2 15 11 3
Transcoding 13 2 20 8 5
Interoperability 11 3 25 5 4
Software Registry 4 2 21 12 9
Peer-review 1 0 22 20 5
# of respondents 42 9 40 25 15
Comments N=9
For some of these activities, we already support some, but not all, aspects described herein (e.g., we
verify metadata but don’t crosswalk, we ensure documentation are comprehensive and complete,
but we don’t open and run data files). We have not yet received AV materials as part of our data
management programs.
For those marked 1, we do a pretty minimal amount, e.g., might do file renaming or restructuring, or
metadata for a group or set of files, but not for each individual file.
Most of the above is for libraries collections.
Previous Page Next Page