11 SPEC Kit 354: Data Curation
Interestingly, there is some disagreement as to the value of providing the data curation activities
on this list. In addition to responses indicating a strong interest in these activities, there were also a
number of respondents who indicated that they had no interest in providing them or were unsure
whether or not they wanted to provide them. The number of respondents who indicated a strong lack of
interest or were unsure are listed in the table below.
Activity No Interest Unsure Total % of those providing a response
Repository Certiﬁcation 5 10 31%
Code Review 10 6 33%
Emulation 14 7 44%
Peer Review 20 5 52%
Software Registry 12 9 44%
Deidentiﬁcation 11 5 33%
Interoperability 5 4 19%
In both the processing and preservation categories there are a large number of respondents
(close to half ) who indicated they have no interest in performing these curation activities in the future.
The survey results and comments made about data curation activities reflected librarians ambivalence
around incorporating them into library services. As one respondent commented, “We believe all this is
important, just not things the LIBRARY needs to do or should do.”
Peer review appears to be a particularly problematic area for librarians as many respondents
appeared to recognize its importance to scholarship, but felt that the complexities of peer review for data
put it outside of what libraries can oﬀer. Some of the comments also indicated that while activities such
as repository certiﬁcation and emulation are important, they are not necessary for every library to achieve
or to oﬀer. Other comments expressed concern about the ability of libraries to oﬀer these services given
limited resources and expertise. Instead, some respondents felt that the data curation activity would be
better performed by others, particularly the researcher depositing the data or an IT unit. This schism in
the survey responses with some respondents aspiring to provide particular data curation activities and
others indicating uncertainty or no interest, is further indication that the library community has not yet
come to a shared understanding of the roles they expect to play in providing data curation services.
Respondents indicated that they expect to face numerous challenges in providing data curation services
in the near future. The survey listed seven aspects of providing these services and all of them were seen
as challenging by respondents, receiving an average rating of 3.54 or higher on a 5 point Likert scale (5 =
very challenging). The most challenging is having expertise in curating certain domain data. The lowest
ranked challenge is changing requirements for data sharing. The comments indicate there is considerable
concern about institutional priorities for data curation and funding, increasing demand for services and
the library’s capacity to scale up to respond to anticipated demand, and the challenges of recruiting and
retaining skilled personnel to provide services.
Perceived Importance of Curation Activities
The respondents who reported they are not currently oﬀering data curation services were asked to assign
a ranking of importance to each of the 47 possible curation activities listed in the survey, with a rank of “1”
meaning that they consider the activity to be essential and a rank of “5” meaning that it is not important.
Overall, the activities that received the highest importance rating are in the ingest and access categories.