41 SPEC Kit 354: Data Curation
28. Please indicate your institution’s level of support for these data curation ingest activities on a scale
of 1 to 5 where 1=currently providing; 2=will provide in the near future; 3=would like to provide,
but unable to at this time; 4=no interest/desire to provide; 5=unsure. N=49
Activity 1 2 3 4 5
Metadata 43 1 4 0 1
Deposit agreement 38 5 5 1 0
Authentication 36 1 8 2 2
Documentation 36 3 8 0 2
File validation 32 3 12 0 2
Chain of custody 22 2 16 3 6
# of respondents 45 9 22 5 9
Authentication and chain of custody are not done at the level described here, in part because we allow
for unmediated ingest. We are using ORCID to login to Zenodo to ingest data from a GitHub account by
linking to the UFID/Gatorlink authentication.
Deposit agreements have been done on an ad hoc basis. Formal agreement currently making it’s way
through legal for approval.
For ﬁle validation and chain of custody, we are using whatever is provided by Bepress during
IR is currently undergoing policy changes that aﬀect this area.
Like many groups, the infrastructure and work was quickly rushed to production while not all the
services, policies and procedures, and distribution of tasks have been fully formed and vetted.
RE Chain of custody: we do this currently, but it’s not consistent enough for me to say it’s rigorous
enough to provide a true record of provenance.
Self-deposit IR supports these activities.
These levels have changed over time but the ratings reflect our current situation.
This is a mediated process that allows us to ensure authentication, chain of custody, and metadata. We
are working to provide better ﬁle validation.
We provide support but some elements (metadata, documentation) are not as robust as they could be
given that our repository is self-service.
SUPPORT FOR APPRAISAL ACTIVITIES
Here are descriptions of three data curation appraisal activities.
Rights Management: The process of tracking and managing ownership and copyright inherent to
a data set as well as monitoring conditions and policies for access and reuse (e.g., licenses and data
Risk Management: The process of reviewing data for known risks such as conﬁdentiality issues
inherent to human subjects data, sensitive information (e.g., sexual histories, credit card information)
or data regulated by law (e.g. HIPAA, FERPA) and taking actions to reject or facilitate remediation
(e.g., de-identiﬁcation services) when necessary.