172 · Representative Documents: Workflows
univerSity of michigan
Quality Assurance for BHL Web Archives
9/21/2011 2
Quality assurance (QA) refers to the systematic evaluation of an activity or product
“to maximize the probability that minimum standards of quality are being attained.”1
BHL staff involved in the preservation and QA of archived websites should have a
some understanding of the design and architecture of websites (including links,
embedded content, web forms, navigational menus, etc.) as well as basic knowledge
of HTML, Cascading Style Sheets (CSS), JavaScript (JS), and other significant web page
features. A familiarity with the curatorial interface and basic functions of the
California Digital Library (CDL)’s Web Archiving Service (WAS) is also important.
In performing QA on websites preserved by the University Archives and Records
Program (UARP) and Michigan Historical Collections (MHC), the Bentley Historical
Library (BHL) seeks to ensure the accuracy and integrity of its web archives
During this process, a BHL QA specialist will:
Identify incomplete, inaccurate, or unsuccessful web captures
Determine the underlying causes or issues that led to the substandard
captures. This step may require the QA specialist to:
o Verify crawl settings
o Review crawl reports and logs
o Inspect the content, layout, features, and source code of the target site
o Any technical limitations, robots.txt exclusions, or other issues that may
have prevented a faithful and accurate capture of a website.
o Contact information for webmasters (if necessary)
o Recommendations to delete captures or initiate new crawls
Given the inherent challenges of various content types and the technical limitations of
the WAS infrastructure, it is not feasible to perfectly preserve the content,
appearance, functionality, and structure of all targeted websites. Although QA may
not resolve all issues with a given archived website, careful documentation will help
to establish the provenance of content and record actions taken by the archives.
Information gathered during QA will also enable the library to revisit problematic
captures as web archiving technology continues to mature.
The CDL’s release of additional quality assurance tools and reporting features for
WAS in late May/early June 2011 will require the revision of these guidelines and
procedures. This document will also be reviewed on an annual basis to ensure that
the information and procedures contained herein are current and applicable.
“Quality assurance.” Wikipedia (May 5, 2011). Retrieved on May 6, 2011 from
Previous Page Next Page