through Flickr. Oral Roberts University’s Written Rummage project2 to transcribe a Fredrick Douglass diary uses a free Amazon cloud service, Mechanical Turk, to manage the transcription workflow.3 Other more well-heeled projects have resulted in efforts such as What’s on the Menu?,4 a New York Public Library project to transcribe historic restaurant menus. As of the end of November 2011, there have been 645,517 dishes transcribed from 10,960 menus. Some projects rely on specialized crowdsourcing software. Among the first to enter this arena was software engineer Ben Brumfield, who built the web- based tool From the Page5 for transcribing, indexing, and annotating handwritten material. At the time Iowa was starting its project, this was the only open-source solution around. Since then, with the help of grants from the National Endowment for the Humanities Office of Digital Humanities, the Roy Rosenzweig Center for History and New Media has developed an open-source tool Scripto and applied it to transcribe 45,000 papers of the War Department.6 This solution is gaining momentum, in part because it integrates with existing content management systems. Libraries considering crowdsourcing should also look to the Australian and European library communities, as well as non-library efforts, for innovative and more seasoned examples of engaging the crowd. The National Library of Australia and the non-profit Distributed Proofreaders have organized extensive projects to correct text images scanned using OCR and enhance access by adding tags and other markup.7 International university collaborations such as Galaxy Zoo, a Zooniverse Project, ask volunteers to classify millions of photographs of galaxies, while still other projects invite the public to upload their own artifacts and recollections for inclusion in an online collection. The Iowa Approach In preparation for the Civil War sesquicentennial beginning in 2011, the UI Libraries conducted a two-year reformatting project to provide comprehensive digital access to the Civil War manuscript materials in its Special Collections department, comprising approximately 50 collections containing more than 20,000 pages of correspondence and diary pages. As the scanning effort was drawing to a close, curators began to discuss ways to promote the resulting digital collection. Most of the items were handwritten and lacking transcriptions (with the exception of a small number provided by the families who donated the materials), so the idea of a transcription crowdsourcing project had strong RLI 277 10 Experimenting with Strategies for Crowdsourcing Manuscript Transcription ( C O N T I N U E D ) DECEMBER 2011 RESEARCH LIBRARY ISSUES: A QUARTERLY REPORT FROM ARL, CNI, AND SPARC
Previous Page Next Page