News: OCR Grant Awarded to Digital Humanities, Media, and Culture (IDHMC) at TAMU

News from the TEI-L list:

English Professor Laura Mandell, Director of the Initiative for Digital Humanities, Media, and Culture (IDHMC), along with two co-PIs Professor Ricardo Gutierrez-Osuna and Professor Richard Furuta, are very pleased to announce that Texas A&M has received a 2-year, $734,000 development grant from the Andrew W. Mellon Foundation for the Early Modern OCR Project (eMOP, ).  The two other project leaders, Anton DuPlessis and Todd Samuelson, are book historians from Cushing Rare Books Library.
Over the next two years, eMOP will work to improve scholarly access to an extensive early modern text corpus. The overarching goal of eMOP is to develop new methods and tools to improve the digitization, transcription, and preservation of early modern texts.
The peculiarities of early printing technology make itdifficult for Optical Character Recognition (OCR) software to discern discrete characters and, thus, to render readable digital output.  By creating a database of early modern fonts, training the software that mechanically types page images (OCR) to read those typefaces, and creating crowd-sourced correction tools, eMOP promises to improve the quality of digital surrogates for early modern texts. Receiving this grant makes possible improving the machine-translation of digital page images with cutting-edge crowd-sourcing and OCR technologies, both guided by book history.  Our goal is to further the digital preservation processes currently taking place in institutions, libraries, and museums globally.
The IDHMC, along with our participating institutions and individuals, will aggregate and re-tool many of the recent innovations in OCR in order to provide a stable community and expanded canon for future scholarly pursuits. Thanks to the efforts of the Advanced Research Consortium (ARC) and its digital hubs, NINES, 18thConnect, ModNets, REKn and MESA, eMOP has received permissions to work with over 300,000 documents from Early EnglishBooks Online (EBBO) and Eighteenth-Century Collections Online (ECCO), totaling 45 million page images of documents published before 1800.
The IDHMC is committed to the improvement and growth of digital projects and resources, and the Mellon Foundation’s grant to Texas A&M for the support of eMOP will enable us to fulfill our promise to the scholarly community to educate, preserve, and develop the future of humanities scholarship.
For further information, including webcasts describing the problem and the grant application as submitted, please see the eMOP website:
For more information on our project partners, please see the following links.
ECCO at Gale-Cengage Learning<>
EBBO at ProQuest<>
Performant Software<>
Professor Raghavan Manmatha at the University of Massachusetts Amherst<>
The IMPACT project at the Koninklijke Bibliotheek – National Library of the Netherlands<>
PRImA at the University of Salford Manchester<>
Department of Computer Science and Engineering, Texas A&M University<>
The Initiative for Digital Humanities, Media, and Culture, Texas A&M University<>
Cushing Memorial Library and Archives<>
The OCR Summit Meeting Participants<>
For more ARC and IDHMC news, please see the following links.
Texas A&M to House Digital Literary Consortium<>
MESA to Receive Funding<>
REKn to Partner with ARC<>