The newly-released DocSouth Data makes the full text of hundreds of nineteenth-century books and pamphlets available for easy download as text-only files. The materials come from four text-heavy Documenting the American South collections: The Church in the Southern Black Community; First-Person Narratives of the American South; Library of Southern Literature; and North American Slave Narratives. (http://blogs.lib.unc.edu/news/index.php/2015/01/introducing-docsouth-data-old-texts-for-new-readings/)
The page for DocSouth Data provides a basic overview of the data included, and then offers ZIP files of the data for the major collections with an explanation on the folder contents and structure, as well as a link to Voyant, which is a fabulous tool for folks new to and looking to explore and learn about text and data mining.
I’m very excited to see what scholars–and especially students in classes–do with the newly released DocSouth Data. I’m super-duper excited to have such a concise model to follow for releasing data from other projects. DocSouth Data is an amazing resource for its content, and for the clear and simple model it offers for all digital collections and repositories for releasing data with low internal effort required and low external effort required to start playing with and making use of the data.