In August, the Digital Library Center proudly announced breaking the one million page mark, with over a million pages online for more than “20 collections, representing more than 44,000 titles in more than 52,000 volumes.” Now, just 7 months later we’ve added slightly over another 60% of that to the collections for a total of 1,621,841 pages, over 5,1746 titles (up from 44,000) and 67,487 volumes (up from 52,000). That means we’ve been producing almost 10% of our total holdings each month for the past 7 at nearly 100,000 pages a month!
The incredible production rate is far more incredible when the types of pages are considered. Large scale digitization initiatives produce far more pages than this, but any comparison would be apples to alligators because our pages are from all sorts of documents, photographs, maps, video, audio, and more. Each file requires metadata (title, author, and a lot more) so books are relatively quick per item for page count. Letters, maps, and photographs are much slower with the same information often required for each page. Plus the 100,000 pages have been produced over the fall to spring semester break and spring break, times when student workers are in short supply, and when many staff people take their own vacations. The page count also can’t accurately reflect the audio and video files, which are counted as a single file for any video or audio clip, even if that one page really means an hour long video, with all of the required processing. Even large printed materials skew aren’t accurately represented by pages given that a single page map will often be several square feet in size, requiring additional processing and time for a single page. Despite the difficulties in reporting fully accurate statistics, the production rate remains extremely impressive and what’s even more impressive is thinking about how many people all of these pages, and all of these materials, will help. Of course, many of the materials are also books like the cover image above, which is from Susan Proudleigh by Herbert G. de Lisser. The book is well out of print and was rare and hard to find, so hard to find that this was actually digitized from a photocopy because that’s what was readily available. But now Susan Proudleigh is available for all.
The fast and steady production is due to a great crew of dedicated workers (including students, some of whom have been at the DLC for multiple years), great technologies that we all work to constantly improve, and constant work to streamline digitization work flows. While we may not be able to much faster than this current speed of roughly 100,000 pages a month, and we may go slower during the summer with missing student workers, 100,000 pages is still a great service for the University of Florida, the University Libraries, and every citizen as all benefit from more information being openly available.
2008-03-23
1 Comment
Comments are closed.