UFDC Usage Stats!

We now have usage statistics for our collections as a whole and by collection! They’re online here in an Excel spreadsheet. Our overall usage stats, while good, are far smaller than they will be because so much of our content is recent (over a million pages since July alone) and because UFDC was forced to deny search engine robots entirely for several months in early 2008 because the search engines were behaving improperly and overly taxing the database.
As expected, our most used collections are the oldest and largest, with the Baldwin, the Florida Newspaper Digital Library, and the Digital Library of the Caribbean.
Given the addition of so much content and the static pages, UFDC’s hits should double (or more) in the next year!

3 Comments

  1. Thanks for posting. I like the organization by month. Your numbers represent page hits per collection, correct? We (Duke Libraries) are working on preparing and publishing our Digital Collections usage stats for 2008 and hope to share soon…
    http://library.duke.edu/blogs/digital-collections

  2. Yes, the page hits per collection, with the totals for all collections in the row that’s in black, so 343,608 hits in December 2008.

  3. The statistics we stored in the database were on a monthly basis. We stored:
    1) How many hits TOTAL and sessions TOTAL that hit UFDC
    2) At each collection hierarchy level (subcollection, collection, collection group) we got the number of sessions and hits at:
    a) home page
    b) advanced search page
    c) search results page (# of searches)
    d) browses
    3) For each institution, we stored the same data as above
    4) The number of hits and sessions at each title level (fairly less commonly used. Title view would be the list of all items in a newspaper for example, without having a single issue selected).
    5) Number of hits and sessions for each item, including number of specific view types
    a) jpeg view
    b) zoomable jp2 views
    c) citation views
    d) google map views
    e) flash views
    f) download views
    g) static view pages (exposed to search engines).
    We attempted to remove all robot hits by searching for information in the original HTTP request. (OS, Browser type, MSNBOT, Crawler, Slurp!, etc…)
    I have the code I used (C#) for much of this analysis and can share the database design if anyone is interested.

Comments are closed.