Story of a Printed Database
A colleague recently shared her dissertation research files with me. The files are a relevant part of our conversations and activities related to several digital programs and research curation activities. She’s a Professor of History and so I’m out of my depth (and always trying to catch up) in considering the meaning and contents of her research, but I was extremely excited by the form of her research files. The research files include over 300 pages of typed notes (computer format, still readable) that are extremely detailed, with full citation for the materials and the holding institutions (different archives, repositories, etc.).
In order to build her argument through the research data, she printed copies of all of her research notes, parsed them by different elements and their attributes. The printed notes were pasted onto cards which were arranged in specific groupings in boxes. She built her argument using the sources and notes she had assembled, and by sorting, reviewing, cross-referencing, and analyzing the materials together under different searches and sortings. This paper database has information on the source institution holdings, which is especially important given archival materials and the need for context (provenance, original order, etc.). The database information on the source institution holdings includes attributes for the top-level institution, institutional divisions, archival collections, particular paper series, other organizational units, and items. In addition to the source institution holdings and its attributes, the paper database has primary source citation items with information at the item level (date, author, theme/topic, recipient for correspondence items). The primary source items were also structured for specific events with attributes for date(s), people, theme/topic, etc. With this as only a short side discussion for a meeting, I’m not clear on the full extent, but I believe that larger super-constructs like historical period or social change-event were not represented as formal structures, which makes sense with those in the knowledge of the expert user, and because not making assumed structures part of the data model is important in grounded theory research which builds from the evidence on the ground to develop new and inform existing understandings. In using the paper database, the different cards eventually were arranged and configured to create the argument. The cards now include notes on where they belonged in the structure of the dissertation, with notes on changes over time.
Printed Databases Matter for Now and the Future
My colleague’s printed database was made necessary by many factors, including: lack of access to digital copies (or any copies outside of the holding archive) of the research primary source materials; controlled use of technology in the archives (institutional controls, and practical controls with some support simply not feasible); and, limited technologies for integrative support of source materials/contents with the research notes and processes. Now, more (more, not all) archives now allow more technology in the archives. The report authors for Supporting the Changing Research Practices of Historians: Final Report from ITHAKA S+R (December 10, 2012) state:
The introduction of digital cameras to archival research is altering interactions with materials and dislocating the process of analysis, with potential impacts not only for support service providers but for the nature of history scholarship itself. (4)
Transcription remains an important part of the research method for many historians, and they reported spending hours in an archive taking notes by hand or on computer. In some instances – though rarer by the day – transcription is the only option available to archival researchers for capturing the content of the sources. (11)
The ability to organize and access photographs in a constructive way after a trip is a sticking point for many of those who worked with digital cameras. Because the digital images are typically JPEGS, there is no metadata inherently associated with the file that relates it to the content of the image. Scholars rely on complex file structures and good memories to access their files once home from the archive. One interviewee includes call slips in her photographs, which stated the name of the archive and the collection, so that she could always orient herself to the source. Again, the displacement of the intellectual engagement with the material appears to have some downsides, given the lack of tools or software to facilitate the process of capturing and using digital photographs for scholars. (12)
Clearly, some of the needs and concerns for better supporting historians and historical research processes can be addressed with Zotero (and others) that support reference management with materials (e.g., PDF of an article included with the citation). Zotero also supports work in groups. There’s clearly more that could be done in many areas, as the ITHAKA S+R report authors and others have shown.
One area in need of more work and that represents an opportunity is integrative support for data management and digital collections through collaboration among librarians, archivists, and historians. Research time for historians includes time to learn about material holdings, travel to archives, take photos and transcribe sources, arrange sources (paper piles and books stacks around desks, digital folders, etc.), and other activities related to accessing resources and building relationships as well time in isolation for curation, arrangement, sorting, databasing, and data clean-up. After the analysis and scholarly work, the findings are published as books and articles and shared in many ways. However, the heavily curated resources are normally not available. Any activities to build on data management and make the resources available need integration with other processes so as to not increase workloads on scholars while increasing the final outcome benefits.
The problem/opportunity could be to build out integrative data support. Building on existing systems and supports from libraries including institutional repositories and digital libraries with primary materials to implement integrated support for historian research data for their data management (data being digital files, so all of the files of primary documents and historian-added research) would make more resources available immediately and would provide an example framework that has its own value as well as collateral/potential value for informing and supporting necessary discussions and translations among historians, librarians, technologists, and others for defining needs and developing new supports for the full data lifecycle.
Printed Databases and Beyond: History and Other Fields
The need for integrated data support across the full data lifecycle is clear in Supporting the Changing Research Practices of Historians: Final Report from ITHAKA S+R (December 10, 2012):
Researchers widely and consistently reported that managing analog and digital research notes and sources is a primary challenge for them. Collocating and accessing research notes, and relating them to the writing in an effective way, is an organizational challenge, especially for large book projects that can last multiple years and cover hundreds, if not thousands, of resources. And yet, this is perhaps the most tangible component of the analytical work conducted by historians. (23)
Digital systems do not appear to address all the needs of even those scholars who seek to use them. One scholar’s process for collecting and organizing source material incorporated a database to capture passages and collect notes. From the database he then prints each note or quote onto an index card, and the words are then organized into chapters. He manually reviews the stack of note cards for a section of a chapter, arranges them into a narrative, and writes from this tangible tool. (25)
Historians are not alone in finding digital tools insufficient for their needs, and in finding inventive ways to meet those needs through alternate designs. Alternate designs are particularly useful for informing the critical complexities and dependencies necessary to meet user needs, institutional needs (integrating with good processes, leveraging existing capacity, aligning with mission critical goals), and community needs for building community, change management, and more. For the scholar’s primary field and other fields, alternate designs can inform and evoke ideas and ways of thinking that are productive for conceptualizing and creating new systems for data management, scholarly communications, and scholarly practices.