Week Four Reading Blog: Deconstructing the Database’s Perks and Perils

Our previous weeks’ readings consistently reflected both enthusiasm and caution about the new and not-so-new digital tools available for all historians. In keeping with this message, the digital “database” has now become that next digital-history resource that can potentially elevate historical scholarship to a new level – “digital history 2.0,” according to James Mussell — while still retaining the potential to “scald” unwary and neophyte users.  But for me, the readings suggested that the digital database, if used properly and for selected purposes, can be a boon to the average historian — even the most digitally challenged of us. Yet in spite of my enthusiasm for the digital database, Lev Manovich’s characterization of the digital database as the ‘natural enemy’ of narrative resonates strongly with me. My own experience with Edward Ayers’s The Valley of the Shadow Web site gave me reason to believe that Manovich’s point has merit. Thus, the message to me is clear: Proceed with caution, and understand the strengths of weaknesses of the database before putting it to use .

The first message I took from the readings was the need to recognize  the true nature — both good and bad – of the digital database as described by Manovich. I am inclined to agree that a database is simply a digital archive in which each of its stored components, all holding equal status, depends upon the skill and intent of the user to unleash that database’s potential. But, as Patrick Spedding cautioned about the limits of the ECCO database, recognizing the shortcomings of those digitized holdings is absolutely essential to making effective use of them. Understanding the shortcomings of OCR and recognizing error rates in digitization, as pointed out by Simon Tanner in an earlier reading and which Spedding further underscored, are all critical factors in making effective use of the database.

In my own tinkering with the historical newspapers archived on ProQuest’s site, I learned quickly that each search for a key term produced not simply easy-to-digest results but actually a whole new database.  In other words, I engaged in the digital manipulation of a database that Manovich discussed in the simple act of saving my results to the “My Research” feature in ProQuest. In effect, I had created a new database from which I could potentially apply further searches with greater granularity.  But I stumbled a bit here, since I could not figure out how to conduct these more refined searches from my newly created database.  What I did discover, though, was that ProQuest’s search function queried the actual OCR-produced scans of the newspapers.  And instead of identifying complete phrases, the search produced results according to each word in a phrase. Thus, my new database became filled with needless hits based upon individual words in the phrase, most notably the preposition “of,” instead of a complete search for the phrase “invasion of Europe.” Even so, the results quickly narrowed the database for me and presented some discernible patterns. Most importantly, they packaged for me into a newly defined database the actual primary sources I needed with which to apply the historian’s traditional qualitative analysis. I kept in  mind Sean Takats’s concerns about the abundance of source material as I tinkered with ProQuest, but my ability to reconfigure and limit the initial database helped alleviate some of those concerns a bit. I was able to follow my “hits” directly to the OCR-scanned facsimile of the articles and assess them individually.  But I still found myself sorting through a lot of superfluous stuff. Thus, the act of sifting through numerous sources, some useful but many not, showed me that the traditional approach to researching history still applies — but the research part now seems much more efficient  thanks to the digital database.

The second point from the readings that grabbed my attention was the idea that databases can now allow historians to make subordinate points within a  broader argument without engaging in extensive — and possibly digressive – research. W. Caleb McDaniel described how some scholars used search “hits” to quantify and support “points that were secondary to their arguments,” but the danger rests in what Lara Putnam cautioned as “superficiality or topical narrowness.” I am inclined to agree, yet I find this use of the database particularly intriguing for my own dissertation research. For example, my focus will be on examining how the radio and print media portrayed D-Day as it was happening on 6 June 1944. But I wanted to explore as a subordinate matter the degree to which newspapers “talked up” the invasion in the six months leading up to the event. The idea of reviewing extensive six-month samplings of multiple American newspapers to support a smaller point contained in one or two paragraphs seemed quite daunting and not the best use of my time.  I liked the terms that Lara Putnam used to describe the possibilities of making transnational connections to historical arguments through searches among multiple databases — “side-glancing” and “term-fishing.”  These terms helped me to conceptualize how databases can enable the inclusion of subordinate points within an argument without necessarily crossing external boundaries as Putnam intended. In effect, the point made is strictly contingent upon a targeted — but hopefully not superficial — acknowledgement of another factor that bears directly on the core argument without the need for an in-depth examination of numerous primary sources. But “hope” isn’t a method; and, in spite of my attraction to the concept,  I’m concerned that such results may in fact make narrow, tenuous points that won’t withstand scrutiny. My greater fear is that historians more broadly may tend to rely on basic patterns gleaned from databases to make many of their key points. My own thinking here is incomplete, and I may have actually talked myself into the very pitfall that concerned Putnam — a tendency toward superficiality. Frankly, I won’t know how I feel about my own re-defined notions of side-glancing and term-fishing until I put them to the test.

In sum, I see databases as a great thing for historians, but I’m skeptical about characterizing them as a genre unto themselves. Manovich’s article makes an interesting case for the database as some new “cultural form,” but I can’t help but see databases (at least for the moment) as just another digital tool historians may leverage to help accelerate and enrich their research. In other words, a database is really just an archive without the dust.  And so I proceed with cautious enthusiasm!

Steve Rusiecki


