Wednesday, June 17, 2009

Google Book Search and my wasted youth

In college I worked as a library researcher for the Oxford English Dictionary. I had two main roles. First, I would verify quotations. They would send me a slip of paper with a citation from a book, journal, etc. I would locate that item in the library and verify the citation's accuracy.

More interesting, and more time consuming, was antedating work. As a historical dictionary of English, the OED strives to give the earliest quotation it can for each word. When the editors were working on a new entry, they would send me their earliest citations and ask me to find earlier ones.

Antedating led to some interesting adventures. I emailed Richard Stallman to ask about the origin of the term POSIX; he refused to offer any help until the OED was released into the public domain. Researching "ribbit," I combed the script archives at UCLA to find an annotation in an old Smothers Brothers script.

Most of my work, though, turns out to have been largely wasted. 15 years later, early uses I spent hours to find can be beaten within minutes using Google's Book Search. A sad (for me) example is "bow hunting". I looked through dozens of books about hunting with a bow to find an early use of that term, not to mention dozens of volumes of old magazines. The oldest citation the OED has is from 1947; that's the earliest one I could find after hours of work in 1993. Using Google Book Search today, it took me less than a minute to locate a citation from 1923. (Interestingly, Popular Mechanics won't allow the full citation to be displayed.) A little more digging could probably locate even earlier examples. As Google Book Search expands its corpus, the date could go even further back.

With all the controversy surrounding Google Book Search, it's easy to overlook its incredible importance for all kinds of scholarship. I was hired by the OED partly because I was in Berkeley and had access to its vast library holdings. Now anyone, anywhere in the world can do a vastly more thorough search. It's only a matter of time before each quotation in the OED is linked directly to the source material; someone could easily write a Firefox add-on to do just that, right now.

I'm not bitter, though. I think of myself as monk-like, creating a beautiful sand mandala, only to have it swept away upon completion. This is how it ought to be.

