About this blog

My "two cents" on being an old fashioned librarian in the digital age.
(Want to know more?)


Wednesday, February 16, 2011

Google Metadata and Librarians

I went to an amazing library conference last November - the Charleston Conference.  During the last plenary session, we got to hear Jon Orwant, a Google engineering manager, talk about the Google Books project.

Now, you've probably seen old-fashioned librarians like me reel and hiss at the very idea, nay, the very notion, of Google presuming to scan books and make them wholly or partially available online.  Yes, there are plenty of lawsuits out there.  Yes, of course, having Google send a very charismatic speaker to a premier conference is self serving and in their best interests.  And yes, there are issues about intellectual property rights and access and digitization, and the fact that so many cataloging mistakes have happened, and all that other stuff.

But this guy was COOL.

I won't go into the more technical aspects of his talk.  But I do want to tell you about one thing he shared with us.  One of the reasons for this project is to facilitate research, and Google gives academic scholarships and access to their databases for various projects.  Back in the 1950s, a definitive book about the Victorian era was published.  To write it, the author basically read every book he could get his hands on that was written during that time, and mined them for data about language, and custom, and culture, and so forth. 

Google gave a grant to a group of scholars who wanted to recreate this guy's work, but digitally.  Since so many books from that era have now been digitized, keyword searches can be used now instead of tedious page-by-page reviewing.  So this group of folks (I think they were from George Mason University, though I may be wrong on that) gained access to the Google databases and used a set of 50 keywords - words like "leisure" and "work"- then tracked their usage over time to see how common they were.  And in fact, the word "leisure" started to disappear over the era, and the word "work" increased exponentially.  Ok, so that's cool already, right? 

Well, this Google engineer went one better to show us how digitization could expedite research.  He created a graph of every book published by date and subject, that could then be zoomed in on specific periods.  So at the dawn of the printing press - BOOM - a spike in books.  Zoom in on French literature from 1650 to 1750 - BOOM - you see a spike because of the French Enlightenment.  (It was really funny - he zoomed in on that quite by accident, and said "Huh - I wonder why there's a spike here?" And the room of 1000 librarians all started shouting "It's the French Enlightenment!"  Heh.)

COOLEST THING EVER.

I wish he'd put that chart online so folks could play with it - heck, for those of us at his talk we'd probably be willing to pay!

No comments:

Post a Comment