Tuesday, January 25, 2011

Distant Reading by Using a Written Index

Today, gauging the general pulse of what people are saying or talking about or reading is fairly easy. Twitter's "Trending Topics" are one of many methods for seeing what people are interested in right now. Others include a scan of the top stories in today's newspapers, or a comparison of today's blog posts by keyword.

But how do we gauge that same pulse historically? What were people interested in 200 years ago? Can we really know?

One method is to look at what people read. For example, in 18th century London, the Gentleman's Magazine was the most successful magazine that targetted a wealthy and powerful audience. The Gentleman's Magazine published for over 200 years, starting in 1731 and during the late eighteenth century was England's most respectable magazine. In the first issue, the editor declared the magazine was originally conceived to "give monthly a view of all the pieces of wit, humour, or intellligence, daily ofered to the Public in the Newspapers, (which of late are so multiplied, as to render it impossible, unless a man makes it a business, to consult them all)".

Since we know that the magazine sold well, it's fair to assume that the editors were providing content that people wanted to read and is therefore a relevant measure of what people found interesting, broadly construed. That's not to say everyone who read it thought every article spoke to their soul. It's just to suggest that people vote with their wallets and one of the strongest indicators of a publication's influence is the number of copies it sells and for how long its able to continue publishing.

This magazine has since been digitized by Google Books, but to my knowledge, there is no accessible machine-readable text version freely available. However, there is a paper subject index of all the essays written for the magazine. The index was compiled in 1821 by some poor chap whose father insisted upon the task. The index won't give you the same full-text search option we've come to love in search engines. It is organized thematically into categories selected by the indexer. This means there is a fair degree of bias in terms of what was categorized, how it was categorized, and what was overlooked. However, with 675 pages of topics, along with the various page references to find the article, the index itself provides a useful gauge of what was appearing in the magazine during the 30 year period covered by the index (1787-1818).

reading John Bull's mindEach page is split into two columns. Using the columns as a unit of measure, I have compiled a list of the 34 most prevalent subjects from the Index and graphed them as a word cloud. France had the most 11.5 columns of entries and Africa had the least with half a column. Given the status of the magazine with Britain's upper classes, and given the fact that the magazine was successful, and therefore publishing content its readers wanted to consume, I believe it is fair to say that this word cloud is a fairly good - yet rough - illustration of what was important to wealthy Englishmen from 1787 to 1818.

Given the international affairs between Britain and various countries of the world during this era, the countries that appear most prominently are perhaps not a surprise. However, there are a few observations that I think are worth noting. Overall, I would suggest this wordcloud shows us what the wealthy people of late eighteenth, early nineteenth century England were worried about. Without reading the articles, or even knowing what the titles of the articles are, it would seem that these "Gentlemen" were writing essays about countries with which England was at war or at odds: France, Ireland, America. Less frequently appearing are Scotland and Wales, both passive at this point in terms of relations with England, along with Switzerland who tended to stay out of things.

The third most common term is "Naval Action" and "Buonaparte" is high on the list. Fires (7 columns), Storms and Murder all appear on the list. The first two threaten commerce, and murder is an obvious concern. Cowpox (which had been discovered could be used to vaccinate against Smallpox) is as prominent as Scotland or England. However, notice theft, rents, or poverty don't appear at all.

By applying my historical knowledge of Britain during this era, my distant reading of the Gentleman's Magazine suggests to me the following conclusions:

Wealthy Englishmen in the late eighteenth, early nineteenth century were interested in whichever country was currently causing the most trouble. They wanted to be kept informed of things that could kill them, or things that could disrupt their trade. They were interested in discussing the structure of the Anglican church, but less interested in discussing other religions, or directly engaging with the Bible. And finally, London was more important than America.

Is this perfectly accurate? No, of course not. But it is a macro-analysis of the interests of wealthy male Englishmen from 1787 to 1818. And it only took 20 minutes.

I would be interested to hear from others who are engaged in distant reading, or comments and concerns you may have about my methodology and conclusions, from digital humanists and historians alike.

Monday, January 3, 2011

My C.V. is Better Than Yours

Sure, you may have more publications, conference presentations and funding. You may have a better job than I do or more awards. The contents of your C.V. may make mine seem underwhelming.


But my C.V. is better. In fact, I’m confident that if we applied for the same competition, it would be my C.V. that had the committee members talking over lunch. “Did you see that guy’s C.V.?”


My C.V. makes noise. It moves by itself. It’s shiny.


See why My C.V. is better than yours: My C.V.


I’ve taken up the challenge of Dr. Tom Scheinfeldt in his blog post “New Wine in Old Skins: Why the CV needs hacking”. In that post, Scheinfeldt urged academics to come up with new ways of presenting their achievements. Ways that moved beyond the traditional lists that have largely remained unchanged since the eighteenth century.


So I'd like to extend a challenge, particularly to the digital humanists out there. Let 2011 be the year you make your C.V. better than mine. And let me know about it.