Using a Google database of about 5.2 million digitized books, Harvard researchers have created a "cultural genome" of the humanities, reports the Chronicle of Higher Education's Marc Parry. The researchers, who published their findings in Science, looked through the "digital fossil record" of about 15 million digitized books, selected a data set of 5.2 million books, and examined the frequency with which certain words appeared over time in that data set in order to quantify cultural trends. They found that the English language is still growing, that humanity forgets its history more quickly which each passing year, and that celebrities are losing their fame faster than in previous years, among other things. The data set, the largest one to date in this type of study according to the researchers, includes books in several languages ranging from the year 1500 to 2008, and more than 500 billion words which, Parry says, amounts to a sequence of letters 1,000 times as long as the human genome.
The 'Cultural Genome'
Dec 18, 2010