There's a lot of data on the Internet, and researchers have had a hard time trying to organize the information and use it. But Princeton researcher David Blei has been trying to teach computers to do the organizing themselves, The Economist says. Blei starts by defining topics as sets of words that would come up in connection with each other — for example, while "Big Bang" and "black hole" often come up in connection with each other, they aren't often connected to "genome," the article says. Once the sets are defined, computers can then start to figure out which terms are part of a topic and which aren't. Blei's model even allows for very narrow topic searches, allowing the user to decide how fine an analysis he or she wants, the article says, adding that eventually, "networks of linked words will emerge." While this presents a novel way to organize information, Blei believes it can also shed light on how the scientific method works. One version of his model looks for topics in published texts and then tracks how these topics evolve in the literature from year to year, The Economist says. "This allows important shifts in terminology to be tracked down to their origins, which offers a way to identify truly ground-breaking work — the sort of stuff that introduces new concepts, or mixes old ones in novel and useful ways that are picked up and replicated in subsequent texts. So a paper's impact can be determined by looking at how big a shift it creates in the structure of the relevant topic," the article adds.
Science and the Web
Apr 29, 2011