European academic publisher Elsevier says it is opening up its vaults, or databases, and making it much easier for scientists to text-mine its more than 11 million online scientific papers, Nature reports.
The move should be welcome to scientists seeking to scour vast seas of information for their large-scale analyses projects, which are considered a particularly useful approach for finding meaning in genomic data. It also may prod other publishers to follow suit, Nature expects.
Elsevier says researchers at academic institutions may now use its online interface to batch-download documents in XML format. The Dutch publisher plans to provisionally limit investigators to 10,000 articles per week, and asks that they sign a legal agreement stating they will only publish products of their text-mining under a non-commercial use license and will only use snippets of the origin text.
Elsevier changed its policy after many in the research community said going through the publisher for data was an onerous and time consuming process.
Max Haeussler, a computational biologist at the University of California, Santa Cruz, once spent over three years trying to get DNA data from 3 million articles for his genome research, Nature reports.
Elsevier's Chris Shillum says the criticism was "legitimate," and that the publisher now has tried to remove "the practical barriers" to text-mining.
“Finally, someone is showing that there is no need to be afraid of text-mining analysis anymore,” says Haeussler says.