NEW YORK (GenomeWeb News) – US funding agencies may have awarded millions of dollars in duplicate grants, according to an analysis by Virginia Bioinformatics Institute researchers presented as a peer-reviewed comment in Nature today.
Using text-mining software followed by manual review, Virginia Tech professor Harold Garner and his colleagues identified 39 pairs of grants awarded between 2007 and 2011 that were suspiciously similar. All together, those grants totaled more than $20 million, the researchers said.
"It is quite possible that our detection software missed many cases of duplication," Garner said in a statement. "If text similarity software misses as many cases of funding duplications as it does plagiarism of scientific papers we've studied, then the extent of duplication could be much larger. It could be as much as 2.5 percent of total research funding, equivalent to $5.1 billion since 1985."
However, the researchers noted that, because of the limitations presented by the data they relied on, they could not definitively say that they uncovered true duplications. To come to more certain conclusions, they said that they would need to make Freedom of Information Act requests to have full access to grant documents.
To reach the conclusions that they did make, though, Garner and his team combed through more than 850,000 funded grant and contract summaries from the National Institutes of Health, the National Science Foundation, the Department of Defense, the Department of Energy, and the Susan G. Komen for the Cure charity. Of those records, about a quarter could not be included in the study as the summaries were too short.
With their eTBLAST text similarity engine, the researchers calculated similarity scores for the remaining grant summaries. Grants with scores that met an arbitrary cutoff, totaling about 1,300 summaries, were selected for manual review.
The manual review aimed to differentiate between grants that used similar phrases in introductory or background sections from ones that had overlapping project aims, hypotheses, or goals.
From this, they found 167 pairs of grants amounting to about $200 million in funding that appeared to be duplications.
"Our analysis does not determine whether any likenesses in funded grant pairs are inappropriate, only that the short summaries contained highly similar aims, goals, objectives, and hypotheses," the researchers cautioned. "To identify true duplicates, we would need to compare the full applications, the awards made and any adjustments made to the awards on the basis of disclosures of duplicate funding. This information is not publicly available."
The researchers did not publish the grant summaries that they flagged as possible duplications.
The Susan G. Komen for the Cure did, at the researchers' behest, examine 30 pairs of grants that it funded. Two pairs, the foundation said, had been identified and had had funding adjustments made that were not reflected in the summaries searched by Garner and his team. The foundation then began a review of two other projects, but could not evaluate any other possible duplicates without data from other funding agencies.
The researchers contended, though, that their analysis likely missed many instances of grant duplications.
For example, they noted that some principal investigators appeared to apply for a grant that included support for lab members while also applying for funding from another agency to support students. They also found similar grants submitted by different PIs, including from pairs of researchers that appear to publish together. Additionally, the researchers said that they've noted that some research articles cite support from a number of grants, and further investigation of some of those grants uncovered possible duplicate grants that their analysis missed.
Further, similar text-mining techniques have found instances of plagiarism in 0.04 percent of biomedical manuscript, even though 1.4 percent of researchers admitted to plagiarism in a survey. Extrapolating from that, Garner and his colleagues suggested that instances of duplication could reach as high as 12,441 grant pairs, or up to $5.1 billion in funding, since 1985.
Their results, the researchers said, underline the need for a centralized grant information database that includes detailed grant application information.
"In line with the Government Accountability Office report issued February 2012, these findings suggest the research community should undertake a more thorough investigation of the true extent of duplication and establish, clearer and more consistent guidance and coordination of grant and contract funding across agencies, both public and private," Michael Waitzkin, a researcher on the analysis from Genomeon, said.