The Microarray Gene Expression Data Society has sent a letter to about 90 scientific journals suggesting that they require authors to submit microarray data to public repositories as part of the process of publication.
The effort is aimed at encouraging researchers to share their data and make their results more transparent.
“There are public data repositories, [such as] GEO and ArrayExpress, that are able to accept data in a meaningful way. We wanted to make sure that journals, editors and reviewers knew about them and were making sure that data were getting deposited in those repositories,” said Catherine Ball, director of the Stanford Microarray Database and president of the MGED Society.
The letter follows on an earlier initiative launched by the MGED Society to get journals to follow a set of guidelines and a checklist based on the organization’s Minimal Information About a Microarray Experiment (MIAME) standard. The MIAME guidelines were published in Nature Genetics in 2001, and a year later the MGED Society fired off a letter to journals that publish articles on microarray research suggesting that they require research papers to follow the guidelines as a condition of publication (see BAN 10/11/2002).
The initiative met with success as several of the major scientific journals contacted by MGED immediately agreed to the suggestion. The Nature journals even went a step further by requiring authors to submit data that was integral to a paper’s conclusions to either the ArrayExpress or Gene Expression Omnibus repositories.
While the response to that earlier letter was “extremely positive,” the MGED Society said it believed the journals should now take that extra step.
According to the letter, sent out last week, “While the adoption of these standards has greatly improved the accessibility of microarray data … obtaining and comparing datasets remains a significant challenge. Clearly we need additional requirements for publi-cation that include submission of expression data to public data repositories.”
It is unclear how many scientific journals require researchers to follow the MIAME guidelines or submit their microarray data to a public repository. But one journal that strongly recommends it, without requiring it, is Genome Research.
According to the journal’s policy, which was sent to BioArray News by Genome Research Managing Editor Hillary Sussman, “Material from a publication must be easily available to the broader community in publicly held databases and repositories when available, and at the Genome Research Web site, and if desired at the author’s Web site, when they are not.” The policy states that there are no exceptions to this rule.
Researchers Still Face Obstacles
Although the MGED Society is pushing for journals to follow its suggestions, it recognizes that there are still obstacles for some researchers in complying.
“It’s still pretty hard if you don’t have a software package that helps you. If you don’t have access to a package [that can export data in an appropriate form for public repositories], you will probably spend an afternoon or two clicking on a Web site, annotating your data,” said Ball.
“The journals, reviewers and authors are all interested in doing the right thing. It’s just a question of making the right thing easy to deal with,” she said.
Another potential obstacle for researchers in submitting data, for the short-term at least, is access to the particular type of software language needed to submit data to a given repository. While ArrayExpress and GEO accept microarray gene expression markup language (MAGE-ML), some other public repositories do not accept documents using that language.
But Ball told BioArray News that ArrayExpress, GEO, and Japan’s CIBEX repository, are working on methods to “within a reasonable time replicate the data at different sites, very much along the lines of the GenBank model. So, eventually whether one demands or accepts MAGE-ML or not, I think that will become moot.”
Another concern voiced by the MGED Society is that some corporate researchers are not making their data publicly available, in effect, asking other researchers to trust that their sequences work.
“Recently there have been a lot of array manufacturers who are resisting releasing their actual sequences that are on the arrays,” Ball said.
But some manufacturers may have concerns about divulging proprietary information.
Agilent spokesperson Christina Maehr explained, “A large percentage of our business comes from custom microarrays, and in some cases the reason those microarrays are made custom is because a pharma company or a biotech company or even consortia believes they have identified unique genes or something proprietary.
“For biotech and pharmaceutical companies, that type of information to them is often a competitive advantage.”
A final suggestion from the MGED Society was that the DNA Databank of Japan, the European Bioinformatics Institute, and the National Center for Biotechnology Information repositories collaborate on exchanging all MIAME-compliant microarray data.
Thus far, the feedback to this suggestion has been positive, according to Ball. “All three fully intend to do it, and I know they’ve started work on a few of the initial stumbling blocks.”