The US Food and Drug Administration is organizing a new research study to “objectively assess the technical performance of different next-generation sequencing technologies” for DNA and RNA analyses, and to evaluate the pros and cons of various data analysis methods, according to a notice published in the Federal Register last month.
The study, called Sequencing Quality Control, or SEQC, is open to the research community and is expected to be completed by the end of this year. According to its organizers, the project is “a natural extension” of the MicroArray Quality Control, or MAQC, project that the FDA has been running since 2005 in order to evaluate the performance of microarray and related platforms for gene expression analysis.
The FDA is interested in conducting the new study because it expects the new sequencing technologies “to be adopted by the pharmaceutical and medical industries for advancing personalized nutrition and medicine,” according to the notice. SEQC will “help prepare FDA for the next wave of submission of genomic data generated from the next-generation sequencing technologies.”
According to Leming Shi, a SEQC organizer and an FDA researcher, there have been “serious discussions” for more than a year among MAQC participants about a performance assessment of the new sequencing technologies and data analysis methods for RNA sequencing.
Several MAQC members over the last two years or so have already been sequencing the two reference RNA samples from the first phase of the MAQC project on second-generation sequencing platforms (see In Sequence 4/22/2008). In addition, Shi and his colleagues have received inquiries from scientists outside of MAQC about the possibility of such a study.
The first phase of the MAQC project — MAQC I — measured gene expression levels in two standardized RNA samples on seven microarray and three other platforms, including qPCR, at three independent test sites. Results from that study, which involved participants from 51 organizations, were published in Nature Biotechnology in 2006. The second phase of the project, which includes participants from 60 groups, is focusing on data analysis and predictive models. Its results are currently been prepared for publication.
SEQC, which the organizers consider the third phase of MAQC, was formally launched at a meeting at an FDA campus in Silver Spring, Md., on Dec. 16 and 17 that hosted more than 40 researchers interested in the project. Among them were representatives from Illumina, Life Technologies, Roche’s 454 Life Sciences, and Helicos BoSciences.
“We predict that, like previous phases of MAQC, the impact of data analysis will be the more interesting part of SEQC.”
Researchers and reviewers from several FDA centers are also participating in SEQC, according to Shi, a computational chemist at the FDA’s National Center for Toxicological Research in Jefferson, Ark.
The study is currently seeking additional participants, and interested parties can submit requests until Jan. 9. Besides vendors of sequencing technologies, institutions “interested in the generation, management, analysis, and interpretation” of sequencing data are invited to join, according to the FDA.
The study design is expected to be finished next month, and researchers will start collecting RNA sequence data on the two MAQC reference RNA samples this spring. These samples will be spiked with external RNA controls of known sequence and abundance, according to Shi.
The two MAQC reference RNA samples were “a natural choice” for benchmarking RNA sequencing data in SEQC, Shi said in an e-mail message, because “a huge amount of expression data has already been collected” on them. “In fact, all major sequencing players have been using the two RNA samples internally for quality control and protocol optimization purposes,” he said.
Each sequencing platform will be tested at three sites using the same set of RNA samples, and the results will be compared with those from the MAQC I study.
According to Shi, SEQC will also solicit proposals on sequencing other RNA samples “if they are deemed ‘interesting’ and certain references/standards are available for comparing results.” In addition, “we are also interested in applications other than gene expression,” which will be “addressed separately,” he said, without providing further details.
Besides comparing new sequencing platforms, SEQC will also evaluate a variety of bioinformatic solutions for analyzing and handling the data. “We will not limit ourselves to any predetermined approaches,” Shi said. “In fact, we welcome SEQC participants to explore many different methods for sequence mapping and assembly, and to compare the resulting data against each other or with the ‘truth’ embedded in the reference RNA samples.”
Results from this evaluation may be especially interesting, he suggested. “We predict that, like previous phases of MAQC, the impact of data analysis will be the more interesting part of SEQC.”
After collecting sequence data this spring, SEQC plans to analyze the results over the summer and submit a manuscript for publication by the end of the year. “Participants expressed the need to work with a tight timeline because of the evolving nature of the sequencing technologies,” according to Shi.
Like MAQC, the new study will provide a “neutral environment” in which participants can “openly discuss ideas, debate scientific issues, and share expertise,” which is “critical to cultivate a new generation of users of the new technologies,” Shi said.
More information for those interested in participating in SEQC can be found here.