The US National Institute of Standards and Technology plans to make standard reference RNA controls available to microarray users by early next year, BioArray News has learned.
Developed by the External RNA Controls Consortium, an ad hoc group of participants from industry, the government, and academia, the spike-in RNA controls will be designed to enable researchers to measure the performance of their assays, according to Marc Salit, the chairman of the ERCC and a research chemist at NIST.
Salit told BioArray News this week that the ERCC has agreed upon a library of 96 controls composed of synthetic mammalian RNA that NIST will sell to array users at cost. As part of the controls development process, array platform manufacturers agreed to add probes to their arrays that will detect the control RNA without interfering with existing content.
"People really want to have confidence in their measurements," Salit said. "These controls are designed to give people that confidence."
Gaithersburg, Md.-based NIST has developed a plasmid DNA library that will be used to create the RNA controls using in vitro transcription. ERCC members Affymetrix, Atactic Technologies, Invitrogen, and the Stanford Genome Technology Center aided NIST in the construction of the library. Ultimately, NIST will make its library available to users like core labs or commercial vendors that will use it to manufacture RNA controls.
Salit said that the ERCC is now going through the process of certifying the sequences of the plasmid inserts in the library as standard reference material. He said that certification of the library of 96 controls, including about 95,000 bases, should be completed by the end of this year, allowing NIST to make the reference RNAs available in 2010.
ERCC members Ambion and Commonwealth Biotechnologies are developing traceable RNA controls from the library, Salit added. "We are working with Ambion and CBI to make sure that these controls will be commercially available," he said. "This way, they can assert traceability to the reference material.
"We will work with any manufacturers that are interested in making derivatives of that," Salit said. "We are trying to enable the world to have traceable reference material to facilitate gene expression measurements in regular applications."
'A Long Process'
Established in 2003, the ERCC has delayed the targeted release date for its controls several times. After agreeing on a test plan for developing the spike-in controls in 2005, the ERCC embarked on testing 176 proposed controls with the aim of narrowing in on a set of 96 that could work reliably across all array platforms. The controls were tested at Illumina, Affy, Agilent Technologies, and the National Institute of Allergy and Infectious Diseases. The target date for making them available was pushed back from mid-2006 to the end of 2007 and then to mid-2008 (see BAN 7/17/2007).
Salit said that part of the reason for that delay is that the certification process for the library has taken longer than expected. For instance, the ERCC had to work with the International Standards Organization to amend its definition of certified reference material to include sequences.
"We had to develop a new approach to certification of a reference material in order to certify sequence; this hasn't been done before," said Salit. "We worked to have the ISO definition of certified reference material changed to accommodate this need. Our new work here was in developing a way to express confidence in the base calls, when aggregating sequence data from many alternate measurements."
The ERCC test sites have also done an extended dynamic range study on the controls, and the ERCC has also done "authoritative sequencing" on the spike-ins. "NIST will issue this as standard reference material," Salit said. "That means the controls are backed up with multiple measurement methods and sufficient experimental measurements to allow us to estimate any uncertainty."
After gathering data from multiple labs with multiple sequencing instruments, Salit said that the ERCC is integrating all those data and classifying each base. Based on these evaluations, each base in NIST's library will have an estimate of confidence, ranging from "high confidence" to "ambiguous." Salit said he expects the "overwhelming majority" of the bases in the library to fall into the first category. He added that the ERCC is doing another round of second-generation sequencing to include more data in certification process.
"We are expecting to have reference material on the street in 2010," Salit said. "It's been a long process and we've learned a lot along the way."
[ pagebreak ]
According to Salit, there has been "great interest" from array firms and users in gaining early access to the controls. Currently, there are about 14 different groups using them in a beta testing mode, referred to as 'Phase V testing' within the consortium.
Seattle-based NanoString Technologies, for example, is using the spike-ins as calibration controls in its nCounter system. Lianne McLean, NanoString's vice president of marketing, told BioArray News this week that the firm recently implemented the set of RNA control transcripts and complementary probes for its nCounter gene expression assay.
"Since the ERCC control sequences are alien to all genomes, they will have greater specificity than strategies that use, for example, Arabidopsis sequences as controls in human assays," McLean said. For nCounter users, the assay and data analysis workflows remain the same, she added.
Salit said that researchers at the Allen Institute for Brain Science are using 20 spike-in controls to identify samples from dissected tissue. "They are doing a spatial map of gene expression in human brains, and the controls will be used to assure that samples can be tracked through the measurement process," Salit said.
He said the 20 spikes are mixed in such a manner that the pattern of presence or absence of one represents a binary number, meaning around a million possible mixtures can be used as "serial numbers" for the samples.
"Those are the kinds of applications we are seeing already in early access mode," Salit said. "It's got good momentum."
Salit said he expects early access users will publish on their use of the controls. NIST is also likely to publish on the sequence certification process.
Going forward, the ERCC is looking for ways to expand the use of the ERCC controls into other application areas, Salit said. "We are thinking of how to leverage our understanding and experience with this reference material to support structural genomic assays, such as next-gen sequencing," he said.
NIST is also looking to deploy its reference materials in "applications that give users more confidence in their molecular assays," Salit said. He cited species identification, food identification, and biosecurity as areas where NIST could next turn its attention.
"These molecular assays are often PCR-based, and validation using controls may be useful to establish measurement performance," Salit said. "Imagine routine proficiency testing where 'blind' samples are distributed to a variety of test sites, with known correct answers, and ROC curves being constructed for each test site — the various communities would be able to demonstrate, with objective evidence, the quality of their data," he said.
He said NIST will evaluate the use of the ERCC controls for these applications at the outset, because "they’re really well characterized, and we’ve got them at hand," but stressed that these projects are at an early stage.