An adaptation of Illumina's sequencing technology could improve quantitative measurements of protein-DNA binding affinity over current array-based methods, Massachusetts Institute of Technology researchers have reported.
The group harnessed Illumina's Genome Analyzer II to measure the binding of proteins to DNA, a technique they call high-throughput sequencing–fluorescent ligand interaction profiling, or HiTS-FLIP. They published their method in Nature Biotechnology last month.
Tested on the yeast protein Gcn4 and compared to a standard protein-binding microarray, the HiTS-FLIP method offered more quantitative results and allowed "improved discrimination of in vivo Gcn4p binding sites and regulatory targets," the authors wrote in their paper.
"One core problem we are interested in [in my lab] is RNA splicing," Christopher Burge, who's team led the study, told In Sequence this week. The team initially planned to use the Illumina sequencer to study that, but realized it would be difficult to generate RNA on the flow cell, so they decided to establish the method with DNA first "to work out the technical details and assess the potential to obtain the sort of deep, quantitative portrait of binding that we were after," Burge said.
HiTS-FLIP essentially takes the basic process of the Illumina GA II and changes up a few of the additive and software elements, Burge said. Fluorescently tagged proteins are added to the flow cell and the location of their binding to each DNA cluster is imaged in the same way as that of fluorophore-tagged nucleotides.
To create and test the HiTS-FLIP method, the research team chose Gcn4p, a basic leucine zipper transcription factor in the yeast Saccharomyces cerevisiae, and a master regulator of the amino acid starvation response, the authors wrote. Burge said Gcn4p offered not only an interesting subject, but one for which there was already a lot of microarray data available, allowing for a comparison with what Burge said is currently the most popular method for examining protein binding: microarrays.
The team tagged Gcn4p with an mOrange fluorescent tag, and applied the protein at several different concentrations to a flow cell with a library of "randomized 25-base pair synthetic DNA" molecules in approximately 88 million clusters, according to the report.
"[Proteins] only bind to the clusters that contain specific 7-mers or 8-mers," Burge said. "Some of those have a strong binding site and some have a weak binding site, so at a given concentration, the strong site might be completely saturated, and the weak ones might have just a few protein molecules."
This manifests on the resultant image as a constellation of binding points with a range of intensities and very little background noise, which, Burge said, was exciting to the researchers. "What we realized when we saw that image was, 'Wow, it really works great' – it's binding very specifically, and there is essentially no background, there's no haze back there," he said.
In their report, the researchers wrote that HiTS-FLIP of the GCn4 protein yielded approximately 440 million binding measurements, "enabling determination of dissociation constants for all 12-mer sequences having submicromolar affinity" and revealing a "complex interdependency between motif positions."
According to Burge, the new technique's main competitor is protein-binding microarrays, which he said are widely used but typically limited to "directly estimating binding enrichments to motifs of size eight nucleotides or fewer," according to the study.
The team conducted a direct comparison of the two methods, lining up the HiTS-FLIP data from their experiment with available PBM data from a previous study by researchers at Brigham and Women's Hospital and Harvard Medical School.
[ pagebreak ]
"A PBM directly measures affinity to at most hundreds of thousands of oligonucleotides, while HiTS-FLIP collects [tens] or hundreds of millions of binding measurements," said Burge. "As a result, PBMs can directly estimate binding affinities to motifs of eight nucleotides or fewer, while the current version of HiTS-FLIP can measure affinities to motifs up to 12 nucleotides long, enabling characterization of more complex binding affinity landscapes."
To directly compare data quality, controlling for quantity, the group compared affinities to octamers estimated by HiTS-FLIP to those estimated by PBM and found that the HiTS-FLIP values consistently correlated better with published data on in vivo binding and activity. "The accuracy and throughput are both improved relative to PBMs, and we expect that HiTS-FLIP can detect lower affinity interactions because the wash step is much shorter [at] two minutes, versus 20 minutes for PBM," Burge said.
On a microarray, proteins can bind non-specifically to the surface, Burge said, which necessitates a long wash. "We don’t know what the surface of the flow cell is made of, that’s a trade secret … but it's designed to be like molecular Teflon," Burge said. Also, the optics of the Illumina system only register what is less than 100 nanometers from the surface of the flow cell, resulting in high discrimination between the bound proteins and those floating in solution.
In their paper, the group reported doing a two-minute wash of their flow cell, but wrote that similar results were obtained in a pilot experiment with no washing at all. "That's very important," Burge said. "It means the method has intrinsically low background, and it means you can hope to get quantitative biophysical measurements. If you do a lot of washing, you lose the weak binders and it's not quantitative at all."
Overall, the researchers report that HiTS-FLIP offered an appreciable, though modest, improvement in classification over PBMs. However, HiTS-FLIP predicted the magnitude of binding much more accurately, Burge said.
There are drawbacks to the technique, Burge said, particularly the relatively high cost. However, he said, "cost can be reduced in various ways, like applying multiple proteins to the same flow cell in succession, or in different lanes."
Additionally, "if you just want to know what the consensus 7-mer for a certain factor is, there are several ways to get there," he said. "You may not need all this detail."
Burge's team has no plans to commercialize the method. He said HiTS-FLIP "is an application that any researcher can do on their own, provided they can express tagged DNA binding factor and can work with their local sequencing facility to use the modified recipes needed."
"We are working to make the method straightforward to use, by providing needed recipes and software," he said, adding that Illumina also further optimized the method after the group's initial proof-of-concept experiments.
Burge also said the team has been talking with Illumina scientists about extending the approach to the HiSeq and MiSeq platforms.
"There are a few details to work out, but it looks promising," said Burge. "HiTS-FLIP on a HiSeq would give massive depth of information, enabling detailed characterization of factors with very long and complex binding motifs, like p53. The appeal of the MiSeq would be the lower cost per run and fast turnaround… The method might also be adaptable to platforms from other vendors," he said.
Most importantly for their own goals, Burge said, the researchers are now working on using HiTS-FLIP for measuring RNA-protein interaction, which is the area they originally hoped to achieve, and one they think could have great impact because there isn't an established array-based method for directly looking at RNA-protein binding.
"There is no PBM for RNA," Burge said. "My lab is working on [applying HiTS-FLIP to RNA] right now and we think we have a good solution."
Have topics you'd like to see covered in In Sequence? Contact the editor at mashford [at] genomeweb [.] com.