NEW YORK – Next-generation sequencing technology from Illumina is helping biophysicists interrogate DNA and protein binding by increasing the number of molecules they can interrogate in experiments.
In twin papers published Thursday in Science, researchers from Sweden's Uppsala University and The Netherlands' Leiden University and Delft University of Technology described their methods for using NGS to tie specific DNA molecules to imaging-based studies of molecular interactions of CRISPR-Cas9/guide RNA complexes and Holliday junctions, respectively.
"We're trying to find a relation between sequence and structure and function," said Leiden's John van Noort, senior author of the paper describing the method, called single-molecule parallel analysis for rapid exploration of sequence space (SPARXS). "We're basically automating the whole fluorescence microscopy workflow."
Both SPARXS and Uppsala's multiplexed single-molecule characterization at the library scale (MUSCLE) method use the Illumina MiSeq to link sequences with data from fluorescence microscopy. Molecules of interest are seeded on a MiSeq flow cell, analyzed with imaging, and then connect the data with the spatial coordinates of the sequences.
Until now, single-molecule biophysics has been extremely limited by throughput. "Basically, you'd just pick a couple of representative sequences and do one experiment or settle on one sequence and maybe look at another," said Uppsala's Sebastian Deindl, senior author of the MUSCLE study. "It would take years and is a tedious, long, drawn-out process."
Looking at single fields of view, van Noort added that the number of different DNA sequences in an experiment was limited to around 100 molecules. In their new study, they followed approximately 2 million molecules for about one minute. That meant they could probe the entire sequence space for 8-bp stretches of DNA. "It sounds small, but for our application it's sufficient to probe everything that's available," van Noort said.
Deindl said his lab has been working on MUSCLE for nearly seven years. "At some point at a conference, I realized that [the SPARXS team] had a similar approach. My first instinct was I freaked out," he said. However, he approached the other researchers and proposed a "friendly competitive situation."
"I have to say, the other team was really amenable to that," he said. "There was still a bit of competition and timing mattered... but we were roughly on the same timeline." They made a point of communicating with each other and even submitted a joint cover letter to Science.
Van Noort said they're still "two parallel efforts" but still wanted to find a common name for the technique, as they're very similar. While his team analyzed Holliday junctions —four-way DNA binding events that occur during homologous recombination — they also included supplemental data showing the ability to analyze protein-DNA interactions.
For MUSCLE, the team showed proof-of-concept experiments calculating the equilibrium when an R loop is formed in Cas9-RNA binding. "We looked at time traces and determined what fraction is spent in one or the other state. "It really has both the thermodynamics and kinetics that are really for understanding the mechanisms."
Deindl noted that MUSCLE has high efficiency of immobilizing molecules on the flow cell, approximately 70 percent.
Both groups are trying to help spread the use of the NGS-based approach, though neither are pursuing commercialization. "We're doing everything we can to make it accessible," Deindl said, even providing designs for their 3D-printed mechanism that helps image the flow cell before sequencing. They've also begun some collaborations, however, he did not provide any details.
Developing a robust data analysis pipeline was "particularly challenging, as single molecules are fragile and yield only a tiny amount of light, making the data inherently noisy," van Noort said. "Furthermore, the resulting data do not directly provide insights into how the sequence affects the structure and dynamics of DNA, even for the relatively simple DNA structures that we studied. To really test our understanding, we set up a model that incorporates our knowledge of the DNA structure, and compared it with the experimental data."
Illumina has not been actively involved in project, van Noort said. "We've had some contact, but so far we've done this independently. It would be great if we could more flexibly adjust their workflow," including aspects of flow cell chemistry. He noted that moving the process to an Illumina HiSeq instrument is one way to increase throughput even more.
Van Noort plans to use SPARXS to answer outstanding questions about the role of sequence in chromatin structure, which in turn influences regulation of transcription. "The sequence mechanics of double-stranded DNA is heavily ignored," he said.
Deindl said he is looking to push the method to do more than just analyze nucleic acid interactions with other nucleic acids or proteins.
"The method is not limited to just studying the behavior of different DNA molecules," he said. "We can really look at proteins interacting with any [DNA barcoded] library."