It's not just what a DNA sequence says, but how it looks. NHGRI's Elliott Margulies and Boston University's Thomas Tullius found that certain topographical features of a stretch of DNA — particularly when conserved across species — are correlated to its function. "When we think of primary sequences being conserved throughout evolution, we think that those sequences are maintained because they do some important function. And now we are extending that onto the structural topography of DNA," Margulies says.
Margulies had been trying to think up new ways to look at DNA and study its function when he came across Steve Parker, who was investigating how DNA topography related to functional elements within a species. "This made me think, gosh, if we can convert this DNA topography across a bunch of species, it'll give us a different way of looking at similarities in DNA," Margulies says. Parker is the first author on the new Science paper.
To see how the structure of a length of DNA is affected by single base changes, Margulies and his colleagues determined the structural profiles of all possible 11-mer sequences that differ by a substitution. Many single base changes didn't affect the structure of the DNA, but some affected it dramatically. "This, to me, makes the statement that, potentially, all 92 percent identical sequences weren't created equal," Margulies says.
The researchers then developed a computer program called Chai that takes into account structural information while it searches for evolutionarily conserved sequences. Margulies used the new algorithm and binCons, which is an established sequence-based conservation algorithm, to analyze comparative sequence data from the ENCODE project. "We are able to connect more sequences being evolutionarily conserved, which is interesting," Margulies says. He also says that when they used the same statistical cutoffs for each approach, Chai could detect twice as many conserved sequences as binCons. "The other thing is these Chai elements seem to be better coordinated with functional sequences," he says.
Margulies and his colleagues then looked into how structural changes could have an impact on biological function. They examined disease-associated SNPs in the ENCODE database and found that those SNPs are more likely to have dramatic structural changes than neutrally evolving SNPs. From protein-binding assays, he also notes that "base changes that affected structure more dramatically also affected protein binding" giving a possible way to connect structure to disease. "There is this whole world trying to figure out how noncoding SNPs [can be] causative in disease and not just links to disease. One way is through protein binding," Margulies says.