Thanks to their group’s participation in the Mouse Sequencing Consortium, David Haussler and Gill Bejerano, a postdoc in Haussler’s lab at the University of California, Santa Cruz, were deliberately looking for conserved sequence regions between mouse and human. But what they discovered, since dubbed ultraconserved regions, came as quite a surprise even to them.
Haussler remembers that about a year ago Bejerano had come to talk to him about a 200-base sequence that was identical in human, mouse, and rat. “I immediately started thinking with three genome sequences this can’t be an accident,” Haussler says. He asked Bejerano to figure out how many such sequences there were across the genome, and “he comes back with 481 sequences that are longer than 200 bases,” Haussler says. “I really didn’t expect anything remotely like that. I was floored.”
The regions tend to fall in noncoding sequence, and their function is still a mystery to Haussler’s team. “Before we could align these genomes one to another,” Bejerano says, no one would have suspected that there was anything interesting about these regions. “All of this comparative [work] is really enabling us to say this region which five minutes ago we thought was boring has to have something unique about it. Something special must be happening. … Something that maintains it so well even when it’s not functioning — that’s pretty amazing.”
Clearly, with some of the regions that are conserved perfectly among human, mouse, and rat stretching as much as 800 bases, something is indeed going on. “It doesn’t match our usual model for regulatory regions,” Haussler says, adding that the regions have only been found in vertebrates — invertebrates have no comparable sequences. “The key mystery,” he adds, “is what molecular function would be going on that would require such extreme conservation for hundreds of bases.”
The initial discovery of these ultraconserved regions has set the stage for Haussler’s team to throw a battery of statistical analysis tests at the data set. The regions could conceivably harbor disease-linked mutations, Haussler predicts, and tracking down the functional elements in noncoding sequence could be a boon to the more clinical side of genomics. Bejerano will continue to examine at least the statistical characteristics of the regions to see if they point to any function or purpose.
Bejerano adds, “I would be very surprised if there was a singular mechanism underlying all of [the regions].” He expects to find that the regions, possibly divided in groups, represent many different functions or that they’re critical for different reasons.
Haussler is relishing the challenge of the seemingly cryptic conservation pattern. “I often feel like an explorer of a new continent,” he says. “If you just zoom into a random place with the UCSC genome browser you start looking around and you have this funny feeling that, ‘Wow, I’m the first person to ever really look at this.’ So much of it is uncharted territory.”
— Meredith Salisbury