Skip to main content

Comparative Genomics Questions Abound with UCSC's Ultraconserved Sequences


Thanks to their group’s participation in the Mouse Sequencing Consortium, David Haussler and Gill Bejerano, a postdoc in Haussler’s lab at the University of California, Santa Cruz, were deliberately looking for conserved sequence regions between mouse and human. But what they discovered, since dubbed ultraconserved regions, came as quite a surprise even to them.

Haussler remembers that about a year ago Bejerano had come to talk to him about a 200-base sequence that was identical in human, mouse, and rat. “I immediately started thinking with three genome sequences this can’t be an accident,” Haussler says. He asked Bejerano to figure out how many such sequences there were across the genome, and “he comes back with 481 sequences that are longer than 200 bases,” Haussler says. “I really didn’t expect anything remotely like that. I was floored.”

The regions tend to fall in noncoding sequence, and their function is still a mystery to Haussler’s team. “Before we could align these genomes one to another,” Bejerano says, no one would have suspected that there was anything interesting about these regions. “All of this comparative [work] is really enabling us to say this region which five minutes ago we thought was boring has to have something unique about it. Something special must be happening. … Something that maintains it so well even when it’s not functioning — that’s pretty amazing.”

Clearly, with some of the regions that are conserved perfectly among human, mouse, and rat stretching as much as 800 bases, something is indeed going on. “It doesn’t match our usual model for regulatory regions,” Haussler says, adding that the regions have only been found in vertebrates — invertebrates have no comparable sequences. “The key mystery,” he adds, “is what molecular function would be going on that would require such extreme conservation for hundreds of bases.”

The initial discovery of these ultraconserved regions has set the stage for Haussler’s team to throw a battery of statistical analysis tests at the data set. The regions could conceivably harbor disease-linked mutations, Haussler predicts, and tracking down the functional elements in noncoding sequence could be a boon to the more clinical side of genomics. Bejerano will continue to examine at least the statistical characteristics of the regions to see if they point to any function or purpose.

Bejerano adds, “I would be very surprised if there was a singular mechanism underlying all of [the regions].” He expects to find that the regions, possibly divided in groups, represent many different functions or that they’re critical for different reasons.

Haussler is relishing the challenge of the seemingly cryptic conservation pattern. “I often feel like an explorer of a new continent,” he says. “If you just zoom into a random place with the UCSC genome browser you start looking around and you have this funny feeling that, ‘Wow, I’m the first person to ever really look at this.’ So much of it is uncharted territory.”

— Meredith Salisbury


The Scan

Pfizer-BioNTech Seek Full Vaccine Approval

According to the New York Times, Pfizer and BioNTech are seeking full US Food and Drug Administration approval for their SARS-CoV-2 vaccine.

Viral Integration Study Critiqued

Science writes that a paper reporting that SARS-CoV-2 can occasionally integrate into the host genome is drawing criticism.

Giraffe Species Debate

The Scientist reports that a new analysis aiming to end the discussion of how many giraffe species there are has only continued it.

Science Papers Examine Factors Shaping SARS-CoV-2 Spread, Give Insight Into Bacterial Evolution

In Science this week: genomic analysis points to role of human behavior in SARS-CoV-2 spread, and more.