NEW YORK (GenomeWeb News) – The DNA sequence patterns residing near insertions and deletions can offer clues about how these indels form, according to a paper in the latest issue of Genome Research.
A trio of researchers from Pennsylvania State University used an approach called wavelet transformation analyses, among other methods, to investigate how local DNA sequence patterns relate to insertions and deletions in the non-coding, non-repeat regions of the human genome. Their results suggest that although the two types of mutations share some local sequence, the patterns around each were relatively distinct. Overall, the team concluded, two main processes — replication and recombination — appear to drive indel formation in non-coding parts of the genome.
"We were surprised to find that the patterns of DNA sequences surrounding insertions versus deletions are unique because scientists previously have lumped the two types of mutations together, " lead author Erika Kvikstad, a graduate student in genomics and bioinformatics researcher in Kateryna Makova's lab at Penn State, said in a statement.
"What's striking is that most insertions and deletions are thought to occur by replication errors," Kvikstad added, "and, while this is a primary source generating the mutations, we discovered that recombination also is very important."
To investigate how local sequence affects indel formation, the team identified indels in the non-coding, non-repeat part of the human genome through comparisons with chimpanzee and macaque sequence. They then sub-divided the human genome into five different "sub-genomes."
They then scoured the genome for motifs related to various sequence-altering processes such as recombination, replication stalling, or DNA stability, assessing how frequently different motifs appeared in the sub-genomes.
Their results suggest that although sequences around insertions and deletions share some features, some motifs are more prevalent around one type of mutation or the other. For instance, they reported that sequences characteristic of DNA polymerase pausing or frameshifting — replication-related processes — were more frequently found near insertions.
On the other hand, sequences susceptible to topoisomerase cleavage — a process that can lead to recombination — were preferentially located in the vicinity of deletions.
Next, the researchers used the wavelet transformation method to try to tease apart how the patterns of different sequence motifs changed over various genomic distances — from small to large scales. For example, they noted that DNA polymerase pause or frameshift motifs could typically be found close to indels — within 10 or 20 base pairs of deletions and within about 80 base pairs of insertions.
Based on their findings, the researchers suggested that replication and recombination contribute to the genesis of insertions and deletions. But, they emphasized, motif patterns around each type of mutation tends to be quite distinct.
"Our present study reinforces the previous observation that small insertions and deletions arise, at least in part, by different mechanisms: subtle differences in the interplay of replication and recombination processes further distinguish insertions from deletions at the DNA context level," the authors wrote.
And, researchers say, the results may ultimately have implications for understanding how insertions and deletions arise, as well as their role in genetic diseases. "Males undergo more rounds of DNA replication than females and the number of replication rounds increases with a male's age," senior author Makova said in a statement. "If we know that a disease is due to a replication error rather than a recombination error, doctors can provide better genetic counseling to couples."