NEW YORK (GenomeWeb News) – New research is highlighting the interplay between genome structure and short DNA repeats — and revealing how this relationship could affect gene expression, protein function, and disease risk.
In a paper scheduled to appear in an upcoming issue of the journal Genome Research, a team of researchers from the US and the UK used structural analyses to determine why some microsatellites or short DNA repeats are more common in vertebrate genomes than others. Their findings indicated that the number and length of DNA repeats — which expand in certain diseases — are determined by the structure of the repeats themselves and by the way the DNA bases stack on top of one another in the genome.
“Researchers knew that some DNA repeats are found to be very abundant in the human genome … whereas others are extremely rare and sometimes absent, but they did not know why,” lead author Albino Bacolla, a researcher at Texas A&M University’s Center for Genome Research, said in a statement.
Although microsatellites are common in vertebrate genomes, certain repeats are more common than others. In addition, the length of repeats can vary depending on the loci examined and seem to be influenced by the sequence composition of each repeat.
The lengths and characteristics of nucleotide repeats are of interest to researchers, in part, because of their links to disease. For instance, the expansion of three-nucleotide repeats has been associated with neurological disease. Variation in microsatellite length and stability also seems to drive the formation of some cancers.
In an effort to understand the nature and distribution of these repeats in vertebrate genomes, the researchers created 82 synthetic oligonucleotides representing different three and four nucleotide repeats. They then looked at how often each of these was present in short repeats in the human reference genome and eight additional vertebrate genomes.
They found that repeats forming more stable hairpin or quadriplex structures were less likely to appear in vertebrate genomes. The authors postulated that because they are more prone to DNA repair and get chopped out of the genome or because they are overlooked by DNA replication machinery, such repeats are systematically weeded out of the genome during evolution.
On the other hand, repeats that form unstable structures were more common in the vertebrate genomes, particularly in coding regions. The team discovered that some of the mini- and microsatellites were recruited to regulatory genes functioning in transcription and signaling pathways — especially nervous system signaling pathways.
The results support the notion that specific repeats can influence cellular processes, such as gene expression and protein function.
The researchers also found evidence that repeats are influenced by the way DNA is stacked and interacts with itself in the genome. Their data indicates that the repeats most prone to expansion are those found in regions of the genome with high base stacking energy.
“[C]ertain DNA secondary structures have prevented the accumulation of specific repeating sequences in the genome over evolutionary time,” the authors suggested, “whereas strong base stacking interactions have favored their expansion.”
Robert Wells, director of the Center for Genome Research at Texas A&M and senior author on the paper, called the findings “a simple and elegant molecular explanation at why chromosomes, our genetic material, contain certain types of repeating sequences.”
This intimate relationship between repeats and gene and protein function may also start to explain how unwanted expansions and mutations in DNA repeats predispose some individuals to disease, the authors noted. In addition, since DNA sequences could become rearranged or unnecessarily chopped from the genome when hairpins or other stable repeat structures form, understanding the structural nature of these repeats could prove crucial for understanding some disease states.