NEW YORK (GenomeWeb) – Researchers at the University of California, Davis are working on an assay to record the repeats and methylation status of the fragile X mental retardation gene using Pacific Biosciences' long-read sequencing platform, with the ultimate goal to develop a screening and diagnostic test that will improve on current approaches.
During a workshop sponsored by Roche at the American College of Medical Genetics and Genomics annual meeting in Tampa last month, Paul Hagerman, a professor of biochemistry and molecular medicine at UC Davis, reported on his team's recent advances and spoke with GenomeWeb last week about their work.
In 2012, his group published a paper describing how they sequenced CGG repeat alleles of the fragile X gene using the PacBio platform.
Fragile X syndrome is a leading cause of inherited intellectual disability, and almost half of patients are also diagnosed with autism. The disorder is associated with an expansion of the number of CGG trinucleotide repeats in the 5' untranslated region of the fragile X mental retardation 1 (FMR1) gene, located on the X chromosome. This leads to increased methylation in the FMR1 promoter and CGG repeats and ultimately shuts down the expression of the FMR protein, which has an important role in synaptic development.
Unaffected individuals usually have fewer than 55 CGG repeats in the FMR1 gene. Premutation carriers have alleles with 55 to 200 CGG repeats and are at risk for fragile X-associated tremor/ataxia syndrome (FXTAS), a neurodegenerative disease. In addition, women who are premutation carriers are predisposed to premature infertility.
Full mutation alleles, which often develop in female premutation carriers during meiosis, contain more than 200 CGG repeats and tend to be hypermethylated and silenced.
However, in terms of symptoms, the boundaries are fluid. "You can certainly see individuals who are clearly symptomatic who are in the upper end of the premutation range, and those who are high functioning in the low end of the full mutation range," Hagerman said.
Fragile X syndrome is currently diagnosed using PCR-based testing to determine the number of CGG repeats and restriction enzyme-based methods to gauge methylation status at specific sites.
But the FMR1 locus is highly polymorphic, both in terms of repeat length and methylation status. "You might have one individual who has 30 to 40 different alleles, a whole distribution. They may have some repeats that are in the 300, 400, 500 repeat range, others that may be in the 80 or 90 repeat range, and depending on the ratios of those repeats, that can profoundly affect the degree of clinical involvement, so we want to be able to characterize that in an unbiased fashion," Hagerman said.
"We're [also] very interested in knowing what the pattern of methylation is, because there is a lot of methylation mosaicism in these patients," he said, "and we want to know what patterns of methylation permit gene activity and which ones cause it to silence." In addition, he said, he and his colleagues want to find out whether methylation occurs first in the promoter and then spreads to the CGG repeats, or vice versa.
"From the research standpoint, we're very interested in knowing the patterning and the temporal aspects of the methylation process," he said. "From the standpoint of clinical interest, we're interested in knowing: 'Can you predict the clinical outcome of a particular case based on the sequence and methylation pattern?'"
Pacific Biosciences' single-molecule real-time sequencing platform lends itself to these types of studies, he said, because its long reads can determine the size of the CGG repeats, and at the same time, it can record epigenetic modifications to the DNA. "That's a real advantage for us, being able to do direct sequencing and identification of the methylation groups," he said.
To be able to specifically target the FMR1 gene, Hagerman and his team, in collaboration with PacBio scientists, have developed a single locus enrichment method that requires no amplification, which they published earlier this year in Molecular Genetics and Genomics.
The method, which they can enrich a locus almost 700,000-fold, takes advantage of a class of restriction enzymes that cuts a little away from its DNA recognition site. That way, each locus of interest will have overhangs with a specific sequence after the enzyme has cut, which can be targeted with complementary adapters. In the next step, these adapters protect the fragment from exonuclease treatment, which destroys the rest of the DNA. "The beauty of the method is that it allows you to substantially enrich the fragment of interest simply because you have designed the adapters to target your region of interest," Hagerman said.
Prior to using the method to sequence the FMR1 region in patient samples, the researchers are working on further improving it. One thing that needs to be optimized is the amount of target in the library, Hagerman said, which depends in part on how many correct versus incorrect ligation events take place. "We're basically tuning the steps in the process to optimize yield of the targeted fragment," he said.
He hopes to have the method optimized within a few months, after which his team plans to validate the approach on known clinical samples. Following that, they plan to use it on a large number of patient samples from an existing collection.
"We would like to use that information for predicting clinical outcome," he said. "We have probably the world's largest population of fragile X individuals — we have a few thousand samples — so we can go through a very large range of methylation stages and CGG repeat lengths, trying to correlate the known phenotypes with the methylation status."
So far, Hagerman's group has used the PacBio RS II platform for its work, outsourcing most of its sequencing to a lab at Washington State University that has been able to tweak the sequencing protocol to allow for smaller amounts of input DNA.
Each sample currently costs on the order of $350 to $400 to analyze on the PacBio system, Hagerman said, but his team is working on barcoding methods that would allow it to multiplex several samples, lowering the cost per sample.
They also hope to use the PacBio Sequel in the future, which has a higher throughput. "It doesn't matter to me what SMRT sequencing platform we use for this initial optimization, because this really has to do with the chemistry of developing the libraries, and once we have them developed, then having a machine that has [higher throughput], and uses less material, and makes longer reads would just improve things," he said.
Nanopore sequencing technologies, such as Oxford Nanopore's MinIon, could characterize fragile X samples in a similar fashion as PacBio's platform, but Hagerman said it's not mature enough. "We'd like to see them further develop their platform and demonstrate its capabilities," he said. "In principle, such a platform would allow us to get similar information. But in practice, I'm still reserving judgment."
Eventually, the direct sequencing and methylation analysis approach could have broad applications in screening and diagnostics, he said, including carrier and newborn screening. "There are different stages at which such screening can take place, and I think this will fit just in with the current armamentarium," he said.