NEW YORK – Researchers at Children's Mercy Kansas City are among the first to try out a new capability from Pacific Biosciences: the ability to measure both genetic and epigenetic variation in a single sequencing run.
With a new software update available now, whole-genome sequencing runs on PacBio's Sequel IIe instrument will get data on 5-methylcytosine (5mC) patterns in a genome by default.
"Anyone running the HiFi pipeline will get the four standard bases plus methylations calls for CpG motifs," said PacBio VP of Product Marketing David Miller. Previously, multiple tests were required to evaluate rare disease cases for sequencing and methylation variation using the PacBio platform.
As part of an expanded collaboration, the Children's Mercy researchers will analyze this combination of data for 100 genomes to look for extreme cases of genome methylation.
"We have had success using five-base HiFi sequencing at Children's Mercy Kansas City to identify abnormal methylation in repeat expansion cases, and we plan to apply it to all the future genomes we sequence," Emily Farrow, director of laboratory operations at Children's Mercy and associate professor of pediatrics at the University of Missouri Kansas City School of Medicine, said in a statement.
Tomi Pastinen, director of the Children's Mercy Genomic Medicine Center, added that his team would also be combining those data with PacBio's Iso-Seq RNA sequencing. "We are now working on integrated analyses across hundreds of our unsolved cases and we hope to identify new, previously veiled genomic variants that may be associated with rare disease," he said.
PacBio's chemistry has long held the ability to call epigenetically modified bases, by analyzing hiccups in base incorporation at those and other nearby bases. These hiccups in polymerase activity show up as time between fluorescence pulses, called the interpulse duration (IPD). PacBio's algorithm uses these differences in IPD to make the epigenetic modification calls.
The PacBio platform can detect other types of modifications besides 5mC, however only that modification will be included in the BAM file for every run on updated Sequel IIes, for free. "And by free, we mean no change to protocol in terms of the wet lab, no change to sample inputs, no change to informatics; it all comes straight out of the instrument," Miller said. The change does increase data file size by up to 5 percent, he noted.
PacBio CEO Christian Henry previewed this feature in February on a call with investors. Now, the firm is releasing it as part of its SMRT Link v11.0 software, alongside other improvements, including new SMRT Bell kits that halve sample prep time and lower DNA input requirements for human genome sequencing to 3 micrograms per genome.
PacBio's process enables methylation sequencing without the use of bisulfite chemical conversion in sample preparation, which can be harsh and can miss certain SNPs. As with standard bisulfite sequencing, PacBio's method cannot distinguish 5mC from 5-methylhistocytosine. It is not the only long-read sequencing platform that can detect epigenetic modifications: Oxford Nanopore Technologies can call several modifications in DNA and RNA based on raw signal patterns.
Children's Mercy was able to try out this new algorithm during an early-access partnership. Now, the organizations are expanding their collaboration on diagnosing rare childhood disease using long-read sequencing.
Pastinen's team is using the algorithm to generate methylation calls for several rare disease cohorts samples previously sequenced with PacBio's platform.
"Some specific types of mutations, such as repeat expansions, can cause methylation signatures in the diseased allele," Pastinen explained. Repeat expansions are not necessarily straightforward to detect, so having methylation data can increase the confidence of calling those variants.
For example, the Children's Mercy team has associated hypermethylation with a repeat expansion in the DMPK gene of a patient diagnosed with that form of muscular dystrophy, helping them make a diagnostic variant call. "This was a known case where we knew long read had challenges in reading through whole repeat," Pastinen said. With the help of the associated methylation, they were able to make the call.
They'll also look at "extreme methylation outliers and how they play into rare disease genetics," he added. "Yesterday, I picked up a new repeat expansion that silenced alleles in a potentially new disease. That's the next frontier here, with all these unsolved genomes we can annotate some function to them using the direct methylation [detection] in parallel."
In addition to aiding diagnosis, methylation data "has really exciting prospects of new genome biology," he said, especially around parent of origin effects, also called parental imprinting. Bisulfite sequencing using short-read sequencing platforms — namely, Illumina — doesn't provide good visibility on the parent of origin for a methylated strand of DNA.
Now, Pastinen's team will be tagging whether DNA from across the genome comes from someone's mother or father. "The early results are stunning," he said. "In germ cell development, most of the methylation in the genome is erased, but there are a few loci in eukaryotic genomes that are left methylated." Those regions either receive the paternal or maternal methylation pattern. "These loci are thought to be critical in development and are involved in disease," he said.
Moreover, much of the methylation variation in the human genome is encoded by sequence, Pastinen said. "A lot of times people have investigated these two separate features separately," he said, but now they can be analyzed in tandem, from the same sample.
Methylation data can also be helpful in plant and animal genome research, a popular application of PacBio sequencing.
"We found that the CpG methylation patterns detected in tomato and maize genomes using HiFi sequencing are highly concordant to standard bisulfite sequencing, but bring power to resolve transposable elements and other sequences that are out of reach with short reads," Michael Schatz, a computational biologist at Johns Hopkins University, said in a statement.
"When combined with the incredible capabilities of HiFi sequencing for genome assembly and variation analysis, this creates an unmatched opportunity for ultra-high-quality genome and epigenome analysis of plant and vertebrate genomes from a single data type," Schatz added.
In addition to launching new studies, researcher may want to also reprocess older HiFi sequencing results, Miller said. "The signal was always there. This is extracting that and putting that into the BAM file."