A group of researchers at Children's Mercy Hospital, led by Stephen Kingsmore, is using the "rapid mode" capabilities of Illumina's HiSeq2500 combined with novel software tools to sequence the genomes of infants entering Children's Mercy Hospital's neonatal intensive care unit, in the hopes of diagnosing rare monogenic diseases that present with serious symptoms early in life.
Kingsmore described his early experience using the rapid sequencing method, which his team is calling STAT-Seq, in a webinar presented by Illumina last month. So far the researchers have analyzed samples from 16 patients with a turnaround time hovering around two days from sample preparation to results, he told Clinical Sequencing News in an email this week. He added that his team plans to continue evaluating the STAT-seq approach, and expects to be able to reduce the time to final result down to 36 hours.
In his webinar, Kingsmore detailed the group's experience with its first nine cases, comparing data quality of the STAT-Seq approach — which involves sending samples to Illumina for sequencing and using a computer method to narrow possible mutations — with the hospital's current method for sequencing monogenic disease using the HiSeq2000, a predecessor to the HiSeq2500.
According to Kingsmore the researchers were happy to find that rapid sequencing with the HiSeq2500 gave a very similar quality of data to what they have achieved with the HiSeq2000. Both technologies identified approximately 80 to 85 percent of the same variants, and among those that were in common, the group found "extremely high rates of genotype concordance," he said during the webinar.
Kingsmore also reported that the initial use of STAT-Seq allowed the researchers to identify known causes of disease, identify new disease-causing variants in genes, identify novel disease genes, and rule out diseases.
Children's Mercy has been at work since last year on validating a targeted sequencing-based test for childhood Mendelian disorders using the HiSeq2000. The hospital is expected to finish that validation and begin offering the service clinically this year (CSN 2/22/2010).
Kingsmore explained in the webinar that his team's study was prompted when the group was approached by Illumina with an opportunity to use its new HiSeq2500 platform, which the company officially launched in January. The group hit on NICU monogenic disease screening as a potential application for the 2500 rapid run mode.
He explained that while molecular diagnoses are almost never made for patients in the NICU, there is a tremendous need to diagnose these disorders as early as possible.
"The NICU is very costly, about $3000 to $8000 per day, not including costs of testing and treatment, and everything that occurs is an emergency, and involves tremendous urgency, so that's where we get STAT-Seq," Kingsmore said. Molecular diagnosis in the NICU would allow more targeted and rapid treatment, reducing morbidity and mortality, he added. And for infants suffering disease with no current curative treatment, answers earlier in life could help avoid unnecessary treatment and inform care that better preserves patients' dignity.
The HiSeq2500 allows two different run modes, a high-output configuration similar to the HiSeq 2000 as well as a rapid run configuration that can generate 120 gigabases in about one day from 2x100-base reads.
In an introduction to Kingsmore's webinar presentation, Joel Fellis, Illumina's senior product manager of sequencing systems, said that the company's internal experiments from isolated DNA to final variant calls now regularly run under 50 hours and that Illumina believes it can reduce this further by the end of the year.
Kingsmore's STAT-Seq method takes advantage of this rapid run capability by using software tools that help narrow down to target genes for more specific and rapid analysis. In his webinar presentation, he reported that the group saw a sequencing run time of around 25 hours for seven of the nine samples that could be analyzed in the rapid run mode. Shipping samples to Illumina's Chesterford, UK offices for sequencing, the group aimed to arrive at a final result in about two days, he said.
The group's STAT-Seq process starts with preparing samples for subsequent sequencing on the HiSeq2500. While waiting for sequencing results, Kingsmore said the group uses a software system called SSAGA — symptom and sign assisted genome analysis — to map clinical features to a set of target genetic diseases, which helps restrict NGS analyses to a smaller target set.
Kingsmore said this approach allows the group to prioritize genomic analyses and also fits current guidelines for genetic testing in children.
To date, the researchers have analyzed 16 patients. In the webinar, Kingsmore discussed nine cases, two of which were retrospective, with known molecular diagnoses, and 7 of which were prospectively analyzed. He said the group compared data quality and results it received using the STAT-Seq method with its current program of targeted sequencing with the HiSeq2000 at 800x coverage.
Overall, the 2500 results were highly concordant, according to Kingsmore. "The finding was that we had about 80 to 85 percent variants in common to both technologies. Both are looking at the genome in different ways so we would expect each would find variants the other would miss," he said. Overall, there was almost 100 percent genotype concordance between the two methods.
Kingsmore said in the webinar that for all nine subjects, the group found that each child had about 4 million variants after base calling and characterizing variants with the STAT-Seq method. Only about 10,000 of these were associated with a gene, he said. And only a tenth or fifth of these had an allele frequency that fit with them being causative in a rare genetic illness. "Less any known silent or not-causative genes, we're then down to between 800 and 1,300 across an entire genome," he said.
After this, the group used the SSAGA results to further narrow its analyses. For both retrospective subjects, the STAT-Seq method identified the correct variant and associated diagnosis — in one case Tay-Sachs, and in the other Menkes syndrome.
In one prospective patient the researchers found a single variant in a gene called BRAT1 to be the cause of an infant girl's epileptic symptoms — a variant the team had never seen in any other patients.
"When we looked up the literature it turned out just one month earlier this gene, BRAT1, had been incriminated as a cause of rigidity and multifocus seizures, as a lethal neonatal disorder," Kingsmore said. "So although we originally didn’t have a diagnosis, we were able to identify that this was indeed a pathogenic variant in a known disease gene."
Two other children, a NICU infant and an older sibling, both presented with heterotaxy, or swapped organ placement. Kingsmore said 14 candidate genes were culled using SSAGA, but none showed variants known to cause disease. The STAT-Seq process identified a mutation in the gene HTX6, which the group believes is a new cause of heterotaxy.
"This shows that not only can we recapitulate a known diagnosis or a known disease gene using STAT-Seq, we can also in some cases identify a new disease gene," Kingsmore said.
Two other children were found to have a candidate variant, but when the parents' genomes were examined, the gene was ruled out as causative. "These things are difficult to diagnose," Kingsmore said. "We do not expect to find the diagnosis in every patient [because] there may be certain regions of genes we can't analyze [and] novel disease genes we don't recognize.
Overall, Kingsmore said, the group is "very excited about the potential of the 2500 to deliver results in a realistic time frame."
"[Neonatal intensive care] is a good place to use this [approach] initially given the high cost," he wrote in his email to CSN. "[But] in the future we anticipate it will be used everywhere that sequence information is sought as part of medical practice. It's just a matter of cost."
Kingsmore said the group's initial results show that the method can "identify known causes of disease, new disease causing variants in known genes, and in one case, we believe we've identified a new disease-associated gene … This is the first example we've come across where we are able to get actionable information in a time frame highly relevant to an acutely ill child," he said.
According to Kingsmore, STAT-Seq could help reduce unnecessary treatment and testing, and also allow for more personalized drug choices and dosing. He said the group has been analyzing the whole genome sequence results from the STAT-Seq process for drug metabolism implications, which could be used to inform treatment decision making.
"The nice thing about sequencing an entire genome is that you get several types of information," he said.