NEW YORK (GenomeWeb) – Folding in transcriptomic data from blood samples could help increase the diagnostic rate for patients with rare diseases and identify new disease-linked candidate genes, according to a new study appearing in Nature Medicine this week.
Some 350 million people around the world have a rare disease, but the current molecular diagnostic rate from exome sequencing falls around 50 percent. In their new paper, Stanford University's Stephen Montgomery and his colleagues from the Undiagnosed Diseases Network examined RNA-seq data from blood samples of nearly 100 people with undiagnosed rare diseases belonging to 16 different disease categories to find it has a 7.5 percent diagnostic rate, suggesting RNA-sequencing could help diagnose those diseases.
"[T]his work demonstrates the utility of performing RNA-seq on peripheral blood, which is a readily available specimen type in clinical practice," the researchers wrote in their paper.
In total, they collected RNA-seq and whole-exome or whole-genome data from whole blood on 143 individuals, 94 of whom had a rare disease and 49 of whom were unaffected family members.
They then compared RNA-seq data from patients to that of 49 family-based controls and 1,594 external controls from three cohorts: the Depression Genes and Network (DGN), the Prospective Investigation of the Vasculature in Uppsala Seniors (PIVUS) project, and the Genotype-Tissue Expression consortium.
The researchers first confirmed that many known rare disease genes were indeed expressed in the blood — 70 percent of disease genes from the Online Mendelian Inheritance in Man database and 76 percent of a panel of neurological disorder genes.
For each rare disease sample, the researchers uncovered an average 343 genes with aberrant expression. After filtering based on loss-of-function tolerance, allele-specific expression, and other parameters, they further narrowed down the candidate gene list. On average, those filters whittled the list of genes with outlier expression to fewer than 10 genes per case.
At the same time, they identified an average 540 splicing outlier genes. Filtering based on genes relevant to the phenotype and other criteria brought the number of candidate genes down to about 10 per case.
Additionally, the researchers uncovered an average 94 allele-specific expression events per case that could be disease related.
By combining these three signals — expression, splicing, and allele-specific expression — the team reported it could identify the causal gene in six of the 80 independent cases, a diagnostic rate of 7.5 percent. In addition, they could identify a candidate gene in five of the 30 cases with candidate gene information, a rate of 16.7 percent.
However, they were unable to finding relevant candidate genes for 69 cases, about 86 percent of cases.
Still, the researchers noted that their findings underscore that RNA-seq data could aid in identifying disease genes. For instance, they were able to home in on biallelic heterozygous pathogenic variants in the MECR gene in a pair of brothers who presented with delayed motor milestones and hypertonia that progressed to include spasticity, ataxic gait, and a progressive loss of motor skills.
Similarly, they identified a splice-loss variant in ASAH1 in a patient with a form of sporadic spinal muscular atrophy and a synonymous mutation in KCTD7 that created a splice junction in a patient with developmental regression combined with tremors and seizures.
This, the researchers said, suggests RNA-seq can help to identify disease genes. "We can expect that combining information from multiple 'omics' sources will only further improve diagnosis of unsolved rare-disease cases in the future," the researchers wrote in their paper.