Researchers from the genomics arm of the University of Padua’s CRIBI Biotechnology Center have developed a web-based platform, dubbed QueryOR, that helps users annotate variant data from human exome sequencing projects to identify likely disease-causing genes.
QueryOR’s developers presented the tool during a poster session at Cold Spring Harbor Laboratory’s Personal Genomes & Medical Genomics meeting held last month.
Usually, “when you do exome or whole-genome analysis you find so many different SNPs,” some of which aren’t present in reference genomes, Alessandro Vezzi, team leader of CRIBI’s functional genomics group and one of QueryOR’s developers, told BioInform. “The question is how to make sense of all the data,” he said.
QueryOR uses information from public resources like the Single Nucleotide Polymorphism Database and Ensembl to associate genetic variations with the pathology or phenotype being studied. It also links to external resources such as the Variant Effect Predictor, which provides the SIFT and PolyPhen scores for each variant.
Prior to using QueryOR, scientists doing genome or exome resequencing projects can use any software of their choice to perform basic mapping steps and to call variants, Claudio Forcato, a doctoral student in agrobiotechnology at the University of Padua and co-developer of the platform, explained to BioInform.
Users then upload their VCF files to the QueryOR platform, select filtration criteria such as coverage, Phred score, or variant type, and then run advanced queries or cross-reference personal genomic data with disease ontologies and SNP frequencies, for example.
QueryOR returns results within a few hours, the developers said. It displays its findings in a matrix in which rows represent individual genes and columns represent the criteria selected in the previous steps.
These results include general information about the genes in question as well as a chart that provides the intron/exon structures of the gene transcripts along with the positions of the variants along the transcript.
While there are a number of other programs available for filtering and prioritizing variant data, QueryOR's developers believe their system is unique because “rather than apply filters to progressively remove neutral variants and irrelevant background,” the tool performs a “general ranking derived from a composite search of the selected criteria,” they explained via email.
Furthermore, because its results are provided “as a gene-centered ranked list” that shows “the features satisfied by each gene,” it’s possible to “evaluate and improve the query, focusing and giving appropriate weights to the most relevant criteria,” they said.
Prior to uploading their data, users can test the system with sample data from a trio provided by the CRIBI Genomics team.
Once they load their data, users can store files on QueryOR’s servers for about a month after which the variant files are erased and must be loaded again.
The group has tested the platform in some in-house projects. For example, an early version of QueryOR was used to annotate the grape genome. It has also been used in hemophilia, epilepsy, kidney, and lysosomal disease studies.
For their next steps, the researchers said they will work on integrating more resources that have been developed for functional annotations of variants and genes.
“This will offer more possibilities to identify potentially pathogenic allelic variants,” they said.
Additionally, the group is looking into developing capabilities that will allow the comparison of two or more individuals based on haplotype.
They plan to publish a paper describing QueryOR but have not yet submitted a manuscript for publication.