Skip to main content
Premium Trial:

Request an Annual Quote

With New Biofx Approach, Researchers Find 300 Novel Human Genes


Researchers at Cornell University recently discovered roughly 300 previously unidentified human genes and several hundred extensions of already known genes. The group, led by Adam Siepel, assistant professor of computational biology and biological statistics at Cornell, used three specially scripted algorithms to compare alignments among human, rat, chicken, and mouse genomes in varying configurations in order to identify conserved genes.

The research project demonstrated the value of a computational gene- finding approach over traditional sequencing methods using mRNAs or cDNA libraries, which can sometimes miss genes expressed at lower levels or only in certain tissues or particular stages of development. “With our comparative computational approach, you do not rely at all on the presence of mRNAs that you sequence by these random methods,” says Siepel. “Instead, you look for statistical signatures through comparative sequence analysis to find regions that are evolving in gene-like ways by comparing the human genome and these other mammalian genomes.”

According to Siepel, the biggest challenge the researchers faced during the three-year study is that the set of known genes is always a moving target. “Every day you go back to the database and there are new genes in there, and so we had to work out a fairly complicated way of assessing novelty by comparing what we had done to the database of known, publicly available genes,” he says. Some of the genes they found are involved in motor activity, cell adhesion, and central nervous system development.

Siepel says that this computational approach has broad applications. He is now using the same approach to identify single exon genes systematically across the genome using currently available comparative sequence data. “If there are missing human genes, there is a good bet that a lot of them will be these single exon genes because of these challenges in identifying them,” he says. “We are using these comparative methods to identify single exon genes systematically across the genome with comparative sequence data we already have.”

The group is now focused on identifying functional sequences that are not protein coding genes, as well as genes that have been gained or lost in different species.

The Scan

Lung Cancer Response to Checkpoint Inhibitors Reflected in Circulating Tumor DNA

In non-small cell lung cancer patients, researchers find in JCO Precision Oncology that survival benefits after immune checkpoint blockade coincide with a dip in ctDNA levels.

Study Reviews Family, Provider Responses to Rapid Whole-Genome Sequencing Follow-up

Investigators identified in the European Journal of Human Genetics variable follow-up practices after rapid whole-genome sequencing.

BMI-Related Variants Show Age-Related Stability in UK Biobank Participants

Researchers followed body mass index variant stability with genomic structural equation modeling and genome-wide association studies of 40- to 72-year olds in PLOS Genetics.

Genome Sequences Reveal Range Mutations in Induced Pluripotent Stem Cells

Researchers in Nature Genetics detect somatic mutation variation across iPSCs generated from blood or skin fibroblast cell sources, along with selection for BCOR gene mutations.