Skip to main content
Premium Trial:

Request an Annual Quote

Irish Researchers Sleuth Out Unique Human Genes Originating from Non-Coding DNA

NEW YORK (GenomeWeb News) – In a paper appearing online today in Genome Research, a pair of Irish researchers identified three genes that appear to be unique to the human genome.

Researchers David Knowles and Aoife McLysaght of the University of Dublin's Smurfit Institute of Genetics compared chimp and human protein and DNA sequences, identifying three human genes that lack orthologues in other species. The researchers tracked down DNA sequences resembling the genes in chimps and other primates. But those sequences don't code for proteins, suggesting the trio of human-specific genes found in the study may have sprung up from non-coding DNA.

Past studies have identified numerous genes that have been duplicated or rearranged throughout evolution, taking on distinct characteristics and functions in different lineages. But less is known about whether — or how — new genes originated from non-coding sequences.

Knowles and McLysaght used the protein BLAST tool BLASTP to build and compare blocks of conserved synteny in chimps and humans, containing sequences that are orthologous in both species.

Initially, the researchers found 644 proteins in the human genome that had no BLASTP hits in chimp. They excluded hundreds of genes that corresponded to assembly gaps in the chimp or macaque genome from their subsequent analyses, as well as genes with known or suspected orthologues in other species. The pair also tossed out potentially spurious human genes or annotation artifacts.

In the end, they were left with three genes: CLLU1, which codes for the chronic lymphocytic leukemia upregulated gene 1, as well as C22orf45 and DNAH10OS, which are less well characterized.

A dozen nucleotide substitutions spanned these three genes. Seven of these substitutions (four synonymous and three non-synonymous) appear to have occurred in the chimp genome, where sequences for the genes are present but non-coding. Meanwhile, five substitutions — three non-synonymous — occurred in the human genome.

Based on these findings and their subsequent analyses, the team concluded the genes originated in parts of the genome that are non-coding in other primates.

Although the functions of the genes are poorly understood, the researchers noted that all three overlap with genes on the opposite DNA strand. In addition, each produces an intronless ORF coding for a short protein.

"They are unlike any other human genes and have the potential to have a profound impact," McLysaght, a molecular evolution researcher at the University of Dublin, said in a statement.

Based on these findings, the team estimates that about 0.075 percent of human genes — roughly 18 of the 24,000 — are human-specific and arose from formerly non-coding sequence.

"The three genes reported here are the first well-supported cases of protein-coding genes that arose in the human lineage and are not found in any other organism," Knowles and McLysaght concluded. "It is tempting to infer that human-specific genes are at least partly responsible for human-specific traits and it will be very interesting to investigate the functions of these novel genes."

The Scan

Lung Cancer Response to Checkpoint Inhibitors Reflected in Circulating Tumor DNA

In non-small cell lung cancer patients, researchers find in JCO Precision Oncology that survival benefits after immune checkpoint blockade coincide with a dip in ctDNA levels.

Study Reviews Family, Provider Responses to Rapid Whole-Genome Sequencing Follow-up

Investigators identified in the European Journal of Human Genetics variable follow-up practices after rapid whole-genome sequencing.

BMI-Related Variants Show Age-Related Stability in UK Biobank Participants

Researchers followed body mass index variant stability with genomic structural equation modeling and genome-wide association studies of 40- to 72-year olds in PLOS Genetics.

Genome Sequences Reveal Range Mutations in Induced Pluripotent Stem Cells

Researchers in Nature Genetics detect somatic mutation variation across iPSCs generated from blood or skin fibroblast cell sources, along with selection for BCOR gene mutations.