Skip to main content

Finished Human Genome Sequence Published in Nature

NEW YORK, Oct. 20 (GenomeWeb News) - The International Human Genome Sequencing Consortium today announced that the finished human genome sequence will appear in the Oct. 21 issue of the journal Nature. According to the consortium, the estimated number of human protein-coding genes in the human genome is 20,000-25,000, lower than the approximately 35,000 that was estimated when the working draft of the sequence was completed three years ago.

 

Researchers on the project confirmed the existence of 19,599 protein-coding genes and another 2,188 DNA segments that are predicted to be protein-coding genes, according to the findings.

 

"The analysis found that some of the earlier gene models were erroneous due to defects in the unfinished, draft sequence of the human genome," said Jane Rogers, head of sequencing at the Wellcome Trust Sanger Institute in Hinxton, England, which is part of the consortium. "The task of identifying genes remains challenging, but has been greatly assisted by the finished human genome sequence, as well as by the availability of genome sequences from other organisms, better computational models and other improved resources."

 

The Nature paper also provides a peer-reviewed description of the finishing process and an assessment of the quality of the finished human genome sequence. According to that assessment, the finished sequence covers more than 99 percent of the euchromatic portion of the human genome and was sequenced to an accuracy of 99.999 percent.

 

Since the working draft was completed in 2002, the contiguity of the sequence has been improved. The average DNA letter now sits on a stretch of 38.5 million base pairs of uninterrupted sequence - about 475 times longer than the 81,500 base-pair stretch available before. The human genome sequence still contains 341 gaps, the consortium noted, compared to the 150,000 gaps in the sequence when the working draft was completed. It said closing the remaining gaps would require more research and new technologies.

 

The finished sequence provides a much clearer view of certain phenomena, according to the researchers, such as duplication of DNA segments and the birth and death of genes. For example, their analysis found that distribution of segmental duplications varies widely across human chromosomes. The Y chromosome is the most extreme case with segmental duplications occurring along more than 25 percent of its length.

 

In addition, researchers found that some segmental duplications tend to be clustered near the centromeres and telomeres of each chromosome. Researchers speculate that these segmental duplications may be used by the genome as an evolutionary lab for creating genes with new functions.

 

Authorship of the Nature paper is shared by more than 2,800 researchers who took part in the consortium, which includes scientists located at 20 institutions in France, Germany, Japan, China, Great Britain and the United States.

 

The finished sequence and its annotations can be accesssed through several public genome browsers listed here.

The Scan

US Supports Patent Waivers

NPR reports that the Biden Administration has announced its support for waiving intellectual property protections for SARS-CoV-2 vaccines.

Vaccines Versus Variants

Two studies find the Pfizer-BioNTech SARS-CoV-2 vaccine to be effective against viral variants, and Moderna reports on booster shots to combat variants.

CRISPR for What Ails You

The Wall Street Journal writes that CRISPR-based therapies could someday be used to treat common conditions like heart attacks.

Nature Papers Review Integration of Single-Cell Assay Data, Present Approach to Detect Rare Variants

In Nature this week: review of ways to integrate data from single-cell assays, and more.