WASHINGTON--Human Genome Project coordinators announced two major accomplishments in recent weeks. The US National Human Genome Research Institute celebrated reaching the one-billion-base-pair sequence mark, and, more significantly, the Sanger Centre said it had completed, with the exception of some gaps, the entire sequence of chromosome 22. The chromosome is the first to be completed in the decade since the Human Genome Project began.
The effort to sequence chromosome 22 was led by the Sanger Centre with assistance from investigators at Keio University in Japan, University of Oklahoma, and Washington University. A report was published in the December 2 issue of Nature. The researchers findings on the chromosome included: 545 genes, including 298 previously unknown genes, and 134 pseudogenes; 247 genes that computer analyses revealed to be identical to previously identified human genes or protein sequences; genes ranging in size from 1,000 to 583,000 bases, with a mean size of 190,000 bases; unexpected long-range complexity and an elaborate array of repeat sequences near the centromere; and several regions where recombination is increased, and others where it is suppressed.
Francis Collins, director of the NHGRI, likened the full chromosomal sequence to an ocean liner breaking through the fog where only rowboats were visible before.
"It's pretty powerful because it gives us the data not only to find genes, but to start understanding how chromosomes are structured and what regions might be important for gene regulation, chromosome replication, and evolution," said Richard Wilson, codirector of the Genome Sequencing Center at Washington University School of Medicine in St. Louis and participant in the decoding of chromosome 22.
Wilson stressed that the approach of the public effort to sequence the human genome, which is being done chromosome by chromosome, benefits researchers investigating where a genetic disease originates on a particular chromosome. For diseases that are known to be associated with chromosome 22, the sequence information will give investigators a virtual "encyclopedia volume" of data that will help them discover candidate genes that may be involved in those conditions, said Wilson. This will enhance understanding of such diseases, revealing more about what goes wrong and how to diagnose and treat them.
Washington University's researchers, who have been concentrating their efforts on chromosome 7, did about five percent of the sequencing on chromosome 22, said Wilson. Work on chromosome 22 was started at about the same time as some others but, being the second-smallest with only 33.5 million bases, 22 was finished first. A few small gaps around some repetitive DNA regions remain, but those don't detract much from the scientific value of the sequence, maintained Wilson.
Wilson said the St. Louis facility uses Phrap for sequence assembly and Gap and Consed for observing and editing assemblies.