Skip to main content
Premium Trial:

Request an Annual Quote

While Washington Cheers, Labs Continue to Generate and Finish Chromosome Data

Premium

While heads of the major US genome sequencing centers joined President Bill Clinton, Francis Collins, and Craig Venter in Washington last week to celebrate the completion of a working draft of the human genome sequence, their underlings remained back at their laboratories finishing what many experts said are the most challenging aspects of the project.

When Celera Genomics and the public Human Genome Project jointly announced that they are done generating a “working draft sequence” for nearly 90 percent of the genome, scientists nationwide agreed that the milestone is an important one. But they also acknowledged that almost 80 percent of the genome remains in “unfinished” form.

Rick Wilson, co-director of the genome sequencing center at Washington University in St. Louis and head of a committee that has been created to oversee genome-wide finishing work, said, “The easy part of sequencing is doing what we’ve done so far. What we’ve got to do now is polishing, filling in the gaps, making sure ambiguities are solved, making the sequence perfect. That’s really what the next couple of years is going to do.”

Steve Scherer, assistant professor of molecular and human genetics at Baylor College of Medicine, said he was delighted by the announcement in Washington. “This is what we’ve been working our tails off on for years now. Everybody is walking around with a little extra bounce in their step today.” Still, Scherer said he and his colleagues at Baylor’s genome sequencing center, who are responsible for chromosomes 3, 12, and part of X, have at least another year’s worth of work to do.

In fact, of 24 chromosomes, only numbers 21 and 22 are in finished form. Others are in various states of completeness. According to the National Center for Biotechnology Information’s website, less than 10 percent of the finishing work has been done on each of chromosomes 3, 4, 8, 9, 10, 11, 13, 15, 18, and X (see chart below). The public project has predicted that all 24 chromosomes will be in final finished form by April 2003.

"Finishing" the Human Genome
The major genome sequencing centers said they aim to finish their work
on the 24 human chromosomes by April 2003:

CHR CENTER
%
FINISHED
TARGETED
COMPLETION

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

Y
X

Sanger Centre
Washington University
Baylor
Washington University
Joint Genome Institute
Sanger Centre
Washington University
Whitehead Institute
Sanger Centre
Sanger Centre
Whitehead Institute
Baylor College of Medicine
Sanger Centre
Genoscope
Whitehead Institute
Joint Genome Institute
Whitehead Institute
Whitehead Institute
Joint Genome Institute
Sanger Centre
Sanger Centre
Sanger Centre
Washington University
Sanger Centre
Baylor College of Medicine
12.1
12.2
6.3
6.8
11.5
29.7
50.0
5.9
5.3
6.9
6.5
17.1
3.7
52.4
2.1
21.4
31.4
5.0
27.9
53.0
103.5
97.4
8.8
38.0

2002
2002
late 2001
2003
late 2001
early 2001
early 2001
2003
2002
2002
2003
mid 2001
2002
late 2000
2003
2001
2003
2003
late 2000
2000
done
done
2000
NA
Xp early 2001

TOTAL

 

18.1

APRIL 2003

“You have to read between the lines to understand that done ain’t done,” remarked Lee Rowen, coordinator of the Institute for Systems Biology’s effort to sequence regions of chromosomes 14 and 15. “At one extreme, done would mean you have 24 contiguous sequences, one for each chromosome. But what done means in the context of this announcement is that the draft is done. They have sequence coverage of some sort or another for about 90 percent of the human genome.”

Rowen said her group had completed its work on chromosome 14, which will be finished by the French lab Genoscope later this year, and will now focus on getting a region of chromosome 15 into finished form. Massachusetts Institute of Technology’s Whitehead Institute will finish the rest of chromosome 15, which is presently 2.1 percent finished.  

At Baylor, Scherer said chromosome 3 is going together easily, but that chromosome 12 has been “troublesome” due to areas in which repeat sequences are stacked up against each other. He said that although the lab hopes to finish chromosome 12 in the next 3 to 5 months, “we may run into intractable regions, that may require even some new techniques to come down the road before we can pull it off.”

Wilson told BioInform that the Washington University center aims to finish chromosome Y by the end of this year. Target completion dates for chromosomes 7, 2, and 4 are in 2001, 2002, and 2003, respectively.

He compared the working draft sequence to the outline a reporter might prepare before writing a news article. The finishing process, he said, “is what you would do between your rough draft version and the final story: run spellchecker and grammar checker, maybe add a few thoughts here and there.” The genomic equivalent of running spellchecker, he said, is to use other chemistries in the sequencing process, apply new methods to find uncovered regions, and use different computational tools to sort out sequence data.

Having completed the C. elegans genome sequence telomere to telomere, Wilson said scientists at Washington University and the Sanger Centre are well equipped to tackle this next step. “We learned a lot from doing the worm, so I think we’ve got everything in hand we need to finish the human.”

Because the human genome contains so many gene-less regions, Wilson said the human sequence would not have to be finished to the same standards. “We’ve learned as we’ve sequenced C. elegans and the yeast genome that there are regions of the genome that are completely noninformative and are different between individuals--repeat regions where it’s a question of whether there are 20 Ts or 25 Ts. The bottom line is that you don’t care, you just want to figure out where those Ts are and then you can go back later and find out if the number makes a difference.”

Wilson said his lab would celebrate the completion of the working draft at its annual summer picnic next month. Then, sometime later this year or early next, “when we finish chromosomes Y and 7, we’ll party,” he said.

--Adrienne Burke

The Scan

Missed Early Cases

A retrospective analysis of blood samples suggests early SARS-CoV-2 infections may have been missed in the US, the New York Times reports.

Limited Journal Editor Diversity

A survey finds low diversity among scientific and medical journal editors, according to The Scientist.

How Much of a Threat?

Science writes that need for a provision aimed at shoring up genomic data security within a new US bill is being questioned.

PNAS Papers on Historic Helicobacter Spread, Brain Development, C. difficile RNAs

In PNAS this week: Helicobacter genetic diversity gives insight into human migrations, gene expression patterns of brain development, and more.