The National Human Genome Research Institute has launched a large-scale medical sequencing study focusing on heart disease that eventually plans to sequence the genomes of 1,000 individuals using next-generation sequencing technologies.
The study, called ClinSeq, has started modestly but will scale up over time. Since February, researchers from the NHGRI, the National Heart, Lung, and Blood Institute, and other NIH institutes have enrolled about 50 participants.
NIH’s Intramural Sequencing Center has been using conventional Sanger capillary sequencing to decode 39 genes in these samples — including coding regions and regulatory regions — that are known to be involved in heart disease.
Over the next two years, the researchers are hoping to enroll 1,000 people, including patients with varying degrees of cardiovascular disease and controls. The initial goal is to sequence between 200 and 400 genes that are known to, or suspected to, play a role in cardiovascular disease, according to Leslie Biesecker, chief of NHGRI’s genetic disease research branch, who heads the study.
All participants will give their consent to have their entire genome sequenced. This will happen once the technology becomes affordable — a milestone that will depend on new sequencing technologies, he said.
The researchers have not decided yet which technology they will use for whole-genome sequencing, or at what cost per genome they will switch over to them. “We haven’t settled on that answer yet,” Biesecker told In Sequence last week.
“I don’t think any of the technologies are ready today for doing the ClinSeq-type project on the whole genome,” said Jim Mullikin, who heads the comparative genomics unit at NHGRI and is in charge of sequencing the ClinSeq samples. “We’d consider, as time goes on, any of the platforms that will arise.”
But existing next-generation technologies such as 454 Life Sciences’, Illumina’s, or Applied Biosystems’ could potentially already help the current phase of targeted resequencing of PCR products, he said, which NISC currently performs on its ABI 3730 capillary instruments.
Illumina might have the best chance of getting involved: In the near term, NISC “won’t have direct experience with anything but the Solexa machine,” Mullikin said, adding that “of course we’ll keep our eyes open to any other technologies that come along.”
NISC has just ordered an Illumina Genome Analyzer, which it expects to receive later this year, and which the scientists initially plan to use for other projects. “But we will start to get experience with that technology and see what we can do with it,” possibly including targeted sequencing for ClinSeq, Mullikin said.
Mullikin and his colleagues have already had experience with Illumina’s platform from a collaboration with Aravinda Chakravarti’s group at Johns Hopkins University and Solexa in which they tested the instrument’s ability to resequence a 140 kb genomic region from 15 large PCR products (see GenomeWeb Daily News, In Sequence’s sister publication, 10/30/2006).
“Once the long-range primer products are properly designed, it’s a very nice way to get sequence from a targeted region,” Mullikin said. “But one of the challenges is making sure that you have very robust long-range PCR primer sets, and that you don’t get allelic bias from that,” he explained.
It remains to be seen if designing long-range PCR primers is worth the effort, given that “we know we can [generate] the short-range primer products, and get very good products from that right away, for Sanger-style sequencing,” said Mullikin.
Because they deliver a different type of data than Sanger sequencing, all new sequencing technologies will require researchers to develop new approaches and analysis pipelines, Mullikin said.
“There is hope that you get to a place where you could call just from a single read variations with confidence, but I don’t think we are there yet with any of the new technologies like we [are] with capillary-type machines,” he said.
“I don’t think any of the technologies are ready today for doing the ClinSeq-type project on the whole genome.”
In the meantime, scientists using the new technologies have to oversample to make sure they do not miss any sequence variations. “When you sequence with a new technology, it’s typically clonal-based, and then you need to have sampled that individual enough times at that base position to know that you have had a good chance of seeing both alleles,” Mullikin said. “And you also need to see both alleles enough times to know that you have seen the variation, and not just noise.”
Mullikin and his colleagues are also interested in adopting non-PCR-based techniques for selecting genomic segments for sequencing. “If that technology gets robust enough to put into a production facility like we have at NISC, then we would adopt that,” he said. “That would be a nice fit with these other [new] sequencing technologies.”
Several research groups are already working on such methods, including teams at Stanford University’s Genome Technology Center and Harvard University (see In Sequence 5/22/2007).
According to Biesecker, all of the samples and “essentially all of the data” will become available to other research groups within and outside of NIH.
The researchers chose heart disease as their test disease for several reasons: It is clear that genetics plays a major role in the disease, they found “strong and enthusiastic clinical partners” at NHLBI, the phenotype can be assayed directly, and the disease is of great public health and epidemiology interest, Biesecker said. Once whole-genome sequencing is underway, the data might also reveal susceptibility to other diseases.
Study participants will not receive their raw sequence data but only specific test results revealing disease risk. “What will be made available to the patients is a very delimitated set of results … that can be verified and meet the standards of a CLIA-certified testing laboratory,” Biesecker said. “What we are trying to model is how real-world clinicians will deal with results that involve sequence data,” he explained.