A new project sponsored by the US National Human Genome Research Institute has the ambitious goal of planning for the future of personalized medicine that will be created by high-throughput medical sequencing — how to handle the information, what to do with it, and how to build the infrastructure to use it.
"What I hope is that we can address some questions so others can do whole-genome medical sequencing to answer whatever questions they want to answer in human subjects," Les Biesecker, a senior investigator at NHGRI and director of the new ClinENCODE project, said in an interview with Pharmacogenomics Reporter.
"One of those [sequencing projects] could be clinical studies run by pharma," he said. "And some of the protocols and paradigms that we work out — if we can demonstrate that they're appropriate or efficacious, then they can adopt those or modify them to their needs instead of starting from scratch."
The five-year ClinENCODE project will sequence regions of the genome previously identified by NHGRI's ENCODE (Encyclopedia of DNA Elements) project for about 400 people to explore three areas integral to medical sequencing: dealing with subjects and their expectations; defining diseases, phenotypes, and their relation to genotype; and developing methods to handle and interpret diverse sequencing data. After the project is over, the ClinENCODE researchers hope, drug makers and diagnostic companies should be able to collect and use sequence data in clinical trials in valid and meaningful ways. The intramural research is envisioned as both a technology "push" and a "pull" project that will give companies involved in personalized medicine a window onto platform and healthcare markets of the next decade. That is, the project should push the limits of current technology, thus opening new areas for research and market opportunities.
Data produced by the project will be made publicly available both in a trace repository, which will contain small segments of DNA that users will not be able to match to the source subjects, and in full continuity through an IRB-approved protocol that will allow access to … data [in which the DNA segments are linked in their original form] for approved research projects. "We will absolutely make that available for appropriate downstream uses of the data," Biesecker said. "We're going to make cell lines for other projects downstream, etc."
Financial support for the project comes from NIH intramural funding, but Biesecker said the total amount of funding still hasn't been determined. The researchers should begin enrolling subjects by mid fall, and within 18 months may be finished sequencing the ENCODE regions of all 400 participants using ABI sequencers at the US National Intramural Sequencing Center, according to Biesecker.
The first "leg" of the study, the psycho-social portion, will try to piece together how healthy, voluntary subjects feel about medical genetic data, an integral component of any future market for personalized medicine. "We want to understand what motivates people to want to undergo this, and what would preclude people from undergoing this, because five to 10 years from now, when we're talking about whole-genome studies, we would want to be able to know [that] this is how you do it," said Biesecker.
Also, no one truly knows how subjects will react when they are informed they have a genetic propensity to disease, when they are told that some genes may cause certain phenotypes, or that scientists simply don't know what the effect of some polymorphisms may be. ClinENCODE will try to clear that up too, and in the process perhaps find a way to communicate with subjects "in a way that allows them to go forward with their life," said Biesecker.
Second is the medical leg. When dealing with sequencing data, "What is a disease? What is a phenotype? What evaluation should we do? How do you do the correlative study?" Biesecker asked rhetorically. The ClinENCODE project will work with the small number of disease-related genes in the ENCODE regions, which make up about 30 megabases, or about 1 percent of the human genome, in trying to develop algorithms to solve these questions, Biesecker said.
This part of the program will also collect phenotypic information on subjects, including a physical exam, medical history, and a family medical history. The team is hoping to find out what the informational needs of whole-genome medical sequencing will be.
Among the few disease genes in the ENCODE regions are those related to cystic fibrosis, sickle-cell anemia, color blindness, and oto-palatal-digital syndrome. As the sequencing project moves forward, said Biesecker, it may well encounter disease gene variants with no known clinical significance. How the group decides to deal with that situation may form a model — or an example — to followers in clinical sequencing.
Technical issues are the target of the third leg. The NHGRI researchers will try to define how to decide what genomic regions to sequence in the future, and how to interpret that sequence, in addition to establishing quality control measures, instrument parameters, and sample-management and data-management systems for anonymized sequences.
In parallel with the project, the NIH extramural program is developing four proposals for separate medical sequencing projects that will be "taken up by the extramural sequencing centers," said Biesecker. "This project is part of a larger portfolio of medical sequencing that is going to open up quite quickly here in the next few years," he said.
— Chris Womack ([email protected])