NEW YORK – As genomics becomes more prevalent in society — powering personalized medical approaches, new diagnostics, and society's response to major crises like the COVID-19 pandemic — it also becomes increasingly important for researchers to determine how the science is succeeding and failing, and how to steer its course in the future.
Researchers at the National Human Genome Research Institute (NHGRI) led by Director Eric Green published a paper in Nature on Wednesday in which they laid out a strategy for advancing cutting-edge human genomics research, as well as 10 aspirational predictions for how they hope genomics will advance in the next decade.
GenomeWeb spoke with Green about the paper, how genomics is shaping society, and how the successful dissemination of genomics across biomedicine makes it challenging to strategically focus on what priorities should be at the forefront for the next 10 years. Below is a transcript of the interview edited for length and clarity.
Which technological advancements in genomics over the last five years are you most impressed with, and how do you hope to see genomics technologies advance in the next five to 10 years?
Looking back, the obvious thing is these new methods for sequencing DNA that get a tremendous reduction in the cost of genome sequencing. Beyond DNA sequencing technologies, probably the thing I'm second most impressed with over the last decade, although we're still very much in the midst of it, would have to be related to genomic data science. In fact, they're interrelated, because obviously, if you generate prodigious amounts of DNA sequence, you have huge amounts of data, you have huge data challenges. But it's not even just analyzing the genomic data. What I think has been impressive is a growing capacity to grab different datasets and to deal with the scale of them, not by doing business as usual, but to be increasingly going to a cloud-based ecosystem, where the data is highly distributed. We're figuring out more and more ways to be able to integrate that data and cross-analyze across different datasets.
And I actually think that's going to be the big thing of the coming decade. There is both a huge challenge and a huge opportunity, and it's really the way of the future. It's very clear that the great discoveries are going to come from studies done at scales that we never anticipated being able to do. And that's going to be by merging datasets, probably across different countries' worth of data, and really doing things to the point where you get enough statistical power to tease out very small contributions to biological processes caused by genomic variants that you'll never see until you get the scale up, which I think means our enterprise is going to be increasingly computational heavy.
[Also], I think we've a long way to fully close the gap [in cost between sequencing and data interpretation]. Are we talking about that gap closing among the most advanced and proficient of genomic data scientists, as opposed to the gap closing for the casual practitioners or the routine practitioners of genomics? Right now, almost anybody can sequence a human genome, can operate one of these next-generation DNA sequencing instruments. But most people will get overwhelmed by the data quickly. The most effective practitioners won't get overwhelmed by the data because they're at the cutting edge. What we need to do is bring a greater swath of biomedical researchers to the current cutting edge of genomic data science. I think we will do that slowly but surely. But I do think it's going to require a lot of effort, a lot of development, a lot of investment and probably more than just a few years.
Is there still a disconnect in the translation of genomic results from bench to clinic, andhow can it be overcome? Should more researchers be starting their own companies? Should there be stronger partnerships between the research world and life science companies? And can NHGRI facilitate any of these solutions?
There's this whole thing around genomic medicine implementation, and to me, it's not a company, per se, that's going to fill that role. Maybe here or there, but they're not trying to develop a product. What we are talking about in this area is actually getting genomics into the healthcare ecosystem, which is not companies, but healthcare systems and insurance companies. What it relates to is developing more and more evidence that genomic information is useful in clinical practice and that you actually can improve clinical care. But in addition to that is once you have a discovery, you need to say, 'Alright, how do we operationalize that across all the different heterogeneous ways in which medicine is delivered in a place like the United States?'
Implementation science speaks to [that]. How do you take something that seems to work in one place and implement it in other places? And that directly relates to learning the healthcare system. We are very explicit in this strategic vision, much more so than we were nine years ago with our 2011 strategic vision. We talked about we need to be implementing genomic medicine and we're going to stick our toes and even our ankles in the water to get up there. Now, nine years later, we are at a much more sophisticated way to describe it, because now we see genomic medicine really does work. Now we've got to take our successes, we need to learn how to implement them more broadly, and we need to find more success.
And with that has to come uses demonstrating that [genomics] improves healthcare, reduces cost. We get insurance companies, get health systems onboard. And the point that I hope comes across in the paper is that we can't cede those activities to somebody else. We can't say, 'We're just genomicists, we're not going to be involved in implementation.' Part of being at the forefront of genomics, of being responsible stewards of the field, is to take partial responsibility to catalyze those things to happen, because without our involvement, they may not happen. We've got to push. We've got to fund. We've got to catalyze. It's all part of what we should be doing.
In one of your predictions for the next 10 years, you write, "Research in human genomics will have moved beyond population descriptors based on historic social constructs such as race." If not race, what kinds of descriptors should researchers be focusing on in order to ensure that databases and studies are as diverse as possible, or are painting as complete a picture of humanity as possible?
In many ways, this is a conversation going on even at the NIH level about how we do population kinds of research and how descriptors such as race are mixing social constructs with biological constructs. And the more we learn, the more it just is not logical. It doesn't have a scientific basis and it actually gets distracting. The COVID-19 pandemic is also bringing to light a whole lot of issues around health disparities that I think are going to continue to reveal some of the things that epidemiologists have been telling us for years. For example, you can learn far more about a person's vulnerabilities to disease if you have their zip code as opposed to if you have some information about their ethnic or racial background. It's just far more informative. There are so many other social determinants of health that are relevant. And yet we tend to check a box, which is race, that ends up having just all sorts of complexities that actually, in many cases, distract us from the science as opposed to being helpful.
If you look at a genomic analysis of a person relative to their ancestry — and you could do this either looking at biomedical research or looking at what ancestry.com or 23andMe provides to their consumers — genomics, in one simple experiment, gives you many orders of magnitude more high-resolution information about your ancestry, that it's almost laughable that we would go and use race in a scientific experiment. And the more we have genomic data from people, we have incredibly powerful tools to be able to say many things about people's history of their families and so forth under their geographic origins and so forth. Genomics tells us the answer to a lot of important aspects of this that give us biological information, as opposed to social construct information.
I think it's one of these things that we're going to look back on and just be very surprised that this remains part of what we were doing in research, let alone genomics research. Genomics is bringing the tools to give us the kind of information that would allow us to bypass these other descriptors. We're getting much more sophisticated at appreciating the social determinants of health. And as you get better at quantitating and understanding and characterizing these, that is far better than someone's self-reported racial categorization, which is fraught with problems and historically gets overinterpreted in ways that are counterproductive.
How do we make sure that advances in genomics and personalized medicine are reaching communities that have been left out of many scientific advancements thus far? How do we make sure that the gap between advantaged and disadvantaged communities doesn't get any bigger, specifically in terms of issues like access to technologiessuch as CRISPR or NIPT or advances in cancer diagnostics?
It is an amazingly complicated set of issues. They're not unique to genomics. Many other aspects of medicine suffer the same thing. There is not a single step that will be the solution — it's going to have to be multifaceted. For starters, we need to admit when we are not being successful, and we need to immediately put attention to the problem to be able to fix it. And I would say we still do not have enough underrepresented groups in the genomic studies that we do. That goes across the board from what genomes we've characterized to which human disease studies we [conduct]. Only recently are we beginning to fix this through programs like the All of Us research program at NIH. So, first thing is we need to recruit more underrepresented groups to all of our studies. But it turns out, that's very difficult. And genetic studies are unique in that way, in that we carry some burdens of our past misdeeds in genetics and the eugenics movement, and so forth. You've got to be willing to invest and do the hard stuff. So, it's going to cost us more to do it, but we're going have to do that. We also have to be willing to spend effort to help people understand genetics, so they don't fear it the way some people do now. This speaks to issues around genomic literacy. It also speaks to issues about our workforce. If everybody in the genomics workforce looks like me, it's going to be very difficult to recruit individuals from minority groups.
The institute's very committed. We're putting our money where our mouth is — we are investing in studies to improve diversity among our research participants and putting our efforts to improve diversity in our workforce. And we are trying to improve genomic literacy across the board, and that includes everybody, including underrepresented groups.
There's also a genuine interest in trying to broaden the access to the genomic technologies, to genomic data science, sometimes referred to as the democratization of genomics, and making it more within reach.
In another of the predictions, you wrote, "The clinical relevance of all encountered genomic variants will be readily predictable, rendering the diagnostic designation 'variant of uncertain significance (VUS)' obsolete." That's a bold prediction — genomics researchers have been struggling with VUS for decades. What's needed to make this prediction come true in the next 10 years?
That whole area of uncertain significance is sort of like just admitting, 'I don't know.' And so, we want to get rid of that and strive to get to where we could understand every variant in every position in the genome. Maybe it's going to have to be a much greater advance of understanding and probably computer models to help us get there. But, let's make that aspirational, and let's try to see, can we at least develop predictive models so that maybe we can give a score. 'We think there's an 80 percent chance this is relevant, or a 20 percent chance.' But just to say variant is of uncertain significance seems like a punt.
We're not the only ones who will be doing this. We have a major program, we have a funding announcement out on the street, and the grants are just coming in. We're going to develop a big program to look at how variation functions in disease, which is basically developing all sorts of approaches to try to characterize variants that are part of the genome, how they influence genome function, and how that, therefore, plays a role or not in human health and disease. This is sort of a project that is following on the ENCODE program. ENCODE was figuring out all the functional elements in the human genome. This is now asking the questions, 'What's the role of variants when they occur in those functional elements? How do they influence function? And how could they influence human health and disease?' And so, we're going to put a big consortium together, a lot of smart people. They're going to work on this, maybe for a decade, maybe more. Ten years from now, I think [we might have] a little more sophisticated approach than VUS. I don't know what it will look like, but we've got to fix this.
We fully expect some of these [projects] will be developing high-throughput experimental methods to characterize variants. But we also think it'll involve a lot of modeling. The idea is, can you build predictive models that look at the stretch of DNA sequence, throw down every possible variant across that stretch and have a reliable model that predicts which of the variants are functionally consequential.
How has the coronavirus pandemic affected genomic research projects and initiatives? Do you see a long-term shift to infectious disease genomics or is this only temporary?
I think there is a chance there'll be more money put into [viral research]. But I've seen this play out before. It's no different than when there was the anthrax scare or when Zika hit — money went into certain areas. It's no different than any time there's a public health emergency — augmented money goes in. The one thing that COVID-19 revealed and continues to reveal is broad-scale appreciation for having a strong basic science and translational science and clinical research enterprise. The dirt, dirt, dirt cheap sequencing — the idea that we are now going to be doing surveillance of waste water as a means to follow the SARS-CoV-2 virus is only feasible because we've had a technology development program that reduced the cost of DNA sequencing by a millionfold. The idea that we have better and better ways of synthesizing DNA is helping vaccine development through synthetic constructs. The fact that we are getting better and better in cloud computing and having big datasets, that means that very large studies get off the ground very quickly to try to interrupt the pandemic. You need to have a broad base in biomedicine to be able to have a strong foundation across a diverse set of [scientific research] areas to have them properly situated to jump in and help.