This article has been updated to clarify that the All of Us program still intends to use genotyping arrays on all participants to generate data for research and for returning non-health-related results.
NEW YORK – The National Institutes of Health's All of Us research program is working on an investigational device exemption (IDE) submission to the US Food and Drug Administration to return results to participants next year and is adjusting the mix of genomic technologies to generate data for its study cohort.
Last week, the program announced a $7 million one-year award, from the National Center for Advancing Translational Sciences (NCATS), to the Hudson Alpha Institute for Biotechnology to generate long-read whole-genome sequencing data for at least 6,000 samples from All of Us participants.
Over the next several months, HudsonAlpha researchers plan to conduct a pilot study to decide which technology to use. In addition, the program will likely drop genotyping array data for returning health-related results to participants and focus entirely on sequencing data for that.
The HudsonAlpha grant complements $28.6 million in NIH funding that went to three genome centers last year, led by Baylor College of Medicine, the Broad Institute, and the University of Washington. Those centers will generate and analyze genomic data for the project, both for research and for the return of results to participants. In addition, the program awarded $4.6 million this summer to genetic testing company Color, which will build a genetic counseling resource for the program.
In the meantime, the project is working on submitting an IDE application to the FDA so it can start returning results to participants next year. "We're trying to lock down when we're going to have the first return of results," said Brad Ozenberger, genomics program director for All of Us. "One of the most common question we get from participants is, 'When am I getting my genetic results?' We want to communicate that pretty quickly."
Earlier this year, the program had said it planned to return results from a pilot study to 20,000 participants by the end of this year, but data production has not started yet.
"We won't start returning results until the IDE is complete," Ozenberger said. "We still intend to pilot the return of results, and it won't take us too long to do that. We're going to pilot, pause, evaluate our genetic counseling and the response by the participants, and then scale that up."
According to Ozenberger, discussions with the FDA about the types of results the program plans to return — a report on heritable disease risk that will focus on the ACMG 59 genes recommended by the American College of Medical Genetics and Genomics and a list of pharmacogenomic results — started about a year ago.
The program explained that it was planning to generate data in a CLIA/CAP environment but was going to label the results as research, not clinical, asking participants to seek confirmation from a healthcare provider before making any changes to their medical care. FDA decided back then that this protocol would fall under its purview and that an IDE submission would be required.
This summer, as part of the pre-submission process, All of Us submitted a first version of its protocol to the FDA. "We got lots of feedback from them," Ozenberger said, including suggestions and questions the program needs to address for the formal IDE submission.
"We have a great relationship with their review team and their devices group and they've been working very closely with us to make sure we fulfill the requirements they need," Ozenberger said, adding that the plan is to complete the IDE application within the next six months.
FDA's requirements focus on reducing the risk to participants. On the one hand, that means making sure that participants obtain the same variants regardless of which genome center processes their sample. For that, the three centers (HudsonAlpha, for now, will not generate health-related data for return of results) are in the midst of an analytical validation study, for which they are sequencing a number of cell lines with known variants in ACMG or pharmacogenomic variants, he said. They are also discussing with the FDA what kinds of changes to the protocol the program can make in the future and what changes would require additional approval.
All of Us has also found that using two platforms to generate health-related results for return — one based on genotyping arrays, the other on whole-genome short-read sequencing — will probably be too complex. "We're very likely going to remove genotyping from the equation and just focus on one assay" for the IDE application, Ozenberger said.
However, the program still plans to use Illumina genotyping arrays to generate data for research and for the return of non-health-related results. Illumina said last year that it would donate Infinium Global Diversity Array genotyping arrays, which it had developed for the program but also recently made commercially available, for up to 1 million samples at no cost to the program.
The other aspect of risk mitigation relates to communicating results to participants. Ozenberger said that All of Us is currently working with its own institutional review board on a protocol for returning results, noting that some of this work had to wait until the award to Color was made so the company could contribute to the discussion. It involves the design of reports, what types of warnings they contain, and how they explain the results, including the difference between research results and clinical tests, he said.
Recently, the FDA has taken a tough stance against laboratories offering pharmacogenetic testing without its approval, and has reportedly asked the All of Us program to only return PGx results that are supported by FDA-approved drug labeling. However, Ozenberger said nothing has been finalized yet. "We're in very active discussions with FDA about the pharmacogenomics report and what it contains, and we've gone back and forth, and we really haven't come to a final stage that I can talk about," he said. "We haven't come to a final decision yet."
Reporting results from the ACMG 59 genes will be a little less contentious because they will be communicated by a genetic counselor, which mitigates the risk to participants. In addition, pathogenic variants in ACMG genes will be validated in a second sample before they are returned. Findings in ACMG 59 genes are also expected to occur at a much lower frequency than actionable PGx findings, which more than 90 percent of participants will have, according to a report by All of Us investigators in the New England Journal of Medicine this summer.
One aspect of the program the FDA has not reviewed in full yet is the consent package, which will be part of the full IDE submission. Technically, the program should not have begun enrolling participants prior to obtaining the IDE, but it has already accepted more than 200,000 people into the study, who have contributed biological samples, and is signing up more than 3,000 per week. "Our horse is out of the barn, as they say," Ozenberger said.
While the program and its three genome centers are working on the IDE submission, HudsonAlpha investigators plan to conduct a pilot study before the end of this year to decide which long-read sequencing technology they will use to analyze 6,000 to 7,000 participant samples in 2020.
According to Shawn Levy, a faculty investigator at the institute, the goal of the project funded under the award is to determine what types of structural variants can only be detected by long-read data. "So far, there has not been an effort to start evaluating those at a population scale to start determining what the frequencies for those kinds of variants are," he said. "This is really an effort to expand our knowledge of the genome from a structural variant perspective."
The long-read data will likely complement structural variants gleaned from short reads, he said, and samples that appear to harbor complex structural rearrangements, based on short-read data, could in the future be prioritized for long-read analysis.
In addition, long-read data might even help with calling single-nucleotide variants in genes that have pseudogenes or repeat structures and are difficult to analyze with short-read data alone. Levy said this was recently demonstrated by Mark Ebbert from the Mayo Clinic, who gave a talk about his findings, involving both PacBio and Oxford Nanopore data, during an Oxford Nanopore-sponsored workshop at the American Society of Human Genetics annual meeting last week.
HudsonAlpha's pilot project will help determine "what is the most appropriate method to get 6,000 genomes done in a year," he said. The plan is to sequence 96 well-characterized samples using the PacBio Sequel 2 and Oxford Nanopore's PromethIon 48. A subset of the samples will also be analyzed with Bionano Genomics optical mapping technology and with linked reads from 10x Genomics and Illumina data, he said, noting that HudsonAlpha has all four technology platforms in house.
Levy said work by other groups, including a study by members of the Human Genome Structural Variation Consortium published earlier this year in Nature Communications and another on structural variation in cancer genomes by an international team, published in Nature Genetics last year, laid the foundation for his team's pilot study.
Since those studies came out, both Oxford Nanopore and PacBio have made "some pretty significant advances in terms of their output and the cost per genome, especially from a structural variant perspective," Levy said. "Both manufacturers have made statements as well as provided example datasets showing that both technologies are really moving at a very encouraging pace." For example, he said, PacBio's chemistry and chip updates have improved output "very nicely" and the platform's accuracy has continued to improve. Likewise, Oxford Nanopore has significantly improved its basecalling algorithms to increase the accuracy of its data and the platform's performance on homopolymers, he said.
The pilot project will "put that to the test," he said. Based on the data and input from the three All of Us genome centers, a platform will be selected for the program, ideally by the end of this year. "We want to start the production phase of the project as early in 2020 as possible, but at the same time, we're not going to rush through it," he said.
Both PacBio and Oxford Nanopore "stand by their ability to deliver a depth of coverage that will resolve structural variants at the accuracy that the program needs," Levy said. While it is premature to say what coverage depth that will require, he said that if everything goes as planned, either technology will be used at around 20X to 30X coverage.
While each platform might have advantages and disadvantages, more than one will probably not be chosen. "Just given the scale and the scope of the experiment, I think that would be unlikely," Levy said. "However, I certainly can't take that off the table."
The program has not determined yet which 6,000 samples out of the 200,000 participants will undergo long-read sequencing, he said. None of the long-read data will be produced in a CLIA environment and as of now, none of it will be returned to participants, he added.