NEW YORK (GenomeWeb) – Almost five years after the UK government first announced plans to sequence 100,000 genomes of patients and their families, the project has returned results for thousands of rare disease patients and hundreds of cancer patients to the National Health Service.
The 100,000 Genomes Project is now expected to wrap up by the end of 2018, later than originally planned because of a number of challenges encountered along the way, including generating high-quality cancer genomes and streamlining the analytical pipeline.
In the meantime, the NHS is preparing to commission whole-genome sequencing services to provide WGS as a routine diagnostic test for certain rare diseases and cancers, starting in October of 2018.
During a workshop last month at the American Society of Human Genetics annual meeting in Orlando, Florida, Mark Caulfield, chief scientist of Genomics England, the Department of Health-owned company that organizes the 100,000 Genomes Project, and Sue Hill, NHS chief scientific officer for England, provided an update on the project and future plans. Caulfield provided additional information during an interview last week.
The goal of the project is to sequence 100,000 genomes from a total of about 70,000 patients and family members, a mix of rare disease and cancer patients.
In 2014, Genomics England chose Illumina as its sequencing partner for the project and announced £311 million ($409 million) in funding for the project from various sources, including an investment from Illumina. Later that year, NHS England chose 11 Genomic Medicine Centres (GMCs) — now expanded to 13 — to recruit patients and deliver genomic results for use in their healthcare. Following several pilot studies, the project took off officially in 2015.
As of this week, the 100,000 Genomes Project had collected more than 63,500 samples and sequenced 39,500 genomes, more than 31,000 of them from rare disease families. So far, it has returned about 6,000 genome reports for almost 3,000 rare disease families, as well as more than 600 cancer reports.
One of the reasons so few cancer genomes have been generated so far, Caulfield said, is that the project had to change the way tumor samples were collected. Initially, the samples were formalin fixed, resulting in DNA damage and poor-quality genomes. Following a number of pilot projects to test new protocols, the vast majority of samples are now either fresh or fresh-frozen tissues, which yield high-quality genomes. "We had to get the NHS to reengineer the whole way they do molecular pathology to handle this, and this has proven challenging for them because this is in routine NHS care, it's not a research project," Caulfield explained.
All sequencing for the project is performed at a dedicated sequencing center on the Wellcome Genome Campus that is run by Illumina on behalf of Genomics England. "We've done lots of things together that, if we were in a traditional customer-supplier relationship and just bought a load of X Tens from them, we could not have done," Caulfield said.
Currently, the center uses HiSeq X Ten sequencers and the reads are mapped to the human reference genome build 38 (GRCh38). Some earlier genomes were aligned to build 37, Caulfield said, but those are in the process of being converted to build 38. Rare disease samples are sequenced to at least 30-fold coverage, though the usual coverage is 35-fold to 40-fold, and cancer genomes are sequenced to at least 75-fold coverage, though it is mostly 85-fold to 90-fold, he said.
The raw sequence data as well as the variant call files go back to Genomics England, which has enlisted a few companies to help it with the annotation and interpretation of the data. Following a bake off, Genomics England in 2015 selected four companies, and contracted with three of them: Congenica, Fabric Genomics (at the time called Omicia), and WuXi Nextcode.
However, Caulfield stressed that Genomics England controls the final report that goes back to the GMCs, and it is up to the NHS clinicians to decide what information to return to patients. "This is an analysis of the whole genome, so it's not a diagnosis that they are required to return to the patients," he said.
The report for rare disease patients contains variants in three tiers: known pathogenic variants in known disease genes, potentially pathogenic variants in known disease genes, and likely pathogenic variants in a gene that is not known as pathogenic yet.
As of last week, the project had returned just over 6,000 genome reports for almost 3,000 rare disease families to the GMCs, and about 20 percent to 25 percent of those contained a potential molecular diagnosis. Clinicians "look at it and they either agree with that or they don't agree with it, or maybe they say, 'I know what you're thinking here, but actually, we think it's this mutation over here' and they actually come up with something different," Caulfield said.
In addition to the genome reports, NHS clinicians get access to the BAM files with the raw sequencing data, as well as to the variant call files — sometimes even before the report is completed, "so if they want, they can have a go at doing the report themselves … as a sort of learning exercise," Caulfield said.
Genomics England is in the process of analyzing genomes from a batch of 2,500 families, which will probably go out to the GMCs later this month, and more genome reports are scheduled to go out in December or early January. "Then we will have cleared most of the families where we've got enough data to do so," he said, adding that for some families, more clinical data is needed.
Good phenotyping information has been helpful for obtaining a diagnosis. The NHS supplies the project with clinical information from rare disease patients on a regular basis — Human Phenotype Ontology terms that describe both the presence and the absence of certain clinical features. Using HPO terms, Caulfield explained, allows the project to share clinical data with other researchers around the world in order to find other patients with similar symptoms, which has already helped to solve several cases, he said.
So far, the project has not updated any genome reports based on new research results "but it is our intention to do that," Caulfield said. It has already established the Genomics England Clinical Interpretation Partnership (GeCIP), where researchers and clinicians collaborate in disease-specific and function-specific domains to refine the clinical interpretation of the genomes generated by the project. So far, more than 2,600 researchers from 342 institutions in 24 countries have signed on and the first 34 researchers were given access to the data in June. "They do research and drive up the value for clinical interpretation, so we will send that information back to the NHS when we get it," Caulfield said.
For cancer patients, NHS clinicians receive a report that lists actionable mutations. Also provided are hyperlinks to a clinical trials database. Based on the first few cases, about 60 percent of patients have a match to a clinical trial, which is more than Genomics England had expected, but Caulfield cautioned that this number is likely going to drop once results are returned at scale. "But even if it was 20 percent, or 10 percent, it could change the outcome or the opportunity for the patient, and that will be well worth it," he said.
Clinical data collected for cancer patients from the NHS includes a lot of routine data, such as cancer type, cancer stage, and tumor cellularity, Caulfield said. In addition, the project collects data about the disease course.
As of last week, the project had returned about 600 cancer reports, a number that will increase to 900 by mid-November. About 30 of these analyses were completed in less than 20 days, he said. Building a semi-automated pipeline for analyzing the sequencing data has helped the project to lower the turnaround time for cancer genomes, he added.
The project also plans to return a limited number of secondary findings and carrier status information to participants who opted into receiving them, starting in the coming months. According to a survey, 88 percent of those enrolled in the rare disease program have opted in, Caulfield said.
The list of genes for which secondary findings will be reported is currently much shorter than the American College of Medical Genetics and Genomics gene list and includes genes implicated in colorectal cancer, breast and ovarian cancer, von Hippel-Lindau syndrome, multiple endocrine neoplasia syndromes, retinoblastoma, and familial hypercholesterolemia.
Caulfield said Genomics England decided to exclude certain genes with variants that appear to have incomplete penetrance. "We didn't want to give that information back because we felt that might cause unnecessary worries to some people," he said. "Until we understand more about that information, we prefer to give back things where there is a high degree of likelihood that if you knew about it, there is something you could do."
Also, the project will not return results for adult-onset diseases to pediatric patients, such as mutations in the BRCA1 or 2 genes.
In terms of carrier status, the project will return mutations for eight conditions, including cystic fibrosis, sickle cell anemia, congenital adrenal hyperplasia, Duchenne muscular dystrophy, adrenoleukodystrophy, and certain forms of thalassemia and hemophilia.
No pharmacogenomic variants will be returned initially but those will be added in the near future, Caulfield said.
While the 100,000 Genomes Project progresses, the NHS is preparing to take the next step — bringing whole-genome sequencing into routine diagnostics.
The health service is currently preparing to commission whole-genome sequencing services, Caulfield said, with the aim of making them available to patients in October 2018.
Genomics England has already assisted the NHS in forming a test directory — a hierarchical set of tests the health service is going to commission — that includes not only whole genomes but also other genomic tests, such as microarrays, exomes, and gene panels. Whole-genome sequencing will likely be available for certain groups of rare diseases and a limited number of cancer types, he said. Because there is more work to do in cancer, "we will probably preserve a research cancer pipeline until we drive that into healthcare," he added.
Genomics England's mission until 2021 is to provide the NHS with research informatics solutions and concentrate the UK's knowledgebase of genomics in one location for all genomic testing, Caulfield said, especially where patients have given consent for research purposes. "And we will longitudinally follow those patients using electronic health records, and that will allow us to continue to grow the resource and to continue to add value for future patient diagnoses," he added.
In addition, Genomics England will continue to work with researchers and industry "to try and drive up the value of this for patient care."
The differences to the 100,000 Genomes Project, he said, will be that "instead of us paying for sequencing the whole genomes, the NHS will."
Overall, Caulfield said, it has been exciting to see how the project has delivered molecular diagnoses to many rare disease patients and their families. "They don't expect you to come up with a treatment, actually, they are just incredibly grateful that for the first time, they know why they are like they are," he said.
"If you want to know what keeps us up at night, we worry most about retaining public and patient trust, and that's uppermost in our minds: looking after the people who are in the program," he said.