NEW YORK – The national Estonian Biobank will use Pacific Biosciences' Revio HiFi sequencing system to analyze 10,000 human genomes from its biorepository, with a longer-term goal of sequencing its entire 200,000-sample bank, PacBio said this week.
According to the company, the Institute of Genomics at the University of Tartu has purchased three Revio instruments and plans to sequence the 10,000 samples in about two and a half years.
The biobank, which is located at the university, will use the data generated to inform various personalized medicine projects focused on cardiovascular disease, pharmacogenomics, cancer, and rare diseases.
According to Lili Milani, head of the Estonian Biobank and professor of pharmacogenomics at the University of Tartu, sequencing 10,000 genomes is one of the goals of the recently launched TeamPerMed project.
Initiated last year with a budget of €30 million ($33 million), contributed equally by the EU and the Estonian government, the six-year TeamPerMed project is being led by the University of Tartu and Tartu University Hospital to create a personalized medicine research and development center of excellence. The Institute for Molecular Medicine Finland at the University of Helsinki and Erasmus University Medical Center in Rotterdam, Netherlands, are also taking part.
"The genomes will serve several purposes," noted Milani. "Firstly, we will build an improved reference panel for genotype imputation for the rest of the biobank samples." The University of Tartu has already genotyped its entire biobank, which contains around 212,000 samples — about a sixth of the country's population of 1.3 million — mostly using the Illumina Global Screening Array.
Mait Metspalu, director of the University of Tartu's Institute for Genomics and the principal investigator on the TeamPerMed project, said that the bulk of the genetic data used for the implementation of personalized medicine in Estonia will still be genotyping data, but it will require imputation from the new whole-genome sequencing dataset.
Milani said the researchers are eager to look into complex genes relevant for pharmacogenetics, noting that long reads are "crucial for sequencing through complex regions of the genome, which are poorly mappable with short-read technology, and for accurate phasing of genetic variants."
"We were initially planning to sequence at least 10 percent of the [10,000] genomes with long reads, but based on the offers that we got from different sequencing technology providers, we realized that we could sequence all 10,000 genomes with long reads and get epigenetic information on top of that, too," she said.
According to Milani, the Estonian Biobank evaluated PacBio and Oxford Nanopore's PromethIon before settling on the Revio system. They ran samples with complex structural variants that the researchers had struggled to sort out using short reads and array data on both systems and found the data quality to be largely similar.
"But for genetic insertions and deletions we observed a difference," said Milani. "On average, we were able to call considerably more indels per genome in the Revio sequence data, and with a higher concordance when comparing to existing Illumina short-read sequence data."
Given that TeamPerMed aims to produce a high-quality, long-read-based imputation reference panel for the Estonian population, including genomic regions with complex structures, the researchers opted for the Revio, Milani said.
Currently, the biobank is setting up three Revios in its laboratory. "We are of course eager to sequence the entire biobank as soon as more funding becomes available," Milani added, noting that the TeamPerMed researchers are also keeping an eye on new sequencing companies that are entering the market, such as Ultima Genomics and Element Biosciences.
The deal with PacBio was a "major first step" for TeamPerMed, she said, which has already started recruiting staff. The project has several pilots planned, and its research teams are currently preparing clinical studies to evaluate the effectiveness of polygenic risk scores for the prevention of cardiovascular disease and the use of pharmacogenetic information prior to drug prescription, according to Milani.
The sequencing data generated at the biobank will also help support the European 1+ Million Genomes initiative.
Neil Ward, PacBio's VP and general manager for Europe, the Middle East, and Asia, said that his company will work with the team in Tartu to "ensure they reach their target of 24 genomes per day at 20X coverage." PacBio will implement its latest high-throughput library preparation workflow, he added, automating as much of the process as possible with liquid-handling robots and using the company's newly launched HiFi Prep Kit 96. The Estonian researchers will also be able to run the sequencing data through PacBio's freely available WGS Variant Pipeline.
When asked if the company had similar deals up its sleeves, Ward responded that PacBio is "always looking to partner with organizations large and small to support their research efforts."