SAN FRANCISCO (GenomeWeb) – After a long delay, researchers on the Faroe Islands have begun their effort to eventually sequence some 50,000 residents.
The Faroe Genome Project, or FarGen, was first announced in 2011, but has since faced logistical hurdles related to obtaining the appropriate regulatory approvals with regards to informed consent and how to manage the data.
Researchers have now broken up the project into several phases in order to demonstrate that the infrastructure and pipeline are robust. In the first phase, the team will sequence 1,500 exomes using both Illumina sequencing technology as well as 10x Genomics' Chromium platform to generate long-range information. Already, researchers have consented around 1,200 individuals and have prepared libraries for 200 of those.
It has also secured funding of around DKK 15 million ($2.4 million), which will enable it to complete the entire first phase of the project and part of the second phase, Noomi Gregersen, project manager for FarGen, said. Funding is from a mix of private and public entities, including the Faroese and Danish governments, she said.
An initial goal of the first phase will be to generate a Faroese reference genome, Gregersen said. For that, the researchers will use exome data, select a subset of participants for whole-genome sequencing, and use the long-range data from 10x Genomics to identify population-specific structural variants and haplotype information.
The Faroe Islands has a unique population due to its relative isolation — located between Norway and Iceland, but part of the Kingdom of Denmark — and a population-specific reference could help inform future genomic studies and identify haplotypes specific to the Faroese and clinically relevant variants that are more or less common in the population. Previous research groups have already demonstrated the power of population-specific reference genomes in identifying novel structural variants. For instance, the team that built a Korean reference genome identified more than 11,000 novel structural variants.
Even within the Faroe Islands, which is actually a grouping of around 18 individual islands, there are subpopulations that have evolved relatively independently of each other with their own dialects, Gregersen said. So, to construct a Faroese reference genome, researchers will have to carefully identify participants in order to get a good representation of the whole population and not just one subpopulation, she said.
The Faroe Islands is well suited for genomic research not only because of its unique and relatively small population, but also because its public health system keeps extensive data. The Faroese public health services manage a Genetic Biobank, which includes a genealogy registry, a diagnosis registry, and a tissue registry. The genealogy registry is a family tree that includes all Faroese spanning from 1650 to present day. The diagnosis registry includes all diagnoses that are made in conjunction with research projects by physicians within the public health services.
Gregersen said that the unique regulatory framework in the Faroe Islands, by which research data can end up as part of the clinical record, created logistical hurdles that delayed FarGen. Over the past few years the FarGen organizers have been working on ensuring that the sequencing workflow, from sample collection through analysis, produces high-quality data. In addition, they developed infrastructure and the pipeline through which the genomic data will be stored, accessed for research, and integrated within clinical records.
Genomic data will be stored in the tissue registry of the Genetic Biobank. Individuals who are interested in participating in FarGen consent to having their genomes or exomes sequenced and the data deposited in the Genetic Biobank. Participants initially consent to having that data be used to construct the Faroese reference genome, Gregersen said. Researchers who then wish to use any of the genomic data for a research project have to apply to the Genetic Biobank to access the data, and participants then reconsent to having their data be used for that specific research project.
For instance, she said, one research group has already received permission to use FarGen samples in a hereditary breast cancer study. The goal of that study, she said, is to identify variants that increase the risk of heritable breast cancer.
Women in the Faroe Islands who appear to have hereditary breast cancer due to their family history have been mostly negative for known breast cancer variants, Gregersen said. "We haven't identified any of the known BRCA mutations in these women," she said. So, the study will use exome sequencing to try to find risk variants that may be specific to the Faroese population.
For FarGen samples that are being studied as part of the hereditary breast cancer study, any diagnostic variants that are identified as part of that study and related to breast cancer would then be recorded in the diagnosis registry, Gregersen said. In addition, physicians of FarGen participants can access the full exome data, but only for specific diagnostic purposes, she said. However, individuals themselves, have a right to their own data under Faroese law, so they could request their genome file.
Gregersen anticipated that the first phase of the project will be completed by late next year. The second phase will include the sequencing of 5,000 individuals, she said, and already the researchers are working on obtaining the appropriate permissions for that phase.
In addition, another key question will be whether to move from exome to whole-genome sequencing.
Currently, the sequencing lab is equipped with Illumina's NextSeq and MiSeq instruments. Gregersen said that the researchers would test whether or not it would be feasible to sequence genomes on the NextSeq. She said while it may be possible for a small number, doing thousands of genomes on a NextSeq would likely not be cost-effective.