NEW YORK – Researchers at the University of Copenhagen in Denmark are working on integrating genomics data from various research cohorts into a new tool they developed that allows them to explore disease patterns based on data from more than 7 million patients collected over a period of 25 years.
The free tool, called the Danish Disease Trajectory Browser (DTB), was discussed at length in a recent Nature Communications article.
"We have a number of quite large-scale projects where we have hundreds of thousands of genotypes," said Søren Brunak, a professor at the Novo Nordisk Foundation Center for Protein Research at the University of Copenhagen and corresponding author on the paper. "We have approval for using this data in large-scale projects and will of course get some of this data into the browser at the summary level where they are not person-sensitive."
It is clear that researchers see a role for the DTB in the Scandinavian country's ongoing personalized medicine programs. In 2017, Denmark began implementing its National Strategy for Personalized Medicine, dubbed Per Med, part of which called for the establishment of a national genome center, a goal that was fulfiled the next year. Also in 2018, the Novo Nordisk Foundation agreed to invest DKK 990 million ($150 million) in the center through 2022.
Per Med organizers have forecast that as many as 60,000 people might undergo whole-genome sequencing in the first five years of the center's operations, all data that could also be fed into the browser, Brunak noted, though he acknowledged that not all genomic data generated in Denmark will mesh with the browser's focus on disease trajectory.
"We will of course transfer summary statistics from whole-genome sequencing data to the browser when relevant," he said. "Yet not all data can easily be related to progression. A lot of genetic data concerns disease risk, but the browser focuses on going from one disease to the next, and for that, you need progression biomarkers computed as averages over large patient subgroups."
Highways of disease
Denmark's new foray into personalized medicine benefits from having electronic health records linked to national identity numbers, with health record data stretching back to 1968. The concept underlying the browser was to see if certain diseases were associated with others over time. As people live longer, they develop comorbidities, and it is therefore possible to track whether having one illness might lead to having a seemingly unconnected illness later on. The researchers termed these trajectories "disease highways" and in the paper, and, as an example, they used the tool to model how Down syndrome patients were more likely to develop Alzheimer's disease later in life.
The researchers have put the Danish Disease Trajectory Browser online, and free for anyone to access.The DTB is arguably unique in an international context in that it enables users to explore longitudinal disease progression patterns for an entire country. It relies on data collected between January 1994 and April 2018 and includes electronic health data on 7.2 million Danes, collected during 122 million hospital admissions over that period. Users can search for one or more disease codes, using World Health Organization classifications, to explore progression patterns using diverse functionalities. Data from the Danish Registry for Causes of Death is also included.
"The underlying method highlights the potential value of studying an entire nation that epidemiologically is considered an open, dynamic cohort where follow-up is technically complete," noted Amalie Dahl Haue, a researcher at the Novo Nordisk Foundation Center for Protein Research and a coauthor on the paper.
"Often patients are diagnosed with more than one disease, but today, few guidelines take the sequence of comorbidities into account in risk-stratifying patients," she said. "To target diseases with greater precision, we need to understand disease development better, and one way to do this is to map nationwide trends in electronic health data, which is a domain that is just beginning to take speed."
Haue agreed with Brunak that in the future, these disease trajectories will be linked with other data types, such as genomic data, complementing classical epidemiological studies. Such work, she said, will "pave the way for better treatment and delivery of more efficient healthcare."
Haue also noted that while the browser was built using Danish data, it could technically be applied elsewhere, though there are challenges in doing so. "We do have international collaborations, where we have tried to apply the method in a non-Danish dataset," she said. "This is not always easy, as some countries work with legal restrictions that are more disease-specific."
Different kettles of fish
The commencement of Denmark's Per Med program has coincided with similar strategies across Europe. In Northern Europe, in particular, countries like the UK, Finland, and Estonia have developed large biobanks of genotyped and sequenced cohorts that are being used to inform ongoing personalized medicine strategies. The UK Biobank initially genotyped its repository of 500,000 individuals, followed by whole-genome and whole-exome sequencing in recent years. Finnish researchers and private partners are, meantime, working to genotype 500,000 Finns by 2023. And in Estonia, all 200,000 samples housed in the Estonian Biobank have been genotyped using Illumina arrays.
To data, the Danes, however, have not opted to build a similar national resource, and genotyping and next-generation sequencing data is being collected via various cohort studies instead. For instance, researchers at the University of Copenhagen and the Capital Regions of Denmark partnered in 2018 with DeCode Genetics in Iceland to genotype 120,000 Danish cardiology patients using arrays, data that could be integrated with the browser, according to Brunak.
There is also the data being collected through the National Genome Center, which will store whole-genome sequencing data generated as part of routine care in the country. According to Brunak, following a lengthy review process, officials have just finished selecting patient groups that will receive whole-genome sequencing, which is a step toward generating new data.
"That's another potential source of data," he said, noting that the National Genome Center now employs about 60 people at two sites, one in Copenhagen, the other in Aarhus. It has also recently concluded the design of a national workflow for calling variants. "The infrastructure is emerging," he said.
Brunak underscored that researchers will mostly stick with Danish summary genetic data in future projects that use the browser, to match variation in the Danish population to data gleaned from the new tool. "It is just a different kettle of fish, since we don't have population-wide data," he said, referring to the various cohorts that might be used, "but we will have to do it little by little, in a partial way."
As for a timeline as to when this integration will occur, Brunak said that nothing is set in stone. Much depends when pilot projects around such data integration will be approved. Still, he confirmed that it will happen. It should enable researchers to have a better idea of the relationship between developing different diseases over time, and the role genetics plays in that.
"The whole idea with the browser is to condense the disease progression patterns so that we see the most frequent ones, and then we will actually understand what is driven by genetics and what is driven by other exposures," he said. "This is where the genetics come in."