NEW YORK (GenomeWeb) – The recent decision by a top UK research institute to outsource its bioinformatics data storage to a lower-cost, clean energy facility in Iceland has observers taking notice.
The Earlham Institute announced in September that it had partnered with Verne Global to study the benefits of storing genomics data at its 44-acre data center in Keflavik, Iceland. The London-based company has operated the site, a former NATO base located near the country's main international airport, since 2012.
Timothy Stitt, head of scientific computing at the Norwich-based Earlham Institute, which specializes in crop genomics, said the institute has been looking to reduce its data storage costs as it tussles with a deluge of next-generation sequencing data. EI's flagship research into bread wheat, for instance, has researchers generating and analyzing fresh sequencing data on a genome that is five times the size of the human genome and is more complex.
"Here at Earlham we probably generate about 10 terabytes [of data] per week," Stitt told GenomeWeb. "We can see that number growing and growing over the next few years as the cost of next-generation sequencing platforms get lower and more people generate more data," he said. "It's certainly a big data problem."
While other European genome centers face a similar challenge, EI is apparently the first of its kind to "take the plunge" and partner with Verne Global to locate some of its high-performance computing infrastructure in Iceland. "We certainly know of other academic institutes in the UK and the US that have been thinking of it for a while but haven't signed any agreements yet," said Stitt.
That may be the case. The heads of scientific computing at two other genome centers in the UK said that they too have discussed outsourcing some data storage to Iceland. Both declined to comment publicly, as no decisions have been reached on the topic.
Data is "likely to be something that is outsourced," said Rob Davidson, a data scientist for online academic journal Gigascience
and co-founder of the advocacy group Scientists for EU
"Biological data is fast outstripping all the massive data from space telescopes; it's become astronomical in size," he told GenomeWeb. "You need big servers, storage facilities, cooling, and mega-fast Internet connections with constant up time," said Davidson. "Outsourcing is definitely going to be a part of this mix."
According to Davidson, it is price, rather than environmental concern, that will likely become the main determinant of where genomic data is stored though.
"If history is anything to go by, we won't see green concerns guiding where the money goes," said Davidson. "The only thing that counts with public spending is doing more with less money," he said. "If Iceland can be cheaper, perhaps because of its energy source, then Iceland will get the business."
Davidson also cautioned that concern over sharing data across national borders might slow the outsourcing of genomic data to Icelandic data centers or elsewhere. "Iceland will probably get a lot of EU business, but we will need to see how political attitudes shift before seeing it as a global leader," he said.
Stitt said that EI had been in contact with Verne Global for about 18 months before deciding on the partnership. The institute was ultimately convinced by the promise of reduced cost and clean energy provided by both geothermal and hydroelectric source, as well as year-round free cooling.
"We are in an academic setting, a lot of our money comes from the government, and we want to reduce our operational costs," said Stitt. "Obviously, hosting a large HPC data center, we spend a lot of money each year just to pay the energy and the cooling bills," he said. By moving data to Iceland, EI should be able to reduce its energy cost by about 70 percent, Stitt estimated.
"In the UK we pay 14 pence ($0.17) per kilowatt-hour, but in Iceland, with a hundred percent renewable energy, it's about 4 pence per kilowatt-hour," he said. "That is a huge cost saving for us."
He said that EI will now move some of its "noncritical HPC infrastructure" to Iceland and then will determine how it can remotely manage it. The institute aims to study the cost benefits of outsourcing data storage to Iceland and will publish its findings in order to "see if the results are beneficial to the wider academic community, in the UK and farther afield."
Jorge Balcells, director of technical services at Verne Global, said that the deal with EI is a "first step" toward encouraging other genome centers to outsource data storage to its Icelandic campus. Balcells has engaged these clients at conferences, and said that he has noticed a "pattern of interest" coming from such data centers.
He noted that Verne Global has a partnership with NORDUnet, a Northern European academic network provider, and that EI will be relying on NORDUnet in part to manage its relocated data.
Marius Olafsson, who administers RHnet, the Icelandic University Research Network, a participant in NORDUnet, said that his country does offer less expensive green energy and natural cooling, but cautioned that connectivity can be expensive, and that there are only three fiber sea cables connecting Iceland with North America and Europe.
"Any commercial client of the Icelandic data centers must take the high-cost of connectivity into account, which is of course offset by low electricity costs," Olafsson told GenomeWeb. He noted that there is "a lot of unused capacity on the sea-fibers to and from Iceland."
Ægir Magnússon, divisional manager of sales and business development at Advania, another Icelandic data center, said that, like Verne, the company has seen an increase in recent deals with academic clients.
"We are both offering pure colocation space for these types of clients and now we are also starting to offer end-to-end HPC as a service solution for the HPC market," Magnússon told GenomeWeb. "This means that we can provide a hardware platform optimized for various high performance calculations, such as genomics, where clients can tap into our environment when needed," he said.
"We see a lot of interest in this solution and the fact that it is run in Iceland on renewable energy resources is a large factor in the decision making as well," Magnússon noted.
While some institutions are investing in or looking at Iceland as a possible solution for their data needs, others are turning elsewhere. Lee McGuire, a spokesperson for the Broad Institute in Cambridge, Massachusetts, for instance, said that the institute is actually in the process of moving its data to secure cloud storage and processing, rather than on-premises storage and processing, using cloud service providers, such as Google.
"I imagine we are not alone in this," he said.
Stacey Gabriel, director of the genomics platform at the Broad, discussed the decision to work with Google on Google Genomics' blog
James Hamilton, vice president and distinguished engineer at Amazon Web Services and author of the blog Perspectives, which focuses on data center design and operations, told GenomeWeb that customers like the Broad often want to have their data stored close by, even if they are accessing via a cloud provider.
"There is a customer demand for data centers close to where they operate, so cloud providers place data centers in all major regions worldwide," Hamilton told GenomeWeb. He stressed that his opinions were his own and did not reflect his employer.
"I've noticed that most customers favor data centers near their data sources or close to their customers for low latency data access," Hamilton continued. "Iceland is somewhat remote, but it does offer good power pricing as a potential draw for those customers that don't mind the additional latency," he added.
Stitt said that EI had special data needs that made using a cloud service impractical.
"For some of our wheat analysis, each run takes three or four weeks, and it can take six to 11 terabytes of RAM just to do the assembly of the wheat," Stitt said. "That is a type of technology that we just can't get in the cloud," he said. "That's why we have to host our own hardware to do that."
For now, EI will proceed with moving some data to Iceland and embarking on its cost analysis study.
"We do hope the publication will encourage other academic institutions in the UK to follow suit to house their HPC resources in a green, environmentally friendly way, and at a much lower cost than in the UK," Stitt said.