Skip to main content
Premium Trial:

Request an Annual Quote

Amazon Web Services Aims to Remove Computational 'Heavy Lifting' for Genomics Customers


CHICAGO – Amazon Web Services has been expanding its reach into genomics in recent years and continues to grow in areas like molecular diagnostics through new initiatives.

As might be expected of one of the world's three largest commercial cloud platforms, AWS counts some well-known names in the genomics and bioinformatics world among its customers, including, Ares Genetics, Regeneron, Melbourne Genomics, Australia's Commonwealth Scientific and Industrial Research Organisation (CSIRO), Seven Bridges, BC Cancer, Fabric Genomics, Genomics England, the Global Alliance for Genomics and Health, DNAnexus, the UK Biobank, Konica Minolta Precision Medicine, Illumina, and the Broad institute.

"Over the past 15 years, AWS has helped remove the undifferentiated heavy lifting so that customers are able to figure out what's the differentiating value for them," Wilson To, AWS global head of healthcare, life sciences, and genomics, said this week during AWS' Healthcare & Life Sciences Virtual Symposium. "Genomics" is a recent addition to his title, indicating the field's increased importance to the company.

Pat Combes, AWS worldwide technical leader for healthcare and life sciences, said that the firm's history in genomics goes back to the launch of its cloud platform in 2006. He described himself as the "bridge" between customer-facing technical staff and AWS' internal engineering and service teams in life sciences.

Combes said that this "heavy lifting" in genomics includes mundane, time-consuming computing tasks such as genome assembly, read alignment, and analysis of sequencing runs. "Wherever we see it, wherever we encounter it, we will build out services and solutions necessary to help relieve that from our customers so that they can focus on the very specific and really exciting work that they do," he said.

For example, he said, Munich Leukemia Laboratory, or MLL, in Germany has a private cloud that is hosted by AWS. In concert with Illumina's Dragen platform and BaseSpace environment, the lab has been able to reduce the time to process a genome sequence from 20 hours to three hours.

MLL bioinformatics head Niroshan Nadarajah said that the lab has been an AWS customer since it embarked on its 5,000 Genomes project in 2017.

"It required an enormous amount of compute resources," Nadarajah said via email. MLL did look into building its own high-performance computing infrastructure to analyze whole genomes, but it did not make economic sense.

"It would have required huge upfront investment costs," Nadarajah said. "Not only for the hardware, but we would have also needed more space in the lab [and] more people to maintain the resources."

MLL also wanted its data to remain in Germany to meet strict privacy laws in that country. At the time, Nadarajah said, the two largest AWS competitors, Microsoft Azure and Google Cloud Platform, did not have their German data centers operational yet.

The AWS cloud is essentially a giant network, Combes noted, which people can use as they choose. The cloud has long been attractive to genomics users because the on-demand scalability is a lower-cost alternative to running high-performance computing centers. Also, the sharing and collaboration that cloud computing facilitates grew in importance last year as the COVID-19 pandemic forced so many people to work from home.

"Most of the computational work in genomics is pretty much a massively parallel task," Combes said. "That is a central advantage of any cloud platform."

Combes also said that genomics workflows are becoming more sophisticated as research advances, specifically mentioning CSIRO's development of a CRISPR/Cas9 guide RNA design tool as a complicated undertaking.

He also spoke of the diversification of many of AWS life sciences customers, including Illumina.

Illumina, for example, fully migrated its BaseSpace informatics platform to AWS in 2018, though the platform had partially operated on the Amazon cloud before then.

"They were sort of refashioning it for native use on AWS," Combes said of BaseSpace. Illumina also was incorporating functionality from its 2018 acquisition of Edico Genome, which brought the Dragen analysis platform into the fold.

Combes also spoke of increased use of the cloud in diagnostic applications of genomics, such as at MLL, which saw 5,000 Genomes as its vehicle for bringing WGS-based diagnostics into routine patient care, according to Nadarajah.

In March, diagnostics firm Konica Minolta Precision Medicine, via its Ambry Genetics subsidiary, also entered into a five-year deal for AWS to become the company's "preferred cloud provider" for the global build-out of its integrated, multiomic diagnostic data platform, Lattice. As part of the partnership, Amazon made an undisclosed investment in KMPM.

In an emailed statement, Ambry Genetics CEO Tom Schoenherr said that the company chose to partner with AWS because of the global reach and "breadth and depth of services" the Amazon cloud offered.

AWS will enable Lattice to seamlessly scale operations globally, enabling clinical-grade performance while focusing on building new tools," Schoenherr said. "Further, the virtual endless storage capacity of AWS reduces processes, saving time for employees and accelerating turnaround times on test results."

A month after the KMPM partnership, Amazon announced the next phase of its AWS Diagnostic Development Initiative, in which it plans to distribute $12 million this year to fund projects for SARS-CoV-2 testing, as well as for other infectious disease diagnostics. The company first launched the initiative in March 2020 as a way to accelerate research, innovation, and development of diagnostics for the detection of the coronavirus that causes COVID-19.

While the initiative traces its roots to the early days of the pandemic, Combes sees potential far beyond COVID-19. "It really jump-started a lot of interesting diagnostic use cases on AWS that are multiomic," including public health surveillance, early detection of diseases, and rapid test development, he said. "There's a lot of … expanded efforts into virology and the use of genomics in virology research."

Combes said that the pandemic has changed how customers in genomics approach the cloud. "It's not just capability but recognition of the ease of acquiring high-powered resources," as well as cost savings over investing in high-performance computing infrastructure, he said.

The cloud, of course, supported all the remote collaboration on "virtual desktops" necessitated by office and lab closures, but that is true across all industries that AWS serves, Combes noted.

From Amazon's perspective, COVID-19 has also triggered decentralization of laboratories, which Combes said he expects to remain after the pandemic is over. "A lot of larger labs that had been centralized in a single location really diversified a lot of their work across smaller locations that were able to remain open and do direct work" with proper social distancing, he said. "That meant that a lot of data consolidation and analysis took place on AWS as a result."

Another growth area is related to the Amazon Registry of Open Data, a collection of public-domain datasets across multiple industries that AWS has made available on its cloud based on requests from customers. This collection includes the Genome Aggregation Database, the Cancer Genome Atlas, and the Sequence Read Archive. Combes said that the number of genomics-focused data in the registry has grown tenfold in the last year.

"I think what we're seeing now is the combination of two other things, the dramatic increase in dataset size … and the need for those to be leveraged in a lot of advanced research programs," Combes said.

He said he expects the future to call for "greater and greater focus on aggregating larger and larger datasets and cross-utilization between research applications."

Combes also mentioned the liquid biopsy cancer test of Grail, a company Amazon has invested in. Grail is expanding to about 50 cancers through its Galleri screening test that is set to launch this year.

Going forward, users in genomics and molecular diagnostics can expect additional enhancements from AWS that are tailored to their work. "I think what you'll begin to see are [software] tools that make AWS easier to use, more accessible in this space for researchers and clinicians alike," Combes said.