CHICAGO – This month, the Broad Institute, Verily Life Sciences, and Microsoft announced a strategic partnership to extend the Terra bioinformatics analysis platform to the Microsoft Azure cloud.
Terra, a cloud-based bioinformatics analysis and collaboration platform codeveloped by the Broad and Google sibling Verily Life Sciences, had been available only on the Google Cloud Platform since it launched in late 2017. Among other things, Terra serves as the researcher workbench for the National Institutes of Health's All of Us and Accelerating Medicines Partnership for Parkinson's Disease programs.
About 168,000 data scientists, biomedical researchers, and clinicians now use the platform, the partners said.
By bringing Microsoft on board, Verily and the Broad hope to address some of the issues that users have raised over the last three years.
Clare Bernard, senior director of the Broad Institute Data Sciences Platform and product manager of the Terra platform, said that challenges include the explosive growth in the volume and diversity of genomic data.
"It's becoming more and more important to do integrative analysis where you're looking not just at exome and genome sequencing data, but you're also combining that with epigenomic data, imaging data, and proteomics data," Bernard said. That, she said, calls for collaborative tools that work across laboratories and across institutions.
For these reasons, Terra has been built on the principles of being open-source and standards-based, including standards set by the Global Alliance for Genomics and Health (GA4GH), of which the Broad is one of three host institutions.
In addition to the cloud platform itself, Microsoft is contributing services that come with the platform, including Azure Synapse Analytics, Azure Machine Learning, and Azure Cognitive Services. Microsoft Healthcare Managing Director Desney Tan said that the tech giant also will offer expertise in life sciences-focused technology research and innovation.
Microsoft will draw on such resources from its Health Next initiative — created in 2017 to advance the use of artificial intelligence and cloud computing in healthcare — and within the Microsoft Research division, Tan said.
While Microsoft and Google are fierce competitors in the broader technology space, they see each other as partners in the context of Terra. "This is going to be very complementary," the Broad's Bernard said.
Tan called Verily and Broad's previous work on Terra "visionary" and "prescient" in how they have engaged researchers.
"We were seeing exactly the same things with our life sciences customers," Tan said of Microsoft. "So when we got together and told our respective stories, it just became extremely obvious that we should put our hats in the ring together."
"If you look at the three parties, you'll immediately see the complementary nature and the overlaps that exist that allow us to conceive of something quite as audacious and ambitious as this project," Tan said.
"The notion of data-centric, compute-driven R&D innovation in the life sciences has always been an assumption of ours," he added. "It was never a matter of if that would happen. It was a matter of when and how we could help catalyze and accelerate."
David Glazer, CTO for Terra at Verily — which, like Google, is under the Alphabet corporate umbrella — called Terra "a necessary piece of advancing Verily's mission, which is putting biomedical data to work to help people live longer, healthier lives." That requires an open, extensible, and scalable platform.
"You need a platform that can connect researchers to the tools and the data they need, whatever those tools are, built by whomever, whatever that data is and wherever it lives, and in a collaborative environment that speeds the time to insights," Glazer said.
Glazer noted that the bioinformatics world is relatively small. "When we share a space we work in and we share a vision and we share an intent, [we saw that] there were multiple touch points between our organizations, and the conversation gradually built to the point where, yeah, we should do something together," he said.
Microsoft has a research facility basically across the street from the Broad in Cambridge, Massachusetts, and the two entities have collaborated on and off in the past.
According to Glazer, the addition of Microsoft to the Terra project will allow developers to add technical resources and know-how to the platform, and thus reach more researchers in varied environments.
"What we're excited about now is working with Microsoft to double down on all of those attributes," Glazer said. "We think that the partnership with Microsoft will accelerate the rate at which we can bring new features to our users."
With Terra now running on two major clouds, Glazer said that the informatics platform will be seamless from the perspective of researchers. All of the Azure tools will be available to Terra users no matter how they access the platform.
From the beginning, the underlying message to researchers from Terra has been to "focus on your science," he said.
"If you want to ask or address the scientific question, you should not be thinking about infrastructure," Glazer explained. "The job of infrastructure is to be invisible, and that same principle applies to as we extend the infrastructure and capabilities to give researchers more power without more seams or more speed bumps."
Glazer said that while the notion of collaboration has been part of Terra from the start, the COVID-19 pandemic has accelerated it by forcing people into remote working environments.
He cited a Broad-led paper published in Science last month that just happened to be about COVID-19 spread based on an analysis of SARS-CoV-2 genomes as an example of how Terra facilitates collaboration.
"When I picture collaboration, an example of what I think is great is when researchers do work using Terra, publish a paper based on that work, and in the methods section of the paper link back to the Terra workspace, which allows other researchers to build on their methods, build on their data, build on their work, and reuse, reproduce, and extend," Glazer said.
For this experiment, the Terra team worked closely with the Broad researchers, who collected and sequenced viral samples in the Boston area, then processed the data in a Terra workspace. All of the data is publicly available on the Terra website. "The workspace can be cloned so that other researchers can do very similar analysis," Bernard said.
Tan said there are plenty of similar examples among Microsoft's life sciences customers who want to share their data and work with information generated by others, but without a platform like Terra.
"We see this happening over and over again without the open standards-based collaborative piece of it, which necessarily constrains the exercise," Tan said. "So folks are building and rebuilding infrastructures over and over again in different places and different ways and in silos."
Terra has not yet released a timeline for introducing new features and functionality, but Bernard said that the partnership will provide more details later in 2021.
Bernard said that some groups have asked about adding their own analytics tools to the Terra ecosystem. Notably, Johns Hopkins University contributed the Galaxy platform.
"We're also seeing an increase in tools used for more machine learning applications," she said.
The current trend in bioinformatics is toward performing computation in the cloud where data resides so users do not have to download massive datasets onto their own infrastructure.
Tan said that Terra is "intrinsically a cloud-based workload," which makes it flexible and scalable.
"I think you'll continue to see scientific results come out from the use of the platform," Tan said. "I think you'll see new growth in the platform itself. And then a third bucket will likely be the users or the enterprises and the scientific users that we expect will continue to grow on the platform as well."