This article has been updated to correct the involvement of Ginkgo's different operations in the Google Cloud partnership and other programs.
NEW YORK – Ginkgo Bioworks is betting on artificial intelligence to give its pathogen surveillance business a boost. In late August, the Boston-based firm announced a five-year partnership with Google Cloud worth at least $289 million to develop large language models (LLMs) for genomics, protein function, and cell engineering. The collaboration involves all of Ginkgo's operations, including Ginkgo's Concentric business unit, which provides pathogen surveillance or biosecurity services.
Under the deal, the biotech firm will make Google Cloud its primary cloud service provider and gain access to the internet giant's collection of high-performance tensor processing units. Ginkgo is making minimum commitments of $8.0 million in year one, $28.0 million in year two, $54.0 million in year three, $86.0 million in year four, and $113.0 million in year five, according to a filing with the US Securities and Exchange Commission.
In turn, Google will provide discounted hosting services as well as $56.3 million in cash payments to Ginkgo over the next three years for meeting certain unspecified milestones.
The companies said they will combine Ginkgo's library of biological code from its genetic engineering programs with Google Cloud's Vertex AI platform to create and train AI models for "core biological engineering challenges," including the design of more effective vaccines to control future outbreaks. Ginkgo cited the scalability, data security, and ability of Vertex AI to understand nuance as factors in its selection of the Google platform.
Ginkgo will also seek to improve centralized data repositories such as public and private sequence libraries and functional readouts from experiments, hoping to break down data silos with Google Cloud's BigQuery search technology.
Casandra Philipson, director of bioinformatics for Concentric by Ginkgo, said that this access to Vertex AI will assist in "supercharging" Gingko's bioinformatics infrastructure. "We get to actually run things on the infrastructure that nobody else has access to, to do it faster, and at a scale that nobody else is doing," she said.
The partnership will also allow Ginkgo Bioworks to "teach AI to [understand] DNA," according to Matt McKnight, Ginkgo's general manager for biosecurity. "We're essentially securing access at really good rates for us to … be building the AI-speaking DNA infrastructure," he said.
With Google Cloud access, and as the company builds apps on top of its existing software stack — starting with protein engineering and enzyme design and expanding to pathogen detection, Philipson expects the technology to become faster and more sensitive at detecting genetic variants.
The Google deal seems to fit well with Ginkgo's continued evolution in pathogen testing and surveillance.
When the COVID-19 pandemic hit in early 2020, Ginkgo created Concentric as an end-to-end SARS-CoV-2 testing service to fill an urgent need at the time. "We're not a vaccine maker or a therapeutics maker or a diagnostic player, but we [thought we] should be able to do something to help, like so many others did," McKnight explained.
He said that the firm always viewed Concentric's development through "the lens of what … will be the modern tools of biotech applied towards the essentially defensive mission [of] biosecurity or biodefense." That is where Concentric by Ginkgo wants to be now, primarily serving governmental agencies around the world.
"Our core focus today is filling the technology hole for … detecting biothreats, detecting pathogens early," McKnight said, adding that the company's early warning system turns "environment into data" through DNA sequencing of wastewater samples.
Ginkgo is building what is internally known as a bioradar system, he said. One of Ginkgo's longstanding biosurveillance program is a COVID-19 monitoring system through which the firm supports the US Centers for Disease Control and Prevention (CDC) at seven major US airports.
Since the national COVID-19 public health emergency ended in May, Ginkgo has been among the top three pathogen sequencing contributors to the Global Initiative on Sharing All Influenza Data (GISAID) in the US.
McKnight said that the CDC airports program is one of the few such biosecurity early-warning surveillance efforts that has stayed in place since the end of the national emergency.
Ginkgo is helping the CDC collect wastewater samples as well as anonymized nasal swabs from volunteer travelers to gauge COVID-19 and other pathogenic outbreaks based on viral sequences. Since building this system at the outset of the pandemic, Ginkgo has deployed the technology to schools and has expanded to serve national programs in Qatar, Botswana, Rwanda, and Ukraine.
"We're essentially deploying … radar stations at airports around the world. The radars turn the environment into sequence data," McKnight said. Bioinformatics supports the bioradar stations with early-warning detection.
Last month, Philipson and colleagues posted a preprint to BioRxiv detailing how Ginkgo's technology was able to predict when the currently dominant BA.2.86 variant of SARS-CoV-2 first emerged. The manuscript appeared less than a month after the first genome sequence of this strain was published to GISAID.
While Concentric by Ginkgo primarily focuses on infectious diseases, McKnight said that the division has several ideas on how to leverage AI in new ways in the future, such as detecting whether a sample has been genetically engineered. He also said that the company would like to build large language models called "foundation" models from its genomic datasets to underpin generative AI.
"With these models, we'll be able to more accurately and quickly determine when an anomaly is present in a biological sample and help program teams design better RNA therapeutics, enzyme pathways, and other similar applications," he explained.
Philipson said that Concentric by Gingko has built both DNA and protein models. The next step is to make the computational system multimodal by integrating omics data with epidemiological knowledge gathered from across the internet, including news about reported outbreaks and publicly available databases.
"A place where these foundational models and apps on top will really thrive is integrating across those types of information," Philipson said.
McKnight noted that the US Department of Defense recently released its first-ever "Biodefense Posture Review," which talked about the utility of sequencing-based technologies for early warning. A few months earlier, the UK issued a policy paper on biological security strategy that mentioned a "biothreats radar" as a top priority.
"You're seeing a very strong change moment in this convergence of public health and national security," McKnight said. "Bioradar is the first step in improving our defenses against threats."
Last month, Ginkgo and Ceres Nanosciences also announced a partnership to bring pathogen monitoring and analysis capabilities to laboratory customers.
The companies will provide on-site training and materials to labs to implement a wastewater testing workflow from Ceres. Lab customers will also gain access to Concentric by Ginkgo data and bioinformatics services, as well as sequencing.
Ceres and Concentric have already set up labs in the Middle East and Africa and plan to make wastewater testing capabilities available in countries where Concentric has programs, including Australia, Botswana, Qatar, Rwanda, Ukraine, and the US.
"SARS-CoV-2 has no seasonality yet that we know of, and it still continues to mutate in ways that are unpredictable and ways that even the best tools don't detect right now," Philipson said. "So we know we need to leverage some of our AI/ML collaborations now to get better at finding variants first."
Ginkgo has also been involved with the US government's Intelligence Advanced Research Projects Activity (IARPA), starting in 2018 as a partner of IARPA contractor Battelle, and later as a contractor in its own right to develop Engineered Nucleotide Detection and Ranking (ENDAR), a computational platform for detecting genetic engineering.
McKnight said that Ginkgo was the only participant in an IARPA challenge to take a computational approach toward bioengineering detection.
Under another IARPA award in July, the firm will deploy what it called a cellular "flight recorder" to develop a biosensor that can continuously record and store microbial sequencing data in chronological order, which Ginkgo software can then analyze in search of clues about pathogenic outbreaks.
According to McKnight, evidence of genetic engineering offers clues to whether there may have been "malicious acts among bad actors."
Philipson said that Ginkgo originally built ENDAR as a classification system to validate the work of the company's cell engineering team. "We're now deploying it externally to be able to identify novel signatures or engineered signatures where we don't expect them to be," she said.