Skip to main content
Premium Trial:

Request an Annual Quote

Google, Stanford to Rework Research Algorithms for Planned Clinical Genomics Service


NEW YORK (GenomeWeb) – In partnership with Google's genomics arm, Stanford University School of Medicine will be evaluating and testing a number of established algorithms and solutions as part of efforts to put together the requisite informatics infrastructure for its planned clinical genomics service.

Stanford Medicine expects to launch the service, which will provide genome sequencing services for patients at Stanford Healthcare and Stanford Children's Health, in the spring of next year. The institution is collaborating with Google to develop cloud-based applications for exploring genomic and other healthcare datasets from patients with rare diseases and inherited disorders who are recommended for testing.

Specifically, they will create applications for storing, processing, exploring, and sharing genomic data generated by the service. They will use technologies and methods that power resources like Google Search and Google Maps to build cloud-based applications for securely storing, processing, exploring, and sharing genomic and other healthcare datasets.

As part of these efforts, the partners will explore various established research software solutions and rework them for clinical use, Euan Ashley, a Stanford associate professor of medicine and genetics, told GenomeWeb. They will evaluate algorithms for calling small insertions and deletions and structural variations among other tasks and ensure that the results that these methods generate are "non-inferior" to current clinical standards.

"Many of the tools that have been developed to date that have been very successful have been developed for research utilization," he explained. These tools are designed to identify meaningful patterns in data from hundreds or thousands of samples, but the questions that clinicians have to answer are quite different.

"You have a patient in front of you, [and] the question is 'can you find an answer to the problem they have and is that actionable?'" he said. As such, algorithms have to be far more sensitive and specific than would be required for research use cases. If, for example, "your indel detection algorithm isn't good enough … that's a major problem because the difference between discovering and not discovering the cause of that individual's syndrome has much higher consequences than simply failing to discover another variant that could be relevant."

The decision to work with the cloud provider was in part due to "the scope and ambition of the service that we are putting together," Ashley told GenomeWeb. The demonstrated utility of whole-genome sequencing in the research domain has "really compell[ed] us to think about doing this for patients," he explained. So while the initial focus for the service is on rare diseases, it is designed to be part of the broader Stanford healthcare system, and "we want to plan for that," he said.

The decision to use the cloud was also influenced by the quantity and size of genomic datasets, Ashley said. Over time, Stanford expects to generate thousands of genome sequences though its service. Furthermore, Google has also implemented safety measures that offer far more security for patient data than a single university like Stanford could provide on its own.

For example, "Google has a team of people whose job is to constantly hack and attack the Google servers," he noted. "The fact is that no university … has that sort of resource." Moreover, Stanford uses Google infrastructure to support other institutional activities, and so the current arrangement was a natural extension of that pre-existing partnership, Ashley added.

Furthermore, unlike some competing cloud vendors, Google has a dedicated genomics arm that has been involved in several genomics-based initiatives, Ashley noted. For example, last year Google Genomics partnered with the Broad Institute to offer cloud-based access to the Genome Analysis Toolkit and to jointly develop other data analysis services on the Google cloud. Also, the Institute for Systems Biology selected the Google cloud to build its proposed system for the National Cancer Institute's Cancer Genomics Cloud initiative.

Google also collaborated with Austism Speaks and BioTeam on the MSSNG project, an open repository of whole genome sequence, phenotype, and clinical information from 10,000 individuals and families with autism. "They have shown more interest than other cloud providers in genomics as a source of big data," he said. 

In addition to building up the appropriate informatics infrastructure for the service, the Stanford team will also assess the value of different sequencing technologies for clinical use, Ashley noted. "We have to be open to a variety of approaches to deriving that data [including] both short and long read, because there are many clinical situations where short-read approaches are just not able to do what we need," he said. For example, in cases where the disease is characterized by a large number of repeats.

Initially, Stanford's service is focused on rare and inherited disorders but eventually it will be expanded to include cancer and other areas, Ashley said. The service will also be combined with other genomics initiatives at the institution including a pharmacogenomics clinic led by Russ Altman, a Stanford professor of bioengineering, genetics, and medicine. Patients who undergo testing will also have an opportunity to enroll in and contribute their data to research projects at Stanford. "Our conception is that genomics will come over the next decade to touch every patient, and so we are building for that," Ashley said.