Skip to main content
Premium Trial:

Request an Annual Quote

Oak Ridge, Georgetown Join Forces To Bolster Biomedical Data Analysis


Against the backdrop of the expanding computational needs of biomedical research, Georgetown University Medical Center has signed a Cooperative Research and Development Agreement agreement with Oak Ridge National Laboratory that gives its researchers access to ORNL’s considerable computational resources to support systems biology research. 
“At ORNL, we have been building problem-solving environments that integrate experiment, theory, and simulation for a variety of disciplines, and our hope is to do something similar for clinical and translational science,” Jeff Nichols, director of the Computer Science and Mathematics Division at ORNL, told BioInform in an e-mail this week.
“ORNL brings to the table supercomputing capabilities that will allow us to analyze, manage, and visualize complex molecular data that is collected at Georgetown,” said Howard Federoff, executive vice president for health sciences and executive dean of the school of medicine at Georgetown University Medical Center, in a statement.
Three projects are already underway and more are set to follow in this collaboration. One project, between oncologist Stephen Byers at GUMC and ORNL’s analytical chemist Gary Van Berkel, is using atmospheric pressure surface sampling and ionization methods in conjunction with mass spectrometry to map biomarker metabolites in paraffin-embedded tumor tissue samples.
Another undertaking between ORNL’s Ram Datar, a cancer molecular pathologist, and Minetta Liu, an oncologist at GUMC, is geared toward using an engineered membrane, a microfabricated device developed at ORNL, to detect the earliest signs of metastasis.
A third project, between GUMC’s Zofia Zukowska, who chairs the department of physiology and biophysics, and ORNL systems geneticists Brynn Voy and Elissa Chesler, is integrating large amounts of biochemical, physiological, and molecular data to study stress and obesity.
This is ORNL’s first CRADA with an academic institution, Kristina Thiagarajan, NIH program manager at ORNL, told BioInform sister publication Biotech Transfer Week. The CRADA supports the project for five years.
Park the Data
As Andrew Deubler, vice president for enterprise development at Georgetown University Medical Center, told Biotech Transfer Week, Georgetown is seeking to leverage its clinical capacity for large-scale biomedical science. “In order to be able to capture the load of data necessary, and reduce and visualize it, the decision we made at Georgetown — and I think a lot of universities are going to have to make this decision — [was,] ‘Do we want to invest the cost it would take to invest in that kind of infrastructure, or are there partners we can pull into the mix?’”
Georgetown lacks the computational facilities that are housed at ORNL, he said (see below for details of ORNL’s IT infrastructure.).
Deubler said the partners believe that “we have only begun to scratch the surface of using supercomputing to actually employ systems biology.”
As ORNL’s Thiagarajan explained, the projects are being funded with internal money from the individual institutions. The first three projects needed essentially seed funding, so this project provides them “with a stepping stone to the next stage, which would be to go after competitive funding from the NIH, in particular,” she said. “It is very hard to get funded at NIH without preliminary data or proof of principle. So in these three projects we’ve been able to do that.”
This partnership is part of a larger context: GUMC, ORNL, along with Howard University, the Baltimore and Washington, DC-based healthcare organization Medstar, and the Veteran’s Administration in Washington, DC, have also just submitted a proposal for a clinical translational science award, she said.
It’s a Cross
GUMC’s Zukowska told BioInform in an e-mail that the collaboration “will capitalize on [the] computational [and] bioinformatics expertise of ORNL,” as well as ORNL’s “Collaborative Cross” collection of mouse strains that represent the genetic diversity of the world population.
The plan for this project is to use novel models of stress and obesity. Those conditions will be “studied at multiple levels, from in vitro to in vivo, combined with our knowledge of novel mechanisms that may mediate them, and innovative ways to test potential new therapeutic avenues,” she said.
There is demand for new therapeutics in this area. Earlier this month, Merck abandoned an obesity drug in late-stage clinical trials, and sales of Sanofi-Aventis’ Acomplia, an obesity drug rejected by the US Food and Drug Administration last year, were suspended in the European Union last week.

“ORNL brings to the table supercomputing capabilities that will allow us to analyze, manage, and visualize complex molecular data that is collected at Georgetown.”

Systems genetics is building on six decades of mouse genetics at ORNL, Voy said, noting that systems biology has gained importance due to the Collaborative Cross, a large-scale effort supported by many research institutions including the Jackson Laboratory, Pennsylvania State University, the University of Tennessee, and the University of North Carolina. It includes around 1,000 recombinant inbred lines of mice that required 25 generations of mouse breeding. 
As the scientists outlined in a Nature Genetics paper in 2004, “a set of 1,000 RI strains can generate as many as one million distinct but genetically well-defined and reproducible mice that will represent a vast resource for the discovery of new animal models of human diseases.”  
The mice are housed at ORNL, where all the maintenance and breeding is taking place, Voy said. The Cross will eventually deliver a large common set of genetically defined mice to help study mammalian biology at a systems level and to help scientists obtain integrated data on a large population.  
“What we wanted to do is be able to take something that is a semblance of the type of genetic variation you get in a human population, where some people are susceptible, some people are resistant, and everything in-between,” she said.
“With that genetic backdrop [the question is], ‘Can we start to identify individuals, in this concept, strains of mice, who are susceptible to stress or susceptible to stress-induced obesity because of genetic recombinations that they carry?’” she said.
The mouse genetic reference population gives researchers a large and genetically diverse reference population so they can examine combinations of genes responsible for diseases.
The discerning factor to start will be phenotype. “We’ll have genotype [data] for 13,000 SNPs and ideally we will have molecular phenotypes like microarray-based gene expression, and then intermediary phenotypes linked to obesity and stress, such as [from] body fat and insulin levels,” Voy said.
Those results means layers of data, which is “where the data explosion really comes in,” she said. The researchers will be using the Illumina array platform for SNP detection and have designed a custom array to interrogate the population, which was based on the sequence data of the eight parental [mouse] strains that are being used to create the Collaborative Cross, she said.
The first milestone in the partnership, Voy said, will be to perform quantitative trait locus mapping to look at the genetic architecture in response to stress. It will be “the QTL mapping for what regions of the genome seem to play a role in stress-induced obesity,” she said. 
ORNL brings large-scale biological and computational tools to the partnership. It’s about “having the mouse population, the genotype information from those mice, and then the analytical infrastructure to make sense of that,” said Voy, who is a molecular physiologist by training but has acquired bioinformatics skills for her research.
As this study progresses, computational tools at ORNL are going to help find patterns in a sea of data. For example, a secure database with a web-based sharing infrastructure has been set up so Zukowska can access data from Georgetown.
Nichols said ORNL wants to bring its extensive hardware, storage, and networking resources to bear on projects such as this attempt to identify the molecular interactions and genetic risk factors underlying adverse responses to stress and risk for obesity.
“We are proposing to use a broad range of computational science software — some exist today and others [will be] developed specifically for new projects of mutual interest,” said Nichols. “The multidisciplinary nature of these problems and the wide variety of available user facilities is what makes this partnership exciting.”
“Computational science has supported experimental efforts in neutron, nano, and bio sciences for several years. Now the theories and simulation capabilities have matured such that computational efforts can guide, lead, or in some cases replace experiment, thus reducing the time to solution in a very significant fashion,’ said Nichols.

ORNL Computational Tools for the GUMC Partnership
ORNL has a broad range of hardware, storage, and networking resources that can be leveraged to solve problems of particular interest to the scientific community. The following is a list of some resources that will be available to Georgetown researchers under the CRADA, as provided by ORNL’s Jeff Nichols.
  • Jaguar, a 263-teraflop/s Cray XT4 computer resource for open science research. The Department of Energy’s National Center for Computational Sciences is currently upgrading Jaguar to over a petaflop/s with more that 250 terabytes of memory.
  • Smoky, an 80-node Linux cluster dedicated to application development, specifically application scaling development for petascale applications.
  • Kraken, a 166-teraflop/s Cray XT4 computer resource for open science research. Kraken is currently being upgraded to 615 teraflop/s. In 2009, the University of Tennessee’s National Institute for Computational Sciences will upgrade Kraken to 2.4 GHz 6-core processors, powering up to 962 teraflop/s and 100,224 cores.
  • Eugene, ORNL’s 27-teraflop/s IBM Blue Gene/P computer. Access is limited to ORNL staff and university partner members.
  • ORNL also supports the Oak Ridge Institutional Cluster, a collection of SGI Xeon clusters that provides almost 20 teraflop/s of peak performance.
Storage Systems:
  • NCCS provides 10 petabytes of a Lustre-based parallel file system called Spider, a center-wide shared file system that saves all files to one location at a rate of 250 gigabytes/s.
  • Hierarchical tape storage is provided to both NCCS and NICS. Data from simulations are first written to disks by high-speed data movers and then migrated to tape drives. Storage capacity is added incrementally as needed. Current capacity is 20 petabytes with growth expected to be 40 petabytes in 2009.
  • Wide Area Network: The ORNL campus is connected to every major research network at rates of 10 gigabits/s or greater. Connections into ORNL include TeraGrid, Internet2, ESnet, and Cheetah at 10 gigabits/s as well as UltraScienceNet and National Lambda Rail at 20 gigabits/s. ORNL operates the Cheetah research network for the National Science Foundation and the UltraScience Net research network for the Department of Energy.
  • Local Area Network: ORNL deploys a center-wide InfiniBand fabric, so users can move large data sets from the simulation platforms to other platforms such as the Lustre file system, data storage, and analysis and visualization with an aggregate performance of 100 gigabytes/s

Filed under

The Scan

Pig Organ Transplants Considered

The Wall Street Journal reports that the US Food and Drug Administration may soon allow clinical trials that involve transplanting pig organs into humans.

'Poo-Bank' Proposal

Harvard Medical School researchers suggest people should bank stool samples when they are young to transplant when they later develop age-related diseases.

Spurred to Develop Again

New Scientist reports that researchers may have uncovered why about 60 percent of in vitro fertilization embryos stop developing.

Science Papers Examine Breast Milk Cell Populations, Cerebral Cortex Cellular Diversity, Micronesia Population History

In Science this week: unique cell populations found within breast milk, 100 transcriptionally distinct cell populations uncovered in the cerebral cortex, and more.