It has been a rough few years in the biobanking world since it was first weighed and found woefully wanting. That wake-up call came in 2005, when the National Cancer Institute's much anticipated The Cancer Genome Atlas project kicked off, only for the biobank aspect to fall flat on its face. The project hinged on the procurement of 500 samples for five major tumor types, but soon after the call went out to biobanks across the US — and then around the world — it became apparent that only a small fraction of the samples being submitted were of suitable quality for research.
"When The Cancer Genome Atlas started, what they thought was going to be easy failed miserably — less than 1 percent were functional for even extraction of nucleic acids," says Allison Hubel, director of the Biopreservation Core Resource at the University of Minnesota. "So that kind of tells you that there is still a lot of work that needs to be done in biospecimen science to ensure high-quality samples that can be used for not only the techniques that we have available now, but the techniques that are in development."
And therein lies the rub — the science of biobanking is more like an art of anticipation. The inherent challenge is ensuring that today's samples will be good enough for tomorrow's analytic techniques.
While fresh-frozen samples are typically regarded as optimal for array CGH analyses, sometimes only tissue preserved using the formalin-fixed, paraffin-embedded method is available. FFPE samples can be stored indefinitely at room temperature, making the technique the preferred storage method for harvesting nucleic acids — particularly as degradation can occur within fresh-frozen samples, especially during freeze-thaw cycles. But as 'omics techniques and approaches continue to evolve, the differences in yield quality can sometimes be mitigated.
In February, a group of researchers from the Netherlands Cancer Institute published a paper in PLoS One that compared FFPE samples to fresh-frozen samples for new cancer biomarker discovery, which often requires fresh-frozen samples. The researchers compared data from 20 FFPE tissues and 20 matched fresh-frozen tissues using the whole genome cDNA-mediated annealing, selection, extension, and ligation assay, typically used only for FFPE tissue profiling. Using Illumina's DASL assay, they demonstrated that FFPE data is comparable to its fresh-frozen counterpart. These results could mean that investigators can use both FFPE and fresh-frozen material in gene expression studies, thereby increasing the size and scope of those studies.
However, as The Cancer Genome Atlas debacle demonstrated, simply freezing samples with the hope of harvesting DNA is not a problem that has been solved. "Other than the dry-state storage, there has been very little understanding of the process of storing DNA and, in fact, we know that freezing of DNA, which is the most common method of storage [for] purified nucleic acid, results in severing of the DNA chain. We make up for it because we do [PCR] amplification," Hubel says. "Tomorrow's techniques may require us to have intact DNA, and could we develop improved methods of stabilizing that would not result in breakage of the strand and would make the sample more amenable to new and emerging sequencing technologies?"
FFPE and RNA
Stephen Hewitt, a clinical investigator at the National Cancer Institute's pathology lab, is heading up efforts to maximize RNA extraction from FFPE samples. There, the primary challenge is keeping up with high-throughput, next-generation sequencing platforms. "There's no doubt that we really have to improve and standardize our specimen handling and storing processes, otherwise you're always having to tweak your analytics protocols and then you're never certain of its ramifications," Hewitt says. "In many ways the reason it's becoming so fit-for-purpose is that the next-generation sequencing technologies are so sensitive that variations show up much more rapidly than many of the other protocols."
To establish the quality of extracted RNA from FFPE samples, Hewitt and his team take a multi-step approach that starts with a NanoDrop spectrophotometer reading, followed by analysis on Agilent's Bioanalyzer RNA chip, where the quality is expressed by the RNA integrity number, or RIN. However, Hewitt is quick to point out that RIN is not a measure of quality for RNA derived from FFPE tissue, but rather the ratio of the 18S and 28S RNA peaks. He is currently working to develop a better metric of quality. "The Bioanalyzer provides a quantitative electrophoretic analysis of an RNA sample — basically, how much RNA there is at any time point in the electrophoretic retention by means of microcapillary electrophoresis," he says. "The Bioanalyzer is programmed to automatically calculate the RIN for any RNA sample. However, this number is meaningless for FFPE-derived RNA, where there are no discernible ribosomal RNA peaks in 99 percent of FFPE samples."
In February, Hewitt and his team at NCI published a paper in the Journal of Histochemistry & Cytochemistry on the degradation of FFPE samples over time in storage that emphasizes the need for good biorepository practices. FFPE materials are great at the protein level and nucleic acid level, but material can degrade — just as fresh-frozen samples do during the freeze-thaw process, creating errors in the results. In its report, Hewitt's team observed how humidity can create antigenicity loss in FFPE samples over time. Endogenous water retention in FFPE tissue sections is hypothesized to be the result of inadequate tissue processing or exposure to high humidity levels during storage. While the exact parameters for optimal storage of FFPE samples still need to be defined, at this point the only effective way to mitigate how much degradation affects analysis is through implementing good biorepository practices that allow biobank managers to determine the state of the samples and their history.
"Most of it is really at this point people adopting and practicing best practices. The NCI has a series of best practices, the International Society for Biological and Environmental Repositories has a series of best practices, and they're fairly congruent," Hewitt says. "What we anticipate is that those will be adopted and then, over time, they'll be changed, but you'll be able to reference the different practices. And if you're performing an analytical experiment, you'll have some reference of what your storage conditions were and how you should either interpret your results and perform your assay."
What, then, is the best way to modernize biobanking? Some biobankers say that the answer lies in developing a uniform set of best practices while others say that advancing storage technology is the key. "It's kind of the Wild West right now. There are a lot of people who still believe that all you have to do is have a minus 80 freezer and a computer tracking system, and you have a biobank," says Minnesota's Hubel. "But those perceptions are going to be changed with time because there's going to be more and more demand on the quality and the information provided with each biospecimen."
Hubel and her colleagues at the University of Minnesota are trying to sketch a roadmap for the future. In addition to providing services that assist researchers in preserving samples using different modalities, their big initiative over the summer was the Biopreservation Resource Consortium. The consortium is a joint effort between the Biopreservation Core Resource at the University of Minnesota and the State University of New York at Binghamton, and aims to bring together biobanks to address preservation needs with an eye toward the development of standard methods of biobanking or improved biobanking for specific biospecimens.
"We're trying to advance the field and enable or create a platform ... or publicly available knowledge that can help people create biobanks that have higher-quality specimens and preserve biospecimens that have tremendous clinical potential but that are not currently amenable to current preservation techniques," Hubel says. "As the requirements for training and equipment monitoring for quality continue to rise, it will not be a function of whether the biobank is new or old, but whether or not [it is] willing to continue to evolve with evolving best practices, and that's going to be the critical delineation between the biobanks in terms of tiers of prominence and quality."
Wilma Lingle, a researcher who oversees biospecimen collection and storage at the Mayo Clinic in Rochester, Minn., says that the advent of GWAS made her rethink how a biobank should be run. "When people started doing GWAS studies where you have thousands upon thousands of samples, you really needed to make sure that all those are processed as similarly as possible," Lingle says. "So that really means that you want to have one SOP that's in place for when all of those samples are collected. So when people started doing these massive analyses with thousands of patients, rather than 20 or 100 patients, that's when it really started to hit us."
Lingle currently uses automated equipment with standardized chemistry to extract RNA and DNA from FFPE material, and then uses the Illumina Dazzle platform for analysis, as she says it works quite well for RNA with FFPE material. However, with FFPE, the RNA is going to be shorter than it would be from fresh-frozen material, so she says she is unable to use an Affymetrix platform, for example. "What you can do is quantitative RT-PCR, as long as you design your primers right so that your amplicon is going to be shorter than 100 bases. And then you can use the Illumina Dazzle assay platform, which works quite well with these shorter lengths of RNA," Lingle says. "We've just gotten through running samples through the Dazzle platform from 1,500 FFPE blocks of breast cancer that was for a clinical trial where we had women enrolled from all across the USA — this would be blocks and materials that would be prepared at hundreds of hospitals, and hundred of variations on doing tissue processing and formalin fixation — and it turns out to be pretty robust."
Unlike The Cancer Genome Atlas, which could only use about 1 percent of the samples it initially received, Lingle says that her group couldn't utilize around 10 percent of its samples. But even samples from that 10 percent were still able to provide quality data for other applications.
Lingle and her team are also working on an informatics solution to help biobanks based on NCI's caBIG project, which aimed to provide researchers with IT solutions to share data and programs. "It proved to be too difficult to manage, but what's come out of it is an application called caTissue, and I think that boils down some of the essential elements you might need to capture the tissues when you might be sharing derivatives of the tissues and even results of the tissues across institutions," she says. "So we've tried to design our information management system here to utilize the same data fields or elements, so that we could communicate with other institutions who are also aligned with caTissue, and use it as a common language."
The caTissue Suite is caBIG's bio-repository tool, permitting users to enter and retrieve data concerning the collection, storage, quality assurance, and distribution of biospecimens. It is designed to be sufficiently scalable and configurable for deployment across biospecimen resources of varying size and function, and can manage multiple types of biospecimens, including tissue, biofluids, and nucleic acids.
CLIA for biobanks?
There is growing momentum among biobanking centers and their staffs to create some type of unity. Unlike clinical laboratories, which have CLIA certification as the gold standard for maintaining operations, biobanks have no such accreditation. The NCI-supported Group Banking Committee, an initiative intended to expand the quality and accessibility of specimen collections, has devised a set of best practices which the Mayo Clinic and nine other cooperative cancer treatment groups in the US are currently implementing.
Maximizing biospecimen samples also includes analyzing the samples alongside de-identified electronic medical records. The Mount Sinai Biobank, one of the largest in the US, takes this approach and has so far acquired DNA and plasma samples from roughly 18,000 patients, with a goal of 100,000 donors by the end of 2011. "We have data going back to 2003 on these patients, so we have longitudinal clinical data, complete clinic data, DNA samples, and genotyping information," says Erwin Böttinger, director of the Charles R. Bronfman Institute for Personalized Medicine at Mount Sinai Medical Center in New York City. "In addition to using this resource for the discovery of new genotype-phenotype associations, and for local validation of genotype-phenotype associations that have been reported elsewhere, we can now move to a clinical care implementation of genomic medicine."
In late August, the National Human Genome Research Institute awarded the Mount Sinai School of Medicine a four-year, $3.4 million grant to help facilitate the creation of a new biobank database and expand Mount Sinai Biobank's infrastructure. The study, called the "Biorepository for Genomics Medicine in Diverse Communities," is part of a consortium of seven leading genomic medicine institutions called Electronic Medical Records and Genomics.
Böttinger says that until best practices have been established and are as widespread as CLIA's, "mega biobanks" like the Mount Sinai Biobank, will still be few and far between. But as more and more mega biobanks become established, and all adhere to the same operating procedures, they can be linked together to facilitate massive studies. "There are few examples of 'mega biobanks' — most are at a level of a few thousand samples — and each biobank individually, for many of the studies that we're looking forward to undertaking, do not have enough power to do that," Bottinger adds. "What would enhance the utility of biobanks nationally and globally would be to link them effectively, such that sample sizes can be combined to increase the sample size for a given study. That's a key issue that we'll have to address going forward."
Proof is in the papers
Nowhere is the lack of attention to quality control more evident than in the literature — at least, that's what Daniel Simeon-Dubach, CEO of the nonprofit Foundation Biobank-Suisse, said in a recent Nature paper. Having analyzed 125 papers he retrieved using the keywords "biomarker discovery" in a PubMed search of open-access articles published between 2004 and 2009, Simeon-Dubach and his team found that more than half of those papers contained no information on the biospecimens used. The team also noted that four such papers were published in Nature in 2009. Insufficient specimen data included a range of metrics — among them, how many times a frozen sample had been thawed, how many cycles it went through, how long a sample was exposed to room temperature throughout its processing, or whether it had been frozen or stored in formalin.
"We did a very small analysis," Simeon-Dubach says. "We analyzed the information available and there is, for us, a clear message here that we need to be much more careful about the information we are publishing about the biospecimens we are using."
While fixing the problems of having a dearth of, or inconsistencies in, the data has to begin with biobanks themselves, Simeon-Dubach says it is ultimately going to take the whole research community to correct the problem. "Researchers should ask for this information, the publishers and reviewers should also ask for this information before publishing the paper, and the health authorities — when they register a new biomarker — they should ask for it," he says. "I've talked to the different biomarker companies who say that they do ask for information, but not this very specific preclinical type of data. So the take-home message is that all of the stakeholders involved should make sure that this information is correct."