NEW YORK (GenomeWeb News) – Researchers from the University of California at Los Angeles reported online today in PLoS Genetics that they have sequenced the genome of a glioblastoma multiforme brain cancer cell line.
The team sequenced the GBM cell line, called U87MG, to about 30 times coverage. During their subsequent analyses, the team identified numerous SNPs, structural variants, translocations, and small insertions and deletions not found in the human reference genome — many affecting protein-coding sequences.
"This was the most thorough sequencing analysis of an individual cancer cell line that has been performed to date," senior author Stan Nelson, a genetics researcher and director of the UCLA Jonsson Comprehensive Cancer Center's gene expression shared resource, said in a statement.
Several genomic studies have focused on GBM, including sequencing work by members of The Cancer Genome Atlas. But, Nelson and his colleagues explained, the new paper represents the first time a GBM cell line has had its whole genome sequenced.
"Lots of biology is based on cell lines," Nelson told GenomeWeb Daily News, "but we have a rather incomplete view of what these cell lines are." Finding mutations within cell lines should provide researchers with an additional resource for putting findings based on these cell lines in context, he added.
For the current paper, Nelson and his co-workers selected U87MG, a well studied line derived from a grade IV glioma that has appeared in nearly 2,000 research papers. Previous karyotyping and fluorescence in situ hybridization studies suggest U87MG's genome is extremely aberrant with many rearrangements, though a complete picture of sequence and structural mutations in the genome has remained elusive.
The team used the Applied Biosystems SOLiD 3 sequencing system to generate 50 base mate pair reads from a library with a mean insert size of around 1,400 bases, aligning these reads with software called BFAST. Using this approach, the researchers obtained more than 30 times coverage of the U87MG genome at an estimated cost of about $35,000.
In their subsequent analyses, they found nearly 2.38 million SNPs in the U87MG genome that aren't found in the hg18 assembly of the human reference genome — 89.8 percent of which overlapped with SNPs in build 129 of the dbSNP database. The researchers verified their sequence variant findings by sequencing 5,253 genes using exon capture sequencing with the Illumina Genome Analyzer II.
By comparing SNPs in the cell line genome with those found in dbSNP and two previously published genomes — the Watson genome and first Asian genome — they found that the prevalence of SNPs in the U87MG genome was comparable to that in normal genomes.
"Most of the variation is still dominated by the inherited polymorphisms," Nelson explained.
Even so, the team detected mutations affecting 512 protein-coding genes, including PTEN, a gene previously implicated in brain cancer. Many of these involve small insertions and deletions.
And overall, the team found 191,743 small insertions and deletions (including 116,964 not found in dbSNP) as well as more than 1,300 structural variants, and 35 inter-chromosomal translocations in the U87MG genome.
Down the road, the researchers plan to sequence additional cancer cell lines as well as patient-derived tumor samples, including multiple samples from the same individual. In the long term, Nelson predicts, such sequencing efforts should provide members of the research community with the opportunity to work with a host of genetically characterized cell lines.
In addition, he and his team noted, the current study also highlights the potential for a relatively small lab to do whole-genome sequencing. "These data demonstrate that routine generation of broad cancer genome sequence is possible outside of genome centers," they wrote. "The sequence analysis of U87MG provides an unparalleled level of mutational resolution compared to any cell line to date."
The team plans to make data from the U87MG sequencing project available online through a database that lets researchers query the data and search for variants.