Skip to main content
Premium Trial:

Request an Annual Quote

Cold Spring Harbor Laboratory To Manage Bioinformatics For Pharma SNP Consortium


NEW YORK--The SNP Consortium, a nonprofit entity, was officially formed this month by 10 pharmaceutical companies and the UK's Wellcome Trust for the purpose of identifying and making available in a public database 300,000 single-nucleotide polymorphisms (SNPs) from the human genome. Four leading genome sequencing centers--the Whitehead Institute, Washington University School of Medicine, Stanford's Human Genome Center, and the Wellcome Trust's Sanger Centre--will generate genome-wide SNP data for the venture. A bioinformatics team led by Lincoln Stein at the Cold Spring Harbor Laboratory in New York will curate the information, hold it in escrow for quarterly releases to a public website, and submit it to the US National Center for Biotechnology Information, which will merge the so-called TSC Database with its own publicly funded SNP database.

Pharmaceutical companies participating in the $45 million initiative are: AstraZeneca, Bayer, Bristol-Myers Squibb, Glaxo Wellcome, Hoechst Marion Roussel, Hoffmann-La Roche, Novartis, Pfizer, Searle, and SmithKline Beecham. Arthur Holden, the consortium's CEO and former CEO of UK diagnostics company Celsis, said the project, which plans to map at least 150,000 of the SNPs, "will help answer questions about genetic factors that contribute to disease susceptibility and response to treatment, and suggest directions for future investigation."

Stein told BioInform TSC Database will be available to users in several formats: as an Oracle database, as an Ace DB database, and in flatfile format. "We'll have interactive web pages that are nicely formatted HTML text," Stein added, "but people who want complete dumps of the data can get flatfiles."

The major informatics challenge of the initiative, according to Stein, will be to reconcile contradictions that he said will inevitably turn up in the data. "We may find that the same SNP appears on different chromosomes in different people's maps," he explained. "So we will need to develop some combination of computational tools and probably human elbow grease to resolve those contradictions. Most of them will be small, but some are going to require human attention," Stein added.

Stein and four others at Cold Spring Harbor will use standard bioinformatics tools such as Blast to identify overlaps between candidate SNPs and the genomic sequence in GenBank. To map the SNPs, Stein said he will rely on a 30,000-gene map that was published last year through the collaborative efforts of the Sanger Centre, Stanford, Whitehead, Genethon, and Oxford University. "It's a good source because it tells us what nearby genes and expressed sequences are," Stein commented. "If we don't get a hit on that map, we will use secondary sources such as Washington University's Bac physical map. If that fails we'll have to go to GenBank features, which may be cytogenetic location or just a chromosome," he elaborated.

Labs will submit data in a common format that Stein developed based on the National Center for Biotechnology Information's draft SNP data-submission standard. "The only thing that makes this project possible is that we're all using a standard submission format and data representation formats," he remarked.

Technological demands at the end of the pipeline are another question. Pharmaceutical companies that finally get the SNP data face technological challenges to utilizing it. Stein said, "The major bottleneck for using this information is going to be developing robust and inexpensive assays for polymorphic alleles. That is, we know where the SNPs are, but some technique such as microarrays has to be applied to detecting these in mass scale on actual human samples."

In terms of informatics, the problem for pharmaceutical companies isn't as daunting, Stein contended. "Once SNPs are identified and mapped, they become no different from other genetic markers and can be treated as classic markers, such as RFLPs or phenotypic traits," he said, adding, "They'll just go into the standard linkage and analysis program."

--Adrienne Burke

Filed under

The Scan

Genetic Testing Approach Explores Origins of Blastocyst Aneuploidy

Investigators in AJHG distinguish between aneuploidy events related to meiotic missegregation in haploid cells and those involving post-zygotic mitotic errors and mosaicism.

Study Looks at Parent Uncertainties After Children's Severe Combined Immunodeficiency Diagnoses

A qualitative study in EJHG looks at personal, practical, scientific, and existential uncertainties in parents as their children go through SCID diagnoses, treatment, and post-treatment stages.

Antimicrobial Resistance Study Highlights Key Protein Domains

By screening diverse versions of an outer membrane porin protein in Vibrio cholerae, researchers in PLOS Genetics flagged protein domain regions influencing antimicrobial resistance.

Latent HIV Found in White Blood Cells of Individuals on Long-Term Treatments

Researchers in Nature Microbiology find HIV genetic material in monocyte white blood cells and in macrophages that differentiated from them in individuals on HIV-suppressive treatment.