Skip to main content

NIH's $15M to PharmGKB Will Improve Curation Capabilities, Facilitate Annotation of Whole Genomes

Premium

Originally published Sept. 7.

By Turna Ray

With an infusion of funding from the National Institutes of Health, researchers involved in developing the Pharmacogenomics Knowledge Base are planning to develop tools to speed up curation of PGx data in published literature and annotate drug response data from the genomic sequence of an entire family.

The NIH announced this week that it will invest $15 million over the next five years to expand PharmGKB, an online hub for PGx data that is managed by researchers from Stanford University and curated from research consortia in the US and abroad. The database currently includes information on around 1,500 genes, 400 drugs, and 300 diseases.

With the additional funding, "we are going to develop literature-mining tools to extract information automatically to speed up our curation," Russ Altman, chair of Stanford University School of Medicine's bioengineering department and principal investigator in PharmGKB, told PGx Reporter this week. Altman added that even though PharmGKB will be investing the NIH grant money on tools to speed up the mining of data from published literature, "most information will still be examined and vetted by human curators."

PharmGKB, launched by Stanford in 2000, is freely available to the scientific community. The hub contains data about biochemical pathways influenced by specific drugs and provides summaries of genes that influence individuals' response to various treatments. PharmGKB is run by six staff scientists and six software engineers, who conduct research, collaborate with outside investigators, and build the software infrastructure supporting the database. PGx data in PharmGKB is annotated and cross-referenced with related research data.

In addition to improving PharmGKB's data-curation capabilities, the NIH funds will go toward annotating sequence data from the genomes of individuals and at least one family. Earlier this year, a team of Stanford researchers published the annotated genome of Stephen Quake, co-chair of the bioengineering department at the university. In doing this work, PharmGKB was one of the main resources the researchers used to gather data on validated SNPs linked to drug response (PGx Reporter 05/12/10).

Although Altman could not yet identify the family whose genomes PharmGKB researchers will analyze, he noted that the team is working on annotating the genomes of these individuals in the same way it worked with Quake's genome.

"You can expect similar, but more automated annotations of genomes in the future," he said. Although the sequencing will be performed by "many different technologies," PharmGKB researchers are focusing their attention on "interpreting the genome for what can be learned/predicted (and with what level of confidence) from the sequenced genome," Altman said via e-mail.

Currently, researchers have the most confidence in SNPs and "non-conservative mutations" in the PharmGKB database. PharmGKB has developed a list of "Very Important Pharmacogenes," or VIP Genes, on which researchers have the most certainty about their impact on drug response.

After five additional years of NIH funding, "we should show a marked improvement in that capability," Altman said. "We want to make the capability available for research purposes to other researchers." However, he added that PharmGKB will make sure to emphasize that upon release of its data on the annotated genomes of this family, that "more work needs to be done before [the data] can used for clinical purposes."

The "additional work" required will involve validation of the observed gene-drug response and physician education to ensure that doctors can accurately interpret this data. PharmGKB is part of the NIH's Pharmacogenetics Research Network, which has formed the Clinical Pharmacogenetics Implementation Consortium — an effort that will convene groups of physicians with expertise in certain areas to write PGx guidelines for doctors. These guidelines will be published in peer-reviewed journals, as well as placed on the PharmGKB website.

[ pagebreak ]

These guidelines will differ from the bulk of annotations in the database "because they are a small subset where the data is sufficient already to make clinical recommendations," Altman explained.

The effort to develop and publish guidelines through CPIC is being lead by Mary Relling of St. Jude's Medical Center and PharmGKB. Guidelines will first be developed for the "greatest hits" of pharmacogenetics, Altman said, including warfarin and the anti-platelet drug Plavix. "As CPIC gets experience writing these guidelines, we will expand to the next list of variations that deserve clinical attention," he said.

Additionally, PharmGKB is also part of the Genomic Medicine Academy being developed by Eric Topol, director of the Scripps Translational Science Institute. The Genomic Medicine Academy is working on an online genomics education curriculum for doctors that is slated for launch by year end (PGx Reporter 05/12/10).

Over the next five years, PharmGKB researchers expect to gather gene response information for drugs such as warfarin, tamoxifen, and certain selective serotonin reuptake inhibitors and other depression drugs.

Previously, the PharmGKB team used clinical and genetic information in the database to develop an algorithm for dosing warfarin (PGx Reporter 02/19/09). The algorithm is being tested in a large-scale clinical trial sponsored by the NIH's National Heart, Lung, and Blood Institute. The study is slated for completion in September 2011, and is currently recruiting participants.

In general, PharmGKB researchers are casting a wide net for their research efforts. "We still are looking at all published data, no matter what the area," Altman said. "We had our first success with these data-sharing consortia with the warfarin paper [published in February 2009 in the New England Journal of Medicine] and so we are now trying to expand these consortial activities to other areas that need data sharing."

He added that the team is getting " pretty good at it, because we have resources to apply to the task of bringing data sets together and harmonizing them," including tools for curating, aggregating, and integrating the data.

PharmGKB is part of a broader NIH pharmacogenomics initiative that includes various PGx research projects and a nationwide research consortium, the NIH Pharmacogenomics Research Network, or PGRN. This week the NIH also announced it is providing more than $160 million to the PGRN to advance new PGx research projects (see related story, this issue).

"PharmGKB has become a powerful resource not only by providing high-quality information, but also by bringing researchers together to share ideas and collaborate," Rochelle Long, head of pharmacogenomics research programs at NIH's National Institute of General Medical Sciences, said in a statement. "We anticipate that such collaborations will continue to grow for the next five years."

In addition to the NIH grant, another source of funding for PharmGKB may come from licensing contracts. According to Altman, Stanford is making PharmGKB available for licensing to companies on a non-exclusive basis. "The terms of licensing are being drawn up now, but we are seeing lots of interest from the private sector as they see PharmGKB as a good head start for getting all the known human variations relevant to drug response," Altman said.

The Scan

Call to Look Again

More than a dozen researchers penned a letter in Science saying a previous investigation into the origin of SARS-CoV-2 did not give theories equal consideration.

Not Always Trusted

In a new poll, slightly more than half of US adults have a great deal or quite a lot of trust in the Centers for Disease Control and Prevention, the Hill reports.

Identified Decades Later

A genetic genealogy approach has identified "Christy Crystal Creek," the New York Times reports.

Science Papers Report on Splicing Enhancer, Point of Care Test for Sexual Transmitted Disease

In Science this week: a novel RNA structural element that acts as a splicing enhancer, and more.