Director, Center for Bioinformatics and Computational Biology
National Institute of General Medical Sciences
With a remit to foster scientific collaboration and the goal of finding ways for bioinformatics researchers to play a role in the “a-ha” moments of scientific discovery and data analysis, Karin Remington brings to the Center for Bioinformatics and Computational Biology at the National Institute of General Medical Sciences energy as well as experience juggling and visualizing large datasets.
As the director of the NIGMS CBCB since September 2007, Remington oversees approximately 375 active awards, including mainly R01 initiated projects, funding for the National Centers for Systems Biology, the National Centers for Biomedical Computing, interdisciplinary projects such as MIDAS, the Models of Infectious Disease Agent Study, and various training programs, all of which adds up to an annual budget between $140 million and $150 million (see below for some funding opportunities available through the NIGMS CBCB).
Remington also serves as chairperson of the Biomedical Information Science and Technology Initiative Consortium, or BISTIC, which is comprised of representatives from each of the NIH institutes and aims to coordinate the bioinformatics and computational biology efforts across all the institutes and centers.
BioInform recently spoke to Remington about the progress of her quest to foster collaboration and create a bioinformatics knowledge network across NIH. An edited version of the interview follows.
What is the focus of BISTIC these days?
Each institute is really its own organization but we also are one. … We want to make sure that the programs [that] each individual center or the Roadmap programs launch take best advantage of each other and are synergistic and not competing.
There is a lot of science that is competitive; our scientists are very competitive by nature. That level of competition is good. But when you are talking about large programs and large investments in these areas you do want to make sure that you are as complementary across the whole organization as you possibly can be.
[BISTIC] is intended to cover the spectrum: hardware and software developments, infrastructure developments that are needed by either our intramural researchers or the extramural community. … So one of my goals for BISTI is to re-spark its original line and engage the whole community in sharing knowledge, being collaborative, and communicating across campus.
It cuts across a whole swath of things: knowing who is developing tools in what area, what is the NIH portfolio in this area, whether it is intramural science or extramural science, pooling those together. … There is a real awakening to the fact that it is important to understand NIH’s portfolio from a high-level perspective, not just from an individual institute’s perspective. We program officers want to understand what are some fundamental gaps, where we need to build up some new programs.
What role do standards and interoperability play?
Standards and interoperability are an easily graspable way to think about it. [For example], the tools that the centers are producing in the National Centers for Biomedical Computing program [should] be readily adoptable by the centers in the Clinical Translational Science Awards program, so that we build active connections there.
You want to rely on things to self-assemble and for researchers to pick up on things that are going on. But there are an awful lot of people who are awfully busy and have so much on their plates they might not even realize another program is going on.
We do bear some responsibility for enabling connections, we can’t enforce that on grantees, but it is something we want to really encourage.
You have mentioned that a new BISTIC web site will be launched in the foreseeable future where scientists will also be able to forage for new funding opportunities. One bugaboo for scientists is finding the funds to maintain databases they have developed and launched. What can you tell them?
It is a common cry and it is one all the people in similar roles across the campus realize is a problem. … There is a difficulty in getting continued support for databases; they don’t sound all that glamorous to a study section looking at proposals.
There are some areas of research where the data needs to be maintained behind a secure wall and that is only for a limited group of investigators. But there is a lot of data that’s collected that could be of more widespread use, [so it needs to be] known to people that it’s available, readily accessible, and that the databases can be maintained over a usable length of time.
If applications like that have to go to a traditional science study section interested in hypothesis-driven science, they look at a database and say, ‘This is nice and useful but yawn in terms of science.’ What [BISTI] does, is try to initiate funding mechanisms for things that might fall through the gaps.
[We have come up] with specific funding mechanisms whereby these proposals get reviewed separately from the investigator-led hypothesis-driven research so they have a fair playing field amongst each other.
The one specific funding initiative that is particularly applicable to [databases] is the Continued Development and Continued Maintenance of Software program. … This came on before I was here so I can’t take credit for it except in being a believer in it and continuing it. They were able to get a lot of institute support and dedicated funds for this program, recognizing that these databases are really valuable resources for any number of scientific communities.
Now that sequencing allows labs to generate and play with so much data, how do you plan to support tool development in that area?
[One] program is the Innovations in Biomedical Computational Science and Technology Initiative. That is intended to be sort of a high-risk endeavor, coming up with new tools and ways of looking at data that are sort of out-of-the-box but at the same time promise to help make the data usable for scientists.
To me that really reflects what I see as my mission here at NIGMS and my mission here at NIH in general — to give every scientist the tools to think beyond what they are doing at their bench, whether that means to do laboratory experiments in silico that they can’t somehow accomplish because of technological reasons … [or to be] able to look at whole genome sequence in a way some of the big labs like the Venter Institute or the Broad [Institute] can do pretty routinely.
I just think of how much potential there is in this data not just for the big labs to make exciting aggregate huge-data-scale discoveries, but if every scientist can have access to that same thing and the tools to move around the data. ... That’s a bigger challenge right now, trying to bring that scale of data into everybody’s reach.
A lot of scientists don’t have the infrastructure to do a lot of that really large-scale research on their own without some experts. Or [that] might mean aggregating their data with data from a lot of other labs to see a bigger picture than the one they have locally.
That is kind of how I see my role here in my center and then broadly across NIH, is to enable that scientist to get beyond what they can see in their own labs.
Everyone knows about this deluge of data; it is bad now or good depending on whether you are an optimist or a pessimist. … We have all this data and we’re going to have only more and more. We need to make sure that the data is usable and not just by the big labs, by the big computing facilities and the really high-powered scientists, but by everybody.
You worked on the Celera effort to sequence the human genome and on Craig Venter’s Global Ocean Sampling Expedition. How is that experience shaping your plans at NIH?
That is really how I got interested in this NIH program, working at the Venter Institute, working with large amounts of data. I had a really great team of bioinformaticists and computer scientists creating these really fascinating-to-look-at views of tons and tons of data and then we would try and engage with scientists who might be able to tell us what some of the data meant.
We were trying to come up with science stories we could tell beyond just the fact of, ‘Hey, we collected the data and did some gene discovery on it.’ We wanted to have some real science impact come out of this. … to show the kinds of things that other people could begin to look at with this data. The way we were looking at the data was really just with brute force, hacking away at large datasets without any pre-canned tools to help us out.
One of the things I can do here in the center is invest in training programs for getting people talking together, getting scientists and computer scientists to speak the same language. In a real cultural shift, I would really like all the computer scientists and bioinformaticists in this field to have the same feeling of love for science that I have found over the last decade or so and to be really engaged not only in building the tools and creating new algorithms but in making scientific discoveries.
Part of that is training programs and really encouraging interdisciplinary training from the undergraduate level through graduate school and into academic standard investigations. [We are] trying to deliberately bring computer scientists and [experimental] scientists together on projects rather than have them try and forge that kind of a connection on their own, and [to] really reward people who are going through this thought process together rather than just [have] a scientist using some pre-developed tool by a bioinformaticist who is creating it because they know how to write a cool tool.
What does this kind of collaborative effort involve?
It requires a culture change, too, in our academic institutions, and this has been something that has held back a lot of progress in this area. Computer scientists in academia are rewarded for computer science publications, publishing in their journals. If they are being really helpful in a scientific exploration of something, by maybe just providing not a novel unique algorithm for doing something but just a basic grab-of-the-data-and-here-you-go, a … fundamental thing they can help a scientist do, they wind up with very little academic credit for that.
They can get a really nice feeling of participation in a scientific discovery process, but they aren’t professionally rewarded for that — at least that is how it has been for a long time. At NIH it’s widely recognized [that] it is a struggle to try and change it because it is something that is built into our academic institutions in this country, where the departments are very compartmentalized and stove-piped.
Maybe there are going to have to be a few different sorts of roles for computer scientists. We always need people who are going to be working on new high-speed algorithms for particular well-known computational problems and we have a very talented workforce of computer scientists who can do that.
But what’s really lacking are the people who are perfectly competent in that area but are also really good at communicating with the scientists, understanding what it is they need, translating it back into what the data has to offer and trying to find a way to extract that knowledge from the data and having a really close collaborative connection. Whether it requires them to be doing true innovative computer science, we need to make it so that doesn’t matter to them, so that is not their driving force, but what they are driven by is being able to participate in scientific discovery and enable the scientists to do what they need to do.
One of the real concrete things in the last couple of years has been the institution of multiple principal investigator applications. There had been a tradition here to have a single PI and then anybody else would be a co-principal investigator. When you are up for tenure at your home institution and you are co-PI as a computer scientist, despite the fact that you are providing an absolutely vital component of the work, it is not as rewarding to you professionally as being a principal investigator, earning your own way, and having full credit for that participation.
So having the multiple PI line on our applications … means a lot to people who are going back to their home institutions needing to get grant money, coming [into projects] where they are full peers, [and] not just a co-investigator, key administrator, or collaborator. That is one small thing we can do here at NIH to encourage those interactions.
Funding Opportunities Through the NIGMS Center for Bioinformatics and Computational Biology
The Continued Development and Maintenance of Software (R01) program is here.
The Innovations in Biomedical Computational Science and Technology (R01) program is here.
More information about the Models of Infectious Disease Agent Study is here.
New MIDAS funding opportunities are here.
Other funding opportunities in biomedical informatics and computational biology are listed here.