NHGRI's Teri Manolio on the Importance of Public Input in Creating a GWAS Data Repository

On Aug. 30, the National Institutes of Health issued a request for information to gather feedback on a proposed repository for data gathered from large-scale genome-wide association studies.
A number of such studies are already underway within NIH, including the GAIN (Genetic Association Information Network) project, which plans to genotype as many as 2,000 patient samples, and the SHARE (SNP Health Association Resource) project, which is analyzing 9,000 samples from the Framingham Heart Study.
NIH has proposed a policy that will require NIH-funded GWAS investigators to “quickly” submit genotypes and phenotypes into a centralized NIH data repository. This data will be “submitted in a form that protects the privacy and confidentiality of research participants,” and will be made freely available to approved researchers, NIH said.
However, NIH noted in the RFI that GWAS data raises a number of issues, including participant privacy concerns, professional recognition for investigators submitting data, intellectual property rights, and technical details related to the repository itself, that should be discussed with the scientific community as well as the general public.
Comments can be submitted here through Oct. 31.
BioInform spoke to Teri Manolio, senior advisor to the director for population genomics at NHGRI, about the issues involved and NIH’s vision for the proposed repository, this week.
What are some of the issues at stake regarding genome-wide association studies that you are seeking feedback on?
The main issues are outlined in the RFI. We basically are proposing a policy for widespread sharing of data, and we want to know from the scientific community how we can make that more useful to them, and from the lay community if they have any concerns about their data being used in this way.
As for the repository, it’s my understanding that NCBI is already developing a database for this kind of information as part of the GAIN project [BioInform 04-28-06].
It is. GAIN, and also what’s been called the SHARE project in Framingham. The [National] Heart, Lung, and Blood Institute has decided to do genome-wide association testing in all of those participants, and the members of the study in the Framingham community have agreed to make those widely available. So both of those will go into the NCBI database as sort of the first installment of what we hope will be many data sets.
So in terms of feedback and information on what would be required in this sort of repository, it sounds like that’s proceeding anyway in parallel to this RFI.
Pretty much, but we can always make it better. One of the things we’d like to know is how it can be more useful, more accessible, more user-friendly. The fact that people haven’t really had access to it yet makes it a little bit hard [to answer] those questions, but if they have ideas up front as to what would make it useful for them, that would be very good to know.
This RFI was released a few weeks ago, so what kind of feedback have you gotten so far?
People tend not to respond when you first put something out. They often wait a little bit — like until the deadline. So I’m not aware of a lot of feedback that we’ve gotten. But it wouldn’t come to me. It would go to the people in the office of the director. 
Ideally, what would you like to put in place for a GWAS repository? What is NIH’s vision?
It’s a little hard to say because we really are trying to shape this for the maximum scientific use. So whatever we can do to make the investment in genome-wide studies — which is going to be large, because they’re expensive studies — whatever we can do to make them maximally scientifically useful so that we can actually identify genes that are related to health and disease, that’s what we want out of it.
How much of a concern do you anticipate privacy will be — both from the scientific community and the general public?
I think it is a concern. Very often we tend to get a little more excited about this than we need to be. Mainly because there aren’t large databases out there that can identify somebody through their genome. Now, someday there may be, and when there are then it may be a much a larger concern. But at present, it’s probably more of a perceived issue than a real one, and we would be much better off — and I think everyone agrees — if we had genetic antidiscrimination legislation in place now as we’ve been trying to get at the Genome Institute and NIH for many years.
Would the holdup on that legislation have any impact on putting a GWAS repository in place?
It might. It certainly doesn’t help. I think if we were able to go to the public and say, ‘There are laws that protect your genes or any adverse-risk alleles that might be identified,’ I think that would be somewhat reassuring to people, although a lot of times people are not reassured even by that. But I think we in the scientific community might feel a little more sanguine about doing this kind of work.
On the other hand, we’re really at such an early stage of identifying at-risk alleles that we don’t think there’s going to be anything terribly damaging that is identified in these early studies, but later on there might be.
I suppose that the technology for privacy protection and security is also progressing, even as these issues are being debated.
It is, and we’re not going to put these data up on the web unless we get a strong response from the scientific community and the lay public that that would be OK. What we’re anticipating is providing some access controls to be sure that they’re being used for a legitimate scientific purpose by researchers who agree that they will keep the data confidential and not try to identify anybody and that sort of thing.
How enforceable those agreements are is sort of a matter of question, but we actually have a fairly strong moral suasion that we can use from within NIH, and also just in general terms, and I think if we advise people that these really are research data that were donated by participants just like you and me and we have some responsibilities to them, most people tend to follow that.
Is there anything else that you think is worth noting about what’s envisioned for the GWAS repository?
I think it is important to note that we’re trying to be careful about how we develop these policies and we want to be sure that we get broad public input on them. And we’re going to listen to that input. It’s not just that we’re looking for rubber stamps, so we hope that we’ll get a lot of comments back on this.

