Most bioinformaticists consider unrestricted access to genomic data a fundamental right, but heightened security concerns in the US are forcing the scientific community to rethink that assumption. At a workshop hosted by the National Research Council on Oct. 1, a panel of scientists and security experts discussed the implications of open access to genomic data for potential agents of bioterrorism, and considered mechanisms for restricting access to that information in the interests of national security.
“The idea that there are benefits to openness in science is very abstract to the general public, but the threat of bioterrorism is very concrete,” said Tara O’Toole, director of the Center for Civilian Biodefense Strategies at the Johns Hopkins Bloomberg School of Public Health, during the workshop. “There’s an imbalance in terms of those perceptions that the scientific community will have to face very squarely.”
Taking steps to address that imbalance, the NRC formed a committee to recommend guidelines for sharing genomic information in an era of mounting bioterrorism risk. The committee will base its recommendations on the workshop, which gathered more than 40 participants — from academia, federal funding agencies, government research labs, industry, and several national security agencies — at NRC’s Washington, DC, headquarters.
The NRC report should be published by February or March, according to Stan Falkow, professor of Microbiology and Immunology at Stanford University and chairman of the NRC committee on Genomics Databases for Bioterrorism Threat Agents. Other committee members include Corrie Brown of the University of Georgia, David Franz of the Southern Research Institute, Claire Fraser of the Institute for Genomic Research, Paul Keim of Northern Arizona University, Erin O’Shea of the University of California San Francisco, and Terence Taylor of the International Institute for Strategic Studies.
Whatever the committee ultimately decides, it’s unlikely that its overall recommendations for data release will depart drastically from the current status quo: Although the day’s discussion touched on a number of contentious topics, there was a general consensus that the benefits of open access to genomic data far outweigh the potential risks — with the exception of a few specific examples. Exactly what those examples are, however, and how the scientific community should address them, is still up for debate.
Shades of Gray
During the course of the workshop, the consensus evolved from near-unanimity on the concept of openness as “an absolute value,” as one participant put it, to general acknowledgement of a “gray zone” of genomic data that could be dangerous in the wrong hands. But, as several workshop participants pointed out, defining that gray zone can be extremely tricky. As an example, Art Friedlander, a senior scientist at the US Army Medical Research Institute of Infectious Diseases, proposed the hypothetical case of a strain of smallpox that is found to be resistant to all known vaccines. Do you release that information into the public domain so that the international scientific community can work on developing a new vaccine, he asked, or do you hide the vulnerability so that terrorists don’t learn about it? Even if the information is only made available to a restricted set of “secure” researchers, “you limit the number of people who can address the problem,” he said.
Friedlander and David Relman, an assistant professor in the division of infectious diseases at Stanford University, each pointed out several examples in the published literature of biological research that could be used for nefarious purposes, ranging from directed molecular evolution of whole genomes to select for specific properties, to an amino acid mutation in the influenza virus to give it higher virulence in humans, to a method for making anthrax resistant to antibiotics. O’Toole said that her group at Johns Hopkins identified “thousands” of such examples in a recent survey of the scientific literature.
Such examples spurred the discussion of whether sequences that code for specific traits — such as hypervirulence, antibiotic resistance, vaccine resistance, improved environmental survival, or increased transmissibility — should be restricted. Participants also suggested that sequences of naturally occurring organisms should continue to be deposited in the public domain, while access be restricted or limited for sequences of engineered microorganisms, or even strain variants of microorganisms for which a “reference” genome is already available. Relman noted that reference genomes for most of the world’s most dangerous pathogens are already in the public domain, and “can’t be retracted.” However, he added, “There is no argument for releasing every strain variant.”
Paul Keim pointed out that the CIA and FBI already limit access to sequence data that they generate as part of their research, such as the FBI’s investigation of the 2001 anthrax scare, so “no new processes need to be put in place” to keep some data out of the public domain.
Publish and Perish?
For some scientists at the workshop, the issues under discussion weren’t just abstract policy decisions. For example, Jim LeDuc, an epidemiologist at the Centers for Disease Control and Prevention said his lab at CDC is in the process of preparing the results of a large-scale genomic analysis of several previously unreleased virulent smallpox strains. Although the smallpox genome was originally published in the 1990s, “we now have 20 or so full-length smallpox strains under analysis and ready to be released,” he said. “I don’t want to publish that data and then have the NRC report come out saying that I shouldn’t have,” he added.
“I wouldn’t publish that data,” said James Kvach of the Defense Intelligence Agency. “Do we know whether those strains are more or less virulent [than the publicly available genome]?” he asked. While acknowledging that the genome sequence of an organism isn’t quite the blueprint to crafting a superpathogen from scratch that some fear it might be, “even if we can’t put it together today, someday we might,” he said.
Limitations of Limiting Access
In the case that the NRC committee does recommend limiting or monitoring access to some genomic data, the means of implementing those restrictions remains unclear. David Lipman, director of the National Center for Biotechnology Information, explained that “if you make a system that requires registration, a lot of people won’t want to give you their name,” and usage for the publicly funded resources would drop dramatically. There are also technological concerns, he said. NCBI currently links seamlessly to over 1,200 other resources around the world, and requiring users to register would throw up “gates” that would close off many of those links, he said. In addition, some participants said, regardless of what the US decides, enforcing a more restrictive policy internationally will be very difficult, if not impossible.
Restricting access to the data may be a moot point altogether, some said. As Falkow noted, if terrorist groups “are sophisticated enough to weaponize microorganisms, then they’re probably sophisticated enough to sequence the suckers.”
The NRC committee has its work cut out for it in crafting a report that balances the scientific culture of openness with the bioterrorism concerns of the security community and the general public. TIGR’s Fraser acknowledged that coming up with a set of recommendations would be a challenge, but told BioInform that the committee plans to look at the issue from “as broad a view as possible” in order to avoid appearing like “scientists just talking about themselves, again.”
Indeed, workshop participants noted at several points during the day that no matter what the committee ultimately recommends, it will be crucial to demonstrate that it has fully explored both sides of the issue. “The scientific community must not appear arrogant or cavalier about this,” said Relman. In particular, the boundaries of the “gray zone” that defines exactly what type of genomic information is ultimately deemed dangerous will have to be better defined, O’Toole pointed out. “We don’t know enough yet to set standards and create rules,” she said.
Participants noted it will also be important for the final report to take into account the international scientific community, which may not share the same security concerns as the US but does depend on many of its biological data resources.
Finally, as O’Toole and others pointed out, there should be a “temporal component” to the committee’s recommendations. Noting that genomic technology is increasingly accessible, and that “research that would have won you an award 10 years ago won’t even get you a PhD today,” O’Toole said that the scientific community must make policies today with an eye toward what may be technologically feasible in five or 10 years. Falkow agreed, noting that scientists have good reasons to feel secure that the genomic data currently in the public domain doesn’t place the nation directly in harm’s way, but, he asked, “will we be as comfortable in the future when the technologies change?”