The US National Institutes of Health is looking into the removal of SARS-CoV-2 data from a gene sequence database it oversees, the Wall Street Journal reports.
In June, Fred Hutchinson Cancer Research Center's Jesse Bloom reported in a preprint posted to BioRxiv that he was able to reconstruct from Google Cloud data some early SARS-CoV-2 sequences that had been deposited in the Sequence Read Archive but later removed. Bloom, as Science reported then, suggested that the sequences had been removed from the database to "obscure their existence."
The New York Times further reported at the time that the researchers who added the sequences to the database had asked the SRA to remove them as they were being updated and would be deposited to a different database. The Times reported in August that those sequences were now housed at a database maintained by China National Center for Bioinformation and that the researchers attributed their removal request to a misunderstanding.
As the Journal reports, the episode concerned three lawmakers in the US, who wrote to NIH Director Francis Collins to seek further explanation. It adds that, in response, Collins said a review at NIH was underway to examine "whether appropriate steps were taken to assess this withdrawal request."