The SARS-CoV-2 genome sequences that disappeared from a database about a year ago are back online at a different database, according to the New York Times.
In June, Jesse Bloom from the Fred Hutchinson Cancer Research Center reported in a preprint posted to BioRxiv that though he was unable to find a number of viral genomic sequences that were supposed to beat the Sequence Read Archive, he was able to reconstruct 13 missing sequences by recovering files from Google Cloud. According to Bloom, the reconstructed data suggested that early pandemic viral samples had characteristics similar to bat coronaviruses and the data may have been removed from the SRA to "obscure their existence." Others, though, were skeptical of a cover-up, as Science reported then.
The Times now reports that the viral genome sequences were uploaded in early July to a China National Center for Bioinformation database. It adds that the issue surrounding the sequences' disappearance seems to stem from an editorial error in which a data availability statement was accidentally deleted by the journal Small — which published the initial viral sequencing work — leading the researchers to think the data did not have to be stored at SRA. The journal tells the Times that it is issuing a correction and a link to where the data is now kept.