BALTIMORE – Having raised €5 million in seed funding last November, French DNA data storage startup Biomemory says it is poised to scale up its “biocompatible and biosecure DNA synthesis technology” to be able to store large-scale data.
To achieve this, the Paris-based company said it is developing a petroleum-free technology to produce DNA nucleotides based on synthetic biology. Additionally, Biomemory said it has found a way to synthesize DNA enzymatically without using terminal deoxynucleotidyl transferase (TdT), a type of enzyme that has been essential for most other enzymatic DNA synthesis technologies.
“We don't use petroleum-based chemistry [to produce nucleotides]; we are only using synthetic biology,” said Biomemory CEO Erfane Arwani. “Today, we're the only company and laboratory to do that in the world.”
Established in 2021, Biomemory was founded by Arwani, who has a computer science background, and two cofounders, Stéphane Lemaire and Pierre Crozet, both researchers affiliated with Sorbonne University.
The company’s core technology involves two patents from Centre National de la Recherche Scientifique (CNRS) and Sorbonne University, which were co-invented by Lemaire and Crozet along with their collaborators. According to Arwani, Biomemory has secured an exclusive worldwide license, with buying option, to both patents.
Compared with traditional digital data storage, DNA data storage comes with many potential advantages, Arwani said.
For one, DNA can be an extremely compact medium for data storage. “On a hard drive, if you want to store one bit of information, you use 1 million atoms,” he explained. “With DNA, you use only 50 atoms.”
Additionally, Arwani said DNA data storage can potentially be energy-efficient and environmentally friendly. “With DNA, you don't need any energy to preserve the data; you just need to synthesize DNA, then you don't need any more energy,” he said. “Whereas today with electronic devices, you need energy at least to maintain the temperature in the correct range.”
Furthermore, since DNA molecules can be “very sturdy,” the information stored in the molecules can potentially be preserved for a long time, Arwani added.
Because of such promises, the field of DNA data storage has been booming for the last several years. In 2020, for example, Twist Bioscience, Illumina, Western Digital, and Microsoft teamed up to create the DNA Data Storage Alliance, taking the lead to establish interoperability and an industry roadmap for the sprouting DNA data storage field. The alliance has since picked up steam from dozens of companies, including DNA Script, Molecular Assemblies, and Biomemory, which became a member in 2021.
Despite the field’s gain in popularity, Arwani said one major bottleneck to achieving large-scale DNA data storage is cost. “It is very expensive, because [companies] are relying on fossil fuels to [produce] nucleotides that you need to assemble DNA,” he noted.
To circumvent this problem, Biomemory said it is designing “bio-sourced, biocompatible, and bio-secure DNA fragments” using nucleotides that don't rely on petroleum materials. “What we're doing at Biomemory is tackling this price,” Arwani said. “To do that, we removed everything related to fossil fuels.”
Arwani declined to disclose the exact mechanism of Biomemory’s nucleotide synthesis process, citing the company’s ongoing patent application efforts. “What I can tell you is, to design those blocks, we are using sugar and bacteria only,” he said.
After the nucleotides are synthesized, Arwani said, the company deploys an enzymatic method to piece the DNA building blocks together to form double-stranded DNA with desired sequences. However, unlike other enzymatic DNA synthesis approaches, such as the ones used by Twist Bioscience, DNA Script, Ansa Biotechnologies, Molecular Assemblies, and others, Biomemory’s method does not require TdT, a vertebrate polymerase that can add single nucleotides to the 3'-end of single-stranded DNA without a template. Instead, Arwani said, the company only uses ligases and restriction enzymes.
In a preprint published in August 2022, Biomemory cofounders Crozet and Lemaire offered some insights into the company’s DNA data storage strategy.
According to the article, information stored in a binary file is converted to a biosafe and biocompatible DNA sequence based on the Church-Gao-Kosuri (CGK) encoding scheme, a strategy to encode digital information in DNA that was originally devised by Harvard University biologist George Church and his team.
Subsequently, the DNA sequence is synthesized and assembled into plasmids, followed by bacterial amplification. The DNA molecules can then be extracted and stored in data blocks named DNA Drive.
Arwani did not disclose any detailed performance metrics other than saying that the enzymes engineered by the company are “very sturdy” and can be used with “a lot of efficiencies.”
So far, Arwani said, the longest oligo that Biomemory has ever produced was a 10 kb double-stranded DNA molecule, which took the company “several weeks” to make. As for the error rate, he said when the team copied the synthesized DNA with cloning, the error rate was one in a billion bases.
To retrieve the information stored in the DNA, the molecule is sequenced and the data is converted back to a binary file. While the DNA synthesized for data storage is readable by any sequencer on the market, Arwani said the company has optimized the molecules to be read by sequencers from Oxford Nanopore Technologies due to their ease of use and fast turnaround.
That said, Arwani noted that the company designed the encoding scheme to avoid having three bases of the same type in a row due to sequencing errors in repetitive regions associated with nanopore sequencing.
Having garnered €5 million in seed funding, Arwani said the company will continue to scale up its technology while lowering the cost. “To decrease the price is the North Star of everyone working on the DNA data storage problem,” he said, adding that the company plans to deliver DNA for $1 per megabyte within the next several months.
To get there, Arwani said the company will focus on the miniaturization, automation, and parallelization of its microfluidic DNA assembly device. Biomemory currently has the capability to write one DNA molecule at a time, but by the end of this year, he thinks the company will be able to produce about 3,000 to 4,000 molecules in parallel.
Besides making technological advances, Biomemory, which currently has 10 employees, also expects to double in size this year, Arwani said.
IP-wise, in addition to the two patents from CNRS/Sorbonne University, Biomemory is working on two new patent applications this year, Arwani said. According to the US Patent and Trademark Office database, a patent application titled Biocompatible Nucleic Acids for Digital Data Storage (US20220351807A1) that lists Lemaire and Crozet as co-inventors was published in November 2022.
As for its business strategy, Arwani said he envisions Biomemory to first and foremost take on “cold” storage of data — inactive data that is rarely used or accessed. “What we want to do is to build a computing device that can build DNA directly in data centers,” he said. “It is a machine that will look like a computer, but inside you will have biology.”
While the company’s technology can supposedly produce customized DNA enzymatically, Arwani said Biomemory is not interested in tapping the technology to produce oligos for biomedical and academic research applications, such as Twist and DNA Script.
“We don't have any plans for that,” he said. “We are a pure player of data storage.”