OpGen, the sole commercial provider of optical restriction-mapping technology, will resume its optical mapping service in June, and in the next few months plans to double its mapping capacity both for internal projects and for service work, In Sequence has learned.
“There will be enough capacity to meet all demand for mapping” services, said OpGen CSO Colin Dykes.
The company’s optical-restriction maps, combined with next-generation sequencing data, may allow researchers to generate high-quality bacterial genome assemblies more quickly and cost-effectively than using traditional methods, according to scientists who developed methods for automatically using optical mapping information for bacterial genome assembly.
OpGen, based in Madison, Wis., suspended the service last year, following a $23.6 million Series A financing round, in order to develop its technology into a commercial instrument for microbial identification in clinical settings (see In Sequence 9/18/2007). But continuing demand for optical mapping services, in particular for sequence assembly and comparative genomics, has convinced the company to resume its service.
”Frankly, there is so much demand from labs for sequence assembly and a lot of genomics [applications] that the board finally approved making it available as a service business,” said Dykes.
The company continues developing the instrument and plans to deliver prototypes to first customers around the end of the year, Dykes said.
Mihai Pop, an assistant professor in computer science at the University of Maryland who developed a method to automatically incorporate optical mapping data in genome assemblies, said that technologies like OpGen’s, as well as new sequencing technologies that can also generate mate pair information, “combined together, would allow you to actually push draft genomes more towards almost-finished [genomes].”
Last week he and his colleagues at the Naval Medical Research Center published an article online
in Bioinformatics describing their approach.
The scientists were looking for a method that would allow them to assemble contigs from unpaired 454 data. “The question for us was, ‘How can we go from the raw data from the 454 machine, which are good, but not for finishing genomes, and get as close as possible to a real finished genome?’” said Pop.
Optical maps seemed like a good solution, but “we realized that there are no tools to automatically take the 454 data and combine it with the optical map,” Pop said. He and his colleagues developed such tools and tested them, initially, on artificial datasets to optimize their ability to deal with a variety of errors that occur in real data, both in the optical map and in the 454 contigs.
Later, Pop and his colleagues used their method on real data from two Yersinia genomes and were able to order between 80 percent and 90 percent of the genomes. However, they could not place small contigs, which do not have enough restriction sites to match them with the map. Since the publication, Pop and his colleagues have applied their method to another eight genomes, but with similar results, he said.
An alternative assembly approach would have been to combine the 454 data with mate-pair sequence data, generated by either Sanger or 454, but “based on lots of experience with doing these kinds of assemblies before, you never get one single scaffold that spans an entire chromosome when you do mate pairs,” Pop said. “You usually get multiple smaller scaffolds, and then you still have to order and orient those.”
“Mate pairs will allow you to bring into the scaffold the very little contigs that don’t have restriction sites on them, whereas the optical map will give you the big global picture of the genome.”
Also, Sanger mate pairs would push costs “well above” the approximately $3,000 he and his colleagues paid for an optical map, he said. In the past, OpGen has charged approximately $5,000 to $7,000 for a bacterial optical map (see GenomeWeb Daily News 5/8/2006). Dykes said last week that prices will be “negotiable” and “driven by market forces.”
The other reason Pop and his colleagues were more interested in using optical maps and 454 sequencing than a hybrid Sanger-454 approach is that the data would potentially be available more quickly, an important factor in an emergency situation.
“When we are attacked by a new strain [of a pathogen], or there is an emerging disease, … it would be ideal if we could do it all in the same timeframe that we need for doing the sequencing,” Pop said.
If researchers had an OpGen instrument in-house, “the moment you get the DNA and you put it on the 454 machine, you could also put it on the OpGen machine, [and] the next day, you have both the optical map and the 454 assembly,” he added.
By comparison, building mate-pair libraries for Sanger sequencing, he said, would take “at least a week.”
However, in the absence of commercial optical-mapping instruments, researchers usually have to wait a little longer to obtain a map via OpGen’s service. Although “we can knock out a map in a day or two,” Dykes said, the company typically has a turnaround time of about two to three weeks, depending on its production schedule.
Ideally, Pop said, researchers would combine optical maps with mate-pair data. “The two would allow you to do complementary things,” he said. “Mate pairs will allow you to bring into the scaffold the very little contigs that don’t have restriction sites on them, whereas the optical map will give you the big global picture of the genome.”
Also, Pop is looking into using paired-end data from short-read technologies, such as Illumina’s Genome Analyzer or ABI’s SOLiD, in combination with optical maps.
Contigs from unpaired Illumina reads are unsuitable for this approach, he said, because they are too small and do not contain enough restriction sites to place on the optical map. But “if you have something like Illumina mate pairs, and you build little scaffolds, those scaffolds are going to be significantly larger than the contig size,” Pop said. “The question is, ‘Can I use the same approach that I used for [aligning] contigs to an optical map to align scaffolds to an optical map?’ That would probably be the way to go for those technologies.”
Others agree that combining optical maps with new sequencing technologies is a good idea. “This might be a promising addition to the new tech tool box the scientific community is working on,” Alla Lapidus, leader of the microbial genomics finishing group at the Department of Energy’s Joint Genome Institute, told In Sequence.
However, she pointed out, there are still some potential challenges with using the approach for genome sequencing and finishing. For example, for some microbes it might be difficult to produce sufficient DNA for the optical maps.
According to Dykes, OpGen recommends using a similar amount of DNA as for pulsed field gel electrophoresis. He said a milliliter of overnight bacterial culture, or about 108 to 109 cells, would be more than sufficient. For a 5-megabase genome, this would translate 0.5 to 5 micrograms of DNA. However, the company has generated maps from “as few as a few thousand cells,” he said.
Lapidus also pointed to the “difficulties with choosing the optimal-restriction enzymes to construct the optical maps, especially when dealing with new, uncharacterized genomes,” as well as being able to generate large numbers of optical maps, the need for automated data analysis, and data accuracy.
At present, to finish microbial genomes, her JGI group uses a combination of 5-fold coverage with Sanger, 20-fold with 454, and 20-fold with Illumina sequencing. “That’s what we currently do, but that is not the final answer,” she said. “We are currently working on an improvement to this approach because we believe that we can optimize this and use less Sanger,” which is the most expensive of the platforms.
“With increased quality of 454 and Solexa, we will revise our standards and will optimize the finishing process. Our goal is to produce the same quality projects for a significantly lower price,” she said.
Pop said he believes that a combination of different technologies, including optical maps, will ultimately make it affordable to finish more genomes. “Of course we should finish everything, it’s just a question of whether it’s cost-effective, and I think that these technologies are helping us get there,” he said.