By Monica Heger
This article was originally published Feb. 27.
OpGen plans to launch a human chromosome mapping service in the second half of this year that will enable the detection of large structural variations, including balanced rearrangements, large inversions and translocations, and novel insertions — "things you wouldn't see with next-generation sequencing," Richard Moore, OpGen's chief scientific officer told In Sequence.
The service builds on upgrades the company introduced to its Argus Whole Genome Mapping platform last fall, and marks an increased focus on its services business.
Moore said that moving forward, the company plans to expand its service business to accommodate researchers who do not have the funds to make large capital purchases. Not only is it increasing its physical space by around 50 percent, but the company is also hiring in the areas of bioinformatics and software development.
"We think that sequence finishing is a big part of what we'll be doing," Moore said. "And people want to do that as a service," not buy an instrument.
Additionally, OpGen is moving into the outbreak surveillance space. It has joined the European Union's Patho-NGen-Trace program — a project that will combine next-generation sequencing with whole-genome mapping to generate accurate sequences of model pathogens, and to characterize genetic markers for drug-resistance, virulence, and whole-genome evolution. OpGen is also working with 11 different US state laboratories to implement its technology in outbreak screening.
Structural Variation Detection
The whole-chromosome mapping service builds on upgrades to the Argus system that the company introduced last fall (IS 10/18/2011). Since then, OpGen has further increased the density of the disposable cards used in the system, so that one card can generate 3 gigabases of mappable data. Additionally, it has created algorithms that are specific to human chromosomes, which allow for the de novo assembly of a single chromosome arm.
So far, it has tested the technology on publicly available human genome sequence data from healthy individuals. When comparing those genomes to the human reference genome, the technology has uncovered a lot of structural variation, Moore said.
In every genome, for instance, he said that there are several "megabase-sized balanced events, and many tens to hundreds of indels," as well as "quite a few novel insertions."
Balanced events, in particular, cannot be detected with other technologies because they do not result in copy number changes. So, for instance, if a five-megabase piece of DNA is simply flipped around, that would not be detected with array-CGH or next-generation sequencing.
Many researchers have attempted to use next-gen sequencing with large insert sizes to detect structural variation, and while that works to some extent, Moore said it is still very difficult to detect large, megabase-sized events. The "high-end" of what can be detected with sequencing is "probably around a couple of kilobases," he said.
Additionally, for resequencing studies, novel insertions will not be detected because they "don't line up to anything" on the reference genome.
Even as read lengths of next-gen technologies increase, genome mapping and chromosome mapping will still be relevant, he said. Pacific Biosciences, whose single-molecule sequencer currently generates the longest read lengths of the next-gen technologies at more than 2 kilobase pairs, still cannot detect the large-scale rearrangements and novel insertions that can be seen with OpGen's technology, Moore said.
Once sequencers are able to generate 20-kilobase to 40-kilobase reads, "then the need for independent structural variation validation will go down dramatically," he said. And while Oxford Nanopore recently said that it will be able to generate accurate reads of that length, currently no available technology comes close, he added.
Being able to detect these large-scale structural variations will have many applications in human disease research and agriculture, he noted.
Moore said the company is currently working with well-characterized disease samples from George Church's lab and other labs to validate the technology on known samples. He expects it will be useful for mining genome sequences from disease cohorts in order to come up with novel events that may be disease causing. Structural variations are suspected to play a role in diseases like autism and other developmental disorders, but so far, existing technology has not been able to detect these events.
OpGen's chromosome mapping service can be used in conjunction with sequence data or independent of it, Moore said. The technologies are very complementary to each other, he added, and OpGen's technology could even be used as a first-pass screen to choose which samples to study further with deep sequencing.
OpGen is also applying its technology to outbreak monitoring and public health surveillance. Moore said that the technology could be used as an initial screen, a "first-priority triage technology to understand which samples are part of the outbreak and which samples are part of the background."
The maps can be applied to the sequence data to create better assemblies and also to figure out which isolates should be sequenced in the first place, he added.
For instance, he said, Escherichia coli is present everywhere, but not all strains are pathogenic. So, when people are sick, you want to "differentiate between the ones that are associated with the outbreak."
Starting with whole-genome mapping can reduce the number of samples that would then need to be sequenced, he said.
In addition, the company has already demonstrated the use of its technology for improving the assembly of outbreak pathogen genomes. Last summer during the German E. coli outbreak, combining OpGen's mapping technology with sequence data helped to generate a more complete assembly to identify the strain's origins and how it acquired its pathogencity (IS 6/14/2011).
The original whole-genome sequence, generated on the Ion Torrent PGM, was highly fragmented, and while the sequencing itself took just three days, the bioinformatics and assembly took much longer.
Meantime, using whole-genome mapping technology, OpGen researchers were able to generate a report for around half a dozen samples in less than 48 hours, Moore said.
The company is now working with 11 US state public health laboratories to implement the technology as a screening tool for outbreaks and with researchers from the University of Maryland's Institute for Genome Sciences to develop protocols that combine whole-genome mapping with next-gen sequencing.
Additionally, it is part of the EU's Patho-NGen-Trace research consortium, which is looking to develop pathogen surveillance protocols that combine whole-genome mapping with next-gen sequencing.
In particular, the consortium is looking to develop bioinformatics tools and technologies that combine sequencing and whole-genome mapping to for sensitive and early detection of drug resistance and the spread of Mycobacterium tuberculosis, the bacteria that causes tuberculosis; methicillin-resistant Staphylococcus aureus, a major cause of hospital-acquired infections; and Campylobacter strains, which are a major cause of diarrhea.
The overall goal of the four-year project, which began this January, is to develop efficient pathogen epidemiological surveillance and early warning systems.
Have topics you'd like to see covered by In Sequence? Contact the editor at mheger [at] genomeweb [.] com.