Skip to main content
Premium Trial:

Request an Annual Quote

Multi-Institute Team in UK Rolls out Innovative Genome Assembler


A collaborative effort among British researchers has produced the first de novo assembly software solution capable of assembling multiple eukaryotic genomes simultaneously. The team includes researchers from the European Molecular Biology Laboratory-European Bioinformatics Institute, the University of Oxford, and the Genome Analysis Centre in the United Kingdom.

Called Cortex, the new software has already facilitated the joint assembly of more than 150 genomes from the 1,000 Genomes Project, demonstrating that each individual has roughly 1.4 million DNA bases that differ from the reference genome. According to Zamin Iqbal, a postdoctoral researcher at the Wellcome Trust Centre for Human Genetics at Oxford, the impetus for developing Cortex arose from the large amount of memory — hundreds of gigabytes or even terabytes — standard assemblers typically use when processing next-generation sequence data.

"We were sure that a careful and efficient design could reduce this overhead, which made assembly of large genomes almost impossible — certainly no one was thinking about assembly of more than one eukaryote genome, as they could only barely do one," Iqbal says. "However, we were able to make dramatic improvements, which opened up possibilities for not just looking at two or three genomes, but hundreds. Once you open the door to simultaneous assembly of many individuals, you can bring to bear the full power of population genetics into assembly. We show in our paper how effectively this can be used to get a good call set, even when you don't have a reference genome for your species." The team's paper was published online in Nature Genetics in January.

The Cortex team is hopeful that its solution will open up several new avenues of research including the analysis of microbes and pathogens like methicillin-resistant Staphylococcus aureus and Escherichia coli. "Suppose you have a longitudinal study of blood samples from a patient with a disease caused by a pathogen. It's becoming more common to use sequencing to study evolution of the pathogen in the host," Iqbal says. "However, often there is either no good reference genome, or it is reasonably diverged from your samples. So what you really want to do is just compare your samples directly and watch the mutations appearing — Cortex makes this easy. You compare samples directly, without using a reference as an intermediate."

The Scan

New Study Investigates Genomics of Fanconi Anemia Repair Pathway in Cancer

A Rockefeller University team reports in Nature that FA repair deficiency leads to structural variants that can contribute to genomic instability.

Study Reveals Potential Sex-Specific Role for Noncoding RNA in Depression

A long, noncoding RNA called FEDORA appears to be a sex-specific regulator of major depressive disorder, affecting more women, researchers report in Science Advances.

New mRNA Vaccines Offer Hope for Fighting Malaria

A George Washington University-led team has developed mRNA vaccines for malaria that appear to provide protection in mice, as they report in NPJ Vaccines.

Unique Germline Variants Found Among Black Prostate Cancer Patients

Through an exome sequencing study appearing in JCO Precision Oncology, researchers have found unique pathogenic or likely pathogenic variants within a cohort of Black prostate cancer patients.