NEW YORK —The Vertebrate Genomes Project has generated near-complete genomes for more than two dozen vertebrate species.
The effort, which grew out of the Genome 10K project, aims to generate essentially gapless chromosome-scale reference genome assemblies for all existing vertebrate species. The first phase of the project is focusing on generating assemblies for one species from each of the 260 vertebrate orders. In 2018, researchers from the project released the first 15 vertebrate genomes they assembled, which included animals from all five vertebrate classes.
In a suite of new papers appearing in Nature and related journals, researchers from the VGP reported that they have now generated 25 high-quality vertebrate genomes and described the pipeline they developed to obtain them. Additionally, these new genome assemblies have enabled the researchers to uncover, for instance, previously unknown chromosomes among platypuses and zebra finch as well as home in on key differences between marmoset and human brain-related genes. These genomes could further help in the conservation of some species.
"We expect researchers to use the genomes for all areas of biology," Rockefeller University's Erich Jarvis, the VGP chair, said in an email. "They can access them from … public databases, including our GenomeArk."
To determine the best approach to generate near-complete genomes, Jarvis and his colleagues first evaluated various sequencing approaches and assembly methods using one species, the Anna's hummingbird, Calypte anna. They chose Anna's hummingbird to develop their pipeline as it has a small genome of about 1 Gb, is heterogametic, and it has a short-read sequencing-generated reference genome.
Following their benchmarking analysis, the researchers built an iterative assembly pipeline in which haplotype-separated long-read contigs are arranged on to scaffolds using linked reads, optical maps, and Hi-C data to place those contigs into the correct order and organize them into chromosomes. That process was then followed by gap filling, base call polishing, and manual curation.
They applied this pipeline to generate genome assemblies for 15 other vertebrate species representing the major vertebrate classes of mammals, birds, reptiles, amphibians, teleost fish, and cartilaginous fish.
For some species, these new assemblies changed what was known about their genomes. For platypus, the researchers identified 18 structural differences in 13 scaffolds between the new assembly and its previous Sanger sequencing-based reference and filled in a number of large gaps. They further identified eight additional chromosomes within the platypus genome and seven in the zebra finch genome.
The new assemblies also enabled a number of additional studies. In a related study appearing in Nature, Jarvis and his colleagues relied on 35 genomes from across the vertebrate lineage — including ones generated through the VGP — and four invertebrate genomes to examine the relatedness of the neurotransmitters oxytocin and arginine vasopressin. Using Blast and Blat analyses, they identified six major oxytocin-vasotocin receptors among vertebrates that they then analyzed in further detail.
Their analysis suggested that these receptors arose from a single receptor, shared with invertebrates, and that they proliferated in vertebrate genomes through a combination of whole-genome and large segmental duplications.
At the same time, in another Nature paper, Jarvis and his colleagues focused in detail on the genome of the common marmoset (Callithrix jacchus), a common model organism. When they compared the marmoset genome to the human genome, they found that while most brain-related genes are conserved between marmoset and human, the marmoset genome includes four genes that encode human pathogenic variants, including in APOE — suggesting that care should be taken when using marmosets as models.
These new VGP assemblies could also aid conservation efforts, the researchers noted. Among their new sequence assemblies is one of the kākāpō, a critically endangered parrot from New Zealand. An analysis that is to appear in Cell Genomics found that this parrot has been able to purge deleterious mutations from its genome despite low genetic diversity.
The VGP plans to next complete Phase I of the project by finishing sequencing for representative species from 260 vertebrate orders before embarking on Phase II, which entails sequencing representatives from each vertebrate family. This includes additional projects such as one Jarvis is involved with to identify genes that enable spoken language.
"We will get a spectacular picture of how nature actually filled out all the ecosystems with this unbelievably diverse array of animals," co-author David Haussler, a computational geneticist at the University of California, Santa Cruz, said in a statement.