NEW YORK (GenomeWeb News) – The nuclear and mitochondrial genomes of Amborella trichopoda, published as a trio of papers in Science today, offer researchers glimpses into the evolution of flowering plants.
As Amborella, a small tree found in New Caledonia, is the only living descendent of a sister lineage to all other living flowering plants, an international team of researchers pulled together the plant's genome and compared it to those of other angiosperms to better understand the origin and diversity of flowering plants. In addition, a set of researchers sequenced the plant's multiple mitochondrial genomes to get a handle on how horizontal gene transfer influences evolution.
"In the same way that the genome sequence of the platypus — a survivor of an ancient lineage — can help us study the evolution of all mammals, the genome sequence of Amborella can help us learn about the evolution of all flowers," said Victor Albert from the University of Buffalo in a statement. Amborella diverged from other angiosperms about 160 million years ago.
To assemble the draft Amborella genome, researchers with the Amborella Genome Project had to turn to a mix of technologies. Once they assembled its genome, they found that it shows synteny with the genomes of other flowering plants, though with marked differences in the size of microRNAs as well as limited evidence of recent transposon activity. Additionally, they noted that the Amborella genome had undergone a duplication event that traced back to before its split with other angiosperms. Meanwhile, the plant's mitochondrial genomes indicated that it acquired foreign DNA from green algae, mosses, and other angiosperms.
"This work provides the first global insight as to how flowering plants are genetically different from all other plants on Earth," added the University of Florida's Brad Barbazuk in a statement.
Researchers led by Barbazuk generated more than 23 gigabases of DNA sequence from single and paired-end 454-FLX and –FLX+ reads, paired-end Illumina HiSeq reads, and Sanger-sequenced bacterial artificial chromosome-end reads. They combined this into about 5,750 scaffolds totaling 706 megabases.
Their sequence, the researchers reported, has about a 31x average depth of coverage, and the assembly covers more than 94 percent of the genome. Flow cytometry had estimated the Amborella genome to be about 870 megabases, while Barbazuk and his team placed it closer to 748 megabases.
Barbazuk and his colleagues noted that coverage of the two BAC contigs by the assembled sequence contigs indicated that the regions were well represented in the assembly, and all 155 N90 scaffolds included physically mapped BAC sequences. In addition, the researchers assessed the accuracy of the assembly through cytogenetic FISH analysis of the BACs assembled into 104 scaffolds comprising 430 megabases.
However, the researchers noted that gaps remained in their assembly, and to close them, they worked with OpGen on a whole-genome mapping approach. They then compared the scaffold sequences to the single-molecule restriction maps to find some 30 joins that they could bring together to make superscaffolds. This, they noted, increased both their N50 and N90 by about two fold.
With the Amborella genome in hand, Amborella Genome Project researchers noted that the plant's genome harbors evidence of an ancient genome duplication event, though no indication of any further whole genome duplication events since it diverged from the other angiosperms. This event, the team said, is the most ancient known genome duplication for which there is still structural evidence. Still, many of the genes in the Amborella genome fall in an order that resembled that of other flowering plants.
This synteny, the researchers noted, enabled them to reconstruct the ancestral gene arrangement in eudicots, which comprise about three quarters of all angiosperms. This ancestral plant contained some 14,000 protein-coding genes, the researchers determined.
Looking more broadly at a reconstructed ancestral angiosperm, the researchers uncovered some 1,179 gene lineages that arose with angiosperms by using Amborella as an anchor group in their analysis. These gene lineages include a number of ones related to reproduction and MADS-box genes that regulate flower development. The researchers noted that a number of these gene lineages continued to evolve, especially in regards to reproductive processes.
"These observations suggest that orthologs of most floral genes existed long before their specific roles were established in flowering, and that they were later co-opted to serve floral functions," the researchers wrote. "After the origin of angiosperms, new genes originated or were recruited to refine or more narrowly parse functions associated with flower development."
Meanwhile, the Amborella genome has some quirks of its own. The team found that the transposons in Amborella are markedly older than those in other angiosperms, which they speculate may be due the presence of an effective silencing mechanism or to the loss of an active transposase.
"Amborella is unique in that it does not seem to have acquired many new mobile sequences in the past several million years," said Sue Wessler from the University of California, Riverside. "Insertion of some transposable elements can affect the expression and function of protein-coding genes, so the cessation of mobile DNA activity may have slowed the rate of evolution of both genome structure and gene function."
In addition, it contains larger-than-usual regulatory small RNAs. Through small RNA-seq, the researchers found some 56,000 loci that generate regulatory small RNAs of 20 nucleotides to 24 nucleotides in length. Most of the lineage-specific small RNAs were in the 23 nucleotide to 24 nucleotide-length range, at a frequency more than twice that of other land plants.
The Amborella genome also provided insight into the plant's own population history, and an analysis of variation in 12 Amborella individuals indicated that there are four distinct Amborella populations in New Caledonia that went through a series of population bottlenecks, including one about 100,000 years ago.
Amborella also contains a large, 3.9 megabase mitochondrial genome, which researchers led by Jeffrey Palmer at Indiana University noted includes the equivalent of six foreign mitochondrial genomes. That DNA was acquired through horizontal gene transfer.
By sequencing these Amborella mitochondrial genomes, the researchers found that it contains large chunks of DNA — nearing full mitochondrial genomes — from mosses, green algae, and from other angiosperms, and in relatively unchanged form.
The researchers suspect that Amborella was able to ingest such large amounts of foreign DNA when it is wounded and its cells and those of the organisms growing on it are broken up. Some of those healed cells could then be incorporated into the meristem and the plant germline.
The mitochondrial fusion that then occurs, the researchers noted, is similar to what happens in other land plants and green algae.
"[T]he Amborella genome has swallowed whole mitochondrial genomes, of varying sizes, from a broad range of land plants and green algae. But instead of bursting from all this extra, mostly useless DNA, or purging the DNA, it's held on to it for tens of millions of years," Palmer said. "So you can think of this genome as a constipated glutton, that is, a glutton that has swallowed whole genomes from other plants and algae and also retained them in remarkably intact form for eons."