NEW YORK (GenomeWeb News) – More and better organism sampling aimed at achieving greater genomic depth will be necessary to flesh out the tree of life, a new analysis suggests.
In a paper appearing online in Science yesterday, University of Arizona evolutionary biologist Michael Sanderson assessed the current state of the eukaryotic tree of life, looking at the phylogenetic signal present. He discovered that while there is a relatively strong signal for well studied groups, such as vertebrates and non-vertebrate animal models, the information available for the other eukaryotes is very broad and mostly insufficient for creating one unified tree of life.
In the article, Sanderson argued that creating a eukaryotic tree of life remains a lofty, but ultimately achievable goal. “Construction of a high-resolution phylogenetic tree containing all eukaryotic species in the database is a grand challenge that is substantially more tractable than inferring the entire tree of life,” he wrote, “but to succeed, strategies will have to overcome serious sampling impediments.”
First, though, Sanderson emphasized the need for understanding the strength and distribution of the phylogenetic data that’s currently available in the NCBI taxonomy tree. To do this, he looked at the phylogenetic signal found in 1,127 higher taxa representing 14,289 phylogenies and 2.6 million GenBank sequences.
The analysis revealed a strong phylogenetic signal for vertebrates — especially humans — and model organisms such as Drosophila. Overall, though, just 12 percent of the operational taxonomic units tested garnered the minimal phylogenetic support used in Sanderson’s analysis.
In general, the groups of eukaryotes with more species diversity were slightly less likely to have achieved minimal phylogenetic support. “Some taxa with surprisingly low support exemplify how biological diversity can overwhelm substantial and sustained phylogenetic efforts,” Sanderson wrote.
He also noted that the sequence data available so far “are enriched for taxonomic diversity to the relative exclusion of some high-throughput genomics data, which, though presently available for only a small fraction of eukaryotic taxa, ultimately should enable stronger phylogenetic inferences.”
In the future, Sanderson predicted that better phylogenetic inference tools will improve the information that can be gleaned from available data, but he also emphasized the need for newer sampling strategies and targeting sequencing projects appropriately.
“[S]ampling protocols guided by quantitative assessments of the phylogenetic distribution of data will improve the efficiency of emerging phylogenomic strategies for building the tree of life of known organisms,” Sanderson wrote.