NEW YORK (GenomeWeb) – An international group of researchers has used mass spec-based protein analysis to unravel the evolutionary history of a pair of South American ungulates, members of the taxa Toxodon and Macrauchenia.
Detailed in a paper published this week in Nature, the study provides insight into mammalian development while also demonstrating the potential of proteomics as a tool for evolutionary analysis of ancient samples, Ian Barnes, a researcher at London's Natural History Museum and corresponding author on the paper, told GenomeWeb.
Traditionally, such studies have focused on genomic evidence, Barnes noted. However, he said, after around a million years, DNA is typically too degraded to be used for analysis. Collagen, on the other hand, can remain intact for around 10 million years, making it a potentially useful sample source for researchers interested in studying ancient organisms.
Additionally, collagen makes up a large percentage of bone, further adding to its potential suitability as a sample source.
Nonetheless, relatively little research has been done using collagen – or other proteins – for analysis of ancient samples, Barnes said, noting that to his knowledge only one or two previous studies have used the approach specifically to answer evolutionary questions.
In large part, he said, this has been due to technological limitations, with sequence coverage being a particular challenge.
"Sequence coverage has been a big problem," Barnes said. "Not just sequencing certain parts of the protein which are perhaps better preserved or easier to infer from the data. I think that is probably the major barrier to using these molecules in a phylogenetic context."
In the Nature study, the researchers sequenced collagen samples on both a Thermo Fisher Scientific Q Exactive instrument and a Bruker Maxis HS QTOF system combining data from eight runs to assemble type I collagen (COL1) protein sequences that were 89.4 percent complete in the case of Macrauchenia and 91 percent complete for Toxodon.
To validate these sequences, the researchers compared them to analyses of other fossils and modern samples, finding that amino acid variations were located at similar positions in the COL1 chains compared to genomic-based collagen sequences. Additionally, they manually validated selected peptide matches, confirming that these matched the peptides assigned by database searching.
For their phylogenetic analysis, Barnes and his colleagues assembled a set of 76 additional mammalian COL1 sequences along with one out-group sequence and the Macrauchenia and Toxodon sequences.
Such analyses, Barnes explained, "basically revolve around looking at the degree of similarity in the sequences you have available."
For instance, he said, researchers examine changes in amino acids between the sequences of related organisms and score these changes based on how "expensive" they are — the more biochemically different an amino acid is from its substitute, the more costly that change is considered to be.
Based on these scores, researchers can build evolutionary models, establishing when and where various organisms and groups of organisms branched off from one another.
In the case of Macrauchenia and Toxodon and the larger 280 genera classified as South American native ungulates, questions have persisted, the Nature authors wrote, regarding when they arose, whether they had one or several origins, and whether they should be classified with the superorder Afrotheria, which includes animals like elephants and manatees, or the superorder Laurasiatheria, which contains horses and cattle.
Researchers, they noted, have struggled to place the organisms because of the inconclusive nature of morphology-based analyses and because of a lack of success finding suitable ancient DNA samples.
Using the protein-based analysis, Barnes and his colleagues were able to shed some light on these questions, developing a model that places the South American ungulates in a monophyletic group most closely related to Perissodactyla, which contains horses and rhinoceroses and is part of Laurasiatheria.
Of the findings, Barnes said that "in terms of our understanding of how mammals evolved and ended up looking quite similar in certain respects to things they are distantly related to, I think [classification of these ungulates] is quite a big question."
Additionally, he said that, "the use of the technology is still pretty novel and very well deployed here."
Barnes and his colleagues' success with the proteomic approach notwithstanding, DNA is still the preferred molecule for such analyses when it is available, he said.
"In general, DNA sequence data is often more informative than amino acid sequence data from a protein like collagen, which is heavily constrained and doesn't change very much," he said.
In the Nature paper, the authors also noted that basing their analysis on the COL1 sequences meant they were essentially looking at material from just two genes. Barnes said, however, that this was actually reasonably robust compared to traditional DNA-based analyses.
"In the past with ancient DNA we have almost entirely worked on mitochondrial DNA, and even there only a short section of it," he said. He added that with the advent of next-generation sequencing, researchers can now look at both mitochondrial and nuclear DNA but that "there are relatively few papers that actually do that."
"So the fact that two nuclear genes are represented here is actually pretty good by comparison," he said.
The Nature authors suggested that "with ongoing improvements in instrumentation and analytical procedures, proteomics may produce a revolution in systematics such as that achieved by genomics, but with the possibility of reaching much further back in time."
Collagen, Barnes said, is the obvious candidate for such work both due to its abundance and durability and the large database of collagen sequences from various mammals researchers now have at their disposal.
He said, though, that he thought in the future the field would expand its research to other proteins.
"I suspect we will be looking at a large range of proteins in some very well preserved samples and possibly other longer-surviving proteins in other types of fossil material," he said.