Skip to main content

Using Mass Spec to Sequence Protein of a 68-Million-Year-Old T. Rex: Apr 13, 2007


John Asara
Beth Israel Deaconess Medical Center Mass Spectrometry Core
Who: John Asara
Position: Director, Beth Israel Deaconess Medical Center Mass Spectrometry Core, 2003 to present; instructor in pathology, Harvard Medical School, 2005 to present.
Background: Research mass spectrometrist, Harvard Microchemistry and Proteomics Analysis Facility, 2000 to 2002; postdoc work in proteomics, Harvard University; scientist, Beyond Genomics, 2002 to 2004.

In 2005, Mary Schweitzer, a paleontology professor at North Carolina State University found and confirmed the existence of protein in soft tissue recovered from a 68-million-year-old Tyrannosaurus rex bone fossil.
After learning about Schweitzer’s discovery, John Asara, who had worked with her in the past, set out to sequence the proteins by mass spectrometry. When he did, he found that certain T. Rex sequences matched sequences found in chickens, frogs, and newts.
His findings are being published in the April 13 edition of Science.
Below is an edited version of a conversation ProteoMonitorhad this week with Asara.
How did you get started on this work?
This stems from a previous collaboration with Mary Schweitzer from North Carolina State University, who in 2005 discovered soft tissues from … T. rex fossils, which are really amazing. They were basically flexible and they even showed sort of the possibility of blood vessels and such things like that.
In 2002 when I was in [Harvard University Director of microchemistry and proteomics analysis] Bill Lane’s lab, we had sequenced collagen from a 100,000- to 300,000-year-old mammoth skull with Mary. And she did some immunohistochemistry and she showed antibody reactivity to the fossil. In this study [which appears in the April 13 Science], … we show the protein sequences from a 160,000- to 600,000-year-old mastodon and a 68-million-year-old T. rex.
And this was done by LC-MS/MS using an LTQ ion trap mass spectrometer. We found [that] for the mastodon we got very significant protein coverage where we got almost three-quarters of the amount of collagen protein that we found from modern ostrich, which was amazing since it was approximately half-a-million years old.
When doing a database search against the all-species protein database from the National Center for Biotechnology Information … of course mastodon is not in the database, so we would have to rely on identical matches to related species such as cow. We [also] got dog, we got elephant — although the elephant database was incomplete because that’s from the DNA sequencing — and also some human. So basically mammalian species which were similar.
What we also did was try to find unique sequences, de novo sequences which are unique to the mastodon. And we used ostrich as a control for this because ostrich has modern bone, and ostrich was also not present in the public protein databases.
We aligned known collagen sequences from species that evolved before the species of interest and afterward. We took three different species. The thing is [that] collagen is very highly conserved. In fact, from cow to human, it’s about 97-percent identical.
When we did this alignment, we looked for all the spots where we had a misalignment. Anything that aligned perfectly, we basically left alone. And any place where we had a misalignment, we sort of generated theoretical protein sequences that we could populate the protein database with — basically guesses that we could then search the data against in order to uncover unique sequences.
We used a consensus algorithm that we developed and the point-assisted mutation matrices to basically make these guesses. By searching the mastodon, we actually uncovered four unique protein sequences which matched nothing else in the protein database, which are sort of unique to the mastodon. Then we validated those by having synthetic peptides made in order to match the MS/MS spectra.
We also did that with ostrich before [doing it with the] mastodon, and we found seven unique sequences to ostrich. So we knew with the modern bone that this was a good approach to use with the fossils.
Now for the T. rex, we found far [fewer] proteins, barely enough to break the threshold of the mass spectrometer. I would say low to sub-fentomole levels of peptides.
Why is that?
Because it has degraded so long, over 68 million years compared to the mastodon, which is only half-a-million years old.
What gave us a chance is that this was so very well preserved. This basically was the T. rex that was shown in Mary Schweitzer’s 2005 paper that showed all this excellent preservation. So that gave us a shot.
The thing is after doing the bone extractions, there was still a large amount of this gritty, brown contaminant, and in order to clean up that contaminant and purify the peptide and concentrate the peptide in order to get the sequences, we had to go through several purification steps, which involved solid-phase extraction as well as microchromatography, including ion-exchange chromatography.
In the end, over five extractions and a year and a half, we got seven sequences and the majority of these collagen sequences matched chicken, which supports previous reports over the last several years that birds may have evolved from dinosaurs.
And we also got matches to frog as well as newt, unique matches to frog, unique matches to newt, as well as several matches to chicken, which were unique.
And these all make sense when you think about the evolution of the dinosaur and where it might have fallen in relation to these modern organisms. And we knew that it was not contamination from a modern source or from a single source, or else all the sequences would have matched a single organism.
The fact that we got all of these unique matches from different organisms really made us confident that we had T. rex sequences and then we synthesized peptides in order to validate the MS/MS spectra. Of course this was all done using nanoflow LC-MS/MS at low flow rates.
Now that we have sequenced from several extinct organisms which we never had before, we can start to make evolutionary relationships with modern organisms that we know the sequence for, and also make relationships among extinct organisms. And it’s truly amazing that you have this organic material, you have strings of amino acids that have actually survived so long. No one thought that a protein would ever last more than a million years.
We’ve now proven that hypothesis to be false.
So your research provides fodder for both arguments then — that both birds and amphibians descended from dinosaurs?
These are arguments that you can now make not just based on the architecture of the bones, but actually based on molecular tags.
Why did you choose to pursue this research?
Lewis Cantley [a professor of systems biology at Harvard Medical School] first saw [Schweitzer’s] publication in 2005, and he knew that I had worked with her in the past with Bill Lane in order to get sequences on the mammoth. So when he saw this preservation of soft tissue in dinosaurs, he said ‘Well, maybe you should contact her,’ just kind of half jokingly.
I took the bait and I actually did. We’ve been working on this now for almost two years, and we actually have some amazing data.
Can you talk about the unique challenges that you faced working with a sample that is so old and so degraded?
The biggest challenge is basically just finding the proteins. You start with two to five grams of bone material. And that is extracted, it’s sort of demineralized and then washed, concentrated and dried. And then you have 30 to 40 milligrams of bone extract.
From that, we were only finding very low fentomole levels of peptide or even sub-fentamole levels. The biggest challenge was just to sequence the peptide. What we did to overcome it was we tried various chromatographic techniques and solid phase extractions and we figured out this brown contamination, whatever it was. It was something like 99.9 percent of everything in this extract, if not more.
It followed the peptide contents. So you could never purify it by reverse phase, but then when we put it on strong cation exchange, we seemed to be able to wash away the brown contamination, and then we were able to purify the peptides, and we, of course, got rid of the salts, using another reverse phase technique … and then we were able to concentrate it down to a very low volume where we had pure peptide. And that gave us the best chance to do this.
Were you setting out specifically to figure out where the dinosaur fits in the evolutionary chain?
I think we need a little bit more sequence in order to accurately do that, and we would also like to get alternative proteins other than collagen.
But the specific goal was to see if we could get any protein. Being a mass spectrometry core at a medical center, we are focused on developing very sensitive technologies to actually probe low-level proteins in human tumors. We thought, ‘We have the most sensitive mass spectrometry technology in place that could possibly detect proteins in these old bones.’ So we thought, ‘What the hell? This would be a great test of the technology.’ And it would be a great test to actually test protein longevity. And the result of that is some evolutionary studies.
This also shows the cross-disciplinary nature of these technologies, how we can now apply these technologies which are actually in place for biomedicine, and now apply these to alternative fields.
You say that your research does not mean that you will be able to map the entire T. rex proteome.
No. Don’t forget that we are dealing with a bone, and the thing is bone cells are more than 90 percent collagen as well as other proteins that we’re going after. We cannot find the whole proteome from a human where we have plenty of materials. From something this old where you have so little material, we are really just going one protein at a time, trying to get a little bit more sensitive, trying maybe to get additional proteins.
I don’t think you’re ever going to, in the near-future, given the current technologies, get the entire proteome of a dinosaur. I don’t think that’s possible right now.
What can other proteomics researchers learn from what you did?
I think the application is, ‘If you don’t think you have enough proteins, you might.’ Because the technology is so sensitive, it can actually surprise you at times. And we also got modifications of collagen, hydroxylation being the most common.
So I think any sample is worth a try. I think that this really says that anything that you’re looking for protein in is worth trying, given the sensitivity of the current state of mass spectrometry.
Is there anything that someone doing proteomics research into cancer or neurodegenerative diseases, for example, can take from your study?

We have learned a little bit more about preparing a protein and concentrating low levels of proteins by various techniques. The thing is when trying to extract low-level proteins from tumors, which may be contaminated by non-cancerous cells, we can learn some tricks how to isolate and concentrate those proteins of interest.

File Attachments
The Scan

Not Kept "Clean and Sanitary"

A Food and Drug Administration inspection uncovered problems with cross contamination at an Emergent BioSolutions facility, the Wall Street Journal reports.

Resumption Recommendation Expected

The Washington Post reports that US officials are expected to give the go-ahead to resume using Johnson & Johnson's SARS-CoV-2 vaccine.

Canada's New Budget on Science

Science writes that Canada's new budget includes funding for the life sciences, but not as much as hoped for investigator-driven research.

Nature Papers Examine Single-Cell, Multi-Omic SARS-CoV-2 Response; Flatfish Sequences; More

In Nature this week: single-cell, multi-omics analysis provides insight into COVID-19 pathogenesis, evolution of flatfish, and more.