An international team of researchers has identified nearly 1,100 proteins in mitochondria, resulting in what they said is the most “comprehensive and accurate” map of the mitochondrial proteome ever developed.
The protein library provides a foundation for the systematic investigations of mitochondria and mitochondrial dysfunctions, which have been associated with a “broad spectrum” of human diseases, the researchers write in an article describing their findings.
The team of scientists from Boston and Australia used in-depth protein mass spectrometry, microscopy, and machine learning to achieve its results, published in the July 11 print edition of Cell.
Included in their protein library, called MitoCarta and available here, are 19 that could be important to the function of complex 1 of the electron transport chain, including one, C8orf38, which had not previously been associated with C1.
According to the authors, mitochondrial dysfunction is responsible for more than 50 diseases including some resulting in neonatal fatalities and adult-onset neurodegeneration. It also is likely to contribute to certain cancers and type 2 diabetes.
While the human mitochondrial genome — comprising 13 genes that encode proteins and 24 non-protein coding genes — was decoded in 1981, until now researchers have been able to identify fewer than half of the estimated 1,200 to 1,500 nuclear-encoded mitochondrial proteins that make up the functional mitochondria.
A complete library of mitochondrial proteins, the authors said, “would provide a molecular framework for the investigation of mitochondrial biology and pathogenesis,” they added.
To that end, MitoCarta represents a list of highly confidently identified proteins in mitochondria from which proteomics researchers can begin to gain information about disease pathways through phylogenetic associations, one of the authors of the Cell article told ProteoMonitor this week.
Their work demonstrates that “the mitochondrial compendium can help highlight specific candidates within linkage regions of any Mendelian mitochondrial disease.”
“This paper really shows you a representation of what can be done given a high-quality compendium like this,” said Steven Carr, director of proteomics at the Broad Institute of the Massachusetts Institute of Technology and Harvard University. “It’s meant to be illustrative, not to be the only thing that can be done. So we see researchers doing what researchers always do — and that is to come up with novel and exciting and creative ways of using high-quality sources of data, mining it, and then doing follow-up, hypothesis-driven experiments.”
Other efforts at mapping out the mitochondrial proteome have been done, using technologies including mass spectrometry, epitope tagging combined with microscopy, and computation. Carr and his colleagues, however, said that each method has suffered from “intrinsic technical limitations” that compromise results. For example, MS-based approaches have sometimes confused genuine mitochondrial proteins with co-purifying contaminants, and published reports have had false-positive rates of up to 41 percent.
In addition, MS-based methods are poor at detecting low-abundance proteins and proteins expressed only in specific tissues or developmental states, and thus capture only 23 percent to 40 percent of known mitochondrial components.
For their work, which pulled together experts from a broad swath of scientific disciplines, the researchers used MS-based proteomics then integrated the data with six other genome-scale datasets of mitochondrial localization using a Bayesian framework. They also performed “the most extensive” green fluorescent protein tagging study focused on mammalian mitochondria.
While individually none of the methods they used were novel, what is new is the integration of “very high quality mass spectrometry derived protein identifications with computationally derived predictions of mitochondrial proteins,” Carr said. “It is that Bayesian integration of the data, which involves weighing the contributions and figuring out what the log likelihood or p-value is for each of these components, and coming up with an integrated value to build this compendium,” he said. “That’s the novel part of this.”
For the proteomics work, the goal was to get at only true mitochondrial proteins, so the team designed a two-phase approach in order to identify as many mitochondrial proteins as possible while parsing out co-purification contaminants: the discovery phase, and the subtractive phase.
In the discovery phase, mass spectrometry was done on 14 organs in mice: cerebrum, cerebellum, brainstem, spinal cord, kidney, liver, heart, skeletal muscle, white adipose tissue, stomach, small intestine, large intestine, testis, and placenta. Highly purified mitochondria were isolated from each organ, and each sample was separated by SDS-PAGE.
Analysis with a Thermo Fisher Scientific LTQ Orbitrap Hybris MS platform captured 4.7 million tandem mass spectra, and searching against the mouse RefSeq protein database resulted in the confident identification of products from 3,881 genes. In total, the authors estimated they identified 88 percent of previously known mitochondrial proteins, including 93 percent of oxidative phosphorylation proteins.
In the subtractive phase, the team performed LC-MS/MS on both crude and purified mitochondria from 10 of the 14 tissues. “This approach is based on the observation that bona fide mitochondrial proteins should become enriched during the purification process, and likewise contaminants should become depleted,” the authors said.
This resulted in 2,565 gene products, of which 709 were more abundant in purified samples, meaning they were “gold-standard” mitochondrial proteins, Carr said.
Data from the discovery and subtractive phases were combined to assign a probability that each MS/MS discovered protein was truly mitochondrial. The research team compiled training sets of 591 known mitochondrial genes and 2,519 non-mitochondrial genes. They calculated the likelihood ratio that a protein is truly mitochondrial “on the basis of its discovery MS/MS protein abundance and its subtractive MS/MS enrichment,” they write in the article.
The research team then integrated the MS analysis with six complementary computational, homology-based, and experimental techniques to determine mitochondrial localization. They did this, they said, because while discovery/subtractive proteomics is “extremely powerful” for the discovery of true mitochondrial proteins, the approach is not sensitive or specific enough to pick up very low-abundance proteins, which “lack tryptic peptides amenable to MS, [and] localize to mitochondria only under specific conditions.”
The combined methods resulted in the high-confidence identification of 1,098 genes and their protein expressions, including about one-third that had not been previously linked to mitochondria. The authors estimate the library is 85 percent complete and contains about 10 percent false positives. They claim that MitoCarta also “distinguishes itself from other catalogs by providing strong experimental support for 87 percent of genes on the basis of mass spectrometry, GFP studies, and or literature curation.”
To study the function of some of the newly linked proteins, the researchers performed phylogenetic profiling on these proteins, comparing their corresponding gene sequences across hundreds of species, an approach that is “likely to be particularly applicable to the mitochondrian, given its unique evolutionary history of descending from a Rickettsia-like endosymbiont early in eurkaryotic evolution,” the authors said.
They focused on identifying factors essential to respiratory chain complex 1 because of “its well-known association with a large number of rare and common diseases,” Carr said. Only three known assembly factors exist currently for the complex.
They built phylogenetic tree showing the relationships of these newly mitochondrial-linked proteins.
Starting with a set of 15 C1 proteins that are absent from several yeast species and are ancestral bacterial subunits that have been independently lost at least four times in eukaryotic evolution, “they asked the question, ‘What other proteins present in our MitoCarta set also showed similar loss in those particular species?’” Carr said.
He and his co-researchers found 19 other proteins in MitoCarta that shared this profile, strongly suggesting they may be associated with C1. The researchers chose four of the 19 to test for involvement in C1 activity “by creating stable knockdowns in human fibroblasts via lentiviral-mediated RNAi,” the authors said in the article.
High levels of knockdown were achieved for all four. Each was then assessed for C1 and it was demonstrated that “Complex 1 subunit based on immunoblots and Complex 1 activity based on a biochemical assay were in fact knocked down. That more or less validated that these proteins are involved in assembly function of Complex 1,” Carr said.
One of the 19, C8orf38, showed the strongest reduction of C1 abundance and activity as a result of knockdown suggesting the protein, comparable to NDUFAF1, a recently identified C1 assembly factor.
“That really suggested that that protein, which had no prior biological association with activity or assembly of Complex 1, was, in fact, critical to the assembly of Complex 1,” Carr said.
Furthermore, he and his colleague said in the article, their work demonstrates that “the mitochondrial compendium can help highlight specific candidates within linkage regions of any Mendelian mitochondrial disease.”
The researchers are continuing work on the other 18 proteins that they suspect are associated with C1, but Carr said he and his colleagues weren’t ready to share any data yet.
While Carr could not comment about what specific follow-up research will be done, he said one area that may be looked at is modulation of the proteins in patient samples. “We would use targeted MS-based methods to get to the requisite sensitivity for the analysis of patient-derived samples,” he said.
Work on developing and expanding MitoCarta will continue, but there is no specific timeline, Carr said.