The Proteoform

In correspondence to Nature Methods, Lloyd Smith, Neil Kelleher, and the Consortium for Top Down Proteomics suggest the term "proteoform" to describe all the shapes that a protein can assume. "Given the importance of capturing this protein variation in basic and translational research, and that technologies now exist to reveal it, we point out an ongoing problem in nomenclature regarding what to call it," they write.

Smith and his colleagues add that many of currently used terms are imprecise — 'isoform' actually refers to genetic, not protein, variations, while the phrase 'protein species' cannot "distinguish between proteins originating from different genes and those originating from a single gene."

'Proteoform', they say, will help solve these issues, and they have begun using it in their work. "We find it to be intuitive and readily grasped by readers and audiences. It has an aesthetic appeal, as the simple protein analog of the genetic term 'isoform'," they add.

"It just catches on … it fills a void the rolls right off the tongue at conferences and sits well in the gut while digesting text," Kelleher tells Nature Methods' Methagora blog.


The proposal by Smith et al.

The proposal by Smith et al. in their Nature Methods correspondence to adopt the term "proteoform" would somewhat reduce the ambiguity of what is meant by a protein "isoform." Isoforms would primarily arise from alternative splicing of the same gene or, I presume, highly related genes arising from gene duplication that are also functional homologues. "Proteoforms" would encompass all amino acid differences arising from genetic mutations and post-translational modifications of a protein translated from an mRNA according to the definition proposed.

The difficulty with this suggestion is that the term "isoform" has also been historically used to distinguish between proteins that reside in complexes with different subunits. For example, many of the cyclin-dependent protein kinases can each bind to different cyclins. In another example, the catalytic subunit of protein phosphatase 2A can bind to a wide diversity of regulatory subunits that can affect its cellular location and substrate specificity. As defined in the Smith et al. Nature Methods correspondence, these isoforms would not be reclassified as proteoforms. However, if the word "proteoform" was extended to include protein complexes, I think that the meaning of term would be too diluted and it would be overly broad.

The vast majority of proteins in cells appear to exist in either homogeneous or heterogenous multimeric complexes. Perhaps it would be a good idea to adopt a third term to distinguish proteins on the basis of their quaternary structures. Carrying on with the intent of this initiative, I suggest a generalized term "proteocomplex" that refers to the specific composition and stoichiometry of subunits in protein complexes. From a quick Google search, this appears to be a novel word, but I believe that it is conceptually obvious in its meaning.