This is part one of a two-part interview. Part two is available here.
Name: Janne Lehtiö
Position: Platform Manager for Mass Spectrometry, Science for Life Laboratory; Head of Clinical Cancer Proteomics, Karolinska Institute and Hospital
Background: PhD, Royal Institute of Technology, Stockholm, Sweden
In 2010, the Swedish government established the Science for Life Laboratory, a $75 million life sciences institution developed as a collaboration between Uppsala University, Stockholm University, Karolinska Institutet, and the Royal Institute of Technology, with branches in Stockholm and Uppsala.
The institution, commonly known as SciLifeLab, is focused significantly on omics research, housing such projects as sequencing the genome of the Norway spruce as well as the Human Protein Atlas project, led by Matthias Uhlen, a researcher at Royal Institute of Technology and director of SciLifeLab, Stockholm.
As platform manager for mass spectrometry at the laboratory's Stockholm facility, Janne Lehtiö is closely involved in various aspects of its proteomics work, and in its cancer proteomics and proteogenomics research, in particular. ProteoMonitor spoke to him about these efforts as well as advances and challenges in the field.
Below is an edited version of the interview, the first in a two-part installment. Part two is available here.
Could you give an overview of your proteogenomics research?
Proteogenomics is annotating the protein-encoding genome. The way to do that is to look at the proteins [in an] unbiased [way] and see what parts of the genome code the proteins. Currently, it is not done like that; it is done by analy[zing] DNA sequence. But, of course, that is circumstantial evidence. The real evidence [for a protein-coding gene] is that you have a protein that is coded by that sequence.
That is one of the things that we do [at SciLifeLab], and for that you need to have very, very good proteome coverage … [and] you need to be able to search your proteome data unbiased.
What techniques do you use to obtain the level of coverage you need?
We use high-resolution peptide isoelectric focusing as a first pre-fractionated step, followed by the normal LC-MS/MS workflow. The reason we do that is because with high-resolution isoelectric focusing you can generate additional data points with the [isoelectric point] of each peptide. You separate the peptides in a PX gradient and then you take fractions of this PX gradient and inject them into the mass spectrometer for peptide sequencing. Now we know what the [isoelectric point] of each peptide is, and that will tell us something about the sequence composition, which will help us to go back into the sequence that we identify, verify that, and then go back to the genomic sequence and see what part of the genome coded that [protein] sequence.
And there are other things for which this high-resolution isoelectric focusing is very good. Peptide digestion creates huge [sample] complexity. The problem is that the mass spectrometer can only handle 5 to 10 peptides each second, and you feed the mass spectrometer thousands and thousands of peptides. So you need to fractionate [the digest], and isoelectric focusing — especially the way we do it in high resolution — is a very powerful fractionation method.
We've compared it to the latest mRNA sequencing, and we are getting very close with proteomics [using isoelectric focusing] to the exact same coverage as with transcriptomic sequencing. The interesting thing here is [that in some cases] we see transcripts but we don't see proteins. That could be a sensitivity issue. It could also be that all of the transcripts are not coding for the protein, that there is translational regulation and different things involved.
But this part is much more interesting – we also see proteins that are not seen by the transcriptomics, and that is of course totally against the central dogma, where you have to have the transcript in order to code the protein. So this data just proves that even with modern sequencing – and this is with 300 million sequence reads, we still miss some genes. And when we analyze that fraction of the proteins where we only see the protein product, we see that they are enriched for, for example, ion channels and membrane proteins.
So you have small amounts of transcripts that are missed with deep sequencing. That is why we see that this type of proteomics is an interesting complement in the [SciLifeLab] — [using] proteomics to annotate the protein-coding [gene] sequences.
What does this sort of deep proteome coverage bring to your more applied, clinical research?
Well, we also use this analytical depth in clinical and biological research. In [a recent] Nature Methods paper we used the [isoelectric focusing, LC-MS/MS workflow] to discover targets of the ubiquitin-ligase Fbw7. Fbw7 is very important in cancer because it's a tumor suppressor, and in that paper we looked at what proteins are accumulating if you knocked out [Fbw7]. Then we used bioinformatics to identify a list of [Fbw7] targets, and then we validated NF-κB – which is a very central regulator – as a novel target for this ubiquitin ligase. Proteomics is a very nice way to look at ubiquitin-ligase targets since it's [a part of] protein-level regulation.
What are the implications of that paper's findings?
The broader significance is that it shows the use of quantitative proteomics in studying protein internal regulation, which is of course the whole ubiquitination system. So, in order to study that proteome-wide, or genome-wide, you need a method where you can cover the whole proteome and look at the effects of protein turnover.
But that particular paper has another interesting implication, and that is that Fbw7 has been shown to be mutated in myelomas and leukemia, and in those systems NF-κB is driving the cancer growth. So that particular protein could be [useful] in determining prognosis or selecting [myeloma or leukemia] patients [who are good candidates] for inhibiting the NF-κB signaling. Another thing is that the paper has a list of 100 other putative targets, with some very interesting proteins included there. It is one of the first proteome-wide evaluations of an E3 [ubiquitin] ligase, and it shows the sort of magnitude of regulation that these E3 ligases can have.
What other questions is your lab working on?
Another paper is more on the clinical side. It was published in Molecular & Cellular Proteomics, and there we looked at the gynecological cancers, mainly vulvar cancer. We looked at two different [types of cancers], cancers that are [human papillomavirus]-infected and those that are not, [and] about 50 percent of those cancers are [infected]. And it is quite well known what is driving the tumorigensis and development [in HPV-positive tumors], but in the HPV-negative tumors we need to find molecular targets for treatment. We did proteomics profiling to look at those, and we also looked at [cancers] that had good prognosis and bad prognosis, dividing them between relapsing and non-relapsing patients. We did a general in-depth proteomics profiling of that.
The other thing [the researchers found] in that paper is that we can generate such good proteomics data that we can do a thorough individual pathway activation analysis of each individual tumor based on the proteomics data. We don't rely on single markers; we look at the proteomics on a pathway level in each individual tumor and see if we can draw a general conclusion of pathways activated in bad prognosis versus good prognosis, or in HPV-infected tumors versus non-infected tumors.
We also have several ongoing projects — one, for instance … looking at endocrine therapy-resistant breast cancer. The other one that we are working on heavily is [a project] on neuroendocrine tumors and endocrine tumors, meaning thyroid cancer, for example, or neuroendocrine tumors in the midgut — that's the disease that killed Steve Jobs.
We're working on similarly trying to understand the molecular profile and landscape of those tumors, looking in depth at the proteomics and particularly looking at what determines the metastasis location and growth rate of those tumors. These tumors are quite slow growing, which is good in a sense because the patients live longer. But [it is] bad in the sense that it is very difficult to treat them since all our cancer treatments are hitting the fast-growing cells. These tumors just keep growing [slowly] and eventually the tumor will just eat up the patient inside. We need to find particular molecular targets for these diseases, and we can do that by proteomics screening.