This is part two of a two-part interview. Part one appeared in last week's issue.
Name: Janne Lehtiö
Position: Platform Manager for Mass Spectrometry, Science for Life Laboratory; Head of Clinical Cancer Proteomics, Karolinska Institute and Hospital
Background: PhD, Royal Institute of Technology, Stockholm, Sweden
In 2010, the Swedish government established the Science for Life Laboratory, a $75 million life sciences institution developed as a collaboration between Uppsala University, Stockholm University, Karolinska Institutet, and the Royal Institute of Technology, with branches in Stockholm and Uppsala.
The institution, commonly known as SciLifeLab, is focused significantly on omics research, housing such projects as sequencing the genome of the Norway spruce as well as the Human Protein Atlas project, led by Matthias Uhlen, a researcher at Royal Institute of Technology and director of SciLifeLab, Stockholm.
As platform manager for mass spectrometry at the laboratory's Stockholm facility, Janne Lehtiö is closely involved in various aspects of its proteomics work, and in its cancer proteomics and proteogenomics research, in particular. ProteoMonitor spoke to him about these efforts as well as advances and challenges in the field.
Below is an edited version of the interview, the second in a two-part installment. Part one is available here.
Where are you in terms of moving the findings from your research into clinical validation and implementation?
In breast cancer, where we have worked the longest, we are coming to the phase where we have [validated] some of the proteomics findings, and we are at the moment going into a larger clinical validation in retrospective material. And if that is successful then we will design a treatment trial based on the findings. It's very important to work on these long-term translational projects, because otherwise the benefits are not coming to the patients; we don't disseminate the results to the actual level of patient benefit.
We have predictive [markers] to select patients for endocrine therapy like estrogen receptor expression, but one third of these patients will relapse … So we need to find that one-third of patients and offer them additional therapies. And this is something where we need the biomarkers and we need the molecular profiling. This is one of the problems … we have applied to biomarkers, and we can select these early-relapse patients and offer them additional tailored therapy.
Where did you work before you came to SciLifeLab?
Before I moved to SciLife lab I was heading the clinical proteomics research lab at the Karolinska University Hospital associated with Karolinska Institute. Before that I worked at [Ciphergen, now Vermillion] for a couple of years.
It's interesting, because what [Ciphergen's SELDI MS technology, acquired by Bio-Rad] was good at was to run large series of samples, which is the way they could validate the clinical significance of their findings. What it was not good at was to dig deep into the proteome. But they ran lots of different projects and some of them were successful. In the end [Vermillion's OVA1 ovarian cancer diagnostic] turned out to be the first [US Food and Drug Administration]-approved proteomics [test]. And I think the time I spent there was very interesting in the sense that I definitely know the drawbacks and pitfalls of trying to do clinical proteomics and learned lots of lessons from working there.
Are you using mass spec for your validation work, as well, or do you move to an immunoassay platform for this phase?
When we find something in proteomics in the screening work – and this is general in any proteomics lab – we say, "How are they going to validate this finding?" And the exciting thing here is that we have several projects where we have taken 20 or 40 of our proteins and then gone into the facility of [Peter Nilsson, director of the affinity proteomics platform at SciLifeLab] and done the Luminex-based validation using affinity proteomics.
Also, for proteogenomics we do parallel sequencing, mRNA, and proteomics and couple those datasets. In this center we can select the hundred most important findings [from proteomics data] and go into the other platforms and do the validation. We can also use the data resources that the other platforms have generated to prioritize our findings. We run the same cell lines; we run the same model systems, and sometimes the same clinical materials, and that type of layering of data will help in selecting the key findings.
That is very exciting, and we're just at the beginning of learning how to do that. It's not easy learning to couple these [different] high-throughput omics technologies and data [types] together. The data structure is very different between the different platforms. The data has different types of strengths and weaknesses, so you need to actually understand the data from each platform in order to be able to generate that fusion. But we are doing a similar project where we have done the discovery on the proteomics side and have gone on to the screening and validation and sequencing verification of mutations and things like that. So that [sort of multiomics work] I think will prove to be … the best fruit of this center in the long run, provided that we can collaborate.
How are you investing in your bioinformatics infrastructure to further these ambitions?
The bioinformatics investment helps us, definitely, because the bioinformatics data flow from genomics and proteomics, after you have done the raw data analysis, is very similar – at least from transcriptomics and proteomics, since it's quantitative analysis of the gene products. But we also need to realize that the proteomics-related bioinformatics is far behind the genomics bioinformatics, and I think that major investment is needed there, and it is up to us in the proteomics field to try to attract clever bioinformaticians into the field.
[Most bioinformaticians] have been basically doing genomics bioinformatics. And I think genomics bioinformatics is close to computer science in a funny way because it is almost digital. You have four different bases that are like zeroes and ones, and it is easy to handle that. In proteomic bioinformatics, you have to handle 20 amino acids folding in different structures, decorated with post-translational modifications, raw data that is spectral data – so it is not at all zeroes and ones in a row, and I think it asks for different types of challenges. You need to be able to understand how the raw data is generated; what are the post-translational modifications; what is the occupancy rate of post-translational modifications; how does protein folding depend on or affect the activity; how is protein interaction affecting the biological functions? There are more layers to it, definitely … And we need to improve these technologies.
What sort of bioinformatics skills do you need? Are there specialists in this sort of multiomics analysis, or do people typically specialize in one omics discipline or another?
I think it depends. Of course it is a very broad field, and there are specialists in proteomics bioinformatics and specialists in genomics bioinformatics and there are also specialists on pathway analysis and that sort of thing, which can combine the data from both sides. There are not really specialists yet looking [specifically] at transcriptomics and proteomics data or genomics and proteomics data. I think that will come in the coming years when these bioinformaticians are exposed to this data and the problems that we have with analyzing this data. We need to inform them what the possibilities and problems are; what the needs [are]. And I think that eventually this type of bioinformatician, this field, will develop.