NEW YORK (GenomeWeb) – Last week, scientists led by Stockholm Royal Institute of Technology (KTH) researcher Mathias Uhlén, released the latest version of the Human Protein Atlas.
This new release – the thirteenth edition of the HPA – contains information on more than 16,900 human proteins across 44 different normal tissues with complementary RNA-seq data available for 32 of the 44 tissues. It also contains data on 20 different cancer types and 46 human cell lines. Using more than 24,000 antibodies, Uhlén and his colleagues have generated immunohistochemistry-based profiles offering data on protein expression in these tissues at a single-cell level.
Launched in 2003, the HPA project currently involves more than 150 researchers worldwide. The effort is one of the primary pillars of the Human Proteome Organization, providing a catalog of protein expression and localization data supporting a variety of HUPO initiatives, including the ongoing Chromosome-Centric Human Proteome Project.
GenomeWeb spoke to Uhlén this week about the recent HPA release and future plans for the project.
Below is an edited version of the interview.
Where do things stand in terms of tissue coverage after this recent release?
We have 32 tissues where we have done both transcriptomics and antibody-based profiling, and then we have an additional 12 tissues that we have only analyzed at the protein level. So we will try to close that gap so that we have 44 tissues with both RNA and protein, but right now it is 32 [with both].
How do you integrate the protein and RNA data you've generated?
For us it has been a wonderful experience to work with the new next-generation sequencing way of doing RNA profiling. What we do with the RNA-seq is take the tissues and organs that we analyzed with the antibodies and we do quantitative measurements on the mRNA level, and that provides us with quantitative levels across all genes. The problem with it is that it gives you a sort of average expression of RNA level in that tissue or organ, so, you know, you have a mix of cells in all tissues, but it does provide us with very good quantitative data on this sort of mixture of cell types in a tissue, and that is very nice. So we can then give sort of quantitative levels of different genes in different organs. But I think the most powerful thing with the Protein Atlas is that we can then take these lists and go into these specific tissues and see pretty exactly where are these proteins more or less on the single-cell level.
So you're able to generate protein data with single-cell resolution with your antibodies?
Absolutely. With the immunohistochemistry and the microscopes we are using we can see on the single-cell level. We can see roughly if it is in the nucleus or in the cytoplasm – that is kind of the resolution we have. It's important that we can sort of see the heterogeneity of cells and so on. The [recent] launch provides this type of information across the different tissues and organs, and in two years we will hopefully launch a complete set of analyses all using confocal microscopy and then we will be looking at where are the proteins inside the cell – we will have much, much higher resolution.
What are the challenges to achieving this higher resolution? Is it just a matter of doing it, or are there technical hurdles you need to overcome?
There are major challenges around the use of antibodies. I think everyone who works with antibodies knows they are notoriously cross-reactive and that is, of course, why we have now tested more than 70,000 antibodies and picked only 24,000 that we are using. So almost two-thirds of the antibodies that we have tried have failed in our hands. And even many of the antibodies that we use we know give rise to cross reactivity. So in quite a few genes or proteins we are actually seeing cross reactivity, and that is something that we need to clean up, and we are doing a lot of work now to improve the quality of the protein staining. A lot of the protein [data] is excellent. But for a lot of proteins we are still seeing if we can get antibodies with less cross reactivity, and that is the major challenge.
Have you looked into using some sort of sandwich assay, perhaps, to improve the specificity of your measurements?
I think that is a very good question. I am a big fan of using different types of sandwich assays, but unfortunately all of these assays take a long time to develop and you have to spend a lot of time to develop even one assay for one protein. We want to analyze 20 new proteins every day, and so we have decided to go for a single antibody type of method, but I think at the end of the day it would be very nice to use sandwich assays to increase the specificity of the data. But what we are also trying to do is have several antibodies and compare the results of them, and if you have the same staining [from both] then you are more or less sure it is the correct staining.
One of the commonly cited limitations in protein research is the lack of good antibodies. You, though, seem to have tens of thousands of quality antibodies. Are these only validated for IHC? Or could these be used by researchers doing Western blots and other types of assays?
All the antibodies we have used are validated for immunohistochemistry, Western blots, confocal microscopy, and protein array, and all of the data for those you can go in and look at in Protein Atlas. Almost half of the budget of the project has been the validation of the antibodies.
It's also very interesting to use these reagents for mass spectrometry and classical proteomics, and we do that in two ways. One is to use all the proteins we have produced, the antigens, as spike in reagents when we do different types of targeted proteomics. This is a collaboration [we have] with Mathias Mann that has been very fruitful. We have 55,000 [protein] fragments, and you can use those to develop SRM assays and Swath assays, for instance. The other thing, which I think is very interesting, is to use the antibodies for capture of proteins and then use mass spectrometry [for analysis].
How useful are antibodies for looking at protein isoforms and variants? What can they contribute in this area?
We have tried many antibodies that claim to be isoform-specific, and we haven't been very successful, especially not in immunohistochemistry assays. So we are a little bit discouraged from using these isoform-specific antibodies. For us our focus has been to look at the sort of representative proteins from a gene and have an antibody that would hopefully capture all the different isoforms. I think that the way we would like to look at isoforms is to use the antibodies as capture reagents and use mass spectrometry for isoform analysis. But this is relatively hard to scale up with kind of the throughput we are doing, so I think this is more suited [to situations] when you have specific targets you want to analyze.
Generally speaking, what is the quality level of commercial antibodies on the market today?
We have received antibodies from almost 50 different commercial suppliers, more than 25,000 antibodies, and we have tested those, and to our disappointment the majority of those antibodies have not worked in our hands, though some have been excellent. So I was quite annoyed about this in the early days of the project, but I have since realized that if an antibody works or not is very context dependent. You have to try it in a particular assay. An antibody could work fine in a Western blot and not at all in immunohistochemistry. So what we decided to do is start an effort where we would have all the commercial suppliers send in information to a database where you can then compare antibodies. We call it Antibodypedia, and we now have 1.4 million antibodies to human proteins in that portal, and researchers can compare antibodies to their favorite protein and see the results in different assays including Western blots and so on.
But this question about antibodies is very relevant. Everyone working in the field, this is what we worry about. Usually what happens is that the antibody recognizes the right target but it also recognizes other proteins, and it is also concentration dependent – so even if you have a very specific antibody, if your target protein is a thousand times less abundant than the other proteins, you might still have the antibody binding to these much more abundant proteins in a cross reactive way.
Are researchers careful enough about validating the antibodies they use? Do you think the field is aware enough of the various potential pitfalls?
There are of course thousands and thousands of publications where people have used antibodies and the results they are showing are actually not what they think they are looking at, and of course this is annoying. So there are quite a lot of questions marks around the use of antibodies. But I think in most cases people are aware that you have to be careful with antibodies and, of course, it is always good to have two independent antibodies – you can verify results of one antibody with the second.
Do journals typically require data from researchers to show that their antibodies are working the way they think they are?
There are not strict guidelines about this. There are different policies in different journals. Some journals require that you show that the staining goes away if you remove the antigen, but it is a very crude way of showing that your antibody is specific, and it is very often a grayscale kind of situation. We also see in the Protein Atlas that sometimes you have an antibody that is beautifully specific but then in other tissues you get background staining that is probably false. So this is something you have to be aware of.
What is the next area of focus for the HPA project?
We have two areas where we think that our data is not very high resolution. One is the hematopoietic system, blood cells. This is something that is very interesting for a lot of people, and there we think we are not contributing in a very good way so far. So we would like to work more on that. The second area is the brain, which is very hard for us to do [because] it is so hard to get fresh tissue from the brain. So next year we hope to launch a Rodent Brain Atlas, which has a lot of similarities to the human brain. And then two years from now we will hopefully launch a more complete subcellular atlas.