Senior research associate
Institute of Zoology, University of Zurich
At A Glance
Name: Erich Brunner
Position: Senior research associate in Ernst Hafen's lab at the Institute of Zoology in the University of Zurich. Coordinator of the Competence Center of Model Organism Proteomes.
Background: Senior research associate in Adriano Aguzzi's lab at the Institute of Neuropathology, Department of Pathology, University Hospital of Zurich, 2001-2004.
Postdoc in Ruedi Aebersold's lab, Institute for Systems Biology in Seattle, 2001.
Postdoc in Bernhard Odermatt's lab, Department of Pathology, University Hospital of Zurich, 1997-2001.
PhD in developmental genetics, Konrad Basler's lab, Institute of Zoology, Department of Developmental Genetics, University of Zurich, 1997.
A database for describing the complete proteomes of model organisms Drosophila, Caenorhabditis elegans and Arabidopsis is currently being developed by researchers at the University of Zurich. ProteoMonitor decided to catch up with Erich Brunner, the lead developer of the database, to find out more about the database and about his background.
What is your research background?
I got my diploma with Charles Weissman. Basically there we did some cell culture work and work on interferons. Then I did my PhD and switched to Drosophila work. I did my PhD in Konrad Basler's lab. And then I actually made a detour and went to work for the university hospital in Zurich. I worked on thyroid cancer. Then in 2001, I joined Ruedi Aebersold's group and switched to proteomics. That's where we initiated the work on the Flycat proteome database. Then I had to go back to Switzerland, where I joined Adriano Aguzzi's group as a senior research associate. I worked there on prions. Then I was asked to reinitiate this [Flycat] project that we had started in Seattle because Ruedi Aebersold came to Zurich. I said yes actually the project grew bigger because we included C. elegans and Aribidopsis. So I'm coordinating that project and I do the fly work.
Can you describe the Flycat project?
What we try to do is annotate the entire proteome of Drosophila and to see what gene models are true. Also we try to use our data to make better genome annotation. We have yet a lot of unassigned spectra which we can map back to the genomic sequence and actually define areas where there could be additional genes. We have found such loci. And we could also refine some existing gene models because some exons seem to be mispredicted or not predicted at all. So we can add some value to the gene models. In the long run, we're trying to make complete coverage of the Drosophila proteome. Also, later on, we're going to go for specific modifications within the proteins. We'll try to pull them out and characterize them such that people get a complete picture of the dynamics of protein modification when they are expressed. We'll go for specific tissues, for instance, different developmental stages, and try to isolate the proteins there, then put that information back into the database, saying 'These proteins are expressed during that time.'
All the information that people have now comes from the genome itself, but protein evidence is very often lacking, so we're trying to make a protein database where all this information is stored.
How do you go about studying the modifications?
There are some papers out that allow you to specifically pull on these modifications. So you target for example glycosylations or phosphorylations, then you look to see what they are, and on what part of protein these modifications sit. Basically, you specifically target the modifications, then look to see which protein actually carry these modifications.
How far have you gotten with the Flycat database?
We reinitiated the project in the beginning of the year, and it will last at least three years. We have several test versions and they work pretty well. The idea is to make all the information that we have public as soon as possible. But of course to make a database like that public, we have to have all the infrastructure to go public and to connect people, and this is not yet done.
Initially, this was developed in the ISB in Seattle, and we have just mirrored that information here.
Do you have any background in computer science?
Not really. I'm pretty familiar, but I'm not really into programming. I work together with the programmers. We define what we would like to have, then we check the readout and see if it's what we wanted. So it's pretty interactive between different scientific groups. We have specialists for the databases for data mining, for extracting proteins and measuring them on a mass spec. We have different people who are specifically doing certain things, but they're interconnected and talking to each other.
Are you involved with other research projects aside from the Flycat project?
At the moment, there are other research projects ongoing very small ones, very specific ones, but I don't have hands on there. It's more that I'm involved through my diploma student. We're really focused on the database.
Why did you decide to get involved with the Drosophila project?
My brother did his diploma and thesis with Ernst Hafen, the group I joined now, and he was working with Drosophila. That's how I actually got interested in the model organism. That's why I actually did my thesis on Drosophila. I looked into different fields, and the combination of proteomics and Drosophila was the perfect match for me.
What did you do your thesis on?
On wing signaling the development of wings, legs in Drosophila. It's also an important pathway that's activated in colon cancer. We did some pretty basic research on that genetic screens to find new components of this pathway, and so on.
Would it have been helpful to have had such a database while you were doing your thesis?
Yes. Actually, it would have been very helpful because most of the components of this signaling pathway hadn't been identified, and we had a hard time cloning some of the genes that we found, because they weren't even annotated, and a protein database then would have been very beneficial. It would have been very useful for geneticists.
Do you have any plans to do disease research?
Probably in the long run, yes. One of the benefits that we have is that we not only work on Drosophila, but also C. elegans and Arabidopsis. We will try to do cross comparisons between the three organisms and we'll try to see [what will be beneficial] from the data that we have from genome annotations. And then we'll probably extrapolate to humans, possibly find new genes maybe genes that are involved in diseases. But this is a future goal. But in the long run, since I have that background I was working on prion diseases and thyroid tumors I'll probably sooner or later get back to disease.
What kind of findings have come from comparing the different model organisms?
This I can not say yet because [the Drosophila] is by far the furthest, and the other ones have just started. We hope to do these analyses probably in the middle of next year.
Are there a lot of people already using the database?
Quite a few, but actually it's still for internal use at the moment. As soon as possible, we're going to go public, and we'll see how many hits we have per day on the site.
The final goal is that we improve genome annotation define the elements within the genome and proteome and then go towards systems biology to understand how these genes are arranged, transcribed, translated into proteins, and how the proteins interact to actually finally build an organism. Also to understand how the pathways cross-talk to define networks. But the first step toward systems biology, we think, is defining the elements. We know that still some annotations in the Drosophila genome are wrong, so people probably work at the moment with wrong annotations.
How can you tell the annotations are wrong?
Because some of the gene models that are predicted we have evidence that they have to be fused down to one, and then there's a big stretch that is predicted not to encode for any protein, and we have found peptides that actually match the genomic sequence. If you look at the gene model more carefully, you can see that these two adjacent gene models actually fuse down into one. People working with one or the other gene would probably not be successful in their experiments.
We find quite a few things that have to be adjusted. We also find areas where there's no gene predicted, and we have evidence that some peptides would map there, and they're unique. So we think that there's a gene that has not yet been identified. And how many there are, we don't know.
What other projects do you have planned for the future?
We're really focused at the moment on that project. I want to see where it carries me.