At A Glance
Name: Cheryl Arrowsmith
Position: Professor of medical biophysics, University of Toronto, and Senior Scientist, Ontario Cancer Institute, since 1992.
Chief Scientist, Canadian branch, Structural Genomics Consortium.
Background: Post-doc in magnetic resonance, Stanford University, 1986-91.
PhD in physical organic chemistry, University of Toronto, 1986.
BS in chemistry, Allegheny College, Meadville, Pa., 1981.
How did you first get involved in proteomics?
My background is in determining the 3D structures of proteins, and when genome sequencing projects were starting to finish in the late ‘90s, it was clear that we needed to rethink how we went about doing our kind of science. Traditionally what we do is wait for a molecular biologist to identify an important gene or protein to work on and then we say, “OK, that’s important. Let’s determine its 3-dimensional structure, because the structure tells you how the protein works — how it carries out its enzymatic activity, or how it interacts with other proteins or small molecules. But now the whole genome was out there, and in many cases we only knew what a fraction of these proteins did. So we thought that it might be time to try to apply structural biology on a more genome-wide scale. That has only become feasible because, also over the last 10 to 20 years, recombinant DNA techniques to allow you to express proteins from cDNA or from genomic DNA have become much more robust and the tools of structural biology — X-ray diffraction and NMR spectroscopy — have become much more sophisticated over these years, so it all kind of came together at the right time [such] that one can think about doing structural biology in a genome-wide context. It’s still not nearly as high-throughput as genomics or the more classical definition of proteomics, where you’re screening cells for protein expression and complexes, but it is becoming faster due to initiatives around the world that are stressing trying to make it more high-throughput.
Tell me about the formation of the Structural Genomics Consortium and its progress?
The Structural Genomics Consortium was initially conceived by a group of pharmaceutical companies and the Wellcome Trust in the UK. The concept there was to focus on pharmaceutically relevant and medically relevant proteins. The idea came about probably four years ago or so when many structural genomics projects throughout the world were being initiated in Europe, the US, Japan, and [in] our group here in Canada. Almost all those groups were focused on the whole genome or proteins whose structure had not been solved before, or couldn’t be modeled. The Trust and pharma companies wanted to focus the targets a little bit more and perhaps not determine as many but have the structures that were determined have higher impact and be directly relevant to drug development or to medicine for understanding disease processes.
It started in the UK and in the summer of 2002: they were interviewing for directors. Aled Edwards, my colleague here in Toronto who’s been co-directing the Toronto effort so far, applied to the director of the SGC, and was offered a position. His strategy was to try to leverage our success so far in Canada and make it a joint Canadian-UK project. So it was really Aled’s vision for joining the two sides of the Atlantic. He was able also to get several of the Canadian granting agencies to get interested in this, and they essentially invested in the SGC. So it’s a really joint Canadian-UK project now.
What is your involvement in the SGC?
It’s a two-site operation — Oxford University and the University of Toronto. My title is chief scientist of the Canadian enterprise. Aled directs the whole project on both sides. They’re sort of mirror sites — they each have similar but complementary capabilities. They both do high-throughput structural biology — particularly X-ray crystallography — but the classes of proteins that will be worked on will be different on each side of the Atlantic. The project as a whole will have to develop technologies to make the process all go faster. So we’ll each try different strategies, compare notes, and then see which methodology works the best.
What sorts of strategies are you trying?
One strategy is to focus on gene families. Each [protein of an] enzyme family, for example, all performs a very similar type of chemistry [to the others] and therefore inhibitors or co-factors of this enzyme family would all be applicable, and binding of these molecules can help stabilize these proteins and make them crystallize better. So we’ll have several groups on each side that are specializing in one particular area of enzymology or groups of gene families. They’ll become real expert[s] in those types of proteins, as opposed to some of the other efforts and what we’ve done in the past here in Toronto, which is to try to look at everything in the genome. What happens there is you come across every kind of biochemical activity, and you can’t be expert in it all. It makes it more difficult to work on the harder proteins. One can always sort of scan and screen for the easy-to-work-with proteins — what we call the low-hanging fruit — but with this project the goal is to solve a lot of these structures that are important pharmaceutically and might not be as easy to work with.
We’ll have a small effort in membrane proteins — they’re certainly a very important class. One of the PIs hired in the UK is Declan Doyle, who is one of the well-known membrane protein biochemists.
So one strategy is to focus on gene families - what are some other strategies?
We’re exploring a number of technical approaches. The project is not quite up and running yet — we’re still in [a] heavy recruiting phase. We’ve identified most of our principal investigators. The way the project is structured, there will be five principal investigators in the UK and five in Toronto.
The strategy will be to explore ways to parallelize and make more efficient the many strategies that structural biologists normally use to bring a gene to fruition as a structural target. These include things like exploring close sequence homologs or orthologs of the protein of interest which may have slightly different surface properties and therefore may crystallize better; using various biophysical techniques to evaluate the stability of the proteins — [like] identifying small molecules that might bind to them that might make them more stable; and exploring different constructs of the proteins to find the most stable construct that will crystallize the best. Finding the way and the order in which to apply each of those different strategies is what we’re going to be doing — we don’t know what the answer is yet. That’s what we have to figure out.
We expect by this summer we’ll have enough critical mass that we’ll really be up and running and starting to produce structures.
Tell me a little about some other things you’re working on in your lab.
I also maintain a laboratory here at the OCI which is focused on cancer — proteins involved and pathways that go wrong in cancer. We’ve worked for a number of years on the tumor suppressor p53. We’re looking at a number of proteins that bind to p53 and regulate its activity. For example, one of the proteins is involved in taking a ubiquitin off the protein [to] help stabilize p53.
We’re [also] working on the breast cancer susceptibility protein BRCA1 and we’ve got some preliminary data on that protein. The central region is about 1,500 amino acids, and is the site of a lot of mutations and the site of a lot of interactions with other proteins, but we’ve shown that in the absence of binding to anything else, this region is largely unstructured, and we expect that it may be important for sensing DNA damage and signaling to the DNA damage repair pathway, and also a sort of clothesline for other proteins to come along and bind to and participate in some of these DNA damage foci that the protein forms. Those are two of the most recent but unpublished things we have in the lab.
When doing high-throughput structure determination, how do NMR and X-ray crystallography compare?
NMR spectroscopy [is] my background and training, and people often come to my lab to train in that area. But sometimes the proteins are not amenable to NMR analysis, so we try crystallization. Very exciting recent developments in NMR make it applicable to very large systems. If you want to ask a very specific question of a very large protein, there are NMR strategies that allow you to address those questions and answer them. In terms of determining a high resolution structure for larger proteins — say 50kDa and larger — it’s not fast by NMR. There are groups that are working towards making NMR applicable to larger proteins, but right now it’s not a fast method. NMR can be fast if you have very good data for proteins 150 residue and smaller. But a lot of pharmaceutically relevant proteins are enzymes. They’re usually 300 to 400 amino acids, and [for] those proteins it’s quicker to try to crystallize them. So it depends on the protein you’re looking at. A lot of signaling domains are quite amenable to NMR. And what we’ve found is that if the protein is small and it is amenable to NMR you can assess that very quickly and then you can put your resources into solving that structure and you know you’re going to get a structure in the end. Whereas with crystallography, you have to go through the crystallization stage, which you never know if it’s going to work until you get a diffracting crystal in the end. So there you’re putting a lot of resources into proteins that don’t always work out to give a structure in the end. So the NMR approach, if you can make a good call about what you’re going to apply it to, can actually be quite economical.
[X-ray crystallography] is tedious and it’s kind of a black box — you don’t know ahead of time what’s going to work, so you have to try lots of different protein constructs. But all these structural genomics projects have resulted in commercially available equipment to help do this faster. So there are crystallization robots now and there are crystal imaging systems that automatically take an image of each little well and see if there’s a crystal in there or not, and the SGC will be using these sorts of things. Still, you can create more data this way, but then you have to worry about sorting through all that data. So this is part of what the SGC will have to do — figure out how to make that work efficiently.