At A Glance
Mark Bedford, assistant professor, University of Texas M.D. Anderson Cancer Center, department of carcinogenesis.
PhD in developmental biology, Weizmann Institute of Science, Israel: Analyzed the developmental function of a growth factor and its receptor
Postdoc, Harvard Medical School: Studied protein domains and signal transduction
What are your research interests, and what role do protein arrays play in your research?
My basic interests are in protein modifications and protein interacting domains. We are working with predominantly WW domains and SH3 domains, and we are interested in how protein interactions are affected by posttranslational modifications.
How did you construct the protein interaction array you described in your recent Biochemical Journal paper?
The arrays were generated on a glass slide that was coated with nitrocellulose; we basically just used a cDNA arrayer. We worked out empirically that we arrayed about 250 ng of fusion protein, which is quite high. We found it was quite critical to get high concentrations of protein for those interactions. These were GST fusions of mouse and human proteins; what we focused on were protein interacting domains, the modules that are most likely to generate protein interactions. We arrayed out WW, SH3s, SH2s, PH, 14.3.3, PDZ, FHA, and FF domains. All of these are known to interact fairly strongly with ligands. We also had a separate section on the array where we [put] full-length proteins which are fused to GST which did not contain domains but [that] were of general interest, for example p53, p73, E2F1, and HDACs. In total we arrayed 212 purified GST fusions. We put some work into purifying the fusions, making sure that we got clean protein and that the proteins were at similar concentrations. That could be a problem in the future when you do these sort of arrays; it’s gonna be difficult doing it in a high-throughput fashion, because each protein functions differently – some are more soluble than others, some are made at higher levels.
The first thing we did was to test whether the protein domains which we had arrayed were actually functional. Each of these domains has a known ligand. For example, WW domains and SH3s will bind certain types of proline-rich motifs. We synthesized peptides that represent these motifs, the peptides were biotinylated and then labeled with streptavidin-Cy3 or Cy5. We developed a labeling technique which is described in the paper. We just probed the arrays with these peptides. From the literature we could predict with which domains these peptides should interact. In most cases, we got interactions with the correct spots. That was the control section, and what came out of [it] is that a peptide from a splicing factor, SmB, [that] has been reported to bind WW domains in the literature, … could bind extensively with SH3 domains on the array [as well]. That raises the whole possibility that SH3s could be involved in splicing.
The next thing we went on to show is that the arrays could be used to obtain binding profiles of endogenous proteins. We looked at two proline-rich proteins, a protein called Sam68 and also the splicing factor SmB. Depending on how many proteins you have arrayed, this can be fairly complex. We have over 200 proteins arrayed, so we can screen for 200 potential interactions of a single protein. This experiment takes 3 hours – it’s very quick as opposed to two-hybrid systems or other protein interaction screens. The problem is, you are limited to what you have arrayed.
You also assessed the role of methylation?
We used peptides that are known to interact with SH3s and WWs. We knew from blot overlay experiments that arginine methylation would prevent SH3 binding, and we showed in this format that we could get the same result. The nice thing about that sort of experiment is, you can label one peptide, for example the unmodified peptide, with Cy3, and a modified peptide with Cy5, mix the two peptides and then do the probing. In much the same way that you would read a cDNA array, you can now read a protein array. We have also looked at phosphorylation, in particular the 14.3.3 section. It worked, and that’s unpublished [data].
What are the main applications for that chip, and does it have commercial value?
The first area would be basic science where someone is just interested in knowing what their protein binds to. There are web-based search engines, for example there is a site called Scansite [http://scansite.mit.edu/] that will allow you to put your protein in and search it for potential protein interaction motifs or phosphorylation motifs. And then you can just synthesize a peptide from that region and see whether it actually does bind what it is predicted to bind.
A lot of people involved in signal transduction research are interested in the modular domain interactions of signaling molecules [and] this would be a quick way of allowing you to map potential interactions in signal transduction pathways. Alternatives would be to do pull-down assays with GST fusions, but then you would be limited to doing a few experiments at a time. This way allows you to screen many more domains at once.
The other alternative is to … do a profiling experiment. You could either overexpress it in cells, or, if your protein is fairly abundant, you could just use the endogenous protein and screen for potential binding partners. You just take the total cell lysate and use that as a probe and then go in with an antibody to your protein and see if it has concentrated on any dots. The one limitation is that you need high expression of your protein. We show in the paper that we have done it for Sam68 and SmB.
What other applications can this chip have?
One possibility is to try and probe whole proteomes. You could imagine labeling up total protein extract from a tumor versus adjacent normal tissue. So the one you could label with Cy3, the other with Cy5, and then you would mix the two proteomes and screen the array and see if there was any slant towards green or red. You are going to need quite a big spike above background because you are going to look at a lot of protein interactions at one dot. You wouldn’t be able to identify directly off the array, but you could use it as a first step in screening. And depending how complex the array is, you could have 1,000 domains arrayed out and then you would see two or three of them showing a change, and those … you would use in pull-down assays, and follow up if you see any differences by mass spec identification of your band.
Where do you see the protein chip field going? Where do you see new interesting technical developments coming from?
We have actually used arrays in a separate study to screen for enzyme targets. And that is going to be a very effective use, I think, of protein arrays, where you use a recombinant enzyme to either methylate or phosphorylate potential substrates, and identify [them]. We have an EMBO Reports paper from earlier this year using protein macroarrays [from a group in Germany]. We used those to screen for methylated targets. We did large-scale enzyme reactions on the filters and then looked to see where there was a focus [of] activity. We have used that as well for protein interactions, also screening for kinase substrates and metal transferase substrates. That array is fairly complex, [it] has 37,000 human brain proteins arrayed.
I think in the future, it will be interesting to try and make a protein domain array that contains an example of every domain family identified. And those families are fairly well described in Pfam, which is a protein family database. And that may be a good first sweep if you just take a few examples from each of those families and array them out in clusters. The full proteome would be needed for different experiments, like substrate screens. I think there are going to be different angles in developing protein arrays, and you are going to have to choose exactly what your experimental approaches are or what your question is you are trying to answer. And I think often you don’t need full representation of the proteome arrayed out