At A Glance
- Professor at the Eugene McDermott Center for Human Growth and Development; The Center for Biomedical Inventions; and Departments of Biochemistry and Internal Medicine, UT Southwestern.
- Co-Founder of with Glen Evans of Human Genome Center at UTSW
- Private sector researcher at General Atomics until 1990
- PhD, Physics, University of Wisconsin
- Latest Paper: “ARROGANT: an application to manipulate large gene collections,” Bioinformatics 2002 November;18(11):1410-1417
How did you come to develop Array Organizing Tool or “Arrogant,” the microarray analysis program you published in Bioinformatics this month?
Very early on, when microarrays were first being developed by Pat Brown and his group, Ron Butow, who is a yeast geneticist, [and I] got an NIH grant. This was before spotted arrays could be bought or there were scanners. My role was to build the robotics to spot the arrays, then make scanners, and make software. One of the things we quickly found out is that there’s a big gap. The furthest that analysis takes you is doing clustering to identify trends in the experimental data from the large dataset, then leaves it up to the individual, to the gray matter in the scientist’s head, to have knowledge of different groups of genes, the names of genes, and what they do, [in order] to put together a model to understand what’s going on. What we wanted to do was to build additional software to bridge that gap between the clustering of the data and the group of data, then add information in an organized way so that we can have enough information associated with the DNA such that scientists can quickly rip through it and make interpretations. That’s what Arrogant does.
Since then, we’ve actually gone further to build codes that we hope will be published some time soon. The most recent one is called Iridescent. It is more of a hypothesis-generating and data mining and discovery code. It works on an entirely different principle. It can handle anything from one data point to microarray data.
Arrogant allows researchers to compare data from different types of arrays, such as Affymetrix, spotted oligo arrays, or cDNA arrays. How dovyou go about doing that, given that the data is so different on the different platforms?
The essence of the code is not only to run arrays but, really, to work with giant collections of genes as objects.
There’s no question that the quality and the reproducibility of the data is better on Affymetrix arrays. That’s one of the reasons, of course, why they’re so popular. The other reason is that Affymetrix arrays are manufactured with very stringent quality control and everybody who gets them basically has the same design, whereas [with] people who are spotting their own arrays, it’s kind of kitchen science, and the quality is a function of not only where they’re made but who made them that day. But at the end of the day, when you’ve generated data from an Affymetrix array or a spotted array, you nonetheless would like to be able to compare that data. That’s one of the functions of our code. The other thing that we’ve done in this particular code is that we realized that there were some fairly popular array designs -- the Affymetrix Hu U133 and HuU95 chips, the Operon oligo sets, and some things like that. So we’ve actually pre-computed our annotation for these arrays. It’s already on the web page. People can paste their data in and work with all of the annotation. We did that by getting sequences and names of the genes from these companies, Affy, Operon, etc.
They provided you with that information?
Yes. Because all these companies realized that the more information that people can get out of their [product,] the more popular it will be, and they just can’t keep up with new tools. People who use the use the Affymetrix analysis suite can get so far, and [then] they plunk [the data] into our code, and they get a different view of things. They can do different types of analysis or thinking about their data.
In the past you’ve been very focused on inventions that people could actually commercialize. Do you plan to explore commercialization of Arrogant or Iridescent?
Of course. I am one of the founders of our Center for Biomedical Inventions at UT Southwestern. At this center, we realize that we are not only academics, but also have a desire to be somewhat commercial, and the [projects] that we work on oftentimes we would like to see end up in the commercial arena. You can do that with software as well as you can with other things, [but] software has a whole different set of issues. We’ve seen software companies come and go — big software companies. With software, there has to be a very clear application for it where there’s a clear audience that would need it and benefit from it. Because Arrogant is designed to help design arrays and work with array data, it would probably be best to be licensed and installed into some company that does that. In the meantime, if you go to my website, http://innovation.swmed.edu, you’ll see that we’ve got a substantial number of codes. We’ve pre-computed all primers to amplify up every exon in the genome, for example. What ends up happening is when our codes are very popular, it’s hard to have enough compute power to respond to the demand of the users. In that case, that’s when it really should go commercial.
Are you reaching that level with some of your codes?
Yes. With this paper coming out, probably the Arrogant code will end up being saturated for a while. We prepared for that by buying some additional servers to speed up the operation. But when we get a whole bunch of users, we’re going to probably discover a bunch of more bugs and other things like that. Before we go broke, we’ll continue to add power.
Who in your lab actually developed Arrogant?
It was Amit Kulkarni [the first author on the Bioinformatics paper.] There are other people who are straightforward biologists, and the postdoc who is maintaining it is Yuen, who is a computational biologist.
Now, on another subject, in last April’s Genome Technology you said you were developing new micro-mirror microarray technology from Texas Instruments. Is that the same technology being developed by the Houston company, Xeotron?
There are several companies. None of them have gotten together but we have been in discussions with Xeotron, which wants to make peptide arrays, because we have an issued patent on this [technology], and the issued patent and some of our other technologies have actually been assigned to a company that’s a startup out of UT Southwestern called Light Biology. We have a collaboration with Affymetrix and a few others as well.
So Affymetrix is developing these micromirror technologies?
Affymetrix is interested in making custom arrays and they have been collaborating with us for years to help advance this technology. Exactly who will commercialize it I won’t know — there are other companies, Febit in Germany and Nimblegen in Wisconsin -- that all had the same idea. Nimblegen actually has their manufacturing off in Iceland, helping to escape the wrath of Affymetrix. I think if they ever make it on the radar screen, and they ever have any substantial sales, I think that they’ll certainly have to talk to both Affymetrix, about the patents from Affymetrix that dominate in this area, and to us for our patent.
So when do you think there’s going to be a micro-mirror based custom array on the market?
There already is. Nimblegen says you can send them material and they will take it and they’ll go and analyze it over in Iceland and send the data back.
Have you tried this service?
Oh no. We manufacture more than enough arrays to take care of our work here. That manufacturing technology is supported by both the NCI and NHLBI here, as well as some DARPA money. But, for expression, we just use Affymetrix arrays. They have the most complete set, and the most cost-effective, most accurate way to do it. We use our custom arrays for other things, [such as] measuring methylation status, resequencing, genome annotation, and DNA packing assays for humans. We have a substantial number of researchers at UT Southwestern for which we make arrays in response to biothreat needs. So we make arrays for Anthrax, for the plague, for tuleremia, TB, and things like that. Anybody can come up to us with a need, and tell us the genome, and we can have them an array in a day.
Are there any potential glitches that need to be worked out with micromirror arrays?
I’d say the major glitch is intellectual property. Most of the people are still trying to focus on measuring expression, and since Affymetrix sells arrays that are complete and sort of standard, we just don’t go into that arena, unless we want to measure expression on an organism that Affymetrix doesn’t sell a chip for. As far as technical glitches, any time you have a new type of application like that, there’s lots of stuff to be worked out. I don’t call them glitches. I call them a development project.
What other work are you doing now related to microarrays?
We’re submitting a paper next week on measuring the chromatin packing state at a gene resolution. I think that if that paper gets accepted, people will understand the value of that measurement.