MENLO PARK, Calif.--At Progenitor here, lab technicians spend their morning online, looking through yesterday's data for hints of disease genes. Each technician is a "chromosome master" of about four chromosomes, charged with monitoring all linkage data for that region. They get to do this work online thanks to the nine-person bioinformatics group led by Senior Director of Bioinformatics Yannick Pouliot.
Progenitor's bioinformatics group has grown since Pouliot's arrival in late 1996. When he came to Progenitor from a job as senior product development scientist at Molecular Simulations, Progenitor's bioinformatics group had two full time employees and two consultants. "It wasn't until I came on board that things started moving," Pouliot said. "Until then, they didn't have a bioinformatics director."
Pouliot said that people had been complaining that data analysis wasn't happening fast enough. "Today that is not an issue," he said.
Pouliot estimated that the company analyzes its data 10 times faster than before the bioinformatics group kicked in. "That's probably an underestimate," he said. "It's more like 50 times in genetics."
Pouliot said his group focuses on turning data into information and accelerating the discovery process. He said, "There is a big difference between data and something that is useful for scientific discovery purposes."
Relying on bioinformatics
Until now, he has been hiring people for pure tool development, but that focus is shifting. "I think what we need is to extend the scientific component of the department, so what we are doing is hiring people who are going to be on the pure discovery side of bioinformatics," Pouliot said.
However, he said the department should be fairly static for the next few months. "We are already about 10 percent of the company, which is a large group for a company of this size." Pouliot added, "I think we have a unique group. We have both genetics and analysis; very few places deal with both."
Douglass Given, president and CEO of Progenitor, said the company has become dependent on the bioinformatics group. He said people can focus on generating more data rather than on analyzing the data they have. "If I asked employees where they would want new hires, they'd ask me to put people in bioinformatics," he said.
Progenitor focuses on identifying new disease genes using parallel approaches, both of which rely on bioinformatics tools. One group uses cell lines from disease patients and looks for genes that are expressed differently in these patients that in normal cell lines.
Winston Thomas, senior scientist at Progenitor, said that the advantage of this approach is that any gene they identify has a good chance of being involved in the disease. "It's more of a candidate-gene approach," he said.
One project that they are using this approach for is asthma. "It's a good disease to be involved in," Thomas says. "There's no surgery," he said, so any pharmaceutical approach they identify will be useful. Using bioinformatics tools, they search for similarities between genes they identify and known genes, looking for clues about those genes' roles.
In labs down the hall, other groups look for gene polymorphisms in affected individuals that are linked to known markers. In one project, Progenitor and its collaborators have blood samples from members of over 1,000 families with a history of asthma. Technicians fluorescently label the markers green, blue, and yellow so a detector at the end of the gel can identify the marker. Each lane can have as many as 12--14 markers for a single patient, appearing as colorful striped bands when displayed on the computer.
Data collected from these runs are compiled every night by a set of computers buried in a small storage room amidst stored lawn furniture and discarded boxes. Despite these lowly settings, the computers churn through what would amount to over a three-inch-tall stack of unorganized data on paper. By the next morning, employees have access to all the data collected the day before.
Progenitor employees view these data using a tool called Genetic Visualizer, developed by three members of Pouliot's bioinformatics group. Genetic Visualizer is a Java-based tool accessed through the web that provides a way to view polymorphism data along with information about which samples came from affected versus unaffected patients and family relationships.
It's the visualizer's ability to sort and display data that makes it so useful, Pouliot claimed. In one mode, it shows a chromosome with the genetic markers labeled along it. All markers carry a score for how likely they are to be close to a disease gene based on accumulated data. If a marker carries a high score, a researcher can move into a mode where Genetic Visualizer displays which families or ethnicity contributed to that high score. In another mode, they can look at all the data for one family or ethnicity, or at a single day's data. In the past, Thomas said, each different way of viewing the data would have involved resorting the data in a computer, then printing another stack of paper.
"If we didn't have Genetic Visualizer we would be in trouble," Thomas said. The tool is not publicly available, and Pouliot said, "there's nothing else like it." He said that not only is Genetic Visualizer a good tool for identifying genes, it is useful in lab meetings and in presentations. "It's much better than making overheads or slides," he said, because you can show the data in real time on a laptop. If a meeting member wants to see the data presented differently, they can look at them right then.
Using the visualizer to locate linkages and cell culture studies to identify potential disease genes, Progenitor can narrow its search to specific regions of the chromosome. Thomas said that if the group finds a possible disease gene, they can locate it in the human chromosome and focus their attention in that area. "This is where the two approaches come together," he said. Pouliot said they have identified hundreds of asthma candidate genes using this dual approach, and are narrowing the group down to the most promising few genes, while continuing their search for additional targets.
Using a similar approach, they have identified several other genes in weight regulation, blood vessel growth, and in hemochromotosis, a blood disease that can lead to cirrhosis, liver cancer, diabetes, and heart damage.
Progenitor's discoveries have led to a collaboration with Cambridge Antibody Technology (CAT). Progenitor supplies gene targets it has identified to CAT, which will develop antibodies to the protein target. Progenitor will then have access to the antibody for further research. Pouliot says that the bioinformatics tools Progenitor brings to the project have helped identify genes that CAT will begin working with, and that Genetic Visualizer provides a good way to demonstrate why a particular gene is worth looking into. "We use it to demonstrate to them the importance of a particular target," he said. However, Progenitor does not provide collaborators with direct access to their bioinformatics tools.