LONDON--"Bioinformatics has moved from high potential to key operational technology," remarked Dominic Clark of Glaxo Wellcome at the start of a conference he chaired here February 15-16. Proceedings of the meeting, "Effectively Applying Bioinformatics to Maximize the Potential of High-Throughput Technologies," organized by IIR Bio/Technology Conferences, confirmed Clark's premise.
The meeting provided a forum for discussion of recent developments aimed at accelerating drug discovery. Pfizer's Geoff Johnson opened with the prediction that target discovery will be revolutionized in the new millennium. He said completion of the human genome project will radically change the current drug discovery scenario, in which, just three years ago, only six molecular mechanisms contributed to 70 percent of sales of Pfizer's top 20 drugs.
The transformation will require sophisticated techniques for storing and retrieving data as well as for deriving necessary information from it, Johnson contended, adding that each function will probably require developing partnerships between pharmaceutical companies and specialist providers.
Bob Gordon of the Janssen Research Foundation discussed how his company has used bioinformatics to identify a neurotrophic factor and its receptor. "Plenty of lessons were learned in the process, in particular the need for discrimination in the face of potentially endless hardware expenditure and an acute shortage of programmers with appropriate skills," he concluded.
The bioinformatics skill deficit was a recurring theme. The growing need to ask increasingly complex questions of data resources is highlighting the shortage of suitably trained personnel, attendees agreed.
In this context, Robert Stevens from the University of Manchester introduced the TAMBIS Project, whose aim is to allow working biologists sophisticated access to many bioinformatics resources even if they don't have the skills of a trained bioinformatician. The approach is to use a conceptual description of the domain knowledge to provide a homogenizing layer over the disparate sources. "This conceptual model can be used to generate a user interface over many resources, enabling biologically sensible questions to be asked," Stevens said, adding that it is now possible to ask complex, multi-resource queries using TAMBIS.
Alvis Brazma from the European Bioinformatics Institute explained the significance of datamining and visualization in bioinformatics. The institute explores the applicability of general-purpose datamining tools to bioinformatics tasks. Brazma described a case study of Quadstone's tool, Decisionhouse, for mining gene expression data. Other general-purpose tools used include Silicon Graphics' MineSet, Daisy Analysis's Daisy, United IS's GVA, Belmont's Cross-Graphs, and Diamond from SPSS. Advantages of such tools include flexibility, good user interfaces, and support from the providers, Brazma said, but, he added, limitations are also evident. When visualization is inappropriate or any of the most popular datamining algorithms fail, general-purpose tools are not applicable.
Cambridge Antibody Tech-nology's ProAb Database, which allows mining and visualization of ProAb gene-linked expression data, was described by Simon Brocklehurst, who demonstrated the company's proprietary multitier client/server system, Continuity, offering enterprise-wide integration of target and drug discovery. Andrew Lyall, who recently joined Oxford Glycosciences from Glaxo Wellcome, further explained how public genome and protein databases can most usefully be combined with internal data sources.
Lyall elucidated the roles of genetics, genomics, and proteomics in target discovery and validation, commending approaches such as TAMBIS and commenting that "providing access to the burgeoning number of databases could be the most significant competitive advantage."
Sanger Centre's Alex Bateman described the use of the Pfam database, a collection of more than 1,400 protein domain families from SWISSPROT and TrEMBL, for automatic sequence annotation. The database contains curated annotation and alignments for common domains and matches the majority of proteins. Sanger researchers employ GeneWise, a single algorithm that compares a protein domain to genomic DNA, to use Pfam for genome annotation, he said.
The use of microarray technology in identifying target genes was also discussed. According to Burkhard Morgenstern of Rhone Poulenc Rorer, "Since comparison of gene expression between normal and diseased tissues may lead to the discovery of novel drug targets, this technology has a significant impact on the drug-discovery process in pharmaceutical companies."
Another application area is toxicological studies, where expression of certain genes may indicate toxicity of chemical compounds. Rhone Poulenc currently uses two microarray systems, from Affymetrix and Amersham. Affymetrix provides customers with ready-made chips, while Amersham microarrays can be built in-house. Resources for constructing microarrays are derived from Rhone Poulenc's proprietary expressed sequence tag database and Incyte's LifeSeq database, as well as from public-domain databases.
Janet Thornton of University College here provided an outline of how the next 10 years will see a simplification of protein sequence and structure classifications. "Extra data will clarify relationships and allow us to explore in more depth than ever the evolution of sequence and structure," she said. These data are of central importance in several aspects of the drug-discovery process, the major aim being to use sequence and structural relationships to identify homologues with the hope of assigning biological function. Knowledge of the 3-D structure must also be used to see if two related proteins perform the same biochemical function.
"Tools to facilitate this analysis have been developed," Thornton said, adding, "The end-point is the identification of potential targets for drug design. Research to define the criteria for prioritizing such targets is important."