Skip to main content
Premium Trial:

Request an Annual Quote

Andrew Lyall, IT Director and CIO of Oxford GlycoSciences


AT A GLANCE: PhD in computational molecular biology from Edinburgh University. Joined OGS in 1999 from Glaxo Wellcome, now GlaxoSmithKline, where he was head of the advanced technology and informatics department. Enjoys cooking and listening to music.

QWhere will bioinformatics be in two years? Five years?

AI think all the action in bioinformatics over the next couple of years will be in finding which genes produce which proteins. There’s maybe between 30,000 and 60,000 genes but there may be as many as a million proteins, so the informatics efforts around that sort of thing are going to be very exciting.

I think within two years we will know most important proteins as far as the pharmaceutical industry is concerned. I would say that genomics and DNA expression chips will essentially be over because those techniques are really pointless when you have direct access to the proteins. So my prediction for bioinformatics over the next two years is that in essence it’s going to be support for proteomics.

If you go out to five years, I think all the action is going to be around building larger structures out of the information we have about proteins.

QWhat are the biggest challenges bioinformatics must overcome?

AThere’s hardly any kinetic data about substrates binding to proteins or proteins binding to each other. What’s available in the literature is also generally unusable because the data was collected under different circumstances, so I think the real challenge is actually still a biology one, which is developing techniques that can generate sufficient data.

Mathematical modeling of pathways is still a purely academic dispute because of the lack of data. It’s a bit like trying to do sequence analysis 20 years ago when there were only one million bases of DNA in the EMBL and Genbank databases and the techniques were way ahead of data. In the case of sequence analysis, it’s the other way around now, of course.

QWhich databases do you use?

AWe use public domain databases, we subscribe, and we also make our own. We supply proteomics databases to our collaborators that integrate genomics and genetics and cDNA databases into our proteomics databases.

QWhat bioinformatics software do you use?

AWe buy whenever possible, and develop when it’s not available. We have a program called PC Rosetta that we use with our customers in order to deploy proteomics data.

QHow do you integrate your data?

AWe have very active software development to integrate data and we use data warehousing, data federation, and data marts. We have relational databases with almost a billion rows in tables.

QHow large is your bioinformatics staff?

AAlmost 50. There are about 10 doing infrastructure support and about 30 doing software development. The others do a variety of things: statistics, project management, production support, database administration, things like that.

QHow is your bioinformatics unit organized within the framework of the company?

AI’m responsible for the entire IT department. Because we’re a proteomics company, IT and bioinformatics are the same thing. It’s very tightly integrated. We think that’s essential.

QWhat projects are you working on now?

AWe’ve just commissioned a first-generation high-throughput proteomics factory and we’re setting up relationships to start setting up the second generation based on ICAT and TOF-TOF and we’re working on the informatics for that.

QWhat made you decide to enter a career in bioinformatics?

AI had always been interested in computing and did a degree in biochemistry in 1977. At about that time molecular biology was really taking off and a lot of my friends had sequences that they couldn’t analyze so I started to write some simple computer programs in Basic to analyze the sequences. The rest, as they say, is history.

Filed under

The Scan

Transcriptomic, Epigenetic Study Appears to Explain Anti-Viral Effects of TB Vaccine

Researchers report in Science Advances on an interferon signature and long-term shifts in monocyte cell DNA methylation in Bacille Calmette-Guérin-vaccinated infant samples.

DNA Storage Method Taps Into Gene Editing Technology

With a dual-plasmid system informed by gene editing, researchers re-wrote DNA sequences in E. coli to store Charles Dickens prose over hundreds of generations, as they recount in Science Advances.

Researchers Model Microbiome Dynamics in Effort to Understand Chronic Human Conditions

Investigators demonstrate in PLOS Computational Biology a computational method for following microbiome dynamics in the absence of longitudinally collected samples.

New Study Highlights Role of Genetics in ADHD

Researchers report in Nature Genetics on differences in genetic architecture between ADHD affecting children versus ADHD that persists into adulthood or is diagnosed in adults.