Director of the bioinformatics center
Institute for Chemical Research, Kyoto University
Minoru Kanehisa is the director of the bioinformatics center at the Institute for Chemical Research, at Kyoto University. He is also a professor at the Human Genome Center, Institute of Medical Science, at the University of Tokyo.
Kanehisa’s research group oversees the Kyoto Encyclopedia of Genes and Genomes, or KEGG, collection of databases. KEGG was launched in 1996 and included around 80 diagrams representing metabolic pathways. The current version includes 69,341 pathways generated from 354 reference pathways, and nearly 3 million genes from 55 eukaryotes, 576 bacteria, and 49 archaea.
Kanehisa’s career began in 1976 at the Johns Hopkins School of Medicine and followed on to Los Alamos National Lab where he was a cofounder of GenBank. These days he’s more interested in KEGG’s latest twist: the KEGG Atlas, a new graphical interface for the KEGG databases.
This week, Kanehisa gave a keynote address on KEGG’s applications for medical and pharmaceutical purposes at the Sixth Asia Pacific Bioinformatics Conference in Kyoto. BioInform caught up with him by phone afterwards. The following is an edited version of the discussion.
You gave the keynote at this year’s Asian Pacific Bioinformatics Conference. Your subject was KEGG for medical and pharmaceutical applications. Can you provide an overview of what you discussed?
We started this KEGG project about 12 years ago now, but basically, this started as …basic sciences, so I was involved in the Genome Project and I wanted a resource that would be used for understanding the foundations of special higher-level functions …functional information encoded in the genome.
KEGG has been mostly a resource for basic science, but now we are more interested in applying this to practical things, so KEGG Drug for pharmaceutical applications was started three years ago, but we’ve just started the disease portion so I wanted to [provide] an overview of recent developments of how this resource can be utilized. So I gave an overview for that application side.
It looks like it has been more of a traditional research tool and so now you are targeting medical applications.
What was the impetus? When did you start going in that direction?
I think about three or four years ago. There is a project called [the] Center of Excellence program in Japan funded by the Ministry of Education and I got funding. I belonged to a bioinformatics center in Kyoto University but I got funding together with people in the school of pharmaceutical sciences, so with this funding I could start thinking about drugs and medical applications. That was the beginning; so we sort of started focusing more on the application side.
So the KEGG Drug database is now in pretty good shape. KEGG Disease is still very preliminary, but [in the] KEGG Drug database, for example, we have all the approved drugs…It’s a chemical-structure-based database … so this information can be linked to the other resources like the target information in the context of KEGG Pathways, for example.
When you talk about drugs, there is a research side. Researchers, scientists are interested because there are chemical compounds that interact with [the] body or biological systems. But of course there is another side where it’s prescribed or used by doctors and patients for general public or consumers, so … there have been kind of two worlds, for the research community and also for the … practitioners. And they have been using different databases – doctors and pharmacists have their own database where they can identify drugs. But we wanted to link between these two communities, or two worlds, so first of all we researchers can go more into the patient side and then consumers – if they are interested – they can see the link to the science.
Why do you think this is a compelling area of interest for you and for the community right now?
[I’ve been] involved in the database and genome project for many years. Scientists have made promises – for example, when you sequence the genome, [saying] there’s this benefit and this benefit, and very often, the benefits are related to diseases or human health. I think we should have them much earlier but I think it’s time for me to actually accomplish those types of goals so at least we can make an informatics resource that can help to achieve that goal.
What’s your pet project right now?
I think the chemical – the pathway network analysis, chemical analysis – that’s also related to this drug process. Genes and proteins are of course important for making up human organisms, but people, in general, their concern is more, for example, regarding drugs, foods, environmental, and toxic compounds. They are all chemical materials. The chemicals are always interacting with genes and proteins, but the chemical substance is more familiar to general people. And also they are very, very important for the maintenance of the biological system. But traditional biology is focused [on … DNAs, RNAs, and proteins … These days there is this feud in chemical biology where they are focusing more on small molecules, small chemicals. But the KEGG database has these chemical components, which it started with 12 years ago. It’s called KEGG Ligand database.
But we are now trying to link chemical information with genomic information – in other words, the environmental information and the genomic information.
For example, there are some bacteria that can degrade toxic environmental compounds – in other words, those environmental compounds can be an energy source for microorganisms, but they can be toxic to humans or mammals. So there [are] different types of interactions between those chemicals and the biological system determined by the genome.
So if you, for example, sequenced a genome and know all the genes a bacteria has, then you know what types of enzymes are present and what types, in this case, [of] biochemical degradation, is possible, then you know what type of prediction can be made from the genomic information. And also from the chemical side, if you have enough knowledge, given a certain chemical compound, then you analyze a chemical structure, then you can say this is going to be degraded by this organism [and discover] what would be toxic for humans, for example. People have been doing analysis from the sequence side or protein 3D structure information but now you can start from the chemical structure information.
Of course, this is an area where cheminformatics people have been working but what is different in our approach is we integrate both chemical information and genomic information and the interactions. So that area, I think, is our pet project.
Can you talk about the KEGG Atlas? According to your website, that’s the new thing you offer.
KEGG has many components. There are already now 20 databases, smaller components, so it has lots of different types of information – genes, proteins, chemicals, pathways. And now we have ontologies, and so forth. So it’s becoming too much. It’s becoming more difficult for people to utilize the KEGG resource. People are using certain aspects of the KEGG, but not the entire integrated part. So [with] KEGG Atlas we are going to make this as an easy interface to access the information. We want this to be like a Google map so you have [the] earth and you go to anyplace and then you … get more detailed information. So we now have this type of ‘global map’ of the metabolism, but this way you can zoom in and see a better picture.
So it sounds like it’s this zooming-in feature that makes it unique compared to what you’ve offered before?