The University of Luxembourg has hired Reinhard Schneider to set up and lead a new central bioinformatics core that will support biomedical research efforts at the Luxembourg Center for Systems Biomedicine.
As part of his new responsibilities, Schneider will lead a team tasked to design data analysis techniques as well as methods to use laboratory data in in silico models of disease and in turn use these models to improve wet lab experiments.
Schneider expects that within a few years, LCSB will have the largest computer and data storage infrastructure in an academic setting in Luxembourg. He also plans to have a team that is adept at "merging and combining high-throughput methods in biology and medicine with the immense requirements for hard- and software" — a point that he believes will be crucial for systems biology in the future.
Schneider joins the LCSB from the European Molecular Biology Laboratory in Heidelberg, Germany, where he has led a data integration and knowledge management team since 2004.
Prior to that, he co-founded bioinformatics firm Lion Bioscience and served as its vice president and CEO for several years. During his time at Lion, he set up and led a team in early drug research at Bayer.
Schneider spoke with BioInform this week about setting up the compute infrastructure for the new bioinformatics core as well as his plans for developing software to support LCSB's research goals. Below is a version of the conversation that has been edited for clarity and length.
When will you officially take office at the LCSB?
I started in May. Right now it's part time, so I am two days in Luxembourg and three days at the EMBL. By early next year, I will be in Luxembourg [full time].
At EMBL you were team leader of the data integration and knowledge management group. With the move to the University of Luxembourg you will be setting up and managing a bioinformatics core. How will your duties differ?
At the EMBL it’s more research oriented but we do a lot of data analysis as part of our job. We get large-scale sequence data and systems-biology-related data from groups at the EMBL but also from collaborators all over Europe and the world. We are developing automatic pipelines and visualization and applying text-mining tools to analyze these data. That will be very similar to what we do in Luxembourg — we will apply these tools and get other tools in place to analyze large data sets arising in the systems biomedicine context.
What will be an addition — and my former life as a manager at Lion Bioscience plays a role — is more the managing, setting up groups, and more support-oriented issues and tasks. I will try to serve not only the LCSB, [but] we [will] also try to be a nucleation site for the other groups in the public research institutes in Luxembourg and the Integrated Biobank of Luxembourg. The hope is to get a critical mass for bioinformatics in Luxembourg and get fruitful collaboration between these institutes up and going.
Can you provide some details about the team you will lead at the university?
We start with a group of around eight people and that will grow to probably 15 to 20 people next year. In my group, I have physicists, biologists, a pharmacologist, and bioinformaticians; the typical mix of people coming either more from the computational science and other people with a focus on the life sciences. Some people will be heavily involved in analyzing disease-related data and network reconstructions and really try to understand biology or medicine, and other people will be more on the software engineering side.
What kind of expertise are you looking for in the new hires?
I have advertisements out there and applications are coming in at all kinds of levels. We need people with experience on more the project-management side, dealing with all the partners and issues arising in large research collaboration projects and I am looking for computer scientists to develop pipelines and biologists or medical oriented people with interest in computer science.
What has to be in place to get the bioinformatics core up and running?
Everything! We have a brand new building [and] the first challenge is to get hardware into the building. The call for hardware is now out, so I hope we get the first boxes in the September/October time frame and then shortly afterwards have a stable production environment. We then will set up a high-throughput environment to handle the huge amounts of data we are expecting from our experimental groups.
How much hardware do you have on the ground and how much are you moving up to?
Right now, I don’t have anything. In the first phase when we will move into the building we will probably start with half a petabyte of data storage and about 600 compute cores. This system will then grow substantially over the next years.
In terms of software, what do you currently have and what are your plans on that front?
We have a large-scale data analysis system developed in Heidelberg, which we will move to Luxembourg. Then our Luxembourg center has a close strategic collaboration with the Institute of Systems Biology in Seattle, so we can also use software and analysis pipelines that are developed there and then of course hire people with a lot of experience [who] also bring tools and specific scientific software with them.
[Currently] we are building up the basic pipelines and then we add whatever data we get and whatever projects are running. Some projects are in the early design phase, some are already sketched out, and some will produce data at the end of the summer, so we are also phasing up on the data front.
What are some research areas that the university is involved in that require bioinformatics tools? Related to that, are there any specific informatics challenges that you are looking to address?
The focus of the LCSB will be Parkinson's disease but we also have projects in diabetes, epilepsy and lung cancer as part of our collaboration locally in Luxembourg and in Europe.
We have started with Parkinson's as the flagship challenge for the LCSB. We get a lot of data from next-generation sequencing projects, but we [also] have a close collaboration with a proteomics group and we have a metabolomics group in-house. We will drive the experiments according to what data we need to build up the molecular disease networks we are interested in. On the computational biology side we have our own research group at LCSB, which is focused on network reconstructions, the dynamics of networks, and so on.
We go for the systems approach for diseases and we picked Parkinson's as our focus but we try to set up our data analysis pipeline in a generic way. This allows us to apply them to other diseases of interest. Our strategy is to explore the complete knowledge landscape, which is known today, and then identify promising smaller disease-related subnetworks. From there we will go into the details and then design experiments to find out if the hypothesis we have is right or wrong.