Tarrytown, NY — IBM last week cemented two key partnerships in the nascent biobanking field. Big Blue will provide the IT infrastructure for a new biobank at the Karolinska Institute in Sweden, as well as the “interim” hardware system for the UK Biobank, which is slated to begin recruiting patients next fall.
Michael Svinte, vice president of Information Based Medicine at IBM Healthcare and Life Sciences, said that biobanking is becoming an important component of the company’s overall information-based medicine strategy. “We’ve seen a lot of interest over the last six months,” he said. After years of planning, he added, a number of large-scale biobanking initiatives “are working through their IT strategy and are starting to put the infrastructure in place.”
This is good news for IBM and other IT vendors. While the primary goal of biobanking is to gather, store, and distribute tissues, blood, and other specimens for biomedical research, such projects require sophisticated information systems for sample tracking and management, for handling the clinical data and patient records associated with each sample, and for managing the vast amounts of experimental data that will ultimately be derived from those samples.
“You can’t separate out the IT from the whole biobanking trend,” Svinte told BioInform at Biobank Summit II, a conference IBM hosted here last week. “At its core, it’s about the information, and information is our business.”
UK Biobank: Big Project Starts Out Small
The UK Biobank — a project initially proposed in 1999 and only now getting underway — aims to create a biorepository and database that will track the health of 500,000 participants between the ages of 45 and 69 over the next 10 to 30 years. The Wellcome Trust, the UK’s Medical Research Council, and several other parties have ponied up around £60 million ($112 million) to fund the recruitment phase of the project, which is slated to begin next fall and last for three to five years.
Steven Walker, CIO of the UK Biobank, said the IT problems associated with the task are “extremely daunting.”
The risk associated with the project is “immense,” Walker said. “We cannot approach this in an amateur way. …We cannot compromise in the areas of confidentiality and security.”
Walker was hired six months ago, and was thrown into an extremely tight schedule that required a pilot system to be up and running by next month. In that time, he said, the project team has hammered out an ethics and governance framework, a sample-handling protocol, and a scientific protocol. On the informatics side, a multi-tier system architecture that will offer several different degrees of access to the repository’s highly sensitive information — ensuring that the core databases are off-limits to external users — has been worked out and proposed to the Biobank’s board of directors.
The only IT equipment that Walker has settled on so far is a Nautilus LIMS from Thermo, and “a few servers and several terabytes of storage” from IBM to support the LIMS. Everything else, he said, is still under consideration.
IBM’s business consulting group has been collaborating with the UK Biobank to determine the optimal IT architecture for the system, but Walker said that this relationship — and the fact that the project has placed an initial hardware order with the company — doesn’t signal that the UK Biobank is committing to IBM for the long term.
“We run a tough procurement competition” for every phase of the project, Walker said. IBM won out for the initial phase of the system, which has been designed to start out small and scale out as required. Walker said that there’s really no way to even guess the amount of compute power and storage that will ultimately be required to support the Biobank.
“Predicting is extremely difficult,” he said. Since the Biobank will capture patient records from the UK’s national healthcare IT system, the storage requirements will ultimately depend on how diligent doctors are in terms of entering patient data. Some physicians, he said, are “rigorous,” and upload each image from every test that a patient undergoes, which would require a huge amount of storage, while others may only jot down a few brief notes on major health events. As a rough estimate, he said, the initial data from existing patient records is expected to require between 4 TB and 5 TB.
One of the most important considerations for the project will be the database management system that Walker and his team ultimately choose — a decision that is also up in the air, he said. IBM’s DB2 is one consideration, he said, as are Oracle, MySQL, and even a custom storage solution from First Genetic Trust.
Security will be the primary criteria for the majority of the project’s procurement decisions, Walker said. Beyond that, he said, the considerations are the same as they would be in any other IT project: “How well will they be able to support me locally, how much will it cost, and how well does it meet the requirements of the project?”
Financial terms of the deal were not provided.
Karolinska Biobank: Strong IT Underpinnings
It’s not clear whether the Karolinska Institute Biobank will be similar in scope to the UK Biobank, because the project’s planners don’t even know themselves yet. Jan-Eric Litton, director of informatics for the project, told BioInform that the total number of patients to be enrolled has not been determined yet.
But that hasn’t prevented Litton and his colleagues from developing a solid IT infrastructure upon which the biobank will be built. The Karolinska team has designed a middleware system called the Biobank Information Management System, or BIMS, to manage sample tracking, link samples with phenotype databases, and handle donor consent records. BIMS will be developed using IBM’s WebSphere platform, along with the DB2 Information Integrator (formerly called DiscoveryLink).
Last week, IBM announced that the KI Biobank will use pieces of its recently launched Clinical Genomics Solution — a suite of tools for integrating, storing, and analyzing genotypic and phenotypic data. In addition to WebSphere and DB2II, the KI Biobank will also use IBM’s Data Discovery and Query Builder, which allows users to perform complex queries without knowledge of SQL.
Litton said that the KI Biobank is a long-term project “that we are building for the next generation of researchers.” While many of the details of the project remain in question, “we knew we had to get the infrastructure in place first.”
Financial terms for the Karolinska deal were not provided.
The UK and KI biobanks are among the first to have begun building large-scale IT systems, but a number of other initiatives are in the pipeline that “could create a windfall for IT infrastructure firms,” according to Anna Barker, deputy director for advanced technologies and strategic partnerships at the National Cancer Institute. Barker said that a “common bioinformatics network and a common language system” would be a “key piece” in creating a national biobank in the US — and just as important as high-quality samples.
Barker is leading an effort at the NCI to explore the possibility of creating such a repository in the US, and said that NCI “is taking a leadership role to make this a national priority.”
The proposed system, dubbed the National Biospecimen Network, is envisioned as a collaboration between NCI, academic medical centers, and the biopharmaceutical industry — along the lines of other public/private endeavors such as the SNP Consortium. Rather than a centralized repository, the NBN would link distributed regional biobanks via an informatics network, Barker said.
NCI expects to sort out some of the open questions surrounding such a system and put together a more concrete proposal within the next year, she said.
Another project, called the Public Population Project in Genomics, or P3G, aims to “foster harmonization — not necessarily standardization” among international biobanks, according to P3G founder Bartha Knoppers. P3G received its first seed funding last week, Knoppers said, and has set as one of its short-term goals the launch of the P3G Knowledgebase, which will serve as a resource for biobanks to share strategies, ethics codes, consent forms, protocols, vocabularies, and other materials that would otherwise be duplicated as more biobanks are launched.
The effort is a first step toward enabling biobanks to share data, Knoppers said, and could also serve as “an engine for the transfer of this knowledge to the healthcare system.”