CHICAGO (GenomeWeb) – The cBioPortal for Cancer Genomics, a web-based software platform to help researchers visualize and analyze large datasets from the Cancer Genome Atlas, began as an in-house system at Memorial Sloan Kettering Cancer Center in New York.
Following pressure from outside the institution, MSK researchers announced at the American Association for Cancer Research annual meeting in 2014 that they were adapting cBioPortal for clinical decision support and making the software open-source.
Although cBioPortal today counts representatives from MSK, Dana Farber Cancer Institute in Boston, Princess Margaret Cancer Centre in Toronto, Children's Hospital of Philadelphia, Bilkent University in Ankara, Turkey, and pharmaceutical giant Boehringer Ingelheim among those on the development and maintenance teams, a driving force behind the technology becoming open-source was a for-profit startup called The Hyve.
"Our goal is to build and contribute to open-source communities," said Kees van Bochove, CEO of The Hyve, which is based in Utrecht, Netherlands, with a US office in Cambridge, Massachusetts.
Van Bochove launched The Hyve in 2011. The open-source bioinformatics software development company is growing rapidly, adding about one full-time-equivalent position a month of late, according to van Bochove. Now, there are six to seven full-time employees dedicated to cBioPortal, supported by grant funding and pharma money, mostly from Johnson & Johnson's Janssen Pharmaceuticals.
The name of the company comes from how a beehive is at the center of a complex system of a bee colony, van Bochove said.
Healthcare is notorious for its large, complex, monolithic IT systems, though that seems to be changing in some circles. "Pharma software isn't as competitive anymore, and companies are increasingly opening up their data sources," noted van Bochove, who is a board member of the Pistoia Alliance, a coalition of life science organizations.
The Hyve's philosophy is to "make lots of small programs that stand alone" but can work together to support larger causes, van Bochove explained at the joint Intelligent Systems for Molecular Biology European-Conference on Computational Biology (ISMB/ECCB) conference in Prague last month.
"Netflix has a lot of open-source technology," van Bochove said. Most is behind the scenes, in the form of "microservices" that support the main function of delivering video on demand. "This is persistent architecture that scales well," he added.
Van Bochove said that the 2015 O'Reilly Media book, "Designing Delivery," about adapting IT to the "digital service economy" provides the blueprint for open-source development of cBioPortal, which is a simple system designed to replace more complex, institutional technology for managing genomic informatics workflows. He said it is akin to Microsoft's decision to turn Office from a software package that is outdated the moment it is installed into a cloud-based subscription service.
The Pistoia Alliance is also following this philosophy for the Ontologies Mapping project, a two-year-old effort to develop better tools and establish best practices for ontologies in life sciences research and development.
The cBioPortal community today includes software engineers, bioinformatics professionals, and cancer researchers and clinicians. Demand is mostly coming from the pharma world, though van Bochove reported that there is increasing interest from national initiatives in precision medicine, including the Dutch Health Research Infrastructure and Germany's Federal Ministry of Education and Research — known in German as BMBF.
"Our overall goal is to build infrastructure to support clinical decisions for personalized cancer treatment by utilizing 'big data' of cancer genomics and patient clinical profiles," reads a poster that The Hyve has been displaying at conferences — including ISMB/ECCB 2017 — for nearly a year and a half.
The poster represents progress made through May 2016, when the display debuted at the annual 2016 Bio-IT World Conference & Expo in Boston.
"The contributions can roughly be divided into three categories: (I) new data analysis features, (II) improvement of the data loading pipeline, and (III) performance optimizations of the front end to be able to host larger studies," the poster said.
Notably, the community has added a "pan-cancer" view to visualize results of studies involving multiple types of cancer, support for newer types of genomic analysis such as messenger RNA, plus better documentation. Recently, The Hyve has augmented cBioPortal query functions, particularly on the portal's study overview page.
The newest query tool is not integrated with the main portal yet, demonstrating that bioinformatics software consistently remain works in progress.