NEW YORK (GenomeWeb) – Scientists at Phoenix Bioinformatics, an independent non-profit organization that launched to manage the Arabidopsis Information Resource (TAIR), seem to have found a viable model for generating sustainable funding to support the long-term maintenance and upkeep of biological data repositories and resources.
Investigators involved with TAIR launched Phoenix in late 2013 to explore alternative funding mechanisms to support the Arabidopsis resource. With TAIR's National Science Foundation funding running out, the investigators decided to offer subscription-based access to the database with separate pricing options for industry and academia and an additional free limited data access option for cash-strapped academics.
Establishing a non-profit allowed the investigators to pursue that model for TAIR but also to explore other forms of user-based support that would ensure that the data remains as widely available as possible, Eva Huala, TAIR's principal investigator, told GenomeWeb this week. They would also be able to offer expertise gleaned from their experiences with TAIR that could be useful for researchers seeking to implement these models and transition their users.
These alternative revenue streams could be viable options for scientists running research repositories and looking for ways to supplement or replace lost grants. Today, researchers routinely use large quantities of data in their studies with more being generated on a regular basis, but the funding that supports these repositories has largely remained flat, according to Huala. As a result, many of these repositories are underfunded and some, like TAIR, lose their funding.
Another disadvantage of relying on grant-based funding is that repositories have to essentially compete for the same research dollars as the scientists that use them, she said. Also, "the periodic, episodic nature of grant funding means that sooner or later every repository ends up with a shortfall and has to lay people off, and then expertise is lost, and it works against the ability to plan long term [and] make long-term strategy decisions for repositories." Moreover, it can have the unwanted effect of orienting investigators' activities to funding agencies' requirements rather than the actual user community, she noted, and the priorities of both groups aren’t always identical.
"As we thought about the user-based funding, we realized that there is a lot of scope there [and] a lot of options that could be explored," Huala said. "With TAIR, we've begun to explore one of them ... but there are actually a number of others that we consider on the table for future efforts with other repositories." That list includes membership models, pledge drives, voluntary contribution campaigns, and so on.
For its part, the subscription-based model has so far worked out well for TAIR. Phoenix will disclose financial details of its first year of operation later this year, but according to Huala, "we are now at a point where we are sustaining the TAIR operations based on the user contributions." That includes weekly updates with new gene function information curated from published literature and being able to accept new data contributions from the research community, she said. There's also room to improve the repository and the Phoenix team is considering adding in new data types and tools, although Huala noted it is too early to discuss these in further detail.
In choosing the pricing schemes, "we did a number of things to make sure that the data are still widely available" and to mitigate any potential negative effects of adopting a subscription model, Huala said. For example, "we made sure the price was affordable and that there were a lot of options for subscription. We also made sure that some of the content is always free, and then the content that is not free is available to a limited extent even for non-subscribers." Also, students enrolled in courses that require the use of TAIR, as well as some low-income countries, have free access to the database. Finally, all data that has been in TAIR for over a year is made freely available, she said.
The company currently offers a four-tier annual academic or non-profit license, with each tier priced differently based on the usage. Specifically, per year it charges $7,725 for high usage, $5,150 for moderately high usage, $2,575 for moderately low usage, and $515 for low usage. For lab licenses, the company charges $88 per lab member annually if two or more subscriptions are billed together. Individual licenses cost $98 per person per year or $9.80 per month. To date, over 120 academic institutions have subscriptions. Commercial pricing for an annual license for a large company is $25,750 and a small company pays $5,150. Individual licenses for commercial users costs $2,060 per year.
Phoenix is now holding talks with researchers running other repositories who are interested in potentially adopting user-based support models for their resources, Huala said.
Also in 2013, the J. Craig Venter Institute along with its partners received a $4.5 million NSF award to create a new free resource for Arabidopsis called the Arabidopsis Information Portal (Araport), a federated system that would offer access to TAIR data. The portal grew out of a series of discussions and workshops that were organized by the International Arabidopsis Informatics Consortium, or IAIC, a community of researchers that mobilized following the NSF's 2009 announcement to discuss new ways of storing and accessing Arabidopsis data. Besides TAIR data, the resource was intended to house additional information about the plant including gene expression, protein interaction data, and genetic variation data.
Huala told GenomeWeb that Phoenix maintains a collaborative relationship with JCVI and its partners. She said that Phoenix provides Araport with access to data that has been housed in TAIR for at least a year — the most recent dataset in Araport is from December 2013 — and both groups are exploring mechanisms of perhaps making the most current datasets available through the JCVI portal.
The option of using one system versus another depends on the research need, according to Huala. For example, JCVI's resource covers a much broader range of data than TAIR's, which focused particularly on gene function data, so it may have additional relevant data types that aren't in TAIR, she said. On the other hand, TAIR offers access to the most recent datasets and so research projects that require the most up-to-date information would be better served by taking advantage of Phoenix's resources.