NEW YORK, March 14 – If imitation is the sincerest form of flattery, then the Genome Database has reason to be proud.
According to people who manage the Toronto-based gene mapping database, the Weizmann Institute of Science, Celera Genomics, and DoubleTwist are among the users who account for the more than 300 million hits the database gets per year.
“The fact of the matter is we’re getting a third of a billion hits per year on our server,” said Jamie Cuticchia, director of the GDB. “And, if you look at some of the high-profile portals, such as DoubleTwist and such as [the Weizmann Institute’s] GeneCards, they are repackagers of GDB data.”
While the people behind GDB, a leading source of curated genomics data, actually enjoy the fact that their data is being raided, the popularity has its downside: Database companies and their customers can make money from the GDB’s freely available information, while GDB itself is struggling to survive.
“If Celera wants our data, I want Celera to have our data. What I’m interested in is that, if Celera or any other company gets our data, it’s done in such a way that if the pharmaceutical companies still see value in our data, we’re able to get a revenue stream from this so we can continue our operations,” said Cuticchia.
Celera did not respond to a request for comment, while DoubleTwist said the large number of hits traced back to the company actually came from users of its database who were routed to GDB.
In response to DoubleTwist's explanation, Cuticchia said that most of the hits GDB gets from DoubleTwist actually appear more like a running script than individuals hitting the server.
But, Cuticchia added, arguing about whether or not a company uses the database misses the main point. For his part, Cuticchia is more concerned about getting funding in order to create a recurring revenue stream from GDB than he is about which genomics companies are using, or even profiting from, the data.
Right now, Cuticchia is trying to raise $1 million to $1.5 million to revamp the database infrastructure and create new software that would allow the Hospital for Sick Children to have full rights to any future profits stemming from database licenses.
“We feel uncomfortable licensing GDB now because it’s not clear where the IP of the software GDB rests on is held. We got funding from 100 different sources,” said Cuticchia, referring to the early supporters of the database which dates back to 1988. “Once we create a brand new, fresh database then we can run unencumbered in this.”
If GDB gets the money, the new and improved database could then be marketed to pharmaceutical companies, many of which are afraid to use the current system because it has a weak infrastructure and can only be accessed through the web, Cuticchia said.
Cuticchia argues that the pharmaceutical companies should support this effort since these companies have the most to gain from a more secure, value added GDB. So far, he has approached a number of pharmaceutical companies about supporting the effort, but none have yet made any financial commitments to GDB.
Cuticchia also noted that none of the users of the data have offered any financial support, although the developers of GeneCards, which is being licensed by DoubleTwist, did ask about obtaining a license. Marilyn Safran, a senior research engineer at the Weizmann Institute and a member of the team that developed GeneCards, said she could not comment on whether there have been any talks with GDB for a license, but she noted that GeneCards serves as a mirror site for the Canadian database.
If the funding doesn’t come through, GDB, which currently supports 11 full-time employees and has 103 editors around the world voluntarily curating genomic information, might consider either licensing the database for a fee to all non-academic users or scaling back its efforts.
“We need money,” Cuticchia said bluntly.
Of course, this would not be the first time that the database, which is housed in the Hospital for Sick Children, would face tough times.
GDB, which was initially funded by the US Department of Energy and the NIH, was shut down in 1998 when the DOE cut support for any non-sequencing genome projects. Later that year, Cuticchia, a former editor of the database, helped to resuscitate the effort with part of a $50 million contribution the Hospital for Sick Children received for its bioinformatics initiative.
Since then, about $700,000 a year has been allocated to keep GDB up and running and providing what he says is a novel service.
“You have companies like Celera and sites like the NCBI that are out there providing raw sequence and in many cases ‘annotation.’ But what’s annotation? Annotation is you release a set of computer programs to take a best guess as to where genes are,” Cuticchia said.
“Think of GDB as a Gray’s Anatomy of the genome – we’ll tell you everything there is to know about cystic fibrosis.”
Now, that might be a service worth supporting.