Last month, a researcher from Washington University in St. Louis invited scientists to download and beta-test his new database search program for MS/MS data, in a posting to the ABRF electronic discussion group. The software, called STLmass, promised to be several times faster than Mascot, a popular search engine offered commercially by UK-based Matrix Science. Not only that, it would also be free and open source.
Now, the program is no longer available from the researcher’s website — the link to download the software is broken. What happened? The scientist, Donald Elbert, declined to comment, but several researchers said that Thermo Finnigan, which holds an exclusive license for a patent covering the Sequest search program from the University of Washington, had sent a letter to Washington University. Thermo Finnigan declined to be interviewed for this article, but in an e-mail Elbert sent to another researcher, which she posted on the discussion group, he confirms that there was “a letter from a lawyer,” but that Finnigan was not suing him. In this e-mail, Elbert asks her to confirm that she deleted all copies of STLmass and will not distribute the program any further.
At first glance, this seems nothing out of the ordinary: A company appears to be staking out its intellectual property claims. But it might be more than that. Many researchers believe that the licensed patent at issue here could be broad enough to cover most MS/MS database search software, including programs from other vendors that have been available for years. Although until now, Finnigan has apparently let its competitors go about their business, they are afraid that the company might in the future try to enforce its exclusive rights to the patent not only against free and open source software that appears on the scene, but also against commercially available packages. “Finnigan has always been quite a good citizen, so I am somewhat surprised at this,” said Matthias Mann, a mass spectrometrist at the University of Southern Denmark in Odense.
Whether or not Finnigan is actually obliged to enforce the patent depends on the terms of its licensing agreement, which were not available from the University of Washington office of technology licensing. “Many license agreements keep the obligation on the university or on the licensor, but some turn over all the rights exclusively to the licensee and the licensor doesn’t want to be bothered with the maintenance and enforcement obligations,” said Daniel Appelman, an attorney with Heller Ehrman in Menlo Park, Calif.
But nobody seems to be quite sure how far the patent that covers Sequest indeed reaches — this, some say, might hinge on hair-splitting definitions of what is a “spectrum” and what is a “database.” “It’s written fairly broadly,” said Jimmy Eng, a researcher at the Institute for Systems Biology in Seattle and one of the two inventors on the US patent (John Yates, the other one, declined to comment for this article). “It was meant to cover any technique to do peptide sequencing via tandem mass spectrometry database searching,” he said. If Eng is right, then the US patent, No. 5,538,897, entitled “Use of mass spectrometry fragmentation patterns of peptides to identify amino acid sequences in databases,” might cover approaches used by many tandem mass spectrometry search algorithms, possibly including Mascot from Matrix Science, Sonar from Genomic Solutions, and SpectrumMill, which was developed at Millennium Pharmaceuticals and will be commercialized by Agilent Technologies.
“If now they said ‘we pursue the patent to the full extent,’ that would mean that nobody could search their spectra anymore, and that would be terrible,” said Mann. “They would have a huge problem enforcing this, and they would be not very popular.”
According to Mann, there are three basic approaches to searching databases using MS/MS spectra. Sequest, he said, calculates the mass spectra for any amino acid sequence in the database, then takes the experimental spectrum and uses a mathematical technique called cross-correlation, which places one spectrum on top of the other and calculates the overlap.
Mascot, on the other hand, takes all the peaks in a spectrum and calculates all the peaks from any sequence in the database. It then lines the two up, finds a number of fragment matches, and determines the probability for each match to happen.
The peptide sequence tag algorithm, developed by Mann himself, looks for an unambiguous short sequence in the spectrum. Combined with the total mass of the peptide, this is a specific probe to search a database for sequences that fit this pattern. Calculated fragments are then used to confirm or reject the sequence.
A researcher who requested anonymity and who is familiar with STLmass said that its analysis is based on either peptide mass fingerprinting applied to MS/MS, or on an automated de novo scheme similar to the peptide sequence tag approach, and does not use cross-correlation.
But vendors don’t believe their software infringes on the patent. Mark McDowall, marketing manager for Micromass, said his company’s MS/MS database search tool first interprets the mass spectrum, then uses amino acid sequences to search the database. “If that infringes, then anyone who uses Blast infringes,” he said. John Cottrell, director of Matrix Science, declined to comment for this article, except for saying that “because Mascot uses a completely different approach from Sequest, there are no patent infringement issues.”
But what if Finnigan thinks otherwise and starts to enforce its presumed intellectual property? Mann believes that the patent, at least in a broad interpretation, should not have been granted in the first place because he presented his peptide sequence tag algorithm in a 1992 lecture at MIT and on a poster at the 1993 ABRF meeting, prior to Eng’s and Yates’ patent applications, which were filed in the US in 1994, and in Europe in 1995. “To the extent that it covers any algorithm for peptide sequencing by mass spectrometry and database searching, it should be invalid because of this prior art,” Mann said, acknowledging that aspects of the patent, especially the specific use of a cross-correlation function, were novel at the time. Indeed, if the company started enforcing it based on this broad interpretation, “I think it would be a good idea to organize and defeat the patent once and for all,” he said.
Contesting a patent, however, is likely easier said than done. “It’s pretty difficult,” said Appelman. “There is a presumption of validity after a certain time. The [US] Patent and Trademark Office and the courts will give [the patent] examiner a lot of respect. It takes a lot for them to be convinced to overturn an already granted patent.”
Moreover, the legal costs for contesting a patent could easily amount to several hundred thousand dollars. “It’s very expensive to embark on any of this litigation, whether it’s on the prosecution side or on the defense side. There is extensive gathering of evidence, you need a lot of expert testimony from expert witnesses, and the process is very technical,” said Appelman. However, it is easier to contest the validity of a patent in Europe than in the US, he said.
At least for now, this could be bad news for proteomics researchers, because it might discourage the academic development of free and open source software, just as it appears to have halted the software development at Washington University. Other researchers working on open source alternatives to Sequest might not be able to rely on their institutions for sufficient financial backing to defend themselves in court when accused of infringement — with perhaps the possible exception of a large government agency like the NIH.
The NCBI is indeed developing a suite of proteomics algorithms (see Industry Briefs, p. 5), but Lewis Geer, a staff scientist who heads the project, said he is currently not thinking about including search software.
Many researchers agree that open source mass spectrometry search software is desirable, although some say the field has lived happily without it for years.
“People who are doing novel proteomics experiments will need novel search options that the existing software doesn’t provide, which means you are at the mercy of a third party to add features for you in some timely fashion,” Eng said. He also favors an inexpensive or free option to lower the cost of entry to some people to do high-throughput proteomics data analyses. “There is MS/MS database search software in development that we’d someday like to distribute as open source if we can,” he said. Ironically, the patent bearing his own name might prevent him from doing so. Does he think it was a good idea for the University of Washington to license the patent exclusively to a single company? “No, in hindsight, no,” he said.