By Matthew Dublin
A few weeks after its release, the Janelia Farm's Jackhmmer algorithm — an iterative search method in the HMMER package similar to the NCBI's PSI-BLAST algorithm — is performing as expected.
In a post on the blog Cryptogenomicon, the Jackhmmer development team says that their new algorithm takes advantage of the sophisticated probabilistic models that underpin profile Hidden Markov Models (HMMs).
Why all the fuss? Profile HMMs are the most sensitive tools to use, which is why many protein family databases use them. What this means for the bench biologists is that they now have a fast method for discovering the likely structure of a sequence which can allow for the creation of functional hypothesis that can guide experiments.
The authors describe an example with several Pfam families that have no current match with a determined three-dimensional structure. By iterating the Pfam model with Jackhmmer through a few iterations against UniProt, then the resulting profile HMM against the PDB structure database, users can frequently identify significant matches to some or all of that family.
Their blog post has a lengthy description of how jackhmmer works, but to sum up, they write that "the point is that the handful of searches performed to get to these results each took a couple of seconds and allowed me to rapidly explore the sequence/structure space, without time to get distracted. It has taken me far longer to write this blog than it took to perform the analysis!"