Myriad Genetics’ in-house bioinformatics capabilities played a key role in completing the sequence of the rice genome six months ahead of schedule, under budget, and with 99.5 percent accuracy, according to Sudhir Sahasrabudhe, executive vice president for research and development for Myriad.
Myriad, in collaboration with Syngenta subsidiary Torrey Mesa Research Institute, completed the sequencing in 14 months, netting Myriad a $3 million cash bonus.
Myriad’s high-throughput sequencing facility processed approximately 12 million base pairs per day to better than 6x coverage, or a total of 2.6 billion base pairs. Sahasrabudhe said that Myriad was able to squeeze additional performance out of its Amersham Pharmacia Biotech MegaBace capillary sequencers through proprietary base-calling, BAC fingerprinting, and assembly algorithms.
Myriad estimated that its bioinformatics capabilities give the facility an average of 40 percent more sequence per machine than other users of capillary sequencing machines.
But while the completion of the rice genome is a notable accomplishment in itself, it is only the first of several projects that will put Myriad’s bioinformatics toolkit to work. Sahasrabudhe told BioInform that the tools developed for the project have given the company a substantial amount of informatics capability for full-genome comparative analysis.
“We can probably use some of the same tools in identifying regulatory regions of the genes and we can also identify genes whose regulatory regions are similar in features, so that’s going to enable us in terms of target identification as well. We are very excited about this capability,” Sahasrabudhe said.
Myriad has already begun sequencing another “fairly large, non-human” genome that Sahasrabudhe declined to disclose. The genome was chosen based on its potential to aid in the identification of regulatory sequences in the human genome, he said. Sequencing of this next genome is expected to be complete in July, at which point Myriad will compare it to the human genome to identify useful targets as part of its pharmacogenomics program.
“We think we can do a wonderful job of finding regulatory variants if we do a whole-genome alignment based on sequencing another non-human genome and putting it on top of the human sequence to more adequately identify the regulatory genes,” Sahasrabudhe said.
In addition to the base-calling software, BAC fingerprinting algorithms, and assembly algorithms that Sahasrabudhe termed the “heart of the whole [rice genome sequencing] project,” Myriad also developed proprietary signal processing programs to give better quality, longer sequences from the DNA sequencers. The average read length was 400-500 base pairs, according to Sahasrabudhe.
Alun Thomas, vice president of bioinformatics at Myriad, noted that the company’s sequence assembly software was aided by “an extensive analysis of the DNA repeat structure in the genome” in order to overcome the difficulties associated with the large number of DNA repeats encountered using the shotgun approach. He also credited his team’s ability to integrate its own sequence data with BAC fingerprint data generated by Clemson University and public data on DNA markers.
Steve Briggs, president of TMRI, said that his team also developed an integrated data warehouse that was used to discover and annotate the genes.
Under the terms of the collaboration, Myriad and Syngenta own the sequence data in a 50/50 profit sharing agreement. The companies are in active discussions with several commercial entities with respect to the sale of the data, although Sahasrabudhe said that the pricing scale for commercial agreements has not yet been established.
An important aspect of the project, according to Sahasrabudhe, is its commitment to making the data available to subsistence farmers and academic groups free of charge, provided they respect the companies’ intellectual property rights. In addition to its use of the data for in-house R&D, Syngenta intends to work with local research groups in developing countries to accelerate use of the information in a way that will improve crops.
The companies have already offered to share the data with the International Rice Research Institute, a non-profit organization formed in 1960 by the Ford Rockefeller Foundation to promote research in the world’s most important crop.
Sahasrabudhe said that Myriad is already in discussions with other agricultural companies to sequence other crop genomes. Although he said that sequencing is not central to Myriad’s business and that “Myriad does not really aspire to be an agricultural-based company,” he explained that sequencing crop genomes adds to the company’s knowledge of the technology, “and in the process will enable some of our agricultural company partners to move forward with recognizing the functional information based on genome sequence.”
He said that Myriad intends to remain focused on developing validated targets, new therapies, and novel diagnostic biomarkers.
“Certainly we consider the whole value chain from gene to protein to function as a continuum,” Sahasrabudhe said. “And we think that our capability in the area of genome analysis allows us to fully leverage our proteomics and functional biology platform to recognize and realize a larger number of validated targets.”
— BT