Skip to main content
Premium Trial:

Request an Annual Quote

Beijing Institute Credits Bioinformatics Prowess for Rice Sequencing Success


A working draft of the genome of the rice subspecies indica is now available to the public thanks to the efforts of the Genomics and Bioinformatics Center of the Beijing Genomics Institute at the Chinese Academy of Sciences.

Matthew Huang, deputy director of BGI, said the success of the project rests largely on the shoulders of the 100-strong bioinformatics team at BGI, which developed three new algorithms to meet the unique assembly and annotation demands of the rice genome.

Huang noted that the support of Sun Microsystems was also crucial to BGI’s accomplishment. The genome assembly was conducted on a Sun Enterprise 10000 server, while BGI’s status as a Sun Center of Excellence in Genomics brought additional support for the development of the algorithms that made it all possible.

“We try to support our partners through various ways,” said Stefan Unger, business development manager for computational biology in Sun’s global education and research group. “We make sure they can use our equipment well and we make sure that we develop a community of users that can talk to each other and share their experiences. That’s the purpose of the Center of Excellence program.”

The accomplishment was lauded by the scientific community. “Our Chinese colleagues have given the world a wonderful gift by deriving a highly useful draft of the instruction book for this incredibly important crop species,” said Francis Collins, director of the National Human Genome Research Institute.

“The public availability of rice genome sequence will have an immediate and salutary effect on the scientific community,” added Eric Lander, director of BGI’s sister center, the Whitehead Institute Center for Genome Research.

The draft sequence data (4X coverage with 95 percent of the coding region identified) is publicly available at cn/rice.

The Beijing rice sequencing project began in May 2000. The success of the effort is evident when compared to the current status of the International Rice Genome Sequencing Project, which began in 1998 and is not expected to be complete for several more years.

Huang said the “real sequencing effort took off in July 2001,” with the main sequencing requiring around three months’ work, a few weeks for the assembly, and around another two months for annotation.

Huang noted that while the IRGSP is taking a chromosome-by-chromosome, clone-by-clone approach that distributes the sequencing tasks among its 10 participating sequencing centers, the BGI team used a whole-genome shotgun approach. This sped up the process considerably — once they dealt with the considerable assembly challenges that whole-genome sequencing presents.

Putting the Pieces Together


The BGI developed three new algorithms to tackle the demands of whole-genome sequencing. The first, a repeat-masking algorithm, “was essential in our successful assembly of the rice genome scaffolding,” said Huang. In addition, the team made improvements on the Phrap assembly algorithm and wrote a specialized gene-finding algorithm.

“The current gene-finding algorithms like Grail and Genscan are good for mammalian gene-finding, but they’re less effective for plants,” said Huang. BGI tailored the new algorithm to the unique G-C content gradient of the rice genome to help it find exons and introns.

The BGI gene-finder identified a surprisingly large number of genes. While the rice genome is one-seventh the size of the human genome at 430 megabases, it has twice as many genes ¯ 60,000. This may be due to the lack of alternative splicing in plant genomes, but Huang said the BGI identified more alternative splicing events in rice than initially expected. The group is now validating this experimentally.

BGI verified its assembly with data from the Chinese Academy of Science, which is participating in the international rice sequencing effort. “We compared our scaffolds with their assembled BAC sequence and the results are consistent,” said Huang. BGI also verified its gene predictions against publicly available data.


Two Strains are Better than One


The first public draft sequence of the world’s largest cereal crop is certainly worth celebrating, and the BGI’s selection of the indica subspecies should prove especially useful, according to Huang.

While the IRGSP and the two private rice sequencing groups at Monsanto and Syngenta are sequencing the japonica strain, Huang said the indica subspecies is older and “there are more indica planted in the world than japonica.” In addition, the indica subspecies is the paternal cultivar of a Chinese hybrid rice that has a yield per hectare 20 percent to 30 percent higher than the average of other rice crops. Hopefully, careful study of the indica genome will help researchers determine the source of this high yield.

In addition, Huang said the two rice strains are of particular biological interest because in an evolutionary sense, “they’re on the verge of reproduction isolation.” The indica genome is much larger than the 380 megabases of japonica, yet their coding regions are almost identical. “Having the data on both will be an enormous advantage for us to understand rice genetics and biology,” said Huang.

— BT

Filed under

The Scan

Highly Similar

Researchers have uncovered bat viruses that are highly similar to SARS-CoV-2, according to Nature News.

Gain of Oversight

According to the Wall Street Journal, the Biden Administration is considering greater oversight of gain-of-function research.

Lasker for mRNA Vaccine Work

The Scientist reports that researchers whose work enabled the development of mRNA-based vaccines are among this year's Lasker Award winners

PLOS Papers on Causal Variant Mapping, Ancient Salmonella, ALK Fusion Test for NSCLC

In PLOS this week: MsCAVIAR approach to map causal variants, analysis of ancient Salmonella, and more.