This week in Genome Research, Marco Marra was senior author on a paper that reports development of a new visualization tool, called Circos, to "facilitate the identification and analysis of similarities and differences arising from comparisons of genomes," says the abstract. Circos can map variation in genome structure from data produced by sequence alignments, arrays, genome mapping, and genotyping studies, and can display data as scatter, line and histogram plots, heat maps, tiles, connectors and text.
Cold Spring Harbor's Lincoln Stein led a group of researchers who studied gene predictions in plant genomes. Automated evidence-based gene building is a rapid and cost-effective way to provide reliable gene annotations on newly sequenced genomes, they say, and report on their newly developed evidence-based gene build system called the Gramene pipeline that can use transcriptional data across species. Using annotated plant genomes, Arabidopsis thaliana, and Oryza sativa, they show that the "cross-species ESTs from within monocot or dicot class are a valuable source of evidence for gene predictions."
Korean scientist Sung-Min Ahn was first author on work that presents the first Korean individual genome sequence and analysis. The group used Illumina paired-end sequencing at 28.95-fold coverage to sequence the diploid genome of a Korean male and found, firstly, 420,083 novel SNPs that are not in the dbSNP database. Second, despite a close similarity, they say in the abstract, they saw important differences between the Chinese genome, the only other Asian genome available, and the Korean genome.
Finally, a large group of researchers led by first author Kevin Judd McKernan from Applied Biosystems sequenced the genome of an African using a novel ligation-based sequencing assay. The method, they say, improves the raw accuracy of the reads to greater than 99.9 percent. They identified 3.8 million SNPs, 226,529 intra-read indels, 5,590 indels between mate-paired reads, 91 inversions, and four gene fusions. They also found dozens of disease susceptibility mutations and thousands of novel potentially functional variants, "which suggests a higher than expected load of deleterious variants that can be tolerated in the human genome," they write.