BioMed Central's Beyond the Genome conference in Boston this week — which was held in conjunction with Genome Biology's 10th anniversary — showcased the work of several researchers whose ideas go beyond just sequencing.
The University of Maryland's Steven Salzberg kicked off the conference with a keynote speech about the work he and others are doing to try and accurately estimate exactly how many genes a person has. In 1964, F. Vogel wrote a letter to Nature estimating that humans have 6.7 million genes. He was way off, Salzberg said, but it hasn't gotten any easier over the years to make the estimate more accurate. In the mid-1990s, three different papers estimated the count to be 50,000 to 100,000, 64,000, and 80,000. Even after the draft genome was published, the estimates widely varied. The public consortium estimated the count to be between 30,000 and 40,000, while Celera and its private partners estimated 26,588, with 12,000 other additional "likely" genes. So far, the most accurate estimate is 22,333 human genes, Salzberg said, but there is still much of the human genome that not much is known about, and RNA-seq is still revealing a lot of new genes that may have previously been overlooked. In the end, Salzberg said, it's not as important to know how many genes there are as to know what they are and what they do.
George Church emphasized how important it is to continue to read the genome. About 2,000 genes are highly predictive and medically actionable, he said, and as the price of sequencing continues to drop, researchers will be able to find more genes they can work with to the benefit of human health. Church also stressed the importance of open-access data, and said there is a need for an open database that researchers can use to analyze each others' data.
Elaine Mardis spoke about her work with cancer genomics, and said that, in researching the way tumors work, validating tumor variants is important especially for dissemination of the information to the wider scientific community for further analysis. The speed of data generation is both challenging and enabling, she added.
The University of Washington's Jay Shendure talked about his lab's work with exome sequencing in autism studies. At least some percentage of autism is caused by coding mutations, and exome sequencing is useful in studies of the disorder because the technique can be used to focus in on a single gene instead of an entire region of the genome, Shendure said. He described a trio-based exome study done in his lab, where 60 exomes — from 20 autistic children and both of their parents — were sequenced, and then analyzed to identify Mendelian errors. The researchers found 16 de novo SNPs validated by Sanger from the 20 autism trios, and found two genes — GRIN2B and FOXP1 — which they think could be causative in autism.
The University of Colorado's Rob Knight and BGI's Jun Wang discussed their respective labs' work with microbes. Knight talked about the research he has done with obese and lean mice, and trying to elucidate the relationship between an organism's weight and its gut microbes. Wang talked about some of the studies BGI has done with diabetic patients, and said one study of Chinese type II diabetes patients discovered more than 500,000 novel bacterial genes and found 1,306 bacterial genes associated with diabetic patients, though whether the genes were the cause or the effect of diabetes is not yet known.