Name: Spencer Wells
Title: Director of National Geographic's Genographic Project
Ancestry testing and genetic genealogy were conspicuous topics at the annual RootsTech family history and technology conference, held last week in Salt Lake City.
In addition to multiple workshops and a roundtable devoted to genetic genealogy, RootsTech's more than 10,000 registered attendees also heard a keynote address by Spencer Wells, director of National Geographic's Genographic Project and an explorer-in-residence at the Washington, DC-based society.
Wells, a population geneticist, authored the 2002 book The Journey of Man: A Genetic Odyssey, which was later made into a 2003 TV documentary. The success of both led to the creation of the Genographic Project in 2005, which aims to better understand ancient human migration patterns by analyzing DNA samples from around the world. For the first seven years of the project, microsatellite genotyping and Sanger sequencing were used to determine participants' Y chromosome and mtDNA haplogroups. That changed in 2012, though, with the debut of Geno 2.0, an autosomal DNA test that relies on a custom designed, Illumina-made genotyping array to offer a more global interpretation of an individual's deep ancestry.
In his address, Wells talked about the increasing popularity of array-based autosomal DNA testing. He called 2013 a "year of inflection" in consumer genomics that saw the millionth person take a personal DNA test and estimated that the number of people tested could more than double by the end of 2014. BioArray News spoke with Wells about those predictions and more following his address at RootsTech. Below is an edited transcript of that interview.
When they introduced your talk, they referred to you as the Indiana Jones of genetics, which I think was fair, because National Geographic has this rugged, outdoorsy image. At the same time, one could argue that consumer genomics in the past has had more of a tech-savvy, Silicon Valley image, mostly thanks to 23andMe. So, how have you managed to integrate consumer genomics into National Geographic's mission?
National Geographic has been interested in human origins research for a long time. They are also interested in language distribution and saving ancient languages, so cultural diversity in a broad sense, and human origins as a component of that, and that is kind of in NatGeo's DNA, so to speak. So, when we started the project nine years ago, it was really about the underlying science and how we use it toto study human migration history, learning how diverse cultures around the world are connected to each other. And, yes, it is a very cutting-edge, laboratory-based technology, but at the end of the day it is very much in tune with what National Geographic has been doing for generations, just blending it with newer scientific tools. National Geographic has also been shifting its idea of what exploration is. From the 19th century when the Society was founded and up into the 20th century, exploration was primarily about going out and visiting new geographic locations. But a lot of geographical locations have already been visited, of course. Now it is a question of how you use scientific tools to better understand these locations, and exploration has really taken on a mantle of science, and that is what National Geographic ismost excited about, that this is an application of using cutting-edge technology to understand age-old questions.
Between 2005, when it commenced, and 2012, when Geno 2.0 launched, the Genographic Project relied on microsatellite genotyping and Sanger sequencing to determine a participant's haplotype. When did the newer array technology enter into the picture?
Arrays were much more expensive in 2005, but we did look into it. Of course, one of the key things about the project is that it is not medical. Much of the early array technology was focused explicitly on medically relevant regions, and so there were a lot of concerns about how to do this research ethically and protect people's SNP data. But as time moved on, people started to say, 'We want this additional information. We want the whole genome, in addition to our Y or mtDNA haplotypes.' And as the price of the technology went down, and the demand grew, we decided to implement it, while focusing only on ancestry, and making every effort not to include medical information.
How did you go about deciding on the content for Geno 2.0?
If there was any evidence that it was linked to or associated with any kind of disease, then we had to exclude that. It was a question of selecting different ancestry informative markers found in different populations and that had the value to distinguish between populations that are relatively closely related, for example, different between Western Eurasian populations or Southeast Asian populations. You could use higher-density chips to screen millions of markers and accomplish that, but if you have to be a little more selective if you are limiting yourself to a smaller set of markers.
23andMe obviously has a health-related component that sets it apart, but how do you distinguish Geno 2.0 from the chips used by Ancestry.com or Family Tree DNA?
A big part of what we focus on is the Y chromosome, so we have between 8,000 and 9,000 markers on the Y. We also test over 3,000 mtDNA markers. So, fine-grained detail on those lineages, that's something that the other companies are not focusing on as much. And I think the other companies are focused on more recent ancestry, genealogy, while we are focused on deeper ancestry and ancient migrations.
You have no interest in offering kinship tools?
Not right now. There are companies out there that are doing it, and I think they are doing it in the right way. It's not something that we are focused on as a project, but we are interested in it. Our customers can upload their data to our partner Family Tree DNA if they are interested in that, but it's not the primary focus of what we are doing.
You have discussed plans to develop a next-generation chip.
We have talked about it, we are looking into it, but I can't really say more at this point.
I attended a talk here where Diahan Southard, a genetic genealogist, described mtDNA as the "ugly stepsister" of Y, meaning that mtDNA lineages have been neglected because there has been so much emphasis on paternal lineages. Do you agree with that?
Certainly, genealogically, surnames are passed along the paternal line, at least in most societies, and Y chromosome testing allows you to investigate those links. If you are interested in the Wellses or the Smiths or the Petrones, the Y is going to give you an insight into your paternal ancestry. In the case of mtDNA, it won't be as easy, because in most societies, historically, it is women who are marrying into families, so the surname associated with an mtDNA lineage changes with every generation, and that makes it more difficult. So, I think that is why genealogists are more focused on the Y chromosome. But I think mtDNA is very important, especially for what we do, deeper ancestry. It allows you to see farther back in time. It may be even more important because of recent events, like the Genghis Khan effect, where a few men sometimes have an out-sized genetic imprint over a relatively recent time scale, and so you have individuals like Genghis Khan, who lived less than a thousand years ago, yet today one in 10 people in Central and East Asia trace their paternal line, their Y chromosome, back to him. So, the Y chromosome shows the impact of recent events to a greater extent than I think the mtDNA does.
The millionth person was tested last year. By the end of 2014, you predicted that it will be more than two million. Will this be from just array-based autosomal DNA tests, or does it also include microsatellites and sequencing?
This includes people testing their Y chromosome and mtDNA, with Geno 1.0, for instance, but most of the work is now being done on arrays. So, primarily, these will be autosomal DNA tests.
How did you arrive at that prediction of two million people tested by 2015?
Based on the estimated samples being run, and talking to people how likely they were to test at multiple companies, trying to account for that. Kit sales matter from a business point of view, for the companies, but for scientific purposes, it's really the number of individuals tested that really matters. We are interested in population sizes, sample sizes, rather than simply test numbers.
One critique of array technology is that the context is fixed, you know what's on there, and so you aren't discovering any new SNPs. So when you are going out and testing these indigenous populations, what new information are you obtaining, and what are you able to glean from those samples using the arrays?
It's really about the distribution of particular genetic markers. When we talk about migrations in ancient history, we can only do that if we have a database of indigenous populations to compare to. The idea is that so many people have moved around in recent history, and their ancestry is mixed. I, myself, am a mutt, so-to-speak, and so so it's hard to place me or the average mixed-ancestry person walking down the street in New York or Salt Lake City or Los Angeles into an ancestral context. You need data from indigenous population to make sense of mixed genetic patterns. By genotyping at high resolution in the indigenous populations, we are able to piece together the deep ancestral genetic patterns.
You mentioned in your talk that people outside the US are somewhat less interested in getting tested. Do you think the outreach events you have done, in County Mayo recently, for example, are generating more interest in this kind of testing?
I would certainly hope so. I think part of it is knowledge that these tests exist. But then it's also what do they get out of it. And I think, particularly in a place like County Mayo, it's a way of getting in touch with the diaspora, all the people who left Ireland for America. For many people in the US, it's a tradition to go back and visit relatives in Ireland. But what about those people who don't know their cousins? This provides them with a way to potentially get in touch with those relatives.
A few years ago, if you used the word microarray, most people wouldn't know what you were talking about. Now, there are many people analyzing their own genotyping array data. In your talk, you called it "citizen science." What do you think about that phenomenon?
In general, I think it's positive. The more people who test themselves, the better; more data equals better resolution Also, the information that we present via the Genographic public participation kits is based on the latest information out there, vetted by our scientific team, but there is a lot of research that is ongoing that is more experimental, that may be ahead of what we are showing on the websites, and it's nice to have people try these analytical tools out on their own. Inevitably, because we have to vet everything we put out there, we might be less experimental than individuals who don't have to report results to thousands of people, but the fact that they are experimenting allows us all to benefit, I think.