Skip to main content
Premium Trial:

Request an Annual Quote

Ancestry.com Scientists Use Array Data to Reconstruct Genome of 19th Century Alabama Farmer

Premium

NEW YORK (GenomeWeb) ― Scientists at AncestryDNA, the consumer genomics wing of Ancestry.com, have used clients' microarray data to reconstruct the partial genome of a common ancestor who was born more than 200 years ago.

The feat was made possible by a new AncestryDNA feature called DNA Circles, which uses customers' genomic data to connect them to a probably common ancestor, typically one who lived during the past two centuries.

Julie Granka, a population geneticist at Provo, Utah-based AncestryDNA, told GenomeWeb that the company embarked on the genome reconstruction project to validate its underlying DNA Circles methods.

She noted that AncestryDNA has genotyped about 500,000 samples since it launched its genetic genealogy service two years ago. By combining the data from those samples with Ancestry.com's 60 million family trees using the DNA Circles tool, AncestryDNA scientists were able to identify an individual with a sufficient number of living descendants to attempt to reconstruct his genome.

The man was David Speegle, an Alabama farmer who was born in 1806. According to AncestryDNA, Speegle married twice, first to Winifred Cranford in 1830, with whom he had 19 children, and, after Cranford died, to Nancy Garren in 1870, with whom he had seven more children before he died in 1890. 

Granka said it was the Speegles' fecundity, in part, that made them "compelling candidates" for genome reconstruction.

"They were selected in part because they had such a large number of children and grandchildren," Granka said. Indeed, a newspaper article from 1890, available online, noted that at the time of his death, Speegle already had 300 living descendants.

"Since some genetic material is lost each time it is passed from parent to child, [the existence of] lots of descendants means that there were lots of chances for the DNA of David Speegle and his wives to be passed on and to persist into later generations," said Granka. "Thus, given genetic data from a collection of his descendants, that means that there was more of the Speegle genomes that could potentially be reconstructed."

Another factor at play was Speegle's successive wives. "This allowed us to, in some cases, tease apart which parts of the reconstructed genome actually belonged to David Speegle, and not to either of his wives," Granka said.

According to a statement, AncestryDNA was ultimately able to piece together fragments of genetic code from David Speegle and his spouses Winifred and Nancy for roughly 50 percent of the length of the human genome. In some cases, Granka and her team were able to identify pieces of the genome that were unique to David Speegle. They also noted the presence of genes associated with certain traits, including male pattern baldness and blue eyes.

Granka credited AncestryDNA's DNA Circles tool, which became available in a beta version in November, with the company's ability to carry out the partial genome reconstruction.

"DNA Circles ... enabled us to find a collection of individuals who all seemed to be likely descendants of David Speegle," she said. "It was this collection of descendants that we used as a basis for genome reconstruction."

Granka said that DNA Circles is "unique as a feature" in the consumer genomics space in a number of ways. First, it integrates genetic and genealogical data on "a scale and scope that has not yet been done before." Second, it increases the possibility for genealogists linked to a particular ancestor to collaborate together.

Finally, since DNA is not necessarily shared among distant cousins because of the randomness of genetic inheritance, DNA Circles can unite two descendants of a common ancestor, even if they do not share common genetic heritage. For instance, of two first cousins, one may have inherited a DNA segment from a certain ancestor while the other did not. However, since the two first cousins are linked via DNA Circles, and the cousin who inherited the ancestor's DNA is linked to others who inherited that DNA segment, the cousin who did not inherit the ancestor's DNA segment would still be linked to those other descendants with the inherited segments because of the tool.

Granka said that AncestryDNA currently has established more than 150,000 DNA Circles, meaning 150,000 ancestors for whom the company's scientists have inferred a likely collection of their descendants. "From a database of over a half-million genetic tests, that number is very impressive," she said.

While the company is still refining its genome reconstruction methods, Granka said that AncestryDNA is "excited about the implications of this research" for genetic genealogy and the genomics industry as a whole.

There are limitations though to what can be done with regards to genome reconstruction, Granka cautioned. One is the number of descendants an individual has. The more descendants, the more feasible such reconstructions become.

Time is also a factor, as with each generation, more DNA from a particular ancestor is likely to be lost. "It may be difficult to recover as much of the genome from a more distant ancestor than a more recent ancestor," Granka said.

Yet even a person born 200 years prior to David Speegle might be a candidate for reconstruction, should he have enough living descendants. "In theory, with enough descendants of a particular ancestor from the 1600s, we could technically reconstruct parts of their genome," said Granka.

Ancestry.com's announcement of the partial genome reconstruction is the second from the company in recent months. In October, Ancestry.com discussed at the American Society of Human Genetics meeting in San Diego the results of a large-scale study of US population genetics. Specifically, AncestryDNA's scientists predicted the genetic ethnicity for each individual tested using a curated reference panel of 3,000 single-origin individuals. The estimates were then combined with birth locations to explore how various ethnicities are distributed across the country.

"We are always looking for ways to push the boundaries of what can be done using a massive collection of genetic and genealogical data," Granka said. "This project is one example of how integrating genetic data with pedigrees can lead to exciting advancements in genetic genealogy and population genetics — especially on such a large scale."

The Scan

Not Yet a Permanent One

NPR says the lack of a permanent Food and Drug Administration commissioner has "flummoxed" public health officials.

Unfair Targeting

Technology Review writes that a new report says the US has been unfairly targeting Chinese and Chinese-American individuals in economic espionage cases.

Limited Rapid Testing

The New York Times wonders why rapid tests for COVID-19 are not widely available in the US.

Genome Research Papers on IPAFinder, Structural Variant Expression Effects, Single-Cell RNA-Seq Markers

In Genome Research this week: IPAFinder method to detect intronic polyadenylation, influence of structural variants on gene expression, and more.