NEW YORK – By bringing together historical genealogical documents and genotyping profiles, a team from McGill University, the University of Oxford, INSERM, and other research centers has teased out new details on French Canadian population history and relationships.
Together, the results "bridge the gaps between family pedigrees and continental population structure as well as gaps between theoretical models and empirical demographic histories," senior and corresponding author Simon Gravel, a human geneticist affiliated with the McGill University Genome Centre, and his colleagues wrote.
For a paper published in Science on Thursday, the researchers dug into 4 million parish records and obtained spatially resolved genealogical insights from Catholic marriage records, along with genotyping profiles for 20,451 French Canadians and nearly 2,300 people from France. A subset of 4,882 genotyped individuals could be tied directly to the spatial pedigree.
"A big part of the innovation here was to take what was known historically, from the marriage records, into a comprehensive model of genetic variation," Gravel explained in an email, noting that the approach made it possible to retrace the emergence of population structure over space and time in French populations after their arrival in Quebec.
"In the future," Gravel added, "we plan to use this mathematical model to reconstruct the history of genetic transmissions through this genealogy back to the first thousands of French settlers, and learn about human biology (natural selection, mutation, recombination) and disease risk in Quebec."
When they analyzed their data alongside other historical records, they saw population changes that suggested historical migrations in the region had a more pronounced effect on French Canadian population structure than the preexisting structure found in ancestral French populations.
"The loss of ancestral French population structure and the appearance of spatial and regional structure highlights a wide range of population expansion models," the authors reported, noting that "leading axes of genetic variation among French Canadians reflect genetic drift that followed French settlement and is encoded in the spatial pedigree."
Past research found very low levels of Indigenous ancestry in French Canadians, despite extensive migration into traditional First Nations territories, the researchers noted. Instead, most French-speaking individuals living in Quebec today have ancestry that stretches back to some 8,500 French settlers who arrived in the 17th and 18th centuries.
The team's ancestry analyses suggested that early Quebec City colonial settlers originated from sites in northwestern and western France, though later migrations linked to further colonization and military efforts appeared to involve migrants from other parts of France.
When they dug further into the genetic features and differentiation patterns, the investigators found that they were intimately linked to geographic features in Quebec, particularly the Canadian province's river network.
"[T]he past four centuries of European colonial history has been marked by rapid frontier expansion, beginning along the shores of the St. Lawrence River and eventually expanding up its tributaries," the authors explained. "By tracing the genealogical ancestry of millions of individuals across space and time, we describe a constellation of distinct founder events arranged along geographic features that defined transportation and economic activity."
In addition to highlighting the importance of river networks in settler spread, the team's genetic analyses highlighted key migrations and founder events consistent with known historical events — from early French settlements at a narrow portion of the St. Lawrence River to efforts to secure control of the Great Lakes or expand into Iroquoian and Woodland First Nations territory along river shorelines.
Using spatiotemporal metadata for present-day French Canadians with grandparents related to individuals in the historical pedigree, meanwhile, the researchers put together simulated whole-genome sequences for more than 1.4 million individuals.
"Although our simulations are based on a real pedigree, they do not contain identifying information, and we can freely share this genome-wide dataset along with spatiotemporal metadata for [more than] 1.4 million individuals," the authors explained.
Though they cautioned that the simulations rely on a pedigree that may include recording errors and does not include information on structure or natural selection in the French Canadian populations beyond insights available from the pedigree, they suggested that "a large and freely available simulated dataset with realistic population structure and mating patterns will help the design of more robust inference methods."
"Despite these caveats, we believe that the genetic model, the simulation tools, and the publicly available simulated data we describe here provide a lens to investigate population genetics at an unprecedented resolution," they concluded.