NEW YORK (GenomeWeb) – A social media-driven project to collect health survey and genetic data from volunteers is seeking more financial support to continue genotyping participants while looking for partnerships to maximize the value of its database.
As reported in an American Journal of Human Genetics paper this week, the Genes for Good Project has engaged 80,000 Facebook users via its online application and genotyped 27,000 people as of March 2019. Consenting users completed more than 2.9 million surveys, answering more than 22 million questions, according to the paper.
While the success of the project showcases the ability of social media to source health and genetic data from volunteers, organizers believe their database could be bigger.
"We want to continue, to get to 100,000 people or more genotyped," said Gonçalo Abecasis, a professor of biostatistics at the University of Michigan School of Public Health in Ann Arbor and director of the project. "We have to figure out how we will continue to pay for those spit kits and DNA tests, and we need to do a little bit of fundraising to do that," he said.
Abecasis is also an employee of Regeneron Pharmaceuticals and previously served on the scientific advisory boards of Regeneron, 23andMe, and Helix. But, he stressed that Genes for Good is a University of Michigan project only.
Abecasis and colleagues started the Genes for Good Project in 2015. The goal of the study was to solicit Facebook users to fill out health surveys and provide DNA samples via an app. Health surveys included information about family and personal health history as well as daily tracking data. Participants had to be residents of the US and aged 18 or older.
Users who completed a certain number of surveys were mailed kits for free and provided some data about their ancestry and other traits to the project. Genotyping was run using Illumina Infinium CoreExome-24 arrays, which were selected in part because they are currently being used by other projects at the university, but also because of the relatively large 600,000 SNP set.
"We wanted an array that gave us good coverage of the entire genome," said Abecasis.
Participation in the project grew steadily, at about 2 percent per week, Abecasis noted. "Compounding that growth, it turns out to be quite a lot of people," he said. "It was exciting to see that we got participants from all over the US."
In addition to regional diversity, Genes for Good's database was relatively ethnically diverse, according to the paper, as roughly 75 percent of participants were of European descent. According to the US Census Bureau, about 72 percent of the US population identified as white in the 2010 census. Participants, however, tended to be younger and female. In fact, about three-quarters of those who engaged the project were women.
"In part, that's social media," commented Abecasis. "Social media users are skewed younger," he said. He attributed the high female participation rate to both social media as well as word of mouth. Most users actually engaged the project on the advice of friends and relatives. Participants were also diverse economically and fell into the US middle household income bracket of $35,000 to $100,000 a year.
All of these indicators, according to Abecasis, demonstrated the ability of a social media app to generate a database more representative than other databases, which tend to be drawn from participants close to a research institution and have typically been more homogeneous and wealthier, he said.
To assess the value of the data collected, Genes for Good also decided to replicate some published genome-wide association findings in their data set. "The point was to do some experiments," said Abecasis. "If you asked someone if they had diabetes, how do you know they are telling the truth?" he said. "What we saw is that for this set of individuals, the results are consistent with these public studies."
In addition to diabetes, Genes for Good also replicated findings from asthma, body mass index, hair color, and eye color association studies in its dataset. Their findings largely matched.
Genes for Good has currently ceased kit collection while it looks for more funding. "What we are trying to figure out is if the idea of sponsoring a project like this is attractive to the National Institutes of Health, for example, or a commercial partner that might fund the study so that we can continue to make it free to participants," said Abecasis. He did not provide an estimate on how much the project had cost to date.
Should it find support, Genes for Good might collect additional data on participants in the future. In the paper, the authors noted the emergence of smartphone and wire sensor applications for measuring physical activity, heart rate, temperature, sleep patterns, and GPS location to infer environmental exposure.
"These and other novel data collection methods are developing rapidly, holding great promise in the near future for the efficient collection of large quantities of precise longitudinal data with minimal participant burden," the authors wrote.
Genes for Good is also looking to work with other projects that have their own datasets to power more discoveries. "I think we will make the most advances if we pool the data with other large studies and do very large analyses," said Abecasis. "If we are very fortunate, eventually instead of having 20,000 people, we'll have 200,000 people, [which will] allow us to make huge genetic discoveries," he said. "I think, right now, the most powerful thing would be to collaborate on a variety of research questions."