NEW YORK (GenomeWeb) – For an increasing number of consumer genomics clients, the customer experience does not end the day that they receive an email notifying them that their 23andMe, AncestryDNA, or Family Tree DNA test results are ready.
Instead, after reviewing their company-generated reports, many opt to download their raw microarray or sequencing data and upload them to third-party analysis sites where they can learn more about their ancestry and health, often for free or at minimal cost.
Promethease, GEDmatch, YFull, DNAGedcom, LiveWello, and Interpretome are just a handful of these websites that cater to customers who hope to squeeze more information out of their data than is obtainable from the major consumer genomics players. While each third party tools provider differs in terms of its offering and focus, recent interviews with the makers and administrators of these services reveal a segment of the consumer genomics industry that is undergoing significant change.
As the number of people genotyped worldwide is set to pass 3 million this year, third-party tool providers have worked to improve usability, add features, and upgrade their servers for scalability. Several have also introduced subscription fees for access to particular tools in order to support growing needs. As such, providers must balance a more business-like outlook, while retaining the people-empowering credo that has allowed them to grow almost solely by word of mouth.
"The mentality is very different," said Greg Lennon, co-founder of Promethease, about the difference between the large consumer genomics companies and third party providers. "There are companies that want to convince you that your genome will somehow cure your ailments," Lennon said. "That is pretty different from what we do."
Lennon, a geneticist, co-founded Promethease with computer programmer Mike Cariaso as an outgrowth of SNPedia, a public wiki on human genetics, nearly a decade ago. The company, which consists of the two men, charges $5 to generate gene reports by comparing customers' array and sequencing data to SNPedia. After 23andMe suspended the health portion of its Personal Genome Service following receipt of a US Food and Drug Administration warning letter in 2013, Promethease was seen by some as an alternative for consumer genomics customers seeking health-related information.
At the same time, the company operates in a way that has allowed it to avoid similar regulatory scrutiny.
"Researchers at the FDA have been in touch with us because they have used Promethease for their own research, but we have not had a discussion within a regulatory context," said Lennon. "Keep in mind that we don't do assays of any sort; Promethease is a literature retrieval system," he said. "We are trying to connect people as efficiently as possible to what the literature says about those genotypes."
As both Lennon and Cariaso currently work on Promethease as a side project, the company never set out to conquer the market such as early comers to the consumer genomics space like 23andMe, Navigenics, and DeCode Genetics. And yet, as Cariaso pointed out, its health reports have outlasted most others, in part because Promethease was never a commercial endeavor.
"If we had taken a pile of venture capital money, VCs would have expected a return on investment;" said Cariaso. "We only ever did things that we could do directly out of pocket and make sure that it was covering its own expenses," he said. "We are not under the commercial pressure that everybody else is. There is no marketing department or scientific advisory board," he added. "It's just Greg and I saying, ‘Does this seem cool?'"
"We don't do testing, we don't sell supplements — there's no add on," said Lennon of Promthease's approach to the market. "We also don't store users' data, and we don't sell it or mine it in anyway."
Cariaso underscored this fact as a differentiator between what Promethease offers and what large consumer genomics companies, notably 23andMe, have done with their users' data. "Everything we know about your genetics is gone after 45 days," Cariaso said. "If you have your worst horror scenario that Starbucks or Monsanto acquires us tomorrow with some sort of a hostile takeover, the answer is that if you did your Promethease 46 days ago, you're cool. No sweat."
While Promethease has maintained its crowdsourced, user-empowering mentality through the years, it has sought to continuously upgrade its service to improve users' experiences. Cariaso and Lennon both discussed projects under way to augment usability and expand applications and features. While they both said that the Cloud-based architecture of the site is scalable and has not faced some of the server issues that have plagued other third-party providers, they have been increasingly working with higher-density next-generation sequencing data in addition to the more common microarray data uploads.
"Fundamentally, that's not too difficult for us," said Lennon. "We are processing full sequence routinely," he said. "A few years ago it was extremely rare for us to see full sequence or exome and now it is not rare at all."
Promethease is also increasingly liaising between researchers and customers who happen to have certain, medically relevant genotypes. "We can discuss quite privately with nonprofits and researchers conducting clinical trials about what is the right way to help people to get to clinical trials or help with them," said Lennon. "It is easy enough for us to put out a certain Promethease report tagged to a certain genotype — if you have this genotype and you might be interested in a clinical trial, here's a contact," he said. "We are increasingly working with organizations to figure this out. It's one more aspect of how people can use their genomes sensibly."
The company also continues to benefit from the crowd curation of SNPedia, which allows users themselves to update literature related to certain SNPs, providing what Lennon sees as more accurate reports compared to some commercial providers.
"The literature is full of reports that a given genotype leads to a certain consequence but it's based on relatively little data," said Lennon. "It might have been a paper that was published in 1983," he said. "We are able to compare a lot of different sources in a ways that people in a single silo can't."
Like Promethease, GEDmatch is another service that caters to consumer genomics customers that has become increasingly popular, all without significant investment or an advertising budget. "We have never advertised or promoted," said Co-founder Curtis Rogers. "Our growth is strictly by word of mouth," he said. "Every week brings thousands of new users."
According to Rogers, GEDmatch now has over 130,000 registered users, over 200,000 samples in its DNA database, and over 75 million individuals in its genealogical database. To keep up with this sizable user base, the site runs on approximately 14 servers, and its hosting fees are paid from user contributions, Rogers said.
While Promethease serves consumer genomics customers who are interested in what potentially medically relevant SNPs they or their family members might be carrying, as described in the literature, GEDmatch has served as an interface for clients of the major testing companies to further their ancestry research. Currently, AncestryDNA and 23andMe customers cannot conduct cross-company comparisons, for instance, but within GEDmatch, they can upload their data and do so, as well as reanalyze it for ancestral origin or relationship matches. There are a host of other features, such as a chromosome browser, that have made it an essential component of genetic genealogy. And yet its popularity has caused GEDmatch's founders their share of challenges. The website has been offline numerous times in the past because of scalability issues.
"A lot of our time has been tied up simply trying to keep up with the rapid increase in usage," said Rogers. "By removing some programs and going to Amazon Cloud servers, we currently have some breathing room," he said.
Together with co-founder John Olson, Rogers runs GEDmatch with the help of a few volunteers. Customers can become subscribers to the service to access certain "tier 1" genetic genealogy tools, such as a feature called Lazarus that allows users to reconstruct the genetic kit of an ancestor based on descendents' data. There are other features that GEDmatch has been forced to discontinue due to its relentless user growth, and others under development that the provider would like to introduce if it had more time and resources, Rogers noted, adding that GEDmatch is moving toward instituting a more business-oriented operating model.
"From a business perspective, we have passed the point where we need good backup and a reliable management team," said Rogers. "The two of us also have very busy outside jobs, but GEDmatch is taking more and more of our time," he said. "It is important that GEDmatch becomes less reliant on its two founders." At the same time, he said that GEDmatch hopes to stick to its initial vision to help consumer genomics customers. "Our thought has always been that we are here for our users' benefit, not ours," he said.
Like GEDmatch, the site DNAGedcom evolved out of a desire by consumer genomics customers to share data across services, rather than to only interact with other customers of the same companies. Like GEDmatch, DNAGedcom allows users to multiple services to upload their kits into one single repository. Moreover it offers an array of tools to compare data between kits, with an emphasis on identifying genetic relationships.
"DNAGedcom is there because now that you have your data there needs to be somebody who is looking at the data, working with it very hard," said Founder Rob Warthen. "The big companies are trying to match the least common denominator, to get the most people, and we are trying to reach that 10 percent of people who need specialized information," he said.
Warthen's service evolved out of DNAAdoption, another website that serves adoptees who aim to integrate genetic genealogy into their searches for biological relatives. As GenomeWeb reported last month, more and more adoptees are successfully using consumer genomics services to connect with their biological families.
While the rate of success stories has grown, it's not always so easy for adoptees to make use of them, according to Warthen.
"There are a lot of people in the adoptee community who are upset that the companies sell this product and then don't make it easier to understand it," he said. "That's where DNAGedcom came into play, taking that information and making it understandable more quickly," he said. Warthen also stays on top of data analysis needs by upgrading DNAGedcom as the major providers introduce new features.
"Any time that one of these sites make a change, I am in there gathering information, seeing how we can make better use of that data," he said.
Currently, DNAGedom allows users to gather and analyze autosomal matches across Family Tree DNA, 23andMe, and AncestryDNA data, as well as search for surnames, names, and places among matches. Much of Warthen's current development of the site includes adding visualization tools and other data analysis features. That development work takes time, though, and Warthen continued to run DNAGedcom as a side project, though he has introduced a "tier 1 tools" option similar to GEDmatch's tier 1 service, and contracts work out to other programmers.
"I think that's what you are going to see with third-party tools, when you have this kind of volume, and the kind of servers that you need to gather, and you are going to have to have full-time people that are dedicated to this who are going to be able to run a reasonably good site," said Warthen. "To offer a good service you have to charge some amount," he said, though he added that with DNAGedcom a large portion of the site will always remain free.
"The optimal goal is to turn this into a real business," said Warthen. "Not that I am looking to make money, but it can be and already is to a large extent a full-time job."
Y Full is arguably one of the more unique third-party data analysis providers in that it focuses exclusively on reanalyzing full Y chromosome sequencing data generated by Family Tree DNA and Full Genomes. For four years, the Moscow-based informatics company has charged $49 per analysis.
Vadim Urasin, a team member at Y Full, said that the company now processes about 200 samples per month and also provides estimates of Y haplotype subclades. Urasin said that Y Full has added an estimated age for almost every subclade on the company's Y Trees, which are essentially Y chromosome pedigree charts where each branch, differentiated by certain mutations, is named with a letter and deeper branches are labeled with numbers and letters to reflect their relationship to each other. By being placed on a firm's Y Tree, clients can learn about their specific line of ancestral descent, with hints to their geographical origins.
Looking forward, Urasin said that YFull aims to improve its estimates about subclade ages, adding information about the countries of origin of known ancestral members of subclades. "Estimating age is possible owing to using analysis of mutations in DNA from ancient burials and radiocarbon dating of these burials," he noted.
While YFull remains a small company, David Mittelman, a geneticist and entrepreneur who worked on Family Tree DNA's Big Y product when he was CSO of that company, praised the service, arguing that it offers an "excellent and more comprehensive analysis of the Y chromosome than Family Tree DNA or Full Genomes."
In general, he said that services like YFull's have emerged in the consumer genomics industry because large vendors have not been offering the all of the tools that users want.
"The big vendors don't always prioritize tools and features the same way the community does," Mittelman said of this industry trend. "They have their own motivations, and product roadmaps. It's a great value when small nimble startups or individual enthusiasts come up with tools to accommodate unaddressed needs in the community."
Mittelman predicted that in the future, more consumer genomics customers will begin to use paid services like those offered by Tute Genomics, where he presently serves as CSO, to analyze their data. "Tute is a well-funded company focused on helping people extract the most meaning from their genome," said Mittelman. "Today, they are serving the needs of researchers and clinicians but it's just a matter of time before they work directly with consumers," he said.
In the future, Mittelman suggested that customers may also turn to Helix, a new venture funded by Illumina that aims to build a marketplace for consumers to dig into their genetic data.
Illumina and several partners announced in August that they had formed Helix to provide services for consumer samples obtained through third-party partners. Customers will be able to manage their data and have access to a marketplace of on-demand applications provided by Helix partners in order to gain insights into their data, according to the firm.
An Illumina spokesperson this week declined to provide an update on Helix beyond what the company communicated over the summer.