NEW YORK – Gencove, a New York Genome Center spinout that is helping clients use low-coverage sequencing for genotyping, has expanded into the agriculture market as it continues to work on its platform.
Low-coverage, also called low-pass, sequencing — for Gencove, usually less than 1X depth — has several agricultural applications but mainly competes with other genotyping methods, such as microarrays, according to Joe Pickrell, the company's CEO and cofounder. "There are some really interesting breeding programs, and we've done some esoteric species," such as black soldier flies, which are insects used in feed for animals, he said. "There has been a structural shift in that industry now that people have genome sequencing hardware. There's an opportunity to take advantage of decreased cost and, operationally, low-coverage sequencing has an advantage."
So far, the company has produced data for peer-reviewed studies looking at dogs and beef cattle.
Fresh off a $10 million Series A financing round, Gencove now plans to improve the core functions of its platform and to build datasets and informatics tools to make low-pass sequencing competitive with arrays.
Curating the best datasets to feed the imputation algorithm is one of the keys to unlocking the ag-bio market. "If you want to go low-pass on soybeans, there aren't great databases you can download and go with," Pickrell said. "So, we built out heavily curated haplotype reference panels." The firm has also invested in making the software user interface easy to use.
Founded in 2017, Gencove's first product was an ancestry test for the consumer genomics market. However, the firm abandoned that product in 2018 and turned toward the research market.
The company's method relies on data imputation to infer genotypes from the sparse sequencing data it produces, using known haplotypes from previously generated datasets. For human genomes, that could be the 1,000 Genomes or HapMap datasets.
So far, the firm has run "hundreds of thousands of samples" and is growing its sample volume "at a good pace," Pickrell said. He declined to provide revenues but said they're also growing "at a pretty good clip."
At present, the firm, based in New York, has about 15 employees and is planning to "heavily expand" the team over the next year.
"What's great about low-pass sequencing is that we can start to take a little bit of imperfection in the library preparation," he said. The firm has modified its sample prep to increase throughput by decreasing the total time and number of steps. It has also made it so the process is automatable. To reach the firm's goals of sequencing hundreds of thousands or millions of samples a year, it has to be automated, Pickrell said.
In February, the firm published a study in Genome Research comparing its method to genotyping arrays for genome-wide association studies and generating polygenic risk scores.
"We asked, 'What aspects of the data influence genotyping accuracy?'" Pickrell said. His team identified a concept he calls "effective coverage," a tandem measure of the number of sequences and their evenness across the genome. Different library preparations can affect effective coverage, and taking this into account can decrease the amount of overall sequencing one needs to do. With this concept, one could do a cost-benefit calculation and determine that some unevenness is acceptable if it decreases costs or allows you to automate the workflow.
While expanding into the ag-bio market, Gencove continues to work on human genomics. One of its customers, Israel's Weizmann Institute of Science, is conducting a biobanking study, performing low-pass sequencing for Project 10K, a personalized disease prediction study with 10,000 participants.
The Gencove data performed as well as the study's genotyping arrays, for less money, according to Iris Kalka, a graduate student at the Weizmann Institute working on the project, so her team has sent more than 6,000 samples to Gencove for processing. The researchers, led by Eran Segal, a professor of computational biology, are combining the genomic data with many medical tests, including glucose levels, sleep data, and body fat distribution.
"Our hope is to use this data as a resource for drug discovery or biomarker discovery," Kalka said.
The company has been a big help in computing polygenic risk scores, she said. While Kalka calculates PRS for many conditions, Gencove's pipeline always determines a PRS for cardiovascular disease and breast cancer, which she has used to validate her own work.
She said she has been "positively" surprised at how responsive the company has been to her questions. "We wanted to implement something they wrote in a paper," she said, and after approaching the firm, she found herself working with the company's chief data scientist. "He went over our code, our methodology. I was truly pleasantly surprised at that."
But working with Gencove isn't for everyone. "Ask yourself what you want to know and whether this answers your scientific question," she said. "If you want to search for very unique variants in your data, I don't think [low-pass sequencing] has the results for you."