Skip to main content

Personalis Seeks Edge with 'Enhanced' Exome and Genome Sequencing, High-Quality Annotations

Premium

After working quietly for a year, Stanford University startup Personalis recently started offering sequencing and clinical analysis services for human exomes and genomes to researchers. The company wants to set itself apart from competitors by providing more accurate sequencing of all medically relevant areas of the genome and by offering analysis that is based on curated proprietary annotation databases, several of which it licensed exclusively from Stanford.

Personalis was founded in 2011 by a team of four Stanford professors with expertise in human genome interpretation and former Solexa CEO John West (CSN 9/28/2011). Later that year, the firm raised about $20 million in venture capital from Abingworth, Lightspeed Venture Partners, Mohr Davidow, Stanford University, and a couple of individual investors.

West told Clinical Sequencing News at the recent Advances in Genome Biology and Technology conference in Marco Island, Fla., that over the last year, the company has grown to about 25 employees and has opened a laboratory in Menlo Park, Calif., that is equipped with two Illumina HiSeq 2500 and two MiSeq instruments. The laboratory obtained CLIA certification in January and expects to become CAP-accredited later this year, allowing the company to offer its services to pharmaceutical companies for clinical trials and, eventually, to develop clinical tests.

About three quarters of Personalis' staff works in research and development, but the firm is in the process of recruiting a sales and marketing team for its services, which it expects to grow significantly over the coming year. It has room to grow to about 75 employees in its current facility.

Over the last year, the company has been developing its sequencing approach and analysis pipeline, with an emphasis on accuracy and completeness of biomedically important areas. It currently offers services for disease case-control studies, pharamacogenomics and adverse event studies, rare pediatric syndrome studies, tumor-normal studies, and others, including family trio and proband-only analyses.

The firm currently does not provide pricing information, saying that prices differ depending on the type of sequencing and analysis required by the customer. "The goal is not to make a Lamborghini," West said. "The goal is to make something like a BMW, that can still be in the range of being mainstream but clearly is a superior product, and it's going to cost more to make that kind of a product."

On the sequencing side, Personalis provides standard exome and whole-genome sequencing as well as, at an additional cost, what it calls "Accuracy and Content Enhanced" exomes and genomes. For ACE exomes and genomes, it uses targeted capture to pull out subsets of the genome that are not covered well by standard sequencing, such as areas with high GC content, known structural variants, repeat sequences, homopolymers, or compressions. Some of these involve optimized sample prep methods, for example to capture GC-rich regions.

ACE exome sequencing also includes biomedically important areas outside the exome, for example non-exonic variants linked to disease, drug metabolism, regulatory regions, structural variant junctions, and intronic splice sites. An example of this is a SNP variant in a promoter region upstream of a gene that is important for dosing the blood thinner warfarin. While the bulk of the sequence data is generated on the HiSeq platform, the company currently uses the MiSeq to fill in the gaps and is considering long-read technologies, such as Pacific Biosciences and Oxford Nanopore, as those technologies improve.

Personalis researchers have tested the accuracy of the Illumina as well as the Complete Genomics platforms extensively by sequencing the same sample – a female HapMap sample from a three-generation CEPH pedigree – multiple times, both in house and through Illumina's and Complete's services. The goal of that effort, which generated more than 600 gigabases of data in total and cost "hundreds of thousands of dollars," West said, was to generate a "ground truth" genome to compare different technologies against. They found that both platforms missed some true variants, and also, that two Illumina datasets generated from the same sample did not completely overlap.

West said the company used those results to identify "problem areas" in the human genome that consistently yield high error rates on the Illumina platform due to the nature of the technology, and to resequence those areas separately with other technologies. That, he said, is more economical than sequencing each genome entirely on two different platforms. "Each technology has certain things it does well and certain things that it does not do so well … and the weak points don't overlap with each other," he said.

Sequencing accuracy needs to be solved, he said, before genome sequencing can be used clinically on a broad scale, "because a single SNP can make a difference in medical diagnosis." And while accuracy can be improved by stringent filters, this also means that large numbers of true variants are filtered out, which increases the false-negative rate.

For the data analysis, Personalis has developed its own alignment and variant-calling pipeline, which uses both publicly available and proprietary algorithms and methods. The pipeline includes a proprietary database and method for structural variant calling, and an "enhanced" human reference that replaces the minor allele with the major allele at more than a million loci.

Personalis' annotation engine uses more than 30 databases, both public and proprietary, to ensure high accuracy. "Potentially, we are going to be making clinical decisions based on this data, every step in this process has to yield good results," said Richard Chen, the company's CSO. The engine includes annotation tools for structural variants, he noted, which have often been ignored by sequencing projects so far but are widely used in diagnostics using arrayCGH.

For its proprietary databases, the firm has exclusively licensed from Stanford the Varimed and Mendel DB databases of disease-associated human variants, and has been curating them further, combining them into the Personalis Disease Variant Database, which currently contains more than 600,000 variants associated with both complex and Mendelian phenotypes.

The firm also took an exclusive license to PharmGKB, a database built by one of its co-founders at Stanford that links genetic variants to drug toxicity, dosing, and efficacy. The company plans to add to it over time, calling it the Personalis Pharmacogenomics Database, and might use it as the basis for future diagnostic assays, West said. Researchers can still use PharmGKB – which continues to be maintained at Stanford -- for their work, he said, but anyone wanting to incorporate it into a commercial product needs a sublicense from Personalis now.

A third database licensed by Personalis from Stanford is Regulome, which contains several hundred thousand putative transcription factor binding sites and regulatory regions. This will help the firm interpret mutations in non-coding regions that may affect gene expression or function.

West said that the company does not only use these databases for annotation but also as an upfront tool to design capture assays for sequencing projects, depending on their goal, so nothing important gets missed. "If you don't have the data, no amount of algorithms is going to fix the problem," he said.

Personalis' first customer is a group of Stanford neurosurgeons who wanted to find the genetic cause underlying a disease called moyamoya, where arteries in the brain become constricted, causing small blood vessels to form around them that show up as "puffs of smoke" on CT scans. The team had developed a surgical technique to bypass the constriction and was interested in developing a genetic test that would be able to pick up relatives at risk in families with affected individuals. Personalis sequenced the genomes of 130 patient samples, matched them with appropriate controls, and found several "interesting candidates," West said, adding that the Stanford clinicians are planning to publish the results in the near future.

Since then, Personalis has signed a contract with a currently undisclosed customer for a research project that will involve sequencing more than 1,000 whole genomes, which it plans to announce soon.

It is also in discussions with a West coast-based hospital system that is interested in rolling out clinical genomic tests in several hospitals, starting with pharmacogenomics. That project – if awarded to Personalis – would involve exome sequencing several hundred patients initially, West said.

The Scan

Call to Look Again

More than a dozen researchers penned a letter in Science saying a previous investigation into the origin of SARS-CoV-2 did not give theories equal consideration.

Not Always Trusted

In a new poll, slightly more than half of US adults have a great deal or quite a lot of trust in the Centers for Disease Control and Prevention, the Hill reports.

Identified Decades Later

A genetic genealogy approach has identified "Christy Crystal Creek," the New York Times reports.

Science Papers Report on Splicing Enhancer, Point of Care Test for Sexual Transmitted Disease

In Science this week: a novel RNA structural element that acts as a splicing enhancer, and more.