Name: Jorge Conde
Position: Co-founder and CEO, Knome
Experience and Education:
Strategic marketing and operations, MedImmune, 2006-2007
Business development, Helicos BioSciences, 2004-2005
Life sciences venture capital, Flagship Ventures, 2005
Associate, biotechnology investment banking, Morgan Stanley, 2000-2003
MBA, Harvard Business School, 2005
MS, health sciences and technology, Harvard University and Massachusetts Institute of Technology, 2006
BA in biology, Johns Hopkins University, 1998
DNA sequencing technologies have developed a lot since personal genome company Knome first launched its whole-genome sequencing and analysis service in late 2007.
Earlier this month, In Sequence caught up with the company's CEO, Jorge Conde, who was visiting the GenomeWeb office on his way to the premiere of "Faces of America with Henry Louis Gates, Jr.," a new series on PBS that explores the family histories of 12 well-known Americans, using both genealogy and genetics tools. Knome provided genome analysis services for both Gates, a professor of African American studies at Harvard, and his father, who had their genomes sequenced by Illumina. Below is an edited version of the conversation.
To someone who has not heard of Knome, how would you describe the company? Are you a services company, a sequencing company, a healthcare company, an informatics firm?
We are a genetic analysis and interpretation company. What that means is, we are getting access to reams of information in a language that we can read but don't really understand today. What we are aiming to do is to enable people to be able to access that information, and have a partner, or tools, that they can use to hopefully understand it as best as we can, given what we know today.
The first step in the genetic revolution was figuring out how to sequence in a way that's fast, cost-effective, reliable, and of high quality, and because [technology] companies have done so much innovation in this space, they have largely solved that problem. Now, you can actually see a point in time where it will be imminently doable to sequence a vast number of people in a very quick and efficient way.
Our aim was, from inception, to be thinking about that next bottleneck that gets created once that's a reality, which is, 'How are we going to make sense of all of this?' That's who we are.
Last year, you started offering your services to researchers, in addition to consumers. Does most of your business still come from the consumer side right now?
Yes, most of our business is what we were founded on, which is the consumer business, and helping individuals understand more about how their own genetic information. But clearly, over the last year or so, because research projects are getting more ambitious, and there is more going on in that space, we have been approached by many groups that we have started working with. And that's actually been very enjoyable as well, because there, you are partnering in discovery.
How is Knome funded? Is the company profitable yet?
Knome is self-funded. We do not disclose our finances.
Who does the sequencing for you?
We have an agreement with BGI [formerly Beijing Genomics Institute], which just announced that they are going to buy a large number of the HiSeq 2000 machines. Most of our sequencing goes to BGI, and they have been wonderful because they have incredible scale at this point, and an incredible amount of experience moving samples through the pipeline.
[ pagebreak ]
We also announced an agreement with SeqWright down in Texas [last year] for the reason that some people wanted to have access to some other platforms, and SeqWright has access to 454 machines and the SOLiD platform, and also, that they are CLIA-certified, so to the extent that something needs to be CLIA-certified, that was an option for us.
Also, back in June of last year, when Illumina announced that they were going to be offering sequencing services for consumers, we hooked up with them to offer interpretation for that service.
What is the value of having a CLIA-certified lab do the sequencing?
I think the expectation is that at some point in the future, it may be required to have a CLIA-certified lab. And to the extent that it is, that's why we have that as an option. But today, it's something that, certainly from the federal perspective, is not mandated.
How did your relationship with BGI come about?
We got connected to them originally through [Knome's] co-founder, George Church [a professor of genetics at Harvard Medical School]. He made the initial recommendation, so I went out to China in 2007 to meet with [BGI representatives], we put together an agreement, and they have been supplying us that service since then. They are a wonderful group, really nice, and they have done a wonderful job building up a core sequencing facility that enables us to move samples through the pipeline in a very predictive flow.
Can customers specify who they want to sequence their DNA?
Yes, of course. On that, we are pretty flexible. Generally speaking, consumers don't have a strong preference. Obviously, everyone wants to make sure that it's good data, and it's reliable, but they don't have any specific preferences. That tends to be more researchers, who we work with as well.
Have you considered working with Complete Genomics, which has talked a lot about driving the cost of sequencing down further? Or is BGI competitive with them?
BGI has been amazing in getting scale and becoming more efficient and more cost-effective. But as I said, we are platform-agnostic, so to the extent that new platforms come online, there are more genomes that need to be analyzed, we want to be part of that.
We try to keep an eye on all the different platforms out there. That list of companies that are doing neat or interesting things in the sequencing space continues to get longer and longer, so we have to keep our eyes and ears open.
Do you believe sequencing technology today is good enough to give you all the relevant information about a person's genome?
I think the platforms that are around today are incredibly good, compared to where we were just a few years ago. But there is no question that there is a need for further improvement in the platforms, precisely so we can make sense of things like structural variants, and also making sure that the resolution for the unit of cost is high enough, so you feel much more comfortable making some calls in the clinical sphere.
What's the turnaround time for, say, BGI?
Door to door, it can be as fast as four weeks. It's usually closer to six weeks. But they are always getting faster.
Do they provide the initial analysis?
They will do the assembly work for us. We have our own assembly algorithms, so we can re-assemble. They send us both assembled data and raw reads.
Do you also offer de novo assemblies of the data, now that BGI has worked that out?
We haven't started doing that yet. There hasn't been a need for the folks we work with yet, but that's something we would be open to.
[ pagebreak ]
How many genomes have you sequenced so far?
We don't disclose that number. When we launched the service in November of 2007, we set this audacious goal and said 'We are going to do 20,' and people laughed and said, 'You are not going to do 20.' We did 20 a long time ago — we are well past 20. It's not as impressive in 2010 as it probably was back when we did it. Volumes have continued to steadily pick up. As the cost has come down, we have started to do more than just individuals. Now we tend to do families. In fact, I would say, more of our consumer clients today, as they come in, are multiples than singles.
Are these customers where there is a disease in the family?
Many of them are. It's become accessible to people, so they are looking for answers, and this might be a way that we can hopefully help at least shine a light into a corner that previously was dark for them.
Are you thinking about sequencing tumors from patients as well?
We have talked to some groups about doing those projects as well, and we are working on a few things there.
Can you talk about pricing? KnomeComplete — complete genome analysis — currently costs $68,500, and KnomeSelect — an analysis of the exome — costs $24,500 for an individual. How will that develop?
When we launched, our pricing was significantly higher than around $70,000 [Ed.: it was $350,000]. When we launched, sequencing was a big chunk, and interpretation was some additional chunk on top of that. [The latter] number is somewhat fixed for now, but [sequencing costs] have been coming down, and that's what's enabled us to drop our prices.
As the cost of sequencing continues to go down, we will, of course, want to adjust our prices to make it attractive, and to remain competitive. But realistically, the cost of interpretation, the amount of labor that goes into it, the amount of value that you can provide someone, is still significant. So prices will go down, but they won't be following at the same rate as the underlying sequencing cost.
Do you think the market values the interpretation enough? When it gets to the point that the sequencing component is a much smaller fraction of the overall cost, do you think your market will understand that, what they are actually paying for?
I think it's a challenge for us as a company to effectively communicate exactly what we are offering, because in the media, often times you get a very simplistic headline, 'Company announces genome costs $1,000', and people think, 'Wait a minute, this should cost me $1,000.' That's one of the things we spend time on when we talk to people; we explain to them exactly what it is that they get on the interpretation side.
One of the good things that plays to our benefit in this space is that the perceived value of interpretation will increase over time. Because as more is learned, as more is known, as there are more and more examples of how this information is helpful, I think the perceived value will go up while the cost of sequencing is coming down.
How many and what kinds of conditions do you report on?
It has got to be in the thousands. We spent a great deal of time putting together an analysis platform that takes a lot of what's known about genotype/phenotype associations in the literature, and puts it into a database that makes it more dynamic. So we can feed our database platform a unique genome, and it can identify hits that we have annotated in our database and map them onto that genome. That way, we can report on pretty much any number of conditions, as long as it's been reported in the literature.
And then we go one step above that, in cases where there is a variant, or variants, of interest for a specific condition, we will put together reports around that to say, 'Here is what we found, this is why it's of interest, this is the genetic basis of the condition,' et cetera. It's a pretty wide range of conditions and traits and pharmacogenomics associations.
[ pagebreak ]
How many conditions do you not report on because of patent restrictions, for example mutations in the BRCA 1 and 2 genes?
We have taken a pretty close look at IP in this space. The IP landscape is as varied as the number of conditions that you could potentially report on. Of all the things that we have looked at, in the vast majority of cases, we feel pretty comfortable that we can report on any association that's in the public domain, in terms of publications, because most of the IP is specific for a specific assay and a specific diagnostic test, and we are obviously doing a more general screen.
In the case of BRCA 1 and 2, that's one where we feel that that is, at the moment at least, protected, so we don't touch that without actually going through and getting a confirmatory test.
Are most of the variants you report SNPs, or do you also have associations for other types of variations?
Most of them are SNPs, because most of the literature has been around those associations. That said, because we have all of the data, if more complex associations have been reported in the literature, and we have gone through that in our database, then we will show that as well. We have cases where we look at certain insertions or deletions that have been characterized. What's more difficult for us is, if we see something that's like an insertion or deletion, and it's never been characterized, how to make sense of that?
How often do you update your database?
The rate of new associations is only going up, so it's an ongoing effort to constantly update that. I don't think, at the moment, we are anywhere near capturing 100 percent, but I think we have made incredible progress in capturing the vast majority of what's been highlighted over the last couple of years.
Do you have a curation team, and how big is that?
We have a team in India and a team in Cambridge that does that, though we don't disclose the size of these groups.
Where do you set the bar for including, or not including, an association?
The rule that we go after is, if it's been replicated in at least one paper, that tends to give us some comfort that it's not some sort of a spurious association. The second one is, we also try to capture as much detail as we can about how the data was presented in the paper, the sample size, and [the type of population studied.] If, for example, we are sequencing an Asian client, and you find a paper that it's only been shown in a small group of Northern European individuals, we will tend to give that lower priority.
Is there a need to standardize how these associations are interpreted? Some people who had their genomes analyzed by different consumer genomics companies found that the same SNPs were interpreted differently.
It will be hard to dictate a standard. I think a standard will emerge over time as to what is the most effective way to do this. This is a new space, and every one of these companies is taking great effort to do this in a thoughtful way. Obviously, opinions will differ as to what's the best way, but I think a consensus will emerge over time. In fact, some of the companies have put out white papers on their methodology, and I think that's a wonderful thing.
Are you considering different tiers of interpretation in the future? Right now, Knome customers meet with several people for an entire day to get their results — sort of a luxury version.
Yes, absolutely. Right now, there have been hundreds of genomes sequenced on the planet, maybe, and that number is going to move to the thousands by the end of this year and next year, and beyond that, it's going to be in the hundreds of thousands, and then after that, it's going to be in the millions. So this is something that is going to scale at an incredibly fast rate.
[ pagebreak ]
And of course, as the cost of sequencing comes down, this becomes accessible to an incredibly broad range of people, both on the research side and on the consumer side. For consumers, when the cost comes within their target range, we are absolutely going to have to be able to provide services in a more automated or simplified way, in terms of being able to deliver data remotely, so people don't have to come and spend a day with us. The timing of this is going to dovetail pretty nicely, in that I would not have felt comfortable e-mailing anyone data, or putting up a webpage with somebody's data, at the beginning because we did not know how people were going to react to information, and how people were going to interpret what you are telling them. Now we have gotten a lot more of that practice, we have a much better sense of what works and what doesn't work, and that will always continue to be improved upon, and it will evolve.
But as the cost comes down and the volume picks up, and there is more demand, we need to figure out a way to deliver our services to a greater number of people. We are certainly planning for ways how we are going to do that.
What's the timeframe for that?
It's probably more in the medium term. But we have to be flexible because this space changes very, very quickly. If you would have asked me when we started the company where I thought we would be on Feb. 1, 2010, in terms of costs, I would have been 100 percent wrong [on the high end]. There is no doubt that we will get to the mythical '$1,000 genome' — whether that's the exact number or not. The question has always been when, and how. It's pretty remarkable how quickly things have moved. And from our perspective, that's a great thing, because the more data is out there that people can get their hands on, the more people get sequenced, the more interpretation they are going to need to make sense of it.
Is there a mechanism for your customers to allow their data to be used in research?
One of the first things as a company that we rejected was that idea that just because we sequence somebody, that we now had claim or ownership over their information. So we developed what we think is a pretty innovative platform for managing the data.
When we sequence an individual, we do the assembly and the analysis, we put the genome onto a detachable drive — a little USB stick that we call a 'genome key.' We put the data on with the browsing software application that has a code in it that enables it to receive updates from Knome's central server.
When you plug in your genome key, it calls Knome and sees when is the last time you have logged on, what updates there are, [for example an] update [on gene] ABC1, and it gets pushed down to the genome key. It is sort of like iTunes in that you continue to update without [Knome] having to know what specifically your genomic information looks like.
Because of that architecture, you can build a back-end to that. [Researchers can say] 'I'm interested in gene ABC1, so you can formulate a query to Knome.' Knome pushes that query down to the clients and says, 'Such and such researcher is interested in gene ABC1, looking at this particular position for this purpose, would you be interested in participating in the research?' [So] Knome aggregates all of the ABC1s of people who have actively opted in to participate and then takes that aggregated data and kicks it back to the researcher.
That's sort of the elegant way to do this, in our view, in the way that we are not taking ownership over peoples' DNA. We filed IP on that, and it's, I think, an important differentiator for us. And from a self-interested perspective as a company, that lowers our burden in terms of building an infrastructure to secure all of this information in a way that's hacker-proof.
There have been certain cases, although nobody has moved forward with this yet, where people have said, 'I'm so interested in this space that I want to donate my genome for public consumption.' We are hearing about that less and less, because there is more and more information out there, but still, that does come up sometimes, and in that case, we have George Church as our co-founder, and who better to think about how to get information out there in a responsible way that people can get access to. That's the whole basis of his Personal Genome Project. That's the second way, and the easier way for people to share information, if they are inclined to do so.
So you keep no back-up copies of the data?
We keep a dry archive copy if we get permission from the individual, but it's not networked, so it's not on a server, it's literally in a third-party facility, in case somebody loses their little genome key. And in certain cases, people have requested that we do that but that they retain custody over access to the dry-docked information.
Does BGI keep any of the data?
No. They are required to delete the information. And frankly, they always call us, 'Are you done with this project? We want to delete.'
Have you had your own genome sequenced and analyzed yet?
I am in the pipeline but have not yet been sequenced. Our pipeline remains full, and our first priority is to deliver for our clients. I will get sequenced and analyzed once we expand our capacity.