INCY Interview:
TMF: Can you briefly tell us what the business model of Incyte is? What are Incyte's products, how does it make money?
Whitfield: Incyte's the world's leading genomic information company and there's basically three fundamental value propositions from that. One is our ability to basically pay our own way by selling subscriptions to the databases. We've developed the largest portfolio of genomic patents and we already have small revenue from that but those patents are going to be around for the next decade or so so we think they'll be fundamental to many of the diagnostic and drug discoveries that are going to be made out of the genomic revolution. And last but not least we are also investing heavily in our own drug and diagnostic discovery programs we've been doing for two or three years ... in the developed additional intellectual property and discoveries in that area and now that we've raised $500 or $600 million last year, we've got all our options open in terms of how we choose to commercialize that - whether we partner that with some of the big pharma. We've got no trouble having conversations with them since nearly all of them work with us anyway or we have the ability to co-develop them with antibody companies or ***.
TMF: Great. That sounds like a good outline for part of our discussion. Let's look at the databases first and let's start with the LifeSeq Gold Human Sequence Expression database. What information does that give customers?
Whitfield: It's the world's largest database of gene transcripts. Do you know what a gene transcript is?
TMF: Well why don't you explain for our readers.
Whitfield: Yeah, because I think this is a key issue with particularly a lot of the stuff that's being published today by people with Human Genome project, so at the end of the day what you want to know is how does the genome code for proteins - you know the things that float around in your body and digesting your lunch right now and everything else, but if you do genomic sequencing like the Human Genome project, you don't get the transcripts themselves. In other words, the way the genes are strung in order to produce the transcripts and so you've got a database of gene transcripts.
TMF: Now by transcripts, are we talking messenger RNA [mRNA]?
Whitfield: Yes and how they can be spliced together in different ways from a given gene. You're much closer to biology. If you are just working with genomic information, you actually don't even know how the pieces of the genes and most genes have more than one piece and they get stitched together in different ways, so you actually don't know how they're stitched together so you're quite a long ways off from even first knowing what the transcript is and then ultimately guessing at what the protein is. Plus you actually don't get the physical clone so another advantage or value to our database business is that all of our customers get access to the clone warehouse of the world which is in St. Louis and we've got 20 million clones under management so any genes that are in the database, you can call us and we can send you the clone overnight. Again from the human genome project data you don't have that.
TMF: Well, let's talk a little bit about the clones. When you say 20 million, you say you've got about 100 thousand unique cDNA clones. Is that right?
Whitfield: No we have far more clones than that, but not just human. We've got rat, dog, sheep, geese, many plant organisms, you know all of the things that many medical researchers would want to use.
TMF: Right, and it covers also things they get in the PathoSeq database or the ZooSeq?
Whitfield: Yes, the ZooSeq. Now in many cases, one of the holy grails right now in the human side is to get a full length clone for every gene. It's going to be quite some time before anybody has that .... But obviously that's a key thing. If you've got a full length clone, then you can insert it into expression factors and produce the protein. That's why some people order those from us in order to do that. Once you've got the protein, you're off to the raises for drug screening and trading antibodies and all that.
TMF: Is the clone business a big part of your business?
Whitfield: No. It's actually part of the LifeSeq subscription so it's like an all you-can-eat menu if you will, but we do sell individual clones. That's something we started last year and there's a line item we call custom genomics which grew over 50% last year and an important piece of that is the individual clone business. So we view ourselves as like sort of the genomic infrastructure. You know that's why almost everybody works with Incyte is because everybody needs a comprehensive database of gene transcript information, everybody needs some clones, and so on.
TMF: Right. Now another term that some of our readers are familiar with is EST or Expressed Sequence Tag. That's what we are talking about here when we're talking about this?
Whitfield: Increasingly no, because in the early days when we started, what Incyte would do is do EST sequencing and then using bio-informatics try to stitch them all together into a full length gene and we did have some success with that. I'd say about two years ago we started moving away from that to a high through-put full length cloning approach which is where we've got a piece of the gene and we know because of a small part of it that it's interesting - it's a GPCRI channel, then we put it into this full length cloning approach in St. Louis and then we go directly to the full length clone, and then we can get the full length clone and then we do what's called “shotgun sequencing” on that to determine the length of the full length clone. So when people say EST sequencing, they usually mean what inside *** *** was doing in the early 90's where you got all the pieces and then try to use bio-informatics to stitch them together. What we do now is we just need a piece of a gene. Enough to identify what class it's in and then when we know it's an interesting class, we'll put it in ahead of the full length cloning cube, then when we've got the full length clone, we'll go back and chop it into little pieces and the sequence it all, so it's a bit like the way people talk about shotgun sequencing for the human genome which you probably heard we do add on because it's easier to do obviously because it's much shorter on full length genes. So that's the whole basis of this big new project we've issued with Pfizer over a year ago.
TMF: Right, could you talk a little about that?
Whitfield: Basically, Pfizer is sitting there and they're our oldest customer and they're seeing that Incyte - we know how to get these full lengths and not only that, we know how to get patents on them as well. They know that for clones that come through the database part of our business, that they automatically get a license to them for a low royalty, so that's quite encouraging for Pfizer being a big drug company because they know that full length gene patents are issuing to a number of people, that we're the leader, then they'd want Incyte to get as many as possible because by virtue of LifeSeq agreement, they are guaranteed a license at a low royalty. That's a much better scenario for them than the prospect that somebody else might get the full length genes with the potential that they might be shut out doing research on some of the important genes in the genome.
TMF: Right. Now that brought to mind a question about the patents. When you were talking, one of the things that's very clear is that you have the largest full length gene patent estate by far. Over 500 right?
Whitfield: Yes. Again, not only that, we've all got to be accurate now in this world of genomic word confusion. These are all full length gene transcript patents so they relate to a full length gene transcript which can be converted as I mentioned, into a protein using conventional technologies. Again, as opposed to computer-derived partial gene sequence information which is derived from human genome project data trying to get patent on that, so one of the controversy here is people are saying, well that's really difficult right. You're using computers to predict a piece of a gene and you don't have a clone. Why should you get a patent on that. We haven't yet tarred with the *** whereas a full length gene transcript patent is quite a different animal.
TMF: Right. I mean, these are patents on gene transcripts for what we would assume would be therapeutic proteins.
Whitfield: Yes. All potential drug targets. All potential diagnostics, so because we've sequenced from thousands of cells and tissues, we have an algorithm we call "guilt by association." What it basically does is, this gene - give me the top 50 genes that are the most highly specific to prostate cancer, right. Another words, *** *** prostate tumor samples, I don't see them anywhere else, right. Obviously that might have some diagnostic utility. For example, when we do that, we have I think it's 52 genes that are more highly specific than prostate specific antigen or PSA which as you know has been a very exciting diagnostic tool for people and also saved a lot of lives.
TMF: Right. Naturally, you know I'm going to ask this - an announcement by the consortium and Celera Genomics (NYSE: CRA) that there might actually be somewhere around 30 thousand genes as opposed to anywhere as high as 120 thousand. Does this have any impact on the information that any of your patents are based on or is this just irrelevant?
Whitfield: No, but the reason is not because it's irrelevant, it's because people and you know, we've been as bad as everybody. We've got to accept some of the blame. I think we've all been sloppy about what we mean when we say gene.
TMF: Well, journalists haven't done much better.
Whitfield: Well, so what they should be saying *** *** or gene ***. Another words, there are 35 to 40 thousand gene ***. I don't know if you are familiar with this - there's this great quote I was just looking at in the nature collaboration. Let me just read this to you. This is really *** *** published today that there appear to be about 30 to 40 thousand protein coding genes in the genome. Notice how they say protein coding. In other words, each gene can actually code for more than one protein.
TMF: Sometimes five to ten right?
Whitfield: Yes, exactly. So what a gene transcript is which is the 100 to 150 thousand number that we use, we're saying that's the number of proteins basically, so that's the number of different ways. So on average, it's between round about three, maybe a bit more, gene transcripts per gene.
TMF: Right. The gene, for each protein that it codes, it has to have a different mRNA, right?
Whitfield. Well, the thing is that most genes when they are in the genome, they are not all in one place. They have what's called exons, so let's say a gene has six exons, so that made one transcript out of that may only have the first four exons. Another transcript may have all of them but the first one. Now it leads to a completely different protein, but they all come from the same gene. Does this make any sense?
TMF: Yes.
Whitfield: So this number, it's all been almost grotesquely misquoted by everyone today because everybody is saying, oh there are all these predictions that there's going to be 100 thousand, now it's only 40, right. Whereas, certainly to my knowledge in our case, everybody who said, what we've been saying there are over 100 thousand gene transcripts and that's totally consistent with a number of 35 to 40 thousand gene *** so your question was does it affect our business. No.
TMF: What you patented, your information is still your information. It doesn't change.
Whitfield: Not only that, this number is actually consistent with what we've been saying about how many transcripts ***
TMF: Roy, if you'll be patient with me, I'm going to go through this once more with you because it's absolutely key.
Whitfield: Well I'm glad to hear you say that.
TMF: I don't want to be another one of these people that throws the terms around loosely.
Whitfield: You know, I've gone on record as saying that this is the number one issue that's been misunderstood by a ***, the rest of my family, my neighbor across the fence, throughout this whole year is the difference between a gene and gene transcript.
TMF: What I've been working on is to try to get people to understand. I've at least gotten as far as understanding the importance of protein therapeutics. You know, when I talk about Human Genome or ZymoGenetics or Genentech or you, which of the four companies with what I understand the patent to states in these areas, but let me see if I get it. Each gene can code for more than one protein, we know that and genes behave differently in different places in the gene and this relates to the exons.
Whitfield: Let's take those two examples. I'm going to give you a simple example here. Let's just assume a gene had ten exons and that each exon had a 100 bases or nucleotypes and therefore this was a thousand base gene, so one version of that gene would be just to have the first 900 and it might be in the brain but when it gets transcribed, the exon gets missed. Now a protein is a three-dimensional structure, so the folding depends on how bits of different parts relate to each other. Now the fact that that last exon is missing, that may cause the gene to folded in a completely different way. It would have been better if I'd just say let's imagine that a metal exon was missing. If it's folded in a different way, it's a completely different shape and therefore, if it's a different shape, it could have a totally different function in the body and certainly if you try to develop a drug to interact with that protein, there would probably be a completely different drug because that all depends on the three dimensional confirmation, so if you're a drug company, then you're trying to use gene sequence information to identify drug targets and move on to drugs. What you want to have is a gene transcript database and having the genomic sequence is somehow, if you want to get into that, it's a marginal over having the gene transcript information.
TMF: When the gene's ability to produce different transcriptions depends on which exons are getting transcribed.
Whitfield: It generally happens that it's dependent on the cell or tissue because every gene is in every cell in your body, in the nucleus. In that example, those ten exons that are in your liver and they're also in your cornea.
TMF: Oh, this is expression.
Whitfield: Yes, right. No, it's more than expression. The thing is, the form of the gene that's expressed in the cornea might be completely different than the transcription in the liver, if often the way it happens. The key point I think is the one I just made which is in terms of trying to solve a medical problem, this is the information you really need.
TMF: And this is why everybody is saying "it's the proteins, stupid."
Whitfield: Yes. "It's the proteins, stupid," but that's gene transcription again, you are a lot of the way there. You can very much predict the structure of a protein, or you can do a lot. You'd certainly know the whole amino acid sequence and then you could do a lot with today's tool to predict the 3-D structure of a protein from the gene transcript information.
TMF: Does the Proteome acquisition figure in here anywhere?
Whitfield: Absolutely. What Proteome is doing, we feel with our gene transcript information and their ability to advocate whole genome databases in the proteome level we'll be able to provide a product which is essentially a description of the human proteome and that should be pretty interesting for a bunch of reasons.
TMF: When do you think that there will start to be products from that purchase?
Whitfield: Well, we're already starting to feed a lot of the exciting *** *** out of the Pfizer program - *** proteome. That's already happening.
TMF: Okay, I see. That set into a whole bunch of questions people had. What are typical contract terms for the databases?
Whitfield: It varies. It could be $2 million a year all the way up to $15 million to adding on the size of a company, the number of databases you get, the royalties you pass. Obviously the higher royalty you pass, the less of a subscription you get. Some companies, smaller companies may have restrictive intellectual property rights like they can only pursue - we just did a deal with a company called Synomyx applying the inside products to the areas of taste and smell, so that's the area that they get to work on. So we can splice and dice it in a broad array of ways.
TMF: So nobody is getting it free in exchange for exorbitant royalties?
Whitfield: I wouldn't call it exorbitant. If you won't say I agreed with you, but we do have a number of academic deals.
TMF: Right, where it's more favorable that way.
Whitfield: Where it's free. We actually share the intellectual property rights.
TMF: Actually right, I wrote about a couple of those.
Whitfield: Yes, I saw that. That's the other end of the spectrum.
TMF: Yeah, that makes sense.
Whitfield: Going back to your first question about business model, you can see that these agreements actually are not *** to cover all three aspects of the business model. Those academic deals, they don't generate any revenue for us, right now anyway. But they're all about generating intellectual property which we may license from royalties or which we may choose to develop downstream ourselves.
TMF: Are they multi-year or are they year by year - the subscriptions?
Whitfield: It used to be that they were all multi-year but now they are kind of all over the map. Some are multi-year and some are just three-year with renewals. We've been on a mission during the last year. We have a slogan which we call gene transcripts everywhere which is get our database out broadly beyond the pharmaceutical industry to academics. Get them out to Motorola, financial ***
TMF: For *** ***
Whitfield: Yeah, get it out there because it is the definitive database of gene transcript information.
TMF: Once they've got it, how do you add value to keep them hooked over the years?
Whitfield: We keep adding more and more information, so some of the newer versions of our databases have gene expression information in it as well, some *** data, protein expression information and now we'll have some of Proteome's information, and so each successive set of genomic information is more complex. Ultimately we'll start to add chemical information. You know which chemical end it is, interacts with different proteins, so we'll look back and say, hey a gene sequence database - from a data processing point of view, that was pretty lame. You know every single gene expression experiment we do now has ten thousand data points and there are thousands of those.
TMF: Right. It keeps increasing over time.
Whitfield. Yes, so what that means is the business becomes a stale business which we have to keep investing heavily and processing. Like we're going to have $40 or $50 million worth of depreciation next year alone which is unusual for a biotech company and that's because of the high capital cost of our business. But what it also means is it almost guarantees that the market will be outsourced because people are going say, well why would I spend the same amount of money to generate all that information data processing to myself when I can do it and amortize the cost over multiple subscribers.
TMF: Sure, and you're getting to big for big pharma just to buy and certainly all our biotech's have no other option.
Whitfield: Yes, from a gene transcript side, there really is *** ***. An interesting development though in the last year is being it's starting to come into focus for many of the big companies and information technology that genomics if really going to revolutionize healthcare and healthcare being a massive piece of the economy, it's kind of come into focus for companies like Motorola, Agilent and IBM that genomics is an interesting area for them to be involved in and we've certainly been involved in a lot of that last year and that was a big change.
TMF: Well it'll be fascinating to see what happens to the microarrays when you have Corning and Agilent and Motorola all diving in chasing what they think is going to be a big market for them.
Whitfield: That's a great development actually for an information company like us because what it means is that the cost of reducing micro information is going to go down and the quality of the information is going to go up because all the guys are investing a lot of money in it and it means you can produce bigger and better databases.
TMF: And you can sell to all of them. You don't have to pick the winner.
Whitfield: Exactly, that's the strategy.
TMF: What path to profitability do you see for Incyte? You've got a lot of cash, you're investing heavily in R&D - do you see when you'll be turning cash flow positive from operations?
Whitfield: *** *** We've projected I think, guidance I think with our cash BPS point of view, we'll pretty much break even next year with about a $50 million paper loss after depreciation and so forth but we have been given any more guidance than that. It's a legitimate question, it's just sort of why I'm being careful about ***
TMF: I understand - I have to ask it but I know what you have to say. Okay, some more questions that were specifically asked by some of our readers. Are there currently any drugs in clinical trials from database subscribers, you know the kind of drugs where you'd be getting royalties in the future if they were approved or are there any about to go into clinic?
Whitfield: We have been to our knowledge, there are none at this time. . The reason we say that is because we will expect to receive a milestone when that happens. A number of our *** have told us though informally that they expect that to happen this year.
TMF: In these cases, is it contractually permissible that the milestones can be made public?
Whitfield: We would take the position that it's a material event. The amount of detail we'll be able to say about it, you know like exactly what gene it is, that will depend on…Our partners vary all over the map. We've got almost all of the big drug companies. Some of them are very secretive about these types of things, but others, in fact, we're starting to see more of it, they want to tell their investors about what they are doing in genomics.
TMF: Right because they're all saying, where's your pipeline, how's it filling…
Whitfield: So there are a couple of situations where we're hopeful we're going to get some good cooperation up there to make this sort of a announcement that are investors are looking for.
TMF: Right, at least say it happened. What's the most overrated concept today in genomics? You want to take a crack at that one.
Whitfield: Overrated by who?
TMF: Overrated by others.
Whitfield: I don't think we need to go much further than what's going on today which is the value of large scale human genomic sequencing. I mean, we could of easily sequenced the human genome here at Incyte actually before anybody else.
TMF: Like your MegaBase 1000's.
Whitfield: We could never figure how to get it to *** ***. In fact, here's a story for you. We actually approached my *** at *** about joining in sequencing the human genome before Celera was heard of anywhere because when we wanted to do it with him in a sort of not for profit way, where we shared the cost, *** *** *** but it was probably about six months after that that Celera was announced. But we just never saw a) the relative value genomic sequence versus gene transcript sequence which is what we've talked a lot about today and b) the fact that the human genome project was going to do most of it anyway. We could just never understand that.
TMF: And you've got also a full offering of software tools that go along with that.
Whitfield: Yes, so it's not like we ignored it, so the human genome project actually is, you know we spent a lot of money downloading that data and cleaning it up and putting it through QAQC and we've actually use it *** *** ***. I don't want to say it's got no value, it's actually very powerful when it's combined with our database. We can often see when we've got a piece of a gene and then when we line it up with the human genome project's stuff, it will help us to fill the gene out and we also get what's called the regulatory region which is just outside the gene and that becomes apparent to us now with the human genome project data, so the integration of that using it to confirm all our genes, is actually been of great value. It's something our partners really value the amount of effort it takes to do all the computational work in dealing with the human genome project stuff.
TMF: Last thing, drug and diagnostic program - very quickly, where's that going or where do you see that going or operating?
Whitfield: These programs have actually been growing within Incyte for two to three years. We have developed a significant portfolio of intellectual property, diagnostic *** and drug targets. It's actually outside of LifeSeq. We would intend to a number of things we can do. You know, working with an antibody company and put together a patent first for the antibodies to be made and then if we're successful, to advance them. I think it's something that we are certainly looking to do this year. We will be hiring a new president of Incyte. As you know Randy Scott, my co-founder has moved on. He's become a chairman of Incyte. We are looking now for somebody from a pharma or large biotech who's got experience in advancing exciting new targets toward the clinic and we've got all our options open. It would be nice to do a validation deal with one pharma in this area where we might ponder a specific indication or target, but that's not necessary because of the financial position that we're in.
TMF: Right, you are sitting on a boatload of cash.
Whitfield: We're still kind of wedded to - we believe we could create a lot of value from heading downstream but certainly in the area of small molecule therapeutics. We still see that as clearly the realm of big pharma for the time being.
TMF: Okay, that's very comprehensive and terrific and I'm most grateful for your time.
Whitfield: Thanks for your questions. |