November 16, 2016

What should non-geneticists know about genetics?

There are things that geneticists hardly ever mention when talking to their non-geneticist collaborators, probably because they take them for granted. Stating them explicitly may be helpful for those of you who frequently work with geneticists. I'm assuming that you already know a little bit about genetic variants and association studies.
  1. When it comes to genetic association studies such as GWAS, correlation really is causation. If a genetic variant is associated with a trait, it or a variant close to it causes the trait, assuming it's not a spurious correlation. Because DNA is read-only, it's not possible that the trait causes the variant. This makes genetics different from e.g. gene expression analysis, where the arrow of causation can point both ways. A differentially expressed gene can cause a disease, but the disease can also cause genes to be up- or downregulated. Typically, you have to do follow-up experiments to determine what's going on. Not so in genetics.
  2. Genotyping isn't sequencing. Frequently, people will say something like, "we have sequenced those samples" when they were really genotyped on a chip. The difference is that genotyping chips are cheap ($200 or less per sample) and typically produce data on several hundreds of thousands of known genetic variants. Sequencing is more expensive (more than $1,000) and produces data on almost all the variants in the genome, including those that haven't been observed before. Unlike genotyping chips, sequencing also delivers data on structural variants such as insertions, deletions and copy number variation.
  3. Knowing the causal variant isn't the same than knowing the causal gene. Most of the human genome isn't coding for genes, and it's not clear what it actually does, or if it does anything important at all. The majority of variants that have been associated with traits and diseases are not located in the coding parts of genes either. For those variants it's difficult to tell how they exert their effect. Some that are known to change gene expression are called expression quantitative trait loci or eQTLs. For those variants that aren't eQTLs, people often assume that one of the genes that are encoded in their vicinity is the causal one.
  4. Knowing the gene isn't knowing the effect direction. Even if you know through which gene a variant exerts its effect, you still don't know in which direction the effect goes. Take the example of a genetic variant that has two alleles, G and T. Assume the G allele is the risk allele for a disease, and it's located in the intron of a gene. This does not immediately tell you if decreased gene function is associated with higher or lower disease risk. Again, eQTLs come to the rescue, as they will tell you if the risk allele is associated with higher or lower gene expression, which are reasonable proxies for increased and decreased gene function, respectively.
  5. Genotypes are discrete, phenotypes often aren't. A genetic variant typically has several genotypes. The example variant from the previous paragraph with the two alleles G and T will, in a diploid organism like humans, have three genotypes: G/G, G/T and T/T. It may therefore be tempting to assume that genetic variants are great biomarkers, as they will unambiguously show if the trait associated with the variant is present or not, maybe with heterozygotes being something in between. Unfortunately, this is often not the case, especially for complex diseases that have many variants associated with them. Each of these variants contributes to disease risk only a little bit, and as a result, individual variants aren't very informative.

There's more, but this post is already too long, so I'll save it for another time.

November 10, 2016

Is a third party candidate the solution?

This election was ugly, and the next four years are going to be traumatic. But America will get through them.

Even if Clinton had won, the election would have left scars. Just as Trump is strongly opposed by a portion of the country, she would have been hated by a different portion. True, that portion would not have had a rational reason to be as terrified as many Democrats are now, but that doesn't mean that their views don't count.

Repeating a campaign like this in four years wouldn't do anyone any favors and contribute nothing to healing the divides. But for any campaign that involves a Democrat running against Trump in 2020, this is what would happen.

Trump has said that he's going to erase the achievements of the Obama administration, and half the country already loathes him. I can't see how a Democrat replacing Trump in four years wouldn't be loathed by his supporters in turn. A large part of the country hating their president, no matter their motivation, and each president trying to erase what the one before them has done, cannot be a healthy state of affairs.

But what if someone who is not particularly objectionable to either camp were to run as an independent candidate? They would get the support of Republicans who stood up to Trump, of moderate Democrats, and of course of independent voters. And most importantly, they would break out of the self-reinforcing cycle of partisan hatred that casts its shadow over America.

September 16, 2014

Is genomics past its peak?

In his excellent blog, Robert Plenge recently asked how far along the hype cycle we are with regards to applying genomics to drug discovery.

The absolute number of genomics papers published in 2014 is likely to be higher than in any previous year. But when normalising by the total number of biomedical papers listed on Pubmed, a search engine for biomedical publications, it seems like we may be past the Peak of Inflated Expectations. The proportion of papers that contain genomics in their title or abstract has peaked in 2012.


I found the number of publications each year by searching for genomics on Pubmed. The number of papers for 2014 is an extrapolation based on the number of papers to date.


August 18, 2014

Should every drug have to pass a clinical trial?

If a pharmaceutical company wants to sell a drug, it first has to prove that it works and is safe by doing a double-blind controlled trial. Everything else is quackery.

At least that is what I used to think, but now I am not so sure any more. A major reason for my new-found uncertainty has been Peter W. Huber's book The Cure in the Code. Huber is a senior fellow of the Manhattan Institute, a free market think tank. Whilst his book is political, it raises points that should transcend the political divide. His central aim is to show that the top-down drug licensing process, as it is currently implemented in the United States and other developed countries, is not in the long-term interest of patients.

Consider the rare diseases that we study at the Sanger Institute, where I am a postdoc. In many cases, they are caused by a single mutation disrupting a single gene. Because the mutation is rare, not enough people have the disease to make it profitable for anyone to invest in developing a drug.

Nevertheless, sometimes doctors are able to repurpose a drug that has originally been developed for a different disease. For example, if the causal mutation disrupts a gene in a particular biochemical pathway, the doctor may know of a drug that upregulates the expression of another gene in the same pathway, therefore compensating for the mutated gene. That is great when there is a drug that can be repurposed, which in most cases there is not.


This issue is not limited to rare disorders. Complex diseases such as autism or diabetes have a large genetic component, but rather than a single causal variant there are tens or even hundreds. Each patient has a different combination of variants, and it is therefore unlikely that there will be a single drug that works for everyone. This makes a personalised medicine approach necessary, where the drugs that are prescribed are adjusted based on the patient's genotype. As with rare diseases, this means that patients with less common genotypes may lose out because there are not enough of them to warrant the development of a new drug, and because no drug that has been licensed for another disease is likely to help.

The higher the level of proof that is required before a drug can be sold, the lower the number of drugs that pass this threshold will be. Always requiring the highest level of proof may seem like the safest option, but is not necessarily in the public's interest if it means that many drugs will ultimately be unavailable. There is legislation such as the US Orphan drug act and a number of accelerated approval rules, but they only partly address the problem. There are still plenty of examples of drugs that may work in a subset of patients, or that may in the future be repurposed, but that are killed off in the licencing process and are therefore not available to anyone.

What is to be done? Creating legislation that allows the use of any compound whatsoever for the purpose of treating people with rare diseases or rare genetic variants could be one option. Of course, such a laissez-faire approach to drug licencing would be controversial and could encourage unscrupulous practices.

Right now, it is unclear to me what the balance between the two extremes of complete permissiveness and complete top-down control of drug licensing should be, but I doubt it is the status quo. I would welcome any views readers of this post may have.

August 3, 2014

How important is health care system quality?

A few weeks ago, I came across an article comparing the health care systems of eleven developed countries. This made me wonder how much health care system quality impacts physical wellbeing, for which I used life expectancy as a proxy. I was expecting a strong correlation between the two, but in fact health care system quality doesn't seem to be that important.

Healthcare system ranking (from best to worst) and life expectancy of eleven Western countries. The correlation (r) is -0.32 (solid line), or 0.06 when excluding the United States as an outlier (dotted line). In either case, the 95% confidence interval of the correlation spans zero, meaning that the data does not support any relationship between the two variables
Health warning: Do not overinterpret such a small dataset. This is me messing around with a few numbers I found on the internet. I am not a health economist.

May 31, 2014

When are we going to get a 3rd generation sequencer?

Two years ago, there were at least a dozen companies trying to develop DNA sequencing technology to rival incumbents like Illumina, Life Technologies, Roche/454, PacBio, and Complete Genomics (the latter offers sequencing as a service). What has changed since?

Despite numerous optimistic announcements by start-ups (for example, here and here and here and here and here) and investments totaling more than $400 million, there hasn't yet been any great breakthrough. The only exception is Oxford Nanopore, whose MinION sequencers seems to be close to ready for prime time.

The table below lists companies that have said they are developing sequencing technology and that have received funding according to the online database Crunchbase, which tracks that sort of thing:

Company
Funding
Status
$211.7m
Active
$58.8m
Active
$45.5m
Active
$35.0m
Active
$22.5m
Active
$20.9m
Active
$10.4m
Acquired by Roche
$5.0m
Active
$2.4m
Active
$1.5m
Active
Total
$413.7m


Even though that's a long list, it is not complete. Companies whose funding situation or current status are unclear and who have therefore not made it onto the list are Noblegen, Base 4 Innovation , Electron Optica, the Beijing Institute of Genomics (BIG), Electronic Biosciences, Qiagen's Intelligent Bio-systems, Reveo, and MobiousBiosytems. Doubtlessly, some of those have quietly exited the race.

Let's summarise: There are a lot of companies trying to develop the sequencer of the future, and at least some of them have received generous funding. Most have been active for at least two years, but we haven't seen any results yet, at least in the form of a sequencing machine we can buy. Clearly, 3rd generation sequencing is a though nut to crack.

I'd like to thank Keith Robison for pointing out omissions in an earlier version of this post, which I've now fixed.

April 3, 2014

How useful is genomics for drug discovery?

In a previous post, I discussed a number of large sequencing projects, including by the governments of Saudi Arabia and the United Kingdom, which are sequencing the genomes of 100,000 of their citizens each.

I also lamented that these projects are exclusively government-funded, and that big pharma and biotechnology companies don't appear to consider large-scale sequencing a viable approach to drug discovery. Well, actually some of them do.

Three recent developments stand out: 

Regeneron, a large biopharmarmaceutical company, has teamed up with Geisinger Health Systems, a local American healthcare network, to sequence 100,000 exomes at an estimated cost of $100 million. Geisinger has detailed electronic health records of its patients, which will help with the interpretation of the data.


Amgen, another large biopharma company, announced that it would partner with the Broad Institute, one of the world's biggest genomics research institutes, to discover new drug targets. Prior to that, Amgen had acquired DeCode Genetics, which despite producing first-class science had been struggling.

And finally, the pharmaceutical giant GSK, together with two European genomics research centres, the EBI and the Sanger Institute (where I'm based), launched the Centre for Therapeutic Target Validation (CTTV), which will validate drug targets using a genomic approach.

Overall, it seems that pharmaceutical and biotech companies indeed have an increasing appetite for applying genomics to drug discovery. The question remains how successful this approach will be, and how much of an advantage it confers to the companies that invest in it.