Was the Human Genome Project a success? Yes! But also no. Genetics has become an effective tool to understand, diagnose, and treat disease, but it remains a much less reliable way of predicting it. And risk prediction was exactly what the organizers of the Human Genome Project had promised: you were going to walk into your doctor’s office and get an overview of your lifetime risk for heart disease, depression, cancer, diabetes, and other common diseases. Today, for the most part, that goal remains out of reach.
What makes risk prediction so complicated is that common diseases are complex and polygenic — many genes and many non-genetic factors contribute to who does and who does not get the disease. There are single-gene variants that can greatly increase your odds of getting sick, like BRCA variants with breast cancer or familial hypercholesterolemia variants with coronary artery disease (CAD). Single-gene scenarios are less common but easier to identify, and we have tests to look for gene variants with a big impact on an individual’s risk for cancer, heart disease, or other diseases.
This is useful for those who test positive, but it doesn’t speak to the risks for the rest of us, although many of the rest of us will get sick. Most people who have cancer, diabetes, or heart attacks, cannot trace their bad fortune to a single gene. To the extent that genetics contributes to their health outcomes, it is a result of hundreds or thousands of genes, none of which plays a decisive role. Our inability to quantify or define this diffuse background risk is one way in which the Human Genome Project has not lived up to the hype.
Now, in what may be a game changer for the use of genetics in medicine, a series of new studies have introduced the possibility that we can calibrate polygenic risk in a way that may be meaningful for individuals by creating personal risk scores (PRS). In the field, some see PRS as a breakthrough and others see it as a distraction, drawing money and attention away from well-established preventive health targets like diet and exercise.
The fact that multiple genes are involved in complex disease wasn’t a surprise — it is the very definition of complex disease, which is caused by the cumulative and interacting effects of genes (plural), environmental influences, and luck. But two things about polygenic disease risk were a surprise to researchers: how many genes are involved, and what a tiny contribution to overall risk each gene represents. Geneticists went looking to identify every voice in the choir and found it was more of a flash mob; accounting for all participants has taken far longer and required much bigger sample sizes than some overconfident early estimates had anticipated.
Case in point: CAD. There are single-gene mutations that radically increase blood cholesterol and risk of heart attack. Approximately 1 in 250 people carry a gene variant that causes a severe form of CAD called familial hypercholesterolemia (FH). All of the people with this variant get CAD, but so do lots of other people. Most people with CAD don’t have an FH variant. To assess these other people, a group of researchers in the lab of Sekar Kathiresan at Massachusetts General Hospital in Boston developed a rating scale that uses data from 6.6 million places in the genome. Tested in a sample of over 400,000 people from the U.K. Biobank, the ratings produced a fine example of the traditional bell-shaped curve, with the 5 percent in the top range three and a half times more likely than the average person to get CAD. Those at highest risk were six times more likely to have had a heart attack by age 55.
These numbers aren’t the trivial risk adjustments normally associated with polygenic disease risk estimates. While the vast majority of people fall somewhere in the big middle, outliers on the far ends have a markedly increased or markedly decreased risk of CAD. Statistically, the increase in risk is just as dramatic as that associated with single-gene familial disease. What’s more, reports Kathiresan, the CAD data are a model, not a fluke. His group is using the same methodology to develop PRS for obesity, breast cancer, and other conditions where multiple genes are in play.
How long until a PRS is a part of your annual physical? That’s a tricky question. The key is establishing medical utility. Can the information be used to prevent CAD, early heart attack, or cancer? If the only advice we have to give is to eat sensibly and exercise, do we really need to stratify the population in order to give out universal truths? Will genetic information change behavior, when nothing else has? Will using drugs or other therapies preventively work in this population? In other words, is it useful? Nobody knows the answer yet, and it is literally a billion dollar question. PRS testing is not wildly expensive, but incorporating both testing and follow up into routine care for everybody is a significant expenditure.
That’s a valid public health debate, but no matter how it is resolved, the success of the PRS experiment is significant news. To create a PRS, you take every gene associated with an increase or decrease in risk, no matter how tiny, and assign a positive or negative score. Then you sum it up. Many people, myself included, were skeptical that this sort of additive model would ever work. Because genes interact, it was possible that a gene might increase risk in some circumstances and decrease it in others.
If we can find a way to translate knowledge into disease prevention, the PRS may turn out to be the most important obscure bit of science jargon you learn this year. But I have a couple of caveats. First, what the PRS appears to do well is to identify the outliers at highest risk: the top 5 percent, 2 percent, and 0.25 percent. It’s in this outer range that risk estimates deviate significantly from normal. It is less impressive as a means of separating the 25th percentile from the 75th percentile. Everyone under the bell is pretty much in the same boat: your genetic inheritance is neither necessary nor sufficient to cause disease. For these people, diet, exposures, and other life events are probably going to be more important than the effect of genes.
Finally, here are some generic warnings about probabilistic information. As useful as it may be, it’s got inherent limitations and dangers. We have to be thoughtful about labeling people as “at risk” because it can be stigmatizing and self-defining. This might not leap out when discussing PRS for CAD, but will be front and center when discussing PRS for mental illness. Or when using this technology to rank embryos. But genetics seems to deal almost exclusively in double-edged swords, and the leading edge of this sword may save some people from heart attack and stroke.