Background info
There are 3 million in the U.S. currently living with prostate cancer. There are approximately 320 million people in the US today, roughly half of whom will have prostates. Hence, let us take the prevalence of prostate cancer among those who have prostates to be approximately 3 in 160, or just under 2%.
The false positive (type I error) rate is reported at 33% for PSA velocity screening, or as high as 75%. The false negative (type II error) rate is reported as between 10-20%. For the purpose of this analysis, let’s give the PSA test the benefit of the doubt, and attribute to it the lowest type I and type II error rates, namely 33% and 10%.
Skill testing question
If some random person with a prostate from the United States, where the prevalence of prostate cancer is 2%, receives a positive PSA test result, where that test has a false positive rate of 33% and a false negative rate of 10%, what is the chance that this person actually has prostate cancer?
Bayes’ theorem
Recall Bayes’ theorem from your undergraduate Philosophy of Science class. Let us define the hypothesis we’re interested in testing and the evidence we are considering as follows:
P(h): The prior probability that this person has cancer
P(e|¬h): The false positive (type I error) rate
P(¬e|h): The false negative (type II error) rateP(h) = 3/160
P(e|¬h) = 0.33
P(¬e|h) = 0.10
Given these definitions, the quantity we are interested in calculating is P(h|e), the probability that the person has prostate cancer, given that he returns a positive PSA test result. We can calculate this value using the following formulation of Bayes’ theorem:
P(h|e) = P(h) / [ P(h) + ( P(e|¬h) P(¬h) ) / ( P(e|h) ) ]
From the above probabilities and the laws of probability, we can derive the following missing quantities.
P(¬h) = 1 – 3/160
P(e|h) = 0.90
These can be inserted into the formula above. The answer to the skill-testing question is that there is a 4.95% chance that the randomly selected person in question will have prostate cancer, given a positive PSA test result.
What if we know more about the person in question?
Let’s imagine that the person is not selected at random. Say that this person is a man with a prostate and he is over 60 years old.
According to Zlotta et al, the prevalence of prostate cancer rises to over 40% in men over age 60. If we redo the above calculation with this base rate, P(h) = 0.40, we find that P(h|e) rises to 64.5%.
Take-home messages
- Humans are very bad at intuiting probabilities. See Wikipedia for recommended reading on the Base Rate Fallacy.
- Having a prostate is neither a necessary nor a sufficient condition for being a man. Just FYI.
- Don’t get tested for prostate cancer unless you’re in a higher-risk group, because the base rate of prostate cancer is so low in the general population that if you get a positive result, it’s likely to be a false positive.
I bungled the arithmetic when I tried to set it up on my own, so I might need to go spend some more time with a textbook and practice problems. Luckily, in the mean time (and forever), I can trust the USPSTF to run the numbers for me. They also recommend against testing normal-risk individual, but in far, far more words.
http://www.uspreventiveservicestaskforce.org/Page/Document/RecommendationStatementFinal/prostate-cancer-screening