pull down to refresh
Thanks for writing it all out. I probably would have contributed to the bad look this test gave~~
reply
Statistics is genuinely hard, and I didn't realize all the nuances until I started working with real life data generated by real life people.
reply
Probability is not easy
reply
I'm too stupid to do it the correct way
my way would have been simple and wrong but...
false positive is 5 percent or 1/20
actual prevalence is 1/1000
20/1000 = .02 or 2 percent which is close to the actual answer, sort of
reply
one of the comments to the article (substack) addresses your question/assumptions
P(T|D) = probability of testing positive given the disease (sensitivity)
(I'll assume this is 100% since it wasn't specified)
update:
Step 3: Calculate each component:
-
P(¬D) = 1 - 0.001 = 0.999
-
P(T) = 1 × 0.001 + 0.05 × 0.999 = 0.001 + 0.04995 = 0.05095
Step 4: Calculate the final probability:
P(D|T) = (1 × 0.001) / 0.05095 ≈ 0.0196 or about 1.96%
reply
yeah you need P(test|disease) to solve the problem.
The one that even fewer people appreciate is that you also need the false negative rate, which is not necessarily calculable from the false positive rate.
FNR = P(test=negative | disease=true)
FPR = P(test=positive | disease=false)
They aren't the 1-minus of each other!
reply
My friend wrote to me...
Turns out it’s not too hard to find the source. The bad news is twofold:
- The paper dates from 1978 so it’s almost a half a century old.
- It only involve a sample of 20 students.
The secondary source gets it wrong, reporting that there were only 10 students.
reply
haha, another good example of how urban legends spread.
I mean, I guess the overall message is true: people are bad at statistics. But the problem seems to run even deeper than what's implied... including how the original progenitors of this trial didn't give proper instructions, and how the results of the trial got propagated in incorrect ways, and then how it developed into this myth.
reply
\alpha
is the rate at which diseased people take the test,\beta
is the rate at which non-diseased people take the test, andFNR
is the false negative rate, which we also weren't told!