reply on: Fired NIH workers fear bleak job prospects in the private sector \ stacker news ~science

pull down to refresh

52 sats \ 8 replies \ @SimpleStacker 13 Mar \ parent \ on: Fired NIH workers fear bleak job prospects in the private sector science

To be fair, the question didn't state its own assumptions very clearly. It didn't say that a person is randomly sampled from the population and given the test.

Typically, when a test is administered, it's because a patient has requested it, or come in with symptoms.

So to really answer the question, we'd need to know the rate at which people with the disease receive a test and the rate at which people without the disease receive a test.

Under real world scenarios, the real answer is probably closer to 95% than it is to whatever the "correct" answer is, which I guess was supposed to be:

 $\frac{\frac{1}{1000} \times 95\%}{\frac{999}{1000} \times 5\% + \frac{1}{1000} \times 95\%}$

However, the real answer is:

 $\frac{\frac{1}{1000} \times \alpha \times (1-FNR)}{\frac{999}{1000}\times \beta \times 5\% + \frac{1}{1000}\times \alpha \times (1-FNR)}$

where

\alpha

is the rate at which diseased people take the test,

\beta

is the rate at which non-diseased people take the test, and

FNR

is the false negative rate, which we also weren't told!

So... the badness at statistics goes all the way around it seems

40 sats \ 2 replies \ @south_korea_ln OP 13 Mar

Thanks for writing it all out. I probably would have contributed to the bad look this test gave~~

10 sats \ 1 reply \ @SimpleStacker 13 Mar

Statistics is genuinely hard, and I didn't realize all the nuances until I started working with real life data generated by real life people.

0 sats \ 0 replies \ @Bell_curve 13 Mar

Probability is not easy

0 sats \ 0 replies \ @Bell_curve 13 Mar

I'm too stupid to do it the correct way

my way would have been simple and wrong but...

false positive is 5 percent or 1/20

actual prevalence is 1/1000

20/1000 = .02 or 2 percent which is close to the actual answer, sort of

0 sats \ 3 replies \ @Bell_curve 13 Mar

one of the comments to the article (substack) addresses your question/assumptions

P(T|D) = probability of testing positive given the disease (sensitivity)

(I'll assume this is 100% since it wasn't specified)

https://open.substack.com/pub/boriquagato/p/why-jay-is-the-right-guy-for-nih?r=2slf6a&utm_campaign=comment-list-share-cta&utm_medium=web&comments=true&commentId=99767215

update: Step 3: Calculate each component:

P(¬D) = 1 - 0.001 = 0.999
P(T) = 1 × 0.001 + 0.05 × 0.999 = 0.001 + 0.04995 = 0.05095

Step 4: Calculate the final probability:

P(D|T) = (1 × 0.001) / 0.05095 ≈ 0.0196 or about 1.96%

10 sats \ 2 replies \ @SimpleStacker 13 Mar

yeah you need P(test|disease) to solve the problem.

The one that even fewer people appreciate is that you also need the false negative rate, which is not necessarily calculable from the false positive rate.

FNR = P(test=negative | disease=true) FPR = P(test=positive | disease=false)

They aren't the 1-minus of each other!

30 sats \ 1 reply \ @Bell_curve 13 Mar

My friend wrote to me...

Turns out it’s not too hard to find the source. The bad news is twofold:

The paper dates from 1978 so it’s almost a half a century old.
It only involve a sample of 20 students.

Primary source: https://sci-hub.ru/https://www.nejm.org/doi/full/10.1056/NEJM197811022991808

The secondary source gets it wrong, reporting that there were only 10 students.

https://www.sciencenews.org/blog/context/doctors-flunk-quiz-screening-test-math

10 sats \ 0 replies \ @SimpleStacker 13 Mar

haha, another good example of how urban legends spread.

I mean, I guess the overall message is true: people are bad at statistics. But the problem seems to run even deeper than what's implied... including how the original progenitors of this trial didn't give proper instructions, and how the results of the trial got propagated in incorrect ways, and then how it developed into this myth.