reply on: When Data Is Missing, Scientists Guess. Then Guess Again \ stacker news

pull down to refresh

37 sats \ 3 replies \ @SimpleStacker 2 Oct 2024 \ parent \ on: When Data Is Missing, Scientists Guess. Then Guess Again science

I do think some assumptions need to be made about the missing data for the procedure to be valid though. Like, if the data is missing at random then I think the procedure would work great. If missing-ness is non-random, but only depends on observed variables, then the procedure could also work.

But if missing is non-random and also depends on unobserved correlates, especially if it depends on unobserved correlates of the outcome variable, then I think the procedure is likely to yield biased results.

10 sats \ 2 replies \ @Undisciplined 2 Oct 2024

In the first scenarios, where it seems ok, are you introducing measurement error?

If so, you're going to have attenuation bias.

29 sats \ 1 reply \ @SimpleStacker 2 Oct 2024

Hmm, good point. I’ll admit I haven’t thought carefully about imputation. Why wouldn’t any procedure that imputes data without the outcome variable lead to attenuation bias, and why wouldn’t any procedure that uses the outcome lead to endogeneity?

I’m assuming there’s a good answer if I read the literature. But it’s possible I’d be disappointed as well

13 sats \ 0 replies \ @Undisciplined 2 Oct 2024

I hadn't thought about imputation leading to attenuation bias until just now, but it seems like it would (if I understand why measurement error has that effect).

I'm also sure this has been discussed at length in the literature. It surprises me a little that none of my advisors or econometrics professors mentioned it, though.