pull down to refresh

I was working my way through this paper earlier and it seems as if since 3 days humanity maybe understands why softmax attention parametrization works better than linear (even though it's not used in many models anymore because it's expensive to run.)
So I don't find it crazy that the AI companies look to hire talent that can actually address open issues like that, instead of using inference. For non-ai tech companies though - yeah, inevitable.