pull down to refresh
Honestly, just grepping for emdash or unicode chars may be a better first-pass detection....
True, but I'm hoping to that avoid that kind of arms race by using one of these black boxes. Bayesian filters would probably do most of the work I need and much more cheaply though.
Apparently people actually use emdashes out in the wild: #1406132
Honestly, just grepping for emdash or unicode chars may be a better first-pass detection....