0 sats \ 1 reply \ @k00b OP 30 Jun \ parent \ on: Request for comments: improving our referral system meta
I'd love to do this. If it were you, how would you deal with bots/sybils being used to generate traffic?
First, I'd keep that part of the code closed-source. Treat it like a black box, an extension, to the open source parts. The goal for the extension is fast inference, using traffic metadata to predict, probability-of-bot.
Second, I'd run a few models in parallel, all trying to compete for bot detection.
Third, I'd explore joining data sets, to improve the models.
Tech wise - its a mix of python, kubeflow, maybe snowflake.
Data wise - its probably a mix of vendor + quality of traffic + a training set comprised of human curated real-user activity.
reply