reply on: Meta SN: Should we raise posting costs? \ stacker news ~bitcoin

pull down to refresh

482 sats \ 2 replies \ @cointastical 1 Oct 2022 \ parent \ on: Meta SN: Should we raise posting costs? bitcoin

The problem I think this would solve is where a second person wants to post a link for a news article that was already posted. This could be due to any of:

ignores the "dup" warning as can still earn sats regardless
ignores the "dup" warning as trying to promote the article itself (e.g., by the author of the article) or wants to spread the good word (e.g., for a bullish story to help pump the price)
ignores the "dup" warning as an earlier post didn't get enough upvotes for front-page visibility, and hopes a second attempt at the right time of day will do better
didn't see the "dup" warning (dup check takes a second or two, so could hit post button before even being shown the "dup" warning)

This wouldn't solve where there are two URLs for the same content. E.g., when there's a URL using Google AMP, and the same article is at a link without the AMP. Or like for a blog post that is available either medium.com/@username as well as username.medium.com, etc., etc.

This higher price for re-post also wouldn't prevent someone from doing a Discussion post and then putting the previously posted link in the discussion post.

And then there is the case that a high fee to re-post could result in some re-posts targeted innocently. For instance, there are some links that have new info but at the same URL. For example, the link announcing new Electrum releases is always at their /download link (without any unique page for each release). Or a website that was just a landing page when the site was announced (e.g., with a waiting list sign up) and then weeks later when the second post occurs, it is for the site when the service is ready / live launch.

That being said, I like your suggestion -- with some variation to accommodate the last use case (same link, but after some amount of time passed it could have new/different content).

25 sats \ 1 reply \ @Lux 2 Oct 2022

yes, what I was thinking, but didn't put the work to write ;) have a few sats sir makes me think, could there be a method to detect same title or same content regardless of different url? AI maybe? content indexing like search engines? dunno

275 sats \ 0 replies \ @cointastical 2 Oct 2022

The AMP duplicate issue is easy to fix. There's already Issue #138 in SN's Github repo for that. And for the case where there's a Description post with a link in it which was shared in a prior Link post was acknowledged as a possible problem, see Issue #150 on that. And there are some other issues on SN's Github related to improvements on dup detection (e.g., ignoring upper/lower case in URL, ignoring www. subdomain, ignoring m. subdomain (e.g, for youtube videos), etc.

could there be a method to detect same title or same content regardless of different url?

That could be done ... exact duplicate is easy. Simply store a hash of the response. If the hash for a new post matches the hash for a prior post, include the earlier as a potential dupe. That solves the case where there are two URLs that respond with identical content. That doesn't solve for something like where ZeroHedge reprints a Bitcoin Magazine article. In that instance they have a different hash but are essentially the same content.

There are anti-plagiarism services and duplicate content detection tools that could identify occurrences of that. But I don't think that a duplicate could be detected in real-time -- as SN doesn't know what prior post(s) should be looked at when there is a new post. Maybe doing the anti-plagiarism search against prior posts with similar words from the title would find duplicates but that would take longer than is reasonable to expect the user entering a post to wait.

Or simply there could be a mechanism for users to flag a post as being a duplicate, and some handling be done for that. Reddit does this, but their approach involves moderators who investigate -- something SN is designed to not need.

The duplicate posts is a minor inconvenience that I suspect bothers mostly only a small number of heavy users of SN. I can imagine there are other things to be done that are at a higher priority.