pull down to refresh

Hacker News Score, Comments and Rank as Time Series

I updated @hn today to not only post when an item hits #1 on HN but also track HN metrics about each item on the front page every minute.
Since I was lazy, this means that @hn now checks every minute for a new item on the top instead of every hour as before so there might be more "spam" from @hn for a while.1
I decided to track metrics as the foundation for a more sophisticated algorithm that decides if @hn should crosspost something to SN. This is something I wanted to do since #192328 where @nout mentioned that @hn basically seeds ~tech but it could post less but more quality content.
But how to decide if something is quality content for SN before it gets posted?
This question is basically what I am trying to answer with the new collected metrics. With them, I can now run some backtesting to see if there is some pattern on HN that @hn can exploit for the items that performed well on SN. I don't collect SN metrics yet though (but I could cheat and use my database access).
But even if such patterns are not visible, I have some ideas that should definitely be an improvement since they are way better proxies for "is this item interesting?" than simply looking at the rank at a single moment in time.
For example, I could track the time an item stays on #1 and only post when it's been there for at least one hour. This should be better since else @hn might post something which reached #1 as fast as it falls down the ranks.
Another idea is to track the "slope" an item has. If it rises very fast (based on rank, score and/or amount of comments), @hn could post something even before it hits the top. @hn would basically predict that this item is very interesting and will reach the top soon.
That's also the reason why @hn only checked every hour: the probability that @hn will find an item at the top that reached the top at the same moment it checked (and then falls of) is lower than if it checks every minute.
All of these ideas can now be expressed with a single SQL query. For the more technical stackers, here is the query that currently decides if @hn should post something:
SELECT t.id, time, title, url, author, score, ndescendants FROM ( SELECT id, MAX(created_at) AS created_at FROM hn_items WHERE rank = 1 AND id NOT IN (SELECT hn_id FROM sn_items) GROUP BY id ) t JOIN hn_items ON t.id = hn_items.id AND t.created_at = hn_items.created_at;
#464039 (24 sats, 0 comments):
#464032 (0 sats, 0 comments):
#463914 (10 sats, 0 comments):
#463813 (102 sats, 0 comments):
#463756 (41 sats, 0 comments):
#463719 (20 sats, 0 comments):
#463685 (387 sats, 0 comments):
#463921 (10 sats, 0 comments):
#463449 (20 sats, 0 comments):
#463261 (30 sats, 0 comments):
#463222 (20 sats, 2 comments):
wasn't posted for some reason:
#462642 (41 sats, 0 comments):

Footnotes

  1. Hopefully, there aren't too many different items at the top of HN within 10m since else @hn is going to pay a lot of fees. ↩
I love your bot.
HN is still the gold-standard of the latest good tech content. Also, always interesting to read the insider information in the comments from a senior Google engineer or someone who worked directly with a tech icon who just died. Unfortunately, noise has become bad these days with lots of non-constructive Reddit style threads of wannabe comedians. Dang has a hard time moderating these days.
Hope SN will be the bitcoin-standard one day of good tech content other than just bitcoin tech~~ until then, we have your bot.
reply
I hate that wannabe-comedian-style comment so much -- it's like the Gresham's law of social media, idiot comments push out good ones.
I can't fathom how people can stand to spend their time doing that or reading it. Hopefully we will have enough moderation tools to stomp the shit out of that tendency in the territories if it ever arises on SN.
reply
This is super cool.
I wonder if there's a way to augment it so that SN can benefit in particular? E.g., there's already (probably) a fruitful discussion on HN, so if you're going to comment in a general way, it makes sense to do so there, bc that's where the discussion is. Is there a SN-specific type of discussion that could ensue? And how could that be encouraged?
reply
I wonder if there's a way to augment it so that SN can benefit in particular? E.g., there's already (probably) a fruitful discussion on HN, so if you're going to comment in a general way, it makes sense to do so there, bc that's where the discussion is.
I think most stackers will simply comment here since they are more interested in what other stackers have to say than HN readers even though you are right, there might already be a discussion there and it would make sense to comment there for more engagement. But I am also not sure how many stackers use HN and how many of those are not simply lurking?
Is there a SN-specific type of discussion that could ensue? And how could that be encouraged?
Good question. I used to engage with as many replies to @hn as I could but now I mostly just scroll through the history of @hn and see if any post got an interesting comment that was not from @hn. But I only reply if I feel like it unlike before where I felt like every post from @hn is basically a post from me and I have a "fiduciary duty" to reply if someone spend their time to reply to @hn.
However, there are discussions in @hn posts that arise organically so I think it's simply a matter of what @hn posts? So it could be encouraged by tweaking the algorithm of @hn?
But maybe @hn could also fetch the top comment on HN and post that here, too. Kind of like seeding the discussion. I almost always go to HN to read the comments.
reply
The hn bot is neat. However, I feel like it people here are more hesitant to comment and discuss on stuff that was auto-reposted by a bot.
reply
Definitely
reply
When you post from hacker news do you notify the HN author in anyway?
reply
That was the original idea but the comments where shadowbanned.
I still mention in the bio of @hn that HN OPs can claim their sats here but with the price of BTC rising, I am actually not sure if I want to respect that anymore.
Maybe I'll just rug pull the OPs as a lesson. But I am interested if any OP ever shows up πŸ€”πŸ‘€
reply
Yes I think I recall you mentioning this when you started that account. Too bad, would be a good way to SN pill HN users
reply
I mean, someone could still post a link to @hn or this post on HN πŸ‘€
But I won't. I don't care about orange-pilling HN anymore. They will learn at their own pace.
reply
Fair enough.
reply
25 sats \ 1 reply \ @mo 14 Mar
Just… wow! Simplicity at work
reply
I never thought @hn would stack as much as fast as it did πŸ‘€
reply
Oh and I also plan to have charts like these inside every post by @hn.
I think these charts can even stay up-to-date since I can simply update the image to which the image link points to and tell browsers to not cache the image. But let's see if that will work or if browsers still cache and you need to bypass the cache to see the latest chart.