This is a subtle but (hopefully) significant improvement to search and related posts.
Before: search and related posts relied on term matches After: search ranks on term matches and semantic similarity while related posts relies exclusively on semantic similarity
This should improve both evergreen-ness of content and rabbit-holing. IMO related posts have improved an incredible amount, but search is also much better at finding relevant matches.
For months, @elvismercury hollered about text embeddings, but I didn't know what text embeddings were. This change uses on them by mapping a search query or item to a vector of meaning and looking for items with similar vectors.
I've been neglecting "related posts". I'll attempt to change my ways.
reply
70 sats \ 0 replies \ @kr 16 Jan
me too, i’m optimistic these posts can help people discover entirely new parts of stacker news than they would otherwise find by browsing top/recent/hot
reply
Whoo! Nice job @k00b, this is going to change everything!
reply
Yep I understood some of those words :P
reply
5 more than me.
reply
I got "the" and that is all
reply
369 sats \ 8 replies \ @ek 16 Jan
Searching for what are you working on
Now:
Was able to immediately find the recent What are you working on this week? post now
reply
Nice.
I often use quotation marks when I'm searching for a exact phrase with Google and know the exact wording. Not a coder, is that easy to implement?
reply
It should be fairly trivial to add, but we don't support that currently.
reply
I often use quotation marks when I'm searching
So do I. Implementing this would be awesome.
reply
1138 sats \ 3 replies \ @k00b OP 17 Jan
Done. fyi
reply
Oh shiiiiiiit. thank you k00b!
reply
217 sats \ 1 reply \ @0fje0 17 Jan
Indeed, thanks @k00b!
(Am I doing this right ???)
reply
reply
Thanks, I’m a visual learner!
reply
Very cool. I found learning about ML tech is helpful even if not implementing, just to understand how so many things are changing. Looking forward to searching on SN more.
reply
This is great! SN search functionality was pretty poor before, so hopefully this makes old posts way more searchable!
Just curious, what language model are you using to get the semantic embeddings?
reply
this sentence transformer
reply
Cool, and are you storing embeddings using pgVector?
reply
We like using OpenSearch for search stuff so we store them there. I’m not sure if Postgres supports k-nn in arbitrary dimensions.
reply
Cool. I believe pgVector supports up to 2k dimensions for knn
reply
You're right! Thanks for bringing it to my attention. I wasn't even aware of that extension.
I've been disappointed by pg's fts in the past so I didn't consider using pg for this, but we could make better use of the embeddings if they were in pg.
reply
Here's some things in Search you may want to look at. Basically it seems like URLs aren't returned in search results when you'd think they would be.
I would expect this post Need feedback on product - Speed ( www.tryspeed.com ) to show up whenever I search for tryspeed. But here are my results:
https://stacker.news/search?q=tryspeed (tryspeed without quotes, lots shows up. The first search result that comes up does NOT appear to have tryspeed in the text anywhere that I could find. The closest I could find was the word speech. The second result has my targeted post. https://stacker.news/search?q=%22tryspeed%22 (tryspeed in quotes, nothing shows up) https://stacker.news/search?q=%22tryspeed.com%22 (tryspeed.com in quotes, nothing shows up) https://stacker.news/search?q=%22www.tryspeed.com%22 (www.tryspeed.com in quotes, my targeted post shows up)
Similar thing here: https://stacker.news/search?q=discoverpraxis without quotes, doesn't find target post https://stacker.news/search?q=%22discoverpraxis%22 with quotes, doesn't find target post https://stacker.news/search?q=%22discoverpraxis.com%22 with quotes, with .com, DOES find the posts I thought it should.
reply
We do support url searching but it has to be explicit right now. Good feedback.
reply
I'm really loving the quoted search, which allows you to search for the EXACT term. It makes Search a lot more useful.
Thank you!
Interestingly, I only discovered it by chance, just playing around with search. I know that on Google search, quoting a term makes it required, so I tried it out, and it worked.
Embarrassingly - I did NOT find the ability to do this by looking at the search screen, even though I search a lot. Usually on a search screen, I'm accustomed to just looking for the "advanced search" pop-up, which gives you all the keywords and tricks you can use.
Now I see that the quoted search ability IS now mentioned on the search screen. It's on the bottom, under "More Filters" in the unusual font. Which apparently was something that I just was not able to see.
Here's an idea - to make this more discoverable, how about putting a little "advanced search" link next to the magnifying glass, that pops up the advanced search options?
Also another thing - what about making the default ranking NOT be zaprank? Or at least, having an option to have it be "best match"? A lot of times when you search, it shows up some random posts, that have the same root word as what you're looking for, that happen to be heavily zapped. Many of these aren't useful, because the longer a post is, the more likely it is that it'll have the root of the word you're looking for.
reply
Great feedback.
what about making the default ranking NOT be zaprank?
We could give it less weight. Currently it's applied linearly, ie match_score*zaprank but it might make more sense to do match_score*log10(zaprank)
reply
Google has proven that improving search is not a subtle upgrade, but a game changer. Well done SN. For the record, I am not a Google fan nor a surveillance company fan, but their business model relied on focusing on making search fast and relevant and it paid off hugely for them.
reply
Good news.
reply
I think the search box should be more visible like the google home page
reply
Thanks! We agree. Our navigation bar needs a makeover.
reply
everything that improves stacker.news is good news
reply
Great Job!👍
reply
It's great if you'll make search engine better than Reddit, course if I want to find something on Reddit I should go the google and search there "myAnswer + reddit" querry
reply