I’m viewing the feed 48 hills , 48 Hills, by folks who worked on an SF Bay Area alt-weekly many years back.
I click on Discover sites. The top result is, unsurprisingly, from the East Bay Express, another SF Bay Area alt-weekly. But, this feed is low-quality, with headlines like “Best online casinos in Australia” and “Top Bitcoin Casino sites with Fast Payouts”. I bet this feed used to be relevant to SF Bay Area folks, but not so much today.
I wonder if there’s some signal that this site isn’t a great thing to spotlight in “Discover sites”.
I bet that most folks subscribed to that East Bay Express feed who view it nowadays unsubscribe soon after they see what it’s turned into. Maybe a stronger signal than subscribers-in-common is subscribers-in-common-who-have-logged-in-recently?
I dunno if there’s a worth-it fix. It just struck me.
I’m about to publish the blog post for Discover tonight, and I try to explain how these discover sites are calculated, but it has nothing to do with personal data. I’ll add an explainer about that to the post. It all has to do with word similarity, from the site title and description to the titles of the first few stories. So that was probably enough from the RSS feed’s description to capture the two feeds as similar.
1 Like
There is definitely space for some improvements.
Checking the suggestions for the Verge the fourth one is for Engadget which hasn’t been updated since 2020.
For Ars Technica there are only two suggestions, the first is the feed I’m already subscribed to and was viewing, the second one is title “Ars Technica - All Content” with the last five stories of course being identical. (Guess there is no site description to go on?)
I see the blog post. Nice!
For sure there could be improvement, but I spent a while looking at hundreds of site results and created a number of catches for bad results: repeat site titles (but not subsets, as that could be interesting if you’re looking for more feeds from the same site), missing stories, and spam results. Also, I made it easy to scroll past sites of no interest.
So while there are some mismatched results, the majority are fascinating dives into related stories.