Google News Alerts - article extraction?

Hello!

I’ve got what may be an unsolvable problem by Newsblur, but I thought I’d try anyway.

Issue: I have a number of custom Google news alerts which were converted into RSS feeds many years ago. They continue to work just fine for their intended purpose. However, when I click on an article in the Newsblur iOS, iPad, or Web interface, I only get the article title, and maybe a sentence from the actual article.

I realise that’s likely exactly what Google wants, as to see the full contents they want someone to click through, show ads, extract your DNA, sell it to a LLM, etc etc.

However, a number of RSS readers work around this somehow and extract the content itself.

For example, if I use ReadKit on my iPad or iPhone, and point it to Newsblur, and select an article and have ReadKit in “Reader” mode for that feed, the article content appears – not always, it’s not 100% of the time, but, it’s more than enough for me.

With Newsblur, however, the native apps at least, I may see a brief error about being unable to fetch text when I select an article and that’s it.

Since this works when Newsblur is the back-end for other RSS apps, it’s likely some client-specific thing that’s not working for this type of feed.

If this is more of a “feature request” and not a “problem” I’m happy to delete this post (or The Powers That Be can).

Got an example feed I can take a look at?

Yes, try:

https://www.google.com/alerts/feeds/13613501410405625340/12731662143552072270

Thank you for taking a look!

Found it! Google Alerts wraps every article link in a google.com/url redirect that returns a JavaScript redirect page instead of a real HTTP redirect. So when NewsBlur tries to fetch the original text, it gets a tiny Google page instead of the actual article.

Just pushed a fix that extracts the real URL from those redirects, both when storing new stories and when fetching original text for existing ones. Should be deployed shortly.

1 Like

Outstanding. Thank you!! I probably should have mentioned this nearly a year ago when I started using Newsblur. Really appreciate the insanely fast turnaround.

Is it possible to do this for the Google News feed too? Perhaps, extract the full text of the article matching the title in the RSS-item’s title, since the feed content displays a list of articles in the cluster?
This is the feed in question → https://www.newsblur.com/site/8690351/