Feed Retrieval Problems: 403 Errors

I wrote to Samuel directly yesterday, but haven’t yet received any response. I hope he’s fighting this issue.

I’ve contacted a few of the admins of pages that cause trouble for me and they claim they don’t block news readers, even if they hammer to their sites to get the latest feed. At the same time I must admit none of them responded directly to my question to whitelist NB in their Cloudflate environment…

Since yesterday I’ve been testing other RSS readers and they don’t seem to suffer from this issue.

I’ll switch if that’s unresolved. I’ll miss the premium archive feature, but there are similar solutions available with competition (though I need to say, they are not that thorough).

It would be great if you could follow up with those publishers, cc’ing me, and ask if they could double check if they are enforcing a bot blocking on cloudflare, since this is a significant enough issue that I’d love to resolve it once and for all. It’s likely that the cloudflare bot blocking heuristic is blocking newsblur.

It’s possible that I could spin up another task server that only takes 403’s sites and attempts to fetch them to get around the blocked op restriction.

Yeah this is probably it. I’ve been doing some extensive research on Cloudflare’s heuristics, and I’ve seen cases where it blocks a reader or even a search engine, even though it’s been approved by it’s verified bots program.

It’s a shame because having RSS readers like Newsblur going to every single website owner to try to get them to add a custom WAF rule exception is just not feasible.

I’ll probably be adding a special 403 feed fetcher soon, since i agree, it isn’t tenable to ask each individual publisher to whitelist newsblur

Sure let me know if you need any help. It should probably use residential, data center, or mobile IPs because anything else cloudflare will likely block as well.

I think you’ve already applied and got Newsblur approved as a verified bot right? Out of curiosity, did they send you any notification or verification email when approved?

Yeah NewsBlur has been an approved bot for a long while, but we switched servers earlier this year and that’s when the trouble started. I’m going to be spinning up a couple servers on the old hosting provider and seeing if that helps. It’ll be a week before I can get to this though, but it’s my higher priority right now.

Open rss is cracking down on cloudflare websites that are causing problems for rss readers

I’m working on a fix, it’ll take a few more days I think.

Thanks for your efforts on this, Sam!

I see that four of the sites I reported (NewsBlur, NewsBlur, NewsBlur and NewsBlur now retrieve ok if I do the retrieval manually (which is ok by me). Will see how it goes…

NewsBlur is still giving the 403 error but that’s one I could live without if I had to… Thanks, again!

Good news, Open RSS publishes a post addressing this issue and a Cloudflare employee responded on the Hacker News thread: Using Cloudflare on your website could be blocking RSS users | Hacker News

So we might see a Cloudflare-wide resolution soon.

Now that is good news!!