Comic Book Resources feeds produce duplicate posts

This doesn’t occur in all Comic Book Resources related rss feeds (http://www.comicbookresources.com/rss), but at the very least the problem persists in these two feeds:

http://www.comicbookresources.com/fee…

http://www.comicbookresources.com/fee…

In both cases, when adding the feeds to NewsBlur duplicates will show up of each and every article. One of the links to the article will work and the other is always broken.

A recent example from the columns feed (http://www.comicbookresources.com/fee…):

This one works!: www.comicbookresources.com/?page=arti…

Doesn’t work, but shows up in the feed nonetheless.: www.comicbookresources.com/?page=arti…

There is ALWAYS an entry with the additional ‘amp’ in the address and these are the ones that appear not to work.

When going directly to the rss feeds, the duplicate entries do not appear.

This has been occurring for a long while. I had actually deleted the feed for more than a few months. I just recently re-added it and saw the problem was still persisting. I did a search and hadn’t seen that anyone had previously complained about it which I found odd since there are currently 389 subscribers to the ALL feed (http://www.comicbookresources.com/fee…).

Anyway, despite this hiccup, I love the product! Just looking for a solution on this issue. Thanks!

1 Like

It’s due to their feed, unfortunately. You are seeing an incorrectly published feed, probably when the cache clears on their end. NewsBlur fetches the feed so often that it will oftentimes catch bugs like this. The publisher should know that they are publishing bad data every so often.

The other issue is that NewsBlur won’t de-dupe and merge the stories because they are comics with very little text, so there just isn’t enough to judge if it’s a dupe.

Hi Samuel,

I’m looking into this for CBR after receiving the report from Ryan. However, we’re not sure what you mean by “You are seeing an incorrectly published feed.”

The feed appears to be correct and passes validation on every RSS validation site we’ve checked it against.

We’re not sure what is causing the feed to appear corrupt to the Newsblur aggregator. Any insights you can lend as to what it is your system is hiccuping on would be greatly appreciated.

Also, we’re not sure what you mean by “stories are…comics with very little text.” For the most part our content is text-rich news articles, so this comment puzzles us.

  • Rob

I’ve also been experiencing this issue with Newsblur and CBR’s news feed. As best I can tell it only seems to present itself in CBR’s feed I never see the behavior in any other feed in Newsblur. Tonight I just finally decided to see if someone else was having the problem. Its not consistent, some times it looks fine with 1:1 and sometimes some will show once and some will show 1:2 then later new ones will appear 1:1 again. One version will work fine, and one results in a 404, so its always a crap shoot when you open a article from CBR via Newsblur. I’m attaching some screenshots so everyone can see what is going on inside NewsBlur.

I’ve used several external validators XML and RSS/Atom and validated that the feed on the site is valid. I also have checked the raw feed and there is no duplication. The only thing that looks remotely “weird” but still valid is a CRLF in the description element on the SilverSurfer/Thor item as I see it now. But it is a pretty plain by the book RSS feed, link text, image, etc. I’m also confused at what Samuel may have seen.

Its just really wierd and kind of annoying, but seems to only affect NewsBlur and CBR. I only get the CBR “News” feed so I don’t know if any of the other feeds see the same behavior or not.

Thanks for the detailed message Joel. Like you, we’re still a bit puzzled as to why the Newblur aggragator doesn’t digest our feed properly.

CBR would be happy to resolve this for its readers on Newsblur. We do need some guidance from the Newsblur support people to let us no exactly what the issue is.

Hey Samuel,

I noticed today that one of my other RSS feeds showed duplicate entries. I don’t recall this one feed ever exhibiting this before. But I inspected the raw feed and no dupes, it also checks out with several validators. As with CBR, nothing seems unusual about the feed. Maybe this will give you another point of comparison. Thanks.

http://feeds.bizjournals.com/bizj_pho…