Old items reappearing in Reddit feeds

I’ve got several Reddit groups I follow by RSS feed. Lately, I have been noticing some degree of older posts showing up again as unread.

Wondering if it’s possible to do something about this? It started happening within the past few weeks.

1 Like

Been noticing the same here too.

Yeah, I’ve observed this in recent weeks as well. It could be that whatever logic Newsblur uses to detect duplicate items is no longer working for Reddit feeds.

Examples of feeds that I see this issue on:
http://www.reddit.com/r/museum/top.rss?sort=top&t=month
http://www.reddit.com/r/truereddit/top.rss?sort=top&t=month

In each of these, a subset of posts appear twice. Typically in succession and never more than 2x in total.

It is highly likely that this is due to the upcoming redesign that uses python 3 and is now serving half of NewsBlur’s feed fetching. It’s likely that the story checker which checks for duplicates is timing out. I raised the timeout from 2 to 5 seconds, but that still may not be enough.

What I’m noticing is that it doesn’t happen for every story, but when it does happen, a group of stories get duplicated. I’m wondering if there are characteristics of the feeds that get duplicated stories.

Also, FYI, it comes down to this function taking too long: NewsBlur/models.py at dashboard3 · samuelclay/NewsBlur · GitHub

Still happening quite frequently for me. If increasing the timeout made a difference I’m not noticing it here.

I’m a fellow developer - is there anything I can do to help you debug this?

I often read all my RSS feeds every day. They’re all marked as read before I go to bed. Then, the next morning, I see many Reddit articles again in my feeds that I’d read the day before. With the way I use Newsblur, I actually very rarely see two identical articles next to each other as shown in the screenshot above. I tend to see them a day or so apart. Would that still tie in with the timeout theory?

Honestly I believe it’s because I have the feed fetchers split, half between www and half between beta. So it’s likely there’s some disconnect between the two. I’ll see about moving all of the feed fetchers to beta today.