Mass unread flag flip

Happened again with BoingBoing, jumped to 182, showing posts from July 18.

Wild. Seems like it’s happening around the same time, but for different feeds.

It’s becoming clear this is due to having more than 500 stories in the archive but unreads maxing out at 500. Even if your unread goes up to 160, that’s because there were 760 worth of unreads and it maxed out at 500.

So the true answer is to change how unreads are calculated for non premium archive accounts to only allow the 500 most recent stories as unread, even if the 501st story is less than 30 days old.

Thanks! I can confirm I’m using the Premium sub, so that makes sense.

Is changing how unreads are calculated for Premium subscriptions something that you can implement for that particular plan?

That doesn’t make a lick of sense to me.

I frequently clear my Everything river of all unread stories. I’ve done multiple Mark All as Read since you requested that last week. It still happened.

When this mass flag flip happens I’m seeing 100s of stories I’ve already seen.

I’m not caring about the count of unread items, that’s just a visible symptom of the fact that Newsblur has had another one of these mass flag flip events.

If the problem is there’s over 500 unread stories in the archive why does Newsblur tell me there’s nothing unread? Why is it happening on the same feeds and not every feed?

Same happening here. ~10 unread last night, ~500 this morning for The Verge - All Posts

It’s happened maybe 6 times over the last month.

Another user just reported that it happened to The Verge, so I’m wondering if there’s something fishy going on with how NewsBlur processes new stories, but it’s so surprising. Did The Verge reset its count for anybody else? I’m subscribed to that feed and it still shows the correct number, so it’s not everybody experiencing it.

Verge did it to me last night too.

BoingBoing ( NewsBlur ) did it again in the last 24 hours.
79 unread going back to Sunday July 24 04:58.

I had read everything and performed a Mark All as Read when I made my post yesterday.

EDIT: And they are definitely stories I’ve already seen and marked as read - not new unread stories from those dates.

Hi, The Verge reset for me at some point in the past 24 hours. It went from <30 stories to 500 unread.

Another instance: Kotaku

258 unread this morning, <10 last night

By Kotaku, you mean this feed?

That’s the one

Ok, I’m following Boing Boing, The Verge, and Kotaku and am watching them regularly to see if the numbers jump. We’ll figure this one out soon I hope!

Happened again with HackerNews. I’ve got 500 unread overnight.

edit: I marked everything read from last week to last night. Refreshed, and it pulled in even more already-marked-as-read articles.

And again, just now, HackerNews pulling in 500 articles that have already been read.

Man, this is so annoying. @samuelclay what can we do to try to prioritize this issue?

It’s prioritized, I just need to track it down. These things take time, esp. when it isn’t site-wide. I’ve been following all of these sites and the unread flip hasn’t happened, so I need to figure out what’s different.

The verge again today ~200 unread.

I noticed that calculate_feed_scores takes very different codepaths depending on whether the subscription has training data for the classifier.

I wonder if this bug is only affecting people who have never fed data to the classifier? I haven’t.

The calls to story_hashes to get unread stories are different in interesting ways – for example, they pass different values for cutoff_date.

That’s an interesting thought but remember that it counts correctly 99% of the time. It’s only breaking on large subscriber feeds very rarely. Those feeds will fetch every 5 minutes and have dozens of new stories in a day.

My guess is that the unread counter routine is being killed due to taking too long and it may stop in the middle of a recount. But it shouldn’t blow away your read stories.

I just fixed the other issue that’s been new since July 1st, so this is the last issue to fix. I’m thinking about this constantly and may try to implement some additional checks next week.