iLounge feed missing items

NewsBlur URL: https://www.newsblur.com/site/9883/il…

NewsBlur seems to have missed all the reviews from today. I can’t see anything obviously wrong with the Atom feed.

1 Like

I think I understand what’s happening here. You can see the news articles are present but most of the reviews are missing. The problem may be that most reviews start out as “first looks” and are updated with the full reviews, days or sometimes weeks later. NewsBlur seems to say “oh, I already have this item” and neither updates the first look to the review, nor marks it as new.

The other feed readers I’ve tried (was NetNewsWire 3.x above, is Safari 5.1/PubSub in this example) are not exhibiting this issue. Here’s a more current example:

Thanks.

So NewsBlur will update a story’s date as long as it’s no more than 24 hours into the future. However, they may be changing the GUID as they do this. I’m not sure why it’s reverting to the old date, though.

Here’s a specific example. Go to the site in NewsBlur (https://www.newsblur.com/site/9883/il…) and search for “Chicken”. Click on the First Look: Fuse Chicken Une Bobine… This will show the first look but open the review.

Here’s what I got out of NewsBlur for the original article (which is old enough to have aged off the feed at this point).

{ 
 "comment\_count": null, 
 "comment\_user\_ids": [], 
 "friend\_comments": [], 
 "guid\_hash": "24d656", 
 "id": "Reviews-32418", 
 "image\_urls": [ 
 "http://assets.ilounge.com/images/reviews\_fusechicken/unebobinelightning/cache/5-240x175.jpg", 
 "http://feeds.feedburner.com/~ff/ilounge?d=yIl2AUoC8zA", 
 "http://feeds.feedburner.com/~ff/ilounge?d=qj6IDK7rITs", 
 "http://feeds.feedburner.com/~ff/ilounge?d=xQJHqnA6YJA", 
 "http://feeds.feedburner.com/~ff/ilounge?d=Juqk2uonMRc" 
 ], 
 "intelligence": { 
 "author": 0, 
 "feed": 0, 
 "tags": 0, 
 "title": 0 
 }, 
 "long\_parsed\_date": "Wednesday, November 6th 1:13pm", 
 "public\_comments": [], 
 "read\_status": 1, 
 "reply\_count": 0, 
 "share\_count": null, 
 "share\_user\_ids": [], 
 "short\_parsed\_date": "06 Nov 2013, 1:13pm", 
 "story\_authors": "Nick Guy", 
 "story\_content": "[...]", 
 "story\_date": "2013-11-06 19:13:44", 
 "story\_feed\_id": 9883, 
 "story\_hash": "9883:24d656", 
 "story\_permalink": "http://www.ilounge.com/index.php/ipod/review/fuse-chicken-une-bobine-for-iphone-5-5c-5s/", 
 "story\_tags": [ 
 "adapters + cables - home", 
 "office", 
 "lightning connector - power", 
 "data" 
 ], 
 "story\_timestamp": "1383765224", 
 "story\_title": "First Look: Fuse Chicken Une Bobine for iPhone 5/5c/5s" 
} 

Everything about that refers to the original First Look rather than the current entry:

<item> <br>
    <title>Review: Fuse Chicken Une Bobine for iPhone 5/5c/5s</title> <br>
    <guid ispermalink="false">Reviews-32418</guid> <br>
    <link>http://www.ilounge.com/index.php/reviews/entry/fuse-chicken-une-bobine-for-iphone-5-5c-5s/ <br>
    <description>[...]</description> <br>
    <category>Adapters + Cables - Home / Office</category><category>Lightning Connector - Power / Data</category> <br>
    <creator>Nick Guy</creator> <br>
    <date>2013-11-13T21:16:50+00:00</date> <br>
  </item>   

Something is weird about the tags — it’s splitting on /’s somewhere, but that shouldn’t affect anything, right?

So the story GUID is the same and the date is not in the future, but NewsBlur is not updating it or showing it as new…?

Ah-ha, it’s old enough that the story is no longer considered updatable. I hadn’t considered this use case, since it’s a huge drain and happens very rarely. When I check for updated stories, I get all stories newer than the oldest story in the feed.

existing_stories = dict((s.story_guid, s) for s in MStory.objects(

story_guid__in=story_guids,

story_date__gte=start_date,
story_feed_id=self.feed.pk
).limit(max(int(len(story_guids)*1.5), 10)))

I used to use guids (commented out above), but it was too finicky. Unless I’m convinced otherwise, this is too much of an edge case to make every single feed fetch take an extra second to fetch the extra stories. As it stands, if the story is updated before it runs through the entire rss feed, it’ll get updated in NewsBlur. But if they update a story AND change the date, it’s going to be missed. They should be issuing new GUIDs.

Clearly a GUID match is already happening, or the new version of story would be inserted as well, right? So where would a 1-second delay (that seems like a *lot*) come from?

If the story is no longer considered updatable, couldn’t you just delete the old version? I don’t care about diffing the old and new versions; all the information about the old story could be lost if it’s been so long, for what I care.

Independent of scalability concerns (since I don’t understand enough to make an intelligent statement), this would make sense to me, in the case where a new version of a story is received when the old version is no longer in the feed:

  1. If the new version has a later date on it, then replace the old version with the new version and mark it unread (my case).
  2. If the new version has the old date on it, silently replace the old version with the new version and don’t bother to mark it unread (it’s probably a minor update/typo correction and not worth bothering the user about).

If you could just do #1 regardless of the date, that’d also be fine (and probably easier). But NewsBlur seems to be picking “don’t update the article at all, and leave the old version without any warning to the user”, which is basically data loss and not fun at all.

Well, here they are:

[Nov 15 18:19:25] —> [iLounge | All Things iPod, iPh] IntegrityError on new story: Reviews-32406 - Tried to save duplicate unique keys (E11000 duplicate key error index: newsblur.stories.$story_hash_1 dup key: { : “9883:b05a87” })
[Nov 15 18:19:26] —> [iLounge | All Things iPod, iPh] IntegrityError on new story: Reviews-32434 - Tried to save duplicate unique keys (E11000 duplicate key error index: newsblur.stories.$story_hash_1 dup key: { : “9883:35b3f5” })
[Nov 15 18:19:26] —> [iLounge | All Things iPod, iPh] IntegrityError on new story: Reviews-32441 - Tried to save duplicate unique keys (E11000 duplicate key error index: newsblur.stories.$story_hash_1 dup key: { : “9883:0b1943” })
[Nov 15 18:19:26] —> [iLounge | All Things iPod, iPh] IntegrityError on new story: Reviews-32440 - Tried to save duplicate unique keys (E11000 duplicate key error index: newsblur.stories.$story_hash_1 dup key: { : “9883:ddeb9d” })
[Nov 15 18:19:26] —> [iLounge | All Things iPod, iPh] IntegrityError on new story: Reviews-32424 - Tried to save duplicate unique keys (E11000 duplicate key error index: newsblur.stories.$story_hash_1 dup key: { : “9883:0b4960” })
[Nov 15 18:19:26] —> [iLounge | All Things iPod, iPh] IntegrityError on new story: Reviews-32423 - Tried to save duplicate unique keys (E11000 duplicate key error index: newsblur.stories.$story_hash_1 dup key: { : “9883:2f6ab0” })
[Nov 15 18:19:26] —> [iLounge | All Things iPod, iPh] IntegrityError on new story: Reviews-32407 - Tried to save duplicate unique keys (E11000 duplicate key error index: newsblur.stories.$story_hash_1 dup key: { : “9883:3e317e” })
[Nov 15 18:19:26] —> [iLounge | All Things iPod, iP*] Parsed Feed: new=2 up=0 same=11 err=7 total=20

Let me see what I can do.

So I fixed it and now the story is being updated. However, the story date will not change. It leads to unread count bugs and is generally unpleasant. I’m going to go ahead and allow story date changes for stories that change this drastically. I’m not sure I’m going to keep this change. If I don’t hear any complaints from folks, it’ll stay. But this gets ripped out if bugs come through.

Gotcha, thanks. As long as it gets marked unread, I think I can live with the date being wrong; as I may have mentioned too many times, I don’t leave any articles unread for more than a day or two, so I’ll see it regardless.

Nope, it definitely won’t get marked as unread. That has never been a feature of NewsBlur.

However, that being said, if the story is older than 30 days, it will probably be marked as unread, as the read story info has been discarded.

OK, if it’s not marked unread, you might as well undo the change — it’s not useful to me at all. I’d have to trawl through old articles in the iLounge feed looking for new reviews.