Missing entries in the InformIT deals feed


#1

Hi,

The InformIT deals feed (http://www.informit.com/deals/deal_rss.aspx)  is regenerated every day with a single fresh (according to ‘pubDate’) entry, but the ‘guid’ stays the same for the particular book (i suspect it’s an internal id) during all period of the deal (typically 1-2 weeks).

And it looks like, when they decide to offer the same book again they always use the same ‘guid’, so NewsBlur’s dedup system filters out new entries.

I’ve noticed this about a month ago and wrote a script that once a day fetches the feed. Here’s a manually combined file for the period from 1th to 26th of March: https://gist.github.com/sainaen/450cc27295c900a167a1
As you can see, during this period there were four unique books offered: Dart Programming Language (01 – 05), Discovering Modern C++ (06 – 12), Multiplayer Game Programming (13 – 19) and How to Use Objects (20 – 26). Only three appeared in the NewsBlur’s feed though:

I suspect that this happens because _Multiplayer Game Programming _was previously offered on December 20th 2015, but I don’t know how get its ‘guid’ to verify that.

So, I have two questions:

  1. Is there a way to fix this on the NewsBlur side? (Maybe show old item as unread again if it reappears after reasonable period of time?) If yes, is there a way for me to help with this?
  2. If a workaround would be too hard/specific to implement, I could try to contact InformIT and ask them fix their feed. For this, I’d like to know what should I ask for: just generate new ‘guid’ every time? Will this work combined with their current approach of publishing the same entry with new ‘pubDate’? I’m afraid to cause even more harm here. :slight_smile:
    Thanks!

#2

I have the same trouble - I have found that NewsBlur does not update any feed (item) if pubDate is changed and the guid stays the same. According to my opinion this is only because of speed optimalization (once displayed feed is not checked again). Am I right? And is it possible to change this behaviour somehow, at least for certain site?

For example the Firefox built-in RSS reader behaves as expected and sort items according the pubDate (even if the guid stays the same).