I subscribe to a ton of feeds. A lot of the feeds are from different sources. For example:
New York Times > New York
New York Times > Home Page
New York Times > Environment
The Guardian > US Edition
The Guardian > Science
The Guardian > Business
The feeds feature different content, but often feature the same news stories several times a day. For example, the New York, Home Page, and Environment feeds from the New York Times will all have the same article on fracking with the same headline and content.
It would be nice to have a way for NewsBlur to recognize if an item in a feed was previously featured in another feed and hide it, leaving only one instance of the news item in all the feeds (or under āAll Site Storiesā). For users that (for whatever reason) want to access the duplicate feeds, there could be a section under Read Items if this hypothetical option is toggled in settings.
For this to work, either A the āduplicatedā feed items must be complete exact matches, or B there would have to be a fully general way for each user to specify what counts as āduplicatedā items, across what feeds. There would also need to be a way to specify which feeds are to be counted as one āgroupā.
Iād rather see IFTTT, Yahoo Pipes or similar doing the work; then itās up to the user to remove the āduplicatedā (syndicated?) items.
Yeah, this is a tough problem. I could use some statistical models, but thatās a very expensive feature and is certainly not easy to build. Also, thereās the fact that the UI doesnāt really support this yet. Iām not sure which is the harder problem to solve, but they are both whatās causing this problem to not yet be solved.