De-duplication of syndicated content

De-duplicating planet content would be awesome.
I’m subscribed to a fair number of planets and blogs, so I see the same posts popping up multiple times, as I see them from the original site (which may also have other, unaggregated content), and from one or two planets.

Planets generally don’t mess with the article’s unique identifier, so this should be doable.

5 Likes

Are you thinking of something along the lines of Shaun Inman’s Fever°?

Fever reads your feeds and picks out the most frequently talked about links from a customizable time period. Unlike traditional aggregators, Fever works better the more feeds you follow.

If you check out the demo video you will see how multiple blogs mentioning the same thing are collapsed into just 1 news item. While it still shows you what other feeds had blogged this and gives you the ability to check out them as well.

I would be all for this. I follow many link sites such as kottke.org, Daring Fireball, Waxy.org Links, and The Brooks Review. These blogs tend to overlap on certain topics and it’s not always so I wish to read what each of them had to say about an article.

If you are thinking of just dropping out certain feed items completely I can’t say I agree. From time to time I do wish to read both Jason’s commentary as well as John’s.

That also sounds interesting. I often see the same thing mentioned on multiple blogs, and yes, grouping those together would be handy. But I agree that one would want to see all the viewpoints.

I’m really just talking about planets, though. I’m involved in both Debian & Ubuntu, and several blogs are syndicated through both planets. Plus I may follow the blog itself. Seeing the same post (identical content) 3 times isn’t really useful.

I’m really just talking about planets, though. I’m involved in both Debian & Ubuntu, and several blogs are syndicated through both planets.

I guess, not being into Linux builds too much, I might not have had the right idea with what planets are. But I guess something like Fever’s item merging would collapse both your duplicates as well as mine into 1.

Here’s a few planets: http://planet.debian.net/ http://planet.ubuntu.com/ http://planet.gnome.org/

Ahh, yes, this is a classic idea. I don’t see this happening for a good long time, as much as I’d like to do it. NewsBlur is not a multi-feed aggregation tool, just yet. This is a very low priority, unfortunately, since it’s a ton of work for very little gain. Now, if this was perhaps a smaller subset, eliminating store that are 100% exactly the same, that reduces the complexity considerably, but it’s still an enormous problem.