The text view is nice for feeds that dont have full content, but it often ends up extracting the posts images as well as the text, and when this is added with the existing post image you end up with to images you have to scroll past, frequently the same image.
Happens on almost every feed item for the Verge. Site id 6643112
Ok, I haven’t forgotten about this one. I see it too. It’s because the URLs of the two images are different, but NewsBlur tries hard to ensure you at least get an image in the full text. What NewsBlur is doing is checking the feed story for images and inserting that into the full text if that url doesn’t show up again.
But caching servers means a lot of image urls change, and it’s a near guarantee that a cached RSS feed’s image urls will be different than the full text’s image urls.
I believe this is the better tradeoff: having two images as opposed to not having any images in some cases.
You could maybe check if there is an image at the top (before the first paragraph) of the fetched text?
If there is, don’t include the feed image, if not, include the feed image.
There’s a lot of detritus and formatting this goes into what’s on top. If I’m using BeautifulSoup I need to delineate the “top” versus the next paragraph/div/container/section.