Curly quotes in Money Stuff sometimes get munged

I’m subscribed to the Money Stuff email newsletter and forward the emails from Gmail to Newsblur to read them. It looks fine in Newsblur itself, but when I click on the title of a story to read it the article by itself, the curly quotes are munged. For example:

The munged text looks like this:

The classic description of a successful investment banking analyst is “detail-oriented.â€

The URL looks like this: “https://www.newsblur.com/newsletters/story/7043354:c61794”

This is happening to multiple email feeds for me.

Oh interesting! I would love a feed address or url where I can sign up for the email so I can see for myself.

Just deployed a fix for this. Unfortunately, it won’t be retroactive, but you’ll see all new stories come with the correct encoding. Let me know if that’s not the case and give me any URLs where it doesn’t work.

I still see encoding errors in Monday’s Money Stuff

https://www.newsblur.com/newsletters/story/7043354:a9b1aa

What is the probability that the US Supreme Court will strike down President Donald Trump’s broad tariffs? There are a number of ways of thinking about that question. One is what you might call legal analysis: You can look at the text of the US Constitution and the relevant statutes, as well as relevant legal precedents, and try to figure out whether the tariffs are legal or not. If they seem very illegal, then the probability that the Supreme Court will strike them down is high; if they seem very legal, low; if the law is genuinely ambiguous then maybe it’s 50/50. In previous columns I have sort of gestured in the direction of this analysis, but honestly this might be the worst way of thinking about the question.Â

Well I’ll be. I was so proud of this fix, because it did work, but only on normal RSS feeds. I noticed Techmeme had this same encoding issues and I was thrilled that this fixed it as well.

But of course, I neglected to cover newsletters, which take a slightly different path into becoming stories. So that’s now deployed and will fix future newsletters.

If you see any encoding issues, please holler, I’d love to fix them!

Hi, I’ve been distracted for a while but I was reading Money Stuff today and it’s still not fixed. From the December 1 email:

We have talked before about the business model that Sam Altman proposed for OpenAI in 2019, which was (1) build an artificial superintelligence, (2) ask it how to make money and (3) do that. “We will create God and then ask it for money,” as I put it.Â

This is in the standalone webpage. URL is: https://www.newsblur.com/newsletters/story/7043354:fffc97

Um, try again?

Apologies for taking so long on this but I finally fixed it for good. Looks like we were correctly storing it in the DB but neglected to include the character set encoding in the html we were serving. It’s now fixed, and I verified and even that url looks good now.

Thanks so much for your persistence and patience. And if you spot anything like this again, please please please post it! Same goes for any other concerns, bugs, or feature ideas.

1 Like

Looks good! Thank you.

1 Like

This doesn’t seem to work for headlines; for example, the latest entry for Mieux Donner has <title>De l&#8217;INSA à Mieux Donner : quand l&#8217;esprit d&#8217;ingénieur rencontre la philanthropie</title> but is rendered like so:

Fixed! Claude wrote this summary:

Thanks for the report! I found and fixed the issue.

The Problem:
Story titles with accented characters (like à, é) were getting corrupted during feed fetching, showing up as à or é instead of the correct characters. HTML entities like ’ (smart apostrophes) were being decoded correctly, but regular UTF-8 characters were getting double-encoded.

Root Cause:
A bug in feedparser 6.0.12 where it does a case-sensitive lookup for the content-type header. We were passing headers with Content-Type (title case), but feedparser only recognizes content-type (lowercase). When it couldn’t find the header, it defaulted to iso-8859-1 encoding instead of respecting the feed’s UTF-8 declaration.

The Fix:
Normalize all HTTP header keys to lowercase before passing them to feedparser. This ensures feedparser correctly detects the UTF-8 charset from the Content-Type header.

I’ve also manually corrected the affected story. New stories from this feed will now display correctly.