I’m open to hosting this! What’s the prompt you use? And if I give it a url, it does a reliable job of extracting story titles, dates, optional authors and tags, and the full content?
Hugin isn’t especially flexible, so I’ve basically built a chain of agents for each site. E.g., for local news site larazon.es => catalonia that neglects to keep its RSS feed updated, I do Website Agent (scrapes xpaths to get URLs), DeDupe (so I don’t translate the same articles multiple times), Website Agent again to scrape contents, title, author etc, then call Post Agent that calls OpenRouter, that uses DeepSeek v3 model (the cheapest reliable model) to translate the title and body, then make RSS.
The prompt is trivial, something like: translate to English. strip all html tags.
or for another site summarize in 10 bullet points. format output as html using list/ul tags
I imagine this could be formalized to have an “easy” mode where a robot would guess xpaths/urls, and an “expert” mode where it’s all manual.
As previously mentioned, thanks for your work over the years but I will not be supporting Newsblur further in light of this.
I don’t understand, what are you objecting to? The LLM feature I’m describing is basically a Web Feeds feature where we use a language model to scrape/follow a website instead of relying on a non-existent feed.
Ohh, I see above that you’re not a fan of LLM summaries, which I don’t see happening. And even if I did add that ability, it would be a separate pro feature. That’s not the kind of feature I see building for regular premium users
If you’re adding huginn (or other LLM-agent-based) monitoring of pages to generate feeds via extraction and optional translation, then presumably the “summary” capability would literally just be a case of adjusting the prompts?
In which case it could be up to the user, assuming they’re on a tier supporting custom prompts.
The only thing this single-feed approach wouldn’t automatically provide based on what we’ve been discussing would be deduplication (or grouping) of related stories which as several people have suggested is a very valuable feature.
But if you extended the pipeline to cover folders you’d get that too.