How does newsblur retrieve historical data on RSS feed?

I have been trying to retrieve data on RSS feed manually use feedparser lib in Python, seems like RSS feed never keep historical post .
My question is , how does Newsblur retrieve historical post , since when I registered to be Newsblur user , I can see my subscriptions historical data .
does it depends on if we have subscriber to refresh the feed to decide if stories on that day will be stored in database or newsblur has a background daemon which continuously retrieve data from user subscriptions ?

My understanding is , it uses celery to run “UpdateFeed” in the background

def enable_celerybeat():
    with virtualenv():
        run('mkdir -p data')
    put('config/supervisor_celerybeat.conf', '/etc/supervisor/conf.d/celerybeat.conf', use_sudo=True)
    put('config/supervisor_celeryd_work_queue.conf', '/etc/supervisor/conf.d/celeryd_work_queue.conf', use_sudo=True)
    put('config/supervisor_celeryd_beat.conf', '/etc/supervisor/conf.d/celeryd_beat.conf', use_sudo=True)
    put('config/supervisor_celeryd_beat_feeds.conf', '/etc/supervisor/conf.d/celeryd_beat_feeds.conf', use_sudo=True)
    sudo('supervisorctl reread')
    sudo('supervisorctl update')

Nothing special going on here, it’s just that NewsBlur has an archive from previous subscribers. If you were to subscribe to a feed and be the first to do so, you would only see as many stories as are in the RSS feed.

2 Likes