Home › Forums › Back End › Custom RSS/Atom Feeds › Re: Custom RSS/Atom Feeds
Hi Megan.
You can do a site with php and mysql, on the database side of things you may need store some things like:
Site’s Table
Site Name (Owner of the RSS/Atom Feed)
Feed Address (The URL to the feed)
Active (just a flag for your bot).
Article’s Table
Article Name (Name, usually on the feed)
Article URL (The URL for the article on the site, so you can link to)
Article Description (The description, a excerpt or the first couple characters)
Article Category (Category, if not provided use RegExp on the php logic to categorize)
Then on the site you will need the usual stuff, a page for showing the articles, some search mechanism, and so on.
For scraping the articles names, descriptions and URL you will need a bot, the bot will download and read each Feed, for this you will need parse XML, luck you that feeds are standards so they have common fields from site to site.
Once your bot scann a file he will check the database looking for the article to see if it has been added yet, if not find anything add the article, then move to the next.
The bot will use the feeds URL located on the Site’s Table, you can make a cron job on your server to run once a day, once each six hours or you can trigger when someone visit the site, make run as background task, so the user don’t get waiting the scan finish.
If I was you I will made the website on PHP and the bot if possible in Nodejs, so it will be more realtime without needing some trigger or cron job to activate.
Good Luck with your project.
I once made a scrapping with php to have a list of all articles of Smashing Magazine, so I can read the titles and choose what read.