I need a tool that will do two separate jobs.
The first job is to crawl/search a specified site on a specific day, find all the excel (or csv) download links and save them for the following week. The next week all of those excel links will be accessed and collated into one excel sheet which is then saved for the program user. After that has been done the site then searches the site again for excel/csv links ready for the following week.
The program user needs to be able to specify the site that is to be crawled, what day of the week and what time that is supposed to happen and which folder on the computer the collated excel sheet should be saved. The program should be able to handle searching more than one site. It should also search on linked pages (up to a certain number of steps). The user should be able to create a list of urls that the crawler doesn't bother to search.
The second job is to have a list of urls that the site needs to crawl on a specific day of the week. The following week the same urls need to be accessed but any differences in text needs to be copied, highlighted and sent in an email to a specified email address.
There will be a large number of urls to be compared so the program needs to be able to compare 100+ urls.