Create a database/php application that will crawl a list of URLs, determined by a priority number using a master/slave system. The master and slave will most likely be done using Ubuntu/Debian EC2's. Using a LAMP stack and with php5-curl installed (To do the requests). The code has to work with that setup, it can be developed in windows but the code has to work for linux filesystem.
The main server/database (lets call it MAIN) will have a mysql database with a few tables:
Urls - (Url, Priority, SlaveId)
Slaves - (SlaveId, ServerIP, QueueSize, State)
State options: Online, Offline
Priorities will be 1-5.
Each slave reports to MAIN its state every 5 minutes, confirming its 'Online'. If MAIN doesn't hear from the slave after 5 minutes, it reports state as 'Offline'.
URLs will be removed once completed by the slave (The slave will do a SQL delete and delete it from the MAIN).
urls will be added to the URL table and can be added randomly to the slaves (doesn't need to be balanced, but if there are 5 new urls then they should be added to slave1, slave2, slave3...etc)
The balance algorithm needs to happen instantly when a slave goes offline, goes online, and every 1 minute.
The MAIN servers job is to assign slaves to the Urls and try to balance workload between all slaves as much as possible. If a slave gets marked as Offline, or a new slave becomes online all queued URLs get even distributed appropriately, making sure not only the number of assigned URLs to a slave is even but the average priority is about the same.
The SLAVEs job is to process their assigned URLs, in order by priority (5 is highest priority). The slave will use php5-curl to make a request to the URL, and then save the contents of the request to a file on the hard drive. Then it will report to MAIN that it's queue is 1 less, and it will delete the URL record it just deleted.
7 freelance font une offre moyenne de $125 pour ce travail
Hi, i'm interested, could you give me more details please? do you need work out all functionalities? regards [url removed, login to view]
Message me before you gonna project to me Message me before you gonna project to me Message me before you gonna project to me Message me before you gonna project to me Message me before you gonna project to me Mes Plus
Hello, I am available for your job, I can start right now. I will provide you good quality work with fast turnaround. Please hire me for this project. Waiting for your kind reply for more discussion. Humfi
Hello, Hope you are doing well. I read your project description, Lets have a technical discussion then we understand, negotiate costing, timeline and then we proceed further. Also I shall show my past work when we di Plus
hello I have read your requirement. I can help you to finish this work. Can you provide more information about this project? I can use python to scrape. Thank you
Hey, I've bid, but I'd also like to say that based upon what you are trying to achieve (central list of URLs, work servers visit them and store the contents somewhere) I would approach it differently. I'd need to cl Plus
Hello, Nice to see your post,I am having 5+ years of experience in development,just share me your detail requirement with me so we can discuss more.I am sure after discussion with me you are satisfy and we will wo Plus