For a project that I'm currently working on as a web developer, we need to build a database of products available in the online market. We chose to scrape Amazon and get the products with their details and pictures.
So here is the requirement:
You would write a piece of software that will:
1. Scrape [url removed, login to view]
2. Get into each department link and the sub-categories all the way down until you see the filters, and thats when you see the product list
3. Scrape first 5 pages in there and get the product details. For now, we need only the product name, short description, and the pictures. Do the same through all the departments and subcategories until our storage crashes :)
4. The scraped details will go into a MySql database and the pictures go into Amazon S3 cloud storage. I'll provide the details of S3 when the work begins.
The tool we prefer to use is noodle ([url removed, login to view]). Because of this, the time to write the software will reduce drastically.
Looking for somebody with excellent demonstrable experience in nodejs, not in writing web apps with nodejs (we have developers to do that), but in writing complex pieces.
Please apply only if you have experience and are confident that you can finish this software in less than a week, because we need this done ASAP, otherwise we would do it ourselves.