I need a simple script:
step 0: you get a large file with a list of URLs (up to 1 million lines)
step 1: extract the domain names from the URL and generate a sorted unique list; this is not as simple as it sounds as the function doing that must be able to tokenize any URL format as well as any form of TLD (like .[url removed, login to view], .fr, .[url removed, login to view], ... for example),
step 2: use this API: [url removed, login to view] to get 2 or 3 data per domain,
step 3: scrape 2 data from the [url removed, login to view] page for each domain in the list,
step 4: sort and output as a flat file.
Potential for long term work with the right programmer(s)
9 freelance ont fait une offre moyenne de 116 $ pour ce travail
Hello. I have a template scarp script. i can adapt it according to your requirements. I give you a high quality service and will be always in communication.