En cours

simple scraping script work

Hello,

I need a php script doing:

step 0: you get a file with a list of URLs (hundreds or thousands); they are in all sorts of format (subdomains, https, many SLD/TLD).

step 1: you extract the domain names from the URLs and generate a sorted list of unique domains; this is not as simple as it sounds as the function doing that must be able to tokenize any URL format as well as any form of TLD (like .[url removed, login to view], .fr, .[url removed, login to view], ... for example).

step 2: clean the list to remove some domains such as free blogs or .gov.

step 3: scrape [url removed, login to view] to get one data about some of the domains.

step 4: scrape [url removed, login to view] to get some data for a short list of domains (without getting banned for superusage).

step 5: scrape 2 data from the [url removed, login to view] page for each domain in the list.

step 6: sort the list and output as a flat file.

Or you can propose your method.

Potential for long term work with the right programmer(s)

Compétences : HTML, MySQL, PHP

en voir plus : scraping free, scraping com, work free without, url scraping, php script work for scraping 2, php programmer nz, namecheap, mysql flat file, scrape data list urls, script php extract data mysql, sorted list, script clean list, scrape data mysql, mysql clean data, page scraping mysql, clean php script, sorted form, scrape sort, gov blogs, scrape domains, scraping php form, function scrape, simple list urls, scraping data url, php clean urls

Concernant l'employeur :
( 9 commentaires ) London, United Kingdom

Nº du projet : #2379610

10 freelance font une offre moyenne de $187 pour ce travail

srinichal

I can deliver the script

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 5 jours
(107 Commentaires)
7.2
waelfree

Hi, I can do that

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 2 jours
(100 Commentaires)
7.0
k1ng440

Full time freelance web developer with over 8 years of commercial experience.

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 5 jours
(96 Commentaires)
6.5
procoder898

hi, I'm an expert in scrapper programming and data mining. Please kindly have your inbox checked and let me handle your project.

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 5 jours
(29 Commentaires)
5.3
csharpdotnettech

sir i am an expert .Net, C# programmer(among Top 10% programmers at Odesk with 91% percentile) can view my profile [login to view URL]~~26186839692c4731 i have 100% completion rate with 5/5 rating(can vi Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 3 jours
(8 Commentaires)
5.1
yarco

But you should give me the domain names to be removed in step2 "step 2: clean the list to remove some domains such as free blogs or .gov."

%bids___i_sum_sub_32% %project_currencyDetails_sign_sub_33% USD en 1 jour
(16 Commentaires)
4.8
anonymed

Ready to work [login to view URL] PM

%bids___i_sum_sub_32% %project_currencyDetails_sign_sub_33% USD en 1 jour
(12 Commentaires)
4.5
joeguo

why not built the script with python?

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 3 jours
(6 Commentaires)
4.3
Attractionnet

Hello. I'm the right programmer! I write custom programs who can do everything. PHP code runs from certain ips thus blocked, programs can't be blocked. So, I will create a program which check the domain validity, and t Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 5 jours
(7 Commentaires)
2.8
vishalkain

Worked as a Backbone of a web based company, now working as a freelancer. As you said you need programmer.. so, i am. ABOUT PHP working in php is like 3 things for me.. 1. confidence 2. google keywords 3. g Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 2 jours
(1 Évaluation)
1.0