Write some Software

I am looking to get a custom software or script built that will scrape the outgoing links from a particular website which we call it as "seed site" or backlinks from a particular website

This will be in 2 parts :

Part 1 : SCRAPER

Example : Lets consider [url removed, login to view] as the seed site. So I want to scrape all the domains that link out from [url removed, login to view] or all the domains that [url removed, login to view] is backlinking to. For example a post about domain "[url removed, login to view]" posted on bbc and has a backlink from it. So bbc links out to thousands of sites and I want to extract all those sites

So not just bbc I want this to work for any of the seed sites or scrape from any of the sites that i enter in software

Part 2 : Check for domain metrics by Integration with API

After it scrapes these domains I want to check metrics of these extracted domains like PA, DA, Tf etc. Meaning they should work with or intergrate with API of [url removed, login to view], [url removed, login to view] and [url removed, login to view] services. It should also check for domain availibility for registration.

I am aware that many such similar scripts have been built in freelancer sucessfully. I would be glad to award them this project


Inputs to the tool


* Mandatory - 1 or more seed urls

* Optional - Crawl depth (Default value = 0, max value = 10)

* Optional - TLD list (Default values = [.org, .net, .com, .info, .biz]) If user enters TLDs, then append them to existing ones.

* Optional - Number of parallel threads to use. (Default value = 6)

* Optional - Proxy server configuration

Output from the tool


* CSV file with list of domain names scraped




1) Take 1 or more seed urls as input via UI field or from a file

2) Take crawl/scrape depth (e.g., 1, 2, 3 and so forth), that is to determinate in a parameter field

3) Take TLD from a list, that is to determinate in a parameter field (.org,.net,.com,.info,.biz and a customer needs to be able to add more and his preferred TLDs)

4) It also needs to work with subdomains

5) Crawl the urls for backlinks (showing the process, so customer knows that something happens and is working, like counting the processed

6) If the backlink is invalid (e.g., HTTP 404 not found), write it to a separate file

7) If the depth is 0, crawl only the seed url and domain. If the depth is 1, crawl backlink domain [url removed, login to view] depth is 3 count backlinks of the backlinks, and so forth.”

8) Possibility to use proxies (to determinate in a parameter field) for proxies)

9) Use multiple threads to scrape

10) Save the invalid to cvs file

11) Build a web application using JSP which will run on a Tomcat. The wordpress site / pop up window

a) should display the status of the scraping

b) should work in all browsers


1)Upload all the domains in UI or text file

2)It should check for MOZ - DA PA ; Majestic : Trust flow & citation flow; check for domain availibilty

Deliverables & Scopes


Following are the deliverables the developer will provide the employer

1) A standalone Java program that scrape

2) A web page to enter the inputs mentioned above

Example of such exisitng and working domain scraper :

[url removed, login to view]

Compétences : Java, PHP, Architecture Logicielle

en voir plus : working as a freelancer web developer, work as a software developer as a freelancer, wordpress software developer software, wordpress org developer freelancer, wordpress developer names, window display freelancer, web ui freelancer, website api integration freelancer, web scraping site freelancer, web page developer tool, web developer website names, web developer tool wordpress, web developer freelancer sites, web developer freelancer for wordpress, web application developer freelancer, value or freelancer com, trust flow checker, the write co, the meaning of freelancer, software freelancer sites

Concernant l'employeur :
( 13 commentaires ) Belgaum, India

Nº du projet : #8159075

1 freelance fait une offre moyenne de $555 pour ce travail


A proposal has not yet been provided

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 10 jours
(0 Commentaires)