Spider/extract a directory - only for professionel programmers
$30-50 USD
Annulé
Publié il y a plus de 14 ans
$30-50 USD
Payé lors de la livraison
Need a coder who is experienced in spider/extraction software.
I dont need the software but only need a coder to extract urls from a directory - so,
you simply make a script to extract urls from teh directory i specify to you and then you give me the output files
when finished.
I both want a complete accumulated list of all urls it finds in the directory - as well
as one individual text file per "category" (say "shopping" or "art"). These individual text files
must have one url per line and the file must be named after the category that the urls are extracted from.
For each url then the pagerank is listed and I ONLY want the urls that have MINIMUM Pagerank 2.
The directory have about 20.000 urls and is very easy for a crawler/spider software to crawl since all is text.
ONLY bid if you have vast experience in doing software that extracts this type of info. If you have no or only little experiecne - then don't bid. thanks :-)
IMPORTANT Please write what experience you have with this type job and why you are the right coder for this task.
## Deliverables
1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.
2) Deliverables must be in ready-to-run condition, as follows (depending on the nature of the deliverables):
a) For web sites or other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment.
b) For all others including desktop software or software the buyer intends to distribute: A software installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request.
3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement).
## Platform
text