Find Jobs
Hire Freelancers

Website domain crawling

$250-750 USD

Fermé
Publié il y a presque 7 ans

$250-750 USD

Payé lors de la livraison
I need to so some large scale website crawling for data mining purposes. It needs to scale to potentially millions of sites. Unless you have done a project like this, please do not bid. I need an expert that has done it before. Not looking for a custom solution. I would expect that solutions exist that can be leveraged. Specs: - Input is a list of URL's  - Each URL is FULLY crawled starting at the depth provided. Example if [login to view URL] is provided, only the data under [login to view URL] is crawled, it would not see [login to view URL] -All pages are saved as HTML in a network file system under the top level target domain directory -All pages and links are crawled, including pages that require a click via javascript or user interaction -All pages are saved as their rendered versions as HTML -Child pages that are behind a javascript link: When this happens, the links are converted to an HTML link and inserted into the page. The rendered child page is saved as HTML -Inserted HTML links (javascript click) should be human readable format for QA Another process will consume and process the directory tree after each site is done. The crucial piece is that everything is traversable by following a regular html link. Please suggest what technologies you would use to accomplish this project. Please show that you have read this fully by putting your favorite color in CAPS as the first word in your bid. Do not post generic list of projects. I want to know what you have done in site crawling specifically. Thanks
N° de projet : 14450929

Concernant le projet

4 propositions
Projet à distance
Actif à il y a 7 ans

Cherchez-vous à gagner de l'argent ?

Avantages de faire une offre sur Freelancer

Fixez votre budget et vos délais
Soyez payé pour votre travail
Surlignez votre proposition
Il est gratuit de s'inscrire et de faire des offres sur des travaux

À propos du client

Drapeau de UNITED STATES
Oconomowoc, United States
4,8
158
Méthode de paiement vérifiée
Membre depuis févr. 16, 2007

Vérification du client

Merci ! Nous vous avons envoyé un lien par e-mail afin de réclamer votre crédit gratuit.
Une erreur a eu lieu lors de l'envoi de votre e-mail. Veuillez réessayer.
Utilisateurs enregistrés Total des travaux publiés
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Chargement de l'aperçu
Permission donnée pour la géolocalisation.
Votre session de connexion a expiré et vous avez été déconnecté. Veuillez vous connecter à nouveau.