Python Web Crawler for large websites

I am looking for a detailed web crawl of any website.

I am aiming to crawl each page of a website and pick only certain information to finally store in a database (suitable, to be suggested by you).

So, input will be the domain and you need to find a way to compile all the URLs and then collect info as in the excel sheet.

- Tab “Crawled URLs” will list out all the URLs of the sites

- Tab “Internal Links Raw Data” will list out all the specifics of the internal links

Now, for each crawl, you may need to record them under a unique crawl ID. This is the 1st phase of the project. We will expand the scope once we get the data correctly and reliably for large websites.

I can explain the details of the required information in the attached sheet.

To qualify for serious consideration of your proposal, you must provide the following in your bid:

- What Python library/package you will use and why

- What are the challenges you foresee and how you will overcome them? It is extremely important to get details here. This is the chance to show how good a fit you are for this project.

- What is your suggestion for data storage and why?

- What similar project did you do earlier and whether I can check that in action?

Please note without the points above in your bid, it is likely that we will not consider the bid seriously.

Compétences : Python, Web Crawling

Concernant le client :
( 2 commentaires ) Kolkata, India

Nº du projet : #34030972

12 freelances font une offre moyenne de 176 $ pour ce travail


Hello, sir! How are you? I am a web scraping specialist. I have rich experience about web scraping. I've been using bs4, selenium or scrapy... I've ever scrapped dozens of sites at once also. At that time, there were a Plus

%bids___i_sum_sub_32% %project_currencyDetails_sign_sub_33% USD en 1 jour
(28 Commentaires)
(14 Commentaires)

Hello sir, I am a python developer with more than 2 years of experience. I have done many projects in past. I can work on : 1. Web Scraping / Data Science / ML 2. Django 3. APP development 4. C/C++ 5. Wordpress Lets Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 4 jours
(41 Commentaires)

Hi, The attachment show some of your requirements. I would like to work on this project, but would like to ask some questions to make things clear. Hereunder my answers to your questions: 1- Python Selenium, Beautiful Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 3 jours
(8 Commentaires)

Hello, I am interested to work on this project. I plan to use libraries like requests, bs4 and selenium. Requests for making http requests to the page, bs4 for scraping and filtering the site html, selenium for dynam Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 7 jours
(19 Commentaires)

Hello: After reading in detail the requirements of your project and concluding that they match my areas of knowledge and skills, I would like to introduce myself. My name is Anthony Muñoz and I am the lead engineer Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 7 jours
(1 Évaluation)

Hi. I’m experienced Data engineer, I use Python and MySQL/Oracle/Hive databases in my professional life. I’m experienced in Data mining so crawling is not a bug deal for me. I’m doing PhD research which includes webs Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 7 jours
(2 Commentaires)

Hi there, I am web scrapping and automation expert with more than 3+ years of experience. I have seen you requirements and according to them i would use beautifulsoup and selenium as library. There might be one probl Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 7 jours
(1 Évaluation)

Hey Dear We are 45 Persons team and my deliver some services . 1. React Native Experts and Developer 2. Digital Marketing (social media & management) 3. Designing (photoshop and illustrator) 4. Android Development (ja Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 2 jours
(0 Commentaires)

Hello, I am willing and able to help with your web_scraping project. I am a seasoned/experienced python programmer who is specialised in data extraction(selenium, BeautifulSoup, request library etc.), modeling, process Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 4 jours
(0 Commentaires)

Hi My name is Mohamed Khaled I'm a Data Analyst I can do this job for you as I/O console application as I've made a similar project in python it's goal is to scrap amazon search results on whatever your input such as " Plus

%bids___i_sum_sub_32% %project_currencyDetails_sign_sub_33% USD en 1 jour
(0 Commentaires)

Hello there, Hope you are doing well! [login to view URL] is available 24/7 for the zoom call . We have our representative all over the World. The attachment show some of your requirements. I would like to work on this proj Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 7 jours
(0 Commentaires)