Terminé

Multi Threaded Python scraper optimization project

I have a proxy supported python scraper hosted on Heroku using Django framework that does the following:

1) user input a list of keywords

2) scraper goes to [login to view URL] and searches for each keyword and returns product information for the first page of search results for each keyword (usually 60 products per keyword). Product information scraped includes title, description, features, review rating, price, seller, etc. It also use [login to view URL] api to return google keyword planner search volume data for each search term.

3) when complete it exports the data to a spreadsheet

The scraper is fully functional and no issues there.

The issue i am having is my scraper is using too much memory and will abruptly stop with it hits the usage limit on my heroku account (1GB). I am looking for someone with experience troubleshooting and optimizing python scraping code that could make my scraper run more efficiently. There may be a smoking gun that is the main cause of the memory leak.. im not sure. See attached image of the heroku metrics page showing the memory spike.

If it cannot be optimized further than it already is, my last resort I can think of is to restrict the number of scraped results for a given keyword (for example 25 scraped product results per keyword) so the export file doesn't return 60 results per page, thereby reducing the time it needs to run and memory used. This would also require altering the code that restricts the number of keywords the user can search on each run.

The deliverables for this project include:

1) optimizing the source code (currently on Heroku) so i can scrape a minimum of 10 keywords & 25 results/keyword per run without the scraper stopping.

2) I also have a couple really simple tweaks i need made to fix a couple fields in the export data.

I'm hoping to find someone with strong experience with python scraping, Django, and proxies that can fix my problem simply and quickly. I have a lot of future work possibilities for someone highly skilled in python web scraping. Please message me with further questions.

Compétences : Django, Python, Architecture Logicielle, Web Scraping

en voir plus : scrape website with login python, web scraping tutorial, web scraping with python, python multithreading, web scraping multiple pages python, parallel web scraping python, python web scraping projects, python multiprocessing scraping, seo site optimization project website, multi threaded myspace, multi threaded crawler, optimization project report template, webbrowser multi threaded, php multi threaded, ajax multi threaded aspnet, python multi threaded crawler, calculus optimization project, sample ecommerce project questions, vb6 multi threaded file reader, net ajax multi threaded

Concernant l'employeur :
( 24 commentaires ) Gates Mills, United States

Nº du projet : #19700691

Décerné à:

logos106

Hello, there. I have good experience with multi-thread in python. I want to check your scraper script. I can't sure without checking the code. Please message me. Thanks.

%selectedBids___i_sum_sub_7% %project_currencyDetails_sign_sub_8% USD en 5 jours
(11 Commentaires)
4.9

22 freelance font une offre moyenne de $199 pour ce travail

ValueCoders

Hi There, I can do it very quickly & effectively. I'm having more than 18 years of web development experience. Looking forward to work with you! Thanks!

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 7 jours
(85 Commentaires)
6.8
umg536

Hello there, This is a default bid made. we'll discuss the price later in the chat after reading your project i can do this for you perfectly.I still have a few questions. please leave a message on my chat so we can di Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 3 jours
(17 Commentaires)
6.1
kipdev13

Hello,how are you? I have experiences for many years about python programming. I have done your proposal quickly and then u will get best result. I will do my best for you and you will be gotten best results Plz c Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 3 jours
(88 Commentaires)
6.3
%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 2 jours
(35 Commentaires)
6.1
Mickelson

Hi Nice to meet you. I have enough experience in python script. Below the libraries are I used in past project. selenium, pandas, matplotlib, lxml, beautifulsoup, scipy, and other useful libraries. I have written Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 3 jours
(66 Commentaires)
6.3
lightingdavid

Hello. I 'm expert in "Django, Python, Software Architecture, Web Scraping" and I have working for 7+ years in this field. I 'm very interest to your project. I have checked your project description carefully and i Plus

%bids___i_sum_sub_32% %project_currencyDetails_sign_sub_33% USD en 1 jour
(90 Commentaires)
5.8
NavyaSales

Hi, I am an ex-Microsoft employee, an expert web scraper and an experienced all-round Python/Django developer. I am working with e small team of talented developers and am confident of optimizing your Python scrapin Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 3 jours
(27 Commentaires)
5.7
ymograi

Sir/Madam, I am an experienced Python Developer. I have built Amazon keyword scraper which is very efficient in scraping the keywords from the amazon website. I believe I can fix it for you. I look forward to working w Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 7 jours
(42 Commentaires)
5.1
Stephenrajs

(if you choose me please atleast send "HI" message, if you invite the project i can't respond you, this is freelancer rules. So its my request) Hi there, I have scraped amazon, aliexpress, yellow pages,yelp,zoma Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 7 jours
(90 Commentaires)
5.1
BestService222

Dear, Sir. Nice to meet you. I read your project description carefully and am very interested in working for your project. I am able to provide the best product with awesome and good performance and offer a good resu Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 3 jours
(17 Commentaires)
4.8
kkc1985612

Hi, I looked at your description carefully and I am very interested in providing my skills in hopes of working with you. I have ample experience in web scraping. -scrap any website using selenium, beautifulSoup, reques Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 2 jours
(25 Commentaires)
5.0
albertpopov46

Hi,Dear. I have many development experiences,getting work about . As I am interested in your project,I want to do work it. I've read the project description and definitely can develop your task. If you hire me, I wil Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 7 jours
(13 Commentaires)
4.3
JinTaiZhe

Hi I am pleased to submit my qualifications to you. I am very happy to work with you on this proposal I have read your description carefully and I am very interested in your project. I have rich experience in web scr Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 7 jours
(5 Commentaires)
4.3
todo2095

hi there, i read your description, that's good and i am sure i can do it. I am a senior full stack developer. So I am sure I can bring you perfect result as you want within short time. I’m ready to start your project i Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 7 jours
(5 Commentaires)
3.5
tascoin

Hi Sir/Ma'am, =>Having expertise with Python, Selenium, Django, OpenCV, Flask =>Experience with Node.JS, APIs, MongoDB, Postgres SQL, AWS =>Experience with back-end development & custom coding =>Experience of working Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 10 jours
(2 Commentaires)
2.4
jinxueqiong0910

Hello I am a experienced Web Developer.(+6 years) I am really interested in your project and even I've done similar works before. When I was a student, I started Web Development only based on PHP. But with time, Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 3 jours
(1 Évaluation)
2.0
SuperM88

Hi there . I am a Python expert, and after reading your requirements caredully, I am sure I can write an Python application for you thatbcan scrape the Amazon sites

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 3 jours
(0 Commentaires)
0.0
sinancetinkaya

I don't know which scraping technique the previous developer has used. You have to show us the code. Then we can bring you ideas to optimize your code.

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 7 jours
(0 Commentaires)
0.0
Mindlarz

We are a web developing and designing company with various other facilities such as web content and digital marketing related to all types of specializations. Additional considerations and implementations are assisted Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 3 jours
(0 Commentaires)
0.0
apextechnomatics

Hello, Hope you are doing well, I have in-depth technical knowledge on Python, Flask, Node.js, JavaScript, PostgreSQL Administration, Django, MySQL, MongoDB, XML-RPC services, stores, ERP/CRM portals and implement Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 3 jours
(0 Commentaires)
0.0