Python web scraping into a Mysql DB

I require a Python script to parse html files into a MySQL DB. The html files are located inside zip files. The html files are identical in structure.

The Python script(s) must perform the following steps:

- Locate and open zip files located in a specified folder

- Extract the html files from any zip files into a temporary folder for parsing

- Parse each of html files for the relevant data

- Save the relevant data to a MySQL db

The successful bidder will be supplied with the following:

- several 100 sample zip files in order to compete the job

- the MySQL db table structures

The following deliverables are expected upon completion:

- One or more clearly documented/commented Python scripts

Prior html web scraping experience is a must. Excellent knowledge of Python, MYSQL and regular expressions is essential. If you do not have these qualifications please don't bid.

Compétences : MySQL, Python, Web Scraping

en voir plus : web scraping mysql python, mysql data scraping python, web scraping python mysql, web scraping mysql, web scraping c#, temporary job, sample regular expressions, regular expressions in c, regular expressions c, python qualifications, mysql db, job web scraping python, html web scripts, data web scraping, c regular expressions, compete data, python html parsing, scraping python, s+db, python web, python script, python scraping, python data, inside knowledge, html web

Concernant l'employeur :
( 3 commentaires ) Kogarah, Australia

Nº du projet : #2410020

Décerné à:


Hi. I'm experienced python developer. My specialization data mining and scraping. I will be happy to help you.

%selectedBids___i_sum_sub_4% %project_currencyDetails_sign_sub_5% USD en 1 jour
(3 Commentaires)

12 freelance font une offre moyenne de $112 pour ce travail


Hello, I can do the same for you using php.

%bids___i_sum_sub_32% %project_currencyDetails_sign_sub_33% USD en 1 jour
(18 Commentaires)

Hi. I have many experience in scrape data. Please check your PM for more details. Thanks

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 2 jours
(30 Commentaires)

Hello, the task seems to be easy to accomplish and I am ready to start working right away.

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 4 jours
(1 Évaluation)

Hello sir, proficient web-scraper here. Looks like not much trouble doing this task, I could do this with high quality. Please respond to the private message.

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 2 jours
(4 Commentaires)

Hi,I'll Give you Exact solution that you have required But in PHP . Contact if you want in PHP...

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 3 jours
(2 Commentaires)

G'day :) I'd probably prefer to have SqlAlchemy as a dependency, which will make it very easy to switch databases. Also, either lxml or BeautifulSoup depending on how "tag soup-y" the html is.

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 2 jours
(2 Commentaires)

hi there i have expertise in MySql database, python and web [login to view URL] problem statement seems rather simple to me. I have done more complex web scarping. This job can be done easily by me. I can do this for you in Plus

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 2 jours
(3 Commentaires)

I have prior experience writing production quality web crawlers.

%bids___i_sum_sub_32% %project_currencyDetails_sign_sub_33% USD en 1 jour
(0 Commentaires)

Custom software development (<b><i>Removed by Admin</i></b>)

%bids___i_sum_sub_32% %project_currencyDetails_sign_sub_33% USD en 1 jour
(0 Commentaires)

I'm new to Freelancer.com, but have worked professionally as a developer for over 5 years. Check your messages for details on me and my bid.

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 7 jours
(0 Commentaires)

Im very skilled on regexing with python and mysql data management

%bids___i_sum_sub_35% %project_currencyDetails_sign_sub_36% USD en 5 jours
(0 Commentaires)