I require a Python script to parse html files into a MySQL DB. The html files are located inside zip files. The html files are identical in structure.
The Python script(s) must perform the following steps:
- Locate and open zip files located in a specified folder
- Extract the html files from any zip files into a temporary folder for parsing
- Parse each of html files for the relevant data
- Save the relevant data to a MySQL db
The successful bidder will be supplied with the following:
- several 100 sample zip files in order to compete the job
- the MySQL db table structures
The following deliverables are expected upon completion:
- One or more clearly documented/commented Python scripts
Prior html web scraping experience is a must. Excellent knowledge of Python, MYSQL and regular expressions is essential. If you do not have these qualifications please don't bid.
Décerné à :
12 freelance ont fait une offre moyenne de 112 $ pour ce travail
Hello sir, proficient web-scraper here. Looks like not much trouble doing this task, I could do this with high quality. Please respond to the private message.
G'day :) I'd probably prefer to have SqlAlchemy as a dependency, which will make it very easy to switch databases. Also, either lxml or BeautifulSoup depending on how "tag soup-y" the html is.