En cours

Small python multithread crawler

I am looking for a guy who can develope the following multithead python script:

In a database, there a tons of urls that need to be visited. (Column page_url)

Visiting the side, it need to be, whether a certain URL is still in the source text of the visited site. This URL is part of the DB as well (image_url)

If the image link is still found in the source code, the value „YES“ is printed into a certain DB column and checking will continue in step 2.

If the image link is not found any more in the source code, the value „NO“ is printed into a certain DB column. Nothing more happens.

Step 2:

For the websites where the image URL is still online, it needs to be checked whether a certain text string (out of DB) is included in the source code of the website.

Here, it needs to be differenciated where the text string is found. For that task, text strings, that are included in the image_url needs to be excluded.

First check: Is the text string generally available on the specific url (page_url out of database)

If NO, print „NO“ to a certain column of the DB.

If YES, continue checking:

Is the text string part of an „alt-tag“, print „ALT“ to a certain column oft he DB.

Is the text string part of an „title-tag“, print „MOUSEOVER“ to a certain column of the DB.

Is the text string not part of the image_url, not part of an alt-tag and not part of a title-tag, print „YES“ to a certain column of the DB.

- As there are a lot of sites that need to be checked, I need the python script multitheading.

- A list of proxies will be provided that shall be used for accessing the page_urls. The proxy used shall be changed each visit of a new page_url.

- There shall be the option to set a waiting time between accessing 2 page_urls by one threat.

Compétences : PHP, Python, Architecture Logicielle

Voir plus : code python online, java crawler multithread source, site crawler multithread, python small applikation, python small project, crawler multithread java source code download, crawler spider python, crawler multithread java, site crawler code python, crawler multithread, application multithread python

Concernant l'employeur :
( 105 commentaires ) Lüneburg, Germany

N° du projet : #8493145

Décerné à :

strikovcobalt

Hi I have an extensive experience with Python and web scraping, so I believe I can handle the job. For this project I would like to use requests library ([url removed, login to view]) and BeautifulS Plus

150 $ USD en 2 jours
(9 Commentaires)
3.0

16 freelance ont fait une offre moyenne de 225 $ pour ce travail

helmot

I have +10 years of experience in Python programming. I am expert in web crawling. You can find tens of such projects done successfully here on freelancer on my profile. I have used multithreading and proxies for ma Plus

200 $ USD en 3 jours
(135 Commentaires)
7.7
mituld

A proposal has not yet been provided

237 $ USD en 10 jours
(239 Commentaires)
7.7
chirgeo

Hi. I read the description and seems to be clear, few question so far: 1. On which OS will you run the script ? 2. Can I have access on the DB which doing tests? 3. Which Db you are using? 4. Do you already ha Plus

250 $ USD en 1 jour
(69 Commentaires)
6.6
gangabass

I'm Python developer with many years of experience that's why I'm sure you'll be impressed with my work. I can create the script which will connect to defined database (MySQL?) and get list of URLs to check and imag Plus

188 $ USD en 2 jours
(177 Commentaires)
6.3
codinglover89

Hi, I've been implemented python script crawler with proxy similar with your project please check my finished project: [url removed, login to view] Cheers.

350 $ USD en 3 jours
(13 Commentaires)
5.9
livegoodlife

A proposal has not yet been provided

250 $ USD en 3 jours
(10 Commentaires)
4.5
Webbleu

Hii, Thanks for your project post. We are a reputed organization with the team of 40+ developers having core expertise in Web design and Web development. We believe that we would best suits to your requirement. Plus

368 $ USD en 3 jours
(14 Commentaires)
3.8
nrasic

Hello, I have a lot of knowledge and experience for this job. If You hire me this project will be done efficiently and fast. Feel free to contact me if You have any questions. Kind regards, Nino Rasic

222 $ USD en 3 jours
(7 Commentaires)
2.7
anupkelkar02

I am a Python expert with experience in Web scraping , API integration and lexical analysis. Also I am a Oracle Certified professional (OCP) with experience in Oracle,MySQL, SQL Server and MongoDB. I can help you Plus

334 $ USD en 6 jours
(3 Commentaires)
2.6
statAnalysis

Hi, I am the founder of a small Austrian company focusing on data analysis. We can handle jobs in the field of data wrangling, data science and data visualization. This job would be in the field of data wrangling. For Plus

199 $ USD en 3 jours
(1 Commentaire)
2.1
travissarleslee

I will write a Python script that will -Read your database using a Python wrapper library depending on the database. -Read a file with the list of proxies -Iterate over the page_urls and proxies -Call the check_url Plus

222 $ USD en 3 jours
(1 Commentaire)
2.2
alexeykurinnij

I have system administration background and practical experience with multi threading + http get and parsing. My profile is new, but I am not new in python :) I have one question about last string: "There shall be Plus

111 $ USD en 7 jours
(2 Commentaires)
1.6
visionsoft7

Dear Prospect Hiring Manager. Thank you for giving me a chance to bid on your project. i am a serious bidder here and i have already worked on a similar project before and can deliver as u have mentioned I have Plus

147 $ USD en 3 jours
(1 Commentaire)
0.3
alenaket

A proposal has not yet been provided

211 $ USD en 8 jours
(0 Commentaires)
0.0
hackplanettech

Dear Client, Thank You for viewing my profile and your requirements can be full filled by our experts of tech team. They need to discuss on your detailed requirements because client's requirements is priority for u Plus

200 $ USD en 5 jours
(0 Commentaires)
0.0
mohamadserag

I love programming with Python, I'm new to this site, and I'm doing this as a hobby, I've a well paying full time job. I've read a many books and wrote some application, but I wanna get involved in more projects. So Plus

100 $ USD en 7 jours
(0 Commentaires)
0.0