I am looking for a guy who can develope the following multithead python script:
In a database, there a tons of urls that need to be visited. (Column page_url)
Visiting the side, it need to be, whether a certain URL is still in the source text of the visited site. This URL is part of the DB as well (image_url)
If the image link is still found in the source code, the value „YES“ is printed into a certain DB column and checking will continue in step 2.
If the image link is not found any more in the source code, the value „NO“ is printed into a certain DB column. Nothing more happens.
For the websites where the image URL is still online, it needs to be checked whether a certain text string (out of DB) is included in the source code of the website.
Here, it needs to be differenciated where the text string is found. For that task, text strings, that are included in the image_url needs to be excluded.
First check: Is the text string generally available on the specific url (page_url out of database)
If NO, print „NO“ to a certain column of the DB.
If YES, continue checking:
Is the text string part of an „alt-tag“, print „ALT“ to a certain column oft he DB.
Is the text string part of an „title-tag“, print „MOUSEOVER“ to a certain column of the DB.
Is the text string not part of the image_url, not part of an alt-tag and not part of a title-tag, print „YES“ to a certain column of the DB.
- As there are a lot of sites that need to be checked, I need the python script multitheading.
- A list of proxies will be provided that shall be used for accessing the page_urls. The proxy used shall be changed each visit of a new page_url.
- There shall be the option to set a waiting time between accessing 2 page_urls by one threat.
Hi I have an extensive experience with Python and web scraping, so I believe I can handle the job. For this project I would like to use requests library ([login to view URL]) and BeautifulS Plus
16 freelance font une offre moyenne de $225 pour ce travail
I have +10 years of experience in Python programming. I am expert in web crawling. You can find tens of such projects done successfully here on freelancer on my profile. I have used multithreading and proxies for ma Plus
Hi. I read the description and seems to be clear, few question so far: 1. On which OS will you run the script ? 2. Can I have access on the DB which doing tests? 3. Which Db you are using? 4. Do you already ha Plus
I'm Python developer with many years of experience that's why I'm sure you'll be impressed with my work. I can create the script which will connect to defined database (MySQL?) and get list of URLs to check and imag Plus
Hi, I've been implemented python script crawler with proxy similar with your project please check my finished project: https://www.freelancer.com/projects/Python/Python-CURL-Project/ Cheers.
Hii, Thanks for your project post. We are a reputed organization with the team of 40+ developers having core expertise in Web design and Web development. We believe that we would best suits to your requirement. Plus
Hello, I have a lot of knowledge and experience for this job. If You hire me this project will be done efficiently and fast. Feel free to contact me if You have any questions. Kind regards, Nino Rasic
I am a Python expert with experience in Web scraping , API integration and lexical analysis. Also I am a Oracle Certified professional (OCP) with experience in Oracle,MySQL, SQL Server and MongoDB. I can help you Plus
Hi, I am the founder of a small Austrian company focusing on data analysis. We can handle jobs in the field of data wrangling, data science and data visualization. This job would be in the field of data wrangling. For Plus
I will write a Python script that will -Read your database using a Python wrapper library depending on the database. -Read a file with the list of proxies -Iterate over the page_urls and proxies -Call the check_url Plus
I have system administration background and practical experience with multi threading + http get and parsing. My profile is new, but I am not new in python :) I have one question about last string: "There shall be Plus
Dear Prospect Hiring Manager. Thank you for giving me a chance to bid on your project. i am a serious bidder here and i have already worked on a similar project before and can deliver as u have mentioned I have Plus
Dear Client, Thank You for viewing my profile and your requirements can be full filled by our experts of tech team. They need to discuss on your detailed requirements because client's requirements is priority for u Plus
I love programming with Python, I'm new to this site, and I'm doing this as a hobby, I've a well paying full time job. I've read a many books and wrote some application, but I wanna get involved in more projects. So Plus