I developed a Java program to scrap information from a website. The architecture of the solution involves: 1) using Java Selenium to send requests to the webpage via Chrome Webdriver to trigger authentication and authenticated requests; 2) routing the requests from Chrome (headless) to Java BrowserMobProxy to capture three HTTP headers (Authorization, X-CSRF-TOKEN, and Cookie) and one query string (without these, the server after some requests starts responding 512); and 3) use these 4 elements in HTTPs requests from Java directly to the webpage (i.e. without Selenium, Chrome, and BrowserMobProxy involved) to retrieve the desired information.
This program does the basic functionality of extracting the information but has a few problems:
It depends on an external non-Java component: Chrome WebDriver
It depends on Java Selenium and Java BrowserMobProxy, two dependencies that I would like to remove
It is not optimized (too much refresh and too long sleep periods) relatively to the limit upon which the Webpage (Cloudfare) starts responding 429 errors. Thus, the retrieval of the information is taking much more time than needed.
You will get the current program Java code and you will need to solve the problems above. To do so, you will need to:
B. You will need to identify the limit upon which the Webpage (behind Cloudfare) starts responding 429 errors. You will need to tune the refresh frequency of the headers and sleep periods to the limit identified. You will need to demonstrate the benefits of your changes by extracting the information currently extracted by the program and measuring how long it takes.
Note: you will need to create your own login/password in the webpage. No additional requirements exist to register.
9 freelance font une offre moyenne de $492 pour ce travail
Hello, I can make improvement in java web scraping. I have gone through your job posting and become very much interested to work with you. I am an expert in this field. I have already completed several projects like Plus
Hello, I am pleasure with your job as detailed. Thank you for the job posting. It’s a pleasure to meet you. I’d really like to work with you on this one if possible! I do have a couple of questions, but first I’d like Plus
Hi, With over 5 years of experience in Python. I’ve gone through your complete project description. I am interested in this project as it is exactly within the scope of my skill. My main skills are as follows: Python, Plus
I can rewrite a clean Python Selenium automated driver code, optimized and organized without bugs, with pipeline to output json or CSV have a look into my Selenium bots in my portfolio [login to view URL] Plus
Hi, sir. I have carefully checked your requirements and I was glad that I've already done this kind of projects before. I'd love to share more detail with you over chat and I'm sure that you'll be interested in them. I Plus
Hi, how are you doing? I hope you're doing well! I am a professional Web Scraper for the last 7 years. I am confident to complete your project. Regards! Sergei.
Hi please hire me Relevant Skills and Experience Did the automation testing in selenium using java