I’m looking for experienced data extraction developer to provide me with custom project.
The goal is to automatically retrieve information from some major search engine. It’s not about regular search results, but information from snippets about important places containing place name, phone number and opening hours.
Input data (string):
- Search query
Output data (JSON):
- Place name
- Phone number
- Opening hours
I see this as a command line script where I provide search query as parameter and get JSON response in STDOUT.
Script must use proxy service provided by [url removed, login to view] and include automated dead proxy detection and rotation. Connection timeout must be a parameter.
Search engine is using HTTPS encrypted connections.
Interpreted languages preferred like: PHP, Python
Script must run on headless Linux/Debian server. It must not depend on web browser or any other GUI application, so for instance Selenium will not work.
Script must be able to run multiple instances concurrently.
During the tests you will provide online demo of this script where search query will be passed as URL GET/POST params and response will contain JSON.
After finishing you must provide full, unencrypted source code of the project and build/compilation instructions if needed.