Beautifulsoupemplois
...utilisateurs : inscription, rôles, droits d’accès, authentification sécurisée (JWT, OAuth ou autre méthode éprouvée). Données à récupérer Le script Python doit aller chercher des articles d’actualité sur les sources que je fournirai, parser le titre, le résumé, la date et l’URL, puis stocker le tout dans une base commune (PostgreSQL ou MongoDB, à discuter). Le scraping doit être robuste (BeautifulSoup, Selenium ou Playwright selon les contraintes) et planifié via un scheduler (cron, Celery Beat…). Synchronisation • Chaque nouvelle actualité doit s’afficher instantanément sur le dashboard et sur l’app m...
...comparateur de prix ergonomique et rapide. • Un front-end clair et professionnel, adapté à une cible B2B. • La gestion d’abonnements (paiement en ligne, espace client sécurisé). Compétences recherchées : • Développement full-stack (front + back). • Maîtrise d’un framework back-end moderne (Node.js, Django, Laravel ou équivalent). • Expérience en web scraping (Python/Scrapy, Puppeteer, BeautifulSoup…). • Base de données (PostgreSQL ou MongoDB). • Intégration d’API de paiement (Stripe, PayPal, etc.). • Connaissances en sécurité et RGPD. Livrables attendus : • MVP fonctionnel (site + base de données +...
...workflow. L’objectif final est simple : chaque jour, récupérer les éditions locale, régionale, nationale et internationale, analyser leur contenu (PDF, HTML ou tout autre format pérenne), puis me livrer un rapport qui ne retient que les articles contenant les mots-clés que je définis moi-même et en propose un résumé clair. Ce qu’il manque aujourd’hui : • un connecteur robuste (Python, Scrapy, BeautifulSoup ou autre) qui télécharge automatiquement chaque journal dès sa parution, quel que soit son support ; • un pipeline d’OCR et de nettoyage pour les PDF scannés (Google ou autre ; • une extraction du texte homogène entre PDF et pages...
...- Extraire le contenu des articles en préservant la structure HTML, et les liens vers les images. - Générer un fichier HTML contenant tous les articles. - Convertir le fichier HTML en un document PDF avec une table des matières et des images incluses. Compétences Requises : - Expérience en web scraping et extraction de données. - Compétences en Python et utilisation de bibliothèques comme BeautifulSoup. - Expérience avec Pandoc pour la conversion de fichiers HTML en PDF. - Attention aux détails pour s'assurer que tout le contenu est correctement formaté. Délai : - Le projet doit être complété dans les 2 semaines suivant l'acceptation de l'offre. Budget ...
Bonjour Madame, J'ai vu votre profil sur le site freelancer.com, vous aviez répondu à cet appel d'offre pour mettr...scraper l'HTML de cette page : en utilisant selenium/python, le tout sur AWS Lambda. J'aimerai que ce soit mit en place sur mon compte AWS et surtout que vous m'expliquiez comment vous avez fait. Je précise que l'html incluant les matchs (qui n'apparaissent que 2 secondes après le chargement de la page via du javascript) me suffit, je gère la partie BeautifulSoup. Seriez-vous disponible ce weekend ? Je pense que 2-3 heures suffiront largement si vous avez déjà réalisé un projet de ce type, il suffira de me donner les fichiers, et de me dire où les installer. V...
Je recherche une personne qui est capable de parser des site web avec beautifulsoup.
...date you captured each record logged beside the data. • Consistency matters: please apply uniform naming conventions (e.g., “FY2023 Gross Profit” instead of varying labels) and check subtotals or totals to be sure everything reconciles. • I’m flexible on the final file type—CSV, Excel, or Google Sheets all work—so let me know which you prefer or suggest. • If you automate with Python, BeautifulSoup, Selenium, or a comparable tool, great; just include the script so the process can be rerun later. A quick README explaining any inputs or environment setup is enough for me to replicate it. • Accuracy is non-negotiable. I will spot-check figures against the original web pages, so double-check before submitting. Once you delive...
...for a later phase—so the job is focused on clean data capture and a flawless import workflow. Descriptions must remain in plain text; no extra HTML markup. Images should arrive attached to the right variation, including separate gallery shots where available, and the colour options need to show as clickable swatches in WooCommerce, not just text labels. I’m comfortable if you use Python (BeautifulSoup, Scrapy) or another scraper, and either the WooCommerce REST API or a CSV/XML tool like WP All Import for the upload, as long as the end result feels native inside my store. Deliverables: • Complete product dataset (titles, plain-text descriptions, all images). • Variations set up so size and colour swatches behave exactly like on Furnx. • Pro...
I need a clean one-off scrape of tabular data that sits openly on a public website and have that entire dataset placed into a Google Spreadsheet. Because it is only a single extraction, I am not looking for a recurring script or scheduler—just an accurate pull of everything that appears in the table on the page today. Feel free to use your preferred stack—Python with BeautifulSoup/Requests, Apps Script, or any reliable web-scraping tool—as long as the final result lands neatly in the sheet, keeping the same column order and row count that appears online. Before we wrap up, I’ll quickly check row totals and a handful of random cells against the site to confirm accuracy; once those spot checks pass, the job is done.
...phone number, and email—nothing more. The final deliverable is a clean, well-structured Excel file ready for me to review. Speed is the priority here: please be able to start right away and turn the file around as fast as possible while still double-checking that every row is accurate and complete. If this timeline works for you and you have solid scraping experience with tools like Python, BeautifulSoup, or Scrapy, let’s move forward now. Budget small as simple Task so Low budget bidder 1st priority. But start now. Simple Task. Start bid with "Urgent" Thanks....
I have two source spreadsheets that I need merged and enriched through automated scraping: • “File 1” – 170 k Spanish local businesses with emails • “File 2” – 65 k additional businesses with websites only Phase 1 – Email extraction Using a Python script and well-known libraries (requests, BeautifulSoup, Scrapy or similar), scan every site listed in File 2, capture all working email addresses you can locate, then append them to the corresponding rows so I can produce a unified “File 3”. Phase 2 – Offer harvesting Next, visit each live site in File 3. Where an offer, deal or promotion is publicly displayed, record the details in a fresh Excel sheet with these exact columns: Business ID | Business Name ...
I have a public-facing website that I need scraped end-to-end. The site is open (no login), but the content is split across multiple pages, so your script will have to detect and follow pagination automatically. Here is exactly what I expect: • A clean, well-commented Python script (requests/BeautifulSoup, Scrapy, or Selenium—your choice) that visits every page, captures the required fields, and writes them to a neatly structured CSV. • The final CSV containing all rows pulled from the site. • A short README that tells me how to run the script and change the target URL or output path if needed. Code quality matters to me: no hard-coded absolute paths, clear variable names, and graceful error handling so the run doesn’t stop if a single page fa...
...schedule; it just needs to collect every page’s copy accurately and store each page URL, headline, sub-headline, paragraph body, and any inline text in separate columns. Please make the scraper resilient to common roadblocks such as pagination, lazy-loaded sections, and basic anti-bot measures, and keep the code modular so I can rerun it myself if the site layout changes slightly. Python with BeautifulSoup, Scrapy, or Playwright is fine as long as the final CSV is UTF-8 encoded and free of HTML tags. Quantities: - we expect somewhere between 10.000 and 70.000 records - we want to pay in milestones per 5,000 - we want to pay for research work + first 5000 in the first milestone, other amount for following milestones (in case you get blocked, problems arise) Deliverables ...
Project Description: Find school districts and charter schools who use a specif...as `"No Vendor Found"`. - If no website could be loaded, the script should log any failed connections or timeouts. Output Format (CSV) The final deliverable file should be structured with the same columns as the ones provided with the additional column to include your results. Skills Required - Expert proficiency in Python. - Deep experience with web scraping libraries (e.g., Requests, BeautifulSoup, Scrapy, and especially Selenium/Puppeteer for dynamic content). - Experience handling common web scraping challenges (redirects, user-agents, proxy usage (if necessary)). To bid, please confirm your familiarity with scraping dynamic content and provide a brief description of the scraping app...
...es or reliable associated sources. Specific sources: Euromillones: (since Feb 13, 2004) La Primitiva: (since Oct 17, 1985 – modern version) El Gordo de la Primitiva: (since Oct 31, 1993) Updates automatic at exactly 00:02 the day after each draw, using ethical scraping (BeautifulSoup/Scrapy) with proper user-agent headers to mimic human behavior. Store data in PostgreSQL (structured) or MongoDB (flexible), including all prize categories to enable ROI calculations and backtesting. 2.2. Number Prediction Generate predictions for Euromillones, La Primitiva and/or El Gordo simultaneously using explicit advanced AI models: Machine Learning ensembles (Random Forests) for
I have three specific school-website links that list all current teachers and administrators. From each page I need a clean scrape of every staff member’s name, role, email address, plus the city/town and the school name, compiled into a single Excel workboo... Key points to keep in mind: • Final deliverable: one Excel file ready for copy-and-paste outreach. • Source material: my three school websites and the driver URLs I will supply. No other sources are required. • Required fields: Name, Role, Email, City/Town, School (or Company for drivers). • Accuracy matters; no duplicate or bounced addresses. If you normally work with Python, BeautifulSoup, Scrapy, or similar web-scraping tools, that’s perfect—as long as the end result is the...
... Once scraped, the information should be organised into a clean CSV file—one row per page—with columns for page URL, full body text, image file names, and link destinations. Please download the images themselves as well and bundle them in a separate folder (a simple ZIP is fine); the CSV should reference the exact filenames so everything lines up. I’m happy for you to use Python with BeautifulSoup, Scrapy, Selenium or whichever stack you prefer, as long as the final output meets these acceptance criteria: • Complete CSV containing text, image names, and link URLs for each page • All images successfully downloaded and accessible via the filenames listed in the CSV • No duplicates or missing pages from the target site * Images need to b...
I need a small automation script that periodically checks item availability on the Bigbasket website and pings me on Telegram the moment any of the tracked products come back in stock. You are free to choose the underlying tech stack (Python + Requests/BeautifulSoup, Selenium, Playwright, or a headless browser of your choice) as long as it works reliably with Bigbasket’s current site layout and protects my account from rate-limit blocks or captchas. The flow I have in mind is straightforward: I feed the bot a list of product URLs (or SKUs). It runs on a schedule I can change—every few minutes during peak shortages, maybe every hour otherwise—grabs the stock status, and fires a concise Telegram message whenever the status flips from “Out of Stock” to &l...
Quiero contar con un archivo .xlsx que contenga las 12 728 filas completas de la tabla pública que aparece en la web de INDECOPI (Perú). El sitio sólo mue...columnas con formato estándar, sin filtros, tablas dinámicas ni otras funcionalidades añadidas. Yo te facilitaré la URL exacta y los pasos de navegación para que ubiques la vista paginada. Una vez terminado, comprobaré que el total de filas coincida con el contador oficial y que no existan celdas vacías en los tres campos solicitados. Si tienes experiencia en scraping con Python (requests, BeautifulSoup, Selenium) o herramientas similares y puedes generar el .xlsx sin alterar la estructura original, me será suficiente. Entrega prevista: archivo Excel ...
...security The service must issue and validate JWT tokens for every request beyond the public health-check route. Token refresh, revocation, and a simple role model (“user” vs. “admin”) should be built in from the start. Flight data extraction I do not have official Iberia developer access, so we will need to pull the data ourselves. I’m open to whichever tooling you are most comfortable with — BeautifulSoup, Selenium, Scrapy, or a hybrid approach — as long as the final solution is headless, resilient to minor layout changes, and respectful of Iberia’s rate limits. Only flights that are bookable with Avios need to be captured; no hotel or car-rental data is required. Deliverables • Clean, modular Python code (FastAPI or F...
...commissions from price comparisons. - Timeline: 1–3 months part-time (flexible around your schedule). - Budget: $1,000–$3,000 (based on experience; includes milestones). **Required Skills:** - Proficiency in no/low-code platforms (e.g., , Adalo, or similar). - AI integration (e.g., Grok APIs for personalized suggestions). - Web scraping/APIs for cost comparison (e.g., Python/BeautifulSoup via Zapier). - Basic frontend/UI design (user-friendly quiz forms, output reports). - Knowledge of compliance (GDPR, disclaimers; experience with health/edtech apps a plus). - Optional: SEO/marketing setup, affiliate integrations. - Strong English communication; available for weekly check-ins (I'm in Dallas, TX—US time zones preferred). **Scope of Work ...
...security The service must issue and validate JWT tokens for every request beyond the public health-check route. Token refresh, revocation, and a simple role model (“user” vs. “admin”) should be built in from the start. Flight data extraction I do not have official Iberia developer access, so we will need to pull the data ourselves. I’m open to whichever tooling you are most comfortable with — BeautifulSoup, Selenium, Scrapy, or a hybrid approach — as long as the final solution is headless, resilient to minor layout changes, and respectful of Iberia’s rate limits. Only flights that are bookable with Avios need to be captured; no hotel or car-rental data is required. Deliverables • Clean, modular Python code (FastAPI or F...
I’m expanding our Florida outreach list and need a reliable web-scraped data set of school, college, and university administrators who oversee Nursing o...Verified email address • State (always Florida) Format & delivery – Send the file in Excel (.xlsx). – First progress drop: within 5 days so I can spot-check. – Final, fully cleaned file: no later than 10 calendar days from project start. Quality matters because this list feeds straight into our marketing campaigns. I’ll spot-verify a sample for accuracy. Feel free to leverage Python, BeautifulSoup, Scrapy, or similar tooling—whatever lets you move quickly while respecting each site’s robots.txt. Let me know if anything needs clarifying before you begin, otherwise I&rsqu...
...program should visit the target site (I’ll share the URL once we start) and pull Product Details exactly as they appear online. That means every time I point the script at a category or search page it should work through all pagination, capture the data, and save it to CSV or Excel so I can sort and analyse it later. Key points to cover • Use reliable, open-source libraries such as requests, BeautifulSoup, or Selenium—whichever gives the most stable results for the site once you see it. • Build in simple settings (URL, output file name, optional delay between requests) near the top of the file so I can tweak them without touching the core logic. • Handle common edge cases: missing fields, changing layouts, or temporary time-outs, and log any skip...
I’m looking for a well-structured Python solution, built around BeautifulSoup (BS4) and any supportive libraries you deem essential, that reliably pulls both product details and customer reviews from Lazada on a daily schedule. The data will fuel ongoing competitor research, so consistency and clarity of the output are critical. I looking specifically to get data using bs4 by bypassing the captcha Here’s how I picture the flow: • Input: category URL(s) or product list I supply in a CSV/JSON. • Scrape: title, price, promos, specs, images, ratings, full review texts, review dates, and reviewer scores. • Output: clean CSV or JSON dropped into a dated folder after each run. Make the script easy to tweak if Lazada changes its markup. Acceptance criter...
I need a seasoned Python developer to build a robust scraper that collects the required data and writes it straight to JSON—no additional cleaning or processing necessary. Once we begin I’ll provide the target URL(s) and any access details; for now, assume a standard public site with pagination and occasional anti-bot checks. Core expectations • Written in Python 3 using requests/BeautifulSoup or Scrapy; resort to Selenium only if there’s no lighter workaround. • Handles pagination, retries, and polite delays gracefully so the run can complete unattended. • Config file or clear constants for headers, cookies, and start URLs, letting me tweak targets without editing core logic. • Produces a single JSON file (or one file per page if that&...
...reliable, well-structured lead list and I already know exactly what it should contain. The task is to extract contact information—email addresses, phone numbers and full mailing addresses—from three sources: company and organisation websites, their public social-media profiles, and well-known online directories. I expect the data to be gathered with a solid scraping workflow (Python, Scrapy, BeautifulSoup, Selenium or an equivalent stack is fine) and then verified so that bounced emails and dead numbers are kept to an absolute minimum. Deliverables • One CSV or Excel file with separate columns for name, company, job title, email, phone, street address, city, state, ZIP/postcode, country, source URL and date collected. • No duplicates; every entry m...
I need a standalone desktop program that lets me analyse horse races by pulling fresh horse-performance data directly from www.racenet.com.au. The app...criteria • One-click scrape pulls the latest horse performance data without captchas or manual intervention. • All key fields visible on racenet for each horse populate correctly in the local database. • Basic analytical views refresh in under two seconds on a typical laptop. • No paid API keys required—everything comes from the public site. I’m flexible on the tech stack: Python (BeautifulSoup/Selenium), C# (.NET), or even Electron if it stays lightweight. What matters most is reliable scraping, clean code, and a UI I can rely on race morning. Let me know your preferred approach and any simi...
...and export the results to CSV or Google Sheets. I mainly care about item title, price, description, photos (image URLs are fine), posting date, item location and the seller’s profile link so I can trace each record back to its source. If you can collect additional fields that Facebook exposes, even better—just keep everything neatly labelled. No hard requirement on the stack: Python with BeautifulSoup / Selenium, Node with Puppeteer, Playwright, or a headless browser solution all work for me as long as it runs on Windows or a small Linux VPS and doesn’t violate Facebook’s ToS. Please build in reasonable throttling, login handling (cookie-based or mobile API, whichever is more stable) and a simple config file where I can tune delay settings or add new ac...
...dashboard to: • Manage monitoring settings • Control alerts and configurations • Implement structured and scalable automation logic. • Ensure the solution is maintainable and adaptable to future website updates. • Provide clear documentation for setup and usage. Technical Requirements • Strong experience with Python • Web automation tools such as: • Selenium / Playwright • Requests / BeautifulSoup • Backend development experience • Familiarity with notification systems (Email, Telegram, Webhooks, etc.) • Clean, well-documented, and modular code Additional Notes • This is a long-term project. • Ongoing collaboration may be required for future updates, optimizations, and feature enhancements. &bu...
...location coordinates directly from Google Maps. The second will crawl a set of websites I will supply and pull out product information, on-page contact details, and any user-generated content that appears alongside those products. Please structure every field into one tidy CSV per source so I can plug the results straight into my BI dashboards. I am comfortable if you lean on Python, Scrapy, BeautifulSoup, Selenium, or similar tools, provided the script is well-commented and can run headless behind rotating proxies without tripping rate limits. Deliverables: • 4 working scripts (Maps + websites) with clear setup instructions • Sample output files proving all requested fields are captured correctly • Output data must have City Name > (Excel file with ...
I need a small proof-of-concept scraper written in Python that pulls user information from a set of static website pages and exports it into a clean CSV file. The pages load without JavaScript, so a lightweight stack such as requests + BeautifulSoup (or lxml) should be all that’s required; no browser automation is necessary unless you can justify a clear advantage. I will supply the page URLs and highlight the exact fields to capture (name, profile link, location, and any other visible user meta). Your code should handle pagination where applicable, respect polite crawl rates, and be easy for me to adjust if the HTML structure shifts. Deliverables • Well-commented Python script (.py) • Sample CSV containing the extracted records • README with setup steps a...
I need clean, structured product details pu...purpose-built. I already have a clear idea of the attributes I want captured (title, price, SKU, description, availability, image URL). Once we agree on the target sites, you can build a scraper, run it, and hand back the CSV along with the script or notebook so I can reproduce the results later if needed. Please let me know: • Which language or framework you plan to use (Python, Scrapy, BeautifulSoup, Selenium, Playwright, etc.). • How you’ll handle pagination, anti-bot measures, and site structure changes. • An estimated turnaround and any milestones you suggest. Accuracy, deduplication, and clarity in the final CSV will be the acceptance criteria. If this sounds like your bread-and-butter, I’m ready...
...Excel file, encoded in UTF-8, with consistent headers and no duplicates. The file needs to be able to be used in a mail merge or address label capability. Acceptance The file must open without errors and pull all relevant permits. Please include a brief note on your chosen approach, the approximate turnaround time, and, if you automate, the language or toolset you’ll use (Python + BeautifulSoup, Selenium, etc.). Samples of the data files are attached...
I need a developer to collect data from multiple public w...solution (script or small app) that I can run on demand Basic documentation: how to run it, how to adjust settings, where outputs go Quality requirements Reliable scraping with error handling and retries Respectful request rate / throttling to avoid overloading sites Clear logging (success/fail, pages processed) Ability to adapt if page structure changes Experience with Python (Scrapy/BeautifulSoup/Selenium/Playwright) or Node.js Proxy / rotating user-agents experience (only if needed) Scheduling/automation (cron, Docker, or cloud run) Deliverables Working scraper + instructions Sample output file(s) Final dataset from agreed sources (initial run) To apply, please include Examples of similar scraping work you...
I have an urgent need for a clean, well-structured dataset containing the listing agent’s first name, last name, mailing address, and phone number for well over 500 active Zillow listings. Speed is critical, but accuracy matters just as much; the final file should be ready for immediate import into my CRM. You are free to use whichever stack you prefer—Python with BeautifulSoup or Scrapy, Selenium, residential proxies, even the unofficial Zillow API—so long as rate-limits are respected and the data is complete. I don’t need property details or price history; the focus is strictly on the agent contact fields. Deliverables • CSV or XLSX with a separate column for each required field • A short read-me explaining the script or method so I can reru...
...every piece of visible textual content I specify, and returning it in a machine-readable format. I’m flexible on the final file type; CSV, Excel, or JSON all work as long as the fields are clearly labeled and easy for me to manipulate later. A small sample first will help confirm we’re on the same page before you run the full extraction. Please use whatever stack you prefer—Python with BeautifulSoup or Scrapy, JavaScript with Puppeteer, or a tool that suits the task best—just be sure to respect and provide the code so I can rerun the process when the site updates. Deliverables: • Re-usable script or notebook with clear comments • Complete dataset containing all extracted text, delivered in my chosen format • Brief read-me exp...
I need a small, reliable script that pings the Late Show with Stephen Colbert page on 1iota every 30 seconds and fires off an SMS the moment May 21 tickets appear. The job is straightforward but time-sensitive: • Scrape or query the specific event listing without triggering 1iota’s bot protections (Python with requests/BeautifulSoup, Playwright, or Selenium are all fine—use what keeps the check time low). • Parse the response and confirm that the date equals 21 May before treating it as a positive match. • Send a single, immediate SMS alert to my phone via Twilio (or another SMS gateway you’re comfortable with). The script must run unattended on a Mac or Linux box—so include setup instructions, any required environment variables, and ...
...whatever loophole is needed to reveal the hidden contact and unit details, then replies with a single, structured template that looks something like: Property: <Title> Unit No.: <unit_number> Client: <client_phone> Owner: <owner_phone> Source: <URL> Key points • No reliance on the Bayut or Propertyfinder APIs—pure scraping with your preferred stack (Python, Node, Playwright, Selenium, BeautifulSoup, etc.). • Handle anti-scraping tactics gracefully (rotating headers, proxies, captchas if they appear). • Keep response time reasonable so a conversation still feels instant. • Deliver clean, well-commented code plus a quick guide for deploying the bot on a VPS or Docker image. Acceptance will be a short live de...
...installierbar; Python oder PHP sind bevorzugt, ich bin aber offen für andere Vorschläge. Akzeptanzkriterien – ≥ 95 % korrekte Felder bei typischen Lieferanten-Layouts. – Zeitgesteuerter oder eventbasierter Abruf ohne manuelles Anstoßen. – Dokumentation zur Einrichtung sowie kurzes Benutzerhandbuch. Teile mir bitte mit, welche Sprache / Libraries (z. B. Python + imaplib, spaCy, BeautifulSoup) du einsetzen willst und wie du das Prüftool realisieren würdest. Beispiele früherer Arbeiten im Bereich E-Mail-Parsing oder Datenimport sind willkommen. I run a vinyl records database and regularly receive new release lists by email. To avoid manually entering this information, I’m looking for a solution that can auto...
...grab every historical and new financial report that appears on the “Filings & Disclosure” section of otcmarkets.com. At the moment I only care about the PDFs of annual, quarterly and interim filings, but the solution should be flexible enough that I can later extend it to press releases or historical data if required. Here’s what I expect: • A script (preferably in Python 3 using requests / BeautifulSoup or Selenium if necessary) that accepts a plain text list of symbols, checks each page once per day and downloads any financial report that is not already saved. • Folder or filename logic that organises the PDFs by ticker and date so nothing is overwritten. • A simple log or CSV that records the timestamp, ticker and URL of each file fet...
...to scrape public websites * Parse HTML, JSON, CSV, and PDF files * Clean and normalize messy real-world data * Write clear, maintainable utility scripts * Deliver working code (not just prototypes) --- ### Required Skills * Strong Python fundamentals * Real experience with web scraping * Data parsing and data cleaning * Comfortable working independently and async --- ### Nice to Have * BeautifulSoup, Scrapy, Playwright, or Selenium * pandas / numpy * Experience scraping government or legacy websites * Experience handling PDFs (text extraction, OCR) --- ### How We Evaluate * This role includes a **paid trial task (1–3 days)** * We care about **output and correctness**, not resumes * Clean, working code matters more than clever abstractions --- ### Important * Ple...
I need a reliable scraping solution that collects every open position from ten job-board and company-career sites in one specific country. I already have the full URL list and will share it right after kickoff. Scope • Write and schedule a se...bilingual postings, basic keyword search in the frontend, and an export button for CSV or Excel, but these are optional. Deliverables 1. Source code for all scrapers and the data pipeline. 2. Database schema or JSON structure. 3. Front-end webview ready to run locally. 4. README covering installation, configuration, and update routine. I’m happy to discuss your preferred stack—Python with BeautifulSoup/Scrapy or Node with Cheerio/Puppeteer are both fine—as long as the final result is stable and well documented....
...pages • Extract the bio text and the profile picture, storing the image locally or saving its direct link next to the bio in a CSV/JSON file • Respect , employ modest request throttling, and handle the site’s usual edge cases—lazy-loaded images, occasional 4xx/5xx responses, and any login or cookie notices that appear for anonymous visitors I’m comfortable with Python (requests, BeautifulSoup, Selenium) or Node (Puppeteer) solutions, provided the code is clean, modular, and comes with a concise README so I can run it on macOS or a Linux VPS without guesswork. Deliverables: 1. Full source code with clear setup instructions 2. A one-line command or small runner script that launches the crawl 3. A sample output file covering at least 20 profiles...
...reliable, well-structured lead list and I already know exactly what it should contain. The task is to extract contact information—email addresses, phone numbers and full mailing addresses—from three sources: company and organisation websites, their public social-media profiles, and well-known online directories. I expect the data to be gathered with a solid scraping workflow (Python, Scrapy, BeautifulSoup, Selenium or an equivalent stack is fine) and then verified so that bounced emails and dead numbers are kept to an absolute minimum. Deliverables • One CSV or Excel file with separate columns for name, company, job title, email, phone, street address, city, state, ZIP/postcode, country, source URL and date collected. • No duplicates; every entry m...
...the following fields: • Job title and full description • Company name plus location (city, state/region, country) • Employment type and any salary or rate information available Your scraper should store results in a clean, normalized CSV (or optionally a relational DB if you prefer) and be easy for me to rerun on demand. I’m comfortable with Python, so a script leveraging requests/BeautifulSoup, Scrapy, or Playwright makes sense, but if another stack delivers better reliability feel free to suggest it. Key expectations • Site recommendations presented first for my approval before you start coding • Respect , add configurable request delays, and build basic anti-block measures (user-agent rotation, retries) • Clear documentatio...
...structure • Columns: name, first line of address, state, city, postcode • Format: every column saved as plain text (no numeric or date formatting) Delivery schedule • First 5,000 fully cleaned rows required within the first 6 hours • Remainder on a rolling basis until the full 15,000 are complete I will supply a surname list to guide the searches. A straightforward Python (requests / BeautifulSoup or Selenium) or Scrapy workflow is fine as long as the final output arrives in a single Excel file (.xlsx) that opens error-free in Microsoft Excel. Accuracy matters more than speed—random spot checks will be run. Any duplicates, blanks, or malformed addresses will be sent back for correction. Once the first 5,000 pass review, I’ll green-light...
I need all publicly available customer-facing email addresses extracted from a list of e-commerce websites that I will supply once the project begins. Please crawl only the domains I provide, respect where possible, and avoid triggering any rate limits or security blocks—rotating proxies or headless browsing with tools such as Python, Scrapy, BeautifulSoup, Selenium, or similar is fine as long as the result is reliable. Deliverable • One clean, de-duplicated CSV file containing the harvested email addresses, ready for direct import into my CRM. Acceptance criteria • Every email must originate from the target e-commerce domains. • No duplicates, placeholders, or obviously invalid addresses. • File encodes as UTF-8 and opens without warnings in Exc...
...images—then converts and calculates the raw values exactly as we define before pushing them straight into WooCommerce. My customers must only ever see the WooCommerce front end, so the sync has to feel native and instant. The portal changes frequently, so please code the extractor so that selectors and credentials can be updated without touching the core logic. I am open to Python (Scrapy, BeautifulSoup, Selenium), PHP or Node as long as the finished solution talks cleanly to the WooCommerce REST API and leaves no manual steps. Deliverables • Scraper that logs in and captures product details, stock, prices and images in real time or on a schedule we agree on • Conversion layer that performs the unit/price calculations before data enters WooCommerce • Im...
I need a developer to collect data from multiple public w...solution (script or small app) that I can run on demand Basic documentation: how to run it, how to adjust settings, where outputs go Quality requirements Reliable scraping with error handling and retries Respectful request rate / throttling to avoid overloading sites Clear logging (success/fail, pages processed) Ability to adapt if page structure changes Experience with Python (Scrapy/BeautifulSoup/Selenium/Playwright) or Node.js Proxy / rotating user-agents experience (only if needed) Scheduling/automation (cron, Docker, or cloud run) Deliverables Working scraper + instructions Sample output file(s) Final dataset from agreed sources (initial run) To apply, please include Examples of similar scraping work you...