Dataforest. Empover the data.

Middle Python Scraping Developer

Looking for a Middle Python Scraping Developer to join the Dataforest team. If you are looking for a friendly team, a healthy working environment, and a flexible schedule, you have found the right place to send your CV. 

Skills & Qualifications:

  • 2+ years of commercial experience with Python.
  • Proficiency in web scraping, data extraction, cleaning, and visualization.
  • Proficient with XPath: strong ability to design robust, resilient expressions for structured and semi-structured HTML documents.
  • Experience with web automation techniques and tools.
  • Hands-on experience with relevant libraries and frameworks, including:
    • Playwright, playwright-stealth (web-automation)
    • Requests, aiohttp, hrequests (for HTTPS requests)
    • lxml (for parsing and data extraction)
  • Deep understanding of anti-bot protection and evasion strategies, including:
    • IP rotation and proxy management
    • Fingerprint spoofing and headless detection avoidance
    • Human-like behavior emulation (delays, mouse movement, interaction)
    • Bypassing JavaScript challenges (e.g., Cloudflare, Akamai, PerimeterX)
    • CAPTCHA solving techniques and integration with services (e.g. 2Captcha, Anticaptcha, CapSolver)
  • Experience implementing structured logging, traceability, and monitoring pipelines, including:
    • Logging request/response cycles, failures, retry attempts
    • Integration with log aggregation platforms (e.g., Sentry, CloudWatch, Grafana Loki)
    • Designing health checks and runtime metrics (success rate, ban rate, throughput)
    • Instrumenting scraping workflows with alerts and failure diagnostics
  • Strong understanding of multiprocessing and multithreading, including process and thread management.
  • Familiarity with Linux environments, cloud services (AWS, GCP), and Docker
  • Experience working with SQL databases (PostgreSQL, MySQL, or equivalent).
  • Experience with GUI automation tools like PyAutoGUI.
  • Knowledge of virtual display environments (e.g., xvfb, pyvirtualdisplay).
  • Experience with Flask / Flask-RESTful for API development.
     

   Key Responsibilities:

  • Develop, maintain, and optimize web scraping and parsing solutions.
  • Design and implement APIs, ETL pipelines, and data integration services.
  • Work closely with Project Managers to address customer requirements and challenges.
  • Ensure performance optimization and efficiency of data collection pipelines.
  • Collaborate with team members, participate in meetings, brainstorming sessions, and code reviews.
  • Implement anti-bot evasion strategies to enhance scraping reliability.
     

   Optional Skills (Nice to Have):

  • Experience with NoSQL databases (MongoDB, Redis, or equivalent).
  • Knowledge of data analysis and processing using Pandas.

    We offer:
  • Great networking opportunities with international clients, challenging tasks;
  • Building interesting projects from scratch using new technologies;
  • Personal and professional development opportunities;
  • Competitive salary nominated in USD;
  • Paid vacation and sick leaves;
  • Flexible work schedule;
  • Friendly working environment with minimal hierarchy;
  • Team building activities, corporate events.
143 views
·
7 applications
86% read
·
43% responded
Last responded 13 hours ago
144 views
·
7 applications
67% read
·
67% responded
Last responded 13 hours ago
To apply for this and other jobs on Djinni login or signup.
Loading...