Web Crawling

2 weeks ago


India NewsCatcher (YC S22) Full time

🌟 We seek a highly skilled and motivated Web Crawling & Scraping Engineer. We crawl over 100k news websites daily and we are looking for someone who is passioned about Web Crawling the same way we are. Curious to explore new ways of handling difficult website cases and automating our crawling techniques.

Functions:

  • Crawling Platform: Design, construct, test and maintain robust, reliable, and scalable crawling pipeline infrastructure.
  • Add an automatic way of fixing non-working crawlers
  • Provide metrics on website coverage
  • Data Pipeline:
  • Design, construct, test and maintain robust, reliable, and data pipeline infrastructure.
  • Automation and unit tests
  • Optimization: Optimize server performance and resource utilization of crawling infrastructure.
  • Regularly review and improve system performance and scalability.
  • Collaboration and Documentation: Maintain accurate and up-to-date documentation of server configurations, procedures, and policies.
  • Provide technical support and training to team members as needed.

Example Tasks:

  • Introduce a new automatic way of crawling a website that does not work with existing techniques
  • Come up with an idea on how to verify why a specific crawler stopped working and fix it automatically
  • Use LLM methods to improve crawling methods

Experience:

  • Proven experience as a Web Crawling & Scraping Engineer or similar role.
  • Web Scraping and Web Crawling Techniques
  • Streaming/batch data processing framework such as RabbitMQ.
  • Solid knowledge of SQL and NoSQL databases
  • Kubernetes / Docker is a must have
  • Strong problem-solving skills
  • Excellent communication and collaboration skills.

Nice to have:

  • Experience with ElasticSearch (OpenSearch)

Your KPIs:

  • Number of non-working crawlers per website (should be small)
  • The time between a crawler goes down and we come up with a fix (should be small)

Compensation and Perks:

  • Competitive salary and equity.
  • Up to 24 days of vacation & 16 days of sick leave/holidays (all fully paid)
  • Learning and development compensation
  • One meeting-free day per week
  • Co-working Budget
  • Training Budget
  • We provide all the necessary equipment to work comfortably and efficiently from home.
  • Yearly company retreats (2024 — Canary Islands, 2023 — French Alpes)

Needed tools:

  • Scrapy
  • Crawlee



  • Web crawling

    5 days ago


    India NewsCatcher Full time

    We seek a highly skilled and motivated Web Crawling & Scraping Engineer. We crawl over 100k news websites daily and we are looking for someone who is passioned about Web Crawling the same way we are. Curious to explore new ways of handling difficult website cases and automating our crawling techniques. Functions: Crawling Platform: Design, construct,...

  • Web Crawling

    2 weeks ago


    India NewsCatcher (YC S22) Full time

     We seek a highly skilled and motivated Web Crawling & Scraping Engineer. We crawl over 100k news websites daily and we are looking for someone who is passioned about Web Crawling the same way we are. Curious to explore new ways of handling difficult website cases and automating our crawling techniques. Functions: Crawling Platform: Design, construct,...

  • Web Crawling

    2 weeks ago


    India NewsCatcher (YC S22) Full time

    We seek a highly skilled and motivated Web Crawling & Scraping Engineer. We crawl over 100k news websites daily and we are looking for someone who is passioned about Web Crawling the same way we are. Curious to explore new ways of handling difficult website cases and automating our crawling techniques. Functions: Crawling Platform: Design, construct,...


  • India NewsCatcher (YC S22) Full time

    We are seeking an experienced Crawling Expert to join our team at NewsCatcher (YC S22). This is a fantastic opportunity to work on scalable web data solutions, leveraging your skills in web crawling and scraping engineering.Job OverviewThe successful candidate will be responsible for designing, constructing, testing, and maintaining robust, reliable, and...


  • india NewsCatcher (YC S22) Full time

    🌟 We seek a highly skilled and motivated Web Crawling & Scraping Engineer. We crawl over 100k news websites daily and we are looking for someone who is passioned about Web Crawling the same way we are. Curious to explore new ways of handling difficult website cases and automating our crawling techniques.Functions:Crawling Platform: Design, construct,...


  • India Black Panda Enterprises Full time

    Are you a passionate and skilled professional looking for a challenging role at Black Panda Enterprises?About the JobWe are seeking a talented Web Crawler Developer to join our team. This is a work-from-home position that requires high-speed internet connectivity, a capable business-grade computer, and a stable power connection.Job SummaryThis role involves...

  • Software Engineer

    2 weeks ago


    India Forage AI Full time

    Software EngineerRole Summary - In this role, you’ll be working with an amazingly passionate and talented team of engineers and data scientists who are working at the bleeding edge of data science and data Automation.Requirements:A bachelor’s degree in Computer Science/Information Technology engineering is preferred. 2-3 years of experience in web...


  • india Forage AI Full time

    Software EngineerRole Summary - In this role, you’ll be working with an amazingly passionate and talented team of engineers and data scientists who are working at the bleeding edge of data science and data Automation.Requirements:A bachelor’s degree in Computer Science/Information Technology engineering is preferred. 2-3 years of experience in web...


  • india Forage AI Full time

    Software EngineerRole Summary - In this role, you’ll be working with an amazingly passionate and talented team of engineers and data scientists who are working at the bleeding edge of data science and dataAutomation.Requirements:A bachelor’s degree in Computer Science/Information Technology engineering is preferred.0-2 years of experience in web crawling...

  • Software Engineer

    2 weeks ago


    India Forage AI Full time

    Software EngineerRole Summary - In this role, you’ll be working with an amazingly passionate and talented team of engineers and data scientists who are working at the bleeding edge of data science and dataAutomation.Requirements:A bachelor’s degree in Computer Science/Information Technology engineering is preferred.0-2 years of experience in web crawling...

  • Software Engineer

    2 weeks ago


    India Forage AI Full time

    Software Engineer Role Summary - In this role, you’ll be working with an amazingly passionate and talented team of engineers and data scientists who are working at the bleeding edge of data science and data Automation. Requirements: A bachelor’s degree in Computer Science/Information Technology engineering is preferred. 2-3 years of experience in web...

  • Software Engineer

    2 weeks ago


    India Forage AI Full time

    Software Engineer Role Summary - In this role, you’ll be working with an amazingly passionate and talented team of engineers and data scientists who are working at the bleeding edge of data science and data Automation. Requirements: A bachelor’s degree in Computer Science/Information Technology engineering is preferred. 2-3 years of...

  • Software engineer

    4 weeks ago


    India ELife Full time

    Our Company: A fast-growing start-up headquartered in San Francisco, CA, USA in the heart of Silicon Valley. We recruit worldwide as our customer base is global. Reliable ground transportation provider, any type of vehicle globally. Vision: Reliable ground transportation services globally with all types of vehicles.​ Mission: Empower high...


  • india Black Panda Enterprises Full time

    Role Summary In this role, you’ll be working with an amazingly passionate and talented team of engineers and data scientists who are working at the bleeding edge of data science and data automation. Requirements A bachelor’s degree in Computer Science/Information Technology engineering is preferred. 0-2 years of experience in web crawling using Python....

  • Software Engineer

    3 days ago


    India Black Panda Enterprises Full time

    Role SummaryIn this role, you’ll be working with an amazingly passionate and talented team of engineers and data scientists who are working at the bleeding edge of data science and data automation.RequirementsA bachelor’s degree in Computer Science/Information Technology engineering is preferred.0-2 years of experience in web crawling using Python.Must...


  • india Black Panda Enterprises Full time

    Role SummaryIn this role, you’ll be working with an amazingly passionate and talented team of engineers and data scientists who are working at the bleeding edge of data science and data automation.RequirementsA bachelor’s degree in Computer Science/Information Technology engineering is preferred.0-2 years of experience in web crawling using Python.Must...

  • Software Engineer

    3 days ago


    India Black Panda Enterprises Full time

    Role Summary In this role, you’ll be working with an amazingly passionate and talented team of engineers and data scientists who are working at the bleeding edge of data science and data automation. Requirements A bachelor’s degree in Computer Science/Information Technology engineering is preferred. 0-2 years of experience in web crawling using Python....

  • Software Engineer

    4 weeks ago


    india ELife Full time

    Our Company: A fast-growing start-up headquartered in San Francisco, CA, USA in the heart of Silicon Valley. We recruit worldwide as our customer base is global. Reliable ground transportation provider, any type of vehicle globally. Vision: Reliable ground transportation services globally with all types of vehicles.​ Mission: Empower high quality local...

  • Software Engineer

    1 month ago


    India ELife Full time

    Our Company: A fast-growing start-up headquartered in San Francisco, CA, USA in the heart of Silicon Valley. We recruit worldwide as our customer base is global. Reliable ground transportation provider, any type of vehicle globally. Vision: Reliable ground transportation services globally with all types of vehicles.​ Mission: Empower high...

  • Software Engineer

    1 month ago


    India ELife Full time

    Our Company: A fast-growing start-up headquartered in San Francisco, CA, USA in the heart of Silicon Valley. We recruit worldwide as our customer base is global. Reliable ground transportation provider, any type of vehicle globally. Vision: Reliable ground transportation services globally with all types of vehicles.​ Mission: Empower high...