Research Crawling Engineer

Wynd LabsRemote

Remote Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Experience Level

Experience

Qualifications

Proficient in programming with one or more languages: Go, Rust, Python, Java, or C++. Demonstrated experience in building web crawlers or large-scale data pipelines. Strong grasp of HTTP, networking principles, and browser behavior. Familiarity with distributed systems and parallel processing techniques. Experience handling large datasets (scales of TB to PB preferred).

About the job

The company emphasizes a lean structure and rapid iteration, keeping bureaucracy to a minimum to focus on progress in open web data and AI.

Role overview

The Research Crawling Engineer (Remote) focuses on designing and running systems for large-scale web data acquisition to support research and model development. This position involves working across distributed systems, scraping frameworks, and complex data pipelines.

Main responsibilities

Develop and maintain web crawlers that operate across multiple domains at scale.
Build high-throughput, fault-tolerant systems capable of collecting data from millions to billions of URLs each day.
Address anti-bot protections, rate limiting, and the challenges of dynamic or JavaScript-heavy websites.
Design data pipelines for cleaning, deduplication, filtering, and normalization.
Create and manage datasets tailored for research and model training.
Monitor crawl performance, coverage, and data quality, iterating quickly based on feedback.
Collaborate with research teams to ensure data collection aligns with modeling needs.
Optimize infrastructure for cost efficiency, low latency, and reliability.

Requirements

Proficiency in at least one of these languages: Go, Rust, Python, Java, or C++.
Direct experience building web crawlers or large-scale data pipelines.
Strong understanding of HTTP, networking, and browser behavior.
Familiarity with distributed systems and parallel processing.
Experience working with large datasets (terabyte to petabyte scale preferred).

Comfort troubleshooting in unstable or adversarial environments is important for this role.

Preferred skills

Experience with NLP techniques and frameworks is a plus.

About Wynd Labs

Wynd Labs is at the forefront of web data infrastructure, enabling organizations to access and utilize vast amounts of information for AI model training. Our innovative solutions and rapid development processes allow us to stay ahead in the evolving landscape of data acquisition and AI technologies.

Similar jobs

Browse all companies, explore by city & role, or SEO search pages. View directory listings: all jobs, search results, location & role pages.

1 - 20 of 69,549 Jobs

Search for Software Engineer – Web Crawling at Woflow | Remote

69,549 results

Select all on this page (20)

Apply

Software Engineer – Web Crawling at Woflow | Remote

Woflow

Full-time|Remote|Remote

Woflow is an innovative technology startup dedicated to developing cutting-edge products and solutions that cater to the rapidly evolving on-demand economy. Our flagship offering is a comprehensive platform that empowers customers to seamlessly request and receive vital merchant data, including structured menu information, images, and store details, through …

Mar 6, 2025

Apply

Software Engineer - Web Crawling

Exa

Full-time|On-site|San Francisco, California

At Exa, we are on a mission to create a groundbreaking search engine from the ground up, designed to empower every AI application. Our team develops robust infrastructure to efficiently crawl the web, trains cutting-edge embedding models for indexing, and engineers high-performance vector databases in Rust for seamless searching. We also manage a state-of-the-art $5M H200 GPU cluster that powers tens of thousands of machines.As a Software Engineer specializing in Web Crawling, you will take on the exciting challenge of building a web crawler that operates at a scale comparable to Google, revolutionizing the way we access and utilize information online.

Jul 23, 2025

Apply

Senior Fullstack Software Engineer

Woflow

Full-time|Hybrid|San Fransisco: HQ, Hybrid

Woflow is an innovative technology startup dedicated to developing cutting-edge products and solutions for the rapidly evolving on-demand economy. Our flagship offering is a comprehensive platform that empowers our clients to seamlessly request and obtain structured merchant data, including menus, images, and store details, through a blend of web applications and public APIs. At the core of our operations, advanced machine learning and artificial intelligence models work in tandem with an automated distributed workforce management system. Proudly, we are recognized as the world's first Merchant Data Platform.We cater to a diverse clientele, including food delivery services, online ordering systems, and e-commerce marketplaces, providing the robust data infrastructure necessary for their growth and scalability.As a Senior Fullstack Engineer at Woflow, you will play a pivotal role in guiding the evolution of our products and the overall direction of the company. Your day-to-day responsibilities will involve planning innovative features, owning product development from inception to completion, and collaborating with the operations team to address and resolve customer concerns. To excel in this role, a passion for coding, meticulous attention to detail, and a commitment to delivering projects with pride are essential.Key Responsibilities:Collaborate across the technology organization, including engineering, product, and devops/data science teams, to deliver cohesive and successful products.Engage directly with the operations team to ensure our data infrastructure remains efficient and effective.Commit to providing an exceptional user experience by writing clean, maintainable, and efficient code.Thrive in a fast-paced, agile environment where your contributions will directly impact our success.Required Qualifications:A minimum of 5 years of professional software development experience.A strong passion for coding and a continual desire to learn and grow.Proven ability to deliver projects with exceptional attention to detail.Preferred Qualifications:Experience with Ruby on Rails.Proficiency in React.Familiarity with Apollo, GraphQL, and web crawling technologies.

Oct 16, 2024

Apply

Research Crawling Engineer

Wynd Labs

Full-time|Remote|Remote

Wynd Labs creates infrastructure that enables organizations to access large-scale web data for advanced AI model training. The team operates Grass, a bandwidth-sharing network powering a distributed web crawler that gathers high-quality public data from around the globe. Wynd Labs also manages data pipelines that process and annotate billions of videos, transcripts, and audio files, supporting research labs in building datasets. The company emphasizes a lean structure and rapid iteration, keeping bureaucracy to a minimum to focus on progress in open web data and AI. Role overview The Research Crawling Engineer (Remote) focuses on designing and running systems for large-scale web data acquisition to support research and model development. This position involves working across distributed systems, scraping frameworks, and complex data pipelines. Main responsibilities Develop and maintain web crawlers that operate across multiple domains at scale. Build high-throughput, fault-tolerant systems capable of collecting data from millions to billions of URLs each day. Address anti-bot protections, rate limiting, and the challenges of dynamic or JavaScript-heavy websites. Design data pipelines for cleaning, deduplication, filtering, and normalization. Create and manage datasets tailored for research and model training. Monitor crawl performance, coverage, and data quality, iterating quickly based on feedback. Collaborate with research teams to ensure data collection aligns with modeling needs. Optimize infrastructure for cost efficiency, low latency, and reliability. Requirements Proficiency in at least one of these languages: Go, Rust, Python, Java, or C++. Direct experience building web crawlers or large-scale data pipelines. Strong understanding of HTTP, networking, and browser behavior. Familiarity with distributed systems and parallel processing. Experience working with large datasets (terabyte to petabyte scale preferred). Comfort troubleshooting in unstable or adversarial environments is important for this role. Preferred skills Experience with NLP techniques and frameworks is a plus.

Apr 20, 2026

Apply

Mid-Market Account Manager at Woflow | San Francisco, CA; New York City

Woflow

Full-time|On-site|San Francisco, CA; New York City

Join Woflow as a Mid-Market Account Manager, where you will play a crucial role in our Account Management team. As the primary point of contact for our valued customers, you will act as a trusted advisor throughout the post-sales lifecycle. Your responsibilities will include onboarding, product usage guidance, partnership expansion, technical troubleshooting, and continuous account management.In this senior role, you will manage a significant portfolio of strategic enterprise accounts, blending your expertise across three key areas:Customer Success: As the face of Woflow, you will embody our customer-first philosophy. Your ability to empathize and build strong relationships with all stakeholders is vital. You will balance client requests with business objectives and be accountable for achieving growth targets while exceeding customer expectations.Project Management: You will oversee multiple priorities across various workflows and serve as the essential link between account stakeholders and Woflow's internal teams, including Product, Engineering, and Operations. You'll ensure the efficient execution of strategic initiatives and monitor client reporting and communications.Technical Delivery: Given Woflow's operational intensity, a deep understanding of our platform's technical aspects is essential. You will need to effectively communicate technical requirements, demonstrate our platform's capabilities, and collaborate with stakeholders to develop innovative solutions that align with customer needs.

Dec 3, 2025

Apply

Frontend Engineer - AI Crawl Control

Cloudflare, Inc.

Full-time|Hybrid|Hybrid

Join Cloudflare as a Frontend Engineer specializing in AI Crawl Control. In this pivotal role, you will develop innovative frontend solutions to enhance web crawling capabilities. Collaborate with a dynamic team of engineers and data scientists to create and optimize user interfaces that empower our AI systems. Your contributions will play a key role in shaping the future of web performance and security.

Mar 4, 2026

Apply

Crawling Engineer (Opportunistic Hire)

Coupang

Full-time|On-site|z-Test & Templates Only

Join us as a Crawling Engineer!At Coupang, we are dedicated to delivering an extraordinary shopping experience for our customers. As a Crawling Engineer, you will play a crucial role in enhancing our data acquisition processes and ensuring that our systems are optimized for performance and reliability.Application Instructions:Please fill out the Internal Transfer Request Form and submit it using your Coupang email address.

Mar 13, 2026

Apply

Software Engineer, Data Acquisition

OpenAI

Full-time|On-site|San Francisco

Overview:Join the dynamic Data Acquisition team at OpenAI, part of our Foundations organization, where you will play a crucial role in the data collection processes that power our model training initiatives. Our team is at the forefront of managing web crawling and GPTBot services, collaborating closely with departments such as Data Processing, Architecture, and Scaling. We are seeking a talented Software Engineer who is passionate about data acquisition and eager to make a significant impact.Key Responsibilities:Lead and innovate engineering projects focused on data acquisition, including web crawling, data ingestion, and search functionalities.Collaborate effectively with cross-functional teams, including Data Processing, Architecture, and Scaling, to ensure seamless data flow and operational efficiency.Partner with the legal team to navigate compliance and data privacy challenges.Design and implement highly scalable distributed systems capable of processing petabytes of data.Architect algorithms for efficient data indexing and robust search capabilities.Build and manage backend services for data storage, including working with key-value databases and ensuring synchronization.Implement solutions within a Kubernetes Infrastructure-as-Code environment and conduct regular system health checks.Conduct experiments and analyze data to derive insights that enhance system performance.Qualifications:Bachelor's, Master's, or PhD in Computer Science or a related field.4+ years of professional software development experience.Familiarity with large-scale web crawlers is a plus.Deep understanding of large stateful distributed systems and data processing methodologies.Proficient in Kubernetes and knowledgeable about Infrastructure-as-Code principles.Eager to explore and implement new technologies and approaches.Proven ability to manage multiple tasks and adapt to shifting priorities.Excellent written and verbal communication skills.About OpenAI:At OpenAI, we are pioneers in AI research and deployment, dedicated to ensuring that the advancements in artificial intelligence benefit humanity as a whole. Our mission is to push the boundaries of AI capabilities while adhering to safe and responsible deployment practices. Join us in our commitment to harnessing the power of AI for positive global impact.

Sep 22, 2023

Apply

Senior Web Software Engineer at Step | Remote

Step

Full-time|Remote|Remote - US

Join Step as a Senior Web Software EngineerAt Step, we empower side hustlers, creators, freelancers, and investors. Our mission is to support those who are eager to start anew, rebuild, or enhance their current financial situations. We welcome individuals who are unafraid to face challenges, particularly in their financial journeys.Step is at the forefront of transforming financial services, aiming to create a superior banking and borrowing experience that fosters financial independence. By providing essential tools, education, and opportunities, we enable our users to manage, grow, and build their wealth confidently. Our focus on a mobile-first approach is reshaping the banking landscape, ensuring an engaging consumer experience.With over 7 million satisfied customers, we are a fast-growing company that prioritizes a strong mission and vision that places people at the center of everything we do. If you are looking to contribute to an impactful organization, we would love to hear from you!Step is a proud member of Beast Industries, led by the renowned Jimmy Donaldson, aka MrBeast. Known for his innovative approach to digital content creation, Beast Industries spans various sectors including digital media, philanthropy, and consumer products. Our commitment to creativity and social impact drives us to explore new possibilities and create memorable experiences.

May 2, 2026

Apply

Enterprise Account Executive

Woflow

Full-time|On-site|San Francisco, CA; New York City

About UsAt Woflow, we are revolutionizing the landscape of real-time, AI-driven commerce. Our innovative platform empowers the largest marketplaces and commerce platforms, such as Shopify, DoorDash, Uber Eats, and Square, to efficiently ingest and structure intricate supplier and product data at scale. As AI agents and dynamic commerce experiences become prevalent, Woflow is committed to helping enterprise clients remain at the forefront of this transformation.The RoleWe are seeking an exceptional Enterprise Account Executive to spearhead strategic sales initiatives and secure partnerships with key commerce platforms. You will take charge of intricate sales cycles and engage with C-level and VP stakeholders, driving revenue growth through acquiring new logos.Your Responsibilities:Manage the complete enterprise sales cycle—from outbound prospecting and qualification to closing dealsDevelop and maintain a robust pipeline of strategic accounts, including marketplaces, SaaS platforms, and aggregatorsEngage with and influence executive decision-makers across Product, Operations, and Data teamsArticulate Woflow’s unique value proposition, particularly regarding data automation, AI readiness, and scalabilityBuild a deep understanding of customer challenges and align them with Woflow’s specialized solutionsCollaborate closely with Solutions Engineering, Product, and Account Management to ensure seamless alignment during evaluations and handoffsAccurately forecast and consistently meet or exceed quarterly revenue objectivesRepresent Woflow at industry events and conferences as needed

Jun 12, 2025

Apply

Staff Software Engineer - Web Platform

Pantheon

Full-time|$164.2K/yr - $205.2K/yr|Remote|United States (Remote)

About Pantheon Pantheon supports over 300,000 websites for organizations including Google, Princeton, Salesloft, Clorox, and the United Nations. Our WebOps platform helps developers and marketers build, manage, and scale WordPress, Drupal, and Next.js sites that reach billions worldwide. The company values expertise, results, and a collaborative, remote-friendly culture where individual contributions have real impact. Role Overview Pantheon is building a real-time, collaborative web content platform from the ground up. This product aims to change how teams create, publish, and manage content online. As a Staff Software Engineer - Web Platform (Remote, United States), you will play a central role in shaping this new product category, which blends developer tools, visual editing, and cloud-native content infrastructure. This is not a maintenance role. The work focuses on foundational engineering and new product development, with decisions that will influence thousands of developers and content teams for years to come. What You Will Do Lead technical direction and contribute hands-on across the stack, from backend APIs and real-time synchronization services to frontend component systems and editing interfaces. Work closely with product and design to define architecture, set engineering standards, and make build-versus-buy decisions. Guide and mentor a focused engineering team, helping achieve key milestones. Balance deep technical discussions (such as distributed state management) with building user-friendly, engaging interfaces. What Pantheon Looks For Experience as a technical leader and hands-on contributor in web platform engineering. Comfort working across the full stack, including backend services and frontend interfaces. Ability to collaborate with product and design teams on architecture and product direction. Depth in distributed systems and frontend development. Why Join Pantheon? Opportunity to help define a new product category in web content management. Work on technically challenging and commercially relevant problems. Collaborate with talented colleagues in a remote-friendly environment where results matter.

Apr 19, 2026

Apply

Senior Software Engineer - Collaborative Web Platform

Pantheon

Full-time|Remote|United States (Remote)

Role overview Pantheon seeks a Senior Software Engineer to strengthen its collaborative web platform. The position involves building scalable applications and refining user experiences. Much of the work centers on designing, developing, and maintaining features that enable digital collaboration. What you will do Partner with cross-functional teams to design and implement new features Develop and maintain scalable web applications Help improve the platform’s usability and reliability Location This is a remote role based in the United States.

Apr 22, 2026

Apply

Software Engineer II - Collaborative Web Platform

Pantheon

Full-time|Remote|United States (Remote)

Pantheon is seeking a Software Engineer II to support the ongoing development of its collaborative web platform. This position is fully remote and open to candidates located in the United States. Role overview This role centers on building and improving features that help users collaborate more effectively on Pantheon's web platform. The work involves close coordination with colleagues across different teams and disciplines. What you will do Partner with cross-functional teams to design, develop, and refine new and existing features Tackle technical challenges with creative solutions that improve the user experience Write and deliver high-quality code to ensure the platform continues to meet user needs Requirements Experience working with modern web technologies Strong problem-solving abilities and keen attention to detail Comfort collaborating within an engineering team Pantheon values engineers who care about building tools that empower users and want their work to make a difference.

Apr 22, 2026

Apply

Software Web Applications Engineer

dstaff

Full-time|On-site|Middletown

Join our innovative team at dstaff as a Software Web Applications Engineer. In this role, you will be responsible for developing and maintaining cutting-edge web applications that enhance user experience and drive business success. You will collaborate with cross-functional teams to design, implement, and optimize web solutions that meet our clients' needs.

Mar 5, 2015

Apply

Software Web Applications Engineer

dstaff

Full-time|On-site|Middletown

Join our innovative team at dstaff as a Software Web Applications Engineer. We are seeking a talented individual who is passionate about developing dynamic web applications and enhancing user experiences. In this role, you will collaborate with cross-functional teams to design, implement, and maintain high-quality software solutions that meet the needs of our clients.

Mar 5, 2015

Apply

Senior Web Software Engineer

Atec Spine

Full-time|$120K/yr - $130K/yr|On-site|Carlsbad, California, United States

Atec Spine is seeking an experienced Senior Software Engineer, Web, to join our dynamic team focused on enhancing the Informatix platform through innovative web applications. This role offers an exciting opportunity to engage in the full software development lifecycle — from defining requirements and designing solutions to developing, deploying, maintaining, and optimizing performance.Key Responsibilities:Lead the development of user interface components for web applications to meet specific project needs.Set up and configure server environments specifically for Vue.js deployments.Collaborate with fellow web developers and software engineers to create a robust and adaptable front-end architecture.Perform performance testing, pinpoint optimization opportunities, and drive continuous improvements.Estimate tasks and execute software projects in alignment with project timelines.Mentor junior developers and oversee the comprehensive delivery of modules.Maintain clear and proactive communication regarding project statuses.Generate documentation pertinent to software development projects, including design artifacts and test plans.Work effectively with distributed teams around the globe.

Nov 11, 2025

Apply

Director of Software Engineering - Node.js & Web Scraping Specialist

PortPro

Full-time|Remote|Los Angeles, California

Join our dynamic team at PortPro as a Director of Software Engineering, where you will leverage your extensive expertise in Node.js and large-scale web scraping. In this pivotal role, you will drive the engineering team towards designing and optimizing high-performance, distributed web scraping systems. We are looking for a visionary leader who possesses extensive experience in managing anti-bot measures, optimizing data pipelines, and building scalable cloud-based architectures.

Mar 11, 2025

Apply

Senior Staff Software Engineer, Consumer Engineering (Web Infrastructure)

Affirm

Full-time|Remote|Remote US

Affirm is looking for a Senior Staff Software Engineer to join the Consumer Engineering team, focusing on Web Infrastructure. This fully remote role is based in the US. Role overview This position centers on designing and building the web infrastructure that powers Affirm’s consumer products. The work involves collaborating with engineering, product, and design teams to create systems that scale and perform reliably. What you will do Lead the design and development of infrastructure supporting Affirm’s consumer web platforms. Work alongside cross-functional teams to build scalable solutions. Drive architectural decisions for web technologies and set technical direction for future projects. Ensure high standards of performance and security across web platforms. Contribute to creating seamless and dependable web experiences for Affirm’s customers.

Apr 23, 2026

Apply

Senior Software Engineer - Web Capture

FullStory

Full-time|On-site|Atlanta

Role overview The Senior Software Engineer - Web Capture at FullStory focuses on building software to capture and analyze user interactions across web platforms. This work plays a key role in supporting efforts to improve user experience for FullStory’s customers. What you will do Design, develop, and implement software applications that capture web interactions Work closely with engineering, product, and design teams to deliver high-quality solutions Contribute to projects that analyze user behavior and help enhance usability Location This position is located in Atlanta.

Apr 28, 2026

Apply

Software Engineer II - Web Foundations

StubHub Inc.

Full-time|$165K/yr - $200K/yr|Hybrid|New York, New York, United States

Join StubHub, where we're on a mission to revolutionize the live event experience worldwide. Whether it's someone's first event or their hundredth, we're dedicated to providing exceptional service from the ticket search to stepping through the gate. We aspire to be the safest and most convenient platform for both ticket buyers and sellers, ranging from individual fans to worldwide promoters.StubHub is excited to welcome a Software Engineer II to our Web Foundations team within the Consumer App Foundations (CAF) department. Our team is responsible for creating and maintaining the shared web platform that underpins StubHub's consumer experiences on both web and mobile.As a full-stack engineer on the Web Foundations team, you will engage with both frontend and backend systems to develop reusable services, shared libraries, and platform capabilities that empower product teams to deliver faster and with enhanced quality. Your contributions will impact core systems serving millions of users, ensuring they remain performant, reliable, and easy for other engineers to utilize.This role is ideal for engineers who thrive on building scalable systems, enhancing developer experiences, and tackling real-world challenges at marketplace scale.

Mar 2, 2026

Create account — see all 69,549 results

Browse all companies, explore by city & role, or SEO search pages. View directory listings: all jobs, search results, or location & role pages.