1 - 20 of 69,093 Jobs

Search for Senior/Staff Machine Learning Systems Engineer

69,093 results

Apply
Quilter logoQuilter logo
Full-time|Remote|Remote - US

Join Quilter as a Senior/Staff Machine Learning Systems EngineerAt Quilter, we are transforming the landscape of electrical engineering by streamlining the complex process of designing printed circuit boards (PCBs). Our talented team consists of specialists in electrical engineering, electromagnetic simulation, machine learning/artificial intelligence (ML/AI…

Apr 3, 2026
Apply
Bjak logoBjak logo
Full-time|On-site|United States

About the RoleJoin bjakcareer as we create an innovative AI system that understands context throughout conversations, proactively plans actions, and effectively carries tasks forward over time.In this role, you will transform research directives into operational, production-ready machine learning systems. You will oversee the execution layer of our AI intelligence, including training pipelines, inference systems, evaluation tools, and deployment processes.What You'll DoDevelop and maintain comprehensive ML pipelines covering data management, training, evaluation, inference, and deployment.Optimize and fine-tune models utilizing cutting-edge methodologies like LoRA, QLoRA, SFT, DPO, and distillation.Design and manage scalable inference systems, ensuring a balance of latency, cost, and reliability.Create and oversee data systems that ensure high-quality synthetic and real-world training data.Implement evaluation pipelines that assess performance, robustness, safety, and bias, collaborating with research leaders.Manage production deployments, focusing on GPU optimization, memory efficiency, latency reduction, and scaling strategies.Work closely with application engineers to seamlessly integrate ML systems into backend, mobile, and desktop applications.Make practical trade-offs and deliver enhancements quickly, leveraging real-world usage data.Operate under real production constraints such as latency, cost, reliability, and safety.Tech StackPythonPyTorch / JAXGPU-based training and inference systemsIdeal ExperienceExperience in developing or deploying real-world ML systems that have been utilized by users, beyond mere demonstrations.Familiarity with large models and understanding their failure modes.Proficiency in writing robust, production-grade code with a focus on system accuracy and reliability.Demonstrated self-direction, pragmatism, and a strong sense of ownership over project outcomes.Excellent communication and collaboration skills, thriving in small, high-trust teams.

Jan 29, 2026
Apply
Affirm logoAffirm logo
Full-time|$232K/yr - $310K/yr|Remote|Remote US

At Affirm, we are reshaping the landscape of credit by making it transparent and consumer-friendly, allowing individuals the freedom to buy now and pay later without hidden fees or accumulating interest.Join our dynamic team as a Senior Staff Machine Learning Engineer and play a crucial role in the evolution of our innovative ML team. Our commitment is to Affirm's mission of transforming financial services with a focus on transparency and inclusivity. We harness cutting-edge machine learning techniques to deliver responsible and accessible financial products.In this influential role, you will shape the future of machine learning at Affirm. Collaborating with ML Platform, engineering, product, and risk leaders, you will design, implement, and scale advanced modeling approaches that inform critical decisions within the organization. Your expertise will elevate our modeling capabilities, influence architectural directions, and ensure our systems are equipped to handle increasingly complex workloads. As a mentor to senior engineers, you will clarify intricate problems and contribute to a cohesive long-term ML strategy. If you are enthusiastic about pioneering modern machine learning solutions and eager to drive impactful innovation in a growing organization, Affirm is the perfect place for you.

Jan 7, 2026
Apply
ServiceNow logoServiceNow logo
Full-time|On-site|Mountain View

Join our dynamic team at Moveworks as a Senior Staff Machine Learning Engineer. In this pivotal role, you will leverage your expertise in machine learning and artificial intelligence to develop and enhance agentic systems that transform the way organizations interact with their customers. You will work alongside a talented group of engineers and data scientists to create innovative solutions that improve user experiences.

Feb 5, 2026
Apply
Charlie Health logoCharlie Health logo
Full-time|$185K/yr - $247.5K/yr|On-site|New York, NY

Why Choose Charlie Health?Across the nation, countless individuals are facing mental health challenges, substance use disorders, and eating disorders. Unfortunately, many encounter obstacles to obtaining the care they need, such as limited local options and lengthy wait times. Behavioral healthcare often fails to provide the personalized support that individuals require, leaving them feeling overlooked and unsupported.At Charlie Health, our mission is to transform this landscape. We strive to connect individuals to crucial behavioral health treatment through personalized, virtual care that fosters connections between clients and clinicians, care teams, loved ones, and their supporting communities. By prioritizing those with complex needs, we are enhancing access to significant care and achieving improved outcomes—all from the comfort of home.As a rapidly expanding organization, we are reaching more communities daily, and we are assembling a team dedicated to redefining behavioral health treatment. If you are eager to leverage your skills to create meaningful change and help individuals access the care they deserve, we invite you to join us.About the RoleCharlie Health is a leader in high-acuity virtual behavioral care, having provided life-saving treatment to over 100,000 clients across the nation. Our Matching and Outcomes team is responsible for developing products that assess and enhance clinical outcomes, including systems for provider recommendations and clinical decision support tools, ensuring effective treatment. You will design predictive models and workflows to connect patients with the most suitable therapists and peers, enabling clinicians to customize care to meet each client's unique needs. The systems you develop will empower clinicians to condense and synthesize insights from our clients and providers into actionable guidance at every step. If you are passionate about utilizing data-driven insights to enhance outcomes, this team is for you.

Mar 10, 2026
Apply
Motional logoMotional logo
Full-time|$125K/yr - $167K/yr|On-site|Pittsburgh, Pennsylvania, United States

Motional, based in Pittsburgh, develops autonomous vehicles with a strong focus on safety, reliability, and accessibility. Supported by Hyundai Motor Group, Motional is committed to advancing physical AI and shaping the future of transportation. The company’s mission centers on making streets safer and encouraging sustainable mobility. The Systems Readiness and Performance team connects software development with real-world deployment. This group handles system design, verifies and validates the autonomy stack, and sets as well as measures performance benchmarks. Team members work closely with autonomy, infrastructure, and operations partners to build the safety case for launching fully driverless IONIQ 5 robotaxis in Las Vegas. Role overview The Senior Engineer, Machine Learning Systems for Autonomy, joins the Autonomy Subsystems team to design and evaluate software modules that power autonomous vehicles. The role centers on assessing machine learning subsystems using offline model evaluations, open and closed-loop re-simulations, and on-road performance analysis. Defining and validating metrics that reflect subsystem performance is a key part of the job, along with sharing insights to help machine learning developers improve the models used in vehicles. What you will do Design and evaluate machine learning software modules for autonomous vehicles Conduct offline assessments of models and analyze results from re-simulations Evaluate the on-road performance of machine learning models Define and validate metrics to measure autonomy subsystem performance Collaborate with machine learning developers to support improvements in deployed models Requirements Experience as a systems engineer, with a background in safety-critical systems and machine learning model evaluation Interest in autonomy, robotics, and machine learning Prepared to help build production-ready systems for robotaxi deployment

Apr 21, 2026
Apply
Reddit Inc. logoReddit Inc. logo
Full-time|$266K/yr - $372.4K/yr|Remote|Remote - United States

Reddit is a vibrant community of communities, fostering shared interests and authentic conversations. With over 100,000 active communities and around 121 million unique visitors daily, Reddit stands as one of the largest sources of information on the internet. To learn more, visit www.redditinc.com.We are seeking a Senior Staff Machine Learning Engineer to become a pivotal member of our Feed Relevance team. This team is at the forefront of developing the end-to-end systems that enhance personalization and ranking for Reddit’s main feeds. In this crucial role, you will combine hands-on development with technical leadership, focusing on creating scalable, extensible, and high-performance personalization systems. Your expertise will define the technical direction for the team as you collaborate closely with Product and Engineering leadership to shape our strategy. Together with other talented individual contributors, you'll work on advancing and scaling our systems to support tens of millions of users. Your efforts will empower our Machine Learning Engineers (MLEs) and other engineers to rapidly iterate and continually enhance the user experience. Additionally, you will collaborate with Infrastructure and Platform teams to guide the future direction of our core infrastructure, paving the way for Machine Learning innovations at Reddit.

Apr 8, 2026
Apply
TRM Labs logo
Full-time|$200K/yr - $240K/yr|On-site|San Francisco, CA

Join Us in Building a Safer World.At TRM Labs, we specialize in blockchain analytics and AI solutions aimed at assisting law enforcement, national security agencies, financial institutions, and cryptocurrency businesses in identifying, investigating, and preventing crypto-related fraud and financial crime. Our innovative platforms leverage blockchain intelligence and AI technology to trace funds, detect illicit activity, and construct comprehensive threat profiles. Trusted by leading organizations worldwide, TRM Labs is committed to enabling a safer and more secure environment for all.Our AI Engineering Team is dedicated to pioneering next-generation AI applications, particularly in the realm of Large Language Models (LLMs) and agentic systems. Our goal is to develop resilient pipelines and high-performance infrastructure that facilitate the swift, safe, and scalable deployment of AI systems.We manage extensive petabyte-scale pipelines, ensuring model serving with millisecond latency while providing the necessary observability and governance to make AI production-ready. Our team actively evaluates and integrates leading-edge tools in the LLM and agent space, including open-source stacks, vector databases, evaluation frameworks, and orchestration tools to accelerate TRM’s innovation pace.As a Senior or Staff ML Systems Engineer – LLM, you will play a pivotal role in constructing and scaling our technical infrastructure for AI/ML systems. Your responsibilities will include:Creating reusable CI/CD workflows for model training, evaluation, and deployment, integrating tools such as Langfuse, GitHub Actions, and experiment tracking.Automating model versioning, approval processes, and compliance checks across various environments.Developing a modular and scalable AI infrastructure stack that encompasses vector databases, feature stores, model registries, and observability tools.Collaborating with engineering and data science teams to embed AI models and agents into real-time applications and workflows.Continuously assessing and incorporating state-of-the-art AI tools (e.g., LangChain, LlamaIndex, vLLM, MLflow, BentoML).Promoting AI reliability and governance while enabling experimentation, ensuring compliance, security, and continuous uptime.Enhancing AI/ML Model Performance and ensuring data accuracy and consistency, leading to improved model training and inference.Implementing infrastructure to facilitate both offline and online evaluation of LLMs and agents.

Mar 12, 2026
Apply
Airbnb, Inc. logoAirbnb, Inc. logo
Full-time|$244K/yr - $305K/yr|Remote|Remote - USA

Join Airbnb as a Senior Machine Learning Engineer in our Communication & Connectivity team, where you will design and deploy innovative machine learning solutions that enhance the experiences of our hosts and guests. You will be at the forefront of developing recommendation systems, intent detection models, and integrating advanced intelligence into our platforms. This role offers the unique opportunity to shape early-stage initiatives into impactful products, leading a team of data scientists and engineers to drive machine learning applications in Messaging and Notifications, a vital area for improving user experience.

Jan 26, 2026
Apply
Coupang logoCoupang logo
Full-time|$152/yr - $277/yr|On-site|Seattle, USA

At Coupang, our mission is to amaze our customers. We thrive on their feedback, often hearing, “How did we ever live without Coupang?” Our passion for simplifying shopping, dining, and living has positioned us as a transformative force in the multi-billion dollar e-commerce sector. As one of the fastest-growing e-commerce companies, we have established a strong reputation as a reliable leader in South Korean commerce.We enjoy the best of both worlds — a vibrant startup culture complemented by the resources of a large, global public company. This unique blend accelerates our growth and empowers us to launch new services quickly. We are a team of entrepreneurs eager to drive new initiatives and innovate. At Coupang, you will witness the growth of yourself, your colleagues, your team, and the company on a daily basis.Our commitment to building the future of commerce is genuine. We constantly push the limits of what is achievable to tackle challenges and redefine traditional trade-offs. Join Coupang now to help create an extraordinary experience in an always-on, high-tech, and hyper-connected world.Role OverviewThe Search & Discovery organization focuses on enhancing our customers' navigation experience and promoting long-term growth. Each week, millions of customers utilize search features on our app and coupang.com, with over one-third making purchases. A significant portion of Coupang's sales can be directly linked to search and recommendation efforts, which drive various business initiatives and growth strategies. This position offers the chance to develop scalable ML platforms, both offline and online, to support ranking, recommendations, and business decision-making. You will manage vast amounts of user interaction data and raw log files, transforming them into easily consumable data objects for engineers and scientists. You will also serve complex ML models in real-time, assisting millions of Coupang customers in product search and discovery. If you have a passion for big data engineering and scalable serving systems, a desire to create meaningful business impact, and a knack for problem-solving, this role is perfect for you.

May 1, 2026
Apply
Coupang logoCoupang logo
Full-time|On-site|Mountain View, USA

Join Coupang as a Staff Machine Learning Engineer, where you will play a pivotal role in developing innovative machine learning solutions that drive business growth and enhance customer experience. You will collaborate with a talented team of engineers and data scientists to build scalable and efficient machine learning models.

May 1, 2026
Apply
Waymo LLC logoWaymo LLC logo
Full-time|On-site|Mountain View, CA, US

Join Waymo as a Senior Staff Machine Learning Engineer specializing in Depot Automation, where you will play a vital role in transforming the future of transportation through cutting-edge machine learning solutions.Your expertise will drive innovation and efficiency, ensuring our autonomous vehicles operate seamlessly in various environments. Collaborate with a team of passionate engineers and researchers to develop and implement advanced algorithms that enhance our depot operations.

Apr 7, 2026
Apply
River Financial, Inc. logoRiver Financial, Inc. logo
Full-time|$150K/yr - $250K/yr|Remote|Remote Americas + Europe

About River Financial River Financial is building a financial institution focused on Bitcoin. The team aims to help individuals and businesses manage, save, and use Bitcoin securely and easily. River’s products support buying, selling, securing, and utilizing Bitcoin, with a vision of making Bitcoin a core part of personal savings and business balance sheets. Role Overview: Senior/Staff Machine Learning Engineer This remote position (Americas and Europe) calls for an experienced machine learning engineer to design, build, and maintain ML-driven systems. The work covers automation and decision-making for onboarding, risk, compliance, and operations. The role involves deploying machine learning models and large language models (LLMs) in production to solve real business problems. Key Technologies XGBoost PyTorch Python MLflow Postgres BigQuery About River’s Growth and Transparency River has raised over $50 million from investors including Goldcrest, Kingsway, Polychain, M13, DG, and Valor. The company values transparency and shares key information publicly. Anyone can review financials and proof of reserves to independently verify River’s business health and growth.

Apr 17, 2026
Apply
Periodic Labs logo
Full-time|$300K/yr - $400K/yr|On-site|Menlo Park

Periodic Labs is an AI and physical sciences company based in Menlo Park. The team focuses on advancing scientific discovery by building advanced models that drive progress in materials, energy, and related fields. The company operates with a strong sense of ownership and a drive to push scientific boundaries, supported by leading investors and a rapidly growing organization. Role overview The Machine Learning Systems Engineer will own the systems layer that powers model training and inference. This work is closely tied to the reinforcement learning (RL) feedback loop at the heart of Periodic Labs' research process, where models propose experiments, experiments generate data, and that data improves future models. The role blends deep infrastructure work with research collaboration, focusing on both performance and integration with the scientific workflow. What you will do Develop scheduling solutions for GB series GPUs using platforms like Ray, Slurm, and Kubernetes. Aim to minimize latency and maximize resource utilization across different cluster setups. Create profiling tools, both online and offline, to identify and resolve bottlenecks in the training and inference stack. Implement direct S3 checkpoint streaming to remove I/O bottlenecks during large-scale training runs. Benchmark RL training configurations across model sizes, batch strategies, and hardware architectures to find optimal setups. Write and optimize communication and GPU kernels to increase hardware throughput. Design and implement zero-copy RDMA weight synchronization between training and inference systems, keeping the RL loop fast and efficient. Develop sandbox execution environments for rapid algorithm testing and iteration. Key focus areas Scheduling, kernels, RDMA, weight synchronization, and communication primitives Collaboration with researchers to co-design algorithms and infrastructure Accelerating the RL feedback loop that drives scientific discovery at Periodic Labs

Apr 29, 2026
Apply
Datadog logoDatadog logo
Full-time|$234K/yr - $300K/yr|Hybrid|Boston, Massachusetts, USA; New York, New York, USA

The Machine Learning Observability team at Datadog is at the forefront of developing innovative tools designed to monitor, interpret, and enhance AI systems deployed in production environments, with a special focus on Large Language Models (LLMs) and generative AI technologies. Our solutions provide comprehensive and scalable observability for AI workloads, including drift detection, model evaluation, and behavior tracing, empowering our clients to deploy AI confidently. As a Staff Software Engineer, you will be instrumental in driving the development of new features and core capabilities within Datadog’s LLM Observability product. You will influence product strategy, lead experimental initiatives, and leverage your extensive knowledge of AI systems and software engineering to tackle complex challenges in the rapidly evolving AI domain. Your contributions will have a significant impact on how our customers monitor, diagnose, and optimize LLM-powered applications in production. Join us in creating the essential tools that ensure AI systems are observable, comprehensible, and dependable in real-world applications. At Datadog, we value our office culture which fosters relationships, collaboration, and creativity. We operate within a hybrid work model to enable our Datadogs to achieve a work-life balance that suits them best.

Feb 18, 2026
Apply
Airbnb, Inc. logoAirbnb, Inc. logo
Full-time|$244K/yr - $305K/yr|Remote|Remote - USA

Airbnb began in 2007 with two hosts and three guests in San Francisco. Since then, the platform has grown to over 5 million hosts and more than 2 billion guests worldwide. Airbnb connects people with unique places to stay and experiences, building authentic community connections across nearly every country. The team: Growth Platform Engineering The Growth Platform team focuses on driving sustainable, long-term growth for Airbnb. The team’s mission centers on building an agentic system and supporting capabilities to help all Airbnb offerings grow, both now and in the future. Efforts include delivering personalized and relevant content and product experiences to users, both on and off the Airbnb platform. The team is working toward a future where AI identifies opportunities, creates campaigns, personalizes experiences, and optimizes outcomes with minimal human input. This journey moves through a maturity curve: AI-assisted, agentic, and ultimately autonomous systems, always with human oversight to ensure brand safety, quality, and compliance. Growth Platform Engineering is tightly integrated with the Airbnb product, enhancing the customer journey and enabling new ways for users to engage. The platform supports a range of digital marketing channels, landing pages, email, push notifications, SMS, and digital advertising, as well as the machine learning and data infrastructure that powers these efforts. What you will do Develop AI-driven solutions to shape the future of Airbnb’s agentic growth platform, using the latest AI methodologies. Lead and mentor engineers through brainstorming, design, and implementation of AI products and features, from initial concept to deployment. Work at the intersection of technical depth, architectural innovation, and mentorship as a Senior Staff Engineer. Collaborate with cross-functional teams to build scalable systems that operate globally. Help evolve the foundational elements of Airbnb’s AI-powered growth systems.

Apr 14, 2026
Apply
Playlab logoPlaylab logo
Full-time|Remote|Remote

Join Playlab as a Staff Machine Learning EngineerAt Playlab, we are a pioneering tech non-profit focused on empowering educators and students to become informed consumers and innovators in the field of AI.Our mission revolves around a community-centric, open-source approach to educational AI. By providing essential tools and hands-on professional development, we enable educators and students to design tailored AI applications that fit their unique educational contexts. To date, over 60,000 educators have successfully published apps through Playlab, and our impact continues to expand.We view AI as a transformative design material, one that should be shaped collaboratively to realize diverse visions of learning. If you are passionate about crafting equitable and innovative futures for educators and students, we invite you to be part of our journey.The RoleWe are searching for a Staff Machine Learning Engineer to enhance our dynamic Engineering team. In this role, you will explore and experiment with cutting-edge AI technologies, transforming them into practical solutions for educators. Your work will bridge the gap between advanced machine learning and actual educational needs, ensuring that frontier AI is both safe and accessible within educational environments.Key ResponsibilitiesDesign and develop evaluation systems that gauge the quality of educational AI across numerous interactions, focusing on learning outcomes, bias detection, and curriculum alignment.Construct ML systems that facilitate self-improving app creation, utilizing insights from high-quality apps on our platform to automatically scaffold new applications for educators.Create and prototype downloadable, on-device AI models that function offline, ensuring privacy and global accessibility.Develop systems that support dynamic interfaces that adapt to learning moments, seamlessly transitioning between chat, writing editors, and interactive simulations.Establish content moderation and safety systems specifically tailored for educational conversations.Implement agentic AI systems that enable educators to craft goal-oriented applications (e.g., guiding students through projects over two weeks).Build advanced RAG systems that integrate diverse educational content with semantic search and knowledge graphs.And more…ExpectationsConduct research, prototype, and implement machine learning systems that empower educators.

Nov 24, 2025
Apply
Gimlet Labs logoGimlet Labs logo
Full-time|On-site|San Francisco

At Gimlet Labs, we are pioneering the development of the first heterogeneous neocloud designed specifically for AI workloads. As the demand for AI systems surges, traditional homogeneous infrastructures face critical limits in power, capacity, and cost. Our innovative platform effectively decouples AI workloads from their hardware foundations, intelligently partitioning tasks and orchestrating them to the most suitable hardware for optimal performance and efficiency. This strategy fosters heterogeneous systems that span multiple vendors and generations, including cutting-edge accelerators, enabling significant enhancements in performance and cost-effectiveness at scale.In addition to this foundational work, Gimlet is establishing a robust neocloud for agentic workloads. Our clients benefit from deploying and managing their workloads via stable, production-ready APIs, without the need to navigate hardware selection or performance optimization intricacies.We collaborate with foundation labs, hyperscalers, and AI-native companies to drive real production workloads capable of scaling to gigawatt-class AI datacenters.We are currently seeking a Member of Technical Staff specializing in ML systems and inference. In this pivotal role, you will be responsible for designing and constructing inference systems that facilitate the execution of complete models in real production environments. You will operate at the intersection of model architecture and system performance to ensure that inference processes are swift, predictable, and scalable.This position is perfect for engineers with a deep understanding of modern model execution and a passion for optimizing latency, throughput, and memory utilization across the entire inference lifecycle.

Mar 10, 2026
Apply
Vectara logoVectara logo
Full-time|Remote|US Remote

At Vectara, we are revolutionizing the deployment of Enterprise AI Agents and AI Assistants, emphasizing Accuracy, Security, and Explainability like never before. Our enterprise RAG Platform stands out by utilizing advanced models for retrieval, embedding, and reranking, alongside a meticulously optimized LLM trained for quality and cutting-edge Hallucination Mitigation techniques. Our innovative approach has garnered recognition in esteemed publications such as the New York Times and Visual Capitalist, solidifying our reputation as leaders in responsible, production-ready AI solutions. With a diverse clientele of over 100 enterprises, including prominent US military organizations, financial institutions, healthcare providers, and manufacturers, we are committed to delivering exceptional results.Our founding team comprises seasoned professionals from Google, specializing in neural information retrieval and distributed systems. We invite you to join us in our mission to empower the world to discover meaningful insights. At Vectara, our team is built on passion and expertise, featuring top talents from companies like Google, Cloudera, Splunk, MongoDB, and Elastic.

Mar 16, 2026
Apply
Motional logoMotional logo
Full-time|$125K/yr - $167K/yr|On-site|Las Vegas, Nevada, United States

Location: Las Vegas, Nevada, United States Motional develops autonomous driving technology aimed at making driverless vehicles safe, reliable, and accessible. Backed by Hyundai Motor Group, Motional works to advance transportation safety and sustainability through real-world deployment of AI-powered vehicles. Team: Systems Readiness and Performance This group bridges software development and real-world application. The team leads system design, validates the autonomy stack, and sets measurable performance targets. Collaboration with autonomy, infrastructure, and operations teams is key to building the safety case for launching fully driverless IONIQ 5 robotaxis in Las Vegas. Role overview The Senior Engineer, Autonomy Machine Learning Systems, joins the Autonomy Subsystems team, which designs and evaluates autonomy software modules. This role centers on improving machine learning subsystems within Motional's autonomy stack. Evaluate machine learning models using offline testing, open and closed-loop re-simulations, and on-road performance analysis Define and track metrics for autonomy subsystem performance Validate machine learning model performance against established metrics Deliver insights to machine learning developers and teams to help improve model performance in deployed vehicles Requirements Experience designing and evaluating safety-critical systems and machine learning models Interest in autonomy, robotics, and machine learning Motivation to help build production-ready systems for scaled robotaxi deployment

Apr 21, 2026

Sign in to browse more jobs

Create account — see all 69,093 results

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.