Staff ML Performance Engineer - Training Efficiency

Wayve TechnologiesSunnyvale

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Experience Level

Experience

Qualifications

We are looking for candidates with a strong background in machine learning and performance engineering. Ideal candidates will have:A Bachelor's or Master's degree in Computer Science, Engineering, or a related field. Proven experience in ML model optimization and performance tuning. Strong programming skills in Python, C++, or similar languages. Experience with machine learning frameworks such as TensorFlow or PyTorch. Excellent analytical and problem-solving abilities. Strong communication skills and the ability to work in a collaborative environment.

About the job

Join Wayve Technologies as a Staff Machine Learning Performance Engineer, specializing in Training Efficiency. In this pivotal role, you will be responsible for enhancing the performance of our machine learning models and algorithms, ensuring they operate at peak efficiency. You will collaborate with cross-functional teams to develop innovative solutions that improve training processes, optimize model performance, and drive impactful results in autonomous vehicle technology.

About Wayve Technologies

Wayve Technologies is at the forefront of autonomous vehicle technology, dedicated to developing innovative solutions that push the boundaries of machine learning and artificial intelligence. Our team is passionate about transforming how vehicles perceive and interact with the world. Join us in shaping the future of transportation!

Similar jobs

Browse all companies, explore by city & role, or SEO search pages. View directory listings: all jobs, search results, location & role pages.

1 - 20 of 762 Jobs

Search for Senior Site Reliability Engineer For Ai Ml Innovations

762 results

Select all on this page (20)

Apply

Senior Site Reliability Engineer for AI/ML Innovations

Intuitive Surgical, Inc.

Full-time|On-site|Sunnyvale

Join our dynamic team as a Senior Site Reliability Engineer focused on AI/ML solutions. In this role, you will leverage your expertise to enhance the reliability, scalability, and performance of our cutting-edge AI-driven products. You will work collaboratively with cross-functional teams to design, implement, and maintain robust systems that support our mis…

Dec 25, 2025

Apply

Senior Site Reliability Engineer

Illumio

Full-time|On-site|Sunnyvale, California - HQ

Illumio’s Senior Site Reliability Engineer role is based at the company’s Sunnyvale, California headquarters. This is an on-site position, requiring presence in the office five days a week. Role overview This position focuses on building and maintaining reliable, scalable infrastructure for Illumio’s applications and services, with an emphasis on Azure cloud solutions. The Senior SRE supports both SaaS and on-premises offerings, working closely with engineering teams to ensure operational resilience and security across hybrid environments. What you will do Design, deploy, and maintain highly available infrastructure on Azure for Illumio’s products. Automate provisioning and configuration management using Infrastructure as Code tools such as Terraform or ARM templates. Develop and manage CI/CD pipelines to improve software delivery and deployment processes. Monitor system and application health using Azure monitoring and logging tools, and optimize for performance and availability. Lead incident response, perform root cause analysis, and document findings to drive continuous improvement. Collaborate with development teams to design scalable, reliable architectures and provide guidance on cloud-native best practices. Engineering at Illumio The engineering team values autonomy, ownership, and collaboration. Work centers on advancing cybersecurity with scalable SaaS services and solutions for on-premises environments. The team emphasizes disciplined engineering, quality, and a supportive culture.

Apr 22, 2026

Apply

Site Reliability Engineer II at Illumio | Sunnyvale, California

Illumio

Full-time|On-site|Sunnyvale, California - HQ

Join Us on Our Mission!At Illumio, we are pioneering the way organizations combat ransomware and data breaches. Our innovative breach containment platform, driven by the Illumio AI Security Graph, enables businesses to effectively identify and mitigate threats across hybrid multi-cloud environments, preventing attacks from escalating into severe crises.As a recognized leader in the Forrester Wave™ for Microsegmentation, Illumio's solutions empower organizations to adopt Zero Trust models, enhancing cyber resilience for the critical infrastructure that sustains the global economy.On-Site Work:This position requires 5 days a week on-site presence at our Sunnyvale, CA headquarters.Our Vision:Our Engineering team thrives on a culture of visionary leadership, autonomy, and ownership, fostering an innovative environment that propels us through the dynamic landscape of cybersecurity.By joining our team, you will contribute to the forefront of Zero Trust Segmentation, utilizing an advanced technology stack that encompasses diverse operating systems, distributed applications, and cutting-edge UI/visualization tools.Together, we are shaping the future of cybersecurity, committed to developing world-class products guided by diverse perspectives and a shared dedication to innovation amidst unprecedented cyber threats.Your Role:As a Site Reliability Engineer II, you will oversee our multi-cloud infrastructure on platforms such as Azure, AWS, and/or GCP. Your responsibilities will include designing new cloud services and applications, collaborating closely with Engineering, SRE/OPS, and Security teams to transition these projects from development to production.Daily tasks will involve enhancing the reliability and scalability of Illumio's SaaS products while driving continuous improvement initiatives.We seek candidates with a strong passion for cloud technology, automation, and collaboration, as well as a solid understanding of the Azure cloud platform and related DevOps practices.

Feb 7, 2026

Apply

Site Reliability Engineer II at Illumio | Sunnyvale, California

Illumio

Full-time|On-site|Sunnyvale, California - HQ

Join Us in Securing the Future!At Illumio, we are pioneers in ransomware and breach containment, transforming how organizations defend against cyberattacks and fortifying operational resilience. Our innovative Illumio AI Security Graph powers a breach containment platform that swiftly identifies and neutralizes threats across hybrid multi-cloud environments, preventing minor issues from escalating into catastrophic events.As a recognized leader in the Forrester Wave™ for Microsegmentation, we enable Zero Trust, bolstering the cyber resilience of the infrastructures, systems, and organizations that keep the world functioning smoothly.Location: This role requires on-site presence in our Sunnyvale, CA headquarters five days a week.Vision of Our Team:Our Engineering team flourishes within a culture that champions visionary leadership, autonomy, and ownership. This dynamic synergy propels us forward in the constantly evolving realm of cybersecurity.As a member of our team, you will be at the forefront of Zero Trust Segmentation, working with an advanced technology stack that encompasses operating systems, distributed applications, and immersive UI/visualization tools.We're not just shaping the future of cybersecurity; we’re committed to developing world-class products led by diverse perspectives, backgrounds, and an unwavering commitment to innovation amidst unprecedented cybersecurity challenges.Your Role:As a Site Reliability Engineer II, you will oversee and optimize our multi-cloud infrastructure across Azure, AWS, and/or GCP. You will have the opportunity to design new services and applications in the cloud, guiding them from development to production while collaborating closely with Engineering, SRE/Operations, and Security teams.Your daily responsibilities will include enhancing the reliability and scalability of Illumio's SaaS offerings and spearheading continuous improvement initiatives.The ideal candidate is driven by a passion for cloud technology, automation, and collaboration, coupled with a solid foundation in Azure cloud platforms and relevant DevOps practices.Design, deploy, and maintain robust cloud infrastructure solutions on Azure, AWS, and/or GCP to support our applications and services.Implement Infrastructure as Code (IaC) principles using tools such as Terraform, ARM templates, or CloudFormation to automate provisioning and configuration management.Develop and maintain CI/CD pipelines for automated software delivery and deployment, utilizing tools like Azure DevOps, AWS CodePipeline, or Jenkins.Monitor system performance and availability, ensuring optimal operational efficiency.

Mar 23, 2026

Apply

Performance & Reliability Engineer

Cerebras Systems

Full-time|On-site|Sunnyvale, CA; Toronto, Ontario, Canada

Cerebras Systems is at the forefront of AI technology, developing the world’s largest AI chip that is 56 times larger than conventional GPUs. Our innovative wafer-scale architecture delivers the computational power of dozens of GPUs within a single chip, simplifying programming and enhancing performance. This unique capability enables Cerebras to provide unparalleled training and inference speeds, allowing machine learning practitioners to execute large-scale ML applications seamlessly without the complexities of managing extensive GPU or TPU infrastructures.Cerebras serves a diverse clientele, including top-tier model labs, global enterprises, and pioneering AI-native startups. OpenAI has recently partnered with Cerebras to leverage 750 megawatts of power, significantly enhancing key workloads through ultra high-speed inference.Our cutting-edge wafer-scale architecture has made Cerebras Inference the fastest Generative AI inference solution globally, achieving speeds over ten times faster than GPU-based hyperscale cloud inference services. This revolutionary speed is transforming the user experience of AI applications, facilitating real-time iteration and boosting intelligence through enhanced computational capabilities.About The RoleWe invite you to join Cerebras as a Performance & Reliability Engineer within our dynamic Co-Design and Next Generation Team. Our groundbreaking CS-3 system has established benchmarks for high-performance ML training and inference solutions, utilizing a chip the size of a dinner plate with 44GB of on-chip memory that exceeds traditional hardware capabilities. In this role, you will focus on characterizing and optimizing the performance and reliability of state-of-the-art AI models operating on Cerebras' revolutionary hardware.ResponsibilitiesCharacterize and enhance the performance and reliability of advanced ML hardware/software systems, focusing on minimizing power and thermal fluctuations.Analyze ML workloads, software kernels, and hardware architecture for their power and performance impacts, synthesizing high-level insights across these layers.Develop innovative software solutions to enhance system performance and efficiency.

Feb 17, 2026

Apply

Engineering Manager - Inference ML Runtime

Cerebras Systems

Full-time|On-site|Sunnyvale CA or Toronto Canada

Join Cerebras Systems as an Engineering Manager specializing in Inference ML Runtime, where you will lead a dedicated team in developing groundbreaking machine learning solutions. Your expertise will guide the design and implementation of our inference runtime, ensuring efficiency and performance at scale.As a pivotal leader in our innovative environment, you will collaborate with cross-functional teams, driving the development of state-of-the-art algorithms and systems that push the boundaries of artificial intelligence.

Mar 24, 2026

Apply

Engineering Manager, Kernel Reliability

Cerebras Systems

Full-time|On-site|Sunnyvale CA or Toronto Canada

Cerebras Systems is at the forefront of AI technology, having developed the world's largest AI chip, which is 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers the AI computing power equivalent to dozens of GPUs on a single chip, simplifying programming to a single device. This revolutionary design enables Cerebras to provide unmatched training and inference speeds, empowering machine learning practitioners to seamlessly execute large-scale ML applications without the complexities of managing multiple GPUs or TPUs.Our clientele includes elite model labs, global corporations, and pioneering AI-native startups. Notably, OpenAI recently entered into a multi-year partnership with Cerebras to deploy 750 megawatts of scale, significantly enhancing key workloads with ultra high-speed inference.Thanks to our groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution globally, achieving speeds over 10 times faster than GPU-based hyperscale cloud inference services. This substantial speed boost is transforming user experiences in AI applications by enabling real-time iterations and enhancing intelligence through additional agentic computation.The RoleWe are seeking a highly technical and hands-on Engineering Manager to lead our on-field Kernel Reliability team. You will guide a high-performing team in addressing a critical challenge: enhancing the reliability of our advanced compute clusters along with the associated inference, training, and internal production services. In this influential role, you will define the technical vision while remaining closely engaged with the code, crafting scalable solutions for our rapidly expanding system production and software service offerings. If you possess proven expertise in software or hardware reliability, diagnostic tool development, or failure analysis and debugging, we invite you to connect with us.ResponsibilitiesProvide hands-on technical leadership, owning the technical vision and roadmap for kernel-centric reliability concerning both internal and customer-facing systems.

Feb 17, 2026

Apply

Software Engineer - Kernel Reliability

Cerebras Systems

Full-time|On-site|Sunnyvale CA or Toronto Canada

Cerebras Systems is revolutionizing the AI landscape with the world's largest AI chip, which is 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers the computational power of multiple GPUs on a single chip, simplifying programming and enabling unparalleled training and inference speeds. This technology allows our users to run extensive machine learning applications seamlessly, eliminating the complexities associated with managing numerous GPUs or TPUs.Our clientele includes leading model labs, global corporations, and pioneering AI startups. Recently, OpenAI announced a multi-year collaboration with Cerebras, aiming to deploy 750 megawatts of power, significantly enhancing their workloads with ultra-fast inference capabilities.With our groundbreaking wafer-scale architecture, Cerebras Inference provides the fastest Generative AI inference solution globally, outperforming GPU-based hyperscale cloud services by over tenfold. This remarkable speed enhancement is transforming user experiences in AI applications, facilitating real-time iterations and amplifying intelligence through advanced computational capabilities.About The RoleWe are in search of a highly technical and hands-on Software Engineer to join our Kernel Reliability team. In this pivotal role, you will address the crucial task of enhancing the reliability of our advanced compute clusters, along with the inference, training, and internal production services. You will work closely with the code to develop solutions that scale alongside our rapidly evolving production systems and software services. If you possess strong foundations in systems, debugging, and failure analysis and have a passion for creating tools and solving complex reliability challenges, we would love to connect with you. New graduates are encouraged to apply.

Mar 5, 2026

Apply

AI/ML Research Scientist in Advanced Technology

Cerebras Systems

Full-time|On-site|Sunnyvale, CA; Toronto, Ontario, Canada; Vancouver, British Columbia, Canada

Join Cerebras Systems as an AI/ML Research Scientist and be part of a pioneering team at the forefront of advanced technology. In this role, you will leverage your expertise in artificial intelligence and machine learning to develop innovative solutions that will revolutionize the field. Collaborate with top-tier researchers and engineers to push the boundaries of what's possible.

Apr 7, 2026

Apply

Principal Engineer, AI Inference Reliability

Cerebras Systems

Full-time|Remote|Remote Office; Sunnyvale CA or Toronto Canada

Cerebras Systems is at the forefront of AI innovation, manufacturing the largest AI chip in the world, which is 56 times bigger than conventional GPUs. Our cutting-edge wafer-scale architecture provides the computational power equivalent to dozens of GPUs on a single chip, simplifying programming to the level of a single device. This pioneering approach enables us to offer unmatched training and inference speeds, allowing machine learning practitioners to smoothly execute large-scale ML applications without the complexity of managing numerous GPUs or TPUs. Our clientele includes leading model laboratories, major global corporations, and innovative AI-native startups. Notably, OpenAI has recently partnered with Cerebras to leverage 750 megawatts of scale, revolutionizing critical workloads with ultra-high-speed inference. Our advanced wafer-scale architecture makes Cerebras Inference the fastest Generative AI inference solution available, outperforming GPU-based hyperscale cloud inference services by over tenfold. This remarkable speed enhancement is reshaping the user experience of AI applications, enabling real-time iterations and enhanced intelligence through additional agentic computation.In late 2024, we launched Cerebras Inference, setting a new standard for Generative AI inference speed. Since its launch, we have rapidly scaled our services to meet the rising demand from AI labs, enterprises, and a vibrant developer community.In October 2025, we celebrated our Series G funding round, successfully raising $1.1 billion USD to accelerate the growth of our product offerings and services to satisfy global AI demand.About the TeamThe Cerebras Inference team is dedicated to delivering the most efficient, secure, and reliable enterprise-grade AI service. We design and manage expansive distributed systems that facilitate AI inference with unparalleled speed and efficiency. Join us in scaling our inference capabilities to new heights!

Feb 17, 2026

Apply

Fleet Reliability Engineer

Applied Intuition

Full-time|On-site|Sunnyvale, California, United States

As a Fleet Reliability Engineer at Applied Intuition, you will be at the forefront of ensuring the reliability and performance of our advanced fleet systems. Your expertise will play a crucial role in the development and deployment of our cutting-edge technology, optimizing fleet operations to guarantee safety and efficiency.

Mar 25, 2026

Apply

Staff ML Performance Engineer - Training Efficiency

Wayve Technologies

Full-time|On-site|Sunnyvale

Feb 27, 2026

Apply

Software Engineer - Specializing in Axion Data Engine and ML Ops

Applied Intuition

Full-time|On-site|Sunnyvale, California, United States

Applied Intuition is hiring a Software Engineer in Sunnyvale, California, with a focus on the Axion Data Engine and machine learning operations. This role centers on building and supporting the systems that power advanced data processing and ML workflows. Key Responsibilities Collaborate with cross-functional teams to design, build, and deploy data solutions for the Axion Data Engine. Maintain and enhance machine learning operations, aiming to improve system reliability and performance. Develop data processing capabilities that meet high standards for efficiency and accuracy. Team and Impact This position works closely with engineers and specialists from multiple areas. The work directly supports the quality and precision needed in industries that rely on advanced data and machine learning tools.

Apr 28, 2026

Apply

Engineering Manager, AI at Coram AI | Sunnyvale

Coram AI

Full-time|On-site|Sunnyvale

At Coram AI, we are transforming the landscape of video security in the digital age. Our innovative cloud-native platform leverages advanced computer vision and artificial intelligence to empower businesses with enhanced safety, smarter decision-making capabilities, and accelerated operational efficiency through features like real-time alerts, effortless clip sharing, and comprehensive multi-site visibility.Join our dynamic and agile team that prioritizes clarity, craftsmanship, and impactful contributions. Every team member plays a crucial role, delivering significant results and shaping the future of AI-driven security solutions.We are seeking an experienced Engineering Manager to lead our talented AI team at Coram. This team, although small, is exceptionally skilled and operates at the forefront of real-time systems, computer vision, and generative AI.In this hands-on leadership role, you will blend technical guidance, architectural oversight, recruitment, and team management. The ideal candidate will possess up-to-date knowledge of modern deep learning and generative AI, along with substantial experience in building and leading high-performance teams.

Mar 3, 2026

Apply

Research Engineer - Robotics and AI Innovations

Applied Intuition, Inc.

Full-time|$126K/yr - $423K/yr|On-site|Sunnyvale, California, United States

Discover Applied IntuitionFounded in 2017 and currently valued at $15 billion, Applied Intuition, Inc. is at the forefront of advancing physical AI technologies. Our mission is to establish the digital infrastructure that will integrate intelligence into moving machines worldwide. We cater to diverse sectors including automotive, defense, trucking, construction, mining, and agriculture, focusing on three key areas: tools and infrastructure, operating systems, and autonomy. Our solutions are trusted by leading global automakers and the United States military. Headquartered in Sunnyvale, California, we also have offices in major cities worldwide including Washington, D.C., San Diego, Ft. Walton Beach, Ann Arbor, London, Stuttgart, Munich, Stockholm, Bangalore, Seoul, and Tokyo. Explore more at applied.co.As an in-office company, we expect our employees to work from the Applied Intuition office five days a week. We also value flexibility, allowing for responsible management of schedules, which may include occasional remote work or adjusted hours for family commitments.Role Overview and Team DynamicsWe are excited to invite multiple passionate Research Engineers to join our Research Group at Applied Intuition. Our mission is to pioneer groundbreaking technology that will drive the next generation of physical AI, focusing on challenging applications such as end-to-end autonomous driving and robotic generalists. The team comprises leading experts recognized for their academic and industry contributions, including several Best Paper awards from premier conferences like CVPR and ICRA. Learn more about our research initiatives at appliedintuition.com/research.With access to industry-leading tools and infrastructure, our researchers can utilize millions of miles of data from extensive fleets, deploying innovative methods across various autonomous and robotic systems, including self-driving vehicles.

Feb 17, 2026

Apply

AI Research Engineer in Robotics

Coram AI

Full-time|On-site|Sunnyvale

At Coram AI, we are revolutionizing video security for the contemporary landscape. Our innovative cloud-native platform leverages advanced computer vision and artificial intelligence to empower businesses to enhance safety, facilitate informed decision-making, and accelerate operations. This includes features such as real-time alerts, effortless clip sharing, and comprehensive visibility across multiple locations.Joining our agile and dynamic team means being part of a collaborative environment that prioritizes clarity, excellence, and impactful contributions. Every team member has a voice, delivers significant work, and plays a crucial role in shaping how AI can foster a safer and more interconnected world.We are seeking engineers who thrive at the nexus of robotics, real-time systems, and deep learning. This position focuses on implementing high-performance vision and multimodal models on robotic platforms, where factors such as latency, reliability, and hardware limitations are paramount.

Mar 11, 2026

Apply

Software Engineer - Robotics at Coram AI | Sunnyvale

Coram AI

Full-time|On-site|Sunnyvale

At Coram AI, we are revolutionizing video security for the contemporary landscape. Our innovative cloud-native platform leverages computer vision and artificial intelligence to empower businesses to enhance safety, make informed decisions, and accelerate operations, featuring real-time alerts, effortless clip sharing, and multi-site visibility.Joining our dynamic and agile team means becoming part of a culture that prioritizes clarity, quality, and impactful contributions. Every team member has a voice, delivers significant work, and plays a crucial role in shaping how AI can foster a safer and more interconnected world.We seek an exceptionally skilled software engineer to develop high-performance, real-time software that operates on edge devices while adhering to stringent latency and memory limitations. This position emphasizes deterministic execution, distributed system architecture, and low-level performance enhancements. You will focus on constructing the infrastructure and runtime systems that enable real-time robotics applications.

Mar 11, 2026

Apply

R&D Engineer - Advanced Technology in AI/ML and HPC

Cerebras Systems

Full-time|On-site|Sunnyvale, CA; Toronto, Ontario, Canada; Vancouver, British Columbia, Canada

Join Cerebras Systems as an R&D Engineer specializing in Advanced Technology, focusing on Artificial Intelligence (AI) and Machine Learning (ML) within High-Performance Computing (HPC). In this pivotal role, you will contribute to cutting-edge projects that drive innovation in AI and ML technologies.As part of our dynamic team, you will collaborate with top-tier engineers and researchers to develop revolutionary solutions that enhance computing capabilities. Your expertise will be instrumental in shaping the future of AI and HPC technologies.

Apr 7, 2026

Apply

Site Selection Program Manager

CoreWeave

Full-time|$143K/yr - $191K/yr|On-site|Livingston, NJ / New York, NY / Sunnyvale, CA / San Francisco, CA / Bellevue, WA / Richmond, VA / Dallas, TX

Locations: Livingston, NJ; New York, NY; Sunnyvale, CA; San Francisco, CA; Bellevue, WA; Richmond, VA; Dallas, TX About CoreWeave CoreWeave is a cloud platform built for AI innovation. Since 2017, the company has provided technology, tools, and expert teams to help organizations develop and scale AI solutions. CoreWeave supports AI labs, startups, and global companies with high-performance infrastructure and technical expertise. The company became publicly traded on Nasdaq (CRWV) in March 2025. Learn more at www.coreweave.com. Role Overview The Site Selection Program Manager joins the global Site Selection team to manage the full lifecycle of data center site selection projects. This role reports to Site Selection leadership and plays a key part in ensuring data center opportunities move efficiently from market assessment to lease execution and delivery readiness. Key Responsibilities Oversee all stages of site selection, from initial screening through lease execution and delivery preparation. Maintain and update program trackers, dashboards, and reporting tools to monitor deal status, milestones, risks, and dependencies. Coordinate activities across Site Selection, Engineering, Capacity Planning, Finance, Legal, Operations, and external partners. Ensure Salesforce and internal systems reliably reflect current program activities and metrics. About the Work This position helps keep CoreWeave’s site selection pipeline organized and efficient. The Program Manager brings structure and clarity to projects involving many teams, helping the business move quickly while maintaining accuracy and reducing risk. The role requires careful tracking of multiple transactions and clear communication across departments and with external partners.

Apr 16, 2026

Apply

Senior Staff AI Engineer | Technical Lead in AI Modeling

LinkedIn Corporation

Full-time|On-site|Sunnyvale

Join our dynamic team as a Senior Staff AI Engineer, where you will lead cutting-edge AI modeling initiatives that drive innovation and excellence. In this pivotal role, you will collaborate with cross-functional teams to architect, design, and implement state-of-the-art AI solutions. Your expertise will guide the development of robust algorithms and models that enhance user experiences and optimize performance.

Mar 25, 2026

Create account — see all 762 results

Browse all companies, explore by city & role, or SEO search pages. View directory listings: all jobs, search results, or location & role pages.