Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
About the job
As a Fleet Reliability Engineer at Applied Intuition, you will be at the forefront of ensuring the reliability and performance of our advanced fleet systems. Your expertise will play a crucial role in the development and deployment of our cutting-edge technology, optimizing fleet operations to guarantee safety and efficiency.
Full-time|On-site|Sunnyvale, California, United States
As a Fleet Reliability Engineer at Applied Intuition, you will be at the forefront of ensuring the reliability and performance of our advanced fleet systems. Your expertise will play a crucial role in the development and deployment of our cutting-edge technology, optimizing fleet operations to guarantee safety and efficiency.
Cerebras Systems is at the forefront of AI technology, developing the world’s largest AI chip that is 56 times larger than conventional GPUs. Our innovative wafer-scale architecture delivers the computational power of dozens of GPUs within a single chip, simplifying programming and enhancing performance. This unique capability enables Cerebras to provide unparalleled training and inference speeds, allowing machine learning practitioners to execute large-scale ML applications seamlessly without the complexities of managing extensive GPU or TPU infrastructures.Cerebras serves a diverse clientele, including top-tier model labs, global enterprises, and pioneering AI-native startups. OpenAI has recently partnered with Cerebras to leverage 750 megawatts of power, significantly enhancing key workloads through ultra high-speed inference.Our cutting-edge wafer-scale architecture has made Cerebras Inference the fastest Generative AI inference solution globally, achieving speeds over ten times faster than GPU-based hyperscale cloud inference services. This revolutionary speed is transforming the user experience of AI applications, facilitating real-time iteration and boosting intelligence through enhanced computational capabilities.About The RoleWe invite you to join Cerebras as a Performance & Reliability Engineer within our dynamic Co-Design and Next Generation Team. Our groundbreaking CS-3 system has established benchmarks for high-performance ML training and inference solutions, utilizing a chip the size of a dinner plate with 44GB of on-chip memory that exceeds traditional hardware capabilities. In this role, you will focus on characterizing and optimizing the performance and reliability of state-of-the-art AI models operating on Cerebras' revolutionary hardware.ResponsibilitiesCharacterize and enhance the performance and reliability of advanced ML hardware/software systems, focusing on minimizing power and thermal fluctuations.Analyze ML workloads, software kernels, and hardware architecture for their power and performance impacts, synthesizing high-level insights across these layers.Develop innovative software solutions to enhance system performance and efficiency.
Illumio’s Senior Site Reliability Engineer role is based at the company’s Sunnyvale, California headquarters. This is an on-site position, requiring presence in the office five days a week. Role overview This position focuses on building and maintaining reliable, scalable infrastructure for Illumio’s applications and services, with an emphasis on Azure cloud solutions. The Senior SRE supports both SaaS and on-premises offerings, working closely with engineering teams to ensure operational resilience and security across hybrid environments. What you will do Design, deploy, and maintain highly available infrastructure on Azure for Illumio’s products. Automate provisioning and configuration management using Infrastructure as Code tools such as Terraform or ARM templates. Develop and manage CI/CD pipelines to improve software delivery and deployment processes. Monitor system and application health using Azure monitoring and logging tools, and optimize for performance and availability. Lead incident response, perform root cause analysis, and document findings to drive continuous improvement. Collaborate with development teams to design scalable, reliable architectures and provide guidance on cloud-native best practices. Engineering at Illumio The engineering team values autonomy, ownership, and collaboration. Work centers on advancing cybersecurity with scalable SaaS services and solutions for on-premises environments. The team emphasizes disciplined engineering, quality, and a supportive culture.
Cerebras Systems is at the forefront of AI technology, having developed the world's largest AI chip, which is 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers the AI computing power equivalent to dozens of GPUs on a single chip, simplifying programming to a single device. This revolutionary design enables Cerebras to provide unmatched training and inference speeds, empowering machine learning practitioners to seamlessly execute large-scale ML applications without the complexities of managing multiple GPUs or TPUs.Our clientele includes elite model labs, global corporations, and pioneering AI-native startups. Notably, OpenAI recently entered into a multi-year partnership with Cerebras to deploy 750 megawatts of scale, significantly enhancing key workloads with ultra high-speed inference.Thanks to our groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution globally, achieving speeds over 10 times faster than GPU-based hyperscale cloud inference services. This substantial speed boost is transforming user experiences in AI applications by enabling real-time iterations and enhancing intelligence through additional agentic computation.The RoleWe are seeking a highly technical and hands-on Engineering Manager to lead our on-field Kernel Reliability team. You will guide a high-performing team in addressing a critical challenge: enhancing the reliability of our advanced compute clusters along with the associated inference, training, and internal production services. In this influential role, you will define the technical vision while remaining closely engaged with the code, crafting scalable solutions for our rapidly expanding system production and software service offerings. If you possess proven expertise in software or hardware reliability, diagnostic tool development, or failure analysis and debugging, we invite you to connect with us.ResponsibilitiesProvide hands-on technical leadership, owning the technical vision and roadmap for kernel-centric reliability concerning both internal and customer-facing systems.
Cerebras Systems is revolutionizing the AI landscape with the world's largest AI chip, which is 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers the computational power of multiple GPUs on a single chip, simplifying programming and enabling unparalleled training and inference speeds. This technology allows our users to run extensive machine learning applications seamlessly, eliminating the complexities associated with managing numerous GPUs or TPUs.Our clientele includes leading model labs, global corporations, and pioneering AI startups. Recently, OpenAI announced a multi-year collaboration with Cerebras, aiming to deploy 750 megawatts of power, significantly enhancing their workloads with ultra-fast inference capabilities.With our groundbreaking wafer-scale architecture, Cerebras Inference provides the fastest Generative AI inference solution globally, outperforming GPU-based hyperscale cloud services by over tenfold. This remarkable speed enhancement is transforming user experiences in AI applications, facilitating real-time iterations and amplifying intelligence through advanced computational capabilities.About The RoleWe are in search of a highly technical and hands-on Software Engineer to join our Kernel Reliability team. In this pivotal role, you will address the crucial task of enhancing the reliability of our advanced compute clusters, along with the inference, training, and internal production services. You will work closely with the code to develop solutions that scale alongside our rapidly evolving production systems and software services. If you possess strong foundations in systems, debugging, and failure analysis and have a passion for creating tools and solving complex reliability challenges, we would love to connect with you. New graduates are encouraged to apply.
Full-time|$193.8K/yr - $285K/yr|On-site|Sunnyvale, CA; San Francisco, CA; Seattle, WA
Join DoorDash as an Engineering Manager, where you will spearhead two pivotal teams: Fleets and Delivery Experience. The Fleets team is dedicated to developing a top-tier Fleets product that drives growth across diverse sectors, including prescription services, automotive, healthcare, and catering. Meanwhile, the Delivery Experience team is committed to creating exceptional experiences for Dashers and Drive Customers through seamless order tracking and pickup/drop-off processes. As a leader, you will enhance delivery quality metrics, scale our Fleets product, and venture into new verticals while fostering innovation and driving engineering excellence.
Join Us on Our Mission!At Illumio, we are pioneering the way organizations combat ransomware and data breaches. Our innovative breach containment platform, driven by the Illumio AI Security Graph, enables businesses to effectively identify and mitigate threats across hybrid multi-cloud environments, preventing attacks from escalating into severe crises.As a recognized leader in the Forrester Wave™ for Microsegmentation, Illumio's solutions empower organizations to adopt Zero Trust models, enhancing cyber resilience for the critical infrastructure that sustains the global economy.On-Site Work:This position requires 5 days a week on-site presence at our Sunnyvale, CA headquarters.Our Vision:Our Engineering team thrives on a culture of visionary leadership, autonomy, and ownership, fostering an innovative environment that propels us through the dynamic landscape of cybersecurity.By joining our team, you will contribute to the forefront of Zero Trust Segmentation, utilizing an advanced technology stack that encompasses diverse operating systems, distributed applications, and cutting-edge UI/visualization tools.Together, we are shaping the future of cybersecurity, committed to developing world-class products guided by diverse perspectives and a shared dedication to innovation amidst unprecedented cyber threats.Your Role:As a Site Reliability Engineer II, you will oversee our multi-cloud infrastructure on platforms such as Azure, AWS, and/or GCP. Your responsibilities will include designing new cloud services and applications, collaborating closely with Engineering, SRE/OPS, and Security teams to transition these projects from development to production.Daily tasks will involve enhancing the reliability and scalability of Illumio's SaaS products while driving continuous improvement initiatives.We seek candidates with a strong passion for cloud technology, automation, and collaboration, as well as a solid understanding of the Azure cloud platform and related DevOps practices.
Join Us in Securing the Future!At Illumio, we are pioneers in ransomware and breach containment, transforming how organizations defend against cyberattacks and fortifying operational resilience. Our innovative Illumio AI Security Graph powers a breach containment platform that swiftly identifies and neutralizes threats across hybrid multi-cloud environments, preventing minor issues from escalating into catastrophic events.As a recognized leader in the Forrester Wave™ for Microsegmentation, we enable Zero Trust, bolstering the cyber resilience of the infrastructures, systems, and organizations that keep the world functioning smoothly.Location: This role requires on-site presence in our Sunnyvale, CA headquarters five days a week.Vision of Our Team:Our Engineering team flourishes within a culture that champions visionary leadership, autonomy, and ownership. This dynamic synergy propels us forward in the constantly evolving realm of cybersecurity.As a member of our team, you will be at the forefront of Zero Trust Segmentation, working with an advanced technology stack that encompasses operating systems, distributed applications, and immersive UI/visualization tools.We're not just shaping the future of cybersecurity; we’re committed to developing world-class products led by diverse perspectives, backgrounds, and an unwavering commitment to innovation amidst unprecedented cybersecurity challenges.Your Role:As a Site Reliability Engineer II, you will oversee and optimize our multi-cloud infrastructure across Azure, AWS, and/or GCP. You will have the opportunity to design new services and applications in the cloud, guiding them from development to production while collaborating closely with Engineering, SRE/Operations, and Security teams.Your daily responsibilities will include enhancing the reliability and scalability of Illumio's SaaS offerings and spearheading continuous improvement initiatives.The ideal candidate is driven by a passion for cloud technology, automation, and collaboration, coupled with a solid foundation in Azure cloud platforms and relevant DevOps practices.Design, deploy, and maintain robust cloud infrastructure solutions on Azure, AWS, and/or GCP to support our applications and services.Implement Infrastructure as Code (IaC) principles using tools such as Terraform, ARM templates, or CloudFormation to automate provisioning and configuration management.Develop and maintain CI/CD pipelines for automated software delivery and deployment, utilizing tools like Azure DevOps, AWS CodePipeline, or Jenkins.Monitor system performance and availability, ensuring optimal operational efficiency.
Join our dynamic team as a Senior Site Reliability Engineer focused on AI/ML solutions. In this role, you will leverage your expertise to enhance the reliability, scalability, and performance of our cutting-edge AI-driven products. You will work collaboratively with cross-functional teams to design, implement, and maintain robust systems that support our mission to revolutionize surgical technology.
At Wayve, we are dedicated to fostering a diverse and equitable workplace that respects and includes everyone, regardless of their unique skills, perspectives, or backgrounds, including but not limited to sex, race, religion, ethnic origin, disability, age, citizenship, marital status, sexual orientation, gender identity, or veteran status.About UsFounded in 2017, Wayve is at the forefront of developing groundbreaking Embodied AI technology. Our cutting-edge AI software and foundational models empower vehicles to perceive, comprehend, and navigate complex environments, significantly enhancing the safety and utility of automated driving systems.Our mission is to create autonomous solutions that drive progress globally. Our innovative, mapless, and hardware-agnostic AI products are tailored for automakers, facilitating the transition from assisted to fully automated driving. In our dynamic environment, we thrive on tackling significant challenges, embracing uncertainty to deliver pioneering solutions. We set ambitious goals and remain humble in our journey toward excellence, continually learning and adapting as we forge a smarter and safer future.Your contributions at Wayve are invaluable. We champion diversity, welcome new perspectives, and cultivate an inclusive workplace where we support each other in making a substantial impact.Join Wayve and shape the experience that will define your career!The RoleWe are seeking a Director of Fleet Operations Execution to spearhead the global delivery of on-road testing across all programs and regions. This pivotal role is central to our operations, translating technical testing requirements into practical implementations, managing regional test teams, and collaborating with Engineering and Product teams to ensure our autonomous and assisted driving systems are rigorously tested, validated, and poised for scalability. You will lead the execution with urgency and precision while influencing the methodologies for testing advanced technologies in real-world scenarios.
We are seeking a talented and motivated Reliability Manager to join our team at dstaff, focusing on Photonic Integrated Components. In this role, you will lead initiatives to enhance the reliability of our products and ensure that they meet the highest quality standards. You will collaborate with cross-functional teams to conduct reliability testing, analyze data, and implement improvements.
Full-time|Remote|Remote Office; Sunnyvale CA or Toronto Canada
Cerebras Systems is at the forefront of AI innovation, manufacturing the largest AI chip in the world, which is 56 times bigger than conventional GPUs. Our cutting-edge wafer-scale architecture provides the computational power equivalent to dozens of GPUs on a single chip, simplifying programming to the level of a single device. This pioneering approach enables us to offer unmatched training and inference speeds, allowing machine learning practitioners to smoothly execute large-scale ML applications without the complexity of managing numerous GPUs or TPUs. Our clientele includes leading model laboratories, major global corporations, and innovative AI-native startups. Notably, OpenAI has recently partnered with Cerebras to leverage 750 megawatts of scale, revolutionizing critical workloads with ultra-high-speed inference. Our advanced wafer-scale architecture makes Cerebras Inference the fastest Generative AI inference solution available, outperforming GPU-based hyperscale cloud inference services by over tenfold. This remarkable speed enhancement is reshaping the user experience of AI applications, enabling real-time iterations and enhanced intelligence through additional agentic computation.In late 2024, we launched Cerebras Inference, setting a new standard for Generative AI inference speed. Since its launch, we have rapidly scaled our services to meet the rising demand from AI labs, enterprises, and a vibrant developer community.In October 2025, we celebrated our Series G funding round, successfully raising $1.1 billion USD to accelerate the growth of our product offerings and services to satisfy global AI demand.About the TeamThe Cerebras Inference team is dedicated to delivering the most efficient, secure, and reliable enterprise-grade AI service. We design and manage expansive distributed systems that facilitate AI inference with unparalleled speed and efficiency. Join us in scaling our inference capabilities to new heights!
Wayve values a workplace where every individual’s skills and perspectives are respected. The company is committed to diversity, equity, and inclusion, and welcomes candidates from all backgrounds. About Wayve Founded in 2017, Wayve develops Embodied AI technology for vehicles. The team builds AI software and foundational models that help vehicles understand and navigate complex environments, supporting safer and more practical automated driving systems. Wayve partners with automotive manufacturers, offering intelligent, mapless, and hardware-agnostic AI solutions. The company’s goal is to help the industry move from assisted to fully automated driving. The team thrives on solving complex problems, learning quickly, and adapting to new challenges. Wayve supports an inclusive culture where different perspectives are valued. Team members are encouraged to learn from each other and make a real impact. Role overview: Process Excellence Manager Wayve is hiring a Process Excellence Manager for the Systems & Insights team in Sunnyvale. This position focuses on designing and scaling operational systems that support global Fleet Operations. The role suits a Lean Six Sigma Black Belt (or equivalent) with a track record of delivering measurable improvements in operational performance, especially in fast-changing, high-growth settings. What you will do Lead major process improvement projects across Wayve’s test network, including sites in the UK and other international locations. Work with teams across the company to reduce waste, improve process stability, and support reliable, high-quality delivery at scale. Play a key part in Wayve’s shift from research and development to a delivery-grade autonomous vehicle (AV) testing organization. Who succeeds in this role Lean Six Sigma Black Belt certification (or similar experience). Experience driving operational improvements in fast-growing organizations. Strong ability to collaborate across functions and geographies.
Join SpaceX as a Senior RFIC Design Engineer in our Silicon Engineering team. In this pivotal role, you will be responsible for designing innovative RF integrated circuits that drive our next-generation space technologies. Collaborate with a team of experts to push the boundaries of technology while ensuring the highest standards of quality and performance.
Cerebras Systems is revolutionizing artificial intelligence with the world's largest AI chip, 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers unparalleled AI compute power, equating to dozens of GPUs on a single chip, all while maintaining the programming simplicity of a single device. This unique solution enables Cerebras to achieve unmatched training and inference speeds, allowing machine learning practitioners to seamlessly execute large-scale ML applications without the complexities of managing multiple GPUs or TPUs.We proudly serve a diverse clientele that includes leading model labs, multinational corporations, and pioneering AI-native startups. Notably, OpenAI has recently entered into a multi-year partnership with Cerebras, harnessing 750 megawatts of scale to transform critical workloads with ultra-high-speed inference.Our cutting-edge wafer-scale architecture powers the fastest Generative AI inference solution globally, boasting speeds over ten times faster than GPU-based hyperscale cloud inference services. This remarkable acceleration is reshaping the user experience of AI applications, facilitating real-time iterations and enhancing intelligence through advanced agentic computation.About The RoleAs a Kernel Engineer, you will be pivotal in crafting high-performance software solutions at the convergence of hardware and software. Your primary responsibility will be to implement, optimize, and scale deep learning operations that fully utilize our custom, massively parallel processor architecture.You will collaborate with a world-class team focused on designing, tuning for performance, and validating foundational ML and HPC kernels. This role includes building a comprehensive library of parallel and distributed algorithms aimed at maximizing compute utilization and enhancing training efficiency for state-of-the-art AI models. Your contributions will be crucial in unlocking the full capabilities of our hardware and accelerating the advancements in AI.
Full-time|$170K/yr - $235K/yr|On-site|Sunnyvale, CA
Founded on the vision of making humanity a multi-planetary species, SpaceX is pioneering the technologies to enable human life on Mars. Our mission extends beyond the stars; we are also transforming global connectivity through Starlink, the most advanced broadband internet system in the world.SENIOR RTL DESIGN ENGINEER (SILICON ENGINEERING)At SpaceX, we leverage our extensive experience in rocket and spacecraft development to successfully deploy Starlink, the world's largest satellite constellation. This initiative is providing high-speed, reliable internet to millions across the globe. We design, build, test, and operate all components of the system, from thousands of satellites to consumer receivers that enable users to connect with ease, and the software that integrates it all. As we continue to expand Starlink's global reach, we seek exceptional engineers to enhance its utility for communities and businesses worldwide.We are in search of a proactive and intellectually curious Senior RTL Design Engineer to collaborate with our elite, cross-disciplinary teams, including systems, firmware, architecture, design, validation, product engineering, and ASIC implementation. In this role, you will be at the forefront of developing next-generation FPGAs and ASICs that will be deployed in both space and terrestrial infrastructures. Your contributions will facilitate connectivity in areas that have previously lacked affordable and reliable access, thereby enhancing the capabilities of the Starlink network.
Full-time|$130K/yr - $180K/yr|On-site|Sunnyvale, CA
At SpaceX, we believe in a future where humanity explores the stars, and we are dedicated to developing the technologies to make that vision a reality. Our ultimate goal is to enable human life on Mars.RFIC DESIGN ENGINEER (SILICON ENGINEERING)As part of SpaceX, you will leverage our extensive experience in building rockets and spacecraft to support the deployment of Starlink, the world's most advanced broadband internet system. Starlink features the largest satellite constellation globally, providing fast and reliable internet to millions of users. We design, build, test, and operate all system components, including thousands of satellites and user-friendly consumer receivers that allow connectivity within minutes. Join us in maximizing Starlink's global impact for communities and businesses.We are seeking a highly motivated and proactive RF/Analog IC engineer to collaborate with world-class cross-disciplinary teams, including systems, firmware, architecture, design, validation, and product engineering. In this position, you will develop cutting-edge next-generation RFICs for deployment in both space and terrestrial infrastructures. These innovations will enable connectivity in previously unreachable, unaffordable, or unreliable areas. Help us deliver next-generation solutions to enhance the Starlink network's performance and capabilities.
Full-time|On-site|Sunnyvale, California, United States
Join Applied Intuition as a Software Engineer specializing in AI Engineering, where you'll have the opportunity to work on cutting-edge technology and contribute to innovative projects that shape the future of artificial intelligence. As part of our dynamic team, you will collaborate with talented professionals to design, develop, and implement AI solutions that address real-world challenges.
Full-time|On-site|Sunnyvale, California, United States
LotusWorks is a premier Engineering Services provider, excelling in the management of Commissioning, Construction Services, Calibration, Operations & Maintenance across global manufacturing facilities. With operations spanning EMEA and North America, we collaborate with industry leaders in the Semiconductor, Pharmaceutical & Biologics, Medical Device, and Data Centre sectors. Our Engineering and Technical professionals are at the forefront of cutting-edge technologies and innovations. At LotusWorks, we are dedicated to fostering a diverse and inclusive workplace that is central to our people-first philosophy.We are on the lookout for a skilled Mechanical Commissioning Engineer to spearhead and execute the commissioning processes for HVAC mechanical systems across intricate facility projects. The ideal candidate will ensure that all mechanical systems are installed, tested, and function in alignment with design specifications and ASHRAE commissioning protocols. This position necessitates robust technical expertise, exceptional documentation capabilities, and the ability to liaise effectively with various project stakeholders.
Position OverviewJoin our dynamic manufacturing engineering team at Intuitive Surgical, where your mechanical engineering design and analysis expertise will play a crucial role in creating cutting-edge equipment used for testing, measuring, and ensuring the quality of our innovative instruments for minimally invasive robotic surgery. Your project management skills will be invaluable as you collaborate with our software engineering team to enhance electro-mechanical components, assemblies, process documentation, tooling, and testing methods, driving improvements in efficacy, reliability, manufacturability, and cost.Key ResponsibilitiesDevelop, maintain, and optimize high-volume manufacturing assembly lines, refining BOMs, workflow processes, and detailed work instructions.Design, document, procure, qualify, implement, and enhance fixtures, tools, and equipment that include both mechanical and electronic components, as well as control algorithms and programming.Analyze equipment performance and trends, addressing root causes of issues to guarantee equipment accuracy, reliability, repeatability, and reproducibility.Modify equipment drawings and implement updates for large-scale production lines across various plants.Conduct risk analysis (PFMEA) on the instrument-manufacturing line, validating critical tests utilized in manufacturing.Oversee the validation and qualification of manufacturing processes and equipment, employing standard qualification methodologies (IQ/OQ/DQ/PQ).Support production lines by resolving day-to-day engineering challenges related to core instruments, including emergency repairs.Identify and execute continuous improvement initiatives focused on first pass yield, cycle time reduction, product reliability, capacity enhancement, and cost reduction.Provide DFx (Design for Manufacturing and Assembly) insights to improve the manufacturability of core instruments.Ensure compliance with medical device quality systems, including the closure of corrective actions and implementation of ECOs.Conduct failure analysis for production discrepancies and field returns, offering technical support as necessary.Willingness to travel periodically to suppliers or our Intuitive Mexicali plant.Exhibit leadership through knowledge sharing, mentoring, and training others.Proactively seek and implement improved technologies for production processes.Enhance documentation related to equipment installation, repair, and maintenance.
Full-time|On-site|Sunnyvale, California, United States
As a Fleet Reliability Engineer at Applied Intuition, you will be at the forefront of ensuring the reliability and performance of our advanced fleet systems. Your expertise will play a crucial role in the development and deployment of our cutting-edge technology, optimizing fleet operations to guarantee safety and efficiency.
Cerebras Systems is at the forefront of AI technology, developing the world’s largest AI chip that is 56 times larger than conventional GPUs. Our innovative wafer-scale architecture delivers the computational power of dozens of GPUs within a single chip, simplifying programming and enhancing performance. This unique capability enables Cerebras to provide unparalleled training and inference speeds, allowing machine learning practitioners to execute large-scale ML applications seamlessly without the complexities of managing extensive GPU or TPU infrastructures.Cerebras serves a diverse clientele, including top-tier model labs, global enterprises, and pioneering AI-native startups. OpenAI has recently partnered with Cerebras to leverage 750 megawatts of power, significantly enhancing key workloads through ultra high-speed inference.Our cutting-edge wafer-scale architecture has made Cerebras Inference the fastest Generative AI inference solution globally, achieving speeds over ten times faster than GPU-based hyperscale cloud inference services. This revolutionary speed is transforming the user experience of AI applications, facilitating real-time iteration and boosting intelligence through enhanced computational capabilities.About The RoleWe invite you to join Cerebras as a Performance & Reliability Engineer within our dynamic Co-Design and Next Generation Team. Our groundbreaking CS-3 system has established benchmarks for high-performance ML training and inference solutions, utilizing a chip the size of a dinner plate with 44GB of on-chip memory that exceeds traditional hardware capabilities. In this role, you will focus on characterizing and optimizing the performance and reliability of state-of-the-art AI models operating on Cerebras' revolutionary hardware.ResponsibilitiesCharacterize and enhance the performance and reliability of advanced ML hardware/software systems, focusing on minimizing power and thermal fluctuations.Analyze ML workloads, software kernels, and hardware architecture for their power and performance impacts, synthesizing high-level insights across these layers.Develop innovative software solutions to enhance system performance and efficiency.
Illumio’s Senior Site Reliability Engineer role is based at the company’s Sunnyvale, California headquarters. This is an on-site position, requiring presence in the office five days a week. Role overview This position focuses on building and maintaining reliable, scalable infrastructure for Illumio’s applications and services, with an emphasis on Azure cloud solutions. The Senior SRE supports both SaaS and on-premises offerings, working closely with engineering teams to ensure operational resilience and security across hybrid environments. What you will do Design, deploy, and maintain highly available infrastructure on Azure for Illumio’s products. Automate provisioning and configuration management using Infrastructure as Code tools such as Terraform or ARM templates. Develop and manage CI/CD pipelines to improve software delivery and deployment processes. Monitor system and application health using Azure monitoring and logging tools, and optimize for performance and availability. Lead incident response, perform root cause analysis, and document findings to drive continuous improvement. Collaborate with development teams to design scalable, reliable architectures and provide guidance on cloud-native best practices. Engineering at Illumio The engineering team values autonomy, ownership, and collaboration. Work centers on advancing cybersecurity with scalable SaaS services and solutions for on-premises environments. The team emphasizes disciplined engineering, quality, and a supportive culture.
Cerebras Systems is at the forefront of AI technology, having developed the world's largest AI chip, which is 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers the AI computing power equivalent to dozens of GPUs on a single chip, simplifying programming to a single device. This revolutionary design enables Cerebras to provide unmatched training and inference speeds, empowering machine learning practitioners to seamlessly execute large-scale ML applications without the complexities of managing multiple GPUs or TPUs.Our clientele includes elite model labs, global corporations, and pioneering AI-native startups. Notably, OpenAI recently entered into a multi-year partnership with Cerebras to deploy 750 megawatts of scale, significantly enhancing key workloads with ultra high-speed inference.Thanks to our groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution globally, achieving speeds over 10 times faster than GPU-based hyperscale cloud inference services. This substantial speed boost is transforming user experiences in AI applications by enabling real-time iterations and enhancing intelligence through additional agentic computation.The RoleWe are seeking a highly technical and hands-on Engineering Manager to lead our on-field Kernel Reliability team. You will guide a high-performing team in addressing a critical challenge: enhancing the reliability of our advanced compute clusters along with the associated inference, training, and internal production services. In this influential role, you will define the technical vision while remaining closely engaged with the code, crafting scalable solutions for our rapidly expanding system production and software service offerings. If you possess proven expertise in software or hardware reliability, diagnostic tool development, or failure analysis and debugging, we invite you to connect with us.ResponsibilitiesProvide hands-on technical leadership, owning the technical vision and roadmap for kernel-centric reliability concerning both internal and customer-facing systems.
Cerebras Systems is revolutionizing the AI landscape with the world's largest AI chip, which is 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers the computational power of multiple GPUs on a single chip, simplifying programming and enabling unparalleled training and inference speeds. This technology allows our users to run extensive machine learning applications seamlessly, eliminating the complexities associated with managing numerous GPUs or TPUs.Our clientele includes leading model labs, global corporations, and pioneering AI startups. Recently, OpenAI announced a multi-year collaboration with Cerebras, aiming to deploy 750 megawatts of power, significantly enhancing their workloads with ultra-fast inference capabilities.With our groundbreaking wafer-scale architecture, Cerebras Inference provides the fastest Generative AI inference solution globally, outperforming GPU-based hyperscale cloud services by over tenfold. This remarkable speed enhancement is transforming user experiences in AI applications, facilitating real-time iterations and amplifying intelligence through advanced computational capabilities.About The RoleWe are in search of a highly technical and hands-on Software Engineer to join our Kernel Reliability team. In this pivotal role, you will address the crucial task of enhancing the reliability of our advanced compute clusters, along with the inference, training, and internal production services. You will work closely with the code to develop solutions that scale alongside our rapidly evolving production systems and software services. If you possess strong foundations in systems, debugging, and failure analysis and have a passion for creating tools and solving complex reliability challenges, we would love to connect with you. New graduates are encouraged to apply.
Full-time|$193.8K/yr - $285K/yr|On-site|Sunnyvale, CA; San Francisco, CA; Seattle, WA
Join DoorDash as an Engineering Manager, where you will spearhead two pivotal teams: Fleets and Delivery Experience. The Fleets team is dedicated to developing a top-tier Fleets product that drives growth across diverse sectors, including prescription services, automotive, healthcare, and catering. Meanwhile, the Delivery Experience team is committed to creating exceptional experiences for Dashers and Drive Customers through seamless order tracking and pickup/drop-off processes. As a leader, you will enhance delivery quality metrics, scale our Fleets product, and venture into new verticals while fostering innovation and driving engineering excellence.
Join Us on Our Mission!At Illumio, we are pioneering the way organizations combat ransomware and data breaches. Our innovative breach containment platform, driven by the Illumio AI Security Graph, enables businesses to effectively identify and mitigate threats across hybrid multi-cloud environments, preventing attacks from escalating into severe crises.As a recognized leader in the Forrester Wave™ for Microsegmentation, Illumio's solutions empower organizations to adopt Zero Trust models, enhancing cyber resilience for the critical infrastructure that sustains the global economy.On-Site Work:This position requires 5 days a week on-site presence at our Sunnyvale, CA headquarters.Our Vision:Our Engineering team thrives on a culture of visionary leadership, autonomy, and ownership, fostering an innovative environment that propels us through the dynamic landscape of cybersecurity.By joining our team, you will contribute to the forefront of Zero Trust Segmentation, utilizing an advanced technology stack that encompasses diverse operating systems, distributed applications, and cutting-edge UI/visualization tools.Together, we are shaping the future of cybersecurity, committed to developing world-class products guided by diverse perspectives and a shared dedication to innovation amidst unprecedented cyber threats.Your Role:As a Site Reliability Engineer II, you will oversee our multi-cloud infrastructure on platforms such as Azure, AWS, and/or GCP. Your responsibilities will include designing new cloud services and applications, collaborating closely with Engineering, SRE/OPS, and Security teams to transition these projects from development to production.Daily tasks will involve enhancing the reliability and scalability of Illumio's SaaS products while driving continuous improvement initiatives.We seek candidates with a strong passion for cloud technology, automation, and collaboration, as well as a solid understanding of the Azure cloud platform and related DevOps practices.
Join Us in Securing the Future!At Illumio, we are pioneers in ransomware and breach containment, transforming how organizations defend against cyberattacks and fortifying operational resilience. Our innovative Illumio AI Security Graph powers a breach containment platform that swiftly identifies and neutralizes threats across hybrid multi-cloud environments, preventing minor issues from escalating into catastrophic events.As a recognized leader in the Forrester Wave™ for Microsegmentation, we enable Zero Trust, bolstering the cyber resilience of the infrastructures, systems, and organizations that keep the world functioning smoothly.Location: This role requires on-site presence in our Sunnyvale, CA headquarters five days a week.Vision of Our Team:Our Engineering team flourishes within a culture that champions visionary leadership, autonomy, and ownership. This dynamic synergy propels us forward in the constantly evolving realm of cybersecurity.As a member of our team, you will be at the forefront of Zero Trust Segmentation, working with an advanced technology stack that encompasses operating systems, distributed applications, and immersive UI/visualization tools.We're not just shaping the future of cybersecurity; we’re committed to developing world-class products led by diverse perspectives, backgrounds, and an unwavering commitment to innovation amidst unprecedented cybersecurity challenges.Your Role:As a Site Reliability Engineer II, you will oversee and optimize our multi-cloud infrastructure across Azure, AWS, and/or GCP. You will have the opportunity to design new services and applications in the cloud, guiding them from development to production while collaborating closely with Engineering, SRE/Operations, and Security teams.Your daily responsibilities will include enhancing the reliability and scalability of Illumio's SaaS offerings and spearheading continuous improvement initiatives.The ideal candidate is driven by a passion for cloud technology, automation, and collaboration, coupled with a solid foundation in Azure cloud platforms and relevant DevOps practices.Design, deploy, and maintain robust cloud infrastructure solutions on Azure, AWS, and/or GCP to support our applications and services.Implement Infrastructure as Code (IaC) principles using tools such as Terraform, ARM templates, or CloudFormation to automate provisioning and configuration management.Develop and maintain CI/CD pipelines for automated software delivery and deployment, utilizing tools like Azure DevOps, AWS CodePipeline, or Jenkins.Monitor system performance and availability, ensuring optimal operational efficiency.
Join our dynamic team as a Senior Site Reliability Engineer focused on AI/ML solutions. In this role, you will leverage your expertise to enhance the reliability, scalability, and performance of our cutting-edge AI-driven products. You will work collaboratively with cross-functional teams to design, implement, and maintain robust systems that support our mission to revolutionize surgical technology.
At Wayve, we are dedicated to fostering a diverse and equitable workplace that respects and includes everyone, regardless of their unique skills, perspectives, or backgrounds, including but not limited to sex, race, religion, ethnic origin, disability, age, citizenship, marital status, sexual orientation, gender identity, or veteran status.About UsFounded in 2017, Wayve is at the forefront of developing groundbreaking Embodied AI technology. Our cutting-edge AI software and foundational models empower vehicles to perceive, comprehend, and navigate complex environments, significantly enhancing the safety and utility of automated driving systems.Our mission is to create autonomous solutions that drive progress globally. Our innovative, mapless, and hardware-agnostic AI products are tailored for automakers, facilitating the transition from assisted to fully automated driving. In our dynamic environment, we thrive on tackling significant challenges, embracing uncertainty to deliver pioneering solutions. We set ambitious goals and remain humble in our journey toward excellence, continually learning and adapting as we forge a smarter and safer future.Your contributions at Wayve are invaluable. We champion diversity, welcome new perspectives, and cultivate an inclusive workplace where we support each other in making a substantial impact.Join Wayve and shape the experience that will define your career!The RoleWe are seeking a Director of Fleet Operations Execution to spearhead the global delivery of on-road testing across all programs and regions. This pivotal role is central to our operations, translating technical testing requirements into practical implementations, managing regional test teams, and collaborating with Engineering and Product teams to ensure our autonomous and assisted driving systems are rigorously tested, validated, and poised for scalability. You will lead the execution with urgency and precision while influencing the methodologies for testing advanced technologies in real-world scenarios.
We are seeking a talented and motivated Reliability Manager to join our team at dstaff, focusing on Photonic Integrated Components. In this role, you will lead initiatives to enhance the reliability of our products and ensure that they meet the highest quality standards. You will collaborate with cross-functional teams to conduct reliability testing, analyze data, and implement improvements.
Full-time|Remote|Remote Office; Sunnyvale CA or Toronto Canada
Cerebras Systems is at the forefront of AI innovation, manufacturing the largest AI chip in the world, which is 56 times bigger than conventional GPUs. Our cutting-edge wafer-scale architecture provides the computational power equivalent to dozens of GPUs on a single chip, simplifying programming to the level of a single device. This pioneering approach enables us to offer unmatched training and inference speeds, allowing machine learning practitioners to smoothly execute large-scale ML applications without the complexity of managing numerous GPUs or TPUs. Our clientele includes leading model laboratories, major global corporations, and innovative AI-native startups. Notably, OpenAI has recently partnered with Cerebras to leverage 750 megawatts of scale, revolutionizing critical workloads with ultra-high-speed inference. Our advanced wafer-scale architecture makes Cerebras Inference the fastest Generative AI inference solution available, outperforming GPU-based hyperscale cloud inference services by over tenfold. This remarkable speed enhancement is reshaping the user experience of AI applications, enabling real-time iterations and enhanced intelligence through additional agentic computation.In late 2024, we launched Cerebras Inference, setting a new standard for Generative AI inference speed. Since its launch, we have rapidly scaled our services to meet the rising demand from AI labs, enterprises, and a vibrant developer community.In October 2025, we celebrated our Series G funding round, successfully raising $1.1 billion USD to accelerate the growth of our product offerings and services to satisfy global AI demand.About the TeamThe Cerebras Inference team is dedicated to delivering the most efficient, secure, and reliable enterprise-grade AI service. We design and manage expansive distributed systems that facilitate AI inference with unparalleled speed and efficiency. Join us in scaling our inference capabilities to new heights!
Wayve values a workplace where every individual’s skills and perspectives are respected. The company is committed to diversity, equity, and inclusion, and welcomes candidates from all backgrounds. About Wayve Founded in 2017, Wayve develops Embodied AI technology for vehicles. The team builds AI software and foundational models that help vehicles understand and navigate complex environments, supporting safer and more practical automated driving systems. Wayve partners with automotive manufacturers, offering intelligent, mapless, and hardware-agnostic AI solutions. The company’s goal is to help the industry move from assisted to fully automated driving. The team thrives on solving complex problems, learning quickly, and adapting to new challenges. Wayve supports an inclusive culture where different perspectives are valued. Team members are encouraged to learn from each other and make a real impact. Role overview: Process Excellence Manager Wayve is hiring a Process Excellence Manager for the Systems & Insights team in Sunnyvale. This position focuses on designing and scaling operational systems that support global Fleet Operations. The role suits a Lean Six Sigma Black Belt (or equivalent) with a track record of delivering measurable improvements in operational performance, especially in fast-changing, high-growth settings. What you will do Lead major process improvement projects across Wayve’s test network, including sites in the UK and other international locations. Work with teams across the company to reduce waste, improve process stability, and support reliable, high-quality delivery at scale. Play a key part in Wayve’s shift from research and development to a delivery-grade autonomous vehicle (AV) testing organization. Who succeeds in this role Lean Six Sigma Black Belt certification (or similar experience). Experience driving operational improvements in fast-growing organizations. Strong ability to collaborate across functions and geographies.
Join SpaceX as a Senior RFIC Design Engineer in our Silicon Engineering team. In this pivotal role, you will be responsible for designing innovative RF integrated circuits that drive our next-generation space technologies. Collaborate with a team of experts to push the boundaries of technology while ensuring the highest standards of quality and performance.
Cerebras Systems is revolutionizing artificial intelligence with the world's largest AI chip, 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers unparalleled AI compute power, equating to dozens of GPUs on a single chip, all while maintaining the programming simplicity of a single device. This unique solution enables Cerebras to achieve unmatched training and inference speeds, allowing machine learning practitioners to seamlessly execute large-scale ML applications without the complexities of managing multiple GPUs or TPUs.We proudly serve a diverse clientele that includes leading model labs, multinational corporations, and pioneering AI-native startups. Notably, OpenAI has recently entered into a multi-year partnership with Cerebras, harnessing 750 megawatts of scale to transform critical workloads with ultra-high-speed inference.Our cutting-edge wafer-scale architecture powers the fastest Generative AI inference solution globally, boasting speeds over ten times faster than GPU-based hyperscale cloud inference services. This remarkable acceleration is reshaping the user experience of AI applications, facilitating real-time iterations and enhancing intelligence through advanced agentic computation.About The RoleAs a Kernel Engineer, you will be pivotal in crafting high-performance software solutions at the convergence of hardware and software. Your primary responsibility will be to implement, optimize, and scale deep learning operations that fully utilize our custom, massively parallel processor architecture.You will collaborate with a world-class team focused on designing, tuning for performance, and validating foundational ML and HPC kernels. This role includes building a comprehensive library of parallel and distributed algorithms aimed at maximizing compute utilization and enhancing training efficiency for state-of-the-art AI models. Your contributions will be crucial in unlocking the full capabilities of our hardware and accelerating the advancements in AI.
Full-time|$170K/yr - $235K/yr|On-site|Sunnyvale, CA
Founded on the vision of making humanity a multi-planetary species, SpaceX is pioneering the technologies to enable human life on Mars. Our mission extends beyond the stars; we are also transforming global connectivity through Starlink, the most advanced broadband internet system in the world.SENIOR RTL DESIGN ENGINEER (SILICON ENGINEERING)At SpaceX, we leverage our extensive experience in rocket and spacecraft development to successfully deploy Starlink, the world's largest satellite constellation. This initiative is providing high-speed, reliable internet to millions across the globe. We design, build, test, and operate all components of the system, from thousands of satellites to consumer receivers that enable users to connect with ease, and the software that integrates it all. As we continue to expand Starlink's global reach, we seek exceptional engineers to enhance its utility for communities and businesses worldwide.We are in search of a proactive and intellectually curious Senior RTL Design Engineer to collaborate with our elite, cross-disciplinary teams, including systems, firmware, architecture, design, validation, product engineering, and ASIC implementation. In this role, you will be at the forefront of developing next-generation FPGAs and ASICs that will be deployed in both space and terrestrial infrastructures. Your contributions will facilitate connectivity in areas that have previously lacked affordable and reliable access, thereby enhancing the capabilities of the Starlink network.
Full-time|$130K/yr - $180K/yr|On-site|Sunnyvale, CA
At SpaceX, we believe in a future where humanity explores the stars, and we are dedicated to developing the technologies to make that vision a reality. Our ultimate goal is to enable human life on Mars.RFIC DESIGN ENGINEER (SILICON ENGINEERING)As part of SpaceX, you will leverage our extensive experience in building rockets and spacecraft to support the deployment of Starlink, the world's most advanced broadband internet system. Starlink features the largest satellite constellation globally, providing fast and reliable internet to millions of users. We design, build, test, and operate all system components, including thousands of satellites and user-friendly consumer receivers that allow connectivity within minutes. Join us in maximizing Starlink's global impact for communities and businesses.We are seeking a highly motivated and proactive RF/Analog IC engineer to collaborate with world-class cross-disciplinary teams, including systems, firmware, architecture, design, validation, and product engineering. In this position, you will develop cutting-edge next-generation RFICs for deployment in both space and terrestrial infrastructures. These innovations will enable connectivity in previously unreachable, unaffordable, or unreliable areas. Help us deliver next-generation solutions to enhance the Starlink network's performance and capabilities.
Full-time|On-site|Sunnyvale, California, United States
Join Applied Intuition as a Software Engineer specializing in AI Engineering, where you'll have the opportunity to work on cutting-edge technology and contribute to innovative projects that shape the future of artificial intelligence. As part of our dynamic team, you will collaborate with talented professionals to design, develop, and implement AI solutions that address real-world challenges.
Full-time|On-site|Sunnyvale, California, United States
LotusWorks is a premier Engineering Services provider, excelling in the management of Commissioning, Construction Services, Calibration, Operations & Maintenance across global manufacturing facilities. With operations spanning EMEA and North America, we collaborate with industry leaders in the Semiconductor, Pharmaceutical & Biologics, Medical Device, and Data Centre sectors. Our Engineering and Technical professionals are at the forefront of cutting-edge technologies and innovations. At LotusWorks, we are dedicated to fostering a diverse and inclusive workplace that is central to our people-first philosophy.We are on the lookout for a skilled Mechanical Commissioning Engineer to spearhead and execute the commissioning processes for HVAC mechanical systems across intricate facility projects. The ideal candidate will ensure that all mechanical systems are installed, tested, and function in alignment with design specifications and ASHRAE commissioning protocols. This position necessitates robust technical expertise, exceptional documentation capabilities, and the ability to liaise effectively with various project stakeholders.
Position OverviewJoin our dynamic manufacturing engineering team at Intuitive Surgical, where your mechanical engineering design and analysis expertise will play a crucial role in creating cutting-edge equipment used for testing, measuring, and ensuring the quality of our innovative instruments for minimally invasive robotic surgery. Your project management skills will be invaluable as you collaborate with our software engineering team to enhance electro-mechanical components, assemblies, process documentation, tooling, and testing methods, driving improvements in efficacy, reliability, manufacturability, and cost.Key ResponsibilitiesDevelop, maintain, and optimize high-volume manufacturing assembly lines, refining BOMs, workflow processes, and detailed work instructions.Design, document, procure, qualify, implement, and enhance fixtures, tools, and equipment that include both mechanical and electronic components, as well as control algorithms and programming.Analyze equipment performance and trends, addressing root causes of issues to guarantee equipment accuracy, reliability, repeatability, and reproducibility.Modify equipment drawings and implement updates for large-scale production lines across various plants.Conduct risk analysis (PFMEA) on the instrument-manufacturing line, validating critical tests utilized in manufacturing.Oversee the validation and qualification of manufacturing processes and equipment, employing standard qualification methodologies (IQ/OQ/DQ/PQ).Support production lines by resolving day-to-day engineering challenges related to core instruments, including emergency repairs.Identify and execute continuous improvement initiatives focused on first pass yield, cycle time reduction, product reliability, capacity enhancement, and cost reduction.Provide DFx (Design for Manufacturing and Assembly) insights to improve the manufacturability of core instruments.Ensure compliance with medical device quality systems, including the closure of corrective actions and implementation of ECOs.Conduct failure analysis for production discrepancies and field returns, offering technical support as necessary.Willingness to travel periodically to suppliers or our Intuitive Mexicali plant.Exhibit leadership through knowledge sharing, mentoring, and training others.Proactively seek and implement improved technologies for production processes.Enhance documentation related to equipment installation, repair, and maintenance.