Senior Site Reliability Engineer jobs in Sunnyvale – Browse 664 openings on RoboApply Jobs

Senior Site Reliability Engineer

IllumioSunnyvale, California - HQ

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Experience Level

Senior

About the job

Illumio’s Senior Site Reliability Engineer role is based at the company’s Sunnyvale, California headquarters. This is an on-site position, requiring presence in the office five days a week.

Role overview

This position focuses on building and maintaining reliable, scalable infrastructure for Illumio’s applications and services, with an emphasis on Azure cloud solutions. The Senior SRE supports both SaaS and on-premises offerings, working closely with engineering teams to ensure operational resilience and security across hybrid environments.

What you will do

Design, deploy, and maintain highly available infrastructure on Azure for Illumio’s products.
Automate provisioning and configuration management using Infrastructure as Code tools such as Terraform or ARM templates.
Develop and manage CI/CD pipelines to improve software delivery and deployment processes.
Monitor system and application health using Azure monitoring and logging tools, and optimize for performance and availability.
Lead incident response, perform root cause analysis, and document findings to drive continuous improvement.
Collaborate with development teams to design scalable, reliable architectures and provide guidance on cloud-native best practices.

Engineering at Illumio

The engineering team values autonomy, ownership, and collaboration. Work centers on advancing cybersecurity with scalable SaaS services and solutions for on-premises environments. The team emphasizes disciplined engineering, quality, and a supportive culture.

Similar jobs

Browse all companies, explore by city & role, or SEO search pages. View directory listings: all jobs, search results, location & role pages.

1 - 20 of 664 Jobs

Select all on this page (20)

Apply

Senior Site Reliability Engineer

Illumio

Full-time|On-site|Sunnyvale, California - HQ

Illumio’s Senior Site Reliability Engineer role is based at the company’s Sunnyvale, California headquarters. This is an on-site position, requiring presence in the office five days a week. Role overview This position focuses on building and maintaining reliable, scalable infrastructure for Illumio’s applications and services, with an emphasis on Azure cloud…

Apr 22, 2026

Apply

Senior Site Reliability Engineer for AI/ML Innovations

Intuitive Surgical, Inc.

Full-time|On-site|Sunnyvale

Join our dynamic team as a Senior Site Reliability Engineer focused on AI/ML solutions. In this role, you will leverage your expertise to enhance the reliability, scalability, and performance of our cutting-edge AI-driven products. You will work collaboratively with cross-functional teams to design, implement, and maintain robust systems that support our mission to revolutionize surgical technology.

Dec 25, 2025

Apply

Site Reliability Engineer II at Illumio | Sunnyvale, California

Illumio

Full-time|On-site|Sunnyvale, California - HQ

Join Us on Our Mission!At Illumio, we are pioneering the way organizations combat ransomware and data breaches. Our innovative breach containment platform, driven by the Illumio AI Security Graph, enables businesses to effectively identify and mitigate threats across hybrid multi-cloud environments, preventing attacks from escalating into severe crises.As a recognized leader in the Forrester Wave™ for Microsegmentation, Illumio's solutions empower organizations to adopt Zero Trust models, enhancing cyber resilience for the critical infrastructure that sustains the global economy.On-Site Work:This position requires 5 days a week on-site presence at our Sunnyvale, CA headquarters.Our Vision:Our Engineering team thrives on a culture of visionary leadership, autonomy, and ownership, fostering an innovative environment that propels us through the dynamic landscape of cybersecurity.By joining our team, you will contribute to the forefront of Zero Trust Segmentation, utilizing an advanced technology stack that encompasses diverse operating systems, distributed applications, and cutting-edge UI/visualization tools.Together, we are shaping the future of cybersecurity, committed to developing world-class products guided by diverse perspectives and a shared dedication to innovation amidst unprecedented cyber threats.Your Role:As a Site Reliability Engineer II, you will oversee our multi-cloud infrastructure on platforms such as Azure, AWS, and/or GCP. Your responsibilities will include designing new cloud services and applications, collaborating closely with Engineering, SRE/OPS, and Security teams to transition these projects from development to production.Daily tasks will involve enhancing the reliability and scalability of Illumio's SaaS products while driving continuous improvement initiatives.We seek candidates with a strong passion for cloud technology, automation, and collaboration, as well as a solid understanding of the Azure cloud platform and related DevOps practices.

Feb 7, 2026

Apply

Site Reliability Engineer II at Illumio | Sunnyvale, California

Illumio

Full-time|On-site|Sunnyvale, California - HQ

Join Us in Securing the Future!At Illumio, we are pioneers in ransomware and breach containment, transforming how organizations defend against cyberattacks and fortifying operational resilience. Our innovative Illumio AI Security Graph powers a breach containment platform that swiftly identifies and neutralizes threats across hybrid multi-cloud environments, preventing minor issues from escalating into catastrophic events.As a recognized leader in the Forrester Wave™ for Microsegmentation, we enable Zero Trust, bolstering the cyber resilience of the infrastructures, systems, and organizations that keep the world functioning smoothly.Location: This role requires on-site presence in our Sunnyvale, CA headquarters five days a week.Vision of Our Team:Our Engineering team flourishes within a culture that champions visionary leadership, autonomy, and ownership. This dynamic synergy propels us forward in the constantly evolving realm of cybersecurity.As a member of our team, you will be at the forefront of Zero Trust Segmentation, working with an advanced technology stack that encompasses operating systems, distributed applications, and immersive UI/visualization tools.We're not just shaping the future of cybersecurity; we’re committed to developing world-class products led by diverse perspectives, backgrounds, and an unwavering commitment to innovation amidst unprecedented cybersecurity challenges.Your Role:As a Site Reliability Engineer II, you will oversee and optimize our multi-cloud infrastructure across Azure, AWS, and/or GCP. You will have the opportunity to design new services and applications in the cloud, guiding them from development to production while collaborating closely with Engineering, SRE/Operations, and Security teams.Your daily responsibilities will include enhancing the reliability and scalability of Illumio's SaaS offerings and spearheading continuous improvement initiatives.The ideal candidate is driven by a passion for cloud technology, automation, and collaboration, coupled with a solid foundation in Azure cloud platforms and relevant DevOps practices.Design, deploy, and maintain robust cloud infrastructure solutions on Azure, AWS, and/or GCP to support our applications and services.Implement Infrastructure as Code (IaC) principles using tools such as Terraform, ARM templates, or CloudFormation to automate provisioning and configuration management.Develop and maintain CI/CD pipelines for automated software delivery and deployment, utilizing tools like Azure DevOps, AWS CodePipeline, or Jenkins.Monitor system performance and availability, ensuring optimal operational efficiency.

Mar 23, 2026

Apply

Performance & Reliability Engineer

Cerebras Systems

Full-time|On-site|Sunnyvale, CA; Toronto, Ontario, Canada

Cerebras Systems is at the forefront of AI technology, developing the world’s largest AI chip that is 56 times larger than conventional GPUs. Our innovative wafer-scale architecture delivers the computational power of dozens of GPUs within a single chip, simplifying programming and enhancing performance. This unique capability enables Cerebras to provide unparalleled training and inference speeds, allowing machine learning practitioners to execute large-scale ML applications seamlessly without the complexities of managing extensive GPU or TPU infrastructures.Cerebras serves a diverse clientele, including top-tier model labs, global enterprises, and pioneering AI-native startups. OpenAI has recently partnered with Cerebras to leverage 750 megawatts of power, significantly enhancing key workloads through ultra high-speed inference.Our cutting-edge wafer-scale architecture has made Cerebras Inference the fastest Generative AI inference solution globally, achieving speeds over ten times faster than GPU-based hyperscale cloud inference services. This revolutionary speed is transforming the user experience of AI applications, facilitating real-time iteration and boosting intelligence through enhanced computational capabilities.About The RoleWe invite you to join Cerebras as a Performance & Reliability Engineer within our dynamic Co-Design and Next Generation Team. Our groundbreaking CS-3 system has established benchmarks for high-performance ML training and inference solutions, utilizing a chip the size of a dinner plate with 44GB of on-chip memory that exceeds traditional hardware capabilities. In this role, you will focus on characterizing and optimizing the performance and reliability of state-of-the-art AI models operating on Cerebras' revolutionary hardware.ResponsibilitiesCharacterize and enhance the performance and reliability of advanced ML hardware/software systems, focusing on minimizing power and thermal fluctuations.Analyze ML workloads, software kernels, and hardware architecture for their power and performance impacts, synthesizing high-level insights across these layers.Develop innovative software solutions to enhance system performance and efficiency.

Feb 17, 2026

Apply

Fleet Reliability Engineer

Applied Intuition

Full-time|On-site|Sunnyvale, California, United States

As a Fleet Reliability Engineer at Applied Intuition, you will be at the forefront of ensuring the reliability and performance of our advanced fleet systems. Your expertise will play a crucial role in the development and deployment of our cutting-edge technology, optimizing fleet operations to guarantee safety and efficiency.

Mar 25, 2026

Apply

Engineering Manager, Kernel Reliability

Cerebras Systems

Full-time|On-site|Sunnyvale CA or Toronto Canada

Cerebras Systems is at the forefront of AI technology, having developed the world's largest AI chip, which is 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers the AI computing power equivalent to dozens of GPUs on a single chip, simplifying programming to a single device. This revolutionary design enables Cerebras to provide unmatched training and inference speeds, empowering machine learning practitioners to seamlessly execute large-scale ML applications without the complexities of managing multiple GPUs or TPUs.Our clientele includes elite model labs, global corporations, and pioneering AI-native startups. Notably, OpenAI recently entered into a multi-year partnership with Cerebras to deploy 750 megawatts of scale, significantly enhancing key workloads with ultra high-speed inference.Thanks to our groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution globally, achieving speeds over 10 times faster than GPU-based hyperscale cloud inference services. This substantial speed boost is transforming user experiences in AI applications by enabling real-time iterations and enhancing intelligence through additional agentic computation.The RoleWe are seeking a highly technical and hands-on Engineering Manager to lead our on-field Kernel Reliability team. You will guide a high-performing team in addressing a critical challenge: enhancing the reliability of our advanced compute clusters along with the associated inference, training, and internal production services. In this influential role, you will define the technical vision while remaining closely engaged with the code, crafting scalable solutions for our rapidly expanding system production and software service offerings. If you possess proven expertise in software or hardware reliability, diagnostic tool development, or failure analysis and debugging, we invite you to connect with us.ResponsibilitiesProvide hands-on technical leadership, owning the technical vision and roadmap for kernel-centric reliability concerning both internal and customer-facing systems.

Feb 17, 2026

Apply

Software Engineer - Kernel Reliability

Cerebras Systems

Full-time|On-site|Sunnyvale CA or Toronto Canada

Cerebras Systems is revolutionizing the AI landscape with the world's largest AI chip, which is 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers the computational power of multiple GPUs on a single chip, simplifying programming and enabling unparalleled training and inference speeds. This technology allows our users to run extensive machine learning applications seamlessly, eliminating the complexities associated with managing numerous GPUs or TPUs.Our clientele includes leading model labs, global corporations, and pioneering AI startups. Recently, OpenAI announced a multi-year collaboration with Cerebras, aiming to deploy 750 megawatts of power, significantly enhancing their workloads with ultra-fast inference capabilities.With our groundbreaking wafer-scale architecture, Cerebras Inference provides the fastest Generative AI inference solution globally, outperforming GPU-based hyperscale cloud services by over tenfold. This remarkable speed enhancement is transforming user experiences in AI applications, facilitating real-time iterations and amplifying intelligence through advanced computational capabilities.About The RoleWe are in search of a highly technical and hands-on Software Engineer to join our Kernel Reliability team. In this pivotal role, you will address the crucial task of enhancing the reliability of our advanced compute clusters, along with the inference, training, and internal production services. You will work closely with the code to develop solutions that scale alongside our rapidly evolving production systems and software services. If you possess strong foundations in systems, debugging, and failure analysis and have a passion for creating tools and solving complex reliability challenges, we would love to connect with you. New graduates are encouraged to apply.

Mar 5, 2026

Apply

Senior Quality Engineer - On-Site Opportunity in Sunnyvale

360IT Professionals

Full-time|On-site|Sunnyvale

Join our dynamic team at 360IT Professionals as a Senior Quality Engineer. In this role, you will play a pivotal part in ensuring the highest standards of quality and reliability in our products. Your expertise will be essential in developing and implementing robust testing strategies, conducting thorough analysis, and collaborating closely with cross-functional teams to drive continuous improvement.We are looking for a passionate individual who thrives in a fast-paced environment and is committed to delivering exceptional results. If you are ready to take your career to the next level and make a significant impact, we encourage you to apply!

Apr 3, 2017

Apply

Reliability Manager for Photonic Integrated Components

dstaff

Full-time|On-site|Sunnyvale

We are seeking a talented and motivated Reliability Manager to join our team at dstaff, focusing on Photonic Integrated Components. In this role, you will lead initiatives to enhance the reliability of our products and ensure that they meet the highest quality standards. You will collaborate with cross-functional teams to conduct reliability testing, analyze data, and implement improvements.

May 14, 2015

Apply

Site Selection Manager

CoreWeave

Full-time|On-site|Livingston, NJ / New York, NY / Sunnyvale, CA / San Francisco, CA / Bellevue, WA / Richmond, VA

The Site Selection Manager at CoreWeave identifies and secures locations for new data centers. This position shapes the company’s expansion across the United States by evaluating sites that align with operational requirements and growth plans. Role overview This role focuses on analyzing potential sites, factoring in market trends and strategic priorities. The Site Selection Manager works with cross-functional teams to gather insights and builds recommendations for senior leadership. Collaboration Success in this position depends on strong teamwork. The Site Selection Manager partners with colleagues from various departments to ensure each location supports CoreWeave’s business goals and operational standards. Locations This position is based in one of several locations: Livingston, NJ; New York, NY; Sunnyvale, CA; San Francisco, CA; Bellevue, WA; or Richmond, VA.

Apr 28, 2026

Apply

Site Selection Program Manager

CoreWeave

Full-time|$143K/yr - $191K/yr|On-site|Livingston, NJ / New York, NY / Sunnyvale, CA / San Francisco, CA / Bellevue, WA / Richmond, VA / Dallas, TX

Locations: Livingston, NJ; New York, NY; Sunnyvale, CA; San Francisco, CA; Bellevue, WA; Richmond, VA; Dallas, TX About CoreWeave CoreWeave is a cloud platform built for AI innovation. Since 2017, the company has provided technology, tools, and expert teams to help organizations develop and scale AI solutions. CoreWeave supports AI labs, startups, and global companies with high-performance infrastructure and technical expertise. The company became publicly traded on Nasdaq (CRWV) in March 2025. Learn more at www.coreweave.com. Role Overview The Site Selection Program Manager joins the global Site Selection team to manage the full lifecycle of data center site selection projects. This role reports to Site Selection leadership and plays a key part in ensuring data center opportunities move efficiently from market assessment to lease execution and delivery readiness. Key Responsibilities Oversee all stages of site selection, from initial screening through lease execution and delivery preparation. Maintain and update program trackers, dashboards, and reporting tools to monitor deal status, milestones, risks, and dependencies. Coordinate activities across Site Selection, Engineering, Capacity Planning, Finance, Legal, Operations, and external partners. Ensure Salesforce and internal systems reliably reflect current program activities and metrics. About the Work This position helps keep CoreWeave’s site selection pipeline organized and efficient. The Program Manager brings structure and clarity to projects involving many teams, helping the business move quickly while maintaining accuracy and reducing risk. The role requires careful tracking of multiple transactions and clear communication across departments and with external partners.

Apr 16, 2026

Apply

Cassandra Engineer - On-Site Opportunity

360IT Professionals

Full-time|On-site|Sunnyvale

Join 360IT Professionals as a Cassandra Engineer where your expertise in managing and optimizing Cassandra databases will play a crucial role in our technology stack. In this position, you will work closely with our development teams to ensure database performance, availability, and security. This is an exciting opportunity for an individual eager to contribute to innovative projects in a collaborative environment.

Feb 22, 2017

Apply

Senior Mechanical Engineer

Intuitive Surgical, Inc.

Full-time|On-site|Sunnyvale

Join our innovative team at Intuitive Surgical, Inc. as a Senior Mechanical Engineer, where your expertise will drive the development of cutting-edge robotic surgical systems. You will be instrumental in designing, analyzing, and testing mechanical components that enhance patient outcomes and surgical precision.

Mar 24, 2026

Apply

Senior Mechanical Engineer

Intuitive Surgical, Inc.

Full-time|On-site|Sunnyvale

Join Intuitive Surgical, a pioneering company in robotic-assisted surgery, as a Senior Mechanical Engineer. In this role, you will be instrumental in the design and development of innovative mechanical systems that enhance surgical procedures. Collaborate with cross-functional teams to create cutting-edge technologies that improve patient outcomes.

May 1, 2026

Apply

Senior RFIC Design Engineer - Silicon Engineering

SpaceX

Full-time|On-site|Sunnyvale, CA

Join SpaceX as a Senior RFIC Design Engineer in our Silicon Engineering team. In this pivotal role, you will be responsible for designing innovative RF integrated circuits that drive our next-generation space technologies. Collaborate with a team of experts to push the boundaries of technology while ensuring the highest standards of quality and performance.

Apr 1, 2026

Apply

Senior Engineering Manager

Illumio

Full-time|On-site|Sunnyvale, California - HQ

Join Us in Shaping the Future!At Illumio, we are pioneers in ransomware and breach containment, transforming how organizations manage cyber threats and ensure operational resilience. Our advanced Illumio AI Security Graph empowers our breach containment platform to identify and contain threats across hybrid multi-cloud environments, effectively halting the spread of attacks before they escalate into crises.As acknowledged leaders in the Forrester Wave™ for Microsegmentation, we facilitate Zero Trust frameworks that enhance cyber resilience for critical infrastructures and organizations worldwide.Location: 5 days a week on-site at our Sunnyvale, CA Headquarters.Our Team's Vision:Our Engineering team thrives on a culture fueled by visionary leadership, autonomy, and a strong sense of ownership. This dynamic synergy propels us forward in the rapidly evolving realm of cybersecurity.When you join us, you become part of a leading force in Zero Trust Segmentation, engaging with a cutting-edge technology stack that spans operating systems, distributed applications, and immersive UI/visualization tools.We are committed to shaping the future of cybersecurity and building world-class products driven by diverse perspectives and a dedication to innovation in these unprecedented times of cyber threats.Your Impact:As the Senior Manager of Cloud Engineering, you will lead a talented team focused on developing scalable, distributed cloud services through containerized microservices within a multi-cloud environment. You will steer the design, development, and delivery of cloud solutions, ensuring high standards for automation, observability, and operational excellence.Lead and manage a cloud engineering team dedicated to developing distributed microservices for a multi-tenant, scalable platform.Supervise cloud service development, prioritizing performance, security, and reliability.Establish coding standards, engage in code and design reviews, and ensure high-quality automation and testing.Collaborate with Product Management to align development efforts with business objectives.Oversee the entire software development lifecycle, from requirements gathering to deployment and operations.Cultivate a culture of continuous learning and innovation, embracing best practices in cloud engineering.

Mar 3, 2026

Apply

Senior Mechanical Engineer

Intuitive Surgical, Inc.

Full-time|On-site|Sunnyvale

Join our innovative team at Intuitive Surgical, where we are redefining the landscape of surgical robotics. As a Senior Mechanical Engineer, you will play a pivotal role in the design, development, and enhancement of cutting-edge robotic systems that are transforming patient care. Your expertise will contribute to creating solutions that improve surgical precision and outcomes.

Apr 2, 2026

Apply

Senior Electrical Engineer

Intuitive Surgical, Inc.

Full-time|On-site|Sunnyvale

About the Role Intuitive Surgical, Inc. is hiring a Senior Electrical Engineer in Sunnyvale. This role focuses on designing and developing electrical systems for advanced surgical technologies. The work supports the full product development cycle, from initial concept through to production release. What You Will Do Design and develop electrical systems for surgical products Work closely with teams across disciplines to move products from idea to manufacturing Support quality and performance standards throughout development

Apr 17, 2026

Apply

Senior Software Engineer

Ceribell

Full-time|$141K/yr - $190K/yr|On-site|Sunnyvale, CA

About CeribellCeribell is at the forefront of medical technology, dedicated to revolutionizing the diagnosis and management of patients with serious neurological conditions. Our innovative Ceribell System is a cutting-edge, point-of-care electroencephalography (EEG) platform that meets the critical needs of patients in acute care settings. Already in use at hundreds of community hospitals, large academic institutions, and major integrated delivery networks across the nation, our team shares a collective mission to enhance critical care with our rapid seizure detection technology. Join us in making a difference!Position Overview:We are seeking a talented Senior Software Engineer with a strong backend focus to join our dynamic team in developing the next generation of EEG web applications that cater to vital medical use cases. In this role, you will be instrumental in designing, maintaining, and enhancing the backend systems for our EEG Portal web application, which is essential for healthcare providers, researchers, and clinical teams to access, monitor, and analyze EEG data. You will collaborate closely with fellow engineers, product managers, and stakeholders to ensure that our backend systems are robust, secure, and scalable within a medical environment.Key Responsibilities:Backend Development & Maintenance:Design, develop, and maintain backend systems to support the EEG Portal application, ensuring dependable performance and adherence to healthcare standards.Implement new features and enhancements to meet clinical and research demands, prioritizing efficiency and scalability.Troubleshoot, debug, and optimize backend systems to guarantee maximum uptime and reliability for users.Database Management:Write optimized database queries and execute data migration strategies.Monitor and fine-tune database performance, including indexing, replication, and backup processes.API Development & Integration:Develop and maintain RESTful APIs that interact with the frontend and other systems.Ensure APIs are secure, well-documented, and capable of handling large volumes of sensitive medical data.Integrate third-party services and platforms as needed to enhance functionality.Ensure backend services comply with regulatory standards, including data encryption, authentication, and auditing.

Mar 2, 2026

Create account — see all 664 results

Browse all companies, explore by city & role, or SEO search pages. View directory listings: all jobs, search results, or location & role pages.