Senior Site Reliability Engineer - Platform Engineering
Landbot
Full-time|Remote|Remote job Full time.We are excited to announce an opportunity for a Senior Site Reliability Engineer to join our dynamic team at Landbot. This is a remote position, and we are specifically looking for candidates located between UTC-1 to UTC +2.About LandbotOperating in over 150 countries, Landbot provides an innovative platform that enables businesses to craft outstanding chatbot and AI agent interactions across various channels, including Web, WhatsApp, and Messenger. We are passionate about delivering exceptional customer experiences and are committed to engineering excellence.At Landbot, we foster a high-performance culture that merges engineering expertise with a product-focused mindset and a dedication to customer satisfaction. We believe that quality and speed are essential for success, and we are looking for a Senior Reliability Engineer to help us enhance our platform and drive meaningful results.About the TeamAs part of our Platform Engineering team, you will collaborate with a small, dedicated group responsible for the development and maintenance of the Landbot Engineering Platform, Data Platform, and Security frameworks. Our mission is to empower Landbot teams to deliver value efficiently, reliably, and at scale.We value:A product-oriented approach to platform developmentAutonomy and accountabilityCollaborative efforts over bureaucratic barriersAbout the PositionYour Role As a Senior Reliability Engineer, you will embody the principles of Systems Engineering within a Platform team, treating infrastructure as a product. Your focus will be on addressing developer needs, minimizing operational challenges, and creating self-service capabilities that simplify processes, allowing teams to concentrate on feature development.Key Responsibilities:Develop and Maintain the Internal Developer PlatformDesign and implement essential platform services, including CI/CD pipelines, infrastructure provisioning, and observability systems.Create developer-facing tools, APIs, and automation that empower application teams to independently deploy, scale, and manage services.Manage and Enhance Platform OperationsOptimize cloud resources, Kubernetes clusters, databases, and networking for enhanced reliability, scalability, and cost-effectiveness.Establish SLIs, SLOs, and error budgets to ensure a balance between reliability and feature velocity.Design and implement observability solutions for real-time monitoring and proactive issue resolution.Develop alerting strategies that minimize noise and highlight actionable insights.Lead incident responses, conduct blameless postmortems, and drive continuous enhancements.Improve Developer Experience and Influence Platform StrategyCollaborate with application teams to understand their workflows and challenges, gather feedback, and prioritize enhancements that align with business goals.Create and maintain comprehensive documentation, runbooks, and knowledge bases to support teams effectively.
Dec 2, 2025