About the job
About Us
Coalition is pioneering the realm of Active Insurance, aiming to avert digital risks before they materialize. Established in 2017, we merge extensive insurance coverage with cutting-edge cybersecurity tools to empower businesses in navigating and mitigating potential cyber threats.
At Coalition, opportunities to make a significant impact with innovative thinking are not just possible—they’re happening every day.
About the Role
We are in search of a Site Reliability Engineer II to enhance our Platform SRE team. In this role, you will be responsible for constructing and managing the infrastructure, tools, and standardized processes that enable our developers to deliver scalable, secure, and dependable software swiftly and confidently.
Your expertise will span the complete stack—from infrastructure automation and observability to developer empowerment and system reliability. You will collaborate closely with software engineering and security teams, contributing to the advancement of our Infrastructure as Code (IaC), refining CI/CD pipelines, and scaling our internal developer platform. We prioritize pragmatism and engineering excellence, primarily leveraging Python, Go, and AWS to minimize toil and foster self-service capabilities.
Responsibilities
- Infrastructure Automation: Architect, build, and scale production environments utilizing AWS and Terraform.
- System Reliability: Elevate the resilience and operability of our platform through failure-based testing and automated recovery strategies.
- Developer Enablement: Design and implement reusable platform components and self-service tools to enhance the developer experience.
- Observability: Establish and uphold robust observability practices, including system metrics, distributed tracing, and SLO management.
- Mentorship & Standards: Mentor junior engineers, maintain high infrastructure quality, and contribute to the team’s evolving best practices.
- Collaboration: Engage in technical design discussions, provide constructive feedback, and adapt strategies based on team input and changing requirements.

