About the job
About ClickHouse
Featured on the 2025 Forbes Cloud 100 list, ClickHouse stands out as one of the most pioneering and rapidly expanding private cloud enterprises. With an impressive customer base exceeding 3,000 and an annual recurring revenue that has surged over 250% year-on-year, ClickHouse is at the forefront of real-time analytics, data warehousing, observability, and AI workloads.
Our recent momentum has been reinforced by a substantial $400M Series D funding round. Notable clients such as Capital One, Lovable, Decagon, Polymarket, and Airwallex have recently adopted or expanded their use of our platform. These organizations join a distinguished roster of AI trailblazers and global leaders including Meta, Cursor, Sony, and Tesla.
Join us on our mission to revolutionize data utilization across companies. Be part of our exciting journey!
About the Team
The Release Team is responsible for the secure and continuous delivery of ClickHouse Cloud, our managed database platform that operates thousands of ClickHouse clusters. We ensure smooth upgrades and maintenance of these clusters at scale, develop the internal tools that facilitate this, and serve as the critical last line of defense when unexpected issues arise.
About the Role
This position merges operational execution with software development responsibilities. You will coordinate and execute upgrades, tackle edge cases that stray from the norm, and maintain the health of tens of thousands of clusters in production. At the same time, you will enhance and innovate the systems that make each rollout safer and more automated than previous ones. If you thrive on both strategizing and executing, embracing the challenges along the way, this role is a perfect fit for you.
What You'll Do
- Plan and execute rolling upgrades across thousands of ClickHouse clusters, ensuring safety, accuracy, and minimal impact to customers.
- Manage the entire release pipeline: from pre-upgrade validation and staged rollouts to post-upgrade monitoring and incident response.
- Investigate and resolve production issues as part of a regular on-call rotation, including unique clusters and edge cases that defy automation.
- Develop and enhance internal tools and automation to ensure reliable and repeatable large-scale database operations.
- Collaborate closely with core database and cloud infrastructure teams to identify operational challenges and implement effective solutions.
- Support and educate other engineering teams in utilizing our internal tools.

