About the job
About TensorWave
At TensorWave, our mission is straightforward: to provide seamless, secure, reliable, and resilient AI compute at scale. Our innovative cloud platform removes infrastructure barriers, allowing creators to concentrate on innovation instead of battling technical obstacles. We believe that transformative AI should progress at the speed of ideas, not hindered by infrastructure.
About the Role
As we develop the next generation of GPU cloud infrastructure, our Global Operations Center (GOC) serves as the essential support for 24/7 operations across various data centers. In the position of Lead Operations Engineer, you will be the technical anchor of the GOC, acting as a liaison between our frontline operations engineers and the engineering teams responsible for building and maintaining our platform.
Your contributions will enhance the effectiveness of shift teams: refining and validating operational runbooks, analyzing significant incidents to promote systemic enhancements, and collaborating with engineering leads to improve alert systems and identify tasks suitable for delegation to the operations floor. Working alongside the Head of Global Operations, you will be pivotal in elevating the operational maturity of the GOC and shifting from reactive measures to proactive, standardized operations.
