About the job
Key Responsibilities:
- Design, develop, and enhance RESTful APIs for efficient management of GPU clusters, virtual machines, and dedicated servers.
- Implement a modern tech stack using FastAPI, Python, OpenStack, REST API, Asyncio, PostgreSQL, Docker, relational databases, RabbitMQ, Redis, and Kubernetes to ensure the delivery of robust and maintainable code.
- Facilitate the integration of development and operations, improving CI/CD workflows and infrastructure reliability through thorough service monitoring and prompt issue resolution.
- Participate actively in the lifecycle management of compute products, from initial concept through to implementation, with a keen focus on scalability and user requirements.
- Architect and develop systems that prioritize high availability and disaster recovery strategies.
- Collaborate with cross-functional teams to align development and operational goals, fostering a culture of DevOps.
- Encourage innovation by staying abreast of industry trends and emerging technologies, applying insights to refine internal processes and solutions.
