About the job
About the Role:
As a pivotal member of Wrike’s Backend Reliability (BRE) team, you will play a critical role in maintaining our backend infrastructure and ensuring exceptional uptime. Our goal is to achieve and sustain 99.99% availability while developing essential tools, components, and safety measures relied upon by the entire engineering organization. In your capacity as a Senior / Staff Backend Engineer, you will not only address issues but also design core reliability solutions that influence how Wrike scales, performs, and recovers from failures.
Your Impact:
- Design, implement, and sustain vital reliability components including HTTP rate limiters, internal database migration tools, circuit breakers, and distributed Redis-based caching.
- Diagnose intricate production issues, optimize PostgreSQL performance, and ensure our distributed systems operate efficiently and stably under peak loads.
- Lead initial investigations during critical production incidents, identifying root causes, assessing impacts, and proposing mitigation strategies, with long-term solutions executed by the responsible teams.
- Develop scalable, reusable tools and frameworks that empower other engineering teams to construct more resilient services.
- Utilize AI-driven tools and coding agents to expedite development, analyze architectures, and automate repetitive or error-prone tasks.
- Promote reliability best practices across the engineering division through knowledge sharing, design reviews, and setting exemplary technical standards.
