About the job
At Braze, we pride ourselves on our exceptional team—approachable, kind, and passionate individuals dedicated to our mission.
We strive to foster this passion by setting high expectations, promoting collaboration, and ensuring a harmonious work-life balance as we grow rapidly on a global scale, all while advocating for equity and opportunities within and beyond our organization.
To thrive in this environment, you should be ready to hold yourself and others to high standards. There are always opportunities to contribute: exercising independence, taking responsibility, and welcoming new viewpoints are vital for our sustained success.
Our inherent curiosity and willingness to share diverse interests create a vibrant culture that is uniquely ours.
If you're motivated to tackle exciting challenges and have a proactive approach to change, you will find the empowerment to make a significant difference alongside a dedicated and enthusiastic team. If Braze resonates with you, we look forward to connecting!
WHAT YOU'LL DO
As a Platform Software Engineer (PSWE), you will design and develop the distributed systems that underpin Braze's extensive background processing platform. You will manage Sidekiq at Braze, which processes over a trillion jobs daily across Kubernetes clusters worldwide. Your role will encompass autoscaling systems, metrics pipelines, reliable job execution, and the internal frameworks that ensure distributed processing is safe for application teams.
Operating at a colossal scale, Braze serves 3.3 billion monthly active users, collects hundreds of billions of data points monthly, and sends billions of messages each day. Our technology stack includes Ruby on Rails, Go, MongoDB, Redis, and Kafka. As a PSWE, you will collaborate with application teams to enhance the Sidekiq platform they rely on and improve its reliability, performance, and developer experience.
Main responsibilities:
- Develop Braze’s embedded frameworks that enable large-scale distributed processing.
- Design, build, and operate internal software frameworks that support Braze’s asynchronous and background processing systems at scale.
- Enhance and extend frameworks built on technologies like Sidekiq to reliably execute over a trillion jobs per day across a globally distributed platform.
- Oversee scaling behavior, reliability guarantees, failure modes, and operational safety of these systems.
- Provide structured abstractions, tooling, and guardrails that allow application teams to utilize distributed processing safely while minimizing complexity.
- Improve observability, debuggability, and overall system performance.

