About the job
Role overview
GetYourGuide connects travelers with memorable experiences worldwide. As a Senior Engineer in Operational Excellence, this position supports teams in delivering reliable, efficient, and trustworthy services for millions of users. Based in Berlin, the role focuses on minimizing disruptions, strengthening user trust, and supporting a culture of operational excellence as GetYourGuide expands into AI-driven travel experiences.
What you will do
- Collaborate closely with product teams to improve system reliability, speed, and confidence.
- Reduce incident frequency, Mean Time to Detect (MTTD), and Mean Time to Recovery (MTTR).
- Lead post-incident reviews and use findings to drive systemic improvements.
- Develop tools and runbooks that help teams quickly diagnose and resolve production issues.
- Promote a blameless culture around incident management and continuous improvement.
- Participate in the infrastructure on-call rotation.
- Advance observability using Datadog, including metrics, logs, traces, dashboards, and alerts.
- Help teams define meaningful Service Level Objectives (SLOs) and set up actionable alerts to avoid alert fatigue.
- Support engineers with production debugging tools and processes for independent troubleshooting.
- Improve change management practices to ensure high-quality software releases.
Team mission
The Operational Excellence team works to prevent incidents before they occur and helps teams resolve them quickly when they do. This group champions reliability, observability, and cost efficiency, building the tools and practices that make operational excellence a shared goal across all product teams. The team’s work ensures GetYourGuide’s engineering organization can move quickly and confidently, delivering consistent, high-quality experiences to travelers.
Location
This position is based in Berlin, with the option to join colleagues in other global offices.
Learn more about working at GetYourGuide at getyourguide.careers.
