About the job
About Us
At Abacus Insights, we are revolutionizing the way healthcare data is utilized for health plans. Our mission is to transform healthcare data into actionable insights, enabling decision-makers to act swiftly and confidently. We work tirelessly to dismantle data silos, facilitating the creation of a unified and reliable data foundation that enhances decision-making processes , ultimately improving health outcomes, minimizing waste, and enriching experiences for both members and providers.
Supported by $100 million from leading investors, we are committed to addressing significant challenges in a healthcare sector ripe for innovation. Our platform empowers Generative AI applications by providing clean, interconnected, and dependable healthcare data that supports automation and streamlined decision-making workflows, positioning us at the forefront of the industry.
Our innovation stems from our people. We foster a culture of boldness, curiosity, and collaboration , as we believe the best solutions arise from teamwork. Are you ready to make a difference? Join us in shaping the future.
About the Role
The Site Reliability Engineer plays a pivotal role as an experienced individual contributor, ensuring seamless production operations, managing incident responses, and maintaining reliability across the Abacus Insights platform.
This position encompasses:
- Overseeing production operations and reliability engineering
- Engaging in advanced technical problem-solving during incidents and escalations
- Providing customer-facing technical support during deployments and troubleshooting
- Hands-on software development and automation responsibilities
In this role, you will tackle intricate production challenges involving AWS infrastructure, Databricks workloads, and complex data pipelines, collaborating closely with senior engineers and platform teams. You will interact directly with customers during escalations and deployments, translating insights into product and operational enhancements. This is a hands-on position where you will actively participate in incident management, debug live systems, and write production code aimed at enhancing system reliability and operability under the mentorship of senior technical leaders.
Your Day-to-Day Responsibilities
- Serve as a technical escalation point during production incidents, leading real-time triage, mitigation, and recovery efforts.
