Mercor logoMercor logo

Site Reliability Engineer at Mercor | San Francisco

MercorSan Francisco
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Experience Level

Mid to Senior

Qualifications

Qualifications5+ years of relevant experience in Site Reliability Engineering. Deep understanding of best practices in SRE. Experience managing large-scale, high-availability systems. Strong problem-solving skills and a collaborative mindset.

About the job

Join the Mercor Team

At Mercor, we stand at the dynamic intersection of labor markets and AI research. Collaborating with premier AI labs and enterprises, we empower the human intelligence that is crucial for AI's evolution.

Our expansive talent network plays a vital role in training cutting-edge AI models, akin to the way educators impart knowledge to their students, by sharing insights, experiences, and contextual understanding that code alone cannot convey. Currently, our network of over 30,000 experts generates more than $2 million daily.

We are pioneering a novel category of work where expertise fuels AI progress. Achieving this vision necessitates an ambitious, fast-paced, and deeply dedicated team. You will collaborate with researchers, operators, and AI firms that are at the forefront of transforming societal structures.

Mercor is a thriving Series C company with a valuation of $10 billion. We operate five days a week in-person at our new headquarters in San Francisco.

About the Role

As a Site Reliability Engineer (SRE) at Mercor, you will take ownership of production reliability for our critical systems, working closely with our infrastructure leadership. You will play a pivotal role in establishing our SRE function and defining how Mercor manages large-scale, high-availability systems.

Your Responsibilities

  • Ensure the reliability and safety of production for key shared services and customer-facing systems.
  • Collaborate directly with infrastructure leadership to outline SRE priorities, reliability benchmarks, and the production safety roadmap.
  • Enhance the structure of our production systems to ensure stability, resource efficiency, isolation, and observability.
  • Advocate for and implement modern SRE methodologies (e.g., incident management, postmortems, SLIs/SLOs) across engineering teams.
  • Work alongside engineering and applied AI teams to facilitate sustainable growth.
  • Promote SRE best practices internally, supporting teams in a safe, scalable, and consistent production onboarding process.

Who We Seek

The ideal candidate will have:

  • Extensive experience in genuine SRE roles (not merely operations) across various positions or organizations.
  • A deep understanding of SRE methodologies popularized by Google (e.g., error budgets, reliability vs. risk trade-offs, large-scale distributed systems).
  • 5+ years of SRE experience; ideally, 15+ years in total experience for this inaugural SRE position.
  • A proven track record of managing systems at scale, with a strong grasp of the complexities involved.

About Mercor

Mercor is reshaping the future of AI and labor markets by harnessing the power of human intelligence. Partnering with top AI labs and enterprises, we are dedicated to advancing AI development through expertise and collaboration.

Similar jobs

Browse all companies, explore by city & role, or SEO search pages.

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.