Apollo Research logoApollo Research logo

Research Scientist/Engineer in AI Evaluations

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Experience Level

Experience

Qualifications

Experience in AI research, model evaluation, or a related field. Proficiency in programming languages such as Python or similar. Strong analytical skills and attention to detail. Proven ability to collaborate effectively with cross-functional teams.

About the job

Application Deadline: We are actively interviewing candidates and aim to fill this position promptly with the right individual.

ABOUT THE ROLE

At Apollo Research, we conduct evaluations that meticulously assess the risks associated with advanced AI systems. Collaborating with leading laboratories such as OpenAI, Anthropic, and Google DeepMind, you will have the unique opportunity to engage with cutting-edge AI models ahead of their public release. The ideal candidate possesses a passion for rigorously testing state-of-the-art AI technologies and excels in creating and automating efficient evaluation pipelines.

YOUR RESPONSIBILITIES WILL INCLUDE

- Conducting pre-deployment evaluation campaigns on the world's most advanced AI systems. Our partnerships with various labs provide access to a wide range of models that no single lab can offer, allowing you to be among the first to engage with new models.
- Exploring AI cognition by analyzing extensive model transcripts to identify novel behavioral patterns that have yet to be documented. These insights can be both surprising and enlightening, including phenomena like non-standard language and reward-seeking reasoning as discussed in our anti-scheming paper.
- Developing new evaluations for frontier risks, creating innovative test environments, and scaling these across multiple scenarios.
- Collaborating with leading AI developers to share your insights, receive feedback, and ensure your evaluations influence the deployment strategies of the most advanced AI systems.
- Optimizing and automating the evaluation pipeline. We utilize automation in building, executing, and analyzing evaluations. As agent capabilities evolve rapidly, you will have the autonomy to reshape the pipeline to keep pace with these advancements.

KEY REQUIREMENTS

About Apollo Research

Apollo Research is at the forefront of AI safety evaluations, dedicated to understanding and mitigating risks associated with advanced artificial intelligence. We partner with leading AI labs to pioneer innovative testing methodologies.

Similar jobs

Browse all companies, explore by city & role, or SEO search pages.

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.