Arena Intelligence logoArena Intelligence logo

Machine Learning Scientist at Arena Intelligence | Bay Area

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Experience Level

Experience

Qualifications

QualificationsTo be successful in this role, you should possess:Experience with large-scale model training, including reward and preference models. A strong grasp of machine learning principles and statistical techniques. Expertise in designing training objectives and evaluation frameworks. Familiarity with experimental methodologies from dataset design to evaluation.

About the job

Join the Arena Intelligence Team

Arena Intelligence is a cutting-edge platform dedicated to evaluating the performance of AI models in real-world scenarios. Founded by a team of researchers from UC Berkeley’s SkyLab, our mission is to push the boundaries of AI through comprehensive measurements and advancements.

Every month, millions turn to Arena Intelligence to gain insights into the performance of pioneering AI systems. Our community-driven feedback loop helps us create transparent, rigorous, and human-centered evaluations. Major enterprises and AI laboratories trust our assessments for their reliability, alignment, and impact. Our leaderboards have become the benchmark for AI performance, influencing the global discourse on model efficacy and innovation.

Our team comprises top researchers, engineers, and builders from prestigious institutions like UC Berkeley, Google, Stanford, DeepMind, and Discord. We prioritize truth, agility, craftsmanship, curiosity, and impactful work over traditional hierarchies, fostering an environment where diverse talents can thrive. Our office is a hub of excellence, energy, and focus.

Your Role as a Machine Learning Scientist

We are looking for a skilled Machine Learning Scientist to enhance our methods for evaluating and understanding AI models. You will design and analyze experiments that reveal the factors contributing to the usefulness, trustworthiness, and capabilities of models based on human preference signals. Your contributions will lay the groundwork for scalable AI understanding.

This interdisciplinary role involves close collaboration with engineers, product teams, marketing, and the wider research community to develop innovative methodologies for model comparison, preference data analysis, and performance factor disentangling, including style, reasoning, and robustness. Your work will directly impact our public leaderboard and the resources we provide to model developers.

If you are intrigued by open-ended challenges, rigorous evaluations, and impactful research, we invite you to apply. We are looking for candidates with:

  • Hands-on experience in training large-scale models, including reward and preference models, as well as fine-tuning LLMs using methodologies such as RLHF, DPO, and contrastive learning.

  • A solid foundation in machine learning and statistics, with proven experience in designing innovative training objectives, evaluation schemes, or statistical frameworks to enhance model reliability and alignment.

  • Proficiency in the entire experimental pipeline, from dataset design and large-batch training to thorough evaluation and ablation, with an understanding of scalability for production.

About Arena Intelligence

Arena Intelligence is a pioneering platform transforming the evaluation of AI models through rigorous and transparent methodologies. Our mission is to advance AI understanding and performance, guided by feedback from a vibrant community of users and experts. We are committed to fostering an inclusive environment where innovation thrives.

Similar jobs

Browse all companies, explore by city & role, or SEO search pages. View directory listings: all jobs, search results, location & role pages.

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.