Research Engineer In Machine Learning Reinforcement Learning jobs in London – Browse 2,578 openings on RoboApply Jobs

Research Engineer in Machine Learning - Reinforcement Learning

AnthropicLondon, UK

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Experience Level

Experience

Qualifications

We are looking for candidates who have a strong foundation in machine learning, particularly in reinforcement learning. Ideal candidates should possess:A solid understanding of AI and machine learning principles. Experience with programming languages such as Python or similar. Familiarity with deep learning frameworks and libraries. Research experience in reinforcement learning or related fields. Excellent problem-solving skills and the ability to work collaboratively in a team environment.

About the job

About Anthropic

At Anthropic, we are dedicated to developing reliable, interpretable, and controllable AI systems. Our goal is to ensure that AI technology is safe and beneficial for both users and society. Our rapidly expanding team consists of passionate researchers, engineers, policy experts, and business leaders collaborating to create advantageous AI systems.

About the Teams

The Reinforcement Learning teams at Anthropic spearhead our research and development in reinforcement learning, playing an essential role in enhancing our AI systems. We have made significant contributions to all Claude models, particularly impacting the autonomy and coding capabilities of Claude Sonnet 4.5 and Opus 4.5. Our work encompasses several critical areas:

Creating systems that empower models to utilize computers effectively.
Enhancing code generation through reinforcement learning techniques.
Conducting pioneering RL research for large language models.
Establishing scalable RL infrastructure and training methodologies.
Improving model reasoning capabilities.

We work closely with Anthropic's alignment and frontier red teams to ensure our systems are both capable and secure. Additionally, we collaborate with the applied production training team to seamlessly integrate research advancements into deployed models, demonstrating our commitment to implementing research at scale. Our Reinforcement Learning teams operate at the intersection of cutting-edge research and engineering excellence, dedicated to building high-quality, scalable systems that expand the possibilities of AI.

About the Role

As a Research Engineer in the Reinforcement Learning domain, you will partner with a diverse group of researchers and engineers to enhance the capabilities and safety of large language models. This position merges research and engineering responsibilities, requiring you to implement innovative approaches while contributing to the research strategy. You will engage in fundamental research in reinforcement learning, developing 'agentic' models capable of tool use for open-ended tasks such as computer usage and autonomous software generation, improving reasoning skills in disciplines like mathematics, and creating prototypes for internal applications, productivity, and evaluation.

Representative Projects:

Design and optimize core reinforcement learning infrastructure, from clean training abstractions to distributed experiment management across GPU clusters, scaling our systems to manage increasingly complex research workflows.
Invent, implement, and evaluate novel training environments, evaluations, and methodologies for reinforcement learning.

About Anthropic

Anthropic is at the forefront of AI development, committed to creating systems that are not only powerful but also safe and interpretable. Our diverse team works tirelessly to push the boundaries of what AI can achieve, ensuring that our innovations benefit society as a whole.

1 - 20 of 2,578 Jobs

Select all on this page (20)

Apply

Research Engineer in Machine Learning - Reinforcement Learning

Anthropic

Full-time|On-site|London, UK

About AnthropicAt Anthropic, we are dedicated to developing reliable, interpretable, and controllable AI systems. Our goal is to ensure that AI technology is safe and beneficial for both users and society. Our rapidly expanding team consists of passionate researchers, engineers, policy experts, and business leaders collaborating to create advantageous AI sys…

Feb 12, 2026

Apply

Research Scientist in Reinforcement Learning

DeepMind Technologies Limited

Full-time|On-site|London, UK

At Google DeepMind, we prioritize diversity in experience, knowledge, backgrounds, and perspectives, leveraging these attributes to create remarkable impact. We are dedicated to providing equal employment opportunities regardless of sex, race, religion, belief, ethnic or national origin, disability, age, citizenship, marital status, domestic or civil partnership status, sexual orientation, gender identity, pregnancy, or any other condition protected by applicable law. If you have a disability or additional needs that require accommodation, please feel free to reach out to us.SnapshotWe seek talented Research Scientists to advance fundamental research and technology in Artificial Intelligence, as part of our interdisciplinary and collaborative Reinforcement Learning team.About UsDeepMind’s RL team is a cohesive and collaborative group of scientists and engineers, led by Tom Schaul. We address large-scale research challenges in reinforcement learning, designing, refining, and scaling RL algorithms to yield significant scientific and product impact. Over the past decade, our RL team has been pivotal in developing innovations such as DQN, AlphaGo, Rainbow, AlphaZero, MuZero, AlphaStar, AlphaProof, and Gemini. Join us in creating the next groundbreaking advancement!The RoleAs a Research Scientist, you will leverage your machine learning expertise and technical skills to innovate, spearhead research projects, and apply findings to impactful challenges. You will be responsible for implementing code, conducting experiments, owning results from start to finish, communicating findings effectively, and collaborating with fellow team members to empower others.Your work may involve:Initiating and pursuing novel research directions by proposing and testing hypotheses.Implementing algorithmic ideas and executing comprehensive experiments, including setup, execution, analysis, and iteration.Sharing your expertise and insights with other researchers.Building or enhancing infrastructure to support scalable research.Designing evaluations and ablations that address real questions and influence perspectives.Meticulously analyzing results, including debugging and failure analysis.Effectively communicating through visualizations, write-ups, and publication-ready narratives and figures.Contributing to a culture of first-principles thinking, high standards, and constructive feedback.Our projects encompass a broad spectrum of state-of-the-art machine learning and AI domains, including large language models, distributed machine learning, and more.

Mar 14, 2026

Apply

Anthropic Fellows Program - Focus on Reinforcement Learning

Anthropic

Full-time|Hybrid|London, UK; Ontario, CAN; Remote-Friendly, United States; San Francisco, CA

Join the prestigious Anthropic Fellows Program, where you'll have the opportunity to delve into cutting-edge research in Reinforcement Learning. This program is designed for individuals passionate about advancing AI safety and developing innovative solutions. As a fellow, you will collaborate with a team of experts, engage in impactful projects, and contribute to a progressive research environment.

Apr 10, 2026

Apply

Machine Learning Software Engineer - Research

PhysicsX

Full-time|On-site|London, United Kingdom

PhysicsX is looking for a Machine Learning Software Engineer to join its research division in London. This position centers on developing new machine learning solutions in collaboration with researchers and engineers. Role overview The work involves applying advanced algorithms and data analysis techniques to address complex challenges spanning several fields. The team values creative problem-solving and technical depth. Collaboration Expect to work closely with colleagues from diverse backgrounds, combining expertise to create practical and innovative solutions. The environment encourages sharing ideas and learning from each other. Location This role is based in London, United Kingdom.

Apr 27, 2026

Apply

Research Engineer in Machine Learning - RL Velocity

Anthropic

Full-time|On-site|London, UK

Anthropic seeks a Research Engineer specializing in Machine Learning, with a focus on Reinforcement Learning (RL Velocity), for its London office. This role supports ongoing AI research and contributes to building advanced machine learning systems. Key responsibilities Work alongside researchers and engineers to solve complex reinforcement learning problems Participate in designing and developing new machine learning models and systems Shape solutions that directly influence Anthropic’s research objectives Collaboration and team environment Join a team of skilled colleagues dedicated to AI advancement. Team members regularly exchange ideas, review each other's work, and support one another to create effective solutions.

Apr 23, 2026

Apply

Senior Research Scientist in Reinforcement Learning at Canva | London

Canva

Full-time|On-site|London

At Canva, our mission is to empower individuals to unleash their creativity through design. We are innovating AI technology that not only feels intuitive but also creates meaningful impacts for millions, enabling everyone to design with confidence. We are seeking a Senior Research Scientist passionate about reinforcement learning, agentic systems, and mixture of experts (MoE) models to advance our capabilities in reasoning, tool utilization, latency, and reliability.About the TeamOur team delves into multimodal agentic architectures, establishing robust training and evaluation frameworks. We collaborate closely with product and platform teams to transform groundbreaking research into engaging product features. As a pioneering post-training team, we are dedicated to developing advanced multimodal agentic systems. We cover a wide array of topics, including multimodal modeling, post-training strategies, and agent design.About the RoleIn this role, you will influence research directions and engage in hands-on initiatives across the agent stack—from reward design and policy optimization to planning, memory management, tool orchestration, dataset construction, and the innovation of post-training methodologies. You will create meticulously designed experiments, iterate rapidly, and derive reliable conclusions, all while ensuring that research translates into safe, high-quality product experiences.Key ResponsibilitiesDesign and develop agent systems focused on planning, multimodal tool usage, retrieval, innovative training methods, and modeling experiments for real-world applications in design, vision, and language.Implement scalable post-training and reinforcement learning solutions across distributed systems (using PyTorch), optimizing data loaders, telemetry, and stable training of MoE architectures while ensuring reproducibility.Contribute to the reinforcement learning and agentic systems research agenda that aligns with Canva’s product vision; quickly identify and prioritize high-impact projects.Create reward models and learning loops, including RLHF/RLAIF, preference modeling, DPO/IPO-style objectives, offline/online RL, and curriculum learning.Develop simulation tasks that expose failure modes (planning errors, tool-use weaknesses, hallucinations, unsafe actions) and establish measurable targets for improvement.Lead rigorous evaluations for agents, focusing on task success, reliability, latency, safety, and regression testing. Set up offline suites and conduct online A/B testing; favor straightforward experiments that yield generalizable results.Collaborate closely with product, design, safety, and platform teams to successfully integrate research findings into reliable product features.

Feb 25, 2026

Apply

Machine Learning Research Engineer

Recraft

Full-time|On-site|London, UK

About UsEstablished in the United States in 2022 and now operating from London, UK, Recraft is an innovative AI platform tailored for designers, illustrators, and marketers, setting a new benchmark in image generation excellence.Our cutting-edge tool empowers creators to swiftly generate and refine original images, vector art, illustrations, icons, and 3D graphics using advanced AI technology. With over 3 million users in 200 countries, our community has produced hundreds of millions of images with Recraft, and we are just beginning our journey.Become part of a world filled with professional opportunities, contribute to large-scale projects, and be a pioneer in the creative industry’s future. We are dedicated to making Recraft a vital tool for every designer, striving to set the industry standard. Our mission focuses on ensuring that creators maintain complete control over their creative processes with AI, offering them innovative tools to transform their ideas into reality.If you have a passion for pushing the limits of AI technology, we would love to have you join our team!

Sep 2, 2025

Apply

Staff Machine Learning Software Engineer - Research

physicsx

Full-time|On-site|London, United Kingdom

physicsx is seeking a Staff Machine Learning Software Engineer to join its Research team in London. This position centers on building advanced algorithms and machine learning solutions to drive forward ambitious research initiatives. Role overview The main focus is on designing and implementing machine learning models that push the boundaries of current technology. Projects will involve exploring new methods and supporting research efforts that aim to solve complex problems. Location This role is based in London, United Kingdom. What you will do Develop and refine machine learning algorithms for research applications Collaborate with researchers to support innovative projects Work on solutions intended to advance the capabilities of existing technology

Apr 27, 2026

Apply

Research Scientist in Machine Learning

Isomorphic Labs

Full-time|On-site|London

Isomorphic Labs is pioneering the application of advanced AI technologies to unlock profound scientific insights, expedite breakthroughs, and develop transformative medicines with a vision to eradicate all diseases.The future is on the horizon, made possible by the remarkable capabilities of machine learning. We envision a world where diseases are effectively managed or eliminated through accelerated and enhanced drug discovery.Join our interdisciplinary team to spearhead groundbreaking innovations and play a pivotal role in our mission to achieve ambitious health-related goals, all while contributing to a vibrant and collaborative workplace culture.The future we aspire to is being shaped today. It begins with our culture at Isomorphic Labs. It starts with you.About Isomorphic LabsFounded in 2021, Isomorphic Labs (IsoLabs) strives to enhance human health by building upon the Nobel Prize-winning AlphaFold system. Our diverse team of experts in drug discovery and machine learning has developed groundbreaking predictive and generative AI models that accelerate scientific discovery at an unprecedented pace.Our name reflects our belief in the inherent symmetry between biology and information science. By leveraging AI's robust capabilities, we can model complex biological processes to design innovative molecules, predict drug performance, and develop novel treatments for some of the world's most challenging diseases.Our state-of-the-art drug design engine integrates AI models capable of addressing various therapeutic areas and drug modalities. We are dedicated to continuously innovating model architectures and creating cutting-edge solutions to advance rational drug design.With each significant advancement, we are progressing towards the promise of digital biology, moving closer to our ambitious mission of ultimately solving all diseases through the power of AI.

Mar 20, 2026

Apply

Machine Learning Researcher at Wintermute Trading | London

Wintermute Trading

Full-time|On-site|London

Join Our Innovative Team as a Machine Learning ResearcherAt Wintermute, we are on the forefront of digital asset trading, leveraging cutting-edge technology to provide unparalleled liquidity and OTC solutions across the cryptocurrency landscape. As a technology unicorn founded in 2017, we are a leading player in algorithmic trading, committed to supporting high-profile blockchain projects and ushering traditional financial institutions into the exciting world of crypto. Our venture arm also invests in early-stage DeFi initiatives, fostering innovation in the blockchain space.Our culture merges the rigor of high-frequency trading with the innovative spirit of tech startups, creating an environment that champions both technological excellence and entrepreneurial ambition. We believe in the transformative potential of blockchain technology, maintaining a long-term vision for the digital asset market while prioritizing compliance and innovation.Your Role and ResponsibilitiesAs a Machine Learning Researcher, you will harness your expertise in applied deep learning to develop robust alpha signal generation pipelines. Your work will encompass everything from data ingestion and feature engineering to model training and deployment, all in close collaboration with our dynamic trading and infrastructure teams.

Nov 18, 2025

Apply

Machine Learning Researcher at Jane Street | London

Jane Street

Full-time|On-site|London, England, United Kingdom

About the Position We are seeking intelligent and inquisitive individuals to enhance our dynamic Machine Learning team at Jane Street. This role offers the opportunity to develop cutting-edge deep learning models that underpin our trading strategies, utilizing our expansive computing cluster equipped with tens of thousands of high-performance GPUs. The trading environment presents unique challenges, including large-scale models and nonstationary datasets within a competitive multi-agent landscape, prompting innovative solutions. At Jane Street, our researchers, engineers, and traders collaborate closely, sharing knowledge and expertise just a few feet apart. Your day may involve analyzing market data, optimizing hyperparameters, diagnosing distributed training performance, or exploring our models' behaviors in live trading scenarios. Your extensive knowledge of the machine learning domain and familiarity with diverse methodologies—ranging from LLMs, image recognition systems, RL agents, recommendation frameworks, to classical ML techniques—will be invaluable in advancing ML at Jane Street. You will play a crucial role in training models for our next-generation deep learning trading strategies and developing the insights necessary to navigate new markets and challenges. Furthermore, you will be involved in the recruitment of new team members, attending industry conferences, and sharing your expertise with colleagues, all of which are considered significant contributions to our mission.

Feb 5, 2026

Apply

Machine Learning Researcher - Generative Modeling Expert

latentlabs

Full-time|On-site|London

Join latentlabs as a distinguished Machine Learning Researcher specializing in generative modeling. You will collaborate with a diverse team of machine learning experts, protein engineers, and biologists, all dedicated to revolutionizing biological control and disease treatment. Your primary focus will be to design innovative generative models aimed at creating new proteins that demonstrate functionality in wet lab assays.Who You AreExpert in Machine Learning Research. You possess substantial experience in generative modeling and have contributed to significant machine learning projects, as evidenced by your involvement in prominent open-source libraries, impactful product launches, or high-caliber publications at conferences such as NeurIPS, ICML, or ICLR.Skilled ML Developer. You produce robust, well-tested, and maintainable ML code. Your familiarity with version control and code review practices ensures high-quality development. You excel at rapid prototyping while also crafting polished production-ready code, with experience in running training and inference on cloud infrastructure.Data Engineering Proficient. You have a strong background in creating ML data pipelines for training and evaluating deep learning models. You’re skilled at analyzing raw data, creating effective dataset splits, and building scalable pipelines.Passionate About Model Performance. Your deep understanding of ML libraries, hardware interactions, and data optimization enables you to enhance the performance of deep learning models, focusing on training and inference speed and validation metrics.Driven and Inquisitive. You have a strong desire to effect positive change, whether for patients or customers. You adapt flexibly to diverse methodologies and are curious about challenges, no matter their scale. Your ability to thrive in a fast-paced environment allows you to achieve goals effectively.What Sets You ApartExperience in Computational Biology or Protein Design. You have engaged in ML-driven projects within the biological domain.Natural Sciences Background. Your foundation in natural sciences supports your work in machine learning applications.

Feb 19, 2026

Apply

Junior Quantitative Researcher - Machine Learning Alpha Research

Squarepoint Capital

Full-time|$150K/yr - $150K/yr|On-site|London, New York, Singapore, Boston, Paris, Zug, Geneva, Hong Kong, Bangalore

We invite candidates to apply for the position that best aligns with their expertise. If we find a better fit for your profile, we will contact you about alternative opportunities. Position Overview: Conduct research utilizing statistical techniques, including time-series analysis, machine learning, and natural language processing to derive insights from data. Analyze extensive data sets employing advanced statistical and machine learning methodologies to uncover trading opportunities. Collaborate in developing statistical and machine learning tools to address complex data-related challenges across the organization. Typical Day: Engage in researching innovative statistical and machine learning techniques and exploring various datasets throughout the day. Present and discuss research findings with fellow researchers. Deploy and oversee models utilized for generating trading signals.

Mar 10, 2026

Apply

Junior Machine Learning Engineer

Recraft

Full-time|On-site|London, UK

About UsFounded in the United States in 2022 and now headquartered in London, UK, Recraft is revolutionizing the creative landscape with its cutting-edge AI tool designed specifically for professional designers, illustrators, and marketers. Our platform sets a new benchmark in image generation excellence.Our innovative tool empowers creators to swiftly generate and refine original images, vector art, illustrations, icons, and 3D graphics using AI technology. With over 3 million users across 200 countries, who have collectively produced hundreds of millions of images, Recraft is just at the beginning of its journey.Join us and explore a universe of professional opportunities! Contribute to large-scale projects and help shape the future of creativity. We are dedicated to making Recraft an indispensable daily tool for every designer, setting the industry standard. Our mission is to enable creators to fully control their creative process with AI, equipping them with innovative tools to turn their ideas into reality.If you are enthusiastic about pushing the boundaries of AI, we welcome you to join our team!

Jan 7, 2026

Apply

Senior Machine Learning Engineer

datatonic

Full-time|On-site|London Consulting

Join datatonic as a Senior Machine Learning Engineer and be a pivotal part of our innovative team focused on transforming data into actionable insights. In this role, you will leverage your expertise in machine learning to develop and enhance algorithms, contribute to data-driven solutions, and collaborate with cross-functional teams to push the boundaries of data analytics.

Mar 18, 2026

Apply

Machine Learning Engineer

datatonic

Full-time|On-site|London Consulting

Join datatonic as a Machine Learning Engineer and be part of a dynamic team dedicated to leveraging data to drive intelligent decision-making. You will design and implement machine learning models, optimize algorithms, and collaborate with cross-functional teams to deliver innovative solutions.

Mar 18, 2026

Apply

Director of Machine Learning Engineering

Trainline

Full-time|On-site|London

Join Trainline as the Head of Machine Learning Engineering, where you'll lead our innovative machine learning initiatives. In this pivotal role, you will spearhead the development and implementation of advanced ML solutions that enhance our customer experience and optimize our operations.

Mar 13, 2026

Apply

Staff Machine Learning Engineer at Deliveroo | London

Deliveroo

Full-time|Hybrid|London - The River Building HQ

Join Our Team as a Staff Machine Learning EngineerBecome an integral part of Deliveroo’s mission to revolutionize the shopping and dining experience with a focus on impact, innovation, and growth. Our talented Engineering teams tackle intricate technical challenges within a global, three-sided marketplace, developing and scaling systems that cater to millions of customers, riders, and partners daily.From advanced real-time logistics to robust infrastructure and marketplace optimization, we create and manage the technology that fuels Deliveroo’s expansive growth.We are seeking a Staff Machine Learning Engineer to join our dynamic London team (working in a hybrid model, 3 days in the office). In this pivotal role, you will design and construct intelligent decision-making systems that operate at a large scale, directly influencing the experiences of consumers, riders, and merchants.Explore our Engineering team and discover our motivations, work culture, and what you can expect as part of our community.Your ResponsibilitiesJoin the Consumer Pricing team to tackle complex pricing challenges at Deliveroo. We are advancing towards dynamic, personalized pricing strategies that take into account consumer behavior, loyalty programs, and real-time market conditions.Your daily tasks will include:Designing and developing high-performance machine learning and optimization systems that guide Deliveroo’s core decisions at scale.Leading the technical advancement of our pricing infrastructure, resolving architectural bottlenecks and integrating enhanced model inputs and decision variables.Creating algorithms to enhance real-time marketplace efficiency, including personalized elasticity models and delivery time forecasts.Collaborating closely with Product Managers and Data Scientists to convert complex business challenges into effective algorithmic solutions.Providing technical leadership across various product domains, identifying and prioritizing high-impact algorithmic enhancements.Mentoring and guiding fellow engineers, elevating technical standards and advancing our machine learning and optimization practices across the company.Qualifications for SuccessThe ideal candidate will possess strong expertise in several of the following areas, along with a desire to grow in others:

Jan 9, 2026

Apply

Machine Learning Engineer

Faculty

Full-time|On-site|London

Join Faculty as a Machine Learning Engineer and be at the forefront of innovation in artificial intelligence. In this role, you will leverage your expertise in machine learning to develop and implement cutting-edge algorithms and models that drive impactful solutions across various industries.As a part of our dynamic team, you will work collaboratively to transform complex data sets into actionable insights, optimizing our products and enhancing client offerings. If you are passionate about technology and eager to make a difference, we would love to hear from you!

Mar 20, 2026

Apply

Senior Machine Learning Researcher at Longshot Systems | London

Longshot Systems Ltd

Full-time|Hybrid|London, England, United Kingdom

Join Longshot Systems, where we are at the forefront of developing sophisticated platforms for sports betting analytics and trading.We are in search of talented Machine Learning Researchers to join our quantitative modeling team. This team’s primary objective is to enhance the predictive capabilities of our models using historical event and market data. The quality of our models is paramount, as improvements directly influence our company’s success.In this role, you will design, test, and implement innovative machine learning models using Python, continuously enhancing our existing state-of-the-art solutions. As a small, focused company, we offer you the chance to be involved in every aspect of the R&D process, from high-level design to production implementation.The ideal candidate will possess high creativity and enjoy developing new, innovative approaches to problem-solving, and will have the autonomy to explore the most suitable methods for the challenges at hand. A strong mathematical foundation in machine learning and core statistics is essential. While knowledge of sports betting is not required, experience with modeling sports — particularly in-play football, basketball, or tennis — is advantageous.We embrace a hybrid working model, requiring in-office presence on Thursdays at our London (Farringdon) office, while allowing flexibility for the remainder of the week. Our standard working hours are from 10 am to 6 pm UK time, Monday to Friday, with support for flexible schedules to help our team achieve their goals.Our interview process includes:Introductory call (30 mins) - discussing your background and interestsTechnical interview (60 mins) - focusing on modeling questions and a coding exerciseFull assessment day (10:30 am – 5 pm) - encompassing a comprehensive modeling exercise and team interactions

Feb 3, 2026

Create account — see all 2,578 results

Browse all companies, explore by city & role, or SEO search pages.