Tailoring 0 resumes…

We'll move completed jobs to Ready to Apply automatically.

AI Agent Testing Specialist at Mindrift | Madrid, Spain | RoboApply Jobs

This job posting is no longer active and is not accepting applications.

AI Agent Testing Specialist - Evaluation Scenario Writer

MindriftRemote — Madrid, Community of Madrid, Spain

Remote Part-time $0/hr - $21/hr

No Longer Active

Experience Level

Experience

Qualifications

Computer Science or Software Engineering degree.5+ years of software development experience, mainly with Python.Full-Stack development expertise, particularly in React and back-end systems.Experience in writing functional and integration tests.Proficiency with Docker containers.Understanding of CI/CD, especially GitHub Actions.B2 level English proficiency.

About the job

We kindly ask that you submit your CV in English and specify your English proficiency level.

At Mindrift, we connect talented professionals with exciting project-based AI opportunities at leading technology firms, focusing on the testing, evaluation, and enhancement of AI systems. Please note that participation is project-based and does not constitute permanent employment.

What You Will Do

In this role, you will design challenging coding test cases that thoroughly assess the capabilities of AI coding systems:

Critically evaluate and enhance realistic coding tasks derived from existing production codebases, ensuring they align with realistic scope, requirements, and information sources.
Create detailed functional tests that validate comprehensive end-to-end behaviors and edge cases, going beyond superficial checks.
Devise “fair yet challenging” scenarios where the AI has access to all necessary context but must engage in complex reasoning across multiple files and external sources.
Examine AI failures to identify areas of difficulty for the model compared to its strengths.
Refine your work based on evaluations from expert QA reviewers who will assess your output against seven quality criteria.

Who We Are Looking For

This opportunity is well-suited for seasoned developers, software engineers, or test automation specialists seeking part-time, non-permanent engagements. Ideal candidates will possess:

A degree in Computer Science, Software Engineering, or a related field.
5+ years of experience in software development, with a primary focus on Python (including pytest, async/await, subprocess, and file operations).
A solid background in Full-Stack development, with balanced expertise in both React-based interfaces and robust back-end systems.
Proven experience in writing tests (functional and integration) rather than simply executing them.
Familiarity with Docker containers for running evaluations locally.
An understanding of CI/CD processes, particularly with GitHub Actions (involving triggers, labels, and result interpretation).
English proficiency at a minimum of B2 level.

Application Process

Apply → Complete qualifications → Join a project → Execute tasks → Get compensated

Estimated Effort

The tasks for this project are estimated to take approximately 20 hours to complete, depending on their complexity. This is an estimation, and there are no fixed schedules; you have the flexibility to determine your working hours. All tasks must be submitted by the deadline and meet the acceptance criteria to be validated.

Compensation

Paid contributions, with rates of up to $21/hour*
Compensation may be structured as a fixed project rate or individual rates, depending on the project.
Some projects may offer additional incentive payments.

*Note: Rates vary based on expertise, skills assessment, location, project requirements, and other factors. Highly specialized experts may receive higher rates. Lower rates might apply during onboarding or non-core project phases. Payment details are project-specific.

About Mindrift

Mindrift is dedicated to connecting specialists with innovative project-based AI opportunities. We focus on enhancing AI systems for leading technology companies, ensuring that our contributors engage in meaningful and impactful work.

This job posting is no longer active and is not accepting applications.

AI Agent Testing Specialist - Evaluation Scenario Writer

MindriftRemote — Madrid, Community of Madrid, Spain

Remote Part-time $0/hr - $21/hr

No Longer Active

Experience Level

Experience

Qualifications

About the job

We kindly ask that you submit your CV in English and specify your English proficiency level.

What You Will Do

In this role, you will design challenging coding test cases that thoroughly assess the capabilities of AI coding systems:

Critically evaluate and enhance realistic coding tasks derived from existing production codebases, ensuring they align with realistic scope, requirements, and information sources.
Create detailed functional tests that validate comprehensive end-to-end behaviors and edge cases, going beyond superficial checks.
Devise “fair yet challenging” scenarios where the AI has access to all necessary context but must engage in complex reasoning across multiple files and external sources.
Examine AI failures to identify areas of difficulty for the model compared to its strengths.
Refine your work based on evaluations from expert QA reviewers who will assess your output against seven quality criteria.

Who We Are Looking For

This opportunity is well-suited for seasoned developers, software engineers, or test automation specialists seeking part-time, non-permanent engagements. Ideal candidates will possess:

A degree in Computer Science, Software Engineering, or a related field.
5+ years of experience in software development, with a primary focus on Python (including pytest, async/await, subprocess, and file operations).
A solid background in Full-Stack development, with balanced expertise in both React-based interfaces and robust back-end systems.
Proven experience in writing tests (functional and integration) rather than simply executing them.
Familiarity with Docker containers for running evaluations locally.
An understanding of CI/CD processes, particularly with GitHub Actions (involving triggers, labels, and result interpretation).
English proficiency at a minimum of B2 level.

Application Process

Apply → Complete qualifications → Join a project → Execute tasks → Get compensated

Estimated Effort

Compensation

Paid contributions, with rates of up to $21/hour*
Compensation may be structured as a fixed project rate or individual rates, depending on the project.
Some projects may offer additional incentive payments.

AI Agent Testing Specialist - Evaluation Scenario Writer

Experience Level

Qualifications

About the job

About Mindrift

Direct Appointment Setter at Southern National Roofing | Columbia, MD

Project Superintendent

Community Support Lead Care Manager at Pacific Health Group | Remote

Physical Therapist at Performance Optimal Health | New Canaan

Part-Time In-Home Veterinarian

Sales Support Specialist at Golden Lighting | Tallahassee, FL

New Home Sales Consultant at LGI Homes | Lebanon, TN

Medical Director - Licensed Psychiatrist

Recruiting Coordinator - Join Our Innovative Team

Experienced Litigation Paralegal - Remote

Senior Director of Digital Communications

Nutritional Cook for Early Childhood Center

FMS Analyst at ACT1 Federal | Patuxent River, MD

Automotive Technician Opportunity at Citrus Kia

Software Security Analyst at TP-Link Systems Inc. | Irvine, California

Network Intrusion Detection Engineer - Active TS/SCI with CI Poly

Tax Associate - Private Client

Lead Behavior Technician - Full-Time Position

Local Roofing Sales Representative - Roof Restoration Specialist

Senior Director of Inventory and Merchandise Planning

AI Agent Testing Specialist - Evaluation Scenario Writer

Experience Level

Qualifications

About the job

About Mindrift

AI Agent Testing Specialist - Evaluation Scenario Writer

Experience Level

Qualifications

About the job

About Mindrift

AI Agent Testing Specialist - Evaluation Scenario Writer

Experience Level

Qualifications

About the job

About Mindrift