About the job
We invite you to submit your CV in English, along with your proficiency level in the English language.
Mindrift connects talented professionals with project-based opportunities in AI for major technology firms, emphasizing the testing, evaluation, and enhancement of AI systems. Please note that participation is project-based rather than a permanent employment position.
Role Overview
This role is ideal for a Senior Python Engineer with extensive experience in functional testing. The candidate should possess strong skills in Linux and Docker, the capability to interpret code across various programming languages (such as C, Rust, and Go) with assistance from LLMs, and be adept at translating requirements for migration tasks. Proficiency in tools like Roo Code or Claude Code to facilitate iterative development is also essential.
Key Responsibilities
- Develop functional black box tests for substantial codebases in multiple source languages.
- Create and administer Docker environments to guarantee 100% reproducible builds and test executions across diverse platforms.
- Track code coverage and configure automated scoring criteria to align with industry benchmark standards.
- Utilize LLMs (such as Roo Code, Claude) to expedite development cycles, automate monotonous tasks, and enhance overall code quality.
Qualifications
- Minimum of 5 years of experience as a Software Engineer, predominantly in Python.
- In-depth expertise with pytest (including fixtures, session-scoped, timeouts) and crafting black-box functional tests for CLI tools.
- Advanced Docker capabilities (such as reproducible Dockerfiles, user contexts, and secure workspaces).
- Strong Linux & Bash scripting skills, with a comfort level in debugging within containers.
- Proficiency with contemporary Python tools (uv, pyproject.toml, packaging).
- Ability to interpret and comprehend multiple coding languages with LLM support (e.g., C, C++, Rust, or Go).
- Experience with LLMs (Claude Code, Roo Code, Cursor) to accelerate iterative development and test-case generation.
- English proficiency at B2 level or higher.
Preferred Qualifications
- Previous experience with agent evaluation platforms and MCP CLI.
Tech Stack: Python (pytest, uv, Pillow), Docker, Bash, Git Submodules, C/C++/Rust/Go (reading), Dagger, GitHub Codespaces, LLMs (Claude Code, Roo Code, Cursor), coverage.py, gcov, kcov.
Benefits
What We Offer
- Freelance project-based collaboration through the Mindrift platform (powered by Toloka AI).
- Fully remote and flexible participation—you decide when and how much to contribute (20-30 hours per week).
- Task-based compensation, which can reach up to $50/hour* based on performance and workload.
- Engagement in innovative AI projects for leading tech companies.

