AI Agent Testing Specialist Jobs in Germany

3,596 jobs found

1 - 20 of 3,596 Jobs
Apply
machinelearningreply logo
Agentic AI Software Engineer

machinelearningreply

Full-time|On-site|Munich, Bayern, Germany

Role Overview machinelearningreply is looking for an Agentic AI Software Engineer to join the team in Munich, Bayern. This role centers on building intelligent, production-ready systems using large language models (LLMs), autonomous agents, and scalable cloud infrastructure. The position involves close collaboration with clients to deliver practical AI solut…

Apr 16, 2026
Apply
Sopra Steria logo
Full-time|On-site|Ulm

Sopra Steria seeks a Generative AI/Agentic AI Engineer (d/f/m) for its Ulm office. This role centers on developing and refining AI models and solutions that enhance user experiences and make operations more efficient. Main focus Design and improve generative and agentic AI systems Work on projects that aim to simplify processes and deliver smoother user interactions Location This position is based in Ulm. Who will succeed Engineers interested in advancing AI capabilities Those who enjoy building solutions that have a direct impact on users and business operations

Apr 23, 2026
Apply
god logo
Full-time|On-site|Ratingen, Nordrhein-Westfalen, Germany

As a forward-thinking IT service provider, we deliver sophisticated software solutions. Our teams assist clients across various industries in the digital transformation, modernization, and cloud migration of their applications.Are you passionate about the quality of software systems? Do you excel in consulting, designing, and executing tests, and even more so in their automation?

Jan 8, 2026
Apply
Machine Learning Reply logo
Agentic AI Software Engineer

Machine Learning Reply

Full-time|On-site|Munich, Bayern, Germany

Join our innovative team as an Agentic AI Software Engineer. We are seeking a talented and driven individual who possesses a strong foundation in software engineering, complemented by practical experience with cutting-edge AI technologies. Your role will involve partnering with clients to design and implement sophisticated, production-ready systems that utilize LLMs, autonomous agents, and scalable cloud infrastructure.Key Responsibilities:Architect, develop, and deploy scalable, production-grade applications on leading cloud platforms such as AWS, GCP, or Azure.Create and manage agentic AI frameworks, incorporating multi-step workflows, tool integrations, and autonomous decision-making capabilities.Oversee the complete lifecycle of AI feature implementation, from proof of concept to full-scale deployment.Develop high-performance APIs and backend services for AI applications using technologies like FastAPI, Flask, or Spring Boot.Integrate generative AI solutions, including LLMs, vector databases, and RAG pipelines, into enterprise environments.Design and implement robust orchestration and retrieval mechanisms for scalable AI applications.Establish and maintain MLOps/LLMOps pipelines and CI/CD workflows for seamless integration, evaluation, and deployment.Ensure software quality through rigorous testing, monitoring, observability, and performance tuning.Collaborate effectively with clients and cross-functional teams to gather requirements and deliver impactful AI solutions.

Mar 30, 2026
Apply
Intercom logo
Full-time|On-site|Berlin, Germany

Join Intercom as a Principal Engineer focusing on our Fin AI Agent, where you'll play a crucial role in shaping the future of financial technology. You will lead the design and development of AI-driven solutions that enhance customer engagement and streamline financial processes.

Mar 26, 2026
Apply
Helsing logo
Full-time|On-site|Munich - Berlin - London - Paris

About UsAt Helsing, we are pioneering the use of artificial intelligence in defense, dedicated to ensuring the protection of our democratic values. Our mission is to establish technological leadership that empowers open societies to make autonomous decisions and maintain ethical standards in technology.As advocates for democracy, we recognize the critical responsibility that comes with developing and deploying advanced technologies like AI. This commitment is at the core of our operations.We are an enthusiastic team comprising engineers, AI experts, and program managers, united by a mission-driven ethos. We invite passionate individuals to join our European teams to tackle complex and significant challenges. Our culture promotes open dialogue about the ethical implications and advantages of technology in defense.The RoleDefine and deliver tools that enable engineers to trust, test, and deploy AI with confidence.Contribute to the development of a new product line centered on simulation, validation, and assurance for advanced airborne systems.Your strategic roadmap will have a direct influence on flight testing campaigns and accelerate development processes.Daily ResponsibilitiesInvestigate user requirements within engineering and autonomy teams.Establish product strategy and feature roadmap.Collaborate with engineers to define and deliver minimum viable products (MVPs).Balance technical debt with new capabilities.Work alongside UX and simulation teams for optimal product design.Provide updates and communicate progress to leadership.Ideal CandidatePossesses a solid understanding of software validation, testing, or AI assurance.Thrives in technical and evolving product landscapes.Can simplify complexity within intricate systems.Communicates effectively across engineering and non-engineering teams.

Feb 26, 2026
Apply
Andercore logo
Full-time|On-site|Berlin

About the CompanyAndercore is an innovative AI-driven trading platform revolutionizing industrial supply chains in sectors such as infrastructure, energy, and construction materials. Our platform seamlessly connects vetted suppliers from Asia, Europe, and the GCC region with local demand, offering a single integrated solution.Leveraging our proprietary AI stack, we digitize and automate the entire lifecycle of materials trade—from procurement and pricing to inventory management, logistics, and embedded financing—eliminating countless manual and relationship-driven processes in favor of real-time orchestration.Our platform provides buyers with instant quotes, dependable availability, and predictable delivery through a unified operating system. Suppliers benefit from precise forecasting, disciplined demand management, and seamless integration into cross-border fulfillment networks. By partnering with leading brands, we facilitate both dropship and cross-dock fulfillment for large-scale transactions, transforming global supply chains into predictable, software-like workflows.Scaling to triple-digit-million GMV and operating in six international markets, we are rapidly moving towards profitability. With a passionate team of over 80 professionals based in Berlin (HQ) and Asia, we are developing the world's first industrial-grade AI operating system for materials, redefining operations in one of the globe's largest and most essential industries.Backed by top-tier investors and global institutional banking partners, we recently secured $40 million in Series B funding from Atomico, Project A, and Invent Capital, with institutional backing from Commerzbank and KfW.Join our tech team and collaborate with a group of ambitious and knowledgeable engineers, tackling complex problems while gaining exposure to commercial aspects.The Tech Stack: Python, LangGraph, LLM APIs (OpenAI, Anthropic, Gemini), AWS, GitHub Actions, Node.js, Java 21 (Spring Boot 3), Postgres, Kinesis, Terraform, Typescript, NextJS, React.

Mar 5, 2026
Apply
Toloka AI logo
Contract|$50/hr - $50/hr|Remote|Remote — Stuttgart, Baden-Württemberg, Germany

Please submit your CV in English and indicate your English proficiency level. This Senior Python Systems Developer role is a project-based, freelance opportunity with Toloka AI (via Mindrift), supporting top technology companies in testing and improving AI systems. The position is remote and based in Stuttgart, Baden-Württemberg, Germany, but open to candidates working from anywhere. Role overview This position centers on building and maintaining functional black box tests for large and diverse codebases. Responsibilities include managing Docker environments, using language models to interpret code in C, Rust, and Go, and translating migration requirements into actionable development tasks. Tools such as Roo Code and Claude Code are used to streamline workflows and automate repetitive tasks. What you will do Design and implement functional black box tests for projects written in multiple programming languages. Set up and manage Docker environments to ensure reproducible builds and consistent testing across platforms. Monitor code coverage and automate scoring to meet industry standards. Apply large language models (LLMs) like Roo Code and Claude to accelerate development, automate routine work, and enhance code quality. Requirements Minimum 5 years of experience as a Software Engineer, with a strong focus on Python. Deep knowledge of pytest, including fixtures, session-scoped testing, timeouts, and black box functional test design for CLI tools. Advanced experience with Docker: writing reproducible Dockerfiles, managing user contexts, and securing workspaces. Expertise in Linux and Bash scripting, including debugging within containers. Familiarity with modern Python tooling (uv, pyproject.toml, packaging). Ability to read and understand code in C, C++, Rust, or Go with the help of LLMs. Hands-on experience using LLMs (Claude Code, Roo Code, Cursor) to speed up development and generate tests. English proficiency at B2 level or higher. Preferred qualifications Background with agent evaluation platforms and MCP CLI. Key tools & technologies Python (pytest, uv, Pillow), Docker, Bash, Git Submodules, C/C++/Rust/Go (for code reading), Dagger, GitHub Codespaces, LLMs (Claude Code, Roo Code, Cursor), coverage.py, gcov, kcov. Benefits & work arrangement Freelance, project-based contract through Mindrift (powered by Toloka AI). Fully remote position with flexible scheduling. Work 20-30 hours per week and set your own hours. Compensation varies by project and experience, with potential earnings up to $50 per hour for this engagement.

Apr 24, 2026
Apply
anyone-ai logo
Contract|$0/hr - $40/hr|Remote|Germany

anyone-ai develops STEM training data that supports the advancement of AI models at leading research labs. The team’s work centers on producing high-quality, technically rigorous datasets that help improve AI reasoning in complex scientific domains. This remote position, based in Germany, focuses on designing and writing advanced physics problems for AI training and evaluation. Each problem must be deterministic, have a single correct answer, and include a full, validated solution. The role requires clear documentation of reasoning and technical accuracy throughout the process. What you will do Develop challenging physics problems that require deep conceptual understanding and multi-step analysis. Ensure every problem has one verifiable correct answer and is suitable for advanced AI systems. Write thorough solutions, documenting each step and validating the reasoning process. Use Python or other specialized tools as needed to build simulations, models, or computational workflows. Produce outputs that are reproducible, technically sound, and clearly written in English.

Apr 29, 2026
Apply
Delivery Hero SE logo
Full-time|On-site|Berlin

Role overview Delivery Hero SE seeks a Senior AI Scientist for the Vendor Data Team in Berlin. This role centers on Agentic AI, with a focus on enhancing how vendor data supports the company’s operations. What you will do Create AI-driven solutions to improve vendor data processes and capabilities Collaborate with colleagues from various backgrounds to design, build, and refine AI models Use expertise in artificial intelligence to address business challenges and inform important decisions

Apr 22, 2026
Apply
MOIA logo
Full-time|On-site|Wolfsburg, Germany

Become a key player as a (Senior) Quality Specialist (all genders) in our innovative Pre-Integration and Testing team and contribute to the evolution of autonomous mobility!In your role as a Quality Specialist – System Testing within the Pre-Integration and Testing (PIT) Team, you will be at the forefront of evaluating our next-generation Mobility Platform solution.Working in the Vehicle Integration Domain, you will thrive in an agile, cross-functional, and collaborative environment, where you will lead system-level testing and automation initiatives while enhancing quality standards throughout the domain.

Feb 18, 2026
Apply
Robert Bosch GmbH logo
Full-time|On-site|Renningen

At Bosch, we are pioneers of innovation, dedicated to shaping the future of intelligent systems that are efficient, safe, and improve lives worldwide. We are looking for a forward-thinking Research Engineer to join our vibrant team, specializing in the dynamic intersection of reinforcement learning (RL) and agentic AI. Your innovative contributions will significantly influence critical Bosch sectors, such as automated driving, smart home solutions, energy management, and advanced manufacturing. As an integral part of our team, you will elevate the capabilities of AI technology, spearheading foundational research and developing core systems that will directly enhance Bosch's product offerings. Join us in converting complex challenges into tangible solutions.In this role, you will lead the design and architecture of cutting-edge AI systems, focusing on the seamless integration of reinforcement learning with agentic AI systems and multi-modal foundational models.Your responsibilities will include training and fine-tuning multi-modal large models to ensure their behavior aligns with Bosch's product specifications and real-world applications.You will push the boundaries of AI by improving existing machine learning frameworks and integrating innovative data-driven, generative techniques with secure, scalable reinforcement learning methodologies.A major component of your work will involve conducting original research, leading impactful projects that address complex scientific and practical issues at the intersection of RL and agentic AI, thereby making a significant contribution to Bosch's product line.Collaboration will be essential, as you will work closely with internal stakeholders and product teams to thoroughly understand their needs, conceptualize groundbreaking solutions, and deliver high-quality Minimum Viable Products (MVPs).Additionally, you will have the opportunity to share your insights by publishing and presenting your research findings in top-tier academic forums, while actively contributing to the larger scientific community.

Mar 4, 2026
Apply
Sixt SE logo
Full-time|On-site|Munich

Join Sixt as a Senior Product Owner for our Agentic AI B2B team in Munich. In this role, you will lead the development of innovative AI-driven solutions tailored for our business clients. You will collaborate closely with cross-functional teams to define product vision, prioritize features, and drive the roadmap.Your strategic insight and experience will be pivotal in shaping our product offerings and ensuring they meet market demands. Embrace the opportunity to make a significant impact in a dynamic industry.

Mar 25, 2026
Apply
IFS logo
Full-time|On-site|Düsseldorf

IFS seeks a Sales Executive in Düsseldorf to focus on IFS Loops Agentic AI. This role centers on building strong client relationships and increasing sales for AI-driven solutions. Role overview The Sales Executive will work to grow the adoption of IFS Loops Agentic AI products. Success in this position depends on understanding client needs, expanding market presence, and supporting customers as they integrate AI offerings. What you will do Develop and manage relationships with clients to encourage adoption of IFS Loops Agentic AI Pursue new business opportunities and drive sales activities Apply established sales strategies to strengthen IFS’s position in the market Collaborate with clients to understand their requirements and enhance their experience with AI products Requirements Background in sales, ideally with technology or AI solutions Strong customer engagement and relationship management skills Proven ability to identify opportunities and close sales Comfortable in a client-facing capacity

Apr 27, 2026
Apply
Cresta logo
Full-time|Remote|Germany (Remote)

Cresta is dedicated to transforming every customer interaction into a strategic advantage by harnessing the full potential of the contact center. Our innovative platform integrates advanced AI with human expertise, enabling contact centers to extract valuable customer insights, streamline processes, and empower teams to operate more efficiently. Founded by the visionary Sebastian Thrun, known for his groundbreaking work with Google X, Waymo, and Udacity, Cresta's leadership team includes CEO Ping Wu, co-creator of Google Contact Center AI, and Tim Shi, an early member of OpenAI.Be part of our exciting journey to revolutionize the workforce with AI. The future of work is at Cresta!

Mar 2, 2026
Apply
Bedachungen Schmidt GmbH logo
Full-time|On-site|Weißenthurm, Rheinland-Pfalz, Deutschland

Join Us as a Developer/Programmer for AI Agents!At Bedachungen Schmidt GmbH, one of the fastest-growing roofing companies in Germany, we are on the lookout for talented individuals in IT Management, Digitalization & Processes to assist us in planning and implementing our IT infrastructure.If you have a background in digital topics and are familiar with Low-Code/No-Code solutions as well as API tools like Zapier or Make, or if you are eager to learn quickly and independently, then our Digital & IT department is the perfect challenge for you.As a Developer/Programmer, you will design and automate our processes using modern tools, ensuring seamless integration of systems, smooth data flows, and enhanced operational efficiency.Key Responsibilities:Expand the automation of our business processes using No-Code/Low-Code tools such as n8n, Zapier, and Make.Build, maintain, and manage Airtable databases.Develop front-ends and interfaces using No-Code/Low-Code tools.Plan, implement, and execute new business processes.Document and conceptualize digital business processes.Provide user training and support.Conduct error analysis and troubleshooting.We at Bedachungen Schmidt are a diverse and high-performing roofing company dedicated to the professional roofing craft for over 160 years. We help Germany tackle the energy crisis while offering a fantastic workplace with excellent transport links, currently employing over 100 staff at our location in Weißenthurm.In short, we are looking for positive individuals who love what they do. As a family-owned business, we strive to do our best together and cherish teamwork. Are you ready to join us?

Dec 1, 2025
Apply
anyone-ai logo
Contract|$40/hr - $40/hr|Remote|Germany

Overview of the RoleAt anyone-ai, we are dedicated to developing high-quality STEM training data that empowers pioneering AI models utilized by top-tier AI laboratories. Our mission is to enhance model reasoning capabilities within scientific disciplines.We invite Biology experts to join our team in crafting intricate, deterministic problems with one definitive solution. These problems should mirror real-world scientific and analytical methodologies and must be accompanied by thoroughly validated solutions.Your expertise may span various specializations such as molecular biology, genetics, systems biology, computational biology, bioinformatics, or closely related quantitative biology areas.Your ResponsibilitiesDevelop sophisticated biology problems that challenge leading-edge AI systems.Design deterministic tasks that yield a single correct answer.Provide complete, verified solutions for the designed problems.Create problems that involve experimental reasoning, biological systems, computational analyses, or bioinformatics workflows.Utilize Python and, as appropriate, specialized biology or bioinformatics tools.Maintain rigorous standards of precision, reproducibility, and technical clarity.QualificationsBachelor's, Master's, or PhD in Biology or a related life sciences discipline.Experience in research or industry with a focus on computational or quantitative biological analysis.Proficient in Python; experience with data analysis or bioinformatics workflows is a plus.Strong analytical skills and comfort with multi-step scientific problem-solving.Proven ability to devise original, challenging problems rooted in practical biological applications.Excellent written communication skills in English and meticulous attention to detail.Preferred QualificationsFamiliarity with additional programming languages or bioinformatics tools.

Mar 31, 2026
Apply
Causa Prima logo
Full-time|On-site|Munich

As a pivotal member of our team, you will play an essential role in shaping our product by designing, developing, and maintaining the autonomous agents that drive Causa Prima forward. Your work will encompass a variety of tasks, from validating invoices and resolving disputes to negotiating across companies and optimizing cash flow.Key Responsibilities:Agent Architecture: Craft the multi-agent system, defining agent boundaries, event contracts, orchestration patterns, and delineating between LLM reasoning and deterministic enforcement. LLMs will inform while rules will enforce.Document Processing Pipeline: Manage the ingestion, classification, and structured extraction of financial documents (such as invoices, contracts, purchase orders) utilizing LLM-based parsing and vision models, with a focus on sandboxed processing for untrusted documents.Validation & Anomaly Detection: Implement business rule checks, perform three-way matching, detect pricing discrepancies, and identify anomalies over time.Integration Framework: Integrate with platforms such as Gmail, Google Drive, Slack, banking APIs, crypto wallets, and accounting systems, focusing on authentication models, credential lifecycle, and data normalization.Knowledge Graph: Develop a Neo4j context graph that encapsulates implicit business knowledge, providing insights into the reasoning behind decisions, including entity extraction, GraphRAG patterns, and provenance tracking.Multi-LLM Strategy: Establish per-agent model selection, create structured output contracts using Pydantic schemas, and implement an evaluation framework along with a fallback strategy.Compliance & Security: Design agents to operate within a zero-trust model, ensuring independent verification against source data, outbound content review, and prevention of cascading failures from prompt injections.

Apr 6, 2026
Apply
Bosch Group logo
Internship|On-site|Stuttgart

Join Bosch Group as an intern in our innovative AI-driven automation team! This internship offers a unique opportunity to work with cutting-edge agent systems that are revolutionizing the automation landscape.

Feb 18, 2026
Apply
Intercom logo
Full-time|$250/yr - $250/yr|On-site|Berlin, Germany

Intercom builds AI-driven customer service tools that help businesses offer strong support experiences. Our AI agent, Fin, powers always-on customer care and integrates with the Intercom Helpdesk as part of the Customer Service Suite. This platform handles complex queries and connects customers to human support when needed. Since 2011, nearly 30,000 businesses have chosen Intercom to improve their customer service operations. Role Overview The Senior Engineering Manager, Fin AI Agent, joins the Service Agent pillar in Berlin. This pillar includes 12 teams and continues to grow. The role involves leading one or two of the most strategically important work streams within this group. Team structures shift frequently, so flexibility and adaptability are essential. The broader market moves quickly, with AI-focused startups competing for the same ground, so high standards and comfort with rapid change are key. What You Will Do Own critical work streams from start to finish, ensuring meaningful impact beyond just meeting deadlines or maintaining team morale. Engage directly with the product and its challenges, this could include writing code, performing analyses, collaborating across functions, or helping team members overcome blockers. Form and lead new teams as priorities shift, often bringing together engineers who have not worked together before. Set clear expectations and build momentum, even under tight deadlines. Measure success by the real-world results your teams achieve. Who Thrives Here This role suits someone who enjoys high-pressure, fast-moving environments and is motivated by delivering real results. The work demands strong leadership, adaptability, and a willingness to get involved directly in technical and organizational challenges.

Apr 20, 2026

Sign in to browse more jobs

Create account — see all 3,596 results

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.