Machine Learning Engineer, ML Platform
Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Experience Level
Entry Level
Qualifications
About Zaimler
Zaimler is a pioneering company focused on creating the essential context infrastructure for the new era of autonomous AI agents. Our team is composed of experienced professionals dedicated to addressing the challenges of fragmented enterprise data and transforming it into actionable insights through advanced machine learning techniques.
Similar jobs
Browse all companies, explore by city & role, or SEO search pages.
Search for Machine Learning Engineer (AI Platform Lead)
59,698 results
About Us: Artera is an innovative AI startup dedicated to creating advanced medical artificial intelligence tests aimed at personalizing therapy for cancer patients. Our mission is to revolutionize medical decision-making, making it more tailored for patients and physicians worldwide.In the role of Machine Learning Engineer at Artera, you will be an integral…
Join Afresh, the forefront AI innovator in the fresh food sector, collaborating with major retailers like Albertsons, Wakefern, Meijer, and Stater Bros to streamline the ordering of billions of dollars worth of fresh food across over 12,000 grocery departments nationwide.After achieving an impressive 70% growth in 2025, we have expanded our platform to encompass all fresh departments, introduced our full store suite, and launched DC Fresh Buying.We are dedicated to eliminating food waste and ensuring fresh food is accessible to everyone. In 2025 alone, our software contributed to the saving of 200 million pounds of food waste. If you are seeking a role where your contributions lead to significant impact and social good, this is the ideal opportunity to join our team that is shaping the future of fresh food.The ML Platform Engineering team at Afresh is tasked with creating and sustaining the foundational infrastructure and tools that drive our machine learning and applied science solutions. We develop the shared components and services that empower our teams to design, deploy, and scale robust ML models. This encompasses a high-performance data API, customizable featurization, dependable forecasting systems, advanced optimization engines, and scalable training pipelines, along with deep experimentation capabilities. As our product offerings and customer base grow, so does the complexity and scale of our platform's requirements, adeptly managing predictions and simulations across various timeframes (hours, days, weeks), intricate data hierarchies (pallets on trucks, shelves of produce in stores, pieces of fruit in bowls), and limitless configuration options (average shelf fullness, backroom loads, truck capacities).About the RoleAs a Machine Learning Platform Engineer on the ML Platform Engineering team, you play a pivotal role in enhancing our core ML platform's performance, reliability, and scalability. You will focus on the essential infrastructure that directly facilitates the innovation and impact of Afresh's Machine Learning and Applied Science teams. Your efforts will empower our product suite, including our flagship Prediction Engine, which informs replenishment decisions for over 15% of all produce sold in the United States.What You Will AccomplishIn your first 3 months, you might deliver a feature that generalizes model configuration, enables no-code model deployments for our various ML solutions, or significantly enhances integration testing across our ML systems.By the end of your first 6 months, you will have taken ownership of implementing major scalability improvements and additions to our core systems.
Join Strava as a Platform Engineer specializing in Generative AI and Machine Learning. In this pivotal role, you will drive the development of innovative platforms that enhance user experiences and push the boundaries of technology. Collaborate with a dynamic team of engineers and data scientists to create scalable solutions that utilize advanced AI techniques. Your work will directly influence the future of our products and services, making a significant impact on athletes and fitness enthusiasts worldwide.
Schrödinger, Inc.
Join Schrödinger as a Machine Learning Platform Engineer, where you will play a crucial role in developing innovative machine learning solutions that drive advancements in scientific research and technology. You will collaborate with a talented team of scientists and engineers, leveraging cutting-edge tools and methodologies to enhance our computational platforms.Your expertise in machine learning algorithms and software engineering will be vital in creating scalable and efficient systems that facilitate data-driven decision-making. If you are passionate about harnessing the power of AI in scientific endeavors, we invite you to apply!
Ambience Healthcare
About Us:At Ambience Healthcare, we are not just another scribe; we are pioneering an AI intelligence platform that reinvigorates the human touch in healthcare while delivering significant ROI for health systems nationwide.Our innovative technology enables healthcare providers to concentrate on delivering exceptional care by alleviating the administrative burdens that detract from patient interactions and their most impactful work. Ambience provides real-time, coding-aware documentation and clinical workflow support in ambulatory, emergency, and inpatient settings across leading health systems in North America.Our team is driven by a relentless pursuit of excellence and extreme ownership, dedicated to crafting the best solutions for our health system partners. We champion transparency, positivity, and thoughtful engagement, holding each other accountable because we understand the significance of the challenges we tackle.Ambience has earned accolades such as being ranked #1 for Improving the Clinician Experience in the KLAS Research Emerging Solutions Top 20 Report, being recognized by Fast Company as one of the Next Big Things in Tech, and being named one of the best AI companies in healthcare by Inc. We were also selected as a LinkedIn Top Startup in 2024 and 2025. Our esteemed investors include Oak HC/FT, Andreessen Horowitz (a16z), OpenAI Startup Fund, and Kleiner Perkins — and our journey is just beginning.The Role:As a Staff Machine Learning Engineer, you will play a crucial role in advancing clinical AI that impacts millions of patient encounters across the largest health systems in the nation. Your contributions will directly influence the speed at which we enhance our AI capabilities through the platform you will oversee.You will design and implement evaluation and release processes that empower teams to deliver with confidence, create observability tools to identify quality issues pro-actively, and develop debugging tools that facilitate rapid issue reproduction. Additionally, you’ll work on the chart context retrieval layer that transforms patient history into model-ready inputs.Our goal is to enable teams to iterate on quality within days, not weeks, ensuring that every enhancement you implement adds value across all product teams each quarter.Please note that our engineering roles operate in a hybrid model from our San Francisco office (3 days per week).What You’ll Own:Evaluation & Release Infrastructure — Developing automated grading systems and release gates that function seamlessly across product teams, creating a unified evaluation dataset with version control to replace fragmented workflows. Implementing production-quality monitoring that includes end-to-end tracing, shared metrics, and automated alerts.Debugging Tools — Building encounter replay features that reconstruct precise inference inputs (including retrieved chart context, packed prompts, and model versions) to allow teams to troubleshoot issues without sifting through logs. Creating differential views to compare known good states with regressions.
Join our innovative team at artafinance as a Machine Learning Engineer to work on our cutting-edge AI Agent Platform. In this role, you will be instrumental in developing and deploying machine learning models that enhance the capabilities of our AI agents, transforming the financial landscape.You will collaborate closely with data scientists, software engineers, and product managers to identify opportunities for leveraging AI to optimize our platform's performance. Your expertise will help us deliver intelligent solutions that meet the diverse needs of our clients.
FieldAI is at the forefront of reimagining how robots engage with the world around them. Our Irvine team specializes in developing cutting-edge, risk-aware AI systems tailored for real-world applications. Nestled in the vibrant robotics hub of Southern California, we tackle complex challenges in robotics, unleashing the full power of embodied intelligence. If you're eager to see your innovations take flight in real hardware and enhance their capabilities through practical deployments, then our Irvine location is the ideal environment for you. We prioritize a comprehensive engineering approach that transcends traditional data-driven methods, integrating proven learning systems to yield globally deployed solutions that continuously evolve with each interaction in the field.About the JobAs a part of our Insight Team, you will lead the development of the Field-insight Foundation Model (FiFM), which empowers a global fleet of autonomous robots to gather extensive multimodal data from diverse environments daily. Your role as a Senior Machine Learning Platform Engineer will encompass ownership of the infrastructure that supports FiFM, overseeing everything from model hosting and distributed training pipelines to data systems and observability. This position uniquely blends systems engineering with machine learning, where you will design and manage large-scale ML platforms to facilitate our mission of deriving actionable insights from complex data.
Instacart is looking for a Senior Software Engineer II to strengthen its Machine Learning and AI Platform. This remote position supports Instacart’s products and services throughout the United States by building and refining the systems behind them. Key responsibilities Collaborate with engineers and data scientists to design and launch new features for the ML/AI platform Influence technical decisions that guide the platform’s growth and functionality Develop scalable solutions that improve efficiency and enhance user experience Team culture Join a team that values both creativity and technical depth. Open communication and new ideas are encouraged as the platform continues to evolve.
Be a Part of the Revolution in E-Commerce with Whatnot!Whatnot stands as the leading live shopping platform across North America and Europe, where you can buy, sell, and explore the items you cherish. We are transforming the landscape of e-commerce by merging community engagement, shopping, and entertainment into a unique experience tailored just for you. As a remote-first team, we are driven by innovation and firmly rooted in our core values. With operational hubs in the US, UK, Germany, Ireland, and Poland, we are collaboratively crafting the future of online marketplaces.From fashion and beauty to electronics and collectibles like trading cards, comic books, and live plants, our live auctions cater to a diverse audience.And this is just the beginning! As one of the fastest-growing marketplaces, we are on the lookout for innovative, forward-thinking problem solvers in all areas of our business. Stay updated with the latest from Whatnot through our news and engineering blogs, and join us in empowering individuals to transform their passions into successful ventures while fostering community through commerce. The RoleWe are seeking passionate builders—intellectually curious, entrepreneurial engineers who are ready to pioneer the future of AI and ML at Whatnot. You will be responsible for designing and scaling the foundational infrastructure that supports machine learning and self-hosted large language model applications throughout the organization. Collaborating closely with machine learning scientists, you will facilitate the deployment of cutting-edge models into production, creating entirely new product experiences. Your work will involve constructing systems that ensure advanced machine learning is reliable and efficient at scale—from low-latency model serving to distributed training and high-throughput GPU inference.Your Responsibilities:Lead the infrastructure that powers AI and ML models across vital business domains—enhancing growth, trust and safety, fraud detection, seller tools, and more.Prototype, deploy, and operationalize innovative ML architectures that significantly influence user experience and marketplace dynamics.Design and scale inference infrastructure capable of managing large models with minimal latency and maximal throughput.Construct distributed training and inference pipelines utilizing GPUs, as well as model and data parallelism.Push the boundaries of your expertise and explore new technologies and methodologies.
Saris AI, based in San Francisco with teams in Montreal and Toronto, develops advanced agentic AI systems for the banking industry. The company focuses on automating complex workflows that require long-context reasoning, integration with legacy systems, and strict compliance. With live AI agents already supporting real customer operations, Saris AI is expanding quickly and seeking technical leaders who want to shape the future of work in banking. Role overview This is a hands-on leadership position within the core engineering team in San Francisco. The Machine Learning Engineering Lead will guide machine learning systems from initial concept through scaling, helping define both the technical vision and the supporting infrastructure. What you will do Oversee the ML/AI function end to end, setting technical direction and standards across the company. Design and supervise development of multi-modal, agentic AI systems that power live customer workflows. Build and manage evaluation frameworks, datasets, and metrics to improve agent performance. Drive productionization of ML systems with an emphasis on reliability, scalability, and compliance. Recruit, develop, and mentor a high-performing ML team, fostering strong practices in modeling, experimentation, and deployment. Requirements 8+ years of experience in machine learning or AI engineering, including time as a technical lead or manager. Proven track record leading ML projects from concept to production deployment. Expertise with large language models (LLMs) and/or agentic systems, especially in customer-facing products. Strong grasp of ML fundamentals: deep learning, transformers, model evaluation, and trade-offs. Hands-on experience scaling ML systems in production, with a focus on monitoring, iteration, and reliability. Ability to lead engineering teams, influence architecture, and set technical direction. Comfort working in early-stage, ambiguous, and rapidly changing environments.
About the RoleMoveworks is seeking experienced Senior and Staff Engineers to enhance our Generative AI Search Platform. Join our dedicated team as we innovate on search-based question answering systems by integrating traditional machine learning methods with the latest advancements in generative AI technologies. Our goal is to construct conversational search systems that deliver immediate responses to enterprise user inquiries.As part of the Search Platform team, you will lead efforts to refine our search capabilities, collaborating closely with ranking, product, design, infrastructure, and data science teams. Together, we aim to elevate our agentic search products to new heights of performance and enterprise readiness.At Moveworks, we value a fast-paced, problem-solving culture that prioritizes providing exceptional value to our customers. Your role will involve working with diverse teams to identify, define, and implement elegant solutions that meet user needs.Design robust abstractions to empower ranking and product teams to effectively contribute to the codebase.Develop algorithmic frameworks for various conversational search applications, including multi-turn, multi-hop question answering systems.Lead the evolution of our search platform by integrating new LLM-enabled generative AI features, ensuring scalability, quality, security, and privacy.Create systematic evaluation metrics and methodologies for our search models, utilizing LLMs to solve unique enterprise challenges.As a technical leader, foster a high-performing, inclusive team environment while mentoring engineers at various levels.Contribute to our hiring processes and help grow our engineering team.
Toloka AI
Toloka AI is hiring a Freelance Machine Learning Engineer for a remote contract role based in Wisconsin, United States. This position centers on building and improving machine learning models that directly support product development and help shape the user experience. Responsibilities Create and fine-tune machine learning models for practical, real-world use Use data science techniques to enhance product features Work with other team members to solve technical challenges Requirements Solid background in machine learning and data science Proven ability to tackle complex problems using technical approaches Comfortable working independently as well as collaborating with a team Remote Work This contract role is fully remote but requires residence in Wisconsin, United States.
Roblox is a dynamic platform where millions of users venture daily to explore, create, engage, and connect with friends in immersive 3D digital environments crafted by our vibrant community of developers and creators.At Roblox, we are dedicated to developing tools and platforms that empower our community to transform their imaginative ideas into reality. Our vision is to redefine global connectivity, enabling people to come together from anywhere in the world, accessible on any device. We are on a mission to unite a billion people with positivity and respect, and we are seeking exceptional talent to help us achieve this goal.A career at Roblox means you will play a pivotal role in shaping the future of human interaction, tackling unique technical challenges on a grand scale, and contributing to the creation of safer, more respectful shared experiences for all.The Foundation AI Group aims to position Roblox as the benchmark for 3D foundational models (3DFMs), making it easier for anyone to craft high-quality, immersive 3D experiences using AI. The AI Platform team is integral to this vision, managing hundreds of machine learning use cases and handling billions of inferences daily across areas like Discovery, Safety, and Engine. We are searching for exceptional PhD graduates to spearhead innovation in three vital areas: AI Platform, Distributed Inference Systems, and Generative AI Information Retrieval.Your RoleAs a Senior Machine Learning Engineer on the AI Platform team, you will be a vital contributor to the development of innovative systems that drive AI at Roblox. You will concentrate on one of three significant tracks:Track 1: AI Platform ProjectsLead the development of next-generation AI tools that improve the efficiency, cost, and usability of ML@Roblox.Build and sustain essential platform components: Serving Layer, Model Registry, Pipeline Orchestrator, and Training/Inference control planes.Craft outstanding developer experiences (templates, tools, visualizations) to reduce time-to-production and ensure foundational AI systems are scalable and reliable.Track 2: Distributed Inference & Systems OptimizationDesign and implement scalable distributed inference systems for efficiently serving large language models (LLMs) and Large Recommender Models at unprecedented scale.Perform in-depth, low-level performance analysis to optimize systems.
About ZaimlerAt Zaimler, we believe that AI agents need a profound understanding of data to operate effectively. In today’s fragmented enterprise data landscape, where information is scattered across various systems without a unified context, traditional AI solutions often fall short. We are pioneering the transition from basic copilots to fully autonomous agents, creating a revolutionary infrastructure layer that makes this transition possible.Our platform serves as the foundational context infrastructure for the agentic era. We automatically uncover domain knowledge, establish meaningful relationships, and endow AI agents with the semantic comprehension necessary for precise operations at scale. Imagine knowledge graphs that not only facilitate real-time inference but are also designed for systems that require reasoning capabilities beyond mere retrieval.Founded by industry veterans Biswajit Das, former VP of Engineering at Truera and Chief Architect at Visa, and Sofus Macskassy, ex-Director of Engineering at LinkedIn and creator of one of the largest production knowledge graphs, Zaimler is a close-knit, senior team currently in the seed stage. We are collaborating with major enterprises across sectors such as insurance, travel, and technology. If you are passionate about shaping the infrastructure that will power the next decade of AI innovation, we would love to connect with you.About the RoleAs part of our Machine Learning team, you will focus on transforming raw enterprise data into structured, contextualized knowledge graphs and embeddings. Your responsibilities will include developing innovative and scalable algorithms for machine learning and data engineering aimed at enhancing system efficiency. You will also explore new methodologies to compress large models into more efficient versions, improve retrieval and reasoning performance through feedback mechanisms, and prototype techniques that enable large language models (LLMs) to extract and utilize real-world knowledge efficiently.We seek a candidate who thrives in a fast-paced environment, values meticulous work, and is eager to learn from some of the top engineers and researchers in the field.
Join Avride as a Machine Learning Platform Engineer, where you will be at the forefront of developing cutting-edge machine learning solutions. In this pivotal role, you will collaborate with data scientists and software engineers to build robust ML infrastructure, driving innovation and efficiency.
Join Hinge – The Dating App Designed to Be DeletedIn an era where forging genuine connections is increasingly challenging, Hinge is committed to fostering meaningful relationships and reducing loneliness. Our obsession lies in understanding user behavior to facilitate love, with our success measured by a straightforward yet vital metric – helping users set up great dates. With millions of users worldwide, we've established ourselves as the most reliable platform for finding relationships.About the OpportunityAs a key member of Hinge’s AI Platform Core team, you will play a pivotal role in shaping and enhancing our evolving Machine Learning platform. This position focuses on accelerating the development, deployment, and operation of AI-driven features within Hinge. If you are passionate about tackling challenges such as optimizing our training and serving frameworks for real-time predictions, implementing automatic model performance monitoring systems, and architecting generative AI solutions, we invite you to explore this opportunity!Our mission is to democratize AI at Hinge, ensuring it remains accessible, robust, scalable, cost-effective, and trustworthy. You will collaborate closely with product and platform teams and engage with external stakeholders, including data scientists and backend engineers. By understanding the needs of our internal customers, you will deliver incremental value aligned with their challenges. In a small, impactful team, you’ll have a broad scope of responsibilities and the chance to significantly influence the future of Machine Learning at Hinge.
Join us in creating the backbone of data infrastructure for real-world robotic operations.As robotics transitions from research labs to real-world applications across factories, warehouses, vehicles, and field deployments, understanding the intricacies of robotic performance becomes critical. When robots encounter failures or unexpected behaviors, data analysis is key to deciphering the underlying issues.At Foxglove, we are at the forefront of building tools for observability, visualization, and data infrastructure that empower robotics and autonomous systems teams to manage, analyze, and derive insights from vast amounts of multimodal sensor data collected from operational systems and production fleets.Role OverviewWe are seeking a passionate ML Platform Engineer with robust infrastructure expertise to design, deploy, and scale our data platform systems. This platform-centric role will allow you to take charge of the infrastructure layer that facilitates machine learning in production environments, going beyond just the models themselves.Your responsibilities will encompass ensuring the reliability, scalability, and performance of the ML platform, including areas such as inference serving, pipeline orchestration, training infrastructure, and evaluation frameworks. You will be tackling substantial challenges such as managing petabyte-scale multimodal robotics data and optimizing high-throughput retrieval and embedding pipelines in a hands-on infrastructure capacity.Key ResponsibilitiesDesign and operationalize production inference infrastructure, focusing on model serving, autoscaling, load balancing, and cost efficiency across cloud environments.Own the platform architecture for embedding and retrieval pipelines that enable semantic search across multimodal robotics data (image, video, point cloud, and time series).Develop and sustain the training and evaluation infrastructure that supports rapid model performance iteration, including job orchestration, experiment tracking, and dataset versioning.Lead decisions on cloud infrastructure (AWS/GCP) that affect latency, throughput, reliability, and scalability.Establish platform abstractions and internal tools that empower product engineers to deliver ML-enhanced features without managing infrastructure directly.Assess, integrate, and operationalize third-party ML infrastructure components while establishing clear build vs. buy frameworks for the team.
tvScientific powered by Pinterest
tvScientific, powered by Pinterest, develops a connected TV (CTV) advertising platform designed for performance marketers. The platform combines media buying, optimization, measurement, and attribution to automate and improve TV advertising. Built by professionals in programmatic advertising, digital media, and ad verification, tvScientific aims to deliver measurable results for advertisers. Role overview As a Machine Learning Platform Engineer, you will join a team that operates where Site Reliability Engineering meets low-latency distributed systems. This team advances Pinterest’s real-time machine learning and measurement infrastructure, focusing on sub-millisecond decision-making and high-throughput data access. Seamless integration with Pinterest’s core stack is central to the work. What you will do Design and build systems to keep queries and RPCs fast and reliable, even during periods of heavy demand. Develop and enhance the foundation of the machine learning training and serving stack. Address challenges in storage, indexing, streaming, fan-out, and managing backpressure and failures across services and regions. Collaborate with software engineering, data infrastructure, and SRE teams to ensure systems are observable, debuggable, and ready for production. Key areas of focus I/O scheduling and batching Lock-free or low-contention data structures Connection pooling and query planning Kernel and network tuning On-disk layout and indexing strategies Circuit-breaking and autoscaling Incident response and failure management NixOS Defining and maintaining SLIs and SLOs This position is a strong fit for engineers interested in building and operating large-scale infrastructure, particularly those who enjoy working on real-time systems, observability, and reliability.
About FaireFaire is a transformative online wholesale marketplace, driven by the conviction that local businesses are the future. Independent retailers around the globe generate more revenue than massive corporations like Walmart and Amazon combined, yet individually, they remain small. At Faire, we harness technology, data, and machine learning to connect this vibrant community of entrepreneurs. Think of your favorite local boutique — we empower them to discover and sell the best products from around the world. With our innovative tools and insights, we aim to level the playing field, enabling small businesses to thrive against larger competitors.By championing the growth of independent businesses, Faire positively impacts local economies on a global scale. We’re in search of intelligent, resourceful, and passionate individuals to join us in fueling the shop local movement. If you value community, we invite you to be part of ours.About this RoleAs the Senior Staff Machine Learning Platform Engineer, you will spearhead the technical vision and evolution of Faire's ML platform. You will establish standards, influence organization-wide architecture, and lead intricate, cross-functional initiatives that enhance data science velocity at scale. This position is crucial for adapting ML workflows to leverage modern AI productivity tools. You will not only develop models but also design the systems that enable those models to empower tens of thousands of small retailers in competing and growing their local businesses.
Shipping & Handling ResponsibilitiesDevelop and execute a comprehensive technical strategy, guiding the multi-quarter roadmap for ML platform capabilities that align with Shippo’s strategic business objectives.Lead cross-team architectural decisions, reviews, and requests for comments (RFCs) pertaining to the ML lifecycle and inference processes.Enhance engineering standards through mentorship, establishing production readiness guidelines, and creating reusable platform components.Take responsibility for platform adoption, ensuring reliability and optimizing cost-performance outcomes.Design, build, and maintain essential components of the ML platform, including:Foundational elements of ML lifecycle (such as experiment tracking, artifact management, model registry, and versioning) utilizing MLflow or similar tools.Facilitating training and experimentation through standardized environments, reusable templates, and efficient workflows that empower data scientists to transition from exploration to production seamlessly.
Sign in to browse more jobs
Create account — see all 59,698 results
