MLOps Software Engineer

Field AIIrvine, CA

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Experience Level

Experience

Qualifications

To excel in this role, candidates should possess the following qualifications:Bachelor's degree in Computer Science, Engineering, or a related field. Experience with machine learning frameworks and tools. Proficiency in programming languages such as Python, Java, or C++. Familiarity with cloud platforms and MLOps practices. Excellent problem-solving skills and a collaborative mindset. Strong communication skills, both verbal and written.

About the job

We are on the lookout for a talented and driven MLOps Engineer to bolster our engineering team. In this pivotal role, you will be responsible for designing and maintaining the infrastructure and tools essential for the comprehensive lifecycle of machine learning systems utilized in robotics. Collaborating closely with machine learning engineers, robotics experts, and infrastructure teams, you will ensure the reliable training, evaluation, deployment, and monitoring of ML models. This position presents a thrilling opportunity to operationalize machine learning within dynamic and rapidly evolving robotic systems.

About Field AI

At Field AI, we are at the forefront of transforming robotic capabilities through advanced artificial intelligence. Our innovative approach not only enhances machine learning applications but also ensures that our technologies are ready for real-world deployment, setting us apart from conventional methodologies.

1 - 20 of 60,442 Jobs

Search for Lead Instructor in MLOps & AI Platform Engineering

60,442 results

Select all on this page (20)

Apply

Lead Instructor in MLOps & AI Platform Engineering

General Assembly

Full-time|Remote|U.S. Remote

Join General Assembly as a Lead Instructor specializing in MLOps and AI Platform Engineering. In this role, you will lead engaging online classes, guiding aspiring data professionals through the intricacies of machine learning operations and AI engineering. You will leverage your expertise to create a dynamic learning environment, fostering both theoretical …

May 1, 2026

Apply

MLOps Platform Engineer

dv01

Full-time|Remote|Remote - USA

At dv01, we are unveiling the intricacies of the world's largest financial market: structured finance. Our platform serves as the backbone for vital activities that promote financial independence, ranging from consolidating credit card debt to refinancing student loans, purchasing homes, and launching small businesses. dv01’s cutting-edge data analytics platform offers unparalleled insights into investment performance and risk for lenders and Wall Street investors involved with structured products. As a data-centric organization, we meticulously manage essential loan data and develop state-of-the-art analytical tools that empower strategic decision-making for responsible lending. In essence, we are committed to preventing a recurrence of the 2008 global financial crisis by providing the data and tools necessary for informed, data-driven decisions that contribute to a safer financial environment for everyone. With over 400 of the largest financial institutions relying on dv01, we cover more than 100 million loans across various sectors, including mortgages, personal loans, auto loans, buy-now-pay-later schemes, small business loans, and student loans. dv01 is continually broadening its market coverage, adding new loans monthly, and innovating new technologies for the structured products landscape.

Mar 11, 2026

Apply

MLOps / AI Platform Engineer Subject Matter Expert

General Assembly

Part-time|$60/hr - $80/hr|Remote|U.S. Remote

General Assembly seeks a MLOps / AI Platform Engineer Subject Matter Expert to support a reskilling initiative for Customer Success and Account Management professionals. This part-time, remote role focuses on ensuring the technical accuracy and practical relevance of training materials designed to help professionals transition into MLOps and AI Platform Engineering. The engagement is flexible, up to 35 hours per week, and runs through July 2026. Key responsibilities Review and validate technical content, governance frameworks, and monitoring exercises for the reskilling program. Confirm that asynchronous learning materials are accurate and practical for those moving into MLOps and AI Platform Engineering roles. Work with General Assembly’s team to deliver training that guides learners in operationalizing AI systems from pilot to production. Engage with corporate leaders who depend on General Assembly’s workforce transformation solutions. Role details Location: U.S. Remote Start: ASAP Hours: Part-time, up to 35 hours per week (engagement ends by July 2026) Compensation: $60 – $80 per hour This position centers on technical review and collaboration to ensure learning assets meet the needs of professionals entering the MLOps and AI Platform Engineering fields.

Apr 22, 2026

Apply

Lead AI Platform Engineer

Tribe AI

Full-time|On-site|United States

About Tribe AI:At Tribe AI, we are dedicated to empowering enterprises to unlock the full potential of artificial intelligence. Many large organizations aspire to leverage AI for operational transformation, yet lack the necessary capabilities. This presents a unique opportunity for us.As a pioneering AI-native services firm, we collaborate with enterprises to design and implement top-tier AI solutions that yield tangible business results. Our partnerships with industry leaders like OpenAI and Anthropic provide us with unparalleled insights into the latest models, strategies, and market approaches.Role Overview:In the capacity of a Lead AI Engineer within our Platform Team, you will play a pivotal role in enhancing our internal foundational codebase to expedite project delivery with superior quality. Your expertise will serve as a crucial link between our sales and project teams, ensuring the delivery of exceptional client outcomes while sustaining high profit margins.Key Responsibilities:Platform Development & SupportEnhance and expand our foundational codebase, known as 'Foundry', to provide reusable components for project teams.Assist project teams in maximizing the use of our platform components, achieving 50-70% code reuse.Identify and integrate successful project implementations back into the platform for future utilization.Proactively resolve bugs, performance issues, and feature gaps within the platform.Training & EnablementCreate training materials and documentation to lower productivity barriers for platform components.Design onboarding processes to facilitate rapid acclimatization for team members using the codebase.Offer hands-on support through pairing sessions and code reviews with project teams.Cross-functional CollaborationEngage with sales teams to grasp upcoming project needs and provide insights on platform capabilities.Partner with project teams throughout the delivery lifecycle to assess effectiveness and identify areas for improvement.

Jan 29, 2026

Apply

Lead AI Platform Engineer

Auditdata

Full-time|Remote|Remote job

About the Role:At Auditdata, we recognize that over 250 million individuals globally suffer from hearing loss and often lack the essential care they need. Our innovative cloud platform is designed specifically for hearing care professionals, empowering them to fit devices, manage patient care, and efficiently run their clinics.We are excited to launch a new AI Platform team aimed at integrating advanced intelligence into our workflow. This includes features such as real-time transcription of clinical sessions, automated clinical documentation, and AI-driven processes that minimize administrative burdens, allowing clinicians to concentrate on providing top-notch patient care. This is a unique opportunity to work on greenfield projects that have a direct positive impact on patients.We are seeking a proactive and technical leader to lay the groundwork for this initiative: launching production-ready AI services, defining the system architecture, and developing a small, dynamic team.Key Responsibilities:Develop and manage AI services in production on Azure, ensuring they are reliable, observable, and cost-effective.Oversee the transcription pipeline that transforms clinical discussions into structured, actionable data.Implement LLM-powered features for documentation and data transformation, incorporating safety controls suitable for clinical environments.Establish a reusable “AI workflow layer” to facilitate the integration of AI capabilities by product teams without the need for redundant orchestration, permissions, and traceability.Define platform interfaces (APIs/SDKs) to ensure easy adoption across the engineering organization.Lead by example: mentor fellow engineers, evaluate design proposals, and elevate the overall quality of work (approximately 30% leadership and 70% hands-on work).Success Metrics:The AI Platform is recognized as a trusted internal resource utilized by multiple product teams for various AI functionalities.The transcription, LLM document generation, and agent framework are efficient, reliable, and cost-effective in a production setting.The AI platform has a clear roadmap that aligns with product objectives and architectural goals, with transparent trade-offs and priorities.You are acknowledged as the primary technical authority within the organization.

Jan 7, 2026

Apply

MLOps Software Engineer

Field AI

Full-time|On-site|Irvine, CA

Field AI is revolutionizing the interaction between robots and the physical world. Our mission is to create AI systems that are risk-aware, dependable, and ready for real-world applications, tackling the intricate challenges of robotics to unlock the true potential of embodied intelligence. By moving beyond standard data-driven methods and purely transformer-based models, we are pioneering innovative solutions that are already deployed globally, yielding tangible results and continually refining our models through actual field applications.We are on the lookout for a talented and driven MLOps Engineer to bolster our engineering team. In this pivotal role, you will be responsible for designing and maintaining the infrastructure and tools essential for the comprehensive lifecycle of machine learning systems utilized in robotics. Collaborating closely with machine learning engineers, robotics experts, and infrastructure teams, you will ensure the reliable training, evaluation, deployment, and monitoring of ML models. This position presents a thrilling opportunity to operationalize machine learning within dynamic and rapidly evolving robotic systems.

Mar 16, 2026

Apply

Lead Instructor in Agentic AI Engineering

General Assembly

Full-time|Remote|U.S. Remote

Join General Assembly as a Lead Instructor in Agentic AI Engineering, where you will be at the forefront of educating the next generation of tech leaders. In this role, you will guide students through the complexities of AI engineering, empowering them with the skills needed to thrive in a rapidly evolving industry.As a seasoned professional, you will leverage your expertise to create dynamic learning experiences, mentor aspiring engineers, and contribute to curriculum development. Your passion for education and technology will inspire students to innovate and excel in their careers.

May 1, 2026

Apply

Lead Principal MLOps Engineer

Raft

Full-time|$150K/yr - $200K/yr|Remote|Remote, US; DMV; McLean, VA; Boston, MA; San Antonio, TX; Colorado Springs, CO; Tampa, FL; Honolulu, HI

Location: Remote (U.S.); DMV; McLean, VA; Boston, MA; San Antonio, TX; Colorado Springs, CO; Tampa, FL; Honolulu, HI Eligibility: U.S. citizenship required. All work must be performed within the continental United States. Raft delivers AI, machine learning, and data-driven solutions to U.S. military and government clients. The team specializes in autonomous data fusion, Agentic AI, distributed data systems, and building scalable platforms. Headquartered in McLean, VA, Raft partners with federal and public sector organizations to create digital solutions that reach millions of Americans, using design thinking and cloud-native technology. Role overview The Lead Principal MLOps Engineer plays a key part in supporting clients and collaborating with a team focused on complex, high-impact challenges. This position centers on building and maintaining mission-critical AI and data platforms for the Department of Defense. What you will do Work on platforms that process large volumes of real-time data from diverse sensors and operational sources. Transform raw data into actionable intelligence, supporting operators with timely decisions through mission applications and operational dashboards. Support a large-scale platform that manages billions of events daily, using low-latency data pipelines and cloud-native infrastructure. Advance the development of an end-to-end machine learning platform for model development, evaluation, deployment, monitoring, and lifecycle management across both cloud and constrained operational environments.

Apr 22, 2026

Apply

MLOps Engineer (AI) - Join the Future of Infrastructure Management

SewerAI Corporation

Full-time|$130K/yr - $130K/yr|Remote|Remote

SewerAI Corporation creates AI-driven tools for managing underground infrastructure. Their platform helps contractors, engineering firms, and utilities transform sewer inspection data into practical insights, reducing the need for manual video review. After doubling its customer base over the past year, SewerAI is growing rapidly. Role overview This remote MLOps Engineer (AI) position centers on building and maintaining the machine learning operations behind SewerAI’s products. The role involves designing, improving, and scaling systems that support machine learning models for sewer and underground infrastructure assessment. The MLOps Engineer connects research and production, ensuring models transition smoothly from development to reliable deployment. The work involves managing training and inference pipelines, strengthening cloud infrastructure, and developing CI/CD processes to keep models secure and dependable for defect detection and maintenance. Key responsibilities Cloud infrastructure: Audit, secure, and optimize AWS environments to support both training and production workloads, focusing on high availability and fault tolerance. Model deployment & inference: Build and maintain scalable systems for serving deep learning models with PyTorch and TensorFlow, tuned for low latency and high throughput on complex infrastructure data. CI/CD for machine learning: Develop automated pipelines for model testing, validation, deployment, and rollback. Training infrastructure: Create efficient compute environments for training computer vision and time-series models on large datasets. Monitoring & observability: Set up monitoring for model drift, data quality, and system health to quickly identify and address performance issues.

Apr 28, 2026

Apply

Lead Product Manager for AI Platform

Tribe AI

Full-time|On-site|United States

About Tribe AI:At Tribe AI, our mission is to empower enterprises to unlock the potential of artificial intelligence within their operations. Many large organizations aspire to leverage AI for transformative operational efficiencies, yet often lack the necessary expertise. This gap presents a unique opportunity for us to assist them.As an AI-native services provider, we collaborate with leading enterprises to create and implement cutting-edge AI products that drive significant business results. Our strategic partnerships with OpenAI and Anthropic give us unparalleled access to the latest models and go-to-market strategies in the industry.About the Role:We are looking for a seasoned Product Manager to lead our AI Platform team. In this key role, you will be responsible for ensuring that all projects utilize our platform's capabilities to significantly enhance both profitability and quality across the organization. You will spearhead our platform strategy, collaborating closely with project teams to optimize platform usage and iteratively enhance our AI functionalities.Key Responsibilities:Connect to Sales Pipeline: Work in tandem with sales teams to grasp upcoming project requirements, prioritize platform features in line with market needs, and develop value propositions that facilitate deal closure.Co-pilot Support: Collaborate directly with active project teams through one-on-one sessions to provide platform expertise and ensure effective implementation.Feedback Collection & Impact Measurement: Monitor essential metrics to quantify the platform’s value, gather user feedback, and continually enhance the platform experience.Standardize Features: Identify successful project-specific implementations that can be standardized and integrated into the core platform.ROI Analysis: Create and maintain detailed ROI models that illustrate the business advantages of platform integration across various projects.Ensure Platform Reliability: Lead efforts to enhance production reliability by managing issue resolution, coordinating with engineering teams, communicating with impacted clients, and incorporating reliability insights into the platform roadmap.About You:7+ years of product management experience, with a strong track record of successful technical product delivery.Proficient understanding of AI technologies and their applications in business contexts.Exceptional communication and collaboration skills, with a knack for translating complex concepts into actionable insights.

Jan 29, 2026

Apply

Lead Instructor in Applied AI Engineering for Summer 2026

CodePath

Part-time|$75/hr - $75/hr|Remote|Remote, United States

At CodePath, we're revolutionizing higher education to cultivate the next generation of AI-native engineers, CTOs, and tech founders.Our industry-aligned courses and dedicated career support are tailored for first-generation and low-income students. Here, students learn alongside seasoned engineers, intern at leading tech companies, and collaboratively ascend to become the tech industry's future leaders.With a thriving community of over 40,000 students and alumni hailing from 1,000 colleges and now contributing to 4,050 companies, CodePath is actively reshaping the tech workforce and the future of various industries. Our sponsors include prominent names like Amazon, Andreessen Horowitz, Anthropic, Comcast, Google, JP Morgan Chase, Knight Foundation, Meta, New Profit, Salesforce, and The Studio at Blue Meridian Partners.About the RoleLocation: Remote, United StatesRole Type: Seasonal Part-Time, W2 Employee (up to 10 hours/week)Duration: May 25, 2026 - August 7, 2026 (Training start and Summer Academic Term)Reports To: Program ManagerCompensation: Starting at $75/hourMany computer science students graduate without gaining real-world experience in AI projects or receiving mentorship from industry professionals. At CodePath, we aim to bridge this gap. As a Lead Instructor, you will play a pivotal role in providing students who may not have access to elite universities with the technical training they need.Lead instructors serve as the “face” of CodePath’s virtual classes. Your responsibilities will include developing engaging lessons based on CodePath’s curriculum, tracking course performance indicators, integrating student feedback, and collaborating with co-instructors and teaching assistants throughout the program. Each session you lead will reach between 300 to 450 students, many of whom will express that this is the most relevant coursework they have experienced in college.

Apr 8, 2026

Apply

MLOps Engineer

Hayden AI

Full-time|Hybrid|San Francisco HQ Office

Hayden AI builds mobile perception systems that help transit agencies and city governments address real-world challenges. The team focuses on computer vision to improve bus lane and bus stop enforcement, modernize transportation technology, and support safer, more sustainable streets. The MLOps Engineer will join the Perception Deep Learning team in San Francisco, working in a hybrid model (at least three days per week in the office). This role centers on building and advancing Hayden AI’s machine learning platform, collaborating with perception, deep learning, and platform engineers to create infrastructure for training and deploying ML models. The position involves shaping the architecture for data ingestion, training pipelines, deployment, monitoring, and governance. What you will do Design, implement, and maintain cloud-based workflows for deploying and managing AI models. Work with cross-functional teams to identify workflow bottlenecks and deliver solutions that improve efficiency. Deploy new features and updates quickly while maintaining quality and reliability, and apply cost-saving strategies to optimize infrastructure spending. Stay up to date with new MLOps tools and technologies, integrating them to improve ML workflows. Participate in the team’s software development process, including design and code reviews, brainstorming, and maintaining accurate documentation. Requirements Bachelor’s degree and 3-4 years of experience in a related field.

Apr 29, 2026

Apply

Lead Instructor for Generative AI

Correlation One

Contract|Remote|Remote

Correlation One is at the forefront of developing workforce skills for the AI-driven economy. We collaborate with enterprises and government entities to cultivate talent and bridge critical gaps in data, digital, and technological skills. Our global initiatives include training programs and data competitions that also empower underrepresented communities, facilitating career acceleration. Our mission is to ensure equitable access to the data-centric jobs of tomorrow. We work alongside leading employers and government institutions such as Amazon, Coca-Cola, Johnson & Johnson, the U.S. State Department, and the U.S. Department of Defense to turn this vision into reality. All of our skill development programs are completely free for participants and are conducted online by industry professionals, thereby reducing traditional barriers to career advancement. We are dedicated to fostering supportive, human-centered group learning environments that enhance technical skills and build confidence among learners. Join us in shaping the AI Economy!

Mar 4, 2026

Apply

Senior Software Engineer in AI/MLOps

Galileo Technologies, Inc.

Full-time|$100K/yr - $225K/yr|Hybrid|Hybrid (Burlingame, California, US)

About the RoleAs a key member of the Galileo team, you will significantly contribute to the design, development, and expansion of our innovative products. We are in search of a talented Senior Software Engineer with a strong interest in tackling intricate challenges at the interface of Data and Machine Learning, and a deep passion for enhancing Observability and Reliability in Generative AI.Your ResponsibilitiesTechnical Design and Architecture: Lead the effort in establishing scalable and dependable architectures while securing stakeholder alignment.Planning and Execution: Collaborate with your team to outline and implement the project roadmap.Peer Reviews: Ensure engineering excellence by conducting thorough reviews of your colleagues' pull requests.Team Collaboration: Work closely with Product Managers, designers, and technical leads to build a cohesive strategy and maximize collaborative efforts.Continuous Improvement: Engage in design reviews, on-call duties, support tasks, and contribute to tech discussions and learning sessions. Assist in the interview process for prospective engineering candidates.

Mar 15, 2026

Apply

Lead Engineer for AI Infrastructure in Platform Engineering

Plasmidsaurus

Full-time|On-site|San Francisco

Plasmidsaurus helps scientists worldwide by streamlining sequencing. Researchers from leading institutions and companies rely on this platform daily. With a global network of labs, the company delivers fast, affordable sequencing results, and has recently expanded into RNA-seq to broaden its genomics reach. The team is focused on building a universal sequencing platform designed for efficiency and global scale. Role overview The Lead Engineer for AI Infrastructure in Platform Engineering sets both technical direction and management strategy for the company’s compute, data, AI, and security infrastructure. This position oversees the entire sequencing operation, from laboratory devices to data delivery. What you will do Oversee core services that coordinate laboratory devices, including robots, sequencers, and on-premises Linux servers, as well as the data ingestion pipeline. Develop cloud infrastructure and data pipelines for storing, processing, and delivering terabytes of sequencing data. Design systems to manage millions of bioinformatics tasks, handling queue management, workflow orchestration, and scheduling. Build AI infrastructure and internal tools to support autonomous systems, including: Quality Scientist Agents: Monitor operations, detect anomalies, and escalate quality or reliability concerns. Logistics Agents: Coordinate global transportation of samples to labs and carriers. Bioinformatics Coding Agents: Run adaptive analyses on varied sample types with different data distributions. Culture The team values initiative and a strong sense of ownership. High agency and responsibility shape how work gets done at Plasmidsaurus.

Apr 28, 2026

Apply

Lead Staff Backend Engineer for No-Code AI Platform

Stack AI

Full-time|Remote|United States

About Stack AIStack AI is an innovative no-code platform that empowers users to design, test, and deploy AI workflows using large language models. Our intuitive drag-and-drop interface enables teams to seamlessly connect data sources with AI models, facilitating the development of production-grade applications such as chatbots, document processing pipelines, and database Q&A tools, all without the need for coding.As we transition from thousands to millions of users, we require a dedicated technical leader to enhance our backend systems. This is where you come into play.The RoleWe are in search of a Staff Backend Engineer to take ownership of our platform's core infrastructure. In this pivotal technical leadership role, you will influence our architectural framework, define engineering standards, and make key infrastructure decisions that will shape the scalability of Stack AI.You will collaborate directly with our founders and product leaders, navigating the balance between speed and sustainability within our fast-paced startup environment, where your contributions will have a significant impact on the entire product.What You'll DoDesign and scale core systems — Architect, develop, and manage the backend services that drive our no-code AI platform, encompassing everything from the API layer to the data pipeline.Lead identity and authentication infrastructure — Oversee our SSO, SAML, and OIDC integrations, ensuring our authentication framework meets enterprise-level standards and scales with our growing customer base.Establish technical direction — Set forth design patterns, service boundaries, data modeling strategies, and long-term infrastructure plans.Mentor and uplift the team — Enhance team performance through design reviews, collaborative coding sessions, knowledge sharing, and the establishment of best practices throughout backend engineering.Enhance reliability and observability — Proactively identify performance bottlenecks, optimize efficiency, and advance our infrastructure to prevent issues before they escalate.What We're Looking ForRequired Qualifications6+ years of backend engineering experience, demonstrating a history of operating at the Staff level or serving as a technical lead on complex, production systems.Extensive knowledge of SSO, SAML, and OIDC — with hands-on experience in building or managing identity and authentication systems.Comprehensive understanding of identity provider (IdP) architecture and enterprise authentication methodologies.

Mar 4, 2026

Apply

Lead Platform Engineer - Search Platform

TetraScience

Full-time|Remote|Remote — United States

About UsTetraScience is a pioneering leader in Scientific Data and AI, driving the Scientific AI revolution by creating and industrializing AI-native scientific datasets. Our innovative lab data management solutions are transforming scientific use cases and yielding AI-driven outcomes.As a prominent figure in this emerging market, TetraScience has surpassed all competitors in revenue generation. In the past year, leading companies in computing, cloud, data, and AI infrastructure have recognized TetraScience as the gold standard, engaging in co-innovation and strategic partnerships. For more details, visit our Newsroom.As part of your application process, we invite you to explore the Tetra Way letter, penned by our co-founder and CEO Patrick Grady. This document is essential for understanding our core values and ethos, and we encourage you to reflect on its content to determine your alignment with our company culture. If you join our team, embodying its principles will be expected on a daily basis.Position OverviewWe are on the lookout for a Lead Software Engineer to advance our scientific search platform, moving beyond conventional keyword search to unveil cutting-edge capabilities in chemical search, semantic search, and natural language processing. You will engage in a dynamic intersection of AI/ML, cheminformatics, knowledge representation, and distributed systems, empowering scientists to access and interpret complex experimental datasets, chemical entities, assay results, and unstructured laboratory documents.As the technical leader of the Search Platform team, you will architect, design, and implement innovative search functionalities while managing the supporting infrastructure. You will set an exemplary standard for system development, mentor fellow engineers, and contribute to the strategic roadmap for search capabilities and platform enhancements. This role entails maintaining scalable and reliable services, constantly refining the platform as we broaden our search offerings. Collaboration with Applied AI Scientists, platform engineers, and product teams will be crucial in delivering high-performance search services that facilitate discovery, analysis, and decision-making throughout the bio-pharma R&D lifecycle.If you are driven by the passion to develop scalable search systems, enhance scientific retrieval, and support production-scale AI workloads, we would be thrilled to connect with you.

Jan 14, 2026

Apply

Principal MLOps Engineer

Raft

Full-time|$150K/yr - $200K/yr|Remote|Remote, US; DMV; McLean, VA; Boston, MA; San Antonio, TX; Colorado Springs, CO; Tampa, FL; Honolulu, HI

Raft is seeking a Principal MLOps Engineer to help build and scale advanced AI and data platforms for the Department of Defense. This position plays a key role in transforming large volumes of real-time sensor and operational data into actionable intelligence, supporting critical decision-making for operators through mission applications and operational displays. What you will do Advance mission-critical AI and data platforms that process billions of daily events with low-latency pipelines and cloud-native architecture. Support the development of an end-to-end machine learning platform for model development, evaluation, deployment, monitoring, and lifecycle management. Ensure the platform operates effectively across both cloud infrastructure and resource-constrained environments. Location This role is open to remote work within the United States as well as in DMV, McLean, VA, Boston, MA, San Antonio, TX, Colorado Springs, CO, Tampa, FL, and Honolulu, HI. Eligibility U.S. citizenship is required. All work must be performed within the continental United States.

Apr 22, 2026

Apply

Senior MLOps Engineer

Fortytwo

Full-time|Remote|Remote

Fortytwo is a cutting-edge decentralized AI protocol built on Monad, which utilizes underused consumer hardware for swarm inference. Our innovative approach allows Small Language Models to execute complex multi-step reasoning at a reduced cost, outperforming the capabilities and scalability of existing leading models.Key Responsibilities:Deploy robust, scalable machine learning services with optimized infrastructure and automated Kubernetes clusters.Enhance GPU resource utilization through Multi-Instance GPU (MIG) and Node Offloading System (NOS).Oversee cloud storage solutions (e.g., S3) to guarantee availability and performance.Incorporate state-of-the-art ML methods, including Low-Rank Adaptation (LoRA) and model merging into operational workflows:Adapt leading ML codebases to align with organizational requirements.Implement LoRA methodologies and model merging processes.Manage and deploy large language models (LLM), small language models (SLM), and large multimodal models (LMM).Utilize technologies such as Triton Inference Server for model serving.Leverage advanced serving frameworks like vLLM and Text Generation Inference (TGI).Optimize models using ONNX and TensorRT to ensure efficient deployment.Develop Retrieval-Augmented Generation (RAG) systems that integrate spreadsheet, mathematics, and compiler processing.Establish monitoring and logging systems with tools such as Grafana, Prometheus, Loki, Elasticsearch, and OpenSearch.Create and maintain CI/CD pipelines via GitHub Actions for streamlined deployment.Develop Helm templates for swift Kubernetes node deployment.Automate workflows using cron jobs and Airflow Directed Acyclic Graphs (DAGs).Qualifications:Bachelor’s or Master’s degree in Computer Science, Engineering, or a related discipline.Expertise in Kubernetes, Helm, and containerization technologies.Experience in GPU optimization (MIG, NOS) and familiarity with cloud platforms (AWS, GCP, Azure).Proficient in monitoring tools (Grafana, Prometheus) and scripting languages (Python, Bash, etc.).

Jan 30, 2025

Apply

Director of AI Inference and MLOps

Deeter Analytics

Full-time|On-site|Austin Area

Location: Austin, Texas area / On-site preferredProject: 7MW Phase I AI Datacenter -> 50MW Campus ExpansionReports to: Founders / Executive TeamAbout the ProjectWe are in the process of constructing a cutting-edge, high-density AI datacenter campus situated just outside of Austin, Texas, commencing with an initial capacity of approximately 7MW of NVIDIA GB300 NVL72 infrastructure and anticipating a scale-up to over 50MW. Our primary aim is to focus on real-time inference, reasoning, and high-value AI serving workloads, effectively monetizing our infrastructure in active markets rather than merely leasing out space.This role transcends traditional datacenter operations.We are seeking a visionary leader who will strategically transform our GPU racks into a profitable inference operation.As the head of this initiative, you will be responsible for defining and executing strategies that enhance revenue, uptime, and utilization through careful selection of models, orchestration stacks, pricing strategies, customer segments, and marketplace partnerships.The ideal candidate will appreciate that the essence of our business lies beyond mere computation; rather, it encompasses monetized tokens, latency-adjusted utilization, and gross margins.The RoleWe are in search of a senior operator-builder capable of bridging multiple domains:AI infrastructureInference performance engineeringModel serving and routingMarketplace monetizationCustomer and partner integrationRevenue optimizationYou will architect and manage the inference platform that dictates how our GB300 NVL72 racks are monetized in real-time. This could involve direct enterprise workloads, marketplace distribution, API-based reselling, model hosting, fine-tuned/private deployments, and novel inference channels.You should possess a keen understanding of profitable applications on modern inference hardware, and be prepared to answer critical questions such as:Which open-weight and commercially viable models should be prioritized on this hardware?How should workloads be balanced across premium low-latency serving, bulk throughput, reserved capacity, and experimental capacities?Should we leverage third-party marketplaces for routing?

Mar 13, 2026

Create account — see all 60,442 results

Browse all companies, explore by city & role, or SEO search pages.

MLOps Software Engineer

Experience Level

Qualifications

About the job

About Field AI

Similar jobs