Machine Learning Infrastructure Software Engineer jobs in San Mateo – Page 3 | RoboApply Jobs

Machine Learning Infrastructure Software Engineer jobs in San Mateo· Page 3

Results 41–60 of 494 for “Machine Learning Infrastructure Software Engineer” in San Mateo.

494 jobs found

41 - 60 of 494 Jobs
Apply
Roblox Corporation logo
Full-time|On-site|San Mateo, CA, United States

Roblox Corporation seeks a Senior Software Engineer focused on Data Infrastructure and Safety in San Mateo, CA. This position plays a key part in maintaining the reliability and performance of the Roblox platform, with a strong focus on user protection and a secure environment. Role overview This engineer will design and build scalable data infrastructure to…

Apr 27, 2026
Apply
Verkada logo
Full-time|$120K/yr - $220K/yr|On-site|San Mateo, CA United States

About UsAt Verkada, we are revolutionizing the way organizations safeguard their people and property through an integrated, AI-powered platform. As a frontrunner in cloud physical security, Verkada empowers more than 30,000 organizations globally—including over 100 Fortune 500 companies—to enhance their safety and operational efficiency via a unified software platform that offers solutions for video surveillance, access management, air quality monitoring, alarms, intercoms, and visitor management.Founded in 2016, Verkada has experienced rapid growth, boasting 15 offices and a dedicated team of over 2,200 employees.The RoleJoin our innovative cloud infrastructure team, where you will play a crucial role in designing, building, and maintaining highly scalable, reliable systems that power Verkada’s services. You will have the chance to work on exciting projects such as scaling microservice clusters, automating serverless deployments, adopting a full service mesh, and enhancing system observability. Take charge of a subdomain and lead collaborative efforts across teams.This position requires your presence at our headquarters located in San Mateo, CA, as we are dedicated to fostering a vibrant in-office culture.

Feb 9, 2026
Apply
Fireworks AI logo
Full-time|On-site|New York, NY; San Mateo, CA

About Us:At Fireworks AI, we are at the forefront of creating next-generation generative AI infrastructure. Our cutting-edge platform is recognized for delivering the highest-quality models with unparalleled speed and scalability in inference. Independently benchmarked as a leader in LLM inference speed, we drive significant advancements through innovative projects, including our proprietary function calling and multimodal models. As a Series C company valued at $4 billion and backed by leading investors such as Benchmark, Sequoia, Lightspeed, Index, and Evantic, we are a dynamic team of builders, comprised of veterans from Meta PyTorch and Google Vertex AI.The Role:We are seeking a talented Software Engineer to join our AI Infrastructure team. In this pivotal role, you will contribute to designing and developing the foundational systems that power Fireworks AI’s generative AI platform. Your focus will be on building robust infrastructure and tools that guarantee the reliability, performance, quality, and availability of our AI systems.Our mission is to establish Fireworks AI as the most dependable and user-friendly generative AI platform globally. You will collaborate closely with our cloud infrastructure, product, and performance teams to create infrastructure solutions that connect our customers with the high-performance proprietary Fireworks inference engine.Key Responsibilities:Design and develop scalable backend infrastructure supporting distributed training, inference, and data pipelines.Build and maintain essential backend services, including LLM CI/CD pipelines, control planes, and model serving systems.Enhance performance optimization, cost efficiency, and reliability across compute, storage, and networking layers.Create frameworks and safeguards to ensure Fireworks AI maintains the highest model quality in the industry.Work alongside performance, training, and product teams to translate research and product requirements into effective infrastructure solutions.Engage in code reviews, technical discussions, and continuous integration and deployment processes.

Mar 5, 2026
Apply
Zaimler logo
Full-time|On-site|San Mateo, CA

About ZaimlerIn a world where AI agents struggle to reason over fragmented data, Zaimler emerges as the solution. Our mission is to unify disparate enterprise data across countless systems, providing a shared context, meaning, and structure. This transformation is essential as we transition from traditional copilots to fully autonomous agents, necessitating a new infrastructure layer that we are dedicated to building.At Zaimler, we are pioneering context infrastructure for the agentic era—a platform that autonomously discovers domain knowledge, maps intricate relationships, and equips AI agents with the semantic understanding required for precise and scalable operations. Envision knowledge graphs that facilitate real-time inference, tailored for systems that need to reason rather than merely retrieve data.Founded by industry veterans Biswajit Das (former VP Engineering at Truera and Chief Architect at Visa) and Sofus Macskassy (ex-Director of Engineering at LinkedIn), who notably built one of the largest knowledge graphs in production, Zaimler is a small, senior team at the seed stage, collaborating with major enterprises in sectors like insurance, travel, and technology. If you are passionate about creating the infrastructure that will support the next decade of AI advancements, we are eager to connect with you.The RoleWe are in search of a talented Data Infrastructure Engineer to establish the foundational distributed data layer that will power our semantic platform. In this role, you will be responsible for designing, building, and scaling systems that enable high-throughput data ingestion, transformation, and real-time processing.

Sep 3, 2025
Apply
Notable logo
Full-time|On-site|San Mateo, CA

Join Notable as a Staff Software Engineer specializing in Cloud Infrastructure and Applications. In this pivotal role, you will lead the design, development, and implementation of scalable cloud solutions that drive our innovative projects forward. Collaborate with cross-functional teams to optimize application performance and enhance user experience.

Mar 17, 2026
Apply
Genesis Molecular AI logo
Full-time|On-site|San Mateo, CA

Join Our Team as a Senior / Staff Fullstack Software EngineerAt Genesis Molecular AI, we are assembling a premier software engineering team dedicated to revolutionizing drug discovery utilizing machine learning, biophysical simulations, and computational chemistry. We seek engineers passionate about developing innovative medicines and contributing significantly to our software platform.Your Responsibilities:Take ownership of our comprehensive full-stack platform utilized by computational chemists, enhancing functionalities ranging from data exploration to real-time visualizations, while delivering AI-driven workflows that expedite the hypothesis-to-insight cycle.Develop and enhance tools for visualizing molecular structures and proteins, analyzing machine learning models, and managing intricate chemical workflows.Collaborate closely with machine learning engineers and computational chemists to create and implement novel computational methods for predicting molecular properties.Expand our data infrastructure capabilities to accommodate billions of data points and thousands of parallel deep learning and molecular dynamics tasks.Your Profile:A critical thinker who approaches challenges from fundamental principles, adeptly balancing detail-oriented tasks with overarching architectural considerations, possessing a strong investigative drive to uncover root causes.Proficient in managing challenges across the full technology stack, including frontend interfaces, backend services, data pipelines, and overall infrastructure.Demonstrated experience in delivering production-level code and leading projects from inception to completion with minimal oversight.Eager to operate at the nexus of software engineering, drug discovery, and machine learning.What We Offer:An opportunity to work on impactful tools and products that are swiftly implemented to accelerate the discovery of new medications.A collaborative environment with skilled colleagues in AI, software, and chemistry, fostering a culture of intellectual curiosity and humility. The team engages in weekly discussions on 1-2 relevant machine learning or chemistry papers to stay abreast of advancements and inspire innovative ideas.Competitive compensation package, including salary and equity, along with comprehensive medical, dental, and vision insurance, and a 401(k) plan.

Mar 3, 2026
Apply
Verkada logo
Full-time|On-site|San Mateo, CA United States

Verkada seeks a Senior Staff Software Engineer to join the Platform Infrastructure team in San Mateo, CA. This senior technical position focuses on shaping and improving the core infrastructure that underpins Verkada's products, with particular attention to performance and scalability. Key Responsibilities Lead the design, building, and ongoing support of platform infrastructure. Partner with engineering teams to deliver software solutions that enable the platform to grow and adapt. Spot opportunities to improve reliability and efficiency, and guide initiatives to address them. Collaboration Frequent collaboration with cross-functional teams is central to this role. The goal is to ensure that infrastructure evolves to meet the demands of both current and future products.

Apr 27, 2026
Apply
Generalist logo
Full-time|On-site|San Francisco Bay Area (San Mateo) or Boston (Somerville)

About the RoleJoin our dynamic team, affectionately known as MBMB (More Big More Better), where you will play a crucial role in optimizing our training and on-robot inference stacks. We are seeking bold innovations that drive substantial improvements rather than incremental changes.Your Responsibilities Will Include:Maximizing GPU performance through innovative strategiesDeploying machine learning, hardware, and software modifications that yield significant advancementsEnhancing both inference and training stacks for optimal performanceIdeal Candidates Will:Possess proficiency in the latest machine learning techniques, particularly for training and inference optimizations within transformer and diffusion-based architecturesHave a relentless pursuit of ML optimizations across various domains, including CUDA kernels, ML architecture, frontend and backend network bottlenecks, CPU inefficiencies, NVLink, and communication protocols, as well as optimizations in libraries such as Torch, NumPy, and Python.

Feb 12, 2026
Apply
Roblox logo
Full-time|$195.8K/yr - $242.1K/yr|On-site|San Mateo, CA, United States

Every day, millions of users engage with Roblox, immersing themselves in an expansive world of creativity, learning, and connection through 3D digital experiences crafted by our diverse community of developers and creators.At Roblox, we are dedicated to developing the tools and platform that enable our community to turn their imaginative visions into reality. Our mission is to revolutionize how individuals connect, irrespective of their location or device. We strive to unite a billion people with positivity and respect, and we are actively seeking exceptional talent to join us on this journey.A career at Roblox means you will be at the forefront of shaping future human interactions, tackling complex technical challenges at scale, and contributing to the creation of safer, more respectful shared experiences for all.Natural Language Processing (NLP) plays a crucial role in facilitating extensive communication, creation, and safety across the Roblox platform. This position offers an exciting opportunity to develop and implement state-of-the-art NLP, speech, and generative AI models that function on an unprecedented scale, impacting hundreds of millions of daily users.You will address a remarkably diverse array of high-scale language challenges—from real-time moderation of voice and text to automating experience localization and empowering users with LLM-driven creation tools. Our approach merges cutting-edge research with large-scale engineering to connect experimentation and production, designing algorithms that will define the future of language services for our immersive, user-generated content platform.Teams Hiring for This RoleSafety AI Systems: Focused on creating comprehensive ML systems to uphold community standards and safety across our platform at massive scale. This includes:Real-time Moderation: Developing top-tier NLP and speech models for real-time moderation of voice and text (handling over 6 billion messages daily) and innovative interventions that significantly enhance user civility.Critical Harms & Advanced Detection: Crafting specialized LLM agents, behavioral analysis, and graph systems aimed at identifying and preventing rare, high-risk situations (e.g., child safety, terrorism), which necessitate adversarial thinking.

Feb 10, 2026
Apply
Verkada logo
Full-time|$200K/yr - $300K/yr|On-site|San Mateo, CA United States

About UsAt Verkada, we are revolutionizing the way organizations safeguard their personnel and properties through an integrated, AI-driven platform. As a frontrunner in cloud-based physical security, we empower over 30,000 organizations worldwide, including more than 100 Fortune 500 companies, to enhance safety and operational efficiency via a single, connected software solution. Our offerings encompass video security, access control, air quality sensors, alarms, intercoms, and visitor management. Established in 2016, Verkada has experienced remarkable growth, now boasting 15 offices and a dedicated workforce of over 2,200 employees.Role OverviewThe Verkada Security Team, primarily composed of software engineers, is dedicated to establishing optimal software security practices. In this pivotal role, you will enhance Verkada’s security throughout the software development lifecycle (SDLC) by utilizing automation, libraries, tools, and frameworks. Your responsibilities will span various technology stacks and involve collaborating with engineering teams across Verkada’s Command platform.

Feb 9, 2026
Apply
Moonlake logo
Full-time|On-site|San Mateo

Welcome to Moonlake, where we harness AI to create immersive world simulations.Our Mission: To enhance throughput, latency, and cost efficiency—deploying our models 2–10 times faster and more affordably, all while maintaining quality.Key Responsibilities:Optimize GPU performance through CUDA/Triton kernels, FlashAttention, paged attention, and CUDA Graphs.Manage the serving stack, including TensorRT-LLM/Triton Inference Server, vLLM/TGI; implement continuous batching and on-GPU KV reuse; explore speculative decoding/medusa and mixture-of-agents routing.Enhance parallelism via FSDP/ZeRO, TP/PP/expert parallel, and fine-tune NCCL.Implement quantization/PEFT techniques such as AWQ/GPTQ/FP8 and LoRA/DoRA serving.Oversee systems like Ray/k8s/Argo, ensuring observability with Prom/Grafana/OpenTelemetry, autoscaling, A/B testing infrastructure, and canary deployments with rollback capabilities.Ideal Candidate Profile:Candidates should have prior experience in infrastructure-heavy startups, particularly at companies like Databricks or Roblox. We are dedicated to fostering a collaborative, in-person team environment in San Mateo.

Sep 27, 2025
Apply
Roblox logo
Full-time|$195.8K/yr - $242.1K/yr|On-site|San Mateo, CA, United States

At Roblox, millions of users engage daily to explore, create, play, learn, and connect within immersive 3D digital experiences crafted by our vibrant global community of developers and creators.We are dedicated to building innovative tools and platforms that empower our community to actualize any experience they envision. Our mission is to revolutionize the way people connect from anywhere in the world, regardless of the device they use. We aim to unite a billion people with positivity and respect, and we are in search of extraordinary talent to help us achieve this goal.Embarking on a career at Roblox means you will be instrumental in shaping the future of human interaction, tackling unique technical challenges at scale, and contributing to the creation of safer, more respectful shared experiences for everyone.As a Senior Machine Learning Engineer focusing on Recommendation Systems, you will play a pivotal role in enhancing user retention, engagement, and monetization for our vast user base. This position presents a remarkable opportunity to transform how users search and discover captivating immersive experiences and digital avatars within our Marketplace, as well as personalize advertising. You will address a wide array of high-scale ranking, retrieval, and personalization challenges across our platform.We integrate pioneering research—including deep learning, generative AI, and reinforcement learning techniques—with large-scale engineering to bridge the gap between experimentation and production. You will design algorithms that operate at a massive scale, shaping the future of recommender systems for user-generated content.

Feb 10, 2026
Apply
Roblox Corporation logo
Full-time|On-site|San Mateo, CA, United States

Join Roblox as a Senior Machine Learning Engineer in the Account Identity team, where you will leverage your expertise in machine learning to enhance user experiences and security. You will collaborate with a talented group of engineers and researchers to develop innovative solutions that contribute to the growth and safety of our platform.

Mar 26, 2026
Apply
IXL Learning logo
Full-time|$116K/yr - $150K/yr|Hybrid|San Mateo, CA

Join IXL Learning, a leading developer of personalized learning solutions that have positively impacted millions worldwide. We are on the lookout for passionate Software Engineers who are eager to enhance our successful educational products and pioneer innovative new solutions. Being a part of our team means contributing to products that make a real difference in education. #LI-NF1 This full-time position is based in our San Mateo, CA headquarters. Our work schedule is Monday to Friday in the office, with the flexibility to work from home one day per week.

Jan 26, 2026
Apply
Fireworks AI logo
Full-time|$175K/yr - $220K/yr|On-site|San Mateo, CA

About Us:At Fireworks AI, we are at the forefront of developing innovative generative AI infrastructure. Our platform is recognized for delivering top-tier models and the industry's fastest, most scalable inference capabilities. As an industry leader in LLM inference speed, we are pushing boundaries with groundbreaking projects, including our own function calling and multimodal models. Fireworks is a Series C startup valued at $4 billion, supported by premier investors such as Benchmark, Sequoia, Lightspeed, Index, and Evantic. Our passionate and collaborative team is comprised of seasoned professionals from Meta PyTorch and Google Vertex AI.The Role: We are seeking a Training Infrastructure Engineer to design, build, and optimize the infrastructure that underpins our large-scale model training operations. Your contributions will be pivotal in establishing high-performance AI training infrastructure. You'll work closely with AI researchers and engineers to develop robust training pipelines, optimize distributed training workloads, and guarantee the reliability of model development.

Mar 5, 2026
Apply
Roblox logo
Full-time|$195.8K/yr - $242.1K/yr|On-site|San Mateo, CA, United States

Every day, millions of users engage with Roblox to explore, create, play, learn, and connect with friends through immersive 3D digital experiences, all developed by our vibrant global community of creators.At Roblox, we're crafting tools and platforms that empower our community to bring their imaginative concepts to life. Our vision is to transform how people connect from anywhere in the world on any device. We're on a mission to unite a billion individuals with optimism and civility, and we are seeking exceptional talent to help us achieve this goal.A career with Roblox means you will help shape the future of human interaction, tackle unique technical challenges at scale, and contribute to creating safer, more respectful shared experiences for all.In this role, you will leverage computer vision and graphics technologies to enhance how our global community discovers, creates, and interacts within our virtual platform. You will be responsible for developing advanced models to interpret and analyze various content forms—experiences, text, images, videos, and 3D avatars. Your contributions will significantly impact core systems driving the next generation of creation, search, recommendations, and trust & safety initiatives across our expansive ecosystem.We are in search of PhD candidates who are enthusiastic about the convergence of computer vision, graphics, and generative modeling. You will engage in applied research and engineering projects with tangible production impact, enabling innovative ways for creators and players to realize their ideas.

Feb 10, 2026
Apply
Skydio logo
Full-time|$170K/yr - $170K/yr|On-site|San Mateo, California, United States

Skydio is a pioneering force in the drone industry, recognized as the leading autonomous flight company in the United States and globally. Our team merges advanced expertise in artificial intelligence with cutting-edge hardware and software development, operational excellence, and an unwavering commitment to customer satisfaction. We empower a diverse range of drone users, from utility inspectors and first responders to soldiers in complex battlefield situations.About the Role: We are in search of a skilled Software Engineer to spearhead the development of innovative tools that enhance the autonomy lifecycle. In this role, you will be responsible for creating and refining essential internal platforms that enable engineers to test new concepts, analyze system behaviors, and comprehend intricate interactions between software and the physical environment.Your work will encompass both backend systems and front-end visualization, necessitating a strong foundation in software design, developer experience, and a genuine passion for empowering others through effective tooling. The systems you develop will be utilized daily by autonomy developers, test engineers, and various stakeholders throughout the organization.Areas of Responsibility:Design and construct robust replay and analysis systems that allow engineers to inspect and replicate recorded autonomy behaviors with detailed control over system states, perception outputs, and decision-making processes across the entire stack.Develop scalable infrastructure for automated testing and failure triage, enhancing our simulation and log-driven test coverage while expediting root-cause analysis through automated log processing and diagnostics.Collaborate at the intersection of autonomy software and core robotics middleware, establishing clear APIs, data contracts, and performance benchmarks for messaging, state propagation, and inter-subsystem coordination, while partnering closely with downstream teams to facilitate their implementation and integration.Create and implement high-quality developer infrastructure and tools that emphasize reliability, performance, and usability, fostering rapid iteration, safe experimentation, and sustained productivity across the autonomy division.What You’ll Do:Lead the design and execution of scalable tools utilized throughout autonomy development and testing workflows.Engage with autonomy, QA, and infrastructure teams to gather requirements, prioritize tasks, and deliver impactful solutions.

Mar 4, 2026
Apply
Dyneti logo
Full-time|On-site|San Mateo

About UsAt Dyneti, we are committed to ensuring that digital payments are both seamless and secure. We have developed DyScan, an innovative software library that empowers digital merchants to prevent fraud while enhancing conversion rates by simply taking a photo of a credit card.Founded by a fraud prevention specialist from Uber, Dyneti has secured backing from a distinguished group of investors, including Y Combinator. Our technology has facilitated the processing of hundreds of millions of credit card scans globally, serving a diverse clientele that includes Fortune 100 companies and rapidly growing tech unicorns.Job OverviewWe invite you to join Dyneti as a member of our founding engineering team. This is a unique opportunity to work closely with our CEO in developing new features and launching innovative products within the deep learning domain.

Jul 24, 2025
Apply
embedding-vc logo
Full-time|On-site|San Francisco Bay Area

Join us at Moonlake, where we leverage AI to craft immersive world simulations that push the boundaries of technology.Role OverviewEnhancing Training EfficiencyImplement advanced dataloaders, fusion techniques, activation rematerialization, and gradient checkpointing strategies.Utilize FSDP/ZeRO/tensor+pipeline parallelism and fine-tune NCCL settings for optimal performance.Boosting GPU and Kernel PerformanceConduct Nsight profiling and develop Triton/CUDA kernels along with fused operations.Implement flash-attention-style optimizations, sequence packing, and KV-cache improvements.Optimizing Inference ProcessesFacilitate low-latency serving, continuous batching, and speculative decoding techniques.Engage in quantization methods (GPTQ/AWQ), model distillation, and pruning practices.Infrastructure and Reliability EnhancementsManage SLURM/K8s multi-node jobs and ensure checkpoint hygiene.Focus on determinism, environment pinning, and effective GPU failure management.We pride ourselves on being an on-site, collaborative team located in San Mateo.

Jan 15, 2026
Apply
Roblox Corporation logo
Full-time|On-site|San Mateo, CA, United States

Join Roblox as a Principal or Senior Machine Learning Scientist in our Search and Discovery team, where you will harness your expertise in machine learning to enhance user experience and drive innovative solutions. Your contributions will directly impact how users discover and interact with content on our platform.

Feb 26, 2026

Sign in to browse more jobs

Create account — see all 494 results

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.