Technical Staff Member Ml Infrastructure Performance jobs in San Mateo – Browse 160 openings on RoboApply Jobs

Technical Staff Member - ML Infrastructure & Performance

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Experience Level

Experience

Qualifications

Candidates should possess a strong background in machine learning infrastructure, performance optimization, and practical experience with the technologies outlined in the scope of work. A passion for pushing the boundaries of AI technology and a collaborative mindset are essential.

About the job

Join the innovative team at Moonlake, where we harness the power of AI to create real-time interactive content.

Mission: Elevate performance metrics by enhancing throughput, reducing latency, and optimizing costs - deploying our models 2–10 times faster and at lower costs without compromising quality.

Scope of Work:

GPU Performance: Expertise in CUDA/Triton kernels, FlashAttention family, paged attention, and CUDA Graphs.
Serving Stack: Proficiency with TensorRT-LLM/Triton Inference Server, vLLM/TGI; continuous batching; on-GPU KV reuse; speculative decoding/medusa; and mixture-of-agents routing.
Parallelism: Experience with FSDP/ZeRO, TP/PP/expert parallel; NCCL tuning.
Quantization/PEFT: Familiarity with AWQ/GPTQ/FP8; LoRA/DoRA serving.
Systems: Knowledge of Ray/k8s/Argo, observability tools (Prom/Grafana/OpenTelemetry), autoscaling, A/B infrastructure, and canary + rollback.

Tech Signals:

Ideal candidates will have previous experience at infrastructure-heavy startups such as Databricks or Roblox.

We are dedicated to maintaining an on-site, in-person team based in San Mateo.

About embedding-vc

Moonlake is at the forefront of AI-driven innovation, specializing in the development of real-time interactive content. Our mission is to deliver cutting-edge technology solutions that enhance user experiences and streamline operations.

1 - 20 of 160 Jobs

Select all on this page (20)

Apply

Technical Staff Member - ML Infrastructure & Performance

embedding-vc

Full-time|On-site|San Mateo, CA

Join the innovative team at Moonlake, where we harness the power of AI to create real-time interactive content.Mission: Elevate performance metrics by enhancing throughput, reducing latency, and optimizing costs - deploying our models 2–10 times faster and at lower costs without compromising quality.Scope of Work:GPU Performance: Expertise in CUDA/Triton ker…

Dec 12, 2025

Apply

Technical Staff Member - ML Infrastructure & Performance

Moonlake

Full-time|On-site|San Mateo

Welcome to Moonlake, where we harness AI to create immersive world simulations.Our Mission: To enhance throughput, latency, and cost efficiency—deploying our models 2–10 times faster and more affordably, all while maintaining quality.Key Responsibilities:Optimize GPU performance through CUDA/Triton kernels, FlashAttention, paged attention, and CUDA Graphs.Manage the serving stack, including TensorRT-LLM/Triton Inference Server, vLLM/TGI; implement continuous batching and on-GPU KV reuse; explore speculative decoding/medusa and mixture-of-agents routing.Enhance parallelism via FSDP/ZeRO, TP/PP/expert parallel, and fine-tune NCCL.Implement quantization/PEFT techniques such as AWQ/GPTQ/FP8 and LoRA/DoRA serving.Oversee systems like Ray/k8s/Argo, ensuring observability with Prom/Grafana/OpenTelemetry, autoscaling, A/B testing infrastructure, and canary deployments with rollback capabilities.Ideal Candidate Profile:Candidates should have prior experience in infrastructure-heavy startups, particularly at companies like Databricks or Roblox. We are dedicated to fostering a collaborative, in-person team environment in San Mateo.

Sep 27, 2025

Apply

Technical Staff Member - AI Training Infrastructure

Fireworks AI

Full-time|$175K/yr - $220K/yr|On-site|San Mateo, CA

About Us:At Fireworks AI, we are at the forefront of developing innovative generative AI infrastructure. Our platform is recognized for delivering top-tier models and the industry's fastest, most scalable inference capabilities. As an industry leader in LLM inference speed, we are pushing boundaries with groundbreaking projects, including our own function calling and multimodal models. Fireworks is a Series C startup valued at $4 billion, supported by premier investors such as Benchmark, Sequoia, Lightspeed, Index, and Evantic. Our passionate and collaborative team is comprised of seasoned professionals from Meta PyTorch and Google Vertex AI.The Role: We are seeking a Training Infrastructure Engineer to design, build, and optimize the infrastructure that underpins our large-scale model training operations. Your contributions will be pivotal in establishing high-performance AI training infrastructure. You'll work closely with AI researchers and engineers to develop robust training pipelines, optimize distributed training workloads, and guarantee the reliability of model development.

Mar 5, 2026

Apply

Technical Staff Member - Cloud Infrastructure

Fireworks AI

Full-time|On-site|New York, NY; San Mateo, CA

Join Fireworks AI as a Technical Staff Member specializing in Cloud Infrastructure. In this pivotal role, you will be at the forefront of our innovative cloud solutions, collaborating with a dynamic team of professionals dedicated to advancing cloud technologies. Your expertise will contribute to building and maintaining robust infrastructure that supports our evolving business needs.

May 1, 2026

Apply

Technical Staff Member - Advanced Machine Learning Optimization

Moonlake

Full-time|On-site|San Mateo

Join Moonlake, a pioneering company harnessing AI to develop immersive world simulations.Role OverviewEnhancing Training EfficiencyImplement data loaders, fusion techniques, activation rematerialization, and gradient checkpointing.Optimize training with FSDP/ZeRO/tensor+pipeline parallelism and NCCL tuning.Improving GPU and Kernel PerformanceConduct Nsight profiling, develop Triton/CUDA kernels, and create fused operations.Implement flash-attention style accelerations, sequence packing, and KV-cache optimizations.Optimizing InferenceFocus on low-latency serving, continuous batching, and speculative decoding strategies.Apply quantization methods (GPTQ/AWQ), distillation, and pruning techniques.Infrastructure and ReliabilityManage SLURM/Kubernetes multi-node jobs and ensure checkpoint hygiene.Maintain determinism, environment pinning, and effectively handle GPU failures.Our dedicated team thrives on collaboration in our San Mateo office.

Nov 25, 2025

Apply

Technical Staff Member, Cluster Management

Fireworks AI

Full-time|On-site|San Mateo, CA

About Us:At Fireworks AI, we are pioneering the next generation of generative AI infrastructure. Our innovative platform is designed to deliver the highest quality models with unparalleled speed and scalability in inference. Recognized as a leader in LLM inference speed, we continuously push the boundaries of technology through transformative projects, including our proprietary function calling and multimodal models. As a Series C company valued at $4 billion, we are supported by esteemed investors like Benchmark, Sequoia, Lightspeed, Index, and Evantic. Our team is a dynamic blend of visionaries and builders, with a strong foundation from Meta PyTorch and Google Vertex AI.The Role:As a Member of the Technical Staff specializing in Cluster Management at Fireworks AI, you will be pivotal in ensuring our world-scale virtual AI cloud operates reliably, efficiently, and at peak performance. Your expertise in large-scale distributed systems, cloud infrastructure, and operational excellence will be essential as you collaborate with top-tier software engineers and AI specialists to elevate our advanced AI platforms, addressing the rapid growth and evolving needs of our applications. This position is ideal for individuals passionate about building resilient, observable, and automated systems that drive customer success.

Mar 5, 2026

Apply

Technical Staff Member - Diffusion Model

embedding-vc

Full-time|On-site|San Francisco Bay Area

Join us at Moonlake, where we harness the power of AI to create immersive world simulations.Model Development & ArchitectureDesign and refine innovative 2D/3D/image/video/audio diffusion architectures.Engage in conditioning tasks involving text, images, poses, layouts, and control signals with multi-modal encoders and guidance strategies.Training & OptimizationConduct large-scale diffusion training to enhance model performance.Focus on improving sample quality while optimizing computational resources through advanced objectives, distillation, and consistency models.Control & AlignmentImplement cutting-edge techniques such as ControlNet, LoRA, and IP-Adapters to manage style, identity, geometry, and control.Develop robust editing, inpainting, and personalization pipelines, including DreamBooth and custom subject/style tuning.Our vibrant team is dedicated to working on-site, currently located in San Mateo.

Jan 15, 2026

Apply

Technical Staff Member - Code Generation

Moonlake

Full-time|On-site|San Mateo

Welcome to Moonlake, where we leverage AI to craft immersive world simulations.Mission: Join us as an Applied AI Research Engineer focused on designing and coding intelligent agents (post-training and systems).Scope of Work:Design agentic systems: Develop tool catalogs, function calls, program synthesis, repair loops, and control mechanisms such as ReAct, Reflexion, ToT, and LangGraph, along with self-verification and sandboxed execution.Evaluation mindset: Create comprehensive task suites for multi-step coding, including full-stack LLM engineering, prompt libraries, routing, retrieval, KV-cache management, streaming, and telemetry.Security and isolation: Implement Docker/firejail, manage network egress controls, maintain secrets hygiene, and ensure dependency pinning for supply-chain integrity.Strong post-training capabilities: Conduct supervised fine-tuning, preference and trace reinforcement learning (DPO/RLAIF/RLHF), dataset curation, reward shaping, and safety filtering.Technical Signals:Experience shipping agents that successfully navigate real repository test suites from start to finish.Published research in the fields of agentic systems and code generation, contributing to frameworks or open-source evaluations such as LangGraph, AutoGen, Guidance, LEAP, and SWE-bench variants.Developed datasets from execution traces, demonstrating significant enhancements from data over parameters.We are committed to maintaining an on-site, collaborative team environment based in San Mateo.

Oct 1, 2025

Apply

Technical Staff Member - Embodied Agents

embedding-vc

Full-time|On-site|San Francisco Bay Area

Join Moonlake, the forefront of AI technology for crafting immersive world simulations.Your Role: Design and train advanced embodied agents capable of perceiving their environment through vision, depth, and language; reasoning through memory and planning; and acting with precision in both continuous and discrete control.We are dedicated to fostering a collaborative, in-person team environment based in San Mateo, CA.

Jan 15, 2026

Apply

Technical Staff Member - Embodied Agents

Moonlake

Full-time|On-site|San Mateo

Join Moonlake, a pioneering company at the forefront of artificial intelligence dedicated to crafting immersive world simulations.As a member of our Technical Staff, you will design and train advanced embodied agents capable of perception through vision, depth, and language, along with reasoning abilities including memory and planning, to execute actions through both continuous and discrete control methods.We pride ourselves on maintaining a collaborative and innovative team environment, currently operating from our San Mateo headquarters.

Dec 22, 2025

Apply

Technical Staff Member - Diffusion Model Development

Moonlake

Full-time|On-site|San Mateo

Join Moonlake, where we harness the power of AI to craft immersive world simulations.Modeling & ArchitectureDevelop and enhance diffusion architectures for 2D/3D images, videos, and audio.Focus on conditioning techniques: text/image/pose/layout/control signals, multi-modal encoders, and guidance strategies.Training & OptimizationEngage in large-scale diffusion training processes.Enhance sample quality while managing compute trade-offs through improved objectives, distillation, and consistency models.Control & AlignmentUtilize techniques such as ControlNet, LoRA, and IP-Adapters for style, identity, geometry, and control adaptations.Design editing, inpainting, and personalization workflows, including DreamBooth and custom subject/style tuning.We are dedicated to maintaining an on-site, collaborative environment in San Mateo.

Nov 25, 2025

Apply

Technical Staff Member - Optimizing Machine Learning Efficiency

embedding-vc

Full-time|On-site|San Francisco Bay Area

Join us at Moonlake, where we leverage AI to craft immersive world simulations that push the boundaries of technology.Role OverviewEnhancing Training EfficiencyImplement advanced dataloaders, fusion techniques, activation rematerialization, and gradient checkpointing strategies.Utilize FSDP/ZeRO/tensor+pipeline parallelism and fine-tune NCCL settings for optimal performance.Boosting GPU and Kernel PerformanceConduct Nsight profiling and develop Triton/CUDA kernels along with fused operations.Implement flash-attention-style optimizations, sequence packing, and KV-cache improvements.Optimizing Inference ProcessesFacilitate low-latency serving, continuous batching, and speculative decoding techniques.Engage in quantization methods (GPTQ/AWQ), model distillation, and pruning practices.Infrastructure and Reliability EnhancementsManage SLURM/K8s multi-node jobs and ensure checkpoint hygiene.Focus on determinism, environment pinning, and effective GPU failure management.We pride ourselves on being an on-site, collaborative team located in San Mateo.

Jan 15, 2026

Apply

Product Design Engineer - Member of Technical Staff

embedding-vc

Full-time|On-site|San Mateo, CA

Join our innovative team at Moonlake, where we harness the power of AI to create dynamic, real-time interactive content.Our Mission: To deliver exceptional product-level user experiences and full-stack solutions, transforming research insights into captivating, market-ready products swiftly.Key Qualifications:Exemplary product acumen; experience in rapid prototyping and ownership from inception to launch.Technical stack familiarity: Next.js/React, Tailwind/shadcn, edge/serverless architectures (Vercel/Cloudflare), tRPC/GraphQL, data synchronization, and authentication protocols.Ability to design and implement model endpoints, observability hooks, feature flags, and experimentation toggles.Technical Competencies:A portfolio showcasing polished AI applications in a production environment.Experience in developing and deploying design systems with a strong focus on performance metrics.We pride ourselves on being an on-site team, located in the vibrant city of San Mateo.

Dec 12, 2025

Apply

Technical Staff Member - Evals & Post-Training Product

Fireworks AI

Full-time|On-site|San Mateo, CA

About Us:At Fireworks, we are at the forefront of generative AI infrastructure. Our platform stands out by delivering the highest-quality models with the fastest and most scalable inference capabilities in the industry. Recognized as a leader in LLM inference speed through independent benchmarking, we are spearheading innovation with projects like our proprietary function calling and multimodal models. As a Series C company valued at $4 billion and supported by leading investors such as Benchmark, Sequoia, Lightspeed, Index, and Evantic, we are a dynamic and collaborative team founded by veterans from Meta PyTorch and Google Vertex AI.We are in search of a Technical Staff Member for Evals & Post-Training Product to play a pivotal role in shaping how developers enhance models on our Fireworks platform. This position operates at the crossroads of product engineering, developer experience, and model quality.In this role, you will create products and workflows that establish a continuous connection between evaluation and post-training. You will facilitate internal teams in executing evaluations at scale, empower external developers through our open-source Eval Protocol SDK, and oversee critical product experiences aimed at fine-tuning custom models on Fireworks.Your work will span the entire stack—from APIs, SDKs, and backend systems to user-facing components in the web application—enabling users to easily create evaluations, interpret results, fine-tune models, and iterate swiftly. Additionally, you will engage directly with customers and internal teams to identify pain points, support real-world use cases, and transform recurring challenges into efficient product features.

Mar 24, 2026

Apply

Staff Software Engineer - Cloud Infrastructure and Applications

Notable

Full-time|On-site|San Mateo, CA

Join Notable as a Staff Software Engineer specializing in Cloud Infrastructure and Applications. In this pivotal role, you will lead the design, development, and implementation of scalable cloud solutions that drive our innovative projects forward. Collaborate with cross-functional teams to optimize application performance and enhance user experience.

Mar 17, 2026

Apply

Applied ML Scientist in Cheminformatics (Staff / Principal)

Genesis Molecular AI

Full-time|On-site|San Mateo, CA

Team OverviewBe part of an elite team leading the charge in AI and biochemistry innovation.At Genesis Molecular AI, our close-knit group comprises distinguished deep learning researchers, software engineers, and pioneers in drug discovery. Together, we are on a mission to create next-generation AI foundation models that will pioneer transformative therapies for patients battling severe diseases.We transcend traditional machine learning applications in biology; we engage in fundamental research at the dynamic intersection of machine learning, physics, and computational chemistry, continuously pushing the boundaries of each discipline. Collaborating with top multidisciplinary researchers, you will design and develop generative foundation models at scale, benefitting from extensive computational resources and large-scale simulations.Role OverviewThis distinctive role is tailored for a scientist eager to drive the application of cutting-edge AI in addressing real-world drug discovery challenges. You will serve as a vital link between our long-term research initiatives and our active drug discovery programs. Your objective will be to build, evaluate, monitor, and enhance our advanced models directly within active drug programs, spearheading model validation, deployment, and analysis to inform the discovery of new medicines.You will function as both a translator and a strategist, aligning our research efforts with critical challenges and ensuring our drug hunters maximize the potential of our top-tier AI platform. This position necessitates a profound understanding of cheminformatics, computational chemistry, and experimental techniques, alongside robust data science skills and an ability to convey complex concepts to a diverse, multidisciplinary team.We have openings available at various seniority levels: Senior, Staff, and Principal.

Jul 30, 2025

Apply

Senior Staff Software Engineer, Platform Infrastructure

Verkada

Full-time|On-site|San Mateo, CA United States

Verkada seeks a Senior Staff Software Engineer to join the Platform Infrastructure team in San Mateo, CA. This senior technical position focuses on shaping and improving the core infrastructure that underpins Verkada's products, with particular attention to performance and scalability. Key Responsibilities Lead the design, building, and ongoing support of platform infrastructure. Partner with engineering teams to deliver software solutions that enable the platform to grow and adapt. Spot opportunities to improve reliability and efficiency, and guide initiatives to address them. Collaboration Frequent collaboration with cross-functional teams is central to this role. The goal is to ensure that infrastructure evolves to meet the demands of both current and future products.

Apr 27, 2026

Apply

Data Platform Engineer - Member of Technical Staff

Fireworks AI

Full-time|$175K/yr - $220K/yr|On-site|San Mateo, CA

About Us:At Fireworks, we are pioneering the future of generative AI infrastructure. Our platform is recognized for delivering the highest-quality models with the industry's fastest and most scalable inference capabilities. With an independent benchmark as the leader in LLM inference speed, we are at the forefront of innovation, working on advanced projects including our proprietary function calling and multimodal models. As a Series C company valued at $4 billion, we are backed by esteemed investors such as Benchmark, Sequoia, Lightspeed, Index, and Evantic. Our team is made up of ambitious builders, including veterans from Meta PyTorch and Google Vertex AI.The RoleWe are seeking a skilled Data Platform Engineer specializing in Order-to-Cash (OTC) Revenue Transformation and AI Application Enablement. This role will involve taking ownership of and evolving the comprehensive billing, revenue, and business data pipeline—from usage metering and invoice generation to revenue recognition and financial reporting. You will be positioned at the nexus of Engineering, Finance, and Data, ensuring precise capture, billing, recognition, and reconciliation of every dollar generated across our five revenue streams.This impactful, cross-functional role requires hands-on engagement with our billing platform (e.g., Orb), accounting systems, data warehouse (BigQuery), and cloud marketplaces (AWS, GCP). Ultimately, you will contribute to the design of AI-enabled workflow agents that automate reconciliation, anomaly detection, and revenue operations once the core data infrastructure is fortified.

Apr 7, 2026

Apply

Senior/Staff/Principal ML Research Engineer - Foundation Models

Genesis Molecular AI

Full-time|On-site|San Mateo, CA

ML Research Engineer, Foundation ModelsAbout Our TeamBecome part of an elite team that is pioneering advancements in AI and biochemistry. At Genesis Molecular AI, we are a cohesive group of accomplished deep learning researchers, software engineers, and innovators in drug discovery.Our mission is transformative: to develop next-generation AI foundation models that will enable new therapies for patients facing severe diseases. We engage in groundbreaking research at the convergence of machine learning, physics, and computational chemistry, continually challenging the limits of these disciplines.The Genesis AI team is at the helm of this revolution, creating extensive generative models that are trained on a wide array of molecular data, backed by robust computational infrastructure and simulation processes.This position integrates machine learning research, structural biology, and computational chemistry, necessitating rigorous technical proficiency and strong collaborative skills across disciplines.About the Role We are seeking an exceptional ML Research Engineer who excels at merging fundamental research with production-level engineering.As a vital member of the Genesis AI team, you will be instrumental in inventing, scaling, and deploying our next generation of foundation models for molecular science.You will collaborate closely with ML researchers, computational chemists, and drug discovery scientists to transform cutting-edge model concepts into operational systems that drive real drug discovery initiatives.Your responsibilities will include:Enhancing model pretraining pipelinesAdvancing reinforcement learning and post-training systemsOptimizing the performance of large molecular modelsImplementing structure prediction models like Pearl in production environments utilized by chemists and drug initiativesWe require a professional who can bridge the gap between ML and computational chemistry, facilitating swift transitions from research insights to practical applications.

Jul 30, 2025

Apply

Staff Technical Recruiter at Skydio | San Mateo, CA

Skydio

Full-time|$174K/yr - $227K/yr|Hybrid|San Mateo, California, United States

Skydio stands at the forefront of the drone industry in the United States and is recognized globally for its cutting-edge autonomous flight technology, which is pivotal for the advancement of drones and aerial mobility. Our team merges profound knowledge in artificial intelligence with superior hardware and software development, operational excellence, and an unwavering commitment to customer satisfaction. We aim to enable a wider and more diverse audience of drone users, ranging from utility inspectors to first responders and military personnel in various scenarios.About the Role:We are seeking a Staff Technical Recruiter to play a vital role in assembling the teams that are defining the future of autonomous flight. In this position, you will drive essential engineering searches, collaborate directly with organizational leaders, and significantly influence our hiring processes as we expand. This senior, hands-on recruiting opportunity is perfect for an individual who excels in dynamic environments and possesses the skills to attract top-tier technical talent.PLEASE NOTE: Skydio operates under a hybrid work model, with teams working in-office three days a week. This position will be located at our headquarters in San Mateo, CA.Your Impact:Act as a strategic recruiting partner to senior engineering leaders, shaping organizational design, headcount planning, and recruitment strategies.Manage end-to-end talent acquisition strategies and execution for one or more of our core engineering divisions.Develop and fine-tune diverse, high-performing talent pipelines for specialized skill sets in emerging technology sectors.Lead executive-level recruitment efforts from initial discussions to successful closure, ensuring a top-notch candidate experience at every stage.Collaborate with People Operations, Compensation, and Finance to streamline headcount planning and recruitment processes for scalability and efficiency.Promote structured hiring methodologies, maintaining rigor and fairness in all recruitment decisions.Mentor fellow recruiters, sharing insights and best practices to enhance Skydio’s recruiting culture and continuously elevate our talent attraction and retention strategies.Your Qualifications:Extensive experience in recruiting within at least two of the following areas: Cloud, Mobile, Infrastructure, Embedded, Wireless, Camera/Optics, or Hardware Engineering.Demonstrated success in building technical teams at high-growth or impactful technology organizations.Strong ability to thrive in fast-paced environments and deliver results.

Oct 7, 2025

Create account — see all 160 results

Browse all companies, explore by city & role, or SEO search pages.