1X logo1X logo

AI Research Engineer - Scaling at 1X | Palo Alto, CA

1XPalo Alto, California, United States
On-site Full-time $180K/yr - $300K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Experience Level

Experience

Qualifications

Key ResponsibilitiesLead the scaling of distributed training and inference systemsOptimize compute resources to prioritize data as the primary constraintFacilitate extensive training runs (1000+ GPUs) using robot-generated data, ensuring robust fault tolerance and effective experiment trackingEnhance inference throughput for datacenter applications, including world models and diffusion enginesMinimize latency and improve performance for on-device robot policies using techniques such as quantization, scheduling, and distillationEssential QualificationsProficient programming skills in Python and/or C++In-depth understanding of training and inference performance bottlenecks and scaling principlesA foundational belief in the importance of scalability in humanoid roboticsBachelor's degree in Computer Science or a related fieldExperience with distributed training frameworks (e.g., TorchTitan, DeepSpeed, FSDP/ZeRO) and multi-node debuggingDemonstrated ability to optimize inference performance using graph compilers, batching/scheduling, and serving systems like TensorRTFamiliarity with quantization methods (PTQ, QAT, INT8/FP8) and associated toolsExperience developing or optimizing CUDA or Triton kernels, with a focus on hardware-level optimization techniques

About the job

AI Research Engineer, Scaling | Infrastructure

Location: Palo Alto, CA (on-site)

At 1X, we are pioneering the development of humanoid robots designed to collaborate with humans, addressing labor shortages and fostering abundance across various sectors.

The Role: As an AI Research Engineer specializing in Scaling, you will be responsible for architecting and implementing robust infrastructure that facilitates large-scale training, evaluation, and deployment for our fleet of robots. Your contributions will be essential in transitioning experimental systems into production-ready platforms, optimized for throughput, latency, and overall performance in both datacenter and edge environments. This role will significantly impact the efficiency of learning and inference processes, directly influencing the capabilities of our general-purpose humanoid robots.

About 1X

1X is at the forefront of creating humanoid robots that work alongside humans to alleviate labor shortages and enhance productivity in various industries. We are committed to innovation and excellence in robotics technology.

Similar jobs

Browse all companies, explore by city & role, or SEO search pages.

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.