embedding-vc logo

Technical Staff Member - Optimizing Machine Learning Efficiency

embedding-vcSan Francisco Bay Area
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Experience Level

Experience

Qualifications

Qualifications:Strong background in machine learning and AI. Experience with Python and deep learning frameworks. Familiarity with GPU programming and performance optimization techniques. Ability to work collaboratively in a fast-paced environment.

About the job

Join us at Moonlake, where we leverage AI to craft immersive world simulations that push the boundaries of technology.

Role Overview

Enhancing Training Efficiency

  • Implement advanced dataloaders, fusion techniques, activation rematerialization, and gradient checkpointing strategies.

  • Utilize FSDP/ZeRO/tensor+pipeline parallelism and fine-tune NCCL settings for optimal performance.

Boosting GPU and Kernel Performance

  • Conduct Nsight profiling and develop Triton/CUDA kernels along with fused operations.

  • Implement flash-attention-style optimizations, sequence packing, and KV-cache improvements.

Optimizing Inference Processes

  • Facilitate low-latency serving, continuous batching, and speculative decoding techniques.

  • Engage in quantization methods (GPTQ/AWQ), model distillation, and pruning practices.

Infrastructure and Reliability Enhancements

  • Manage SLURM/K8s multi-node jobs and ensure checkpoint hygiene.

  • Focus on determinism, environment pinning, and effective GPU failure management.

We pride ourselves on being an on-site, collaborative team located in San Mateo.

About embedding-vc

At embedding-vc, we are at the forefront of artificial intelligence, pioneering innovative solutions that create realistic world simulations through our product, Moonlake. We are dedicated to fostering a collaborative and dynamic work environment.

Similar jobs

Browse all companies, explore by city & role, or SEO search pages.

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.