Cantina Labs logoCantina Labs logo

Applied Machine Learning Engineer for Real-Time Video Generation

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Experience Level

Experience

Qualifications

Qualifications:Proven experience in machine learning, particularly in video generation. Strong programming skills in Python and familiarity with deep learning frameworks (e.g., TensorFlow, PyTorch). Solid understanding of model optimization techniques and real-time systems. Experience with large datasets and efficient data pipeline construction. Ability to work collaboratively in a team-oriented environment.

About the job

About Cantina:

Cantina Labs is an innovative social AI company dedicated to developing cutting-edge real-time models that redefine expression, personality, and realism. Our mission is to animate characters and revolutionize storytelling, connections, and creativity. Our flagship platform, Cantina, is just the beginning of a transformative journey in social AI.

If you are passionate about harnessing AI to enhance human creativity and social interactions, we invite you to join us in shaping the future!

About the Role:
We are seeking an Applied Machine Learning Engineer with extensive hands-on experience in building large-scale video generation models, from data collection and training to distillation and acceleration into production-ready models. Our models are designed to be human-centric and product-oriented: envision interactive characters that can respond to text, audio, and image inputs while generating video with minimal latency.

This role combines applied research and engineering: you will focus on training runs, data management, model optimization, and the crucial process of transforming a capable research model into a real-time experience.

Typical time allocation (approximately):

  • 60–75% dedicated to training, fine-tuning, and distillation of large video models.

  • 15–25% focused on inference optimization (latency, memory, cost) and model runtime enhancements.

  • 10–15% allocated to prototyping and product integration (transitioning demos into shipped features).

Your Responsibilities:

  • Train and scale video generation models: Execute large-scale training and fine-tuning on multi-GPU (and as necessary, multi-node) environments; manage the training loop, maintain stability, checkpoints, and enhance iteration speed.

  • Manage video modeling data: Develop and enhance video datasets and pipelines (including decoding, sampling, filtering, quality control, conditioning alignment, and storage formats), ensuring the pipeline remains efficient and reliable at scale.

  • Distill and compress large models into efficient ones: Implement teacher-student distillation, reduce steps, simplify architectures, and balance quality and speed to meet real-time constraints.

  • Achieve real-time model performance: Conduct profiling, optimize memory usage, apply quantization-aware techniques when suitable, improve kernels and runtime, and focus on practical throughput and latency enhancements.

About Cantina Labs

Cantina Labs is at the forefront of social AI innovation, crafting advanced real-time models that are reshaping how stories are told and connections are made. Our commitment to pushing the envelope in character animation and human interaction offers a dynamic work environment where creativity and technology converge.

Similar jobs

Browse all companies, explore by city & role, or SEO search pages. View directory listings: all jobs, search results, location & role pages.

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.