About the job
Mission Summary:
At Motional, we are revolutionizing the way autonomous vehicles extract vital intelligence from vast amounts of multimodal sensor data. Our cutting-edge autonomous driving stack relies on identifying rare edge cases, long-tail scenarios, and critical model errors. Our ML-powered multimodal data mining framework, Omnitag, serves as the backbone for this discovery process.
As a Senior Machine Learning Engineer on the Data Mining team, you will be tasked with constructing the "Brain" of this framework. This involves designing large-scale multimodal Teacher models that comprehend the intricacies of the world, followed by distilling them into highly efficient Student models capable of processing exabytes of data in near real-time. Your contributions will sit at the crossroads of large-scale representation learning, retrieval optimization, and reasoning systems. You'll directly affect how we compress knowledge into effective encoders for rapid searches and how we utilize reinforcement learning to enhance data discovery workflows and intelligent querying. By developing smarter mining tools, you'll expedite the entire model improvement lifecycle for teams engaged in post-training analysis, error diagnosis, and dataset curation.
What You’ll Do:
- Architect and Train Distilled Models: Design and implement teacher-student model frameworks for multimodal sensor data. Develop efficient training pipelines for knowledge distillation while ensuring that student models achieve high accuracy with minimal inference latency and memory usage.
- Reinforcement Learning for Data Discovery: Create RL-based policy learning and reasoning systems tailored for autonomous driving applications. Implement and scale RL training workflows (e.g., PPO, DQN, actor-critic methods) for both simulation and real-world interactions. Investigate reward shaping, environment modeling, and multi-agent RL when relevant.
- Optimize Model Deployment for Real-Time Inference: Collaborate with backend engineers to deploy distilled and RL models into production, optimizing for latency, throughput, and hardware efficiency across GPU/CPU clusters. Implement model versioning, A/B testing, and performance regression monitoring.
- Research and Integrate Agentic Systems: Explore and prototype workflows for autonomous reasoning, chain-of-thought prompting, and goal-directed behavior. Integrate these systems into our broader autonomy stack as experimental or production components.
- Drive Production Reliability: Establish patterns and protocols that enhance the reliability of our production systems.
