About the job
Join Our Team of Innovators
At 42dot, we are at the forefront of autonomous driving technology, seeking exceptional Senior ML Platform Engineers to develop our core data and machine learning training platforms. Our team is dedicated to creating a distributed, scalable data platform that handles vast datasets, including millions of scenes, alongside high-performance data serving SDKs essential for ML model training and evaluation. Our platforms are designed to significantly enhance the efficiency of the ML model development lifecycle, which encompasses training, evaluation, deployment, and monitoring within a cloud environment.
Key Responsibilities
Define and implement the technical strategy for a robust, large-scale data platform that effectively manages, visualizes, and serves extensive datasets for machine learning model training and validation.
Develop a comprehensive data lakehouse for autonomous driving scene datasets, integrating sensor, calibration, and annotation data.
Lead the development of the Autonomous Driving Data SDK, which includes functionality for scene data searching, dataset preparation, and loading.
Identify and resolve performance bottlenecks throughout data processing pipelines, focusing on data processing and search latencies as well as Test Procedure (TP) coverage.
Establish and maintain the infrastructure for Data Platform components, including the Data Processing Pipeline, Database, Data Lakehouse, and Data Serving.
Collaborate with cross-functional teams, including those focused on ML algorithms, applications, and Cloud Infrastructure, to ensure alignment of ML Platforms with the overarching Autonomous Driving System Architecture.
Qualifications
A Bachelor’s degree or higher in Computer Science, Engineering, Robotics, or a related technical field.
At least 7 years of experience in Data Engineering or ML Platform roles.
Expert-level proficiency in Python, with substantial experience in developing Python SDKs.
Proficient experience with databases such as MongoDB and PostgreSQL.
Deep understanding of modern AI frameworks like PyTorch and TensorFlow, particularly regarding distributed data loaders for model training.
Hands-on experience orchestrating data pipelines using tools such as Databricks Workflows or Apache Airflow, along with integrating these pipelines with machine learning models.
