Integrant logoIntegrant logo

Lead AI Platform Engineer

IntegrantCairo, Cairo Governorate, Egypt
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Experience Level

Manager

Qualifications

Required Qualifications:Over 8 years of experience in AI systems. More than 8 years of experience in ML systems, HPC, and AI infrastructure. Strong proficiency in Python programming. Extensive experience with GPU-based AI workloads and performance optimization. In-depth knowledge of model optimization techniques, including quantization, pruning, and batching. Hands-on experience with:PyTorchONNX / ONNX RuntimeTensorRT / TensorRT-LLMTriton Inference ServerFamiliarity with CUDA, cuDNN, and GPU architecture fundamentals. Experience with distributed systems (multi-GPU / multi-node). Knowledge of NCCL communication, NVLink / InfiniBand, and orchestration tools such as Kubernetes or Slurm. Proven track record of deploying AI models in production settings. Ability to analyze system bottlenecks related to compute, memory, and network. Experience with profiling tools like Nsight and TensorRT profiler. Understanding of cost optimization strategies for GPU workloads.

About the job

Join Integrant as a Lead AI Platform Engineer and be a pivotal force in revolutionizing our AI capabilities.

The Lead AI Platform Engineer will be instrumental in integrating AI workloads with robust production-grade infrastructure, focusing on leveraging the NVIDIA AI stack to deliver high-performance, scalable, and optimized AI systems.

This role emphasizes model optimization, runtime efficiency, and GPU utilization to ensure that AI workloads are ready for production, cost-effective, and highly effective in enterprise environments.

Key Responsibilities:

  • Transform AI/ML workloads into optimized infrastructure and deployment strategies.
  • Enhance model performance within GPU environments, focusing on latency, throughput, and memory utilization.
  • Design and execute inference and training pipelines utilizing NVIDIA stack tools like TensorRT, Triton, and NIM.
  • Convert and optimize models across various frameworks (e.g., PyTorch to ONNX to TensorRT).
  • Identify and mitigate performance bottlenecks through the use of profiling tools (GPU, memory, network).
  • Boost GPU utilization and scheduling efficiency across clusters.
  • Architect scalable distributed training and inference frameworks.
  • Collaborate with clients to outline AI infrastructure strategies and deployment models.
  • Oversee production deployments, ensuring effective monitoring, rollback, and performance validation.
  • Conduct applied research to enhance model efficiency and infrastructure utilization.
  • Guide team members in AI infrastructure, optimization, and GPU systems.
  • Utilize experiment tracking tools (MLflow, W&B, Neptune) to log parameters, metrics, and artifacts for analysis.
  • Detect model degradation post-deployment due to issues like concept drift, data pipeline alterations, and traffic pattern changes.
  • Perform root cause analysis (RCA) on ML systems by isolating variables and reproducing issues.

About Integrant

Integrant is a forward-thinking company that embraces innovation and technology. We are committed to pushing the boundaries of AI and machine learning to deliver exceptional solutions to our clients. Join us and be part of a dynamic team that values collaboration, creativity, and continuous improvement.

Similar jobs

Browse all companies, explore by city & role, or SEO search pages. View directory listings: all jobs, search results, location & role pages.

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.