About the job
About Etched
At Etched, we are pioneering the first AI inference system designed specifically for transformers, achieving performance levels exceeding 10x that of traditional solutions while significantly reducing costs and latency. Our innovative ASICs empower the development of products that were previously unimaginable with GPUs, such as real-time video generation models and highly complex reasoning agents. Supported by substantial investments from leading venture capitalists and a team of top engineering talents, Etched is setting new benchmarks in AI infrastructure, a sector experiencing unprecedented growth.
Job Overview
We are seeking a visionary Head of Performance Profiling to lead the understanding of performance metrics across our next-generation AI accelerator systems.
Our machine learning accelerator platform encompasses custom silicon, advanced supercomputing software, compiler stacks, runtime libraries, and distributed inference environments. Performance analysis at this scope transcends device-level inquiries; it represents a complex distributed systems challenge. You will craft performance metrics that link raw hardware signals to distributed workload contexts, ML cluster behaviors, communication patterns, and emerging bottlenecks.
This position goes beyond traditional telemetry. You will create new abstractions, structured counter ontologies, frameworks for cross-layer event correlation, distributed time-alignment strategies, and scalable reasoning systems that operate across nodes, racks, and clusters. Engaging at the crossroads of hardware design, driver architecture, runtime systems, and ML infrastructure, you will influence how we interpret and utilize performance intelligence. This foundational role will shape our tooling and the platform's approach to efficiency, scalability, and system behavior for years to come.

