About the job
Cerebras Systems is revolutionizing AI with the largest AI chip globally, measuring 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers the computational power of numerous GPUs on a single chip, combining unparalleled performance with the simplicity of a single device. This unique approach enables Cerebras to provide leading-edge training and inference speeds, allowing machine learning professionals to effortlessly execute large-scale ML applications without the complexities of managing multiple GPUs or TPUs.
Cerebras counts among its esteemed clients top-tier model laboratories, major global enterprises, and pioneering AI-native startups. Recently, OpenAI announced a multi-year collaboration with Cerebras, aiming to deploy 750 megawatts of power to transform critical workloads with ultra-high-speed inference.
The groundbreaking wafer-scale architecture of Cerebras Inference offers the fastest Generative AI inference solution worldwide, exceeding GPU-based hyperscale cloud inference services by over ten times. This dramatic improvement in speed is reshaping the user experience of AI applications, facilitating real-time iterations and enhancing intelligence through advanced agentic computation.

