About the job
Cerebras Systems is at the forefront of AI technology, having engineered the world’s largest AI chip, 56 times the size of traditional GPUs. Our innovative wafer-scale architecture integrates the computational power of numerous GPUs into a single chip, providing unparalleled simplicity in programming. This revolutionary approach enables Cerebras to achieve industry-leading speeds in training and inference, allowing machine learning practitioners to efficiently run large-scale ML applications without the complexities involved in managing multiple GPUs or TPUs.
Cerebras is proud to serve a diverse clientele that includes prestigious model labs, global enterprises, and pioneering AI-native startups. Recently, OpenAI announced a multi-year partnership with Cerebras, focusing on deploying 750 megawatts of scale to transform critical workloads through ultra-high-speed inference.
With our cutting-edge wafer-scale architecture, Cerebras Inference delivers the fastest Generative AI inference solution available, over ten times quicker than GPU-based hyperscale cloud inference services. This leap in speed is revolutionizing the AI application user experience, enabling real-time iterations and enhanced intelligence through increased agentic computation.
