About the job
Cerebras Systems is at the forefront of AI technology, engineering the world’s largest AI chip, which is 56 times larger than conventional GPUs. Our innovative wafer-scale architecture enables unprecedented AI computational power, equivalent to dozens of GPUs operating as a single unit, thereby simplifying programming for machine learning tasks. This revolutionary approach not only provides unmatched training and inference speeds but also allows users to execute large-scale machine learning applications without the complexity of managing multiple GPUs or TPUs.
Our esteemed clientele includes leading model laboratories, global corporations, and pioneering AI startups. Recently, OpenAI announced a multi-year collaboration with Cerebras, deploying 750 megawatts of processing capacity that revolutionizes critical workloads through ultra-fast inference.
With our groundbreaking wafer-scale architecture, Cerebras Inference delivers the fastest Generative AI inference solution globally, outperforming GPU-based hyperscale cloud services by over 10 times. This significant increase in speed is reshaping the user experience of AI applications, facilitating real-time iterations and enhancing intelligence through advanced computational capabilities.

