About the job
Cerebras Systems is at the forefront of AI technology, developing the world's largest AI chip that is 56 times larger than conventional GPUs. Our innovative wafer-scale architecture allows us to deliver the AI computational power equivalent to dozens of GPUs on a single chip, while simplifying programming to a unified device. This groundbreaking approach enables Cerebras to achieve unparalleled training and inference speeds, allowing machine learning practitioners to run extensive ML applications seamlessly without the complexities of managing multiple GPUs or TPUs.
Cerebras serves a diverse clientele comprising leading model labs, global corporations, and pioneering AI-focused startups. Recently, OpenAI announced a multi-year collaboration with Cerebras to leverage 750 megawatts of power, revolutionizing key workloads with ultra-high-speed inference capabilities.
Our wafer-scale architecture has positioned Cerebras Inference as the fastest Generative AI inference solution globally, exceeding GPU-based hyperscale cloud inference services by over ten times. This significant increase in speed is enhancing the user experience for AI applications, enabling real-time iteration and boosting intelligence through additional agentic computation.

