About the job
Embrace the Future of E-Commerce with Whatnot!
Whatnot stands as the premier live shopping platform across North America and Europe, revolutionizing how people buy, sell, and discover their favorite items. Our mission is to reshape e-commerce by seamlessly merging community, shopping, and entertainment into a personalized experience for our users. Operating as a remote co-located team, we are driven by innovation and firmly rooted in our core values. With operational hubs in the US, UK, Germany, Ireland, and Poland, we are collaboratively crafting the future of online marketplaces.
Our platform offers a diverse array of products, from fashion and beauty to electronics and collectibles, ensuring that everyone finds something they love during our live auctions.
And this is just the beginning! As one of the fastest-growing marketplaces, we are in search of innovative, bold thinkers across all functions. Stay updated with our latest insights on our news and engineering blogs, and join us in empowering individuals to transform their passions into thriving businesses while uniting people through commerce.
Your Role
We are on the lookout for creators—intellectually curious and highly entrepreneurial engineers passionate about shaping the future of AI and ML at Whatnot. In this role, you will architect and scale the foundational infrastructure that supports large language model applications throughout the organization. Collaborating closely with machine learning scientists, you will implement cutting-edge models into production, enabling entirely new product experiences. This entails constructing systems that ensure AI is reliable and rapid at scale—from developing retrieval systems that effectively contextualize LLM responses within Whatnot’s business environment to devising scalable LLM evaluation frameworks and human-in-the-loop feedback mechanisms.
Key Responsibilities:
Oversee the infrastructure that powers LLMs across crucial business domains—enhancing growth, recommendations, trust and safety, fraud prevention, seller tools, and beyond.
Develop robust and scalable LLM evaluation frameworks to assess model performance, guide iterations, and prevent regressions through CI/CD processes.
Implement RAG systems and MCP servers to effectively ground LLM responses in Whatnot’s business context, while maintaining stringent PII controls.
Design efficient human-in-the-loop feedback pipelines to inform scalable model improvements.

