About the job
About Us
At TwelveLabs, we are at the forefront of creating innovative multimodal foundation models that can interpret videos with human-like understanding. Our groundbreaking models elevate the standards in video-language modeling, enabling intuitive interactions and comprehensive analyses of diverse media formats.
With over $110 million secured in Seed and Series A funding, we are supported by leading venture capital firms such as NVIDIA’s NVentures, NEA, Radical Ventures, and Index Ventures, along with esteemed AI pioneers like Fei-Fei Li, Silvio Savarese, and Alexandr Wang. Our headquarters are based in San Francisco, with a significant presence in Seoul, reflecting our dedication to global innovation.
Our collaboration with NVIDIA and AWS grants us access to cutting-edge chips, including B300s, empowering us to expand the possibilities in video AI.
We cherish the diverse journeys of our team members, believing that our varied cultural, educational, and life experiences are key to challenging the status quo. We seek passionate individuals eager to impact the technology landscape and help us redefine video understanding and multimodal AI.
Team Overview
The Embedding & Search team is integral to TwelveLabs' video understanding initiatives. We craft unified embedding spaces that encompass video, audio, text, and other modalities, and develop retrieval systems designed to deliver precise results that align with user intent across extensive video catalogs.
Our research encompasses a wide array of challenges including multimodal representation learning via contrastive and probabilistic methods, temporal video understanding—focusing on hierarchical segmentation and boundary detection—neural ranking architectures for multi-stage retrieval, and modeling user behavior to gain insights into how users search for and engage with video content. We prioritize both algorithmic advancements that set new benchmarks and human-centric insights that enhance the utility of our systems.
Our research team benefits from access to state-of-the-art chips like NVIDIA B300s, which accelerate our research-to-production transitions.

