Loading organizations...
Twelve Labs develops video-native multimodal artificial intelligence, creating foundation models that enable machines to understand visual, auditory, and linguistic information within video. Their platform provides enterprise-grade AI for comprehensive video analysis, supporting advanced search and insights by processing vision, audio, and language. This technology aims to bring human-like video understanding to diverse applications.
Jae Lee, CEO, and Dave Chung, COO, co-founded the company. Lee's background in large multimodal neural networks provides the technical foundation for Twelve Labs. The founders identified the critical need for sophisticated AI to intelligently process the rapidly increasing volume of global video data, thereby unlocking deeper understanding.
Twelve Labs serves enterprises, researchers, and developers, enabling them to extract valuable intelligence from extensive video archives. Their vision is to embed comprehensive, human-level video understanding into various applications, fundamentally transforming how organizations interact with and leverage video for strategic objectives.
Twelve Labs has raised $107.0M across 5 funding rounds.
Twelve Labs has raised $107.0M in total across 5 funding rounds.
Twelve Labs has raised $107.0M across 5 funding rounds. Most recently, it raised $30.0M Other Equity in December 2024.
Twelve Labs has raised $107.0M in total across 5 funding rounds.
Twelve Labs's investors include New Enterprise Associates, NVentures, Coatue, Daffy, Fenway Summer, First Round Capital, Indeed.com, Pareto Holdings, Greg Bettinelli, Vera Equity, Amit Agarwal, Gokul Rajaram.
Twelve Labs is a San Francisco-based AI startup founded in 2020-2021 that builds multimodal video understanding platforms, enabling developers and enterprises to search, summarize, and analyze vast video archives with human-like comprehension of visuals, audio, and context.[1][2][3][4] Its proprietary foundation models like Marengo and Pegasus power applications such as semantic search, content moderation, and Retrieval-Augmented Generation (RAG) for video data, serving sectors like media & entertainment, advertising, automotive, security, and high-profile clients including the NFL.[2][4][5] With over 30,000 developers and companies using its API, Twelve Labs addresses the limitations of general-purpose AI by focusing on video-native intelligence, recently securing $30M in funding to scale deeper, adaptable solutions.[2][4][6]
The platform solves the core problem of unlocking value from petabyte-scale video libraries—previously hard to query beyond keywords—by offering world-class accuracy, customization on proprietary data, and flexible deployment across cloud, private cloud, or on-premise environments.[3][5] This drives growth momentum, with rapid adoption amid exploding video data volumes and AI advancements.[4]
Twelve Labs was co-founded in 2020 by Jae Lee, a UC Berkeley computer science graduate who previously served in South Korea's Ministry of National Defense Cyber Operations Command.[2][4] While in the military, Lee and AI-passionate colleagues discussed emerging research papers, spotting an untapped opportunity in video AI at a time when the field fixated on text and images.[2][4] This led to the company's inception as a South Korean startup with offices in Seoul and San Francisco, bootstrapping with limited resources to pioneer "perceptual reasoning" in video understanding.[4]
A pivotal early breakthrough was Pegasus, their first model capable of analyzing videos and answering content-specific questions, marking a shift from text-centric AI.[2] Headquartered in San Francisco with 21-40 employees, Twelve Labs quickly gained traction, partnering with the NFL to monetize video archives and launching a public API to empower developers.[1][2][3][4]
Twelve Labs rides the multimodal AI wave, capitalizing on video's dominance as the internet's fastest-growing data type—projected to comprise 82% of traffic—while general models lag in video depth.[2][4][5] Timing is ideal post-2020, as foundation model hype shifted from text/images to video, amplified by generative AI and enterprise needs for intelligent archives in media, security, and beyond.[2][3][4] Market forces like surging video production (e.g., sports, ads) and regulatory demands for moderation favor its scalable, customizable tech, influencing the ecosystem by democratizing video AI for developers and setting benchmarks that push incumbents toward specialization.[2][5][6]
Twelve Labs is positioned for explosive growth, expanding into automotive, security, and defense via In-Q-Tel ties, while recruiting top AI talent to dominate video understanding.[2][6] Trends like agentic AI, RAG proliferation, and petabyte-scale data will propel its models, potentially establishing "clear and sustainable leadership" as CEO Jae Lee envisions.[4][6] Its influence may evolve from developer tool to enterprise standard, unlocking video's full potential and redefining how machines "see, listen, and understand the world"—just as its mission promises from day one.[1][3]