Nvidia-backed Twelve Labs is building AI that understands videos like humans - Twelve Labs

🎉 TwelveLabs models are now in Amazon Bedrock! Leading video understanding meets with the scale of AWS infrastructure. Learn more here!

Products

Enterprise

Research

Developers

Company

Playground

Talk to Sales

🎉 TwelveLabs models are now in Amazon Bedrock! Leading video understanding meets with the scale of AWS infrastructure. Learn more here!

🎉 TwelveLabs models are now in Amazon Bedrock! Leading video understanding meets with the scale of AWS infrastructure. Learn more here!

News

News

News

Nvidia-backed Twelve Labs is building AI that understands videos like humans

The Chosun Daily

The Chosun Daily

The Chosun Daily

Twelve Labs, a South Korean AI startup, aspires to achieve a 'ChatGPT' moment for video

Twelve Labs, a South Korean AI startup, aspires to achieve a 'ChatGPT' moment for video

Join our newsletter

Receive the latest advancements, tutorials, and industry insights in video understanding

Search, analyze, and explore your videos with AI.

Try the Playground

Apr 8, 2024

Apr 8, 2024

Apr 8, 2024

3 min

3 min

3 min

Copy link to article

Copy link to article

Copy link to article

Twelve Labs, a South Korean generative artificial intelligence (AI) startup, made headlines last year after securing investment from U.S. tech giant Nvidia. Based in Seoul and San Francisco, the company specializes in AI technology that analyzes and understands video. Nvidia, Intel and two other companies jointly invested $10 million in Twelve Labs last October.

“Just like OpenAI’s ChatGPT pioneered the realm of text-based generative AI, Twelve Labs aims to pave the way for the advancement of video AI,” said Twelve Labs co-founder and CEO Lee Jae-sung, 30, in a video interview with the Chosunilbo on April 8.

Twelve Labs is developing a multimodal AI that understands videos. The company’s AI model analyzes all the images and sounds in a video and matches them with the human language. For instance, the AI model can identify a scene with “a man holding a pen in the office” in an hour-long video within seconds.

When Lee founded Twelve Labs in 2020, the burgeoning AI market mainly focused on text or images. “AI startups were receiving astronomical funding for developing large language models like ChatGPT,” said Lee. “We believed video was a field where we could make a difference even with limited investment,” says Lee.

Lee, who majored in computer science at UC Berkeley and interned at Samsung Electronics and Amazon, returned to Korea to fulfill mandatory military service. Here, he met his future Twelve Labs co-founders. While serving in the Ministry of National Defense’s Cyber Operations Command, Lee and his colleagues, who were equally passionate about AI, spent time discussing research papers and exploring AI technologies, eventually starting Twelve Labs together in 2020.

“My co-founder, who was the first to finish military service, was so dedicated that he regularly visited us to study AI together,” Lee reflected. “Starting this company based on passion without worrying too much about the future turned out to be a good idea.”

Twelve Labs currently operates Pegasus, a video language foundation model that can summarize long videos into text and answer questions about videos with its users, and Marengo, a multimodal AI model that understands videos, images and audio. Over 30,000 developers and companies are using these AI models. One of the company’s most prominent partnerships is with the National Football League (NFL).

“Organizations like the NFL have amassed a treasure trove of video content that spans over a century, but monetizing such content requires advanced video search technology,” Lee said. “Companies with extensive data archives are seeking out Twelve Labs’ AI technology.”

By Park Ji-min, Lee Jae-eun

Published 2024.04.08. 16:04

Twelve Labs, a South Korean generative artificial intelligence (AI) startup, made headlines last year after securing investment from U.S. tech giant Nvidia. Based in Seoul and San Francisco, the company specializes in AI technology that analyzes and understands video. Nvidia, Intel and two other companies jointly invested $10 million in Twelve Labs last October.

“Just like OpenAI’s ChatGPT pioneered the realm of text-based generative AI, Twelve Labs aims to pave the way for the advancement of video AI,” said Twelve Labs co-founder and CEO Lee Jae-sung, 30, in a video interview with the Chosunilbo on April 8.

Twelve Labs is developing a multimodal AI that understands videos. The company’s AI model analyzes all the images and sounds in a video and matches them with the human language. For instance, the AI model can identify a scene with “a man holding a pen in the office” in an hour-long video within seconds.

When Lee founded Twelve Labs in 2020, the burgeoning AI market mainly focused on text or images. “AI startups were receiving astronomical funding for developing large language models like ChatGPT,” said Lee. “We believed video was a field where we could make a difference even with limited investment,” says Lee.

Lee, who majored in computer science at UC Berkeley and interned at Samsung Electronics and Amazon, returned to Korea to fulfill mandatory military service. Here, he met his future Twelve Labs co-founders. While serving in the Ministry of National Defense’s Cyber Operations Command, Lee and his colleagues, who were equally passionate about AI, spent time discussing research papers and exploring AI technologies, eventually starting Twelve Labs together in 2020.

“My co-founder, who was the first to finish military service, was so dedicated that he regularly visited us to study AI together,” Lee reflected. “Starting this company based on passion without worrying too much about the future turned out to be a good idea.”

Twelve Labs currently operates Pegasus, a video language foundation model that can summarize long videos into text and answer questions about videos with its users, and Marengo, a multimodal AI model that understands videos, images and audio. Over 30,000 developers and companies are using these AI models. One of the company’s most prominent partnerships is with the National Football League (NFL).

“Organizations like the NFL have amassed a treasure trove of video content that spans over a century, but monetizing such content requires advanced video search technology,” Lee said. “Companies with extensive data archives are seeking out Twelve Labs’ AI technology.”

By Park Ji-min, Lee Jae-eun

Published 2024.04.08. 16:04

Twelve Labs, a South Korean generative artificial intelligence (AI) startup, made headlines last year after securing investment from U.S. tech giant Nvidia. Based in Seoul and San Francisco, the company specializes in AI technology that analyzes and understands video. Nvidia, Intel and two other companies jointly invested $10 million in Twelve Labs last October.

“Just like OpenAI’s ChatGPT pioneered the realm of text-based generative AI, Twelve Labs aims to pave the way for the advancement of video AI,” said Twelve Labs co-founder and CEO Lee Jae-sung, 30, in a video interview with the Chosunilbo on April 8.

Twelve Labs is developing a multimodal AI that understands videos. The company’s AI model analyzes all the images and sounds in a video and matches them with the human language. For instance, the AI model can identify a scene with “a man holding a pen in the office” in an hour-long video within seconds.

When Lee founded Twelve Labs in 2020, the burgeoning AI market mainly focused on text or images. “AI startups were receiving astronomical funding for developing large language models like ChatGPT,” said Lee. “We believed video was a field where we could make a difference even with limited investment,” says Lee.

Lee, who majored in computer science at UC Berkeley and interned at Samsung Electronics and Amazon, returned to Korea to fulfill mandatory military service. Here, he met his future Twelve Labs co-founders. While serving in the Ministry of National Defense’s Cyber Operations Command, Lee and his colleagues, who were equally passionate about AI, spent time discussing research papers and exploring AI technologies, eventually starting Twelve Labs together in 2020.

“My co-founder, who was the first to finish military service, was so dedicated that he regularly visited us to study AI together,” Lee reflected. “Starting this company based on passion without worrying too much about the future turned out to be a good idea.”

Twelve Labs currently operates Pegasus, a video language foundation model that can summarize long videos into text and answer questions about videos with its users, and Marengo, a multimodal AI model that understands videos, images and audio. Over 30,000 developers and companies are using these AI models. One of the company’s most prominent partnerships is with the National Football League (NFL).

“Organizations like the NFL have amassed a treasure trove of video content that spans over a century, but monetizing such content requires advanced video search technology,” Lee said. “Companies with extensive data archives are seeking out Twelve Labs’ AI technology.”

By Park Ji-min, Lee Jae-eun

Published 2024.04.08. 16:04

Related articles

Twelve Labs is building AI that can analyze and search through videos

Twelve Labs Secures $30 Million in Funding, Validating the Importance of Twelve Labs' Video Understanding Technology to the AI Ecosystem

Lights, Camera, AI-ction: Twelve Labs Brings Video-Language Models to Center Stage at NeurIPS 2024

Our Series A to Build the Future of Multimodal AI

Footer accent

© 2021

-

2025

TwelveLabs, Inc. All Rights Reserved

Footer accent

© 2021

-

2025

TwelveLabs, Inc. All Rights Reserved

Footer accent

Company

© 2021

-

2025

TwelveLabs, Inc. All Rights Reserved