Research-first and human-centered.

트웰브랩스는 사람처럼 이해하는 영상 중심 AI를 만듭니다.
기계와 기술이 세상을 이해할 수 있게 만들고,
사람이 더 쉽게 영상 속 이야기를 찾고, 담고, 전할 수 있도록 돕습니다.

트웰브랩스는 사람처럼 이해하는 영상 중심 AI를 만듭니다.
기계와 기술이 세상을 이해할 수 있게 만들고,
사람이 더 쉽게 영상 속 이야기를 찾고, 담고, 전할 수 있도록 돕습니다.

CTA Cover

Our brains continually process sensory input – helping us understand what has happened and predict what might happen next. This ability, known as perceptual reasoning, forms the basis of human intelligence.

AI, as rolled out so far, has bypassed a crucial learning step: creating a robust world representation through video, which closely resembles the sensory input that gives rise to human perception.

At TwelveLabs, we’re bridging this gap by training cutting-edge foundation models to learn rich, multimodal representations from video data, then using these representations for high-level reasoning tasks involving language.

Through video-native AI, we’re helping machines learn about the world – and enabling humans to retrieve, capture, and tell their visual stories better. 

The Art of Detail

Perception: Capturing the sensory details through a video-native encoder

영상에 최적화된 AI로
클라우드 대기업과
오픈소스 모델을 능가합니다.

Research illustration
Research illustration
Research illustration
Research illustration
Research illustration
Research illustration

THE POWER OF ALIGNMENT

Reasoning: Inducing the perceptual reasoning capability through video and language alignment

True video understanding requires the ability to reason about what is perceived. This is where our video-language model, Pegasus, comes into play. 

Pegasus merges the reasoning skills learned from large language models (text data) with the perceptual understanding gained from our video encoder model (video data). By aligning these two modalities, Pegasus can perform cross-modal reasoning, inferring meaning and intent from Marengo's rich, multimodal representations.

It’s the synergy between Marengo and Pegasus — the alignment of video and language – that enables perceptual reasoning capabilities in our AI systems. Building on the strengths of both models, we can develop systems that not only perceive and understand the visual world, but also reason about it in a way that resembles human cognition.

Recognition

Our science team has a background in video and language throughout their careers, with 5+ wins in global competitions and 100+ publications in top AI conferences on video and language.

Logo
ECCV
ICLR
Logo

Rethinking how an AI thinks.

We’re not just developing state-of-the-art models — we’re rethinking how AI systems learn and reason. Explore our publications to learn more about our research and discoveries.

We’re not just developing state-of-the-art models — we’re rethinking how AI systems learn and reason. Explore our publications to learn more about our research and discoveries.

Perception & Reasoning

Perception & Reasoning

Cover image

지금, 영상의 새로운 가능성을 직접 경험해보세요.

플레이그라운드에서 직접 테스트하고, 비디오 인텔리전스의 진짜 경험을 시작하세요.

Cover image

지금, 영상의 새로운 가능성을 직접 경험해보세요.

플레이그라운드에서 직접 테스트하고, 비디오 인텔리전스의 진짜 경험을 시작하세요.