Twelve Labs
토큰은 잠들지 않는다: 트웰브랩스가 AI 에이전트와 협업하며 배운 3개월의 기록

김수 (Sue Kim)
Over the past three months, through our Tokens Never Sleep initiative, Twelve Labs has been integrating AI copilots and agents into our actual workflows. From IT operations and product development to research, recruiting, and office administration, AI has taken over repetitive tasks and context synthesis. This allows our team to focus on critical decision-making, relationships, and structural design. This post is a record of how Twelve Labs is defining what it means to work in an AI-native way—and how it transforms speed, trust, and the human role.
Over the past three months, through our Tokens Never Sleep initiative, Twelve Labs has been integrating AI copilots and agents into our actual workflows. From IT operations and product development to research, recruiting, and office administration, AI has taken over repetitive tasks and context synthesis. This allows our team to focus on critical decision-making, relationships, and structural design. This post is a record of how Twelve Labs is defining what it means to work in an AI-native way—and how it transforms speed, trust, and the human role.

목차
뉴스레터 구독하기
뉴스레터 구독하기
영상 이해 분야의 최신 기술 업데이트, 튜토리얼 및 인사이트를 받아보세요.
영상 이해 분야의 최신 기술 업데이트, 튜토리얼 및 인사이트를 받아보세요.
AI로 영상을 검색하고, 분석하고, 탐색하세요.
2026. 5. 11.
10분
링크 복사하기
The System That Runs While We Sleep
How Twelve Labs Compounds Execution Velocity
There is a phrase you hear a lot around Twelve Labs these days:
“Tokens Never Sleep.”
While it sounds like a slogan now, it didn't start that way. It began with an incredibly practical question.
Earlier this year, a team member used an LLM-powered financial tool to complete an analysis in minutes that previously would have taken weeks. While some might have seen that and thought, "That's a neat demo," our CEO saw something else. It wasn't about simply tacking AI onto existing workflows; it was about designing workflows around AI from day one.
This sparked a series of questions:
"Can we connect this to other data and business systems?"
“Can we make it accessible to everyone?”
“Can we make it run while we sleep?”
And so, Tokens Never Sleep (TNS) was born.
The concept is simple: repetitive but mentally taxing cognitive labor—like ticket triage, account provisioning, update tracking, meeting prep, and research monitoring—is offloaded to AI copilots and agents operating 24/7. This frees human minds to focus entirely on decisions that require true human judgment.
This initiative was guided by clear, uncompromising principles from the start:
Ship and iterate with feedback. Every piece of execution makes the next run faster.
Be skeptical of ideas that cannot be demoed.
Before starting anything, ask: “Can AI do 80% of this work?”
Twelve Labs has been running Tokens Never Sleep in our daily operations for about three months now. If the initial question was "How much can we delegate to AI?", the question we ask ourselves today has evolved.
That is because we realized that burning more tokens does not automatically equate to higher productivity. Having moved past the phase of peak token consumption, we now know that what matters isn't how many tokens we use, but rather the context they grasp, the ground truth they operate on, and how they seamlessly loop in human judgment at the right moments.
It All Started with a Single Copilot
This paradigm shift started with a copilot built by our IT Manager, Rick Mondragon.
No matter how advanced our video AI models are, productivity halts if an engineer gets blocked by a permissions error in the middle of the night, or if internal tools stall during timezone handoffs. To solve this, Rick built a copilot that fully understands his role, the tools he manages, and his priorities. By equipping it with skills like ticket triage, access provisioning, and security monitoring, he successfully delegated a massive chunk of operations.
The results were immediate and measurable:
Standard access requests: Cut from 8 hours to 15 minutes
New hire onboarding: Compressed from a full day to a 10-minute review
60% of routine IT tickets: Resolved entirely without human intervention
For the remaining 40%, the copilot pre-triages the issue—summarizing what it checked, identifying the bottleneck, and recommending the next action—before handing it off to Rick.
Seeing a single copilot deliver such clear results naturally led to the next question: What happens when copilots start talking to each other?
From a Single Copilot to a Connected Network
Once the power of a single copilot was proven, the next logical step was: What if everyone had their own personal copilot? To make this a reality, Twelve Labs built a communication layer that links copilots together. This allows every team member to have a dedicated companion that understands their specific role, active projects, tools, and permission levels.
For example, imagine it is 2:00 AM in San Francisco and 6:00 PM in Seoul. A Seoul-based engineer hits a permissions error on a new internal tool. Their personal copilot detects the blocker and calls upon Rick’s copilot’s provisioning skills to check the permission matrix. Simultaneously, it cross-checks with security monitoring skills to ensure there is no system-wide incident.
Within minutes, the access request is resolved. Previously, the engineer would have had to wait until San Francisco woke up. Now, the blocker is cleared before they even lose their flow.
Of course, having copilots collaborate doesn't mean letting them run wild outside of human control. Every inter-copilot action is routed through Slack for human approval. When a decision requires judgment, the copilots package up the necessary context so a human can make the call instantly.
This network went beyond operations; it accelerated how we build. The finance team built dashboards, and the recruiting team built candidate sourcing apps. Ideas that previously would have started with "let's review this" and ended up buried in a backlog are now turned into functional internal tools in a matter of hours.
The Team Building AI, Working with AI
While we were developing Pegasus 1.5, similar dynamics were unfolding in product development. The team building state-of-the-art AI models was also the team leveraging AI most aggressively to do so.
Sam, an ML Research Scientist, hit a structural bottleneck early in training. The Flash Attention package, which drastically speeds up training, couldn't be applied to Twelve Labs’ model architecture out of the box. Rather than finding a workaround, Sam used Claude Code to write and custom-compile GPU kernels from scratch.
“I heavily used Claude Code to write and customize GPU kernels myself. Working in an AI-native environment really opened my eyes to what's possible.” — Sam, ML Research Scientist |
|---|
Development speed skyrocketed. Genie, a Backend Engineer, built Pegasus 1.5’s core Time-Based Metadata (TBM) feature in just two days. Once the PRD was ready, she wrote a detailed technical document and delegated the implementation to Claude as an agent. Test automation, platform integration, and bug detection were all handled seamlessly within that loop.
The numbers speak for themselves. In a previous project working across time zones, two to three weeks of development resulted in roughly 30 QA bugs. For Pegasus 1.5, with only two days of development, we found just 6 bugs.
“The actual time spent writing code dropped dramatically. Instead, we had much more time for constructive discussions, like design-wise, how to make the API specs cleaner and more user-friendly.” | ||
|---|---|---|
This isn't just about "writing code faster." When individuals are empowered with higher ownership and faster feedback loops, AI ceases to be a mere productivity tool and becomes the foundational premise of how we work.
“With coding agents advancing rapidly, individual productivity has soared. To leverage this fully, we must empower individuals to make decisions and run their own execution loops quickly.” | ||
|---|---|---|
Turning Team Memory into System Operations: Dan’s Dual Agents
Dan, who leads our Marengo and Search teams, faced a classic leadership challenge. With team members split between Seoul and San Francisco, multiple parallel initiatives, and a constant stream of Linear tickets, GitHub commits, and WandB experiments, keeping track of the team's entire context while managing his own individual contributions was a massive cognitive load.
Dan decided to solve this structurally, building two specialized agents with distinct purposes.
The first is Dot. Acting as the team's knowledge coordinator, Dot wakes up early every morning to sync and synthesize Linear initiatives, GitHub commits, pull requests, WandB runs, and Notion PRDs. When someone asks, "Where do we stand on this initiative?", Dot instantly weaves together information across these disparate systems, sparing engineers from having to constantly repeat context.
The second is Dan’s OS. Serving as Dan's personal workflow engine, this agent processes his overnight Slack mentions to prioritize his morning, packages Up stakeholder context and recent decision histories ahead of syncs, and evaluates how newly published research papers map to current projects or product roadmaps. Crucially, it tracks not just what was decided, but the underlying 'why' behind each decision.
Together, Dot manages the team's collective memory, while Dan's OS maintains the manager's decision-making context. Individually specialized, they integrate to function like a cohesive operating system.
Dan even keeps a scratchpad in Slack specifically for Dot. When he writes down unorganized, raw thoughts, Dot captures them, structures them, and maps them to the correct projects. Ideas that once lived exclusively in a lead's head are effortlessly codified into dry system knowledge, ready to be utilized by the team.
Dan notes that utilizing these agents has drastically reduced the cost of management. By "cost," he doesn't just mean hours saved on a calendar. He means the cognitive debt of constantly holding mental models of who is doing what, why a specific decision was made, how a new paper fits into the active workstream, and which topics need immediate alignment. Because Dot holds the context, Dan is freed from being the human router of information, allowing him to step in as a high-leverage decision maker on top of synthesized data.
Three Months In: What We've Learned
Three months have passed since we launched Tokens Never Sleep in late January.
Initially, our expectation was simple: writing more prompts means doing more things, faster. And indeed, many processes accelerated. But three months of real-world operation have yielded far more nuanced, valuable lessons.
First, productivity does not scale linearly with token consumption.
The fundamental reason we deploy agents is because human time and attention are finite. To safeguard these finite resources, we delegate repetitive, heavily contextual sorting tasks to agents. The goal is never to maximize token usage for its own sake, but to direct tokens toward high-leverage outcomes.
Second, data must precede agents.
With the rise of interoperability protocols like MCP, connecting agents to tools like Slack, Linear, GitHub, Notion, and WandB has become remarkably straightforward. However, ease of connection does not guarantee quality of output.
No matter how sophisticated your agent is, it is useless if the underlying data it reads is messy or out of date. It is the equivalent of giving a brilliant mind a book filled with errors.
An agent's capability is capped by the quality of its ground truth. Aligning as a team on which document is the source of truth, what data is the baseline, which decisions are finalized, and which scratchpads are merely brainstorming is the hard work that makes the entire system viable.
Building great agents is ultimately about building a reliable, pristine data layer.
Third, the talent bar is changing.
Hyemin, our Korea Country Manager, has seen this shift manifest directly in hiring.
“The criteria I search for in candidates today is completely different from three months ago. Previously, we sought people who could assist with manual data preparation and operational groundwork. Now, agents handle a vast portion of those tasks. Today, we look for architects—people who can design workflows, apply critical judgment, delegate to agents effectively, and interpret synthesized outputs.” | ||
|---|---|---|
When the nature of work shifts, the profile of the ideal team member shifts with it.
Where we once needed individuals to handle basic data operations and manual tasks, we now delegate those chores to agents. The skills that have become premium are uniquely human: designing system architectures, formulating high-quality inputs and standards, identifying where human-to-human empathy and alignment are indispensable, and ultimately taking responsibility for decisions.
At Twelve Labs, being "AI-native" doesn't mean knowing how to use a list of AI tools. It means knowing how to break down complex problems structurally, define clean ground truths, build repeatable systems, and take ultimate ownership of outcomes.
We Use AI So We Can Think About Bigger Problems
Tokens Never Sleep is not a mandate to "use more AI."
While the name implies keeping systems running while we sleep, the purpose isn't to force humans to work longer hours—it is to preserve human focus and decision-making energy for what truly matters.
Tickets can be triaged by copilots.
Documents can be structured by agents.
Meeting context can be assembled by systems.
Research papers can be ingested overnight.
Follow-ups can be automated.
But defining which problems are worth solving, deciding on strategic directions, and building authentic human trust remain exclusively human domains.
What Twelve Labs has learned over the last three months is simple:
AI does not replace humans; it crystallizes what humans must excel at.
Tokens never sleep, but the critical judgments that guide those tokens will always be ours to make.
Therefore, building great agents is, at its core, about building a more leverage-multiplying organization.
We do not wait for perfect systems.
We ship, run them in production, and refine them under real-world pressure.
We question ideas that can't be demoed, and whenever we find a human performing repetitive manual work, we ask:
“Can AI do 80% of this?”
“If so, what must the human do to make the remaining 20% exceptional?”
Tokens Never Sleep started with that very question, and it continues to reshape how we operate every day.
Tokens never sleep.
Which is exactly why we can focus on the thoughts that matter.
Join us on this journey → Twelve Labs Careers
The System That Runs While We Sleep
How Twelve Labs Compounds Execution Velocity
There is a phrase you hear a lot around Twelve Labs these days:
“Tokens Never Sleep.”
While it sounds like a slogan now, it didn't start that way. It began with an incredibly practical question.
Earlier this year, a team member used an LLM-powered financial tool to complete an analysis in minutes that previously would have taken weeks. While some might have seen that and thought, "That's a neat demo," our CEO saw something else. It wasn't about simply tacking AI onto existing workflows; it was about designing workflows around AI from day one.
This sparked a series of questions:
"Can we connect this to other data and business systems?"
“Can we make it accessible to everyone?”
“Can we make it run while we sleep?”
And so, Tokens Never Sleep (TNS) was born.
The concept is simple: repetitive but mentally taxing cognitive labor—like ticket triage, account provisioning, update tracking, meeting prep, and research monitoring—is offloaded to AI copilots and agents operating 24/7. This frees human minds to focus entirely on decisions that require true human judgment.
This initiative was guided by clear, uncompromising principles from the start:
Ship and iterate with feedback. Every piece of execution makes the next run faster.
Be skeptical of ideas that cannot be demoed.
Before starting anything, ask: “Can AI do 80% of this work?”
Twelve Labs has been running Tokens Never Sleep in our daily operations for about three months now. If the initial question was "How much can we delegate to AI?", the question we ask ourselves today has evolved.
That is because we realized that burning more tokens does not automatically equate to higher productivity. Having moved past the phase of peak token consumption, we now know that what matters isn't how many tokens we use, but rather the context they grasp, the ground truth they operate on, and how they seamlessly loop in human judgment at the right moments.
It All Started with a Single Copilot
This paradigm shift started with a copilot built by our IT Manager, Rick Mondragon.
No matter how advanced our video AI models are, productivity halts if an engineer gets blocked by a permissions error in the middle of the night, or if internal tools stall during timezone handoffs. To solve this, Rick built a copilot that fully understands his role, the tools he manages, and his priorities. By equipping it with skills like ticket triage, access provisioning, and security monitoring, he successfully delegated a massive chunk of operations.
The results were immediate and measurable:
Standard access requests: Cut from 8 hours to 15 minutes
New hire onboarding: Compressed from a full day to a 10-minute review
60% of routine IT tickets: Resolved entirely without human intervention
For the remaining 40%, the copilot pre-triages the issue—summarizing what it checked, identifying the bottleneck, and recommending the next action—before handing it off to Rick.
Seeing a single copilot deliver such clear results naturally led to the next question: What happens when copilots start talking to each other?
From a Single Copilot to a Connected Network
Once the power of a single copilot was proven, the next logical step was: What if everyone had their own personal copilot? To make this a reality, Twelve Labs built a communication layer that links copilots together. This allows every team member to have a dedicated companion that understands their specific role, active projects, tools, and permission levels.
For example, imagine it is 2:00 AM in San Francisco and 6:00 PM in Seoul. A Seoul-based engineer hits a permissions error on a new internal tool. Their personal copilot detects the blocker and calls upon Rick’s copilot’s provisioning skills to check the permission matrix. Simultaneously, it cross-checks with security monitoring skills to ensure there is no system-wide incident.
Within minutes, the access request is resolved. Previously, the engineer would have had to wait until San Francisco woke up. Now, the blocker is cleared before they even lose their flow.
Of course, having copilots collaborate doesn't mean letting them run wild outside of human control. Every inter-copilot action is routed through Slack for human approval. When a decision requires judgment, the copilots package up the necessary context so a human can make the call instantly.
This network went beyond operations; it accelerated how we build. The finance team built dashboards, and the recruiting team built candidate sourcing apps. Ideas that previously would have started with "let's review this" and ended up buried in a backlog are now turned into functional internal tools in a matter of hours.
The Team Building AI, Working with AI
While we were developing Pegasus 1.5, similar dynamics were unfolding in product development. The team building state-of-the-art AI models was also the team leveraging AI most aggressively to do so.
Sam, an ML Research Scientist, hit a structural bottleneck early in training. The Flash Attention package, which drastically speeds up training, couldn't be applied to Twelve Labs’ model architecture out of the box. Rather than finding a workaround, Sam used Claude Code to write and custom-compile GPU kernels from scratch.
“I heavily used Claude Code to write and customize GPU kernels myself. Working in an AI-native environment really opened my eyes to what's possible.” — Sam, ML Research Scientist |
|---|
Development speed skyrocketed. Genie, a Backend Engineer, built Pegasus 1.5’s core Time-Based Metadata (TBM) feature in just two days. Once the PRD was ready, she wrote a detailed technical document and delegated the implementation to Claude as an agent. Test automation, platform integration, and bug detection were all handled seamlessly within that loop.
The numbers speak for themselves. In a previous project working across time zones, two to three weeks of development resulted in roughly 30 QA bugs. For Pegasus 1.5, with only two days of development, we found just 6 bugs.
“The actual time spent writing code dropped dramatically. Instead, we had much more time for constructive discussions, like design-wise, how to make the API specs cleaner and more user-friendly.” | ||
|---|---|---|
This isn't just about "writing code faster." When individuals are empowered with higher ownership and faster feedback loops, AI ceases to be a mere productivity tool and becomes the foundational premise of how we work.
“With coding agents advancing rapidly, individual productivity has soared. To leverage this fully, we must empower individuals to make decisions and run their own execution loops quickly.” | ||
|---|---|---|
Turning Team Memory into System Operations: Dan’s Dual Agents
Dan, who leads our Marengo and Search teams, faced a classic leadership challenge. With team members split between Seoul and San Francisco, multiple parallel initiatives, and a constant stream of Linear tickets, GitHub commits, and WandB experiments, keeping track of the team's entire context while managing his own individual contributions was a massive cognitive load.
Dan decided to solve this structurally, building two specialized agents with distinct purposes.
The first is Dot. Acting as the team's knowledge coordinator, Dot wakes up early every morning to sync and synthesize Linear initiatives, GitHub commits, pull requests, WandB runs, and Notion PRDs. When someone asks, "Where do we stand on this initiative?", Dot instantly weaves together information across these disparate systems, sparing engineers from having to constantly repeat context.
The second is Dan’s OS. Serving as Dan's personal workflow engine, this agent processes his overnight Slack mentions to prioritize his morning, packages Up stakeholder context and recent decision histories ahead of syncs, and evaluates how newly published research papers map to current projects or product roadmaps. Crucially, it tracks not just what was decided, but the underlying 'why' behind each decision.
Together, Dot manages the team's collective memory, while Dan's OS maintains the manager's decision-making context. Individually specialized, they integrate to function like a cohesive operating system.
Dan even keeps a scratchpad in Slack specifically for Dot. When he writes down unorganized, raw thoughts, Dot captures them, structures them, and maps them to the correct projects. Ideas that once lived exclusively in a lead's head are effortlessly codified into dry system knowledge, ready to be utilized by the team.
Dan notes that utilizing these agents has drastically reduced the cost of management. By "cost," he doesn't just mean hours saved on a calendar. He means the cognitive debt of constantly holding mental models of who is doing what, why a specific decision was made, how a new paper fits into the active workstream, and which topics need immediate alignment. Because Dot holds the context, Dan is freed from being the human router of information, allowing him to step in as a high-leverage decision maker on top of synthesized data.
Three Months In: What We've Learned
Three months have passed since we launched Tokens Never Sleep in late January.
Initially, our expectation was simple: writing more prompts means doing more things, faster. And indeed, many processes accelerated. But three months of real-world operation have yielded far more nuanced, valuable lessons.
First, productivity does not scale linearly with token consumption.
The fundamental reason we deploy agents is because human time and attention are finite. To safeguard these finite resources, we delegate repetitive, heavily contextual sorting tasks to agents. The goal is never to maximize token usage for its own sake, but to direct tokens toward high-leverage outcomes.
Second, data must precede agents.
With the rise of interoperability protocols like MCP, connecting agents to tools like Slack, Linear, GitHub, Notion, and WandB has become remarkably straightforward. However, ease of connection does not guarantee quality of output.
No matter how sophisticated your agent is, it is useless if the underlying data it reads is messy or out of date. It is the equivalent of giving a brilliant mind a book filled with errors.
An agent's capability is capped by the quality of its ground truth. Aligning as a team on which document is the source of truth, what data is the baseline, which decisions are finalized, and which scratchpads are merely brainstorming is the hard work that makes the entire system viable.
Building great agents is ultimately about building a reliable, pristine data layer.
Third, the talent bar is changing.
Hyemin, our Korea Country Manager, has seen this shift manifest directly in hiring.
“The criteria I search for in candidates today is completely different from three months ago. Previously, we sought people who could assist with manual data preparation and operational groundwork. Now, agents handle a vast portion of those tasks. Today, we look for architects—people who can design workflows, apply critical judgment, delegate to agents effectively, and interpret synthesized outputs.” | ||
|---|---|---|
When the nature of work shifts, the profile of the ideal team member shifts with it.
Where we once needed individuals to handle basic data operations and manual tasks, we now delegate those chores to agents. The skills that have become premium are uniquely human: designing system architectures, formulating high-quality inputs and standards, identifying where human-to-human empathy and alignment are indispensable, and ultimately taking responsibility for decisions.
At Twelve Labs, being "AI-native" doesn't mean knowing how to use a list of AI tools. It means knowing how to break down complex problems structurally, define clean ground truths, build repeatable systems, and take ultimate ownership of outcomes.
We Use AI So We Can Think About Bigger Problems
Tokens Never Sleep is not a mandate to "use more AI."
While the name implies keeping systems running while we sleep, the purpose isn't to force humans to work longer hours—it is to preserve human focus and decision-making energy for what truly matters.
Tickets can be triaged by copilots.
Documents can be structured by agents.
Meeting context can be assembled by systems.
Research papers can be ingested overnight.
Follow-ups can be automated.
But defining which problems are worth solving, deciding on strategic directions, and building authentic human trust remain exclusively human domains.
What Twelve Labs has learned over the last three months is simple:
AI does not replace humans; it crystallizes what humans must excel at.
Tokens never sleep, but the critical judgments that guide those tokens will always be ours to make.
Therefore, building great agents is, at its core, about building a more leverage-multiplying organization.
We do not wait for perfect systems.
We ship, run them in production, and refine them under real-world pressure.
We question ideas that can't be demoed, and whenever we find a human performing repetitive manual work, we ask:
“Can AI do 80% of this?”
“If so, what must the human do to make the remaining 20% exceptional?”
Tokens Never Sleep started with that very question, and it continues to reshape how we operate every day.
Tokens never sleep.
Which is exactly why we can focus on the thoughts that matter.
Join us on this journey → Twelve Labs Careers




