
Product
Product
Product
Bring Video Intelligence to Your Agents with TwelveLabs MCP Server


James Le
James Le
James Le
The TwelveLabs MCP Server provides seamless integration with the TwelveLabs platform. This server enables AI assistants and applications to interact with TwelveLabs powerful video understanding capabilities through a standardized MCP interface.
The TwelveLabs MCP Server provides seamless integration with the TwelveLabs platform. This server enables AI assistants and applications to interact with TwelveLabs powerful video understanding capabilities through a standardized MCP interface.


Join our newsletter
Receive the latest advancements, tutorials, and industry insights in video understanding
Search, analyze, and explore your videos with AI.
Sep 17, 2025
Sep 17, 2025
Sep 17, 2025
7 Minutes
7 Minutes
7 Minutes
Copy link to article
Copy link to article
Copy link to article
Ever wanted an AI assistant that could watch and understand videos for you? How about one that can summarize a 30-minute meeting recording in seconds? Or even search through hours of footage to find specific moments on demand? With the new TwelveLabs Model Context Protocol (MCP) server, now it can.
We are excited to announce the launch of the TwelveLabs MCP Server – a bridge between our video understanding platform and AI assistants, built on the open Model Context Protocol standard. This server acts as a universal adapter that lets large language model (LLM) agents (like Anthropic’s Claude) tap directly into TwelveLabs’ powerful video search and analysis capabilities. In practice, that means with simple natural language prompts, your AI applications of choice can now index videos, find relevant scenes, generate summaries, and more – by invoking our tools through a standardized interface.
Whether you are using the Claude Desktop app, an IDE assistant like Cursor or Windsurf, or a custom AI agent, the TwelveLabs MCP server makes integration frictionless. You no longer need to write custom API calls or glue code – just spin up our MCP server and plug it into your AI environment. The server exposes a suite of Resources, Tools, and Prompts representing TwelveLabs features, so any MCP-compatible client can discover and use them out-of-the-box. (In fact, we’ve already verified compatibility with popular clients like Claude Desktop, Cursor, and Goose) In this post, we will explain what the TwelveLabs MCP server is, show some of the exciting things it can do, and walk you through how to get started.
What is the TwelveLabs MCP Server?
The TwelveLabs MCP Server is a packaged server that exposes our video understanding capabilities (indexing, semantic search, analysis, embeddings) as standard MCP tools, resources, and prompts—so any MCP-compatible client (like Claude Desktop) can use them with zero custom glue code.
To use it, you install and configure the server via our Installation Guide. Once connected, your AI assistant can discover TwelveLabs’ tools at runtime and invoke them deterministically—e.g., “search this library for all fourth-quarter three-pointers” or “summarize this video in 3 sentences,” with results returned in structured form the client understands.

Use Cases & Possibilities
What does this enable you to do? In short, it unlocks a range of powerful video understanding use cases for AI agents. Here are just a few scenarios made possible by the TwelveLabs MCP server:
Semantic Video Search as a Tool: Imagine asking an AI agent, “Find where the presenter shows the final chart in this 2-hour webinar,” and getting back the exact timestamp. The MCP server exposes our semantic video search engine as a tool that can be invoked with natural language. This makes “video search” a first-class action for any LLM agent – it can comb through your video index to find moments or scenes that match a description, using the power of our Marengo embeddings model behind the scenes.
Automatic Video Summaries & Q&A: You can have your AI assistant summarize a video or answer questions about its content on the fly. The MCP server wraps our Pegasus video language model into an easy call. An agent could invoke this to get a concise summary of a lengthy video, or even perform Q&A by feeding the video ID into a prompt. Because MCP allows structured prompts with dynamic context, the agent can retrieve a video’s metadata as a resource and include it in the prompt for a more accurate answer. In plain terms: your AI can “watch” a video and tell you all about it, just like a human would, but in seconds.
Chainable Video Analysis (RAG workflows): The real magic happens when you chain tools together. For instance, an agent might first call a search tool to retrieve relevant video clips (using your query), then pass those clips into an analysis tool to produce a detailed answer or report. This is essentially Retrieval-Augmented Generation (RAG) for videos – an agent orchestrating our search and analysis in sequence. With MCP, these multi-step workflows are seamless: the agent automatically uses the TwelveLabs tools in the right order, enabling complex tasks like “Find me all the scenes where this product appears and then generate a highlight reel summary of those scenes.” What used to require manual coding and stitching of APIs, an MCP-enabled AI can now handle autonomously.
Interactive Video Assistants: Because the MCP server operates in real-time with your AI assistant, you can build truly interactive video workflows. For example, an agent could prompt you for input (“Which of these clips do you want to explore further?”), fetch additional information via a TwelveLabs tool, and loop back into the conversation. Use cases span from media & news (summarizing breaking news videos) to sports (finding and compiling game highlights), security (scanning surveillance footage for events), and beyond. The uniform interface means developers can easily experiment with creative ideas—like a personal video DJ agent that finds clips to match your mood—without worrying about the complexity of video processing under the hood.
Claude Desktop in Action
One of the most exciting ways to experience the TwelveLabs MCP server is through Claude Desktop, Anthropic’s powerful chat-based AI desktop client. In our internal testing, we connected Claude Desktop to a locally running TwelveLabs MCP server – and watched Claude gain video superpowers instantly.
After starting the server and entering the connection details in Claude Desktop, the app automatically discovered our available tools (thanks to MCP’s built-in discovery). Take a look at the demo recording below for the details. Behind the scenes, Claude used the MCP interface to call the TwelveLabs tools: listing videos in an index, searching for specific clips based on semantic queries, and performing a content analysis of a specific given video.

This example highlights how effortless it now is to integrate video understanding into AI workflows. With a few clicks, Claude gained the ability to search and analyze videos using natural language – no custom plugin, no prompt hacking, just the MCP server bridging the gap. And it’s not limited to Claude: you can plug the TwelveLabs MCP server into other MCP-compatible platforms too, like Cursor (for AI-assisted coding with video data) or your own Python scripts using an MCP client SDK. The possibilities for AI + video applications are endless when your tools speak a common protocol.
Getting Started with TwelveLabs’ MCP Server
Follow these steps to get up and running—no source checkout required:
Open the Installation Guide 👉 https://mcp-install-instructions.alpic.cloud/servers/twelvelabs-mcp. The guide walks you through adding the server to your preferred MCP client.
Grab your TwelveLabs API key: Sign in to your TwelveLabs account and copy your API key. You’ll paste it during setup so the server can securely call TwelveLabs APIs on your behalf.
Choose your MCP client: The guide includes client-specific steps (where to add the server, how to provide the API key, and how to approve tool calls).
Connect and verify: After adding the TwelveLabs MCP Server in your client, confirm that the tools are visible. Most clients show a tool list/registry UI.
Run a first workflow: See the demonstration below with the Windsurf IDE for some ideas.

One thing we want to call out is that robust tool-call errors help agents recover automatically. Alpic’s guide on intelligent MCP error handling is a great reference for production setups.
Conclusion & Next Steps
The release of the TwelveLabs MCP server marks an exciting step forward for AI developers and product builders. For the first time, it’s easy to give your AI agent eyes on video content – not by hacking together APIs and custom code, but by simply adding a standardized tool to its toolbox. We believe this will unlock a new wave of multimodal applications, from smarter virtual assistants that can understand meeting recordings, to creative generative agents that mix video context into their outputs.
In other words, our MCP Server makes video intelligence a first-class capability inside your AI workflows—without SDK sprawl or custom integration code. Because it’s distributed as a ready-to-use, install-only server, teams can connect from popular MCP clients in minutes and start building:
Media & news: instant summaries and highlights
Sports: moment search and compilation
Marketing: logo/moment detection and social-ready cuts
👉 Start Now
Use the Installation Guide to add the TwelveLabs MCP Server to your client: https://mcp-install-instructions.alpic.cloud/servers/twelvelabs-mcp
Follow our API documentation: https://docs.twelvelabs.io/docs/advanced/model-context-protocol
Or if you are a developer who want to build your own MCP server, consider launching your hosted server on Alpic (free beta): https://app.alpic.ai/
Ever wanted an AI assistant that could watch and understand videos for you? How about one that can summarize a 30-minute meeting recording in seconds? Or even search through hours of footage to find specific moments on demand? With the new TwelveLabs Model Context Protocol (MCP) server, now it can.
We are excited to announce the launch of the TwelveLabs MCP Server – a bridge between our video understanding platform and AI assistants, built on the open Model Context Protocol standard. This server acts as a universal adapter that lets large language model (LLM) agents (like Anthropic’s Claude) tap directly into TwelveLabs’ powerful video search and analysis capabilities. In practice, that means with simple natural language prompts, your AI applications of choice can now index videos, find relevant scenes, generate summaries, and more – by invoking our tools through a standardized interface.
Whether you are using the Claude Desktop app, an IDE assistant like Cursor or Windsurf, or a custom AI agent, the TwelveLabs MCP server makes integration frictionless. You no longer need to write custom API calls or glue code – just spin up our MCP server and plug it into your AI environment. The server exposes a suite of Resources, Tools, and Prompts representing TwelveLabs features, so any MCP-compatible client can discover and use them out-of-the-box. (In fact, we’ve already verified compatibility with popular clients like Claude Desktop, Cursor, and Goose) In this post, we will explain what the TwelveLabs MCP server is, show some of the exciting things it can do, and walk you through how to get started.
What is the TwelveLabs MCP Server?
The TwelveLabs MCP Server is a packaged server that exposes our video understanding capabilities (indexing, semantic search, analysis, embeddings) as standard MCP tools, resources, and prompts—so any MCP-compatible client (like Claude Desktop) can use them with zero custom glue code.
To use it, you install and configure the server via our Installation Guide. Once connected, your AI assistant can discover TwelveLabs’ tools at runtime and invoke them deterministically—e.g., “search this library for all fourth-quarter three-pointers” or “summarize this video in 3 sentences,” with results returned in structured form the client understands.

Use Cases & Possibilities
What does this enable you to do? In short, it unlocks a range of powerful video understanding use cases for AI agents. Here are just a few scenarios made possible by the TwelveLabs MCP server:
Semantic Video Search as a Tool: Imagine asking an AI agent, “Find where the presenter shows the final chart in this 2-hour webinar,” and getting back the exact timestamp. The MCP server exposes our semantic video search engine as a tool that can be invoked with natural language. This makes “video search” a first-class action for any LLM agent – it can comb through your video index to find moments or scenes that match a description, using the power of our Marengo embeddings model behind the scenes.
Automatic Video Summaries & Q&A: You can have your AI assistant summarize a video or answer questions about its content on the fly. The MCP server wraps our Pegasus video language model into an easy call. An agent could invoke this to get a concise summary of a lengthy video, or even perform Q&A by feeding the video ID into a prompt. Because MCP allows structured prompts with dynamic context, the agent can retrieve a video’s metadata as a resource and include it in the prompt for a more accurate answer. In plain terms: your AI can “watch” a video and tell you all about it, just like a human would, but in seconds.
Chainable Video Analysis (RAG workflows): The real magic happens when you chain tools together. For instance, an agent might first call a search tool to retrieve relevant video clips (using your query), then pass those clips into an analysis tool to produce a detailed answer or report. This is essentially Retrieval-Augmented Generation (RAG) for videos – an agent orchestrating our search and analysis in sequence. With MCP, these multi-step workflows are seamless: the agent automatically uses the TwelveLabs tools in the right order, enabling complex tasks like “Find me all the scenes where this product appears and then generate a highlight reel summary of those scenes.” What used to require manual coding and stitching of APIs, an MCP-enabled AI can now handle autonomously.
Interactive Video Assistants: Because the MCP server operates in real-time with your AI assistant, you can build truly interactive video workflows. For example, an agent could prompt you for input (“Which of these clips do you want to explore further?”), fetch additional information via a TwelveLabs tool, and loop back into the conversation. Use cases span from media & news (summarizing breaking news videos) to sports (finding and compiling game highlights), security (scanning surveillance footage for events), and beyond. The uniform interface means developers can easily experiment with creative ideas—like a personal video DJ agent that finds clips to match your mood—without worrying about the complexity of video processing under the hood.
Claude Desktop in Action
One of the most exciting ways to experience the TwelveLabs MCP server is through Claude Desktop, Anthropic’s powerful chat-based AI desktop client. In our internal testing, we connected Claude Desktop to a locally running TwelveLabs MCP server – and watched Claude gain video superpowers instantly.
After starting the server and entering the connection details in Claude Desktop, the app automatically discovered our available tools (thanks to MCP’s built-in discovery). Take a look at the demo recording below for the details. Behind the scenes, Claude used the MCP interface to call the TwelveLabs tools: listing videos in an index, searching for specific clips based on semantic queries, and performing a content analysis of a specific given video.

This example highlights how effortless it now is to integrate video understanding into AI workflows. With a few clicks, Claude gained the ability to search and analyze videos using natural language – no custom plugin, no prompt hacking, just the MCP server bridging the gap. And it’s not limited to Claude: you can plug the TwelveLabs MCP server into other MCP-compatible platforms too, like Cursor (for AI-assisted coding with video data) or your own Python scripts using an MCP client SDK. The possibilities for AI + video applications are endless when your tools speak a common protocol.
Getting Started with TwelveLabs’ MCP Server
Follow these steps to get up and running—no source checkout required:
Open the Installation Guide 👉 https://mcp-install-instructions.alpic.cloud/servers/twelvelabs-mcp. The guide walks you through adding the server to your preferred MCP client.
Grab your TwelveLabs API key: Sign in to your TwelveLabs account and copy your API key. You’ll paste it during setup so the server can securely call TwelveLabs APIs on your behalf.
Choose your MCP client: The guide includes client-specific steps (where to add the server, how to provide the API key, and how to approve tool calls).
Connect and verify: After adding the TwelveLabs MCP Server in your client, confirm that the tools are visible. Most clients show a tool list/registry UI.
Run a first workflow: See the demonstration below with the Windsurf IDE for some ideas.

One thing we want to call out is that robust tool-call errors help agents recover automatically. Alpic’s guide on intelligent MCP error handling is a great reference for production setups.
Conclusion & Next Steps
The release of the TwelveLabs MCP server marks an exciting step forward for AI developers and product builders. For the first time, it’s easy to give your AI agent eyes on video content – not by hacking together APIs and custom code, but by simply adding a standardized tool to its toolbox. We believe this will unlock a new wave of multimodal applications, from smarter virtual assistants that can understand meeting recordings, to creative generative agents that mix video context into their outputs.
In other words, our MCP Server makes video intelligence a first-class capability inside your AI workflows—without SDK sprawl or custom integration code. Because it’s distributed as a ready-to-use, install-only server, teams can connect from popular MCP clients in minutes and start building:
Media & news: instant summaries and highlights
Sports: moment search and compilation
Marketing: logo/moment detection and social-ready cuts
👉 Start Now
Use the Installation Guide to add the TwelveLabs MCP Server to your client: https://mcp-install-instructions.alpic.cloud/servers/twelvelabs-mcp
Follow our API documentation: https://docs.twelvelabs.io/docs/advanced/model-context-protocol
Or if you are a developer who want to build your own MCP server, consider launching your hosted server on Alpic (free beta): https://app.alpic.ai/
Ever wanted an AI assistant that could watch and understand videos for you? How about one that can summarize a 30-minute meeting recording in seconds? Or even search through hours of footage to find specific moments on demand? With the new TwelveLabs Model Context Protocol (MCP) server, now it can.
We are excited to announce the launch of the TwelveLabs MCP Server – a bridge between our video understanding platform and AI assistants, built on the open Model Context Protocol standard. This server acts as a universal adapter that lets large language model (LLM) agents (like Anthropic’s Claude) tap directly into TwelveLabs’ powerful video search and analysis capabilities. In practice, that means with simple natural language prompts, your AI applications of choice can now index videos, find relevant scenes, generate summaries, and more – by invoking our tools through a standardized interface.
Whether you are using the Claude Desktop app, an IDE assistant like Cursor or Windsurf, or a custom AI agent, the TwelveLabs MCP server makes integration frictionless. You no longer need to write custom API calls or glue code – just spin up our MCP server and plug it into your AI environment. The server exposes a suite of Resources, Tools, and Prompts representing TwelveLabs features, so any MCP-compatible client can discover and use them out-of-the-box. (In fact, we’ve already verified compatibility with popular clients like Claude Desktop, Cursor, and Goose) In this post, we will explain what the TwelveLabs MCP server is, show some of the exciting things it can do, and walk you through how to get started.
What is the TwelveLabs MCP Server?
The TwelveLabs MCP Server is a packaged server that exposes our video understanding capabilities (indexing, semantic search, analysis, embeddings) as standard MCP tools, resources, and prompts—so any MCP-compatible client (like Claude Desktop) can use them with zero custom glue code.
To use it, you install and configure the server via our Installation Guide. Once connected, your AI assistant can discover TwelveLabs’ tools at runtime and invoke them deterministically—e.g., “search this library for all fourth-quarter three-pointers” or “summarize this video in 3 sentences,” with results returned in structured form the client understands.

Use Cases & Possibilities
What does this enable you to do? In short, it unlocks a range of powerful video understanding use cases for AI agents. Here are just a few scenarios made possible by the TwelveLabs MCP server:
Semantic Video Search as a Tool: Imagine asking an AI agent, “Find where the presenter shows the final chart in this 2-hour webinar,” and getting back the exact timestamp. The MCP server exposes our semantic video search engine as a tool that can be invoked with natural language. This makes “video search” a first-class action for any LLM agent – it can comb through your video index to find moments or scenes that match a description, using the power of our Marengo embeddings model behind the scenes.
Automatic Video Summaries & Q&A: You can have your AI assistant summarize a video or answer questions about its content on the fly. The MCP server wraps our Pegasus video language model into an easy call. An agent could invoke this to get a concise summary of a lengthy video, or even perform Q&A by feeding the video ID into a prompt. Because MCP allows structured prompts with dynamic context, the agent can retrieve a video’s metadata as a resource and include it in the prompt for a more accurate answer. In plain terms: your AI can “watch” a video and tell you all about it, just like a human would, but in seconds.
Chainable Video Analysis (RAG workflows): The real magic happens when you chain tools together. For instance, an agent might first call a search tool to retrieve relevant video clips (using your query), then pass those clips into an analysis tool to produce a detailed answer or report. This is essentially Retrieval-Augmented Generation (RAG) for videos – an agent orchestrating our search and analysis in sequence. With MCP, these multi-step workflows are seamless: the agent automatically uses the TwelveLabs tools in the right order, enabling complex tasks like “Find me all the scenes where this product appears and then generate a highlight reel summary of those scenes.” What used to require manual coding and stitching of APIs, an MCP-enabled AI can now handle autonomously.
Interactive Video Assistants: Because the MCP server operates in real-time with your AI assistant, you can build truly interactive video workflows. For example, an agent could prompt you for input (“Which of these clips do you want to explore further?”), fetch additional information via a TwelveLabs tool, and loop back into the conversation. Use cases span from media & news (summarizing breaking news videos) to sports (finding and compiling game highlights), security (scanning surveillance footage for events), and beyond. The uniform interface means developers can easily experiment with creative ideas—like a personal video DJ agent that finds clips to match your mood—without worrying about the complexity of video processing under the hood.
Claude Desktop in Action
One of the most exciting ways to experience the TwelveLabs MCP server is through Claude Desktop, Anthropic’s powerful chat-based AI desktop client. In our internal testing, we connected Claude Desktop to a locally running TwelveLabs MCP server – and watched Claude gain video superpowers instantly.
After starting the server and entering the connection details in Claude Desktop, the app automatically discovered our available tools (thanks to MCP’s built-in discovery). Take a look at the demo recording below for the details. Behind the scenes, Claude used the MCP interface to call the TwelveLabs tools: listing videos in an index, searching for specific clips based on semantic queries, and performing a content analysis of a specific given video.

This example highlights how effortless it now is to integrate video understanding into AI workflows. With a few clicks, Claude gained the ability to search and analyze videos using natural language – no custom plugin, no prompt hacking, just the MCP server bridging the gap. And it’s not limited to Claude: you can plug the TwelveLabs MCP server into other MCP-compatible platforms too, like Cursor (for AI-assisted coding with video data) or your own Python scripts using an MCP client SDK. The possibilities for AI + video applications are endless when your tools speak a common protocol.
Getting Started with TwelveLabs’ MCP Server
Follow these steps to get up and running—no source checkout required:
Open the Installation Guide 👉 https://mcp-install-instructions.alpic.cloud/servers/twelvelabs-mcp. The guide walks you through adding the server to your preferred MCP client.
Grab your TwelveLabs API key: Sign in to your TwelveLabs account and copy your API key. You’ll paste it during setup so the server can securely call TwelveLabs APIs on your behalf.
Choose your MCP client: The guide includes client-specific steps (where to add the server, how to provide the API key, and how to approve tool calls).
Connect and verify: After adding the TwelveLabs MCP Server in your client, confirm that the tools are visible. Most clients show a tool list/registry UI.
Run a first workflow: See the demonstration below with the Windsurf IDE for some ideas.

One thing we want to call out is that robust tool-call errors help agents recover automatically. Alpic’s guide on intelligent MCP error handling is a great reference for production setups.
Conclusion & Next Steps
The release of the TwelveLabs MCP server marks an exciting step forward for AI developers and product builders. For the first time, it’s easy to give your AI agent eyes on video content – not by hacking together APIs and custom code, but by simply adding a standardized tool to its toolbox. We believe this will unlock a new wave of multimodal applications, from smarter virtual assistants that can understand meeting recordings, to creative generative agents that mix video context into their outputs.
In other words, our MCP Server makes video intelligence a first-class capability inside your AI workflows—without SDK sprawl or custom integration code. Because it’s distributed as a ready-to-use, install-only server, teams can connect from popular MCP clients in minutes and start building:
Media & news: instant summaries and highlights
Sports: moment search and compilation
Marketing: logo/moment detection and social-ready cuts
👉 Start Now
Use the Installation Guide to add the TwelveLabs MCP Server to your client: https://mcp-install-instructions.alpic.cloud/servers/twelvelabs-mcp
Follow our API documentation: https://docs.twelvelabs.io/docs/advanced/model-context-protocol
Or if you are a developer who want to build your own MCP server, consider launching your hosted server on Alpic (free beta): https://app.alpic.ai/
Related articles
© 2021
-
2025
TwelveLabs, Inc. All Rights Reserved
© 2021
-
2025
TwelveLabs, Inc. All Rights Reserved
© 2021
-
2025
TwelveLabs, Inc. All Rights Reserved