James Le
Date Published
May 29, 2024
Multimodal AI
Media and Entertainment
Join our newsletter
Youโ€™re now subscribed to the Twelve Labs Newsletter! You'll be getting the latest news and updates in video understanding.
Oh no, something went wrong.
Please try again.

Bullet Points

  • Organized by Twelve Labs and, the Multimodal AI in Media & Entertainment Hackathon in Los Angeles, CA, on June 8-9, 2024, brings together the brightest minds in AI and the most innovative thinkers in media and entertainment.
  • With over 100 AI engineers, data scientists, and entertainment professionals expected to attend, this event is a unique opportunity to explore the cutting-edge applications of multimodal AI in the media and entertainment industry.
  • Participants will have the chance to work on cool projects, network with industry leaders, and compete for exciting prizes.
  • The hackathon will focus on developing AI-powered solutions for video editing, highlight reel generation, sports press conference summarization, and more.

The Exciting World of Multimodal AI in Media & Entertainment

Multimodal AI is revolutionizing the media and entertainment industry by enabling new possibilities in content creation, production, and user engagement. By integrating various data types such as text, images, audio, and video, multimodal AI systems can interpret and generate content with the same nuanced context as humans. This technology is transforming everything from scriptwriting and location scouting in pre-production to object removal and scene stabilization in post-production, making the creative process more efficient and innovative.

One of the most exciting applications of multimodal AI is in video understanding, where multimodal foundation models analyze and interpret video content to provide deeper insights and more accurate recommendations. For instance, Twelve Labs' technology can automatically search and classify digital assets, streamline post-production workflows, and enhance user engagement through personalized content recommendations. By leveraging the power of multimodal AI, M&E companies can create more engaging and interactive experiences for their audiences, ultimately driving higher levels of satisfaction and retention.


The Emerging AI Ecosystem in Los Angeles


Los Angeles is rapidly becoming a hub for AI innovation, thanks to its unique position at the intersection of Silicon Valley tech and Hollywood creativity. The city's vibrant AI community includes researchers, engineers, artists, and entrepreneurs who are working together to drive the future of AI-powered M&E. With a rich ecosystem of startups, established tech companies, and leading research institutions, Los Angeles is poised to become a global leader in AI development and application.


About Twelve Labs and

Twelve Labs' innovative solutions are designed to streamline various aspects of video production and management. Our technology supports applications such as asset management, post-production workflow optimization, user engagement enhancement, and contextual advertising. For instance, our Search API enables users to find specific moments within vast video libraries quickly, our Classify API organizes videos into predefined categories, and our Generate API generates open-ended text about the input video. Additionally, our new Embed API and conversational agent Jockey further enhance the user experience and operational efficiency in the M&E industry., co-organizer of the hackathon, is a venture studio dedicated to pioneering AI-driven solutions for the entertainment industry. Born from the AI LA community, focuses on bridging the gap between AI innovation and practical applications in media and entertainment. They support early-stage AI startups by providing resources for product development, sales, and business challenges, while also fostering a community of AI entrepreneurs and creatives.'s mission is to ensure the ethical and responsible development of AI technologies, helping storytellers and content creators leverage AI to produce immersive and interactive experiences.


Media and Entertainment Use Cases of Twelve Labs

Asset Management: Our technology transforms asset management by making video archives easily searchable. This uncovers new value in your media library, monetizing previously unused content. The automatic retrieval and classification of videos allow media companies to manage digital assets efficiently and locate specific clips quickly, creating new revenue opportunities.

Post-Production Workflows: Our platform enhances efficiency in media production by streamlining post-production workflows. It uses state-of-the-art foundation models to instantly locate the perfect clips across all footage, which reduces the time spent on manual searching & sorting and allows editors to focus on the creative aspects of their work. Applicable to film, TV, or digital content, it accelerates the editing workflow and ensures polished, engaging final products.

User Engagement: Enhancing user engagement is crucial for media platforms looking to retain and grow their audience. Our video embedding model creates multimodal embeddings that enable semantic search and content recommendation features. Such personalized recommendations improve user experience, increase platform usage, and drive higher engagement.

Contextual Advertising: Our solutions maximize ad revenue by analyzing video content to identify optimal ad placement moments, ensuring that they are relevant and non-intrusive. This targeted approach increases ad effectiveness, improves ROI for advertisers, and boosts revenue for media companies through precise contextual targeting.


Hackathon Challenges

The hackathon will feature four exciting challenges:

  1. Video Editing with Johnny Harris: Develop an AI-powered video editing tool that can analyze Johnny Harris' Switzerland bunker footage and script to automate the process of finding relevant clips and creating montages based on the script. This challenge aims to streamline the video editing process, making it faster and more efficient.
  2. Highlight Reel Generation with Drew Binsky: Create an AI tool that can analyze Drew Binsky's travel footage and automatically generate engaging highlight reels. The goal is to develop a system that can identify the most captivating moments from hours of travel videos, making it easier to produce compelling content.
  3. Sports Press Conference Summarization and Highlight Generation: Build an AI tool that can generate concise summaries and highlight reels from sports press conference videos. The goal is to help fans and media quickly grasp the key takeaways and most memorable moments without watching the entire press conference.
  4. AWS-Powered Video Q&A Chatbot with RAG: Develop an AI-powered chatbot that can answer questions about a library of movie and TV show trailers using the Retrieval-Augmented Generation (RAG) approach. The chatbot should provide an engaging and informative Q&A experience for users interested in learning more about upcoming releases.


Resources for Hackathon Participants


Participants in the Multimodal AI in Media & Entertainment Hackathon will have access to a wealth of resources from Twelve Labs to help them succeed in their projects. These resources include comprehensive API documentation, tutorials, and SDKs that provide detailed guidance on how to leverage our video understanding platform.

  • The API documentation covers all aspects of Twelve Labs' technology, including the Search API, Classify API, Generate API, and the new Embed API, which is currently in limited Beta and accessible exclusively to a select group of users.
  • Additionally, participants will have the unique opportunity to work with Jockey, a conversational video agent built on top of the Twelve Labs APIs and LangGraph. Although Jockey has not been publicly released yet, it will be available for hackathon participants, providing them with advanced capabilities for interacting with video content through other LLMs.

Besides Twelve Labs, we have additional sponsors including AWS and Fireworks. Details about sponsors and judges can be found in the up-to-date hackathon page.


Registration Information

Register now at to secure your spot and help drive the future of AI in entertainment! Spaces are limited, so don't wait. Join us in Los Angeles, CA, on June 8-9, 2024.

Generation Examples
No items found.
No items found.
Comparison against existing models
No items found.

Related articles

A Recap of Our Multimodal AI in Media & Entertainment Hackathon in Sunny Los Angeles!

Twelve Labs co-hosted our first in-person hackathon in Los Angeles!

James Le
Twelve Labs: Building Multimodal Video Foundation Models for Better Understanding

Co-founder Soyoung Lee shares how Twelve Labs' AI models are reshaping video understanding and content management

VP Land
AI 100: The most promising artificial intelligence startups of 2023

Twelve Labs recognized as one of the most innovative AI companies in search by CB Insights.

CB Insights
NAB 2024: The Bold Innovations You Probably Missed at the Show

Twelve Labs gets featured as a multimodal AI company that deserves buzz at NAB 2024