CASE STUDY
TwelveLabs and UNICEF Korea —
8TB of Field Records,
Instantly Searchable.
The Challenge
UNICEF Korea Committee holds over 8TB of media accumulated across decades of fundraising campaigns and child rights programs — tens of thousands of hours of video and millions of images spanning relief activities, donor events, and nonprofit operations across dozens of countries.
Despite its value, this archive had remained largely inaccessible. Content was scattered across individual PCs and network storage systems with no unified way to search or retrieve it. Locating a specific clip often meant manually reviewing thousands of folders, with file names offering little indication of what they contained. Footage buried for up to a decade sat unused — not because it lacked value, but because finding it was simply too slow.
The team needed a way to make this content findable: not just technically indexed, but actually searchable by the people who use it — communications and campaign staff who think in scenes and stories, not file paths.
What Failed
The Limits of Traditional Approaches
Standard approaches to archive management hit a wall with this kind of content:
Manual cataloging at 8TB scale is not realistic — it requires months of dedicated resources, and any new content immediately falls out of scope.
Keyword-based search fails for visual content. File names and tags cannot capture what's happening in a scene. "A mother holding a child near a water source in South Sudan" cannot be found with a filename.
Frame-by-frame AI tools treat video as disconnected images, missing the narrative context and temporal relationships that give footage its meaning.
The archive needed something that could understand what was happening in a video — not just what pixels appeared in a frame.
Why TwelveLabs
TwelveLabs approaches video differently — understanding it as a unified medium where visual content, motion, sound, and temporal relationships are analyzed together. Two core models power this:
Marengo
The encoder. Compresses video, audio, image, and text into shared multimodal embeddings — enabling semantic search across all content types simultaneously.
Pegasus
The reasoning model. Localizes events and understands what is happening across a video's full duration — not just in isolated frames.
For a nonprofit archive, this architecture matters. Field footage is rarely clean or labeled. A search for "children receiving vaccines" or "flood relief distribution" needs to work even when files are unnamed, metadata is absent, and the relevant moment lasts only a few seconds in a longer clip. TwelveLabs makes exactly these searches possible.
The solution was built by Letsur, a Korean AI integration partner, using TwelveLabs APIs — and delivered to UNICEF Korea Committee as a production-ready archive system.
IMPLEMENTING TWELVELABS' VIDEO UNDERSTANDING
1
Existing content on personal PCs and NAS drives was migrated to AWS S3, creating a centralized, scalable foundation. TwelveLabs indexed the entire archive — approximately 200 hours and 2TB of video, plus associated images and documents — generating multimodal embeddings across all content.
2
Natural Language Search
Staff can now search the archive the way they think: by describing a scene, a situation, or a subject in plain language. Queries like "children collecting water at a field site in Africa" or "end-of-year fundraising campaign clips" return relevant clips, images, and documents — instantly, with precise timestamps.
3
Automatic Ingestion
New content — from field operations, relief activities, and global missions — is automatically indexed as it enters the system. There is no manual classification step. Every new video, photo, or document is immediately discoverable through the same natural language interface.
1
Donor Communications
Communications staff can locate field photography and footage relevant to a specific donor story in seconds. What previously required manual browsing across multiple drives now surfaces immediately, freeing up time that staff can redirect toward their core communications work.
2
Campaign Planning and Content Development
Past campaign materials are now easier to search and reference. Teams planning new campaigns can quickly surface relevant footage and images from prior work — significantly reducing the time spent on asset discovery in the early stages of content development.
3
Ongoing Field Documentation
Field teams continuously generate documentation — from vaccination drives and emergency response efforts to long-term development programs. All of this material is now indexed and searchable automatically, without any manual tagging or classification.
3
Improving Access to Historical Archives
Footage buried for up to a decade can now be found more easily through search. Years of documentation — including field footage, campaign materials, and relief records — are now more accessible and easier to reuse when relevant.
Impact
The impact of the deployment is measurable and immediate:
95% reduction in content retrieval time — from manual folder browsing to natural language results in seconds.
8TB+ of fragmented data transformed into a structured, searchable digital asset library.
Thousands of manual folder reviews eliminated — campaign planning and content production workflows now move dramatically faster.
Approximately 200 hours / 2TB of video indexed and instantly accessible to all staff.
Zero manual classification required for new content — the archive updates automatically as new material is added.
Beyond the quantitative gains, the deeper value is organizational: UNICEF Korea Committee now has an institutional memory. Communications and campaign teams spend time on the work that matters — storytelling, donor engagement, campaign strategy — rather than on content retrieval.




