제품

요금제

솔루션

개발자 센터

연구

회사 소개

블로그

Select Language

플레이그라운드

엔터프라이즈 문의하기

파트너십

고급 비디오 검색: 시맨틱 검색을 위한 Twelve Labs 및 Milvus 활용하기

제임스 리(James Le), 매니시 마헤슈와리(Manish Maheshwari)

개발자는 Twelve Labs의 Embed API와 Milvus를 통합하여 Milvus 컬렉션에 복합 영상 임베딩을 저장하고, 텍스트와 영상 하이브리드 검색을 지원하는 벡터 유사도 검색을 통해 관련 영상 구간을 간편하게 찾아낼 수 있습니다. 이 프로세스는 Marengo를 활용한 멀티모달 영상 임베딩 생성을 기반으로 작동합니다.

In this article

No headings found on page

뉴스레터 구독하기

영상 이해 분야의 최신 기술 업데이트, 튜토리얼 및 인사이트를 받아보세요.

AI로 영상을 검색하고, 분석하고, 탐색하세요.

플레이그라운드 체험하기

2024. 8. 2.

11분

링크 복사하기

TLDR: Twelve Labs의 멀티모달 임베딩 생성을 위한 Embed API와 효율적인 저장 및 검색을 위한 오픈 소스 벡터 데이터베이스인 Milvus를 통합하여 시맨틱 비디오 검색 애플리케이션을 구축하는 방법을 알아보세요. 개발 환경 설정부터 하이브리드 검색 및 시간 처리에 따른 비디오 분석과 같은 고급 기능 구현까지 전 과정을 다루며, 정교한 비디오 콘텐츠 분석 및 검색 시스템을 구축하기 위한 포괄적인 기반을 제공합니다. 이번 통합 가이드를 위해 공동 작업해 주신 Zilliz 팀(Jiang Chen 및 Chen Zhang)에 깊은 감사를 드립니다.

‍

소개

Twelve Labs Embed API와 Zilliz가 개발한 오픈 소스 벡터 데이터베이스인 Milvus를 사용하여 시맨틱 비디오 검색을 구현하는 종합 튜토리얼에 오신 것을 환영합니다. 이 가이드에서는 Twelve Labs의 뛰어난 멀티모달 임베딩과 Milvus의 효율적인 벡터 데이터베이스의 성능을 활용하여 강력한 비디오 검색 솔루션을 구축하는 방법을 살펴봅니다. 이러한 기술을 통합함으로써 개발자는 비디오 콘텐츠 분석에서 새로운 가능성을 열고, 콘텐츠 기반 비디오 검색, 추천 시스템, 비디오 데이터의 미묘한 차이를 이해하는 정교한 검색 엔진 등의 애플리케이션을 개발할 수 있습니다.

이 튜토리얼은 개발 환경 설정부터 기능적인 시맨틱 비디오 검색 애플리케이션 구현에 이르기까지 전체 과정을 안내합니다. 비디오에서 멀티모달 임베딩 생성, Milvus에 효율적으로 저장, 관련 콘텐츠를 검색하기 위한 유사도 검색 수행과 같은 핵심 개념을 다룹니다. 비디오 분석 플랫폼이나 콘텐츠 탐색 도구를 구축하든, 기존 애플리케이션에 비디오 검색 기능을 추가하든, 이 가이드는 프로젝트에서 Twelve Labs와 Milvus의 결합된 강점을 활용할 수 있는 지식과 실용적인 단계를 제공할 것입니다.

‍

사전 요구 사항

시작하기 전에 다음 사항이 준비되었는지 확인하세요.

설치 및 실행 중인 Milvus 서버 (자세한 지침은 Milvus 설치 가이드 참조)
Twelve Labs API 키 (없으시다면 https://playground.twelvelabs.io에서 가입하세요)
시스템에 설치된 Python 3.7 이상 버전

‍

개발 환경 설정

프로젝트를 위한 새 디렉터리를 만들고 해당 디렉터리로 이동합니다.

mkdir video-search-tutorial
cd video-search-tutorial

가상 환경을 설정합니다 (선택 사항이지만 권장됨).

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

필요한 Python 라이브러리를 설치합니다.

pip install twelvelabs pymilvus

프로젝트를 위한 새 Python 파일을 만듭니다.

touch video_search.py

이 video_search.py 파일은 튜토리얼에서 사용할 메인 스크립트가 됩니다. 다음으로 보안을 위해 Twelve Labs API 키를 환경 변수로 설정합니다.

export TWELVE_LABS_API_KEY='your_api_key_here'

Milvus 연결하기

Milvus와의 연결을 설정하기 위해 MilvusClient 클래스를 사용합니다. 이 방법은 연결 프로세스를 단순화하고, 본 튜토리얼에 완벽하게 부합하는 로컬 파일 기반의 Milvus 인스턴스를 사용할 수 있게 해줍니다.

from pymilvus import MilvusClient

# Initialize the Milvus client
milvus_client = MilvusClient("milvus_twelvelabs_demo.db")

print("Successfully connected to Milvus")

이 코드는 모든 데이터를 milvus_twelvelabs_demo.db라는 파일에 저장하는 새로운 Milvus 클라이언트 인스턴스를 생성합니다. 이러한 파일 기반 방식은 개발 및 테스트 목적에 매우 이상적입니다.

‍

비디오 임베딩을 위한 Milvus 컬렉션 생성

이제 Milvus에 연결되었으므로, 비디오 임베딩과 관련 메타데이터를 저장할 컬렉션을 만들어 보겠습니다. 컬렉션 스키마를 정의하고 컬렉션이 아직 존재하지 않는 경우 컬렉션을 생성합니다.

# Initialize the collection name
collection_name = "twelvelabs_demo_collection"

# Check if the collection already exists and drop it if it does
if milvus_client.has_collection(collection_name=collection_name):
    milvus_client.drop_collection(collection_name=collection_name)

# Create the collection
milvus_client.create_collection(
    collection_name=collection_name,
    dimension=1024  # The dimension of the Twelve Labs embeddings
)

print(f"Collection '{collection_name}' created successfully")

이 코드에서는 먼저 컬렉션이 이미 존재하는지 확인하고, 존재한다면 이를 삭제합니다. 이를 통해 깨끗한 상태에서 시작할 수 있습니다. Twelve Labs 임베딩의 출력 차원과 일치하도록 1024 차원으로 컬렉션을 생성합니다.

Twelve Labs Embed API로 임베딩 생성

Twelve Labs Embed API를 사용하여 비디오용 임베딩을 생성하기 위해 Twelve Labs Python SDK를 사용합니다. 이 프로세스에는 임베딩 작업 생성, 작업 완료 대기, 결과 검색이 포함됩니다. 구현 방법은 다음과 같습니다.

먼저 Twelve Labs SDK가 설치되어 있는지 확인하고 필요한 모듈을 임포트합니다.

from twelvelabs import TwelveLabs
from twelvelabs.models.embed import EmbeddingsTask
import os

# Retrieve the API key from environment variables
TWELVE_LABS_API_KEY = os.getenv('TWELVE_LABS_API_KEY')

Twelve Labs 클라이언트를 초기화합니다.

twelvelabs_client = TwelveLabs(api_key=TWELVE_LABS_API_KEY)

제공된 비디오 URL에 대해 임베딩을 생성하는 함수를 만듭니다.

def generate_embedding(video_url):
		"""
    Generate embeddings for a given video URL using the Twelve Labs API.

    This function creates an embedding task for the specified video URL using
    the Marengo-retrieval-2.6 engine. It monitors the task progress and waits
    for completion. Once done, it retrieves the task result and extracts the
    embeddings along with their associated metadata.

    Args:
        video_url (str): The URL of the video to generate embeddings for.

    Returns:
        tuple: A tuple containing two elements:
            1. list: A list of dictionaries, where each dictionary contains:
                - 'embedding': The embedding vector as a list of floats.
                - 'start_offset_sec': The start time of the segment in seconds.
                - 'end_offset_sec': The end time of the segment in seconds.
                - 'embedding_scope': The scope of the embedding (e.g., 'shot', 'scene').
            2. EmbeddingsTaskResult: The complete task result object from Twelve Labs API.

    Raises:
        Any exceptions raised by the Twelve Labs API during task creation,
        execution, or retrieval.
    """

    # Create an embedding task
    task = twelvelabs_client.embed.task.create(
        engine_name="Marengo-retrieval-2.6",
        video_url=video_url
    )
    print(f"Created task: id={task.id} engine_name={task.engine_name} status={task.status}")

    # Define a callback function to monitor task progress
    def on_task_update(task: EmbeddingsTask):
        print(f"  Status={task.status}")

    # Wait for the task to complete
    status = task.wait_for_done(
        sleep_interval=2,
        callback=on_task_update
    )
    print(f"Embedding done: {status}")

    # Retrieve the task result
    task_result = twelvelabs_client.embed.task.retrieve(task.id)

    # Extract and return the embeddings
    embeddings = []
    for v in task_result.video_embeddings:
        embeddings.append({
            'embedding': v.embedding.float,
            'start_offset_sec': v.start_offset_sec,
            'end_offset_sec': v.end_offset_sec,
            'embedding_scope': v.embedding_scope
        })
    
    return embeddings, task_result

해당 함수를 사용하여 비디오용 임베딩을 생성합니다.

# Example usage
video_url = "https://example.com/your-video.mp4"

# Generate embeddings for the video
embeddings, task_result = generate_embedding(video_url)

print(f"Generated {len(embeddings)} embeddings for the video")
for i, emb in enumerate(embeddings):
    print(f"Embedding {i+1}:")
    print(f"  Scope: {emb['embedding_scope']}")
    print(f"  Time range: {emb['start_offset_sec']} - {emb['end_offset_sec']} seconds")
    print(f"  Embedding vector (first 5 values): {emb['embedding'][:5]}")
    print()

이 구현을 통해 Twelve Labs Embed API를 사용하여 원하는 모든 비디오 URL에 대한 임베딩을 생성할 수 있습니다. generate_embedding 함수는 작업 생성부터 결과 검색까지의 전 과정을 처리합니다. 임베딩 벡터와 메타데이터(시간 범위 및 스코프)가 포함된 딕셔너리 리스트를 반환합니다. 프로덕션 환경에서는 네트워크 문제나 API 제한 등의 잠재적 오류를 적절히 처리해야 합니다. 특정 유스케이스에 맞춰 재시도 메커니즘이나 더 견고한 에러 핸들링을 구현하는 것이 좋습니다.

‍

Milvus에 임베딩 삽입하기

Twelve Labs Embed API를 사용하여 임베딩을 생성한 후, 다음 단계는 이 임베딩을 메타데이터와 함께 Milvus 컬렉션에 삽입하는 것입니다. 이 프로세스를 통해 비디오 임베딩을 저장하고 인덱싱하여 향후 효율적인 유사도 검색을 수행할 수 있게 됩니다.

다음은 Milvus에 임베딩을 삽입하는 방법입니다.

def insert_embeddings(milvus_client, collection_name, task_result, video_url):
    """
    Insert embeddings into the Milvus collection.

    Args:
        milvus_client: The Milvus client instance.
        collection_name (str): The name of the Milvus collection to insert into.
        task_result (EmbeddingsTaskResult): The task result containing video embeddings.
        video_url (str): The URL of the video associated with the embeddings.

    Returns:
        MutationResult: The result of the insert operation.

    This function takes the video embeddings from the task result and inserts them
    into the specified Milvus collection. Each embedding is stored with additional
    metadata including its scope, start and end times, and the associated video URL.
    """
    data = []

    for i, v in enumerate(task_result.video_embeddings):
        data.append({
            "id": i,
            "vector": v.embedding.float,
            "embedding_scope": v.embedding_scope,
            "start_offset_sec": v.start_offset_sec,
            "end_offset_sec": v.end_offset_sec,
            "video_url": video_url
        })

    insert_result = milvus_client.insert(collection_name=collection_name, data=data)
    print(f"Inserted {len(data)} embeddings into Milvus")
    return insert_result

# Usage example
video_url = "https://example.com/your-video.mp4"

# Assuming this function exists from previous step
embeddings, task_result = generate_embedding(video_url)

# Insert embeddings into the Milvus collection
insert_result = insert_embeddings(milvus_client, collection_name, task_result, video_url)
print(insert_result)

이 함수는 임베딩 벡터, 시간 범위, 소스 비디오 URL 등 관련된 모든 메타데이터를 포함하여 삽입할 데이터를 준비합니다. 그런 다음 Milvus 클라이언트를 사용하여 이 데이터를 지정된 컬렉션에 삽입합니다.

유사도 검색 수행하기

임베딩을 Milvus에 저장하고 나면, 쿼리 벡터를 기반으로 가장 관련성이 높은 비디오 세그먼트를 찾기 위한 유사도 검색을 수행할 수 있습니다. 이 기능을 구현하는 방법은 다음과 같습니다.

def perform_similarity_search(milvus_client, collection_name, query_vector, limit=5):
    """
    Perform a similarity search on the Milvus collection.

    Args:
        milvus_client: The Milvus client instance.
        collection_name (str): The name of the Milvus collection to search in.
        query_vector (list): The query vector to search for similar embeddings.
        limit (int, optional): The maximum number of results to return. Defaults to 5.

    Returns:
        list: A list of search results, where each result is a dictionary containing
              the matched entity's metadata and similarity score.

    This function searches the specified Milvus collection for embeddings similar to
    the given query vector. It returns the top matching results, including metadata
    such as the embedding scope, time range, and associated video URL for each match.
    """
    search_results = milvus_client.search(
        collection_name=collection_name,
        data=[query_vector],
        limit=limit,
        output_fields=["embedding_scope", "start_offset_sec", "end_offset_sec", "video_url"]
    )

    return search_results
    
# define the query vector
# We use the embedding inserted previously as an example. In practice, you can replace it with any video embedding you want to query.
query_vector = task_result.video_embeddings[0].embedding.float

# Perform a similarity search on the Milvus collection
search_results = perform_similarity_search(milvus_client, collection_name, query_vector)

print("Search Results:")
for i, result in enumerate(search_results[0]):
    print(f"Result {i+1}:")
    print(f"  Video URL: {result['entity']['video_url']}")
    print(f"  Time Range: {result['entity']['start_offset_sec']} - {result['entity']['end_offset_sec']} seconds")
    print(f"  Similarity Score: {result['distance']}")
    print()

이 구현은 다음과 같은 작업을 수행합니다.

쿼리 벡터를 입력받아 Milvus 컬렉션에서 유사한 임베딩을 검색하는 perform_similarity_search 함수를 정의합니다.
Milvus 클라이언트의 search 메서드를 사용하여 가장 유사한 벡터를 찾습니다.
매칭되는 비디오 세그먼트에 대한 메타데이터를 포함하여 검색하고자 하는 출력 필드를 지정합니다.
쿼리 비디오를 사용하여 이 함수를 활용하는 예시를 제공합니다. 먼저 임베딩을 생성한 후 이를 검색에 활용합니다.
관련 메타데이터 및 유사도 점수를 포함한 검색 결과를 출력합니다.

이러한 함수를 성공적으로 구현함으로써, Milvus에 비디오 임베딩을 저장하고 유사도 검색을 수행하는 완전한 워크플로우를 생성했습니다. 이 설정을 통해 Twelve Labs Embed API가 생성한 멀티모달 임베딩을 기반으로 유사한 비디오 콘텐츠를 효율적으로 검색할 수 있습니다.

‍

성능 최적화

이제 애플리케이션을 한 단계 더 위로 끌어올려 볼 차례입니다. 대규모 비디오 컬렉션을 다룰 때 성능은 가장 중요한 요소입니다. 성능 최적화를 위해 임베딩 생성 및 Milvus 삽입 테스트에 일괄 처리(batch processing)를 적용해야 합니다. 이를 통해 여러 비디오를 동시에 처리하고 전체 처리 시간을 대폭 줄일 수 있습니다. 또한, Milvus의 파티셔닝 기능을 활용하여 비디오 카테고리나 기간별로 데이터를 더욱 효율적으로 구성할 수 있습니다. 이렇게 하면 관련 파티션만 검색하게 되므로 쿼리 속도가 한층 더 빨라집니다.

또 다른 최적화 방법은 자주 액세스하는 임베딩이나 검색 결과에 캐싱 메커니즘을 적용하는 것입니다. 이를 통해 인기 있는 검색 쿼리에 대한 반응 속도를 크게 개선할 수 있습니다. 특정 데이터셋 및 쿼리 패턴에 맞춰 Milvus의 인덱스 파라미터를 미세 조정하는 것도 잊지 마세요. 이 간단한 조정만으로도 검색 성능을 크게 극대화할 수 있습니다.

‍

고급 기능

자, 이제 우리의 애플리케이션을 더욱 차별화해 줄 멋진 기능들을 추가해 보겠습니다. 텍스트와 비디오 쿼리를 결합한 하이브리드 검색을 구현할 수 있습니다. 실제로 Twelve Labs Embed API는 텍스트 검색을 위한 텍스트 임베딩 기법 역시 지원합니다. 사용자가 텍스트 설명과 샘플 비디오 클립을 함께 입력하는 시나리오를 상상해 보세요. 두 항목 모두에 대한 임베딩을 생성하고 Milvus에서 가중치 검색을 수행함으로써 극도로 정밀한 검색 결과를 도출해 낼 수 있습니다.

또 다른 훌륭한 추가 기능은 비디오 내 시간 단위(temporal) 검색입니다. 긴 비디오를 더 작은 세그먼트로 세분화하고 각각 고유의 임베딩을 부여할 수 있습니다. 이 방식으로 사용자는 전체 비디오 클립뿐만 아니라 비디오 안의 특정 시점을 찾아낼 수 있게 됩니다. 나아가 기본적인 비디오 분석 기능을 추가해 보는 것은 어떨까요? 임베딩을 활용해 유사한 비디오 세그먼트를 클러스터링하거나 트렌드를 감지하고, 방대한 비디오 컬렉션에서 특이값(outlier)을 식별해 낼 수도 있습니다.

‍

예외 처리 및 로깅

안타깝게도 개발 및 운영 환경에서는 언제든 문제가 발생할 수 있으며, 우리는 이에 대한 철저한 대비가 필요합니다. 견고한 에러 핸들링은 필수적입니다. 예외 상황 발생 시 사용자에게 유용한 정보를 제공할 수 있도록 API 호출 및 데이터베이스 작업을 try-except 블록으로 감싸주어야 합니다. 네트워크 관련 문제에 대비하여 지수 백오프(exponential backoff) 재시도 전략을 도입하면 일시적인 장애를 매끄럽게 복구할 수 있습니다.

로깅은 디버깅과 모니터링을 위한 최고의 도구입니다. 애플리케이션 전체에 걸쳐 중요 이벤트, 에러 및 성능 지표를 추적하기 위해 Python의 내장 logging 모듈을 사용해야 합니다. 개발 단계용 DEBUG, 일반 운영 정보용 INFO, 심각한 문제 해결용 ERROR 등 로그 레벨을 적절히 설정하세요. 또한 파일 크기 관리를 위해 로그 로테이션 기능을 구축하는 것이 좋습니다. 올바른 로깅 환경이 마련된다면 문제를 신속히 파악하고 해결할 수 있어, 비디오 검색 앱 규모가 방대해져도 안정적인 중단 없는 서비스를 제공할 수 있습니다.

‍

결론

축하합니다! 이제 Twelve Labs Embed API와 Milvus를 활용하여 강력한 시맨틱 비디오 검색 애플리케이션을 성공적으로 구현하셨습니다. 이번 통합을 통해 이전과는 비교할 수 없는 정확도와 효율성으로 비디오 콘텐츠를 처리, 저장 및 검색할 수 있게 되었습니다. 멀티모달 임베딩 성능을 활용하여 비디오 데이터의 복합적인 특성을 완벽히 이해하는 시스템이 구축되었으며, 콘텐츠 탐색 서비스, 추천 시스템 및 정교한 비디오 분석 영역에서의 새로운 장이 열렸습니다.

애플리케이션을 지속적으로 개선하고 확장해 나가는 과정에서 Twelve Labs의 진보된 임베딩 기술과 Milvus의 확장성 높은 벡터 스토리지 조합이 향후 직면할 더욱 복합적인 비디오 분석 과제를 해결해 줄 단단한 기반이 되어 줄 것입니다. 본 가이드에서 다룬 고급 기능을 응용하고 실험해 보며, 비디오 검색 및 분석 분야의 한계를 멋지게 넓혀가시길 응원합니다.

‍

부록

추가적인 실습 및 정보 검토를 위해 아래의 자료를 참고할 수 있습니다.

새롭게 개발할 창의적인 프로젝트가 무척 기대됩니다! 여러분의 결과물과 값진 경험을 Twelve Labs 및 Milvus 커뮤니티에 적극적으로 공유해 주세요. 즐거운 코딩 되시길 응원합니다!

TLDR: Twelve Labs의 멀티모달 임베딩 생성을 위한 Embed API와 효율적인 저장 및 검색을 위한 오픈 소스 벡터 데이터베이스인 Milvus를 통합하여 시맨틱 비디오 검색 애플리케이션을 구축하는 방법을 알아보세요. 개발 환경 설정부터 하이브리드 검색 및 시간 처리에 따른 비디오 분석과 같은 고급 기능 구현까지 전 과정을 다루며, 정교한 비디오 콘텐츠 분석 및 검색 시스템을 구축하기 위한 포괄적인 기반을 제공합니다. 이번 통합 가이드를 위해 공동 작업해 주신 Zilliz 팀(Jiang Chen 및 Chen Zhang)에 깊은 감사를 드립니다.

‍

소개

‍

사전 요구 사항

시작하기 전에 다음 사항이 준비되었는지 확인하세요.

설치 및 실행 중인 Milvus 서버 (자세한 지침은 Milvus 설치 가이드 참조)
Twelve Labs API 키 (없으시다면 https://playground.twelvelabs.io에서 가입하세요)
시스템에 설치된 Python 3.7 이상 버전

‍

개발 환경 설정

프로젝트를 위한 새 디렉터리를 만들고 해당 디렉터리로 이동합니다.

mkdir video-search-tutorial
cd video-search-tutorial

가상 환경을 설정합니다 (선택 사항이지만 권장됨).

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

필요한 Python 라이브러리를 설치합니다.

pip install twelvelabs pymilvus

프로젝트를 위한 새 Python 파일을 만듭니다.

touch video_search.py

이 video_search.py 파일은 튜토리얼에서 사용할 메인 스크립트가 됩니다. 다음으로 보안을 위해 Twelve Labs API 키를 환경 변수로 설정합니다.

export TWELVE_LABS_API_KEY='your_api_key_here'

Milvus 연결하기

from pymilvus import MilvusClient

# Initialize the Milvus client
milvus_client = MilvusClient("milvus_twelvelabs_demo.db")

print("Successfully connected to Milvus")

‍

비디오 임베딩을 위한 Milvus 컬렉션 생성

# Initialize the collection name
collection_name = "twelvelabs_demo_collection"

# Check if the collection already exists and drop it if it does
if milvus_client.has_collection(collection_name=collection_name):
    milvus_client.drop_collection(collection_name=collection_name)

# Create the collection
milvus_client.create_collection(
    collection_name=collection_name,
    dimension=1024  # The dimension of the Twelve Labs embeddings
)

print(f"Collection '{collection_name}' created successfully")

Twelve Labs Embed API로 임베딩 생성

먼저 Twelve Labs SDK가 설치되어 있는지 확인하고 필요한 모듈을 임포트합니다.

from twelvelabs import TwelveLabs
from twelvelabs.models.embed import EmbeddingsTask
import os

# Retrieve the API key from environment variables
TWELVE_LABS_API_KEY = os.getenv('TWELVE_LABS_API_KEY')

Twelve Labs 클라이언트를 초기화합니다.

twelvelabs_client = TwelveLabs(api_key=TWELVE_LABS_API_KEY)

제공된 비디오 URL에 대해 임베딩을 생성하는 함수를 만듭니다.

def generate_embedding(video_url):
		"""
    Generate embeddings for a given video URL using the Twelve Labs API.

    This function creates an embedding task for the specified video URL using
    the Marengo-retrieval-2.6 engine. It monitors the task progress and waits
    for completion. Once done, it retrieves the task result and extracts the
    embeddings along with their associated metadata.

    Args:
        video_url (str): The URL of the video to generate embeddings for.

    Returns:
        tuple: A tuple containing two elements:
            1. list: A list of dictionaries, where each dictionary contains:
                - 'embedding': The embedding vector as a list of floats.
                - 'start_offset_sec': The start time of the segment in seconds.
                - 'end_offset_sec': The end time of the segment in seconds.
                - 'embedding_scope': The scope of the embedding (e.g., 'shot', 'scene').
            2. EmbeddingsTaskResult: The complete task result object from Twelve Labs API.

    Raises:
        Any exceptions raised by the Twelve Labs API during task creation,
        execution, or retrieval.
    """

    # Create an embedding task
    task = twelvelabs_client.embed.task.create(
        engine_name="Marengo-retrieval-2.6",
        video_url=video_url
    )
    print(f"Created task: id={task.id} engine_name={task.engine_name} status={task.status}")

    # Define a callback function to monitor task progress
    def on_task_update(task: EmbeddingsTask):
        print(f"  Status={task.status}")

    # Wait for the task to complete
    status = task.wait_for_done(
        sleep_interval=2,
        callback=on_task_update
    )
    print(f"Embedding done: {status}")

    # Retrieve the task result
    task_result = twelvelabs_client.embed.task.retrieve(task.id)

    # Extract and return the embeddings
    embeddings = []
    for v in task_result.video_embeddings:
        embeddings.append({
            'embedding': v.embedding.float,
            'start_offset_sec': v.start_offset_sec,
            'end_offset_sec': v.end_offset_sec,
            'embedding_scope': v.embedding_scope
        })
    
    return embeddings, task_result

해당 함수를 사용하여 비디오용 임베딩을 생성합니다.

# Example usage
video_url = "https://example.com/your-video.mp4"

# Generate embeddings for the video
embeddings, task_result = generate_embedding(video_url)

print(f"Generated {len(embeddings)} embeddings for the video")
for i, emb in enumerate(embeddings):
    print(f"Embedding {i+1}:")
    print(f"  Scope: {emb['embedding_scope']}")
    print(f"  Time range: {emb['start_offset_sec']} - {emb['end_offset_sec']} seconds")
    print(f"  Embedding vector (first 5 values): {emb['embedding'][:5]}")
    print()

‍

Milvus에 임베딩 삽입하기

다음은 Milvus에 임베딩을 삽입하는 방법입니다.

def insert_embeddings(milvus_client, collection_name, task_result, video_url):
    """
    Insert embeddings into the Milvus collection.

    Args:
        milvus_client: The Milvus client instance.
        collection_name (str): The name of the Milvus collection to insert into.
        task_result (EmbeddingsTaskResult): The task result containing video embeddings.
        video_url (str): The URL of the video associated with the embeddings.

    Returns:
        MutationResult: The result of the insert operation.

    This function takes the video embeddings from the task result and inserts them
    into the specified Milvus collection. Each embedding is stored with additional
    metadata including its scope, start and end times, and the associated video URL.
    """
    data = []

    for i, v in enumerate(task_result.video_embeddings):
        data.append({
            "id": i,
            "vector": v.embedding.float,
            "embedding_scope": v.embedding_scope,
            "start_offset_sec": v.start_offset_sec,
            "end_offset_sec": v.end_offset_sec,
            "video_url": video_url
        })

    insert_result = milvus_client.insert(collection_name=collection_name, data=data)
    print(f"Inserted {len(data)} embeddings into Milvus")
    return insert_result

# Usage example
video_url = "https://example.com/your-video.mp4"

# Assuming this function exists from previous step
embeddings, task_result = generate_embedding(video_url)

# Insert embeddings into the Milvus collection
insert_result = insert_embeddings(milvus_client, collection_name, task_result, video_url)
print(insert_result)

유사도 검색 수행하기

def perform_similarity_search(milvus_client, collection_name, query_vector, limit=5):
    """
    Perform a similarity search on the Milvus collection.

    Args:
        milvus_client: The Milvus client instance.
        collection_name (str): The name of the Milvus collection to search in.
        query_vector (list): The query vector to search for similar embeddings.
        limit (int, optional): The maximum number of results to return. Defaults to 5.

    Returns:
        list: A list of search results, where each result is a dictionary containing
              the matched entity's metadata and similarity score.

    This function searches the specified Milvus collection for embeddings similar to
    the given query vector. It returns the top matching results, including metadata
    such as the embedding scope, time range, and associated video URL for each match.
    """
    search_results = milvus_client.search(
        collection_name=collection_name,
        data=[query_vector],
        limit=limit,
        output_fields=["embedding_scope", "start_offset_sec", "end_offset_sec", "video_url"]
    )

    return search_results
    
# define the query vector
# We use the embedding inserted previously as an example. In practice, you can replace it with any video embedding you want to query.
query_vector = task_result.video_embeddings[0].embedding.float

# Perform a similarity search on the Milvus collection
search_results = perform_similarity_search(milvus_client, collection_name, query_vector)

print("Search Results:")
for i, result in enumerate(search_results[0]):
    print(f"Result {i+1}:")
    print(f"  Video URL: {result['entity']['video_url']}")
    print(f"  Time Range: {result['entity']['start_offset_sec']} - {result['entity']['end_offset_sec']} seconds")
    print(f"  Similarity Score: {result['distance']}")
    print()

이 구현은 다음과 같은 작업을 수행합니다.

쿼리 벡터를 입력받아 Milvus 컬렉션에서 유사한 임베딩을 검색하는 perform_similarity_search 함수를 정의합니다.
Milvus 클라이언트의 search 메서드를 사용하여 가장 유사한 벡터를 찾습니다.
매칭되는 비디오 세그먼트에 대한 메타데이터를 포함하여 검색하고자 하는 출력 필드를 지정합니다.
쿼리 비디오를 사용하여 이 함수를 활용하는 예시를 제공합니다. 먼저 임베딩을 생성한 후 이를 검색에 활용합니다.
관련 메타데이터 및 유사도 점수를 포함한 검색 결과를 출력합니다.

‍

TLDR: Twelve Labs의 멀티모달 임베딩 생성을 위한 Embed API와 효율적인 저장 및 검색을 위한 오픈 소스 벡터 데이터베이스인 Milvus를 통합하여 시맨틱 비디오 검색 애플리케이션을 구축하는 방법을 알아보세요. 개발 환경 설정부터 하이브리드 검색 및 시간 처리에 따른 비디오 분석과 같은 고급 기능 구현까지 전 과정을 다루며, 정교한 비디오 콘텐츠 분석 및 검색 시스템을 구축하기 위한 포괄적인 기반을 제공합니다. 이번 통합 가이드를 위해 공동 작업해 주신 Zilliz 팀(Jiang Chen 및 Chen Zhang)에 깊은 감사를 드립니다.

‍

소개

‍

사전 요구 사항

시작하기 전에 다음 사항이 준비되었는지 확인하세요.

설치 및 실행 중인 Milvus 서버 (자세한 지침은 Milvus 설치 가이드 참조)
Twelve Labs API 키 (없으시다면 https://playground.twelvelabs.io에서 가입하세요)
시스템에 설치된 Python 3.7 이상 버전

‍

개발 환경 설정

프로젝트를 위한 새 디렉터리를 만들고 해당 디렉터리로 이동합니다.

mkdir video-search-tutorial
cd video-search-tutorial

가상 환경을 설정합니다 (선택 사항이지만 권장됨).

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

필요한 Python 라이브러리를 설치합니다.

pip install twelvelabs pymilvus

프로젝트를 위한 새 Python 파일을 만듭니다.

touch video_search.py

이 video_search.py 파일은 튜토리얼에서 사용할 메인 스크립트가 됩니다. 다음으로 보안을 위해 Twelve Labs API 키를 환경 변수로 설정합니다.

export TWELVE_LABS_API_KEY='your_api_key_here'

Milvus 연결하기

from pymilvus import MilvusClient

# Initialize the Milvus client
milvus_client = MilvusClient("milvus_twelvelabs_demo.db")

print("Successfully connected to Milvus")

‍

비디오 임베딩을 위한 Milvus 컬렉션 생성

# Initialize the collection name
collection_name = "twelvelabs_demo_collection"

# Check if the collection already exists and drop it if it does
if milvus_client.has_collection(collection_name=collection_name):
    milvus_client.drop_collection(collection_name=collection_name)

# Create the collection
milvus_client.create_collection(
    collection_name=collection_name,
    dimension=1024  # The dimension of the Twelve Labs embeddings
)

print(f"Collection '{collection_name}' created successfully")

Twelve Labs Embed API로 임베딩 생성

먼저 Twelve Labs SDK가 설치되어 있는지 확인하고 필요한 모듈을 임포트합니다.

from twelvelabs import TwelveLabs
from twelvelabs.models.embed import EmbeddingsTask
import os

# Retrieve the API key from environment variables
TWELVE_LABS_API_KEY = os.getenv('TWELVE_LABS_API_KEY')

Twelve Labs 클라이언트를 초기화합니다.

twelvelabs_client = TwelveLabs(api_key=TWELVE_LABS_API_KEY)

제공된 비디오 URL에 대해 임베딩을 생성하는 함수를 만듭니다.

def generate_embedding(video_url):
		"""
    Generate embeddings for a given video URL using the Twelve Labs API.

    This function creates an embedding task for the specified video URL using
    the Marengo-retrieval-2.6 engine. It monitors the task progress and waits
    for completion. Once done, it retrieves the task result and extracts the
    embeddings along with their associated metadata.

    Args:
        video_url (str): The URL of the video to generate embeddings for.

    Returns:
        tuple: A tuple containing two elements:
            1. list: A list of dictionaries, where each dictionary contains:
                - 'embedding': The embedding vector as a list of floats.
                - 'start_offset_sec': The start time of the segment in seconds.
                - 'end_offset_sec': The end time of the segment in seconds.
                - 'embedding_scope': The scope of the embedding (e.g., 'shot', 'scene').
            2. EmbeddingsTaskResult: The complete task result object from Twelve Labs API.

    Raises:
        Any exceptions raised by the Twelve Labs API during task creation,
        execution, or retrieval.
    """

    # Create an embedding task
    task = twelvelabs_client.embed.task.create(
        engine_name="Marengo-retrieval-2.6",
        video_url=video_url
    )
    print(f"Created task: id={task.id} engine_name={task.engine_name} status={task.status}")

    # Define a callback function to monitor task progress
    def on_task_update(task: EmbeddingsTask):
        print(f"  Status={task.status}")

    # Wait for the task to complete
    status = task.wait_for_done(
        sleep_interval=2,
        callback=on_task_update
    )
    print(f"Embedding done: {status}")

    # Retrieve the task result
    task_result = twelvelabs_client.embed.task.retrieve(task.id)

    # Extract and return the embeddings
    embeddings = []
    for v in task_result.video_embeddings:
        embeddings.append({
            'embedding': v.embedding.float,
            'start_offset_sec': v.start_offset_sec,
            'end_offset_sec': v.end_offset_sec,
            'embedding_scope': v.embedding_scope
        })
    
    return embeddings, task_result

해당 함수를 사용하여 비디오용 임베딩을 생성합니다.

# Example usage
video_url = "https://example.com/your-video.mp4"

# Generate embeddings for the video
embeddings, task_result = generate_embedding(video_url)

print(f"Generated {len(embeddings)} embeddings for the video")
for i, emb in enumerate(embeddings):
    print(f"Embedding {i+1}:")
    print(f"  Scope: {emb['embedding_scope']}")
    print(f"  Time range: {emb['start_offset_sec']} - {emb['end_offset_sec']} seconds")
    print(f"  Embedding vector (first 5 values): {emb['embedding'][:5]}")
    print()

‍

Milvus에 임베딩 삽입하기

다음은 Milvus에 임베딩을 삽입하는 방법입니다.

def insert_embeddings(milvus_client, collection_name, task_result, video_url):
    """
    Insert embeddings into the Milvus collection.

    Args:
        milvus_client: The Milvus client instance.
        collection_name (str): The name of the Milvus collection to insert into.
        task_result (EmbeddingsTaskResult): The task result containing video embeddings.
        video_url (str): The URL of the video associated with the embeddings.

    Returns:
        MutationResult: The result of the insert operation.

    This function takes the video embeddings from the task result and inserts them
    into the specified Milvus collection. Each embedding is stored with additional
    metadata including its scope, start and end times, and the associated video URL.
    """
    data = []

    for i, v in enumerate(task_result.video_embeddings):
        data.append({
            "id": i,
            "vector": v.embedding.float,
            "embedding_scope": v.embedding_scope,
            "start_offset_sec": v.start_offset_sec,
            "end_offset_sec": v.end_offset_sec,
            "video_url": video_url
        })

    insert_result = milvus_client.insert(collection_name=collection_name, data=data)
    print(f"Inserted {len(data)} embeddings into Milvus")
    return insert_result

# Usage example
video_url = "https://example.com/your-video.mp4"

# Assuming this function exists from previous step
embeddings, task_result = generate_embedding(video_url)

# Insert embeddings into the Milvus collection
insert_result = insert_embeddings(milvus_client, collection_name, task_result, video_url)
print(insert_result)

유사도 검색 수행하기

def perform_similarity_search(milvus_client, collection_name, query_vector, limit=5):
    """
    Perform a similarity search on the Milvus collection.

    Args:
        milvus_client: The Milvus client instance.
        collection_name (str): The name of the Milvus collection to search in.
        query_vector (list): The query vector to search for similar embeddings.
        limit (int, optional): The maximum number of results to return. Defaults to 5.

    Returns:
        list: A list of search results, where each result is a dictionary containing
              the matched entity's metadata and similarity score.

    This function searches the specified Milvus collection for embeddings similar to
    the given query vector. It returns the top matching results, including metadata
    such as the embedding scope, time range, and associated video URL for each match.
    """
    search_results = milvus_client.search(
        collection_name=collection_name,
        data=[query_vector],
        limit=limit,
        output_fields=["embedding_scope", "start_offset_sec", "end_offset_sec", "video_url"]
    )

    return search_results
    
# define the query vector
# We use the embedding inserted previously as an example. In practice, you can replace it with any video embedding you want to query.
query_vector = task_result.video_embeddings[0].embedding.float

# Perform a similarity search on the Milvus collection
search_results = perform_similarity_search(milvus_client, collection_name, query_vector)

print("Search Results:")
for i, result in enumerate(search_results[0]):
    print(f"Result {i+1}:")
    print(f"  Video URL: {result['entity']['video_url']}")
    print(f"  Time Range: {result['entity']['start_offset_sec']} - {result['entity']['end_offset_sec']} seconds")
    print(f"  Similarity Score: {result['distance']}")
    print()