TwelveLabs Unveils AI Video Intelligence Platform at NAB

Published on : Apr 21, 2026

At NAB Show 2026, TwelveLabs introduced a new generation of video intelligence tools, signaling its transition from a model provider to a full-stack AI platform capable of transforming how enterprises and creators analyze and operationalize video data.

Video has long been one of the most valuable—and least accessible—forms of enterprise data. Despite representing the majority of digital content globally, extracting meaningful insights from video has traditionally required manual tagging, time-intensive review processes, and fragmented workflows. TwelveLabs is aiming to change that equation.

At NAB Show 2026, the company unveiled a series of product and ecosystem updates centered on a single idea: making video data as searchable, structured, and actionable as text. The announcement marks a strategic shift toward becoming a full-stack video intelligence platform, combining foundation models, applications, and ecosystem integrations.

At the core of this evolution is Pegasus 1.5, TwelveLabs’ latest video foundation model. The model introduces what the company describes as time-based metadata extraction—a capability that allows users to define specific criteria and automatically identify relevant segments within video content, complete with timestamps and structured outputs.

Unlike traditional video analysis tools, which rely heavily on pre-defined tags or manual annotation, Pegasus 1.5 dynamically interprets context. It can identify transitions, key events, and objects within a video in a way that mirrors how human editors review footage. This enables organizations to move from raw video to structured data with minimal intervention.

The implications are significant for industries that rely heavily on video. Media companies can transform archival footage into searchable assets, sports broadcasters can automatically index plays and highlights, and enterprises can eliminate manual tagging workflows that often consume thousands of hours annually. TwelveLabs claims that early benchmarks show Pegasus 1.5 outperforming competing models, including Google Gemini 2.5 Pro, in segmentation quality.

While the model itself represents a technical leap, TwelveLabs is also focusing on usability. The introduction of Rodeo, its first application-layer product, brings AI agents directly into the creative workflow. Designed as a co-pilot for video production, Rodeo allows users to search, edit, and assemble footage using natural language commands.

This approach reflects a broader trend in AI: moving from tools that assist with tasks to systems that actively participate in workflows. With Rodeo, AI agents can surface relevant clips, suggest edits, and help assemble sequences, reducing the time required to produce content from hours or days to minutes.

The company’s ecosystem strategy further extends its reach. Through a partnership with Autodesk, TwelveLabs’ video intelligence capabilities are now embedded into Autodesk Flow Capture, a platform used in film and television production. This integration introduces features such as Smart Search and Smart Actions, enabling production teams to locate specific moments within footage and automate media organization.

For creative industries, this integration addresses a longstanding challenge: the fragmentation of production and post-production workflows. By embedding AI directly into tools already used by professionals, TwelveLabs reduces the need for additional systems and simplifies adoption.

The broader significance of these announcements lies in how they reposition video within enterprise data strategies. Historically, video has been treated as unstructured data—valuable but difficult to analyze at scale. By enabling structured extraction and real-time interaction, platforms like TwelveLabs are effectively turning video into a first-class data source.

This shift aligns with wider trends across the AI and cloud ecosystem. Companies such as Microsoft, Google, and Amazon are investing heavily in multimodal AI, where systems can process text, images, and video simultaneously. Video intelligence is emerging as a key component of this evolution.

From a market perspective, the demand for video analytics is growing rapidly. According to Gartner, multimodal AI is expected to become a core capability for enterprise platforms, while IDC highlights the increasing importance of unstructured data in digital transformation initiatives.

TwelveLabs’ strategy reflects these dynamics. By combining advanced models with application-layer tools and ecosystem integrations, the company is positioning itself as a comprehensive solution for video intelligence. This approach contrasts with competitors that focus primarily on either infrastructure or end-user applications.

For enterprises, the value proposition is straightforward. Faster access to video insights can improve decision-making, reduce operational costs, and unlock new use cases—from content monetization to compliance monitoring. For creators, the ability to interact with video through natural language could fundamentally change how content is produced and edited.

However, challenges remain. Scaling video intelligence requires significant computational resources, and ensuring accuracy across diverse content types is complex. There are also questions around data privacy and governance, particularly when dealing with sensitive or proprietary footage.

Even so, the direction is clear. As video continues to dominate digital content, the ability to analyze and act on it efficiently will become a competitive differentiator. TwelveLabs’ latest announcements suggest that the industry is moving closer to that reality.

Market Landscape

The video intelligence market is evolving alongside advances in multimodal AI. Gartner identifies multimodal systems as a key trend shaping enterprise AI adoption, while IDC emphasizes the growing role of unstructured data, including video, in analytics and automation.

Major technology providers such as Google, Microsoft, and Amazon are expanding capabilities in video and AI, increasing competition in this space. TwelveLabs’ full-stack approach positions it within a rapidly emerging category focused on operationalizing video data at scale.

Top Insights

TwelveLabs launches Pegasus 1.5, introducing time-based metadata extraction that enables structured, searchable video data without manual tagging or re-indexing workflows.
Rodeo brings AI agents into video production, allowing creators to search, edit, and assemble footage using natural language, significantly reducing production time.
Integration with Autodesk Flow Capture embeds video intelligence into professional workflows, improving collaboration and efficiency in media production environments.
The shift toward full-stack video intelligence platforms reflects growing demand for multimodal AI solutions capable of transforming unstructured video into actionable insights.

Get in touch with our MarTech Experts

TwelveLabs Unveils AI Video Intelligence Platform at NAB

TwelveLabs Unveils AI Video Intelligence Platform at NAB

Market Landscape

Top Insights

Our Other Publications

Join our newsletter!