Mixpeek
Mixpeek is a multimodal data warehouse that decomposes video, images, and audio into searchable features and reassembles them through multi-stage retrieval pipelines.
Pegasus 1.5 by TwelveLabs is a video intelligence platform that enables machines to understand, search, and analyze video content across vision, audio, and language.
Stork Quadrant
Replaceable as a UI, but kept alive as the API the agents call.
“TwelveLabs built a capable multimodal video understanding API before the frontier labs caught up. That window is closing. GPT-4o, Gemini 1.5 Pro, and Claude already handle video natively, and they're getting faster and cheaper. There's no proprietary data, no network, no regulatory gate — just a specialized model that bigger players will commoditize.”
An LLM alone could replace
Score history · +5 pts over 3 re-scores
Go vertical and own the liability: pick one industry where wrong video analysis has real consequences — insurance claims, legal evidence, broadcast compliance — and become the vendor that signs the contract and bears the risk. That's the only move that creates a moat here.
Similar Tools
Other tools you might consider
Mixpeek
Mixpeek is a multimodal data warehouse that decomposes video, images, and audio into searchable features and reassembles them through multi-stage retrieval pipelines.
Azure AI Video Indexer
Azure AI Video Indexer is a cloud and edge service that automatically extracts deep insights from video and audio content, integrated within the Microsoft Azure ecosystem.
Moments Lab
Moments Lab is an AI-powered video discovery platform that indexes visuals, audio, and metadata to help organizations find, repurpose, share, and monetize video content.
Memories.ai
Memories.ai offers advanced AI video understanding technology that effortlessly analyzes every frame to detect objects, interpret context, recognize emotions, and extract meaningful insights.
overview
Pegasus 1.5 by TwelveLabs is a video-first language model developed by Twelve Labs that enables developers and enterprises to transform raw video content into structured, queryable data. It leverages multimodal AI to analyze visual, audio, and speech information within videos. Generally released on April 20, 2026, Pegasus 1.5 specializes in Time Based Metadata Extraction (TBM), allowing users to define custom JSON schemas to receive timestamped, structured metadata from videos up to two hours long in a single API call, without requiring prior indexing or preprocessing. The model integrates visual, audio, and speech information to generate contextually relevant text and structured metadata. As of May 6, 2026, it supports synchronous analysis for direct video processing from a URL, asset, or base64 string. Recent updates on May 28, 2026, increased its context window to 261,120 tokens and maximum response length to 98,304 tokens.
quick facts
| Attribute | Value |
|---|---|
| Developer | Twelve Labs |
| Business Model | Freemium / Hybrid (Freemium with usage-based tiers) |
| Pricing | Freemium starting at $0.001/1k input tokens, $0.007/1k output tokens |
| Platforms | API |
| API Available | Yes |
| Founded | 2020 |
| HQ | San Francisco, USA |
| Funding | Series A |
features
Pegasus 1.5 by TwelveLabs provides a comprehensive suite of features designed for deep video understanding and structured data extraction, leveraging multimodal AI across vision, audio, and language. Its capabilities are accessible via an API and SDK, enabling integration into various enterprise workflows.
use cases
Pegasus 1.5 by TwelveLabs is designed for a diverse range of users requiring advanced video intelligence to automate analysis, enhance content management, and extract actionable insights from video data at scale. Its multimodal capabilities cater to both technical and content-focused professionals.
pricing
TwelveLabs offers a freemium pricing model for Pegasus 1.5, structured across Free, Developer, and Enterprise tiers, with usage-based rates for API consumption. Rate limits are implemented across all plans, varying by usage type (duration-based for video/audio, token-based for text, and request-based for endpoints).
competitors
Pegasus 1.5 by TwelveLabs is positioned as a specialized video reasoning model, designed to outperform general-purpose models in key areas critical for production video workflows. It competes with various platforms offering video intelligence and multimodal analysis.
Mixpeek is a multimodal data warehouse that decomposes video, images, and audio into searchable features and reassembles them through multi-stage retrieval pipelines.
Similar to TwelveLabs, Mixpeek offers a full-stack video intelligence platform with composable pipelines for various extractors (vision, audio, OCR, face), providing retrieval-ready output. It also offers a freemium model with 1,000 free credits, aligning with TwelveLabs' freemium offering.
Azure AI Video Indexer is a cloud and edge service that automatically extracts deep insights from video and audio content, integrated within the Microsoft Azure ecosystem.
Azure AI Video Indexer provides similar multimodal analysis capabilities (object detection, OCR, transcription, sentiment analysis) but is a service within the broader Azure ecosystem, potentially appealing to organizations already using Azure, whereas TwelveLabs is a specialized platform. It offers a free trial with up to 2,400 minutes of free indexing.
Moments Lab is an AI-powered video discovery platform that indexes visuals, audio, and metadata to help organizations find, repurpose, share, and monetize video content.
Moments Lab directly competes in the enterprise video discovery and content monetization space, offering similar multimodal indexing and search capabilities to TwelveLabs, but with a stronger emphasis on repurposing and monetizing content. Pricing is likely enterprise-focused, requiring a demo.
Memories.ai offers advanced AI video understanding technology that effortlessly analyzes every frame to detect objects, interpret context, recognize emotions, and extract meaningful insights.
Memories.ai provides comprehensive AI video understanding for insights, content tagging, and scene analysis, directly aligning with TwelveLabs' core offering of understanding video across vision, audio, and language. It implies a trial or freemium model.
Pegasus 1.5 by TwelveLabs is a video-first language model developed by Twelve Labs that enables developers and enterprises to transform raw video content into structured, queryable data. It leverages multimodal AI to analyze visual, audio, and speech information within videos.
Yes, Pegasus 1.5 by TwelveLabs offers a freemium model with a 'Free' tier that includes basic limits at no cost. Paid 'Developer' and 'Enterprise' plans are available, with usage-based pricing for API consumption, such as $0.001 per 1,000 input text tokens and $0.007 per 1,000 output text tokens.
Key features include searching, analyzing, and understanding video across vision, audio, and language; transforming video into Time Based Metadata (TBM) using custom JSON schemas; ingesting multimodal data through a single pipeline; indexing an hour of video in approximately one minute; and offering an API and SDK for integration. It also supports synchronous analysis and multimodal prompting with reference images.
Pegasus 1.5 by TwelveLabs is intended for developers, enterprises, media companies, sports organizations, marketing and advertising agencies, security operators, and government agencies. It is particularly useful for those needing to automate video analysis, extract structured insights, and enhance content management workflows.
Pegasus 1.5 by TwelveLabs is a specialized video reasoning model that outperforms general-purpose models like Gemini 3 Pro/3.1 Pro in segmentation quality and structured output reliability. Compared to competitors like Mixpeek, Azure AI Video Indexer, Moments Lab, and Memories.ai, Pegasus 1.5 focuses on its video-first language model for deep understanding and structured metadata extraction, offering a specialized solution for transforming raw video into queryable data.
More on Stork
Other tools in this category, ranked by community signal
Intuned
🤖 AI Tools
A code-first browser automation platform with an AI agent that builds and maintains your automations as deterministic, production-ready code.
AutoEdit
🤖 AI Tools
AutoEdit Creator Mode understands your content and builds your rough cut automatically, directly inside Premiere Pro.
Qursor
🤖 AI Tools
Qursor is the Chrome extension that lets you inspect any website visually, point at exact UI elements, and copy clean, structured code-aware context for your AI coding assistant.
Incogni
🤖 AI Tools
Incogni helps users remove their personal data from data brokers who collect, aggregate, and trade this information without consent.
AEVS
🤖 AI Tools
Tamper-evident, ECDSA-signed receipts for every AI agent tool execution. KMS-backed. Two lines of code. No changes to your tools.
StartKit
🤖 AI Tools
Launch your AI product 100x faster with StartKit's boilerplate code. Includes user authentication, rate-limits, all OpenAI APIs, and more.
For builders
AI agents read it. Buyers find it. Backlinks accrue. Your tool can have one too — live in 24 hours, indexed by Claude, ChatGPT, and Perplexity, queryable via MCP.