An MCP server that lets Claude, GPT, and Gemini edit videos via tool calls. 17 tools. YAML projects. Apple Silicon native. Think CapCut for AI.
No timeline dragging. No manual edits. Just tell your AI what you want.
Your AI agent (Claude, GPT, Gemini) receives a natural language brief describing the video you want. It plans the edit using Vidya's 17 MCP tools.
Vidya ingests your clips, transcribes audio with Whisper, detects scenes, and builds a .vidya YAML project the AI can read, reason about, and modify.
The agent assembles the timeline, adds captions, transitions, and music. A self-critique loop reviews the edit. Then Vidya renders the final video.
Built for machines, readable by humans. Every feature designed for agent workflows.
17 tools exposed via the Model Context Protocol. Any MCP-compatible agent (Claude, GPT, Gemini) can call them directly. No custom API, no SDK, no glue code.
Every edit lives in a .vidya YAML file the AI can read, reason about, and modify. Version-controlled, diffable, and human-readable. No opaque binary formats.
Hardware-accelerated via Metal and VideoToolbox on M1/M2/M3/M4. 2x faster than generic FFmpeg. CoreML Whisper for real-time transcription.
tiktok_bold, cinematic, karaoke, minimal, news_ticker, subtitle. Auto-synced to speech via Whisper timestamps. Styled and positioned automatically.
dissolve, glitch, whip_pan, zoom_in, zoom_out, slide_left, slide_right, fade_black, cut. GPU-rendered on Metal for smooth real-time playback.
After assembly, the AI reviews its own edit: checks pacing, audio levels, caption sync, and transition coherence. Fixes issues before you even see the output.
One-click export presets for TikTok, Instagram Reels, YouTube Shorts, YouTube Long, LinkedIn, Twitter/X, and custom resolution. Aspect ratios and encoding handled.
Fully open source under the MIT license. No vendor lock-in, no black boxes. Fork it, extend it, self-host it. The core MCP server is free forever.
The MCP server is fully open source. Pay only for the native app and cloud features.
Open source MCP server
macOS app for power users
Teams + cloud rendering
All paid plans include a 14-day free trial. Cancel anytime.
Everything you need to know about Vidya.
MCP (Model Context Protocol) is an open standard by Anthropic that lets AI models call external tools. Vidya exposes 17 video editing tools via MCP, so any compatible agent (Claude, GPT, Gemini) can edit videos by calling these tools directly -- no custom integration needed.
No. The free open-source MCP server runs on any platform with FFmpeg (macOS, Linux, Windows). Apple Silicon (M1/M2/M3/M4) unlocks hardware acceleration via Metal and VideoToolbox for 2x faster rendering, plus on-device Whisper transcription via CoreML. These are available in the Solo plan.
Yes. Vidya uses the MCP standard protocol, which is agent-agnostic. Any AI that supports MCP tool calls can use Vidya. This includes Claude (native MCP), GPT (via MCP bridge), and Gemini (via MCP bridge). You are not locked into any single AI provider.
The MCP server is fully open source under the MIT license and free forever. All 17 tools, all caption presets, all transitions, FFmpeg rendering, and .vidya project format are included. The paid plans add the native macOS app, Metal GPU rendering, cloud features, and team collaboration.
Vidya supports all formats FFmpeg can handle as input: MP4, MOV, MKV, AVI, WebM, and more. For export, you get optimized presets for TikTok (9:16 H.264), Instagram Reels (9:16 H.264), YouTube Shorts (9:16), YouTube Long (16:9), LinkedIn (16:9 or 1:1), Twitter/X (16:9), and custom resolutions. ProRes export is available in the Studio plan.
Clone the repo, configure your MCP client, and let your AI agent handle the rest. Open source. Free forever.
Star on GitHub