claude-video-vision adds adaptive video frame extraction and audio transcription to Claude Code, bridging natural language queries with dynamic ffmpeg processing. It supports multiple audio backends and runs on Node.js 20+.
claude-shorts uses AI scoring, GPU transcription, and adaptive video reframing to extract viral-ready vertical clips from long videos, optimizing cuts with audio-aware snapping and platform-specific encoding.
OmniStream uses a multi-frame transformer to process continuous video streams with patch-level temporal indexing, supporting downstream vision-language-action tasks.