Captions
FreemiumAI-powered video editing app — automatic captions, eye contact correction, AI avatars, and professional editing from your phone
What is Captions?
Captions is an AI-powered video creation and editing app built primarily for creators shooting talking-head content on their phone. It auto-generates animated captions in 100+ languages, corrects your gaze so you appear to look directly at the camera, dubs videos into 28+ languages with matching lip-sync, removes filler words and awkward pauses, and can even generate entire videos with AI "Twins" or synthetic actors — so you never have to film yourself at all.
The app first went viral on the strength of two features: near-flawless automatic subtitles and AI Eye Contact, which digitally adjusts your eyes so you look at the lens even when you were reading off a teleprompter or script. Both features are now table stakes for short-form social content — TikTok, Instagram Reels, and YouTube Shorts creators use them every day to lift watch time and comprehension. Captions has been downloaded well into the tens of millions on iOS and Android, and the company (originally Captions, now operating as Mirage) raised $75M from General Catalyst in 2025 to push further into AI video generation.
In 2026, Captions has grown from a caption utility into a full AI video platform. The iOS and Android apps remain the flagship experience, but Captions also offers a web app, AI Edit (command-based editing where you type what you want and the app executes), an AI Script and AI B-Roll pipeline, AI Dubbing with lip-sync correction, and Creator-focused features like teleprompter with gaze correction. Compared to long-form editors like Descript, Captions is narrower and faster — designed to take you from raw selfie video to publish-ready social clip in under a minute.
⚡ Quick Verdict
Mobile-first creators who need professional captions and eye contact correction for talking-head videos
Professional video editors who need full timeline editing or teams producing long-form documentary content
AI eye contact correction + automatic captions — two killer features no competitor matches together
Primarily a mobile app — desktop experience is secondary
Bottom line: Captions scores 4.3/5 — Social media creators, educators, and marketers who produce talking-head video content and want automatic captions, eye contact correction, and quick editing from their phone.
Captions Pricing
Free: Basic editing, automatic captions, and access to a limited pool of lifetime credits (typically around 60–200 starter credits) for trying out AI Edit, AI Dubbing, eye contact, and other AI features. Exports include a Captions watermark. Good for testing the product and casual use.
Pro — $9.99/month: 200 monthly credits that renew each billing cycle, watermark-free exports, unlimited projects, and full access to AI features including AI Eye Contact, AI Dubbing with lip-sync, AI Edit, AI Twins, Background Noise Remover, and teleprompter. This is the plan most solo creators land on.
Scale — $69.99/month: 1,400 monthly credits for creators who ship high-volume content or agencies producing videos for multiple clients, plus everything in Pro. Captions also publishes higher-tier plans up to around $279.99/month for very heavy AI generation usage, along with an Enterprise tier for teams and brands. See the official Captions pricing page for the latest tier list and credit allocations.
Note: Credits are consumed by AI-powered actions (eye contact, dubbing, AI Edit, Twins). Basic editing and captions on already-recorded videos generally do not consume credits.
Key Features
- Automatic captions in 100+ languages: Captions listens to your audio and generates accurate animated subtitles with word-by-word styling, customizable fonts, colors, emoji, and positioning presets.
- AI Eye Contact correction: The signature feature. Digitally adjusts your gaze so you appear to look directly into the camera even if you were reading from a script or glancing off-screen, while keeping the result natural-looking.
- AI Dubbing with lip-sync: Translate your video into 28+ languages, generate a new voice track in your own cloned voice, and automatically tweak lip movements to match the new language — ideal for creators expanding internationally.
- AI Edit (command-based editing): Type what you want the app to do ("remove filler words", "cut the long pause at 0:14", "add broll of a city skyline") and Captions performs the edit automatically.
- AI Twins and AI Actors: Create a video avatar of yourself from a short recording, or generate videos from a cast of synthetic actors so you can publish without being on camera.
- AI Script and AI B-Roll: Generate a video script from a prompt, then let Captions suggest or fetch B-roll and stock footage that matches the narrative.
- Teleprompter with gaze correction: Built-in teleprompter that scrolls while you read, combined with eye contact correction so the finished clip still looks like you're talking straight to camera.
- Background noise remover: One-tap audio cleanup that strips background noise, hum, and room echo to make home-recorded audio sound closer to studio quality.
- Background replacement: Remove or change the background of your video for branded looks or green-screen effects without needing an actual green screen.
- Platform-ready exports: Vertical-first aspect ratios and templates tuned for TikTok, Instagram Reels, YouTube Shorts, and LinkedIn.
Best For
Short-form social creators: If you publish talking-head clips to TikTok, Reels, or Shorts, Captions gives you animated subtitles, eye contact correction, and fast editing from your phone. It's the fastest path from raw selfie video to publish-ready clip.
Founders, coaches, and creators who hate being on camera: AI Twins and AI Actors let you generate videos without filming yourself. Combined with AI Script and AI Dubbing, one person can ship multiple videos a week in multiple languages.
Global creators and marketers: AI Dubbing with lip-sync in 28+ languages is genuinely useful for reaching non-English audiences without re-filming. Captions handles voice cloning and mouth-shape correction in a single workflow.
Educators and corporate trainers: Teleprompter plus eye contact correction is ideal for course recordings, internal training videos, and explainer content where you need to look authoritative while still reading from a script.
Pros & Cons
Pros
- AI Eye Contact is uniquely good — no other mainstream app matches it
- Automatic captions are accurate, fast, and stylish out of the box
- Mobile-first workflow — record, edit, and publish without ever touching a laptop
- AI Dubbing with lip-sync in 28+ languages opens up international audiences
- AI Edit lets you edit by typing commands instead of dragging clips
- Generous set of features at the $9.99 Pro tier
- Optimized exports for every major short-form platform
- Active development — new AI features shipped regularly in 2025–2026
Cons
- Credit-based system can feel restrictive if you generate a lot of AI video
- Higher tiers ($69.99+) get expensive fast for heavy users
- Desktop and web experience is still catching up to the mobile apps
- AI Twins and synthetic actors can occasionally feel uncanny
- Not a replacement for timeline editors like Premiere, Final Cut, or Descript for long-form work
- Free tier uses a lifetime credit pool, not a monthly allowance
- Some advanced features are iOS-first and roll out to Android later
- Eye contact correction quality can drop at extreme head angles
FAQ
What exactly does the Captions app do?
Captions is an AI video app for people who shoot talking-head content. Its core job is to take raw selfie video and make it look professional in minutes: animated captions in 100+ languages, AI Eye Contact correction so you look at the lens, filler-word removal, background noise cleanup, and export presets for TikTok, Reels, and Shorts. Beyond editing, Captions also does AI Dubbing with lip-sync in 28+ languages, AI Edit (command-based editing where you type what to do), and AI Twins / AI Actors that generate videos of you or synthetic presenters without you ever filming.
Is Captions free to use?
There is a free tier with basic editing and automatic captions, plus a small pool of lifetime credits you can use on AI features like eye contact or dubbing. Exports include a watermark and the credit pool does not refill. Pro is $9.99/month and includes 200 monthly credits, no watermark, and full access to AI features. Scale is $69.99/month with 1,400 monthly credits for high-volume creators, and higher tiers up to around $279.99/month exist for heavy AI generation usage. There's also an Enterprise option for teams. Always check the official pricing page for the latest tier structure.
How good is AI Eye Contact really?
For most talking-head footage shot roughly facing the camera, it's excellent — the corrected eyes look natural, the transition is smooth, and viewers typically can't tell. The effect is especially noticeable when you combine it with the teleprompter: you can literally read a script and still look like you're locked on the lens. Quality drops at extreme angles, when the subject turns their head sharply, or in low-light footage with blurry eye detail. For standard vertical phone video shot face-on, it's genuinely one of the best AI eye contact implementations available.
Does Captions work for multilingual content?
Yes — this is one of its strongest use cases. Automatic caption generation supports 100+ languages, and AI Dubbing can translate and revoice your video in 28+ languages using a cloned version of your voice. On top of the voice swap, Captions adjusts your lip movements to match the new language so the finished clip looks like it was filmed in that language. That makes Captions a strong tool for creators repurposing one recording across English, Spanish, Portuguese, French, German, and more.
Captions vs Descript vs Opus Clip — which should I pick?
Captions is the best pick for mobile-first talking-head creators who want eye contact correction, fast captions, and AI dubbing. Descript is stronger for long-form podcasts, screen recordings, and multi-track editing where you treat video like a text document. Opus Clip is purpose-built for chopping long YouTube or webinar videos into short clips automatically. Pick Captions for phone-first short-form, Descript for long-form desktop work, and Opus Clip for automated clip generation from existing long videos.
What are credits and how quickly do they run out?
Credits are the metering system for AI-powered actions like eye contact correction, dubbing, AI Edit, and AI Twin generation. Basic editing and captioning of already-recorded videos generally don't consume credits. On Pro, 200 monthly credits are enough for most creators who post a few videos a week. On Scale, 1,400 monthly credits comfortably cover heavy daily posting, multi-language dubbing, and frequent AI video generation. If you're running an agency producing videos for several clients, or you lean heavily on AI Twins and dubbing, Scale or higher is the realistic tier.
Is Captions safe for privacy-sensitive content?
Captions processes most AI features in the cloud, which means your footage is uploaded to Captions' infrastructure for transcription, dubbing, eye contact, and AI generation. That's standard for tools at this level of quality, but it does mean confidential footage should be treated the same way you'd treat any cloud SaaS. Check Captions' privacy policy and terms — particularly around whether your video and voice data can be used to improve their models — and avoid uploading footage that contains regulated personal information or trade secrets unless you're on a business agreement that specifies otherwise.
📋 Good to know
Download from App Store or Google Play. Sign up, record or upload video, and AI processing starts automatically. Web version available at captions.ai.
On-device AI processing for most features — video stays on your phone. Cloud features use encrypted upload. SOC 2 compliant.
Pro ($9.99/mo) as soon as you want watermark-free exports and full AI feature access.
Very low. Record → AI processes → export. The app handles everything automatically. Most users productive within minutes.