Open Source CLI Tool

One voice for all your demos

Anyone can record a demo. DemoVoice replaces the narration with a consistent AI voice, so all your videos sound like they were made together.

$go install ...demovoice@latest
terminal
$demovoice render demo.mp4 --output demo.demovoice.mp4

Validating ffmpeg and ffprobe...

Extracting audio from source video...

Transcribing narration with OpenAI Whisper...

Splitting into 12 timed segments...

Synthesizing voice with gpt-4o-mini-tts...

Adjusting timing to match original...

Muxing audio with video stream...

Done! Output saved to demo.demovoice.mp4

The Problem

Demo videos are great docs. But they're a mess to produce.

10 people recording demos

10 different voices, no consistency

Developers uncomfortable on camera

They just don't record, so you get no video

Great content, inconsistent delivery

Videos feel disjointed, not like a cohesive brand

The Solution

Let anyone record. DemoVoice handles the voice.

  • Record your demo with any voice, or no voice at all
  • DemoVoice transcribes and re-records with a consistent AI voice
  • Every video sounds like it belongs to the same brand
1

One voice.
Every demo.

Built for software demo creators

DemoVoice focuses on one thing: re-voicing demo videos with perfect timing alignment.

Timing Preserved

AI-generated speech is automatically stretched or compressed to fit the exact timing windows of your original narration.

AI Text-to-Speech

Leverage OpenAI's gpt-4o-mini-tts with multiple voice options to create natural-sounding narration for your demos.

Highly Configurable

Fine-tune pace, emotion, timing tolerances, and segment boundaries through YAML configuration files.

CLI-First Design

Built with Go, Cobra, and Viper for fast, scriptable workflows that integrate into your existing toolchain.

Smart Retries

When generated speech doesn't fit the timing window, DemoVoice automatically rewrites and retries for best results.

BYOK Security

Bring your own API keys via environment variables. Provider secrets are never stored in config files.

How it works

A streamlined pipeline for timing-preserved voice replacement.

01

Extract & Transcribe

DemoVoice extracts the audio track from your video and transcribes it using OpenAI Whisper, capturing the exact timing of each phrase.

02

Segment & Synthesize

The transcription is split into timed segments. Each segment is synthesized using AI text-to-speech with your chosen voice and settings.

03

Fit & Align

Generated audio is stretched or compressed to fit the original timing window. If needed, the text is rewritten and regenerated for better fit.

04

Assemble & Mux

All segments are assembled with proper silence gaps and muxed with the original video stream, creating your final re-voiced demo.

Ready to get started?

Install DemoVoice in minutes and give your entire video library a consistent, professional voice.