Introduction

DemoVoice is an open-source CLI tool that re-records voice tracks in demo videos using AI text-to-speech while preserving the timing of the original narration.

Installation

Install DemoVoice via Go or build from source.

Get started

Quick Start

Re-voice your first video in under 5 minutes.

Learn basics

Configuration

Customize voices, timing, and provider settings.

Configure

Why DemoVoice?

Demo videos are a powerful form of documentation, but creating them at scale presents challenges:

1.Inconsistent voices — When 10 team members record demos, you get 10 different voices with varying audio quality, accents, and speaking styles.
2.Recording reluctance — Many developers are uncomfortable recording their voice, which means great demos never get made.
3.No brand cohesion — Without a consistent voice, your video library feels disjointed rather than professional.

DemoVoice solves this by letting anyone record a demo with any voice (or no voice at all), then replacing the audio track with a consistent AI voice that preserves the original timing.

How It Works

1
Extract & Transcribe
Audio is extracted and transcribed with word-level timestamps using Whisper.
2
Segment & Synthesize
Text is split into segments and synthesized with your chosen TTS voice.
3
Fit & Align
Each segment is time-stretched or compressed to match original timing.
4
Assemble & Mux
Segments are concatenated and muxed back with the original video.

Requirements

Go 1.22 or later
ffmpeg and ffprobe on PATH
OpenAI API key (for Whisper transcription and TTS)

Introduction

Installation

Quick Start

Configuration

Why DemoVoice?

How It Works

Extract & Transcribe

Segment & Synthesize

Fit & Align

Assemble & Mux

Requirements