V
VoxCPM2

🎨Voice DesignMode

No reference audio required. Describe the target voice characteristics (gender, age, tone, emotion, pace …) in Control Instruction and VoxCPM2 will craft a unique voice from your description alone.

🎛️Controllable Cloning

Upload a reference audio clip, then optionally use Control Instruction to steer emotion, pace, and style while preserving the original timbre.

🎙️Ultimate Cloning

Enable Ultimate Cloning Mode and provide the reference audio's transcript (auto-filled via ASR). The model treats the clip as a spoken prefix and continues from it, reproducing every vocal nuance faithfully.

🎵

Click to upload or drag & drop an audio file

Supports WAV · MP3 · M4A · FLAC · OGG

0 / 2000
🎧

Generated audio will appear here

Examples — click to fill

🗣️ English Accent & Dialect Tips

Write Target Text in the target dialect's vocabulary and expressions, then describe the accent in Control Instruction.

✅ Correct (Southern American)Y'all fixin' to head out? I reckon we oughta grab some sweet tea before we go, bless your heart.

❌ Wrong (neutral Standard American)Are you all about to leave? I think we should get some iced tea before we go.

Not sure how to write a regional dialect? Ask ChatGPT or Claude to rewrite your text in that accent first, then paste it in.

© 2026 WisVid  ·  Powered by VoxCPM2