声音克隆工具大汇总-出家如初，成佛有余

声音克隆一直是AI应用的最大场景之一，在Reddit 上 Stop searching for free voice cloning tools — here are the ones that actually work (2026) 中推荐了一些支持本地自托管的开源项目和免费的在线服务（部分需要收费）。

针对Reddit的那篇帖子内容做一下补充、说明，汇总一下个人推荐的开源或免费支持声音克隆（Voice Cloning）的服务。

开源项目

Qwen3-TTS

https://github.com/QwenLM/Qwen3-TTS

阿里巴巴的开源TTS模型，支持专业声音克隆、设计和情感控制。目前开源声音克隆无可争议的王者。

Voicebox

https://voicebox.sh/

本地优先的语音克隆开源方案，提供macOS/Windows客户端，基于Qwen3-TTS

VibeVoice

https://github.com/microsoft/VibeVoice

微软开源的语音生成框架，很适合用于生成长篇对话（最长90分钟）、多角色播客或极致真实感的语音内容。

微软担心 deepfake 滥用，因此将VibeVoice的 voice prompt 限制为 embedded format（参考说明）

如果要用于声音克隆，可以使用社区微调版。

VibeVoice 社区版：

社区版：https://github.com/vibevoice-community/VibeVoice

VibeVoice 官方仓库曾因 Responsible AI 风险删除部分代码，社区 fork 版本：

通过 fine-tune + speaker embedding 就能实现声音克隆。

为方便使用，主流的基于VibeVoice的声音克隆方案是 VibeVoice 社区版+VibeVoice‑ComfyUI

其他一些不错的支持声音克隆开源项目

index-tts：https://github.com/index-tts/index-tts B站开源项目

GPT-SoVITS：https://github.com/RVC-Boss/GPT-SoVITS

F5-TTS：https://github.com/SWivid/F5-TTS

Fish Speech：https://github.com/fishaudio/fish-speech

CosyVoice：https://github.com/FunAudioLLM/CosyVoice

KokoClone：https://huggingface.co/PatnaikAshish/kokoclone

VoxCPM：https://github.com/OpenBMB/VoxCPM

MOSS-TTS：https://github.com/OpenMOSS/MOSS-TTS

ChatTTS：https://github.com/2noise/ChatTTS

Higgs Audio：https://github.com/boson-ai/higgs-audio

Chatterbox TTS：https://github.com/resemble-ai/chatterbox

Pocket-TTS：https://github.com/kyutai-labs/pocket-tts

Twoshot：https://twoshot.app/coproducer 基于 Qwen3-TTS

NiceVoice：https://nicevoice.org/

TTSMaker：https://ttsmaker.com/

KikiVoice：https://kikivoice.ai/

Fish Audio：https://fish.audio 提供免费额度

MiniMax Voice Clone：https://www.minimax.io/audio/voices-cloning 提供免费额度

ElevenLabs：https://elevenlabs.io 免费额度有限，声音克隆、唇音同步的王者