A multi-voice TTS system trained with an emphasis on quality 文字转语音 https://github.com/neonbjb/tortoise-tts

天问 ffc1736cef Update 'README.md' 1 year ago
README.md ffc1736cef Update 'README.md' 1 year ago

README.md

tortoise-tts

文字转语音

Develop

执行:

conda create --name tortoise python=3.9 numba inflect
conda activate tortoise

conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
conda install transformers=4.29.2


git clone https://github.com/neonbjb/tortoise-tts.git
cd tortoise-tts
python setup.py install

安装 pytorch,nvidia,transformer,然后安装本项目

运行:

python tortoise/do_tts.py --text "I'm going to speak this" --voice random --preset fast

# 大文件
python tortoise/read.py --textfile <your text to be read> --voice random

编程调用:

reference_clips = [utils.audio.load_audio(p, 22050) for p in clips_paths]
tts = api.TextToSpeech()
pcm_audio = tts.tts_with_preset("your text here", voice_samples=reference_clips, preset='fast')

初始化一个 TextToSpeech,然后调用 tts.tts_with_preset 实现文字转语音

训练自己的语音

剪切10s音频,保存到voices/,执行 tortoise utilities with --voice=<your_subdirectory_name>

模型训练,作者并未开源。

源码分析

项目只包括调用源码,对项目源码分析如下:

下载模型:


初始化会自动下载模型,当然可以提前下载,存放到.models目录中。

执行