9 Commits

Author SHA1 Message Date
Vega
2ccf97b6e9 Update README.md 2024-11-14 21:00:05 -08:00
Vega
4b8fa992b7 Update README.md (#1009) 2024-11-01 12:57:10 +08:00
Bob Conan
42789babd8 Update README.md, fix a typo (#1007) 2024-10-22 10:21:44 +08:00
Vega
2354bb42d1 Update README.md (#1005) 2024-10-16 22:48:15 +08:00
Vega
4358f6f353 Update README.md 2024-08-29 17:52:56 +08:00
xxxxx
5971555319 Update requirements.txt (#747)
Ubuntu 20.04.1 CUDA 11.3 缺少依赖,还有依赖冲突

Co-authored-by: Vega <babysor00@gmail.com>
2024-08-22 15:06:40 +08:00
Emma Thompson
6f84026c51 Env update 添加环境需求注释 (#660)
* Update Readme Doc

添加环境需求注释

* Update Readme Doc

Add environmental requirement notes

---------

Co-authored-by: Limingrui0 <65227354+Limingrui0@users.noreply.github.com>
2024-07-06 10:13:09 +08:00
Terminal
a30657ecf5 fix:preprocess_audio.py--The .npy file failed to save (#988) 2024-07-06 10:12:36 +08:00
Terminal
cc250af1f6 fix requirements monotonic-align error (#989) 2024-07-06 10:12:06 +08:00
4 changed files with 16 additions and 12 deletions

View File

@@ -29,6 +29,7 @@
> 如果在用 pip 方式安装的时候出现 `ERROR: Could not find a version that satisfies the requirement torch==1.9.0+cu102 (from versions: 0.1.2, 0.1.2.post1, 0.1.2.post2)` 这个错误可能是 python 版本过低3.9 可以安装成功
* 安装 [ffmpeg](https://ffmpeg.org/download.html#get-packages)。
* 运行`pip install -r requirements.txt` 来安装剩余的必要包。
> 这里的环境建议使用 `Repo Tag 0.0.1` `Pytorch1.9.0 with Torchvision0.10.0 and cudatoolkit10.2` `requirements.txt` `webrtcvad-wheels` 因为 `requiremants.txt` 是在几个月前导出的,所以不适配新版本
* 安装 webrtcvad `pip install webrtcvad-wheels`
或者

View File

@@ -1,3 +1,5 @@
> 🚧 While I no longer actively update this repo, you can find me continuously pushing this tech forward to good side and open-source. I'm also building an optimized and cloud hosted version: https://noiz.ai/ and it's free but not ready for commersial use now.
>
![mockingbird](https://user-images.githubusercontent.com/12797292/131216767-6eb251d6-14fc-4951-8324-2722f0cd4c63.jpg)
@@ -29,6 +31,7 @@
> If you get an `ERROR: Could not find a version that satisfies the requirement torch==1.9.0+cu102 (from versions: 0.1.2, 0.1.2.post1, 0.1.2.post2 )` This error is probably due to a low version of python, try using 3.9 and it will install successfully
* Install [ffmpeg](https://ffmpeg.org/download.html#get-packages).
* Run `pip install -r requirements.txt` to install the remaining necessary packages.
> The recommended environment here is `Repo Tag 0.0.1` `Pytorch1.9.0 with Torchvision0.10.0 and cudatoolkit10.2` `requirements.txt` `webrtcvad-wheels` because `requirements. txt` was exported a few months ago, so it doesn't work with newer versions
* Install webrtcvad `pip install webrtcvad-wheels`(If you need)
or

View File

@@ -116,14 +116,13 @@ def preprocess_general(speaker_dir, out_dir: Path, skip_existing: bool, hparams,
print(f"No word found in dict_info for {wav_fpath.name}, skip it")
continue
sub_basename = "%s_%02d" % (wav_fpath.name, 0)
mel_fpath = out_dir.joinpath("mels", f"mel-{sub_basename}.npy")
wav_fpath = out_dir.joinpath("audio", f"audio-{sub_basename}.npy")
mel_fpath_out = out_dir.joinpath("mels", f"mel-{sub_basename}.npy")
wav_fpath_out = out_dir.joinpath("audio", f"audio-{sub_basename}.npy")
if skip_existing and mel_fpath.exists() and wav_fpath.exists():
if skip_existing and mel_fpath_out.exists() and wav_fpath_out.exists():
continue
wav, text = _split_on_silences(wav_fpath, words, hparams)
result = _process_utterance(wav, text, out_dir, sub_basename,
False, hparams, encoder_model_fpath) # accelarate
result = _process_utterance(wav, text, out_dir, sub_basename, mel_fpath_out, wav_fpath_out, hparams, encoder_model_fpath)
if result is None:
continue
wav_fpath_name, mel_fpath_name, embed_fpath_name, wav, mel_frames, text = result

View File

@@ -2,7 +2,8 @@ umap-learn
visdom
librosa
matplotlib>=3.3.0
numpy
numpy==1.19.3; platform_system == "Windows"
numpy==1.20.3; platform_system != "Windows"
scipy>=1.0.0
tqdm
sounddevice
@@ -12,8 +13,8 @@ inflect
PyQt5
multiprocess
numba
webrtcvad
pypinyin
webrtcvad; platform_system != "Windows"
pypinyin==0.44.0
flask
flask_wtf
flask_cors
@@ -25,9 +26,9 @@ PyYAML
torch_complex
espnet
PyWavelets
monotonic-align==0.0.3
transformers
fastapi
loguru
typer[all]
click
click==8.0.4
typer
monotonic-align==1.0.0
transformers