Created a fixed version of the colab, because recent google colab updates have made this one run into issues, This one downgrades stuff adn i can verify that it works at this moment colab.research.google.com/drive/1sqQqzupo2pdjgggkrbM60sU6sBFYo3su?usp=sharing
@exelyugure10 күн бұрын
great stuff man
@professourcecode6 күн бұрын
Thank's but I think it doesn't work anymore
@mmm-c9p4 күн бұрын
this is also not working
@mohammed65602 ай бұрын
The data processing was interrupted due an error !! Please check the console to verify the full error message! Error summary: Traceback (most recent call last): File "/content/TTS/TTS/demos/xtts_ft_demo/xtts_demo.py", line 215, in preprocess_dataset train_meta, eval_meta, audio_total_size = format_audio_list(audio_path, target_language=language, out_path=out_path, gradio_progress=progress) File "/content/TTS/TTS/demos/xtts_ft_demo/utils/formatter.py", line 75, in format_audio_list segments = list(segments) File "/usr/local/lib/python3.10/dist-packages/faster_whisper/transcribe.py", line 426, in generate_segments encoder_output = self.encode(segment) File "/usr/local/lib/python3.10/dist-packages/faster_whisper/transcribe.py", line 610, in encode return self.model.encode(features, to_cpu=to_cpu) RuntimeError: Library libcublas.so.11 is not found or cannot be loaded
@kingroy88002 ай бұрын
Not working sir make a new tutorial
@hornachos3 ай бұрын
you are a good girl, thanks coqui
@Pastafariste3 ай бұрын
Now I have a new error during the processing of the dataset: The data processing was interrupted due an error !! Please check the console to verify the full error message! Error summary: Traceback (most recent call last): File "/content/TTS/TTS/demos/xtts_ft_demo/xtts_demo.py", line 215, in preprocess_dataset train_meta, eval_meta, audio_total_size = format_audio_list(audio_path, target_language=language, out_path=out_path, gradio_progress=progress) File "/content/TTS/TTS/demos/xtts_ft_demo/utils/formatter.py", line 75, in format_audio_list segments = list(segments) File "/usr/local/lib/python3.10/dist-packages/faster_whisper/transcribe.py", line 426, in generate_segments encoder_output = self.encode(segment) File "/usr/local/lib/python3.10/dist-packages/faster_whisper/transcribe.py", line 610, in encode return self.model.encode(features, to_cpu=to_cpu) RuntimeError: Library libcublas.so.11 is not found or cannot be loaded
@Pastafariste3 ай бұрын
The second celle does not work for me: Traceback (most recent call last): File "/content/TTS/TTS/demos/xtts_ft_demo/xtts_demo.py", line 15, in <module> from TTS.demos.xtts_ft_demo.utils.gpt_train import train_gpt File "/content/TTS/TTS/demos/xtts_ft_demo/utils/gpt_train.py", line 8, in <module> from TTS.tts.layers.xtts.trainer.gpt_trainer import GPTArgs, GPTTrainer, GPTTrainerConfig, XttsAudioConfig File "/content/TTS/TTS/tts/layers/xtts/trainer/gpt_trainer.py", line 13, in <module> from TTS.tts.configs.xtts_config import XttsConfig File "/content/TTS/TTS/tts/configs/xtts_config.py", line 5, in <module> from TTS.tts.models.xtts import XttsArgs, XttsAudioConfig File "/content/TTS/TTS/tts/models/xtts.py", line 10, in <module> from TTS.tts.layers.xtts.gpt import GPT File "/content/TTS/TTS/tts/layers/xtts/gpt.py", line 10, in <module> from transformers import GPT2Config File "/usr/local/lib/python3.10/dist-packages/transformers/__init__.py", line 26, in <module> from . import dependency_versions_check File "/usr/local/lib/python3.10/dist-packages/transformers/dependency_versions_check.py", line 57, in <module> require_version_core(deps[pkg]) File "/usr/local/lib/python3.10/dist-packages/transformers/utils/versions.py", line 117, in require_version_core return require_version(requirement, hint) File "/usr/local/lib/python3.10/dist-packages/transformers/utils/versions.py", line 111, in require_version _compare_versions(op, got_ver, want_ver, requirement, pkg, hint) File "/usr/local/lib/python3.10/dist-packages/transformers/utils/versions.py", line 44, in _compare_versions raise ImportError( ImportError: tokenizers>=0.19,<0.20 is required for a normal functioning of this module, but found tokenizers==0.14.1. Try: `pip install transformers -U` or `pip install -e '.[dev]'` if you're working with git main
@922parchive33 ай бұрын
Note: for anyone still trying this and getting "RuntimeError: Library libcublas.so.11 is not found or cannot be loaded", add an extra code cell between installs and before running the webui itself and fill it with this line: "!apt install libcublas11"
@alexpolidini23274 ай бұрын
The data processing was interrupted due an error !! Please check the console to verify the full error message! Error summary: Traceback (most recent call last): File "/content/TTS/TTS/demos/xtts_ft_demo/xtts_demo.py", line 215, in preprocess_dataset train_meta, eval_meta, audio_total_size = format_audio_list(audio_path, target_language=language, out_path=out_path, gradio_progress=progress) File "/content/TTS/TTS/demos/xtts_ft_demo/utils/formatter.py", line 75, in format_audio_list segments = list(segments) File "/usr/local/lib/python3.10/dist-packages/faster_whisper/transcribe.py", line 426, in generate_segments encoder_output = self.encode(segment) File "/usr/local/lib/python3.10/dist-packages/faster_whisper/transcribe.py", line 610, in encode return self.model.encode(features, to_cpu=to_cpu) RuntimeError: Library libcublas.so.11 is not found or cannot be loaded
@corpse22224 ай бұрын
Not sure where or how to properly submit a ticket for the colab creators, but, the colab has been broken for weeks now. I check every couple days and it's only getting worse, stopping with errors sooner and sooner in the process.
@drewthomasson94922 күн бұрын
I’m currently working on a fixed colab version, I’ll give you a heads up if I get it working Seems that we need to force downgrade all the packages that were changed in recent google colab updates
Why do we need to use gradio for fine-tuning? It seems to be a waste of resources. Autotrain-advanced has a notebook using markdown. It works great.
@pylotlight4 ай бұрын
Got kicked out by google due to free tier limits..
@user-nb9zd2ft7m5 ай бұрын
I fine tuned the model and saved it in my device . but in every time I want to use the model I should provide the speaker_wav which is from data used for fine tuning and this process(analyses the record) take a long time. so, how can I use the model with my own speaker id to avoid providing it the speaker wav ???
@aurelianobuendia245 ай бұрын
it would be amazing to know how to load the model in a local enviroment now that i´ve train it
@sayedyasser25 ай бұрын
any idea on this? Ive got the fine tuned model and the json files, where do I load it for future use?
@torusx85644 ай бұрын
Download the model. You can transfer into your google drive lol. If you could not this video would have 0 sense@@sayedyasser2
@aigaming63105 ай бұрын
Great notebook & video ! Unfortunate, it seems notebook gradio not support Thai yet. Could you please also guide me how to do it in another language ?
@agtcbio97386 ай бұрын
FIX your EyES
@theentirecircus66236 ай бұрын
Great video, once saved the model, can we make inference locally using the tts module?
@handcraft.corner6 ай бұрын
Is this Fine Tuning for XTTS v1 or v2?
@jakobejensen67656 ай бұрын
This might sound like a dumb question, but how would you load the fine tuned model in other python programs. I know we get a config.json, vocab.json and a modal.pth files after the fine tuning process, but would we use the TTS.api?
@starbuck10026 ай бұрын
Shoutout Two Minute Papers i guess :D
@yklandares6 ай бұрын
)))
@yklandares6 ай бұрын
Guys, I repeated your lesson, but in the files of all CMS models wherever I see, there is an index file / how to create it or is it not needed???))))
@torusx85644 ай бұрын
? wdym
@filip19982207 ай бұрын
Is there an example of a narration script that produces the best cloning results? Perhaps one that includes all the phonemes?
@GS1957 ай бұрын
I want Prompt To Voice please. Do you know how hard it is to find a voice according to my specifications?
@xiunianwang7 ай бұрын
when I ran the cell 1,I got this error message Building wheel for docopt (setup.py) ... done ERROR: pip's legacy dependency resolver does not consider dependency conflicts when selecting packages. This behaviour is the source of the following dependency conflicts. lida 0.0.10 requires fastapi, which is not installed. lida 0.0.10 requires kaleido, which is not installed. lida 0.0.10 requires python-multipart, which is not installed. lida 0.0.10 requires uvicorn, which is not installed. librosa 0.10.1 requires numpy!=1.22.0,!=1.22.1,!=1.22.2,>=1.20.3, but you'll have numpy 1.22.0 which is incompatible. plotnine 0.12.4 requires numpy>=1.23.0, but you'll have numpy 1.22.0 which is incompatible. pywavelets 1.5.0 requires numpy<2.0,>=1.22.4, but you'll have numpy 1.22.0 which is incompatible. tensorflow 2.15.0 requires numpy<2.0.0,>=1.23.5, but you'll have numpy 1.22.0 which is incompatible. gruut 2.2.3 requires networkx<3.0.0,>=2.5.0, but you'll have networkx 3.2.1 which is incompatible.
@user-wr2cd1wy3b7 ай бұрын
Also, it doesn't seem to run locally after downloading the model.pth, vocab.json and config.json Do you need to download the whisper model for it to work locally or is that just for training? Edit: No the whisper model didn't change it, I was desperate, figured maybe it needed to check you had requirements for training in order to inferencing or something, that that didn't do it. Removing the quotation makes from the paths made it look like it was loading for a moment, but then after 4 seconds it just says "error." When loading the finetuned model.
@joeyhandles7 ай бұрын
prob just drop those files over the xtts model you have installed locally.
@YevgeniyChannel7 ай бұрын
I need help please
@torusx85644 ай бұрын
lol on what
@YevgeniyChannel4 ай бұрын
To make effects and AI voices @@torusx8564
@HyperUpscale7 ай бұрын
I love the Coqui performance, results and ease of use, but Is it possible to be even easier? Like 1 file for input file for training or microphone input, button 2 for training, and 3 type and speak. I am not sure why in year 2024 we still need to copy and paste text ...
@james-hunter-carter7 ай бұрын
The thing you are looking at is not meant for end-users, it's for developers.
@BlenderBeanie7 ай бұрын
This is currently the peak of technology. The top of tech available to the public. It's the very first iteration of the ui too. So in the future it might get easier, the more people want to use it. Like automatic1111s ui used to be barebones and hard to use. But now it's become a lot more user friendly. Just a few months ago all this was pure command line
@HyperUpscale7 ай бұрын
Maybe you just found out about it😄 People are already making money from the same peak technology. I paid months ago for this type of peak technology and moths in the age of AI means long time ago :)
@BlenderBeanie7 ай бұрын
@@HyperUpscale Perhaps I should have clarified myself more. I meant the peak open source versions, that are accessable for everyone for free. If you take the older models of coqui for example, just a few months ago it would have taken you many hours to train a proper model that works well on any ways. A year ago even just a basic voice was considered a big step towards open source AI technology. I am well aware of AI-Voices being used for many many years now, however, technology of this caliber were not yet accessable to the everyday user for free, only through paid alternatives.
@HyperUpscale7 ай бұрын
👍@@BlenderBeanie
@michal58698 ай бұрын
is there are any options for fine-tuning much longer eg: a few hours for better results?
@maker_pt8 ай бұрын
It seems to work really nicely. But how can I run the model in coqui? E.g. using the python api or the tts server?
@YevgeniyChannel7 ай бұрын
Me too.
@danemmer96866 ай бұрын
for api, replace the files in your computer with the files you've downloaded@@YevgeniyChannel
@ameerazam32698 ай бұрын
amazing @coqui
@captainlavenderVHS8 ай бұрын
Very very cool!!!! Doesn't work when dataset language is set to ja on 1st tab though - doesn't seem to be able to populate the metadata_eval.csv
@Gobolinn8 ай бұрын
encountered the same issue looks like ja isnt supported yet
@Otome_chan3116 ай бұрын
@@Gobolinn disappointing. i've been looking for a good ja->en voice clone tts. best I've found so far is moegoe which ends up being a bit weird sounding with the pacing when doing inference in english (but the sound of the voice is spot on). Every other voice clone thing I've tried doesn't seem to match the voice at all. I was hoping this would work but it seems not?
@user-ng4fk5hd6m8 ай бұрын
I gave up on it, i tried to train it on a 2:30 duration audio that was cleaned properly and it just was still training after 20 minutes on the default settings
@erogol8 ай бұрын
fine-tuning takes time. You need to wait for a bit.
@torusx85644 ай бұрын
depends it takes around 5 min for with 10min audio for fine tuning. Just make sure you use T4 GPU@@erogol
@pylotlight4 ай бұрын
@@torusx8564 got kicked out by google due to some error about free limits sadly.
@MoatasimAlg8 ай бұрын
it support arabic?
@YaBegitulah19 ай бұрын
Can Coqui be used for commercialization, can the free package from Coqui be used for commercial, because I need ai
@tesitest3789 ай бұрын
Eleutherodactylus
@VongolaChouko9 ай бұрын
How do we keep track of our remaining credits?
@NaruHinn10 ай бұрын
Please support arabic
@couldbejake10 ай бұрын
It just doesn't sound like me. We need this for an application.
@redfield12611 ай бұрын
Is emotional control available in python API ? Sorry for the newbie question. I am exploring the crazy world of CoquiTTS
@rthd10 ай бұрын
Wondering the same!
@redfield12611 ай бұрын
I can’t stop playing with it. It is so impressive. Bravo !
@ptitlouis3389 Жыл бұрын
French voice pls
@mrkaliski Жыл бұрын
portuguese voices?
@SiddharthTripathi365 Жыл бұрын
Hi. Has someone here trained something for Hindi or any other Indian languages? If yes, I need some guidance on the same
@bmw335hdk2 Жыл бұрын
Does this have library dependency support for android studio?