AI Voice Cloning for Singing with RVC

AI Voice Cloning for Singing with RVC - Guide and Set-up

Рет қаралды 278,936

Жыл бұрын

Links referenced in the video:
RVC Github - github.com/RVC-Project/Retrie...
Curate and Record Data Samples - • Complete Guide: AI Voi...
Download UVR - • Complete Guide: AI Voi...
Come join The Learning Journey!
Discord - / discord
Github - github.com/JarodMica
TikTok - / jarodsjourney
If you found anything helpful, please consider supporting me and the content I am trying to produce!
www.buymeacoffee.com/jarodsjo... |
Hardware for my PC:
Graphics Card - amzn.to/3pcREux
CPU - amzn.to/43O66Ir
Cooler - amzn.to/3p98TwX
RAM - amzn.to/3NBAsIq
SSD Storage - amzn.to/42NgMFR
Power Supply (PSU) - amzn.to/3NBAsIq
PC Case - amzn.to/447499T
Mother Board - amzn.to/3CziMXI
Alternative prebuilds:
Corsair Vengeance i7400 - amzn.to/3p64r22
MSI MPG Velox - amzn.to/42MnJHl
Cheapest and minimum specs recommended:
Cyberpower 3060 - amzn.to/3XjtZoP

Пікірлер: 912

@TantuBeats 9 ай бұрын

so much respect to everyone who is making this work.. the amount of problems I'm running into is insane, haha. I hardly know where to start after hours of being into this.

@wektorus Жыл бұрын

Finally a tutorial that even I can understand. It's so stupid that most of the tutorials are made as everyone was that tech savvy. Thank you so much.

@Jarods_Journey Жыл бұрын

Appreciate it 🤟🤟

@smokinmoose2 Жыл бұрын

I wish i could say the same. I'm just a singer. I want a program that installs, I hit the .exe file, it opens, I put the source files in and voila, new voice. Don't know why that should be so hard.

@linuxtuxvolds5917 Жыл бұрын

@@Jarods_Journey I can't stress enough how important it is to absolutely tell people that the training process will take a long time. I thought my progress was just stuck but no, it's just taking a long while!

@LovelyNyx7 Жыл бұрын

@@linuxtuxvolds5917I will wait as long as it takes. If it means I get to sound like someone's voice I really enjoy!

@paleguywithdonuts 11 ай бұрын

@@Jarods_Journey it says "No supported Nvidia GPU found, use CPU instead" but it still opened

@raykrislianggi 7 ай бұрын

For those of you looking for the "weights" folder in the main RVC directory, as of RVC1006, it's inside the "assets" folder.

@pingusmcdingus5124 7 ай бұрын

Nothing is placed here after training a model though. Do I manually copy the D_*.pth or G_*.pth over from logs, or something? If I try that and click Refresh Voice List and Index Path, the new model appears in the Inferencing Voice list, but when I select it I just see a red 'Error' all over the UI: i.imgur.com/QNQUpmq.png

@raykrislianggi 7 ай бұрын

@@pingusmcdingus5124 In my case, the .pth file is placed there automatically if it successfully finished the training without any errors. If it's not the case for you, there might be something wrong in the middle of the process. You might want to try retracing the steps or redo it from scratch. The one thing I did differently from this video is that my audio file for training is not split up into multiple short .wav files, but I just combine them into a single 20-minute file. I've compared both the cut and uncut audio and the result is much better with the uncut 20-minute audio.

@realon 6 ай бұрын

Thx for advice

@ohheyvoid 4 ай бұрын

thanks! :)

@luqmanhaqim97 Жыл бұрын

Nice one, keep up the good work. Your instructions is very clear and helpful compared to others. 👍 ✨

@stevecommand77 Жыл бұрын

Well convinced after the preview. Hope you can have video on text to own vocal speech soon.😊

@JeanIbarz 7 ай бұрын

Thanks for sharing ! Small tip: using cut/paste instead of copy/paste allows moving the folder instantaneously ;)

@TheDailyMemesShow 10 ай бұрын

I'm going crazy with Jarod's channel 😂 I'm that off the cliff with it that I'm running into rewatching old videos😂

@solm8212 5 ай бұрын

thank you sooo much, all the other tutorials were so confusing and this was simple and fast, encountered some problems while running the rvc command prompt since i dont have a gpu, but i installed cuda and python and that fixed it. its like now you need to know programming and stuff but this tutorial was easy, fast and simple. keep up the good work.

@arhythwrith 11 ай бұрын

For those who would like to know the harmony bit in 5:11 Harmony is when there's more than one note being sang at the same time It's kinda like chords but for vocals. HP5 Helps with separating harmony but it will be less clear on the voice compared to HP2. The newer RVC2 also has dereverb & deecho which I also highly recommend using to make the vocal separation even more clear for songs where the voice has a lot of reverb / echo. I'd say just mess around with it a bit and choose to your liking depending on the song. Anyways have nice day :D

@matthewedwards904 Жыл бұрын

@8:03 if your process fails when you try to process the input data one possible explanation is that the path for your folder includes a space. That is what hung up my first couple of attempts. make sure your file path doesn't include any spaces for easiest handling.

@Hestia3332 11 ай бұрын

thank you! I took the spaces out of the song name and it worked for me!

@Primesky 11 ай бұрын

Thank you m8

@ChaseEverything 9 ай бұрын

Still not working for me. It says :( ['trainset preprocess_pipeline_print.py', 'C:\\RVC-beta-0528\\RVC- beta0717\\voice\\me', '40000', '12', 'C:\\RVC-beta-0528\\RVC- beta0717/logs/me', 'False'] C:\RVC-beta-0528\RVC-beta0717\voice\me/myself.m4a->Suc. end preprocess C:\RVC-beta-0528\RVC-beta0717\voice\me/myself.m4a->Suc. end preprocess

@LucasMarak Жыл бұрын

RVC is best for me thanks Jarod take care

@Tarbard Жыл бұрын

Thanks for the videos, they are fascinating.

@the3fe245 11 ай бұрын

thanks mate, all of the other people i looked up as tutorials were too complicated, a month ago i viewed your so vits svc fork tutorial too, you are one of the best teachers in the world, i can understand your videos perfectly and my native language isnt even english!

@SplicerTv 11 ай бұрын

Thanks for the great tutorial! I found a couple things that might be helpful to others. For extracting the archive I use the official 7Zip software, its free and open source and will save you some hassle. Next thing, is regarding the batch size. I have a 3090ti which has 24GB of VRAM I find a value of 32 makes use of 21.7GB of the VRAM and leaves a bit for OS related stuff. You don't want to go overboard with batch size of 40, or the gpu will start swapping to system RAM, and significantly affect the time it takes to train even if you have fast RAM, it's still an I/O cycle you can avoid between GPU / System RAM. I recommend looking at task manager or using a tool like nvidia-smi to check the GPU VRAM use and experiment with batch size to find the best value for your card in order to get much faster training.

@obamabinbiden9762 Жыл бұрын

This worked perfectly. Thank you.

@ScorgeRudess 10 ай бұрын

Dude, you are amazing! Thanks for your great work!

@RobertJene Жыл бұрын

12:42 1. Open file explorer to the folder that has a file who's path you want 2. Press Alt+D 3. Press End 4. Type a backslash \ 5. Start typing the name of the file, look for the autocomplete with the correct name, press down arrow until the correct file is highlighted 6. Press Ctrl+C

@Optimus97 Жыл бұрын

Or you could Shift-Rightclick to unhide "Copy As Path" option

@RobertJene Жыл бұрын

@@Optimus97 I prefer to use the mouse as little as possible

@fluffsquirrel Жыл бұрын

@@RobertJene I can kinda see what you're saying, especially with the delay of the context menu in Windows 10/11.

@RobertJene Жыл бұрын

@@fluffsquirrel any keyboard sequence you do will save time not reaching for the mouse

@fluffsquirrel Жыл бұрын

@@RobertJene I think this is generally true, although the less sequences the better, if possible.

@gabrielmorgan3369 11 ай бұрын

For those who are having trouble choosing where the download goes you can right click it and choose save link as

@shaysilver203 Жыл бұрын

Great one! Finally works!

@321Engage28 Жыл бұрын

It worked. Thanks so much!

@animeui_es Жыл бұрын

Great job!. I have a question for you... How many audios do you recommend me to generate the model, and they are not problem if the audios have some background sound?

@Jarods_Journey Жыл бұрын

10 minutes or more of high quality audio. You need to split the background from the audio samples and can check my latest video on that

@nycdweller4287 Жыл бұрын

Hi, thanks for your video. Are there already some pre-trained models for RBC? Also, is there a reason you prefer to train locally rather than on collab?

@Jarods_Journey Жыл бұрын

I'm not sure about fully pre-trained models, you'll have to take a look around the internet to see. Colab is a nightmare to work with for debugging, etc and unless you made the code, trying to debug it isn't that fun. If I can work locally, I much prefer it and my hardware allows for it.

@DPIConnor Жыл бұрын

oh my god this is so awesome

@Snackbarry 2 ай бұрын

damn as a complete beginner coming to this channel to have it being explained like this was really..... interesting....

@RobertJene Жыл бұрын

9:33 when I train embeddings for stable diffusion (image generation) I have it save an embedding file every 50 steps so I can check the loss and strength of them with scripts and test a few

@Jarods_Journey Жыл бұрын

I've been finding with these speech models that the intermediary saves don't really exhibit abilities better than the final model, so I really just save the last one only in order to save space. I haven't found one yet that has been overtrained.

@Cyborg11 Жыл бұрын

Thanks for your very good tutorial Jarod. I still have a question. What do the values "loss_disc", "loss_gen", "loss_fm", "loss_mel" and "loss_kl" mean when training? Which values are indicating a good trained model? Are lower values better?

@Jarods_Journey Жыл бұрын

A downloads slope on the graph is better, or lower values. You wanna look for total loss and train till that's as low as possible preferably

@darksydeflow Жыл бұрын

niiiice thank you for the video :D

@warsin8641 Жыл бұрын

This abosulte legend amongst men

@michaelteuber7362 11 ай бұрын

Thanks a lot for the video! One question: 40kHz is a pretty unusual sample rate so I want to use 48kHz (which now also seems to work with v2). Also I slice up the training vocals manually with a DAW (Cubase) into up-to-10-seconds snippets. Do I have to export the snippets in 48 kHz already from the DAW or would the usual 44,1 kHz be alright and only the output (the resulting file) would be in 48 kHz?

@Jarods_Journey 11 ай бұрын

It'll be fine, I believe RVC resamples your audio already using ffmpeg to the correct SR. I actually haven't verified this, but since it handles my datasets when using either 40k or 48k, that means it doesnt really matter :)

@michaelteuber7362 11 ай бұрын

@@Jarods_Journey Thanks for your fast reply! So there's a tiny bit of hope that if you feed it 48kHz already it might skip the resampling which could probably result in higher quality oputput 🙂.

@joemmaama Жыл бұрын

For the voice me folder, is it just audio recordings of my own voice? if so how many do i need to include and what length? Thanks In advance youre a massive help dude

@Jarods_Journey Жыл бұрын

Yup, as shown, make sure the folder contains all of the audio files without subfolders. Then just use that path for those and you should be fine

@SirMato 11 ай бұрын

bro tysm my brain could not process how to do that on my own

@welachutmelexcel 8 ай бұрын

Since i’m relatively new to this, how would you use this rvc for just cloning a voice? Do I just leave out the parts in model inference about the pitch and music related things?

@denblindedjaligator5300 Жыл бұрын

i have some questions. When I download other people's voice modules, there is a file called something like traint.index, it's a file you have to use. The same goes for total_fea. I have also seen that there are pth files in the log folder itself.

@Jarods_Journey Жыл бұрын

These should go into the log folder underneath the "experiment" or "speaker" name that you want to use. So if the name is john, the john.pth goes into weights and the index goes into the log/ where you have to create a john directory and place the index into.

@denblindedjaligator5300 Жыл бұрын

@@Jarods_Journey but i meen the traint.index. And why is there a modul in the log folder and a detail file

@321Engage28 Жыл бұрын

Great tutorial! Unfortunately, I seem to be having a problem with step 2a: My attempt to process the data was unsuccessful, and the output message came up blank! What am I doing wrong?

@Yumegipsu Жыл бұрын

This happened to me too but it worked when I removed spaces from my experiment name. If it's not that then idk

@paarthsingh Жыл бұрын

can someone plz fix this error , jarods plz tell thisError : ValueError: invalid literal for int() with base 10: 'voice' this error i get when i do process data its step2a error : when i put my local URL into path folder

@HyperbolicArachnid 10 ай бұрын

Finally, a tutorial that doesn't fly 5 miles over my head

@shep9194 Жыл бұрын

Have you tried the realtime voice changing? Ive been trying to get that working but had some issues, i think its an svc fork though

@Jarods_Journey Жыл бұрын

Have not gotten to try that yet on either repos unfortunately :/

@krysidian Жыл бұрын

That was very nice to follow along, thanks! Any interest in showcases bark ai? I think it's a pretty interesting way of doing tts but I don't think it's very well explained in many places or left out a lot that kinda confused me. Especially when it comes to getting decent results. Do think the prompting idea is really intriguing though

@Jarods_Journey Жыл бұрын

My quick experience with bark is that it's still in very early stages, excited to see where it goes though! I might have to do a more throughout test of it, but tortoise tts by far is the most promising and easiest to use

@krysidian Жыл бұрын

@@Jarods_Journey That's definititely true. Tortoise is incredible!. Really hope bark will update or get some cool successors with a similar but more stable approach. Making it generate laughs sighs etc. is spooky and very fun.

@Jarods_Journey Жыл бұрын

@@krysidian I'm definitely interested in the laughing part. That's one additional touch to AI that is lacking in voices and when that gets fleshed out, things are gonna get interesting xD!

@djdocq8963 8 ай бұрын

@13:28 you pick an index file v2, in my drop down box it only has 3 different v1 files to choose from? It doesnt seem to create an index file when I train my voice.

@lalalala99661 7 ай бұрын

Quick tip in minute 13:03 you can do shift+ right click then a other menue pops up and you can click on copy path in the poped up menue

@hariom2580 7 ай бұрын

I have succesfully trained voice but there is no index file in voice name folder, in weights folder pth file is there what to do...nice video

@Retro-zn2jt Жыл бұрын

thanks for your video, there are a few other videos on the subject and I find that yours is better explained nevertheless I still have to deal with several errors. First I had "Cuda Out of memory", so I lowered the batch to the minimum, now I have another error which is: "RuntimeError GET was unable to find an engine to execute this computation". My audio samples are a bit long (a few minutes) and they are in 32Bits float at 44.1Khz but I only have 4 samples... should I divide them into several parts? thanks in advance. Editv1 : I tried many time and also to cut in differents parts, reduce the size and i still get the RuntimeError even with 2 small sample (16bit 44.1khz) than less than 10 secondes… i don’t understand Editv2: Also i wonder if you know how to text-to-speech with this tool ?

@Jarods_Journey Жыл бұрын

You might have to reinstall or make sure the CUDA being stalled is compatible with your GPU

@necrovolo 8 ай бұрын

I'm having the same issue.

@M4rt1nX Жыл бұрын

Those high notes though!!! We love local!!!!!!!!!!!!!!

@Jarods_Journey Жыл бұрын

Haha I wishhhhh xD. Local installation yields less issues, and is much easier to debug lol.

@VongolaChouko 8 ай бұрын

Is 1000 epochs overkill? Will it have diminishing returns compared to just keeping it up to 300? I really don't see a standard recommended epoch total anywhere, the answer varies. I usually use 500, but I honestly don't know if that's fine since I just use RVC for SillyTavern and haven't tried it just on itself yet, hence I don't know how to evaluate if the results are better or not .___.

@Nangel2 Жыл бұрын

Thank you for taking the time to make this tutorial! It was so easy to follow. :) Could I ask you to make a comment or tutorial on how to re-train a previously trained voice? I can't find that information anywhere.

@Jarods_Journey Жыл бұрын

Let me know if this was what you were thinking about: kzfaq.infoeO0gvi_RXTc?feature=share

@Nangel2 Жыл бұрын

@@Jarods_Journey That's exactly what I was looking for, tysm!

@Beary_TheBear 11 ай бұрын

Hi, thanks for the tutorial. I got stuck at the training process. I received a message saying this: RuntimeError: The expanded size of the tensor (12800) must match the existing size (4040) at non-singleton dimension 1. Target sizes: [1, 12800]. Tensor sizes: [4040] Before I got this message, I was getting the "Cuda out of memory", even though I have 32GB of RAM. I cut the audio samples into smaller bits under 10 seconds, and now I have the expanded size of the tensor error. What did I do wrong?

@gabrielmorgan3369 11 ай бұрын

same issue

@gabrielmorgan3369 11 ай бұрын

it means that the if it finishes its going to take up too much space so just turn batch size down to fix

@MrSix-1 5 ай бұрын

Cuda Memory is VRAM Its different than regular RAM

@AImusikindo Жыл бұрын

Thanks bro, from Australia

@Jarods_Journey Жыл бұрын

That's awesome, appreciate it!

@AImusikindo Жыл бұрын

@@Jarods_Journey i just made some cup of coffee for you lol

@Jarods_Journey Жыл бұрын

@@AImusikindo Haha thank you, each coffee keeps me going! 🤟🤟

@CamelliaWings07 11 ай бұрын

Thank u. This is the best explaination video I've ever seen in YT. Very clear☺ I successfully make it because of your detailed contents! (I failed many times before Kkkkk)

@SNYCHANNEL Жыл бұрын

Thank you for this video!! When i trying to train i get this error: sr = int(sys.argv[2]) ValueError: invalid literal for int() with base 10: 'Yona\\Desktop\\RVC-beta\\RVC-beta-v2-0528\\voice\\Me' You know howwhat im doing wrong?

@Jarods_Journey Жыл бұрын

RVC: Invalid Literal or File Not Found error

@ericleigh007 7 ай бұрын

if you want to move the folder faster, just rename the top folder, then cut and paste the lower into the top-level. When you cut and paste the contents, explorer knows it only MOVES the folder, so no copy wait.

@trubyart6193 11 ай бұрын

im having a lot of trouble... opening the go-web file doesnt show the language option, and then has lots of stuff and at the end says to press any button to continue. After i do that it closes, and when i searched up the localhost:7897 it says i cant reach the page..

@scedolin Жыл бұрын

thx for this good tutorial Unfortunatly I had a an error after 2 s and I don't understand why I did wrong. if data.dtype in [np.float64, np.float32, np.float16]: AttributeError: 'NoneType' object has no attribute 'dtype'

@Jarods_Journey Жыл бұрын

Another commenter had this issue but I haven't encountered it yet and haven't found a way to reproduce it. You might be able to find others who are looking to get this issue resolved here: github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues?q=is%3Aissue+AttributeError%3A+%27NoneType%27+object+has+no+attribute+%27dtype%27+is%3Aopen Could be related to the training process, trying to find files, etc

@scedolin Жыл бұрын

@@Jarods_Journey I applied your remark on your good short File Not Found: feature_768, and I succeed to avoid this eror anymore. Thx alot - I start to follow your channel last few days put your subject are very interesting - great job

@EthanWinters176 11 ай бұрын

If you can read this: .pth files go in the folder "weights" .index and others go to "logs" under the voice name ex: Logs\EthanWinters

@AIAsiaSinger Жыл бұрын

thanks bro generous sharing! one quick question, when we restore the previous model, how can we continue the training? do we need to go through all step 1 to step 3 ? should we update the "Load pre-trained base model G path" ?

@Jarods_Journey Жыл бұрын

Check this short to see if it answers your question! kzfaq.infoeO0gvi_RXTc?feature=share

@AIAsiaSinger Жыл бұрын

@@Jarods_Journey thank you so much!!

@AI_arab_world_maroc Жыл бұрын

Hi Jarod, should it train less with V2 48k , what is the best combination to train a model when it comes to V1 , V2 , 40k 48k? Thank you

@Jarods_Journey Жыл бұрын

V2 48k is the best quality

@androidgameplays4every13 Жыл бұрын

Thank you, thanks to your tutorial I finally succeed at creating my own models! even with only 4gb of memory in my gtx 1650 Super.

@Jarods_Journey Жыл бұрын

Awesome! They do say that it can work on smaller amounts of VRAM so glad that this worked!

@schoodst6095 Жыл бұрын

how did you get it to work on low ram? mine is eating up 6gb really quick and shuts down cause it run out, do I lower batch size?

@titrecords2294 Жыл бұрын

Mine keeps running out of memory how did you do it? Please help

@schoodst6095 Жыл бұрын

@@titrecords2294 lower the batch size, like a lot

@MohamedAdel-kw4hx 11 ай бұрын

Thx , but I can't find pth file after training.

@LillianGreenHiLilly 11 ай бұрын

Jarods Journey Why cant we just upload for example an existing split song file from inside the folder that is just the singing voice with no music. Also Why copy and paste the whole address? Please answer, because I dont usually get a response when i ask a simple question.

@el-bicente 9 ай бұрын

Thank you for this great tutorial! I was wondering if there was any tool to separate the vocals when they are different singers, because I want to apply several models. I can get clean vocals with UVR5, but I don't know what to do next. I tried to use whisperX but I think it's not really suitable for singing and overlapping voices...

@TheAimax Жыл бұрын

following your advice to use RVC I have a question, if I stop training at 250 epochs, if I want to start training again to reach 500 epochs I must put a total training epochs of 250 and they will be added to the 250 that I already had or put 500 ? I know that maybe it's a silly question but the question really arose, thanks for the attention to each one

@Jarods_Journey Жыл бұрын

Gotta do 500. These models store checkpoints so if you did 250 (assuming you didn't delete them), it'll start up from the checkpoint

@andivax Жыл бұрын

Thank you very much! My Inferencing voice list is empty. Where to put the downloaded voice models? And epochs. It's it worth to use 1000 epochs instead of 200 to increase the quality?

@Jarods_Journey Жыл бұрын

I believe downloaded voice models should go into the weights folder, as long as they're from RVC. As for epochs, if you get good results at 200, I don't see much reason to go to 1k. If you have enough voice samples, 200 should be relatively good. I would listen to them per 100 epochs and see what you think is best (as it's always dependent on your data and how much of it you have)

@kuroboticuse 9 ай бұрын

Thanks for the tutorial! Though slight question, is there any general advice on making the process of training towards the Epochs faster? Since for me, training for one epochs takes 11 minutes each and it would take a whole day and a half to reach 200 I have tried the method to "Train Model" and "One-Click Training" Though the rate of training towards Epochs are still the same Thank You

@gasparmxm 6 ай бұрын

A faster GPU

@KennaLovesGouda 10 ай бұрын

8:11 it will not process data. It starts but then stops and there is an orange line around the output any reason why?

@Xivlex 10 ай бұрын

Hello thanks for the video. It piqued my curiosity and now I want to try RVC myself. Unfortunately, I'm running an AMD GPU (6800xt) but upon checking the releases an option for AMD users is present in updated0814v2. My problem now is that when I try to follow your steps, RVC does not detect my GPU. For example, at step 2b as in 8:21, the options to select a GPU are not present. The option to input a GPU index is there and I've tried putting in "0", "1", "2" and "0-1-2" but when pressing one-click training it says: "NO GPU DETECTED: falling back to CPU - this may take a while" Do you know a way for it to detect my GPU?

@Jarods_Journey 10 ай бұрын

I'm not too sure unfortunately, you might have to check on their githubs issue area to see if anyone else is running into it.

@Nangel2 Жыл бұрын

Hello again! Do you know what the issue could be when the preprocess stops going in the middle? No error message or anything shows up, after successfully processing many of the vocal samples it just stops going.

@Jarods_Journey Жыл бұрын

Check the logs folder, should have 0 and 1 and if there are contents in there, it finished. Not sure if it stops in the middle, it would output an error.

@Nangel2 Жыл бұрын

@@Jarods_Journey Ty for taking the time to reply again! The logs folder did have 0 and 1 in it and the feature extraction worked, so I tried to train the model but it never progressed past step 1 (ie it never reached the epoch count). I've trained several models before with no problem, so I'm not sure what the issue is. I'll try to experiment a bit and see if I can figure out what the issue is, and if I figure it out I'll report back.

@samphelps856 Жыл бұрын

Thank you

@TheBlueRage 2 ай бұрын

Thanks. I just have to make voice samples. I guess I am supposed to sing something. Is that correct? The UVR software works great. I was able to stem Suno ai. m also looking at Jen music ai and Lalal ai which uses celebrities. This was more intense than what I expected. I see that Mac has a download app. I just found an App of Google App Store. I will look through your other videos for more lessons. Thanks.

@aatkins2002 8 ай бұрын

The program after a while replaces certain fields, usually the big buttons and their output fields with "Error" and a popup appears in the top right saying "Connection Errored Out" The command console isn't reporting anything unusual, but when I tried to proceed like nothing was wrong, one click training didn't seem to react well. What's causing this "Connection Errored Out" message?

@todhold2673 10 ай бұрын

Am i missing something? Did he go over how to add the newly converted vocals back to the instrumental?

@djsaquib Жыл бұрын

While making dataset, if i am taking vocals from a singer! Do i keep to keep key of vocals same? Or i can add multiple audios of different songs to train model of particular singer?

@Jarods_Journey Жыл бұрын

As long as its the same singer, you can add as many songs from them as you like

@djsaquib Жыл бұрын

@@Jarods_Journey thank you for clarifying 🙏🏻

@KoalaTeaGuy 11 ай бұрын

EDIT: I'm dumb. I was forgetting to include the .wav when putting the path of my vocals If I'm getting 'RuntimeError: Failed to load audio: ffmpeg error' is that an issue with the isolated vocals in the song I'm trying to use? or is it the trained model?

@enricopileggi7909 9 ай бұрын

How can I solve the error message " Unfortunately, there is no compatible GPU available to support your training" in step 2b? (My GPU is MX250). Thank you

@philerasmus 8 ай бұрын

Excellent tutorial. Running the gui I have found that the inference does use the GPU but the Vocal extraction task just relies on CPU. Is there a solution? Thanks

@looooool3145 4 ай бұрын

Hey man, thanks for the tutorial. I was wondering how to match the key of the instrumental to the output voice? I converted a male song to a female cover, but I don't know how to change the instrumental pitch to match with the female voice.

@Winterbliss-sg7qg Жыл бұрын

Keep putting more tutorials!!!!

@ElChapoDel8 5 ай бұрын

If i don't have any problems but i want to keep training my model i just do the same thing that you said on the minute 10:50 but increasing the epoch, right?

@Jarods_Journey 5 ай бұрын

Correct :)!

@K-pop2024 Жыл бұрын

What should I do if I'm taking other artist voice to put in someother artist song? Will this work that way? And also for voice cloning should I take a acapella version of them singing in their song to clone their voice? Hope you help me ☁

@Jarods_Journey Жыл бұрын

Acapella dataset works the best, this works by converting the vocals of a song you provided and basically turning it into the voice that you train.

@Jefersen Жыл бұрын

Hello, thank you so much for the tutorial, everything worked fine except the last button: When i click one click training i get all this messages for every file: mp3_10.wav->Suc. and then it just stopps, nothing happens any more. any suggestions ?

@rae8379 5 ай бұрын

Thanks for sharing. But now I run into a problem. Could I just use pretrained models instead of training models myself? But on RVC WebUI, I couldn't figure out how.

@Odinsdottir 11 ай бұрын

Hi there, I have a question, which is probably a stupid one. I followed this tutorial and ran my first process, but it's taken 2 DAYS! Is that normal? I have a GPU with a higher benchmark than the minimum, but well below yours. I've never done this before so if that's normal, I guess it is what it is, but if it's not, any suggestions on how to speed up the process without killing the results?

@SunPrime_Nexus 10 ай бұрын

If you click the option to save the epoch until they reach some number like every 20 epoch, your AI will be save its progress at that point at you could cancel training with any problem. About the time of waiting it depends how much data you use for training the AI and of course how many epochs you put like goal to train

@denblindedjaligator5300 Жыл бұрын

can I make a recording with my screen reader where I try to explain what I do because I don't understand what happens after it has split my audio files, there is nothing in the weights folder. do you have to have a recording of a vocal before it saves the module or what? I just thought you could make a module for later use.

@Jarods_Journey Жыл бұрын

If you hopped in my discord, I could try and help you out there as you would be able to send recordings of the screen reader there. Towards this message, you have to specify what the output folder of the vocals and instrumentals is going to be. After it splits the vocals, they should be located in the path you specified

@outlast2fan535 Жыл бұрын

Is it possible to stop training at like 100 epochs (I typed 200 epochs as goal in the web gui, as you suggested) and infer one of the checkpoint models to see if it's going well?

@Jarods_Journey Жыл бұрын

Well, as long as it's going to save at the 100th epoch. You can technically stop the training at any epoch, but you wanna make sure the it saved

@OravinCZ 10 ай бұрын

Hello! I've started using EasyGUI RVC now, but, do you have any idea where the models I've already trained with this method should be stored? Don't know how to make it appear - they don't want to appear in interference at all... 😐

@Mago497 9 ай бұрын

In the folder you downloaded there should be one that says "weights" there's were your models are stored. At the time of writing this there were a few extra that came with the download.

@DreamboyyHD Жыл бұрын

When i use a Vocals/Accompaniment it away show this message "clean_empty_cache" in my folder it have only one mp3 (I try to move it to another drive and try to make a new one and it still not work )or did i do some thing wrong?

@fsForward 3 ай бұрын

I *love* that he says "Don't trust me blindly", good! But now I do trust you blindly😂

@pcgg-kb4eg Жыл бұрын

Thx for make this tutorial it took me forever so thx and Could I ask you make a comment on if you have trained voice( didn't trained in RVC) how do you use it in RVC

@Jarods_Journey 11 ай бұрын

Drag the files to the folders they need to be in ie weights and then a folder for speaker name in logs

@ultimamage3 7 ай бұрын

thank you for the video, it's really informative but i have an issue: when training the voice it doesn't generate ".pth" files in the weights folder, any way to fix that?

@pingusmcdingus5124 7 ай бұрын

The checkpoints are under logs\[YourModelName] however if you copy them to assets\weights it won't load them properly, so ¯\_(ツ)_/¯.

@denblindedjaligator5300 Жыл бұрын

i have to set my gpu index at 0 and it works. when i have trained my module i can not find it only wavfiles can i send you my project folder?

@gummywormee41 10 ай бұрын

Hello! One issue I've been having is that it says that it cannot find an NVDIA GPU and to use GPU instead, but then it says that there's no GPU to support training. Do you know what would be a good solution to this?

@universalator 11 ай бұрын

I have a weaker GPU (GTX 1660 Ti) and its taking about half an hour for each epoch, i put the settings to match the reccomended starting settings (at 9:13 ), is this normal? Thanks

@EricNoneless 7 ай бұрын

When I click in the bat file it says it was not possible to find the determined path so when I try to search the localhost on the web it gives an error...

@POPMAGStudios Жыл бұрын

i put all the settings and used one-click training and i got: "added_IVF985_Flat_nprobe_1_IbrahemHefny_v2.index All processes have been completed!" but i couldnt find the model from the "Inferencing voice" slider even though i found the voice model in the log file please help

@mrdeadmemes Жыл бұрын

i've managed to go through the entire data training process, but i get an error at the very end when it attempts to create the file in weights. it's an 'unexpected pos' error, ("unexpected pos [long number] vs [long number]"). im not sure how to fix it i trained for 5 epochs instead of 200, and it created a file in pth (and was found in the model inference section), so this only seems to happen on higher epoch values. i'm not sure why

@Jarods_Journey Жыл бұрын

Hmm, I'm not too sure on why this might be happening. I can imagine that maybe something got corrupt or messed up somewhere along the line, causing the position to be wrong. Have you tried training a new model with all new folders?

@VaibhavShewale 10 ай бұрын

why this one is good

@ImaCreepyCreeper Жыл бұрын

4:13 I can't seem to get into the localhost page, also, is localhost necessary to make the custom vocal models? I haven't really gone through the whole video more or less skimmed it just to see how to get the custom voice models. -_-

@Jarods_Journey Жыл бұрын

-_- To get the local host page, you'll need to instantiate it via the python script

@azfarmcalpha875 Жыл бұрын

I failed to get the myself python file in weights folder after one-click training. Should I reset the process at 7:19 and if so do I need to delete certain files? Correct me if I'm wrong but I think this is the error? I'm not familiar with coding. RuntimeError: Calculated padded input size per channel: (2). Kernel size: (3). Kernel size can't be greater than actual input size 98_1.wav-contains nan 9_2.wav-contains nan all-feature-done

@Jarods_Journey Жыл бұрын

I would rerun the preprocess again for all of your data and then try again, but check this out here: github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues/484

@ERR0RR 11 ай бұрын

Any tipps how to better isolate the vocals from a song? Background singers or sometimes even saxophones also get used and when i use them, the output singing is very raspy and glitches out

@Jarods_Journey 11 ай бұрын

Check this out: github.com/Anjok07/ultimatevocalremovergui/issues/344

@givehead 9 ай бұрын

i don't see a pitch extraction algorithm, everything else is there. Any solutions?

@user-fg9nv9oh6z 11 ай бұрын

hi thanks for your tutorials. i got this error: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 6.00 GiB total capacity; 5.11 GiB already allocated; what should i do?

@sreshkhsreshkh3872 6 ай бұрын

same error,any fixes?

@Odyssey_ACNH Жыл бұрын

"The expanded size of the tensor (12800) must match the existing size (0) at non-singleton dimension 1." Anyone know a fix to this problem?

@SunPrime_Nexus 10 ай бұрын

Yeah bro, I recently have this problem. This happens because you surely try to use other data to train your AI that is probably not the original one that the AI started training. To solve this problem you only need to create other AI renaming the experiment and then you charge the new data you want to use at step 2a and then process the data. Then go to step b and extract the feature normally. Then go in your disc where you save your RCV documents, search logs and look for the carpet of the old AI, and copy all the archives that say D and G, then paste them in the new carpet of your new AI. When you have all this done you could train as normally and everything should be right

@ShyGun78 10 ай бұрын

،"I reach the final stages and move into the inference step after completing the training process. However, the model I have created doesn't appear in the Inferencing voice section and there are no options or indications - it's like a blank state. I have followed all the correct steps and the issue remains unresolved, which seems to be a problem that many individuals are encountering. Please help me out with this. :)"