How to Use Hugging's Face Wav2Vec for Speech Recognition in Python

Рет қаралды 16,725

Күн бұрын

Hi guys! Welcome to another video, in this video I'll be showing you how to download and use a pretrained model named Wav2Vec to do Speech Recognition, Wav2Vec is a state-of-the-art model for speech recognition, it uses a similar training strategy as word2vec to learn speech representations using unlabeled data and then fine-tune the model on a labeled data, it also uses a Transformer architecture, using the HuggingFace library called transformers you can use or fine-tune a variety of models, today we'll focus o Wav2Vec, since our goal is to have one of the best models available for speech recognition.

Пікірлер: 26

@dr.mikeybee 2 жыл бұрын

Is there a scorer? That something that puts the logits into sentences, then gets logits for the sentences as opposed to the tokens? That way it would choose results that make the best sense. It's an architecture that's pretty common. Deep Speech does that, for example. And I think vosk does it under the surface. That's why it shows partials and then text.

@NathanWHill Жыл бұрын

Thank you for this excellent video! Can you point me to information about how to build a new model? I am a linguist working on very low resource languages and want to get my PhD students to learn how to do ASR on the languages they study. Thanks!!

@alejandrootiniano10 Жыл бұрын

I had a file not found error with AudioSegment.from_file(data), idk why it happened at all but i fix it with AudioSegment.from_wav(data).

@speechjjong7065 2 жыл бұрын

thank you for sharing!

@Jay-pj3wm 2 жыл бұрын

hi its not working on my ubuntu 20.04 it gets stuck at "you can start speaking now" can u help please?

@harshj6153 Жыл бұрын

how would you change the code if you had run it without an internet connection? And the model is downloaded on my local machine Thanks

@manasomali 2 жыл бұрын

hi, i am working with a project that is about an especific context, with uncomom words. i want to get a base model, like this one and reforce the training with some especific data, it is doable? ola, estou com um projeto de transcrição em um âmbito especifico, com palavras não comuns. Queria pegar um modelo base e reforçar o treino para algumas expressões especificas. É tranquilo de fazer isso?

@codigo_logo 2 жыл бұрын

Sim é possível.

@yohannesayana7613 Жыл бұрын

How can I train a new model with another language beginning from scratch like how can I fine tune the pre trained model too?

@ak-wk7qp Жыл бұрын

Did u got answe searching the same

@manasomali 2 жыл бұрын

another subject, there is an group of the channel in telegram? ou something similar... i think it would be a nice thing be able to discuss this topic in an codigo logo community. outro assunto, existe um grupo do canal no telegram? ou algo parecido... acho que seria legal poder discutir esse assunto com a comunidade do codigo logo.

@codigo_logo 2 жыл бұрын

temos um servidor no discord

@titusfx Жыл бұрын

It is possible to get the timestamp of each word, where starts and ends?

@ArchitAnant Жыл бұрын

deepspeech2 does that not sure about wav2vec2

@harshj6153 Жыл бұрын

is it possible for this code to work without an internet connection?

@codigo_logo Жыл бұрын

It does work without internet, but you need to download the models!

@harshj6153 Жыл бұрын

@@codigo_logo Yes after I download the model, where do I specify the path in the code? Couldnt figure that part

@shashanksadafule4870 2 жыл бұрын

Is using this free of cost?

@jindhu2608 2 жыл бұрын

Can it work offline ?

@codigo_logo 2 жыл бұрын

It works offline, but you need quite a lot of RAM to load the model into memory.

@jindhu2608 2 жыл бұрын

@@codigo_logo thanks for replying

@jindhu2608 2 жыл бұрын

I want to use it for Jarvis Trying for offline mode But now I need to use vosk model for it. Vosk model is not working too well but best than others.

@jindhu2608 2 жыл бұрын

@@codigo_logo Is there anyway where I can run python file in background . Like I wanted to run the python file background when I say Jarvis activate or any such command Jarvis starts replying me. But it should work in offline mode. Main problem, is running python file in background throughout the day. And it should be able to recognise my command and answer it. Is there any way do it? Pls help

@harshj6153 Жыл бұрын

@@jindhu2608 hey I am working on something similar . Have you figured it out? I might need some help