Adding custom Layer to a HuggingFace Pretrained Models | BERT | NLP | Machine Learning | PyTorch

  Рет қаралды 8,443

Rohan-Paul-AI

Rohan-Paul-AI

Жыл бұрын

🔥🐍 Checkout the MASSIVELY UPGRADED 2nd Edition of my Book (with 1300+ pages of Dense Python Knowledge) Covering 350+ Python 🐍 Core concepts
🟠 Book Link - rohanpaul.gumroad.com/l/pytho...
---------------------
I am a Machine Learning Engineer | Kaggle Master. Connect with me on 🐦 TWITTER: / rohanpaul_ai - for daily in-depth coverage of Machine Learning / LLM / OpenAI / LangChain / Python Intricacies Topics.
Code in GitHub - github.com/rohan-paul/LLM-Fin...
======================
You can find me here:
**********************************************
🐦 TWITTER: / rohanpaul_ai
👨🏻‍💼 LINKEDIN: / rohan-paul-ai
👨‍🔧 Kaggle: www.kaggle.com/paulrohan2020
👨‍💻 GITHUB: github.com/rohan-paul
🧑‍🦰 Facebook Page: / rohanpaulai
📸 Instagram: / rohan_paul_2020
**********************************************
Other Playlist you might like 👇
🟠 Natural Language Processing with Deep Learning : bit.ly/3P6r2CL
🟠 MachineLearning & DeepLearning Concepts & interview Question Playlist - bit.ly/380eYDj
🟠 DataScience | MachineLearning Projects Implementation Playlist - bit.ly/39MEigt
🟠 ComputerVision / DeepLearning Algorithms Implementation Playlist - bit.ly/36jEvpI
#NLP #machinelearning #datascience #textprocessing #kaggle #tensorflow #pytorch #deeplearning #deeplearningai #100daysofmlcode #pythonprogramming #100DaysOfMLCode

Пікірлер: 25
@CppExpedition
@CppExpedition Жыл бұрын
Spectacular project!! i'll be sharing this video!
@RohanPaul-AI
@RohanPaul-AI Жыл бұрын
Very satisfying to know it was helpful @cexploreful.
@raniaoslh6106
@raniaoslh6106 2 ай бұрын
AMAZING tutorial, thank you so much for this detailed explanation!!!
@RohanPaul-AI
@RohanPaul-AI 2 ай бұрын
Very happy to know you liked.
@caiyu538
@caiyu538 Жыл бұрын
Great tutoring.
@RohanPaul-AI
@RohanPaul-AI Жыл бұрын
Thanks @caiyu538
@eddyandrade4567
@eddyandrade4567 Жыл бұрын
How can I use the new trained model to make predictions? I have tried with pipeline but it didnt work.
@DLwithShreyas
@DLwithShreyas Жыл бұрын
Sir I have a translation model which I want to push to huggingface but it has error called as "Functional" object not supported.
@jaujau363
@jaujau363 Ай бұрын
Hello thanks for the video its possible to train this model but with hugging face trainer ??
@ahatsham8538
@ahatsham8538 9 ай бұрын
Hi, thank you for making this video, If I have to do multi-task Fine-tuning, what should I do, do I need to create a custom loss function that takes care of both tasks?
@RohanPaul-AI
@RohanPaul-AI 9 ай бұрын
You'll typically use a shared base model, for instance, a BERT model. On top of this, you can have task-specific heads. e.g If planning to do both a NER + sentiment classification, you might have a dense layer leading to token-level predictions for NER and a dense layer leading to a single prediction for sentiment. Each batch of data should contain information for both tasks. For Loss Calculation: For each task, you'll have a specific loss (e.g., cross-entropy for both NER and sentiment). The total loss for a batch will then be the sum of these losses, possibly weighted if you wish to give more importance to one task over the other. It would look something like this in PyTorch: ```py loss_ner = criterion_ner(predictions_ner, targets_ner) loss_sentiment = criterion_sentiment(predictions_sentiment, targets_sentiment) total_loss = loss_ner + loss_sentiment ```
@SarikaSarika-ge5cu
@SarikaSarika-ge5cu 10 ай бұрын
Is there any github or notebook for the question answering task with the head. If its there please provide the source.
@user-zk1gt1nb3p
@user-zk1gt1nb3p 7 ай бұрын
hello. this is very useful video and I am following this. Thank you so much. One thing is can you please show how to save custom model "MyTaskSpecificCustomModel" because it raises the following errors. For original Transfomer model can be saved with model.save_model or training_arguments. However, custom model (like adding layers at the end of BERT) cannot save well (missing config.json with trainer API / cannot save at all with native pytorch ) in many saving methods. Can you please give some suggestions to save and load back our custom model and make inference from reloaded model.
@navneetgupta4669
@navneetgupta4669 Жыл бұрын
Can we use this same architecture for the models like DistilGPT2? and how to use the model that you created if we want to do the multiclass classification?
@RohanPaul-AI
@RohanPaul-AI Жыл бұрын
Yes absolutely. you just need to extend the existing model and add your layers in the newly defined class.
@saluangja8470
@saluangja8470 Жыл бұрын
what's the difference if using custom layer BERT (added Dropout & Linear) like this video & without custom ? ex: for text classification
@RohanPaul-AI
@RohanPaul-AI Жыл бұрын
Even for Text Classification you have to add a custom head layer, which may have a combination of Dropout Layers and Linear Layers (assuming you are just adding plain vanila Neural Networks at the head layer) The body of the BERT model will just output an embedding vector of size 768 in each of the tokens. And then, to use these vectors as an input for our different kinds of NLP applications (e.g. text classification, next sentence prediction, Named-Entity-Recognition (NER), or question-answering) - we need to add custom head layers. For example, for a **text classification task**, we focus our attention on the embedding vector output from the special [CLS] token. This means that we’re going to use the embedding vector of size 768 from [CLS] token as an input for our classifier, which then will output a vector of size the number of classes in our classification task. And that's why we do like below implementation ``` class BertClassifier(nn.Module): def __init__(self, dropout=0.5): super(BertClassifier, self).__init__() self.bert = BertModel.from_pretrained('bert-base-cased') self.dropout = nn.Dropout(dropout) self.linear = nn.Linear(768, 5) self.relu = nn.ReLU() ``` The above code assumes I am doing text-clasification and number of classes are 5. So, the Linear Layer stars with 768 dimensional vector (which is the output from the body of the BERT model), and at the end of the Linear Layer, we have a vector of size 5, each corresponds to a category of our labels or classes.
@emmamon3647
@emmamon3647 Жыл бұрын
At 21:32 : I don't understand why we dismiss most of the embedding output taking only the first row ? I understand we want the logits to be of size (batch_size, num_labels) but aren't we excluding information?
@RohanPaul-AI
@RohanPaul-AI Жыл бұрын
Thanks for watching.. For the part `sequence_outputs[:, 0, : ]` - Here, `sequence_outputs` is a tensor of shape (batch_size, sequence_length, hidden_size), where batch_size is the number of input examples, sequence_length is the length of the input sequence, and hidden_size is the size of the hidden state for each token in the sequence. The notation [:, 0, :] selects the hidden state corresponding to the first token in each sequence (i.e., the [CLS] token in the case of models that use the BERT architecture). The resulting tensor has shape (batch_size, hidden_size). Going back to the fundamentals, the Last Hidden State output is the sequence of hidden-states at the output of the last layer of the model. The output is usually [batch, seq_len, hidden_state], and it can be narrowed down to [batch, 1, hidden_state] for [CLS] token, as the [CLS] token is 1st token in the sequence. Here , [batch, 1, hidden_state] can be equivalently considered as [batch, hidden_state]. And thats what I am doing with `sequence_outputs[:, 0, : ].view(-1, 768 )` Since Transformers are contextual model, the idea is [CLS] token would have captured the entire context and would be sufficient for simple downstream tasks such as classification. Hence, for tasks such as classification using sentence representations, you can use [batch, hidden_state].
@emmamon3647
@emmamon3647 Жыл бұрын
@@RohanPaul-AI It's super clear, thanks ! 🙂
@hemalshah1410
@hemalshah1410 Жыл бұрын
I've a question. In DataLoader section "get_scheduler" is giving me "TypeError: get_scheduler() got an unexpected keyword argument 'num_warmup_step'" Has something recently changed with this ?
@RohanPaul-AI
@RohanPaul-AI Жыл бұрын
@Hemal, would suggest uninstall the current transformer library and then a fresh reinstallation from source pip install git+github.com/huggingface/transformers.git Check out this github issue - github.com/huggingface/transformers/issues/1878#issuecomment-558238488
@hemalshah1410
@hemalshah1410 Жыл бұрын
@@RohanPaul-AI hey 👋 Thank you for responding but I tried that already and it didn't work. However I'll give a fresh try soon. 😅
@venkateshr6127
@venkateshr6127 Жыл бұрын
Fine tune speech recognition model please 🙏
@RohanPaul-AI
@RohanPaul-AI Жыл бұрын
Thanks for the suggestion. That may be after a while in future.
LOVE LETTER - POPPY PLAYTIME CHAPTER 3 | GH'S ANIMATION
00:15
Incredible magic 🤯✨
00:53
America's Got Talent
Рет қаралды 80 МЛН
Cat Corn?! 🙀 #cat #cute #catlover
00:54
Stocat
Рет қаралды 15 МЛН
Fine-tuning Large Language Models (LLMs) | w/ Example Code
28:18
Shaw Talebi
Рет қаралды 274 М.
HuggingFace Crash Course - Sentiment Analysis, Model Hub, Fine Tuning
38:12
The Secret to 90%+ Accuracy in Text Classification
10:34
Pritish Mishra
Рет қаралды 42 М.
The U-Net (actually) explained in 10 minutes
10:31
rupert ai
Рет қаралды 89 М.
Pytorch Transfer Learning and Fine Tuning Tutorial
9:02
Aladdin Persson
Рет қаралды 48 М.