Adding custom Layer to a HuggingFace Pretrained Models | BERT | NLP | Machine Learning

Adding custom Layer to a HuggingFace Pretrained Models | BERT | NLP | Machine Learning | PyTorch

Рет қаралды 8,443

Жыл бұрын

🔥🐍 Checkout the MASSIVELY UPGRADED 2nd Edition of my Book (with 1300+ pages of Dense Python Knowledge) Covering 350+ Python 🐍 Core concepts
🟠 Book Link - rohanpaul.gumroad.com/l/pytho...
---------------------
I am a Machine Learning Engineer | Kaggle Master. Connect with me on 🐦 TWITTER: / rohanpaul_ai - for daily in-depth coverage of Machine Learning / LLM / OpenAI / LangChain / Python Intricacies Topics.
Code in GitHub - github.com/rohan-paul/LLM-Fin...
======================
You can find me here:
**********************************************
🐦 TWITTER: / rohanpaul_ai
👨🏻‍💼 LINKEDIN: / rohan-paul-ai
👨‍🔧 Kaggle: www.kaggle.com/paulrohan2020
👨‍💻 GITHUB: github.com/rohan-paul
🧑‍🦰 Facebook Page: / rohanpaulai
📸 Instagram: / rohan_paul_2020
**********************************************
Other Playlist you might like 👇
🟠 Natural Language Processing with Deep Learning : bit.ly/3P6r2CL
🟠 MachineLearning & DeepLearning Concepts & interview Question Playlist - bit.ly/380eYDj
🟠 DataScience | MachineLearning Projects Implementation Playlist - bit.ly/39MEigt
🟠 ComputerVision / DeepLearning Algorithms Implementation Playlist - bit.ly/36jEvpI
#NLP #machinelearning #datascience #textprocessing #kaggle #tensorflow #pytorch #deeplearning #deeplearningai #100daysofmlcode #pythonprogramming #100DaysOfMLCode

Пікірлер: 25

@CppExpedition Жыл бұрын

Spectacular project!! i'll be sharing this video!

@RohanPaul-AI Жыл бұрын

Very satisfying to know it was helpful @cexploreful.

@raniaoslh6106 2 ай бұрын

AMAZING tutorial, thank you so much for this detailed explanation!!!

@RohanPaul-AI 2 ай бұрын

Very happy to know you liked.

@caiyu538 Жыл бұрын

Great tutoring.

@RohanPaul-AI Жыл бұрын

Thanks @caiyu538

@eddyandrade4567 Жыл бұрын

How can I use the new trained model to make predictions? I have tried with pipeline but it didnt work.

@DLwithShreyas Жыл бұрын

Sir I have a translation model which I want to push to huggingface but it has error called as "Functional" object not supported.

@jaujau363 Ай бұрын

Hello thanks for the video its possible to train this model but with hugging face trainer ??

@ahatsham8538 9 ай бұрын

Hi, thank you for making this video, If I have to do multi-task Fine-tuning, what should I do, do I need to create a custom loss function that takes care of both tasks?

@RohanPaul-AI 9 ай бұрын

You'll typically use a shared base model, for instance, a BERT model. On top of this, you can have task-specific heads. e.g If planning to do both a NER + sentiment classification, you might have a dense layer leading to token-level predictions for NER and a dense layer leading to a single prediction for sentiment. Each batch of data should contain information for both tasks. For Loss Calculation: For each task, you'll have a specific loss (e.g., cross-entropy for both NER and sentiment). The total loss for a batch will then be the sum of these losses, possibly weighted if you wish to give more importance to one task over the other. It would look something like this in PyTorch: ```py loss_ner = criterion_ner(predictions_ner, targets_ner) loss_sentiment = criterion_sentiment(predictions_sentiment, targets_sentiment) total_loss = loss_ner + loss_sentiment ```

@SarikaSarika-ge5cu 10 ай бұрын

Is there any github or notebook for the question answering task with the head. If its there please provide the source.

@user-zk1gt1nb3p 7 ай бұрын

hello. this is very useful video and I am following this. Thank you so much. One thing is can you please show how to save custom model "MyTaskSpecificCustomModel" because it raises the following errors. For original Transfomer model can be saved with model.save_model or training_arguments. However, custom model (like adding layers at the end of BERT) cannot save well (missing config.json with trainer API / cannot save at all with native pytorch ) in many saving methods. Can you please give some suggestions to save and load back our custom model and make inference from reloaded model.

@navneetgupta4669 Жыл бұрын

Can we use this same architecture for the models like DistilGPT2? and how to use the model that you created if we want to do the multiclass classification?

@RohanPaul-AI Жыл бұрын

Yes absolutely. you just need to extend the existing model and add your layers in the newly defined class.

@saluangja8470 Жыл бұрын

what's the difference if using custom layer BERT (added Dropout & Linear) like this video & without custom ? ex: for text classification

@RohanPaul-AI Жыл бұрын

Even for Text Classification you have to add a custom head layer, which may have a combination of Dropout Layers and Linear Layers (assuming you are just adding plain vanila Neural Networks at the head layer) The body of the BERT model will just output an embedding vector of size 768 in each of the tokens. And then, to use these vectors as an input for our different kinds of NLP applications (e.g. text classification, next sentence prediction, Named-Entity-Recognition (NER), or question-answering) - we need to add custom head layers. For example, for a **text classification task**, we focus our attention on the embedding vector output from the special [CLS] token. This means that we’re going to use the embedding vector of size 768 from [CLS] token as an input for our classifier, which then will output a vector of size the number of classes in our classification task. And that's why we do like below implementation ``` class BertClassifier(nn.Module): def __init__(self, dropout=0.5): super(BertClassifier, self).__init__() self.bert = BertModel.from_pretrained('bert-base-cased') self.dropout = nn.Dropout(dropout) self.linear = nn.Linear(768, 5) self.relu = nn.ReLU() ``` The above code assumes I am doing text-clasification and number of classes are 5. So, the Linear Layer stars with 768 dimensional vector (which is the output from the body of the BERT model), and at the end of the Linear Layer, we have a vector of size 5, each corresponds to a category of our labels or classes.

@emmamon3647 Жыл бұрын

At 21:32 : I don't understand why we dismiss most of the embedding output taking only the first row ? I understand we want the logits to be of size (batch_size, num_labels) but aren't we excluding information?

@RohanPaul-AI Жыл бұрын

Thanks for watching.. For the part `sequence_outputs[:, 0, : ]` - Here, `sequence_outputs` is a tensor of shape (batch_size, sequence_length, hidden_size), where batch_size is the number of input examples, sequence_length is the length of the input sequence, and hidden_size is the size of the hidden state for each token in the sequence. The notation [:, 0, :] selects the hidden state corresponding to the first token in each sequence (i.e., the [CLS] token in the case of models that use the BERT architecture). The resulting tensor has shape (batch_size, hidden_size). Going back to the fundamentals, the Last Hidden State output is the sequence of hidden-states at the output of the last layer of the model. The output is usually [batch, seq_len, hidden_state], and it can be narrowed down to [batch, 1, hidden_state] for [CLS] token, as the [CLS] token is 1st token in the sequence. Here , [batch, 1, hidden_state] can be equivalently considered as [batch, hidden_state]. And thats what I am doing with `sequence_outputs[:, 0, : ].view(-1, 768 )` Since Transformers are contextual model, the idea is [CLS] token would have captured the entire context and would be sufficient for simple downstream tasks such as classification. Hence, for tasks such as classification using sentence representations, you can use [batch, hidden_state].

@emmamon3647 Жыл бұрын

@@RohanPaul-AI It's super clear, thanks ! 🙂

@hemalshah1410 Жыл бұрын

I've a question. In DataLoader section "get_scheduler" is giving me "TypeError: get_scheduler() got an unexpected keyword argument 'num_warmup_step'" Has something recently changed with this ?

@RohanPaul-AI Жыл бұрын

@Hemal, would suggest uninstall the current transformer library and then a fresh reinstallation from source pip install git+github.com/huggingface/transformers.git Check out this github issue - github.com/huggingface/transformers/issues/1878#issuecomment-558238488

@hemalshah1410 Жыл бұрын

@@RohanPaul-AI hey 👋 Thank you for responding but I tried that already and it didn't work. However I'll give a fresh try soon. 😅