Deep dive: model merging, part 2
32:15
Пікірлер
@donkeroo1
@donkeroo1 8 сағат бұрын
What is the going rate for an AI scientist that can actually cobble together a functioning solution with significant return on investment. It has to be at least 500k. I’m not talking about those subpar Chemists/Engineers PhD “Data Scientist”, but an actual AI scientist that can build from concept to production. Curious
@sheikhshafayat6984
@sheikhshafayat6984 17 сағат бұрын
The explanation was excellent. Thanks a lot!
@juliensimonfr
@juliensimonfr 15 сағат бұрын
Glad it was helpful!
@SeasonBible
@SeasonBible 2 күн бұрын
Hello Julien, great video! Still have a question if there is no table with clear content description, such as CV, can Amazon textract extract pdf file information and arrange into certain boxes that would be possible to fill information into different tool?
@juliensimonfr
@juliensimonfr 15 сағат бұрын
Maybe, maybe not. Textract can pick up "loose" relationships between a label and an item living outside a table, but it's not going to invent labels or column names.
@xspydazx
@xspydazx 3 күн бұрын
will you do a tutorial on Distilkit? Please ? And the evolution merging ? I think Pruning is a great way to extract a strong model from a larger model : But how ! what are the do's and Donts? and really how to evaluate the information inside the tensors to select the correct layers to extract ? is there a way to preview how it would result before applying a prune ? IE how do you choose to select a subset of layers to run instead of a whole stack of layers ?.. In some pre-training people have expanded layer by layer ... some have begun with a full layer stack : if they expanded layer by layer Pruning is a real viable option for that model : but if they rained the layer stack as a whole then pruning is not prudent as the layers are inversely connected over the ranges: but the expanded models are heavily connected : hence identifying the data held in specific layers , as well as being able to run a subset of layers ... i would expect it to be like extracting a peft from the selected subset of layers and running that peft or merging that peft to a model ? im not great at these tasks or fully versed , ... but if your with the merging technology that means they have a greater understanding of what is happening in a pre-trained model and its layers : in theory it should not be hard to extract a specific subset of layers to make a new one ! it would only be the alignment after that is required as they still contain thier probability its just the layers are not highly connected: ie alignment should be done with a dataset that the model was already previously trained on ! hence aligning existing knowledge with the layers and aligning the tensors together to return this hidden data ! so before distilling i would suggest fine-tuning the model on a specific dataset for 1000 steps and over-train the model ! on these examples : then when the model is distilled retrain on the same dataset to align ! so you have your baseline to match ! here it would suggest that the previous data that was trained did indeed align ( and infer that all previous data also aligned to the new layer stack ?) again papers are always nice , but experiments and actual doing is always better ! , it nice to see the final result of the distillation ... what we gleen from this is that data does not need to travel a long distance through many layers to prove a result or prediction : in fact i personally believe that the longer the context you wish to use the more layers you need to hold the sequence ! so with smaller models that have great pre-training they will be great for pre-trained tasks with conversational sized contexts but smaller than the larger layer stack : but i would say they perform the same on the same task set : given a small context size ! and the loose when using larger context : hence for simple tasks they are perfect ! hence a Slim model can be considered an ideal agent to become a dedicated Tool ! ie: entity detector , documenter ( but not large summary writer ( smaller essays ) ...etc ...
@noinktechnique
@noinktechnique 4 күн бұрын
Every few weeks I have to read some threads in r/machinelearning and search out content like this to flush the buzzwords and singularity fanfiction that is so prevalent these days out of my brain. Thanks for posting! I'm looking forward to listening to the full discussion.
@juliensimonfr
@juliensimonfr 3 күн бұрын
You're welcome. The full discussion is at www.twitch.tv/videos/2170990579
@xspydazx
@xspydazx 4 күн бұрын
before i watch this ( i will coment after as well i like surprises) - > How do you train a model for funciton calling ? i see that you had a dataset , it has the fake calls inside : and i have use these datasets before and prompted them as they prompted the models : hence yes my model can call functions ! But : in training should we actually have the model coorectly set up to do the actual function call ? or have a set of results to fake the actual function call , so the model during training can access theses calls ! << as it would sem that funciton calling can only really be trained on a set of converstaional history containing the back and forth <<< and the funciton calling datasets do not always prvide this , often they provide some of the calls ? maybe not truly exaxctly as they were commited as , we do see the verbose outputs that these models produce and i would expect that we should be training the models on the full verbose data , which is not the case with these datasets ? So my questio of actually function calling or returning the actual expected values for th emodle to format the response correctly and the the series of calls and responses ? as in geeral the datasets also onoly leave the output seperate ( when the output is the actual process and not jut the answer ? the problem i highlght is the fact that the model did not Pause between function calls or attempt the internal conversation so it might not even reflexct a true possible answer from the modlel , such as synthestic data is : So the model is never really trained on function calling . I also found that if you want to do function calling with the model , it should be correctly alligned , ie most function calling is done on a hosted model , so that means that model needs to be trained as a messaging model (chatML) and if you take it to gguf ( it also needs to stay like this ( if it will be on ollama or lm studio ( hosted gguf ) ... ) but for general usage in a gpt4all unhosted it needs to be in the chatQA format or it also will not work : And if it is in tensors then you can change it on the fly to suit your needs ! SO when creating a model for a specifi purpose ( ie ReACT ) these are also the considerations ( ie the whole react process , not just the ins and outs for the data ) ... as well as make sure to train as chatml ! im not sure your dataset was a CHatML styled dataset , so i had to wangle it the Q/A prompt method ! ... As Always i expect a great video !
@xspydazx
@xspydazx 4 күн бұрын
Also one mor question : what is the actual difference between tools and functions ? ( as i noticed with open AI ( function call ) is singular and Tool Calls are plural !) I also recently found that the Graph method ! is very good and helps to create the agentic workflow ! (passing your state down the path ) so that the model can have less functions or tools , utilising the graph entry points as decision trees ! opps.... chain of thoughts lol !
@arnabsinha2339
@arnabsinha2339 5 күн бұрын
Amazing performance! So use this model for RAG or also fine-tune? Look forward to your video on distillation. Thanks for the awesome content.
@juliensimonfr
@juliensimonfr 5 күн бұрын
Thank you! Not sure how fine-tuning would affect distillation, that's an interesting idea :)
@giedrel1s
@giedrel1s 5 күн бұрын
Sadly anything under 8-12B is garbage for anything other than transcribing or other lightweight menial task
@juliensimonfr
@juliensimonfr 5 күн бұрын
Bold statement, but what do I know? Thanks for watching :)
@oryxchannel
@oryxchannel 6 күн бұрын
19:55 I think one or all of the creators (Pala Tej Deep, Rishabh Bhardwaj, Soujanya Poria)...are also fans of J _Dilla_ ...legendary music "sampler".
@juliensimonfr
@juliensimonfr 5 күн бұрын
I have no idea, but I'll trust you on that :*)
@butcher977
@butcher977 7 күн бұрын
So basically the LLM is just redirecting based on predefined rules to corresponding python functions and print results? Is this how agents generally work?
@juliensimonfr
@juliensimonfr 6 күн бұрын
Yes, then you extract the function call from the answer, run it, and often feed the result back to the LLM for story writing.
@xspydazx
@xspydazx 4 күн бұрын
yes you can prompt the model and give it an example of a funciton call output and parse the call from the response , or you can use the funcitoncalling and toollcalling from these librarys ( local models do not have this ) so we use pyydantic to attempt to force the model to produce a structured output. : I found with local models its very slow , but if you host the same model the function calling is much faster , as the outputs are in general json based : this is the true requirement to get the structured output , just to be able to parse the data from the response : A model can produce calls anywhere in the response and we can still find them easy with a reg ex ! so its still parseable from the response : I use this method more right now for learnablity of the how the models work , (example meta prompting is very powerful ) < and this is transferable to other programming languages , as i personally use VB.NET ( after i fugured out how to make a transformer and after i watrch kparthy load the gpt2 by writing some code !<< i was able to move back to my original programming language and even load a mistral model !
@xspydazx
@xspydazx 4 күн бұрын
also in this model setup every response is a funciton call : Making it easier : so your straight response is also a function call !
@hemangnehra7389
@hemangnehra7389 8 күн бұрын
Finance has always adopted technological breakthroughs first. I work at a tech company, and fortunately or unfortunately, finance will not adopt AI because as the person said, too much risk.
@jamescash4065
@jamescash4065 8 күн бұрын
Very important ❤
@juliensimonfr
@juliensimonfr 8 күн бұрын
Yes. Different country, different culture, different rules.
@darkmatter9583
@darkmatter9583 11 күн бұрын
hi i have a dataset with json files of 5gb how can i use that data? on openai inside throws very little data
@user-wr4yl7tx3w
@user-wr4yl7tx3w 15 күн бұрын
no audio
@juliensimonfr
@juliensimonfr 15 күн бұрын
There isn't any. The demo speaks for itself :)
@bhanuchirutha
@bhanuchirutha 15 күн бұрын
great , I agree sometimes you have to spend a lot of time on IAM than the original problem what a mess
@juliensimonfr
@juliensimonfr 15 күн бұрын
Yes, even if you know what you're doing, it's difficult to be 100% sure 🤣
@mourady5588
@mourady5588 16 күн бұрын
Thank you very much Julien for this high-quality excerpt! Could you please attach the slides in the description, as well as under the other videos?
@juliensimonfr
@juliensimonfr 15 күн бұрын
Hi, you'll find the slides at fr.slideshare.net/slideshow/julien-simon-deep-dive-optimizing-llm-inference/270920916. I'll share the other ones in the next week or so.
@mourady5588
@mourady5588 15 күн бұрын
@@juliensimonfr thanks a lot!
@pavansaitadepalli6097
@pavansaitadepalli6097 17 күн бұрын
Julien, this was a great video
@juliensimonfr
@juliensimonfr 16 күн бұрын
Thank you!
@alexis91459
@alexis91459 18 күн бұрын
Super cool! Just why in speculative decoding the validation part made by the bigger model is faster? I don"t understand how validation works
@juliensimonfr
@juliensimonfr 15 күн бұрын
Good question. The main reason is that the input verification by the larger model only requires a single forward pass per candidate sequence. This is much faster than the usual text generation process, which requires one forward pass per new token. If the larger model disagrees on a particular token, then it will generate a better one and the next ones. However, all the tokens generated up to that point by the smaller model are used as is. So, in the end we get large-model generation quality, only quicker :) Makes sense ? Here's a detailed example: huggingface.co/blog/whisper-speculative-decoding
@arnabsinha2339
@arnabsinha2339 18 күн бұрын
Awesome video Julien. When is part 3 coming?
@juliensimonfr
@juliensimonfr 18 күн бұрын
Thank you. Which algos are you interested in?
@francoisdev
@francoisdev 20 күн бұрын
Merci Julien!
@juliensimonfr
@juliensimonfr 20 күн бұрын
Avec plaisir !
@AI-Projects24
@AI-Projects24 20 күн бұрын
Is there any chance to get the slides? Its very well organized and presented. Thank you so much for your work✨🔥🔥
@juliensimonfr
@juliensimonfr 15 күн бұрын
Hi, you can find the slides on Slideshare at Slides: fr.slideshare.net/slideshow/julien-simon-deep-dive-quantizing-llms/270921785
@Lifelessons-sv7pr
@Lifelessons-sv7pr 20 күн бұрын
Will it work on stabilityai/sd-turbo? I am unable to make it work 😢
@juliensimonfr
@juliensimonfr 20 күн бұрын
I don't know. You should ask for help at discuss.huggingface.co or create an issue in the Optimum Intel repository.
@arnabsinha2339
@arnabsinha2339 20 күн бұрын
Julien, this was a great video and walking down memory lane. Thank you for taking the time to do this. One question: If the pytorch folks natively supported cpu/gpu/tpu via torch.compile then is the integration with Open XLA to support future AI accelerators only?
@juliensimonfr
@juliensimonfr 18 күн бұрын
Thank you. Yes, I think the purpose of OpenXLA is to provide a unified interface hiding the complexity of custom accelerators (AWS Trainium, etc.).
@xspydazx
@xspydazx 23 күн бұрын
i think the quantization step is very important , as it reduces the size of the model : it would be good to be able to push it to huggingface then it can be quantized by gguf my repo then the model will be 4.5 gig !! as it is a very large download ... ( esepcially for us slow consumers ( i used to be i a fast internet country but now im in the slow ones ))>> hence cloud dev is the only way to have the same funcitons , without requiring up-to-date technology .. hence i think the pay as you go service is very important !! << Dont forget the little guys ! ( how can i teach my local learners who do ot even have laptops ? computing ... you know it ca even be doen from a good phoe or tablet .. so i feel the market could be ( ema ) countrys ... as well as india a big consumwer of clouds services ( also financially depreived ( in large quantitys ) ) << as well as the STUDENT ! <<
@xspydazx
@xspydazx 23 күн бұрын
Good Series ! < quite calm and followable , and repeatable (nearly) .>>
@xspydazx
@xspydazx 23 күн бұрын
ok i think the code platform to operate on your model is very good as this will have to be the way to get the custom models and datsets up ...Good stuff
@xspydazx
@xspydazx 23 күн бұрын
I really hope we can get the huggingface datsets to use with this :: As i have already placed my great data sets there ! ( so convienient )
@juliensimonfr
@juliensimonfr 23 күн бұрын
You can use HF datasets for alignment, see upload_hugging_face_dataset_qa_pairs() in the Arcee SDK. For pretraining, we don't have that option yet. I'll share your feedback, thank you.
@xspydazx
@xspydazx 23 күн бұрын
@@juliensimonfr i think its a lovely simple site exactly what is needed : just do your simple tasks and leave : In fact in truth you dont need to store every thing on site .. but it does help to have a space : but i also see ( space as an expense so i would expect the pay as you go(to have some limitations on this )
@xspydazx
@xspydazx 23 күн бұрын
that merge sir!!... its a good merge i also done it :: :Many ways ...All experiments : to leanr about how the merges worked ( before your great video on merging techniques ) I would have put the mistral 2 as the base model : as you could justify that you are upgradiing to biomistal to the instruct 2 version of the mistral: so by leveraging the base model ( mistral 1 , and placing the new desired base model ie mistral 2 ... the merger will use the new tokenizer from mistral 2 and the same config, and ... merge the rest of the model list into that model : I can see that you used the mistral 1 as the base model because it was the original base model for the bio mistral , but i would not consider the BASE RAW model to be the base model but as a QVector .. to align the bio mistral to the new chosen base model : i could have chosen the base model for the bio mistral ? but this would have l=kept the bio mistral the same as it would have only grabbed the deltas from the two mistral models ? hence its not much improvment : but it would preserve the Tokenizer from the bio mistral which may have added conceptula meaning and relationn to the unique domain bio mistral data : If i had mixed a bio mistral and a code mistral ( dolphinCoder ) then i would have chosen a neutral base model ( ie the most common base among the collection of models being merged On to the Base model ) <<<< The base model is the model which is the priority base :? Maybe ? Nice video as usual
@xspydazx
@xspydazx 23 күн бұрын
Probably in the guided a default yaml could be available for each type of merge as there are some new ones which i do not know the yaml syntax for !
@xspydazx
@xspydazx 23 күн бұрын
yes but the teirs are very expensive ? - they need a py as you go option ; so we can probably load the reuired credits needed to do a single job ... its very restricted ? i did not even see where to pay ? easily ? How to get my current mistral onto the site for trainning ? as i base trained it with my personal data and changed my settings : SO i cannot train on the base and output the peft for attachment ? Will this conect directly to hugging face models and datasets? ...alowing the import and export of the final models? FOr me it would be good to be able to have amy model (stored/ Not hosted (while training project in being done ( ie a month or two : ) testinng it with the deployment (very good ) then uploading to huggig face when complete to deleate and reimport the next model ... ( i primary use 7b model but i would like to try to mix modalitys ie add a encoder from a audio/image input model ? ... so adding an optional steps of encoder decoder model .. or just the decoder modality .. hence the test deployment is vital for this as this is where huggingface fails ( we need to be able to get free endpoints for upto 7/8b model ( perhaps only for short visits only but at least to test ! ) .... the site looks like it will be easy to use without having to sit in a jupiter notebook to perform the daily tasks : SO a pay as you go feature is vital to rapid expansion ? ( monthly really at that price is more than open source user can afford ) ( hence colab 10-20 euro ) lasts quite along time !! and can do a serious train if need be :( monitored is thier only downfall )( availablity issues too for the best serving!! i hope you overcome this usablity issues as i think its key to living up to the phrase we are wishing to help the open source devleoper ... i think the test rig is the service and not the hosting ( as hugginf face spaces !! <<< really is very excellent for this and over time i expect will raise performance as well as allowing for a single gpu to manage (7-8 b models) ...( for training and devlopment ( hf is not good ) so each for its purpose : I also like the fact that the user does not need to understand the requierements of the machine configuration also !! !<< as this has also blocked me from speding on cloud services .. ( even hface ) ( the colab pricing teirs is actually quite Fair , their Google Cloud ( if you can get it to connect up looks actually well crafted despite being censored and restricted in various countrys due to thier politcal affiinitys ( hence your data is at risk with google ) hence independant such as open osource providers are safe as you choose what to share ! << and anything can be deleted and uploaded at will 1<< Freedoms in the hand of the user.. SO there is a lot to consider with my post sorrry ... But i wish you guys luck with this and im exicted to watch the whole series before deciding ( but 300... phew No Way(i Wish ) ... My current goal with my models is ( recall ) and Personality ...( as well as im-plementing the methodologies such as chain of thoughts and visual pacial reasoning etc, this is what i train my mistral for ... ) i have noticed you do not need a lot of data to train for such funcitonality ... in fact 1000-2000 is enogh to teach a model a new trick ....Ie over fit (loss 0.2/0.1 /0.0064 ) the task on 100 samples first before training it for the task, 1000-2000 samples ( to be fit (0.5 loss) ( training the lm head ( with most layers activated ie Push 345,000,000 parameters ) .... then train normally (8-16) lora ..... on a large dataset of 10,000.... fit any where under loss of 1 ... ranging from 1.2-1.5 to 0.5 loss ( this allows the model to become a predictor ) .... now the task is trained .. the prompt can be removed and the model realigned to alpaca ....( allowiing for the new methodology to be absorbed in the model : , rememebring the prompt , you can activate the task using the same prompt !! <<) .... so lots of small training sessions..
@RemekKinas
@RemekKinas 25 күн бұрын
Great series!
@juliensimonfr
@juliensimonfr 25 күн бұрын
Thank you!
@divyagarh
@divyagarh 29 күн бұрын
Hi Julien, I followed exact steps for Meta's newer model Llama 3.1 8B Instruct and get this error on Sagemaker, "The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. The tokenizer class you load from this checkpoint is 'PreTrainedTokenizerFast'. The class this function is called from is 'LlamaTokenizer'." Any thoughts? Please please help.
@juliensimonfr
@juliensimonfr 28 күн бұрын
Is this the deployment failing? which cell gives you this error?
@divyagarh
@divyagarh 28 күн бұрын
@@juliensimonfr the deploy cell. I found these errors in the log. A lot of people are experiencing the same issue.
@Renozilla
@Renozilla Ай бұрын
Thanks for sharing, amazing content
@juliensimonfr
@juliensimonfr Ай бұрын
Thank you!
@melikanobakhtian6018
@melikanobakhtian6018 Ай бұрын
That was great and it helped me so much! Is there this possibility to have the presentation slides?
@juliensimonfr
@juliensimonfr 15 күн бұрын
Hi, you can find the slides on Slideshare at fr.slideshare.net/slideshow/julien-simon-deep-dive-model-merging/270921708
@FushigiMigi
@FushigiMigi Ай бұрын
Need to know how to communicate with chat models that are running using python code. I’m struggling to find this information.
@juliensimonfr
@juliensimonfr Ай бұрын
Check out the Inference Endpoints documentation. The format is simple JSON.
@itayatelis2898
@itayatelis2898 Ай бұрын
Naw Julien you left HF??
@juliensimonfr
@juliensimonfr Ай бұрын
Yep :)
@JayPrakash-py3sh
@JayPrakash-py3sh 7 күн бұрын
Mm​@@juliensimonfr
@CMAZZONI
@CMAZZONI Ай бұрын
Hello, thank you so much for doing this video, the only question I have is that some of the models do not have a deploy option in the model card (for example gliner models). Is there a way to use these? Many thanks!
@juliensimonfr
@juliensimonfr Ай бұрын
You're welcome! You mean these, right: huggingface.co/urchade ? Not 100% sure, but they don't seem to be supported by the transformers library (see huggingface.co/docs/transformers/main/en/model_doc/bert), so this would explain why they can't be deployed in the standard way. The alternative would be to deploy it in a pytorch environment with the appropriate dependencies, see cloud.google.com/blog/topics/developers-practitioners/pytorch-google-cloud-how-deploy-pytorch-models-vertex-ai.
@Hotboy-q7n
@Hotboy-q7n Ай бұрын
Is there a course to understand this here?
@juliensimonfr
@juliensimonfr Ай бұрын
What would you like to understand? Building models? Running them?
@Hotboy-q7n
@Hotboy-q7n Ай бұрын
@@juliensimonfr Thx for awnser Julien!!! I want to know how to use Arcee/Scribe to develop a bot like Character AI or Replika, in a chatting style. I am willing to do whatever it takes. I want to know how to create and train it.
@xspydazx
@xspydazx 23 күн бұрын
@@Hotboy-q7n i agree i think Character building : and trying to imprit a personality .. or a specific model of speech or reaction to situations etc : Not neccasarly role play models : i have been experimenting with movie scripts and subtitles : to give the model conversations to enable it to have some form of character , i removed the personal refferences like names i now also realize you can replace then with tag [name] if you chat with the model and tell it your name it will fill the square with your actual name o the fly ... so i would also probably use this technique more and more: i think that the main issue is handling the conversations < or even converting existing step by step data etc into conversations instead .. so giving the model dialogs? .. to persoalize the model ( i used the samatha dataset ( it does reduce brain performance , but i replaced with my own name ) , i also used my chat historys from other apps ... as it also contains my style .. so by giving the model some personal data it is basically my own charachter i build : When i fine tune i chage the settings for the peft , and do many eopochs until i get the desied loss rate for the information ... ( the lower the loss the higher the priority ) ... it does make a big difference .
@ChouaibNemri
@ChouaibNemri Ай бұрын
Congrats on joining Arceee! Great demo! Keep inspiring and empowering us! <3
@juliensimonfr
@juliensimonfr Ай бұрын
Thank you so much!
@yacinezahidi7206
@yacinezahidi7206 Ай бұрын
Thank you Julien ! Do this work with Vision Language Models as well ?
@juliensimonfr
@juliensimonfr Ай бұрын
Never tried it, sorry. Ask at discuss.huggingface.co :)
@brightworld7550
@brightworld7550 Ай бұрын
Very helpfull thank you so much. One question. When it comes to pricing. Does Vertex AI bill you based on the usage (how many seconds the model is running) or for as long as the model is running 24/7?
@juliensimonfr
@juliensimonfr Ай бұрын
It's instance-based, so you pay for instance time as long as it's up.
@samyrockstar
@samyrockstar Ай бұрын
is sagemaker endpoint a good option for llama3 production use?
@juliensimonfr
@juliensimonfr Ай бұрын
Sure! SageMaker is solid :)
@masanmola
@masanmola Ай бұрын
Finally, I found an awesome video that shares the story of Hugging Face and presents their products. Thank you for creating this!
@juliensimonfr
@juliensimonfr Ай бұрын
Glad you enjoyed it!
@fantasyapart787
@fantasyapart787 Ай бұрын
i cud c that this video got uploaded 3 years back, is that still valid , I mean the features navigatons
@juliensimonfr
@juliensimonfr Ай бұрын
The UI has changed, the SDK is probably still very very similar.
@SO-vq7qd
@SO-vq7qd Ай бұрын
Is there a way to connect this to a custom domain? I want to create a simple web app of chat interface
@juliensimonfr
@juliensimonfr Ай бұрын
A vertex endpoint is a vanilla HTTP API. Given the right credentials, network permissions etc., you can invoke it from your apps just like any web API.
@SO-vq7qd
@SO-vq7qd Ай бұрын
Thank you!
@juliensimonfr
@juliensimonfr Ай бұрын
You're welcome!
@jiegong529
@jiegong529 Ай бұрын
Thanks so much for the crystal clear explanations! You understand them so well and it's even more amazing how you show them in bullet points and graphs to make your audience understand as well!
@juliensimonfr
@juliensimonfr Ай бұрын
Glad you liked it, thank you!
@josephazhikannickel4188
@josephazhikannickel4188 Ай бұрын
Hi Julien, thank you for the valuable contribution. your vids are impressive, love it. May i have one question, that do deploying Llama3 on sagemaker have met HIPAA compliance, Because the data is highly confidential. Hope you will help with an answer. Thank you!
@juliensimonfr
@juliensimonfr Ай бұрын
This should help :) aws.amazon.com/about-aws/whats-new/2018/05/Amazon-SageMaker-Achieves-HIPAA-Eligibility/
@user-ff2tf3cw9j
@user-ff2tf3cw9j Ай бұрын
Can i use these transformers models for persian language?
@juliensimonfr
@juliensimonfr Ай бұрын
Yes, check them out on the Hugging Face hub: huggingface.co/models?language=fa
@billykotsos4642
@billykotsos4642 Ай бұрын
very informative as always !
@juliensimonfr
@juliensimonfr Ай бұрын
Glad it was helpful!
@user-vo5ce6kn5t
@user-vo5ce6kn5t Ай бұрын
Hi I have question , I have deployed finetuned llama3 on AWS , but it generate repeated answers if I adjust payload it cut off the end of response , please provide me solution on this I have multiple times changed max lengths, max tokens size also still facing that issue
@nb9t7
@nb9t7 Ай бұрын
Hey Julien, Where can we find the training model video for food dataset? Also, I am trying to use a model and deploy it on Hugging Face Inference, but it errors out saying I need a config.json file. I'm not sure how to create it. Any leads would be really helpful. Thanks!
@juliensimonfr
@juliensimonfr 15 күн бұрын
Hi, I think this is the right video: kzfaq.info/get/bejne/q6yop89ottu5pqM.html Yes, your model repository needs to have a config.json file, which is generated automatically when you save your trained model. See the docs at huggingface.co/docs/inference-endpoints/index