Рет қаралды 1,709
Step by step guide to deploying E5-large-v2 (top ranking embedding model on Hugging Face MTEB leaderboard, open AI embedding model ada-002 at No. 6) on your own servers using AWS Sagemaker.
By the end of it, you will have created an API open to the web that anyone with the right credentials can access to create embedding vectors from sentences / paragraphs. These embeddings can be used to retrieve information from your company's database and incorporated into the prompt of chatGPT so GPT can answer questions about this new information that it did not have access to when it was trained.
Key moments:
0:00 - Hugging Face embedding leaderboard - how to find E5-large-v2
3:48 - Look through Phil Schmid blog
5:00 - Creating a notebook instance in AWS Sagemaker
6:47 - Jupiter notebook walkthrough - 7 steps to deploying Embedding Model endpoint
11:48 - what is average pooling?
17:39 - Checking deployed model, endpoint and getting the API URL
18:25 - Using Postman to test out API
Useful links:
Hugging Face MTEB Leaderboard: huggingface.co...
Phil Schmid Blog: www.philschmid...
Average Pooling: www.kaggle.com...
AWS access & secret key: aws.amazon.com...
Postman: www.postman.com