Рет қаралды 5,890
Correction: at 1:53, I said that an embedding is a x digit string. That is not correct, it should be a list of x numbers.
Google Colab Code: colab.research.google.com/dri...
Link to Diagram: link.excalidraw.com/readonly/...
Why build your own retrieval augmented generation pipeline when OpenAI's custom GPTs can do it out of the box? Did you know that the OpenAI solutions, as of the making of this video, are not scalable to large knowledge bases? Also, having your own pipeline gives you a lot more control over the design which you will need if you are building an enterprise grade top-notch system.
In this tutorial, we will talk through a number of advanced techniques such as sentence window retrieval, hierarchical automerge retrieval, returning Top K results vs. greedy search, reranking etc.
We will also work through some code and do a real comparison between basic chunking vs. sentence retrieval strategies.