No video

RAG explained step-by-step up to GROKKED RAG sys

  Рет қаралды 5,899

code_your_own_AI

code_your_own_AI

Күн бұрын

Today I try to answer all questions by my subscriber about my last three videos, w/ focus on the new Grokked LLM integration into traditional RAG systems.
I'll cover a wide array of questions, incl ARM Graph based re-ranker for optimal RAG systems to new "Buffer of Thoughts" BoT reasoning methods of LLMs (so we have Chain-of-Thoughts, Tree-of-Thoughts, Graph-of-Thoughts and now: Buffer-of-Thoughts to kind of force the LLM to solve a causal reasoning task, really?).
With the final answer to GROKKED RAG systems. Smile.
So in this video only a little bit on new AI research, but a lot of explanations for optimizing currently operating RAG systems, that depend on a vector store and how to improve their overall performance with grokked transformers (Grokked LLMs).
#airesearch
#grokkedLLM
#grokkedRAG

Пікірлер: 33
@MBR7833
@MBR7833 2 ай бұрын
Never commented on a KZfaq video in 15 years but I need to now. thank you so much for your work and questioning the status quo.
@code4AI
@code4AI 2 ай бұрын
You are so welcome!
@wibulord926
@wibulord926 2 ай бұрын
well, actually your channel just created 10 years ago
@gileneusz
@gileneusz 2 ай бұрын
44:59 I came into conclusion that maybe I should get 3 months off from AI and just get a long holiday, I'll come back when it will be all sorted out lol
@manslaughterinc.9135
@manslaughterinc.9135 2 ай бұрын
The examples that you give are rag for specifically causal reasoning. The gains are much higher when rag is used for domain specific knowledge.
@code4AI
@code4AI 2 ай бұрын
Causal reasoning works on domain specific knowledge.
@mulderbm
@mulderbm 2 ай бұрын
Love the undertone not sure everyone gets it 😅 interesting take on the VCs trying to force the scientists to make it work to get their ROI out of failed tech
@gileneusz
@gileneusz 2 ай бұрын
38:10 1000 dimensions and 10% efficiency. We need to make more dimensions, like 10 000 or 100 000 000 000 to get 15% efficiency 😵‍💫
@LamontCranston-qh2rv
@LamontCranston-qh2rv 2 ай бұрын
A commenter on an earlier video suggested using quaternions instead of vectors. I wonder if that approach might actually save the VCs and startups from total ruin? Maybe worth a try? I do love that we are full circle back to building models though in the meantime! Outstanding work professor, as always! Thank you so much for creating and sharing these videos!
@BjornHeijligers
@BjornHeijligers 2 ай бұрын
Nice idea. Quaternions solve the problem of mathematical singularities when wanting to calculating ANGLES between vectors. As we are not working with the actual angles, but are working with inner products or similarity scores quaternions, as far as I can see, are not a natural fit to working with high dimensional semantic vectors. Would love to be proven wrong, though.
@agsystems8220
@agsystems8220 2 ай бұрын
The use of an 'apply logical reasoning' step should not be thought of as a weakness of the approach, because reasoning is inherently recursive. Any set up that doesn't have something like this is fundamentally bounded, no matter how well built or trained. That can be extremely powerful in an idiot savant way, but can never really be called intelligent. There is no finite machine that could answer any question in one step. I am completely with you that we are not doing it quite right, but I don't agree with you that parametric memory is the way forward (didn't bring it up here, but you did on the last one). It has fixed size, It doesn't scale efficiently, and I don't think you can really call a system that needs to train heavily on a reasoning technique before using it a reasoning engine. For that moniker it needs to be able to have a logical technique explained to it, and immediately apply it to a problem without retraining. It needs to be able to reason about reasoning, and do it in a one shot setting. The reason it needs to be able to one shot reasoning is that complex reasoning is not bounded in complexity, so it needs some way of unbounding it's memory and run time. The obvious way of doing this is to let it fill in more tokens. At that point you might have some tokens saying something like "we should try induction, induction is done by ...", and the model needs to be able to follow that recipe. That will probably involve searching for further decompositions and relevant facts, often dead ends, though training can improve how this search occurs. Importantly you need to be able to mix information from global memory with local context specific information, so it absolutely makes sense for them to be in the same 'language'. Maintaining knowledge as a separate database is the only way to build something that really scales, and having this database also holding all but a minimal set of foundational logical tools seems sensible. The specifics of how you do this are hard though.
@iham1313
@iham1313 2 ай бұрын
in regards to your aversion on RAG (which i can relate to): how to build an ai system, that is able to cite back from texts (or video/audio), if not using embeddings, metadata & rag? lets say you want to build an domain specific ai tool. it should gather information from different sources and when asked about something from within that domain, an answer should be provided including the reference. (page of a document, timestamp from audio/video, text block from a webpage). i struggle to see when to use which strategy.
@mattager5548
@mattager5548 2 ай бұрын
I don’t think the end game of grokking is to get really good at giving users existing data, the hope is to be able to reason about novel things that humans might be unable to or just haven’t gotten around to yet. RAG seems like the best solution for our current generation of AI that isn’t that reliable
@_Han-xk1zv
@_Han-xk1zv 2 ай бұрын
Are you familiar with the reasons for conducting re-ranking step? Specifically, given the premise of extracting relevant document candidates using only DPR, I'm curious about your perspective on why we'd need to conduct re-ranking using a cross-encoder, in addition to extracting relevant document candidates by computing cosine similarity with a bi-encoder.
@gileneusz
@gileneusz 2 ай бұрын
20:11 This is a framework for large companies or suicidal startups
@ngbrother
@ngbrother 2 ай бұрын
With a long enough pre-prompt and context window? Can you trigger a grokked transformer phase transition through in-context learning only?
@sfsft11
@sfsft11 2 ай бұрын
For in context learning, there is no actual learning and updating of model parameters. So no grokking for that
@En1Gm4A
@En1Gm4A 2 ай бұрын
Just a basic question - isnt reasoning also able to be done via search in semantic graph as shown in that one paper? They where able to visually show the trace needed to solve the task. why does one need a grogged transformer? shortest path search between semantic concepts or so on
@En1Gm4A
@En1Gm4A 2 ай бұрын
q* ?
@code4AI
@code4AI 2 ай бұрын
Sure, we have all the graph based message passing in the world. Like this video I did 2 years ago kzfaq.info/get/bejne/n8WEoJaLtrnHpmw.html&t But here we are talking about a different technology ... take a minute and think about it.
@En1Gm4A
@En1Gm4A 2 ай бұрын
@@code4AI you mean a more expensive way to do the same thing?
@jaredgreen2363
@jaredgreen2363 2 ай бұрын
Yes, but it’s inherently slow. If the graph of inferred facts branches at all it will require exponential time for the length of the resulting line of reasoning at least. Providing heuristics to pick the most promising path to extend significantly reduces that. That is what a language model can be used for.
@fire17102
@fire17102 2 ай бұрын
So you have to train a transformer for this? Can we fine-tune a base model on our data? Is this what X's Grok is doing ?
@code4AI
@code4AI 2 ай бұрын
Unfortunately I have no insights into Musk.
@fire17102
@fire17102 2 ай бұрын
Is there a grokked gpt4 level model on ollama?
@code4AI
@code4AI 2 ай бұрын
Smile.
@fire17102
@fire17102 2 ай бұрын
@@code4AI how about gpt3.5 level model? Is this all purely hypethetical? Or everyone is just in stealth mode ?
@code4AI
@code4AI 2 ай бұрын
Stealth mode???? Meta is publishing about it. Google is publishing about it. Microsoft is publishing about it. OpenAI is publishing about it ......
@fire17102
@fire17102 2 ай бұрын
@@code4AI maybe I misunderstood... Basically I'm asking if there's a grokked model to play with, or not yet.. thanks 🙏
@artur50
@artur50 2 ай бұрын
any github project on that?
@code4AI
@code4AI 2 ай бұрын
About 1000 on RAG ... for Grokked LLM you have to go with the research published by OpenAI, Microsoft, Meta and Google, just to name a few ... I haven't heard back from Apple yet.
@michaelmcwhirter
@michaelmcwhirter 2 ай бұрын
Are you monetized on KZfaq yet? 😃
@wibulord926
@wibulord926 2 ай бұрын
??? what do you mean
NEW Multi-Modal AI by APPLE
26:49
code_your_own_AI
Рет қаралды 2,6 М.
GraphRAG or SpeculativeRAG ?
25:51
code_your_own_AI
Рет қаралды 8 М.
ISSEI & yellow girl 💛
00:33
ISSEI / いっせい
Рет қаралды 24 МЛН
Running With Bigger And Bigger Feastables
00:17
MrBeast
Рет қаралды 159 МЛН
BEST RAG you can buy: LAW AI (Stanford)
19:12
code_your_own_AI
Рет қаралды 4,9 М.
5 Easy Ways to help LLMs to Reason
50:37
code_your_own_AI
Рет қаралды 4,5 М.
How AI 'Understands' Images (CLIP) - Computerphile
18:05
Computerphile
Рет қаралды 196 М.
New Discovery: LLMs have a Performance Phase
29:51
code_your_own_AI
Рет қаралды 15 М.
Transformer Neural Networks Derived from Scratch
18:08
Algorithmic Simplicity
Рет қаралды 136 М.
Adding Agentic Layers to RAG
19:40
AI User Group
Рет қаралды 21 М.
SUPERHUMAN RAG  #ai
31:21
code_your_own_AI
Рет қаралды 16 М.