GraphRAG: LLM-Derived Knowledge Graphs for RAG

  Рет қаралды 61,871

Alex Chao

Alex Chao

Ай бұрын

Watch my colleague Jonathan Larson present on GraphRAG!
GraphRAG is a research project from Microsoft exploring the use of knowledge graphs and large language models for enhanced retrieval augmented generation. It is an end-to-end system for richly understanding text-heavy datasets by combining text extraction, network analysis, LLM prompting, and summarization.
For more details on GraphRAG check out aka.ms/graphrag
Read the blogpost: www.microsoft.com/en-us/resea...
Check out the arxiv paper: arxiv.org/abs/2404.16130
And follow me on other platforms so you’ll never miss out on my updates!
💌 Sign up for my free AI newsletter Chaos Theory: alexchao.substack.com/subscribe
🐦 Follow me on Twitter / alexchaomander
📷 And Instagram! / alexchaomander
🎥 And TikTok! / alexchaomander
👥 Connect with me on LinkedIn / alexchao56

Пікірлер: 97
@alexchaomander
@alexchaomander 27 күн бұрын
What scenarios do you see GraphRAG being useful for?
@jtjames79
@jtjames79 27 күн бұрын
Using GraphRAG to make GraphRAGs. Because AI should be able to go down the rabbit hole.
@alexanderroodt5052
@alexanderroodt5052 27 күн бұрын
Profiling people
@Sergio-rq2mm
@Sergio-rq2mm 27 күн бұрын
Any where, where relationships are important. Abstract associations between data sets, perhaps laws, policies, etc, things that are very narrative driven, such as stories, etc. Nontypical datasets basically.
@alexanderroodt5052
@alexanderroodt5052 27 күн бұрын
@@Sergio-rq2mm I choose to go the 1984 route
@ktbumjun
@ktbumjun 26 күн бұрын
Bible study
@alexanderbrown-dg3sy
@alexanderbrown-dg3sy 27 күн бұрын
This is basically causal grounding. We figure semantic symbolic reasoning, from an architectural perspective. Add a powerful model…something very compelling AGI-like would be the result I would assume(plus mcts sampling lol). Causal grounding is huge hole in current models. This is dope research. Kudos.
@user-dk8dm8db8t
@user-dk8dm8db8t 27 күн бұрын
Looking forward to the code for this!
@jcourson8
@jcourson8 25 күн бұрын
I've been doing work in the area of creating knowledge graphs for codebases. The nice thing about generating them for code (as opposed to text) is that you don't have to rely on LLM calls to recognize and generate relationships, but you can utilize language servers and language parsers for that.
@lalamax3d
@lalamax3d 28 күн бұрын
glad, i didn't skip this and watched video, thanks for sharing knowledge. seems very impressive.
@iukeay
@iukeay 15 күн бұрын
That last 5min of the video was epic!!!!! Dude amazing stuff!!! Also thanks for the tip on having the LLM generate the graph
@ChetanVashistth
@ChetanVashistth 26 күн бұрын
This seems very powerful. Thanks for sharing it and explaining it well.
@peteredmonds1712
@peteredmonds1712 27 күн бұрын
this was so well explained, nicely done. my first thoughts: 1. i'd be curious to see benchmarks with cheaper LLMs. from my experience, even much smaller models like llama-3-8b can come close to gpt-4 in this use-case (entity extraction and relationships). a little fine-tuning could likely match or surpass gpt-4 for much cheaper. 2. i wonder how this could be augmented with datasources which already have some concept of relationships, ie wikipedia, dictionaries, hypertext.
@mrrohitjadhav470
@mrrohitjadhav470 22 күн бұрын
i was having thoughts🙂
@Rkcuddles
@Rkcuddles 6 күн бұрын
GPT 4 not understanding these deep relationships is bar far the biggest bottleneck in me using it. This is super exciting
@andydataguy
@andydataguy 28 күн бұрын
That final streamlit app was awesome!!
@mvasa2582
@mvasa2582 28 күн бұрын
While RAG is a good process for eliminating hallucinations, GraphRAG makes the retrieved context richer with its relationship-building techniques. The expense is worth it. Is the result set then re-graphed, or will the same query twice be as expensive?
@TomBielecki
@TomBielecki 22 күн бұрын
I really like the addition of hierarchical agglomerative summarization, which gives holistic aanswers similar to RAPTOR RAG strategy but with the better data representation of knowledge graphs. I'll need to read the paper to understand if embeddings are used at all in this, and whether relationships are labelled or if they just have a strength value.
@filippomarino861
@filippomarino861 26 күн бұрын
This could be a game-changer in both public and private-sector intelligence analysis (as I am sure you figured out.) Looking forward to additional info - but what about the private dataset's format? Is it vectorized? If so, can we assume that there are optimal and sub-optimal approaches? (IOW, is it fair to assume vectorization can significantly impact GraphRAG's performance?)
@dhirajkhanna-thebeardedguy
@dhirajkhanna-thebeardedguy 27 күн бұрын
This is outstanding stuff!
@lifedownunderse
@lifedownunderse 11 күн бұрын
I really enjoyed this video! What tool did you use to visualise the POD cast graph?
@pablof3326
@pablof3326 26 күн бұрын
Great work! I was thinking to use a system like this to build the memory of an AI companion as it talks to the user. So in this case the knowledge graph will start empty and grow get built dynamically with every conversation. Do you see this as a good use case for GraphRAG?
@escanoxiao6871
@escanoxiao6871 22 күн бұрын
fabulous work! wondering how long it takes to form a whole vector db and plus how many tokens will it take?
@Rkcuddles
@Rkcuddles 6 күн бұрын
Please let me play with this! Impressive work !
@Aditya_khedekar
@Aditya_khedekar 27 күн бұрын
Hii, i am working on solving the same problem of vector search rag is not good. can you plz share the code a tutorial will be even great !!
@ghostwhowalks2324
@ghostwhowalks2324 13 күн бұрын
This is just brilliant
@knaz7468
@knaz7468 2 күн бұрын
Run this on the Lex Fridman podcast library!
@heterotic
@heterotic 27 күн бұрын
How is this any different then Self Organizing Maps for RAG?
@sairajpednekar8049
@sairajpednekar8049 29 күн бұрын
May I know the underlying technology used for hosting the graph database? Was it Cosmos db?
@nas8318
@nas8318 28 күн бұрын
Likely neo4j
@alexchaomander
@alexchaomander 27 күн бұрын
It's graph database agnostic! You can use your choice of Graph DB. The technique is general enough to support multiple
@LadharAmir
@LadharAmir 22 күн бұрын
It's not about the datbase, it's about the methodlogy. RDF or PL graphs should both work
@Mrbeastifed
@Mrbeastifed 12 күн бұрын
Is there an Open source implementation of this or how could I build it into my own app?
@jasonjefferson6596
@jasonjefferson6596 12 күн бұрын
Does the repeated term“regular RAG” refer to setups using vector databases?
@mrstephanwehner
@mrstephanwehner 28 күн бұрын
Is there no standard comparison approach? For example one could take academic literature reviews, collect their references, throw in some more, and ask the llm system. Compare the result with the original review. There might be summaries available in the accounting and legal world, that could be used also
@alexchaomander
@alexchaomander 27 күн бұрын
Comparison is tough! It's another area of research we're heavily invested in. But I like the ideas that you're bringing up!
@sathyanarayanbalaji2971
@sathyanarayanbalaji2971 26 күн бұрын
true that validation would be required to compare the result.
@olegpopov3180
@olegpopov3180 28 күн бұрын
What is technology stack for that?
@JasonSun386
@JasonSun386 22 күн бұрын
Seems like the video was incomplete. Is there another part
@GigaFro
@GigaFro 14 күн бұрын
Excuse me if I’m wrong… listened to this while exercising… but the main issue explored here for each question was that questions like “what are the top themes?” Cannot be answered by the LLM with vanilla RAG. Is this correct? If so, then if context size grows large enough this will be less necessary right? Furthermore, by introducing a graph that has communities premised on topics/themes or whatever u decide, doesn’t that reduce the degrees of freedom of your system?
@phillipmaire8637
@phillipmaire8637 4 күн бұрын
Would love the opportunity to contribute to this project, super interesting. How easy is it to update existing knowledge graphs periodically when new data comes in? Is there a “reindexing” cost?
@joserfjunior8940
@joserfjunior8940 21 күн бұрын
GraphRAG Perfect !
@En1Gm4A
@En1Gm4A 28 күн бұрын
pls provide the code
@alexchaomander
@alexchaomander 27 күн бұрын
Code will be shared soon!
@SamuelJunghenn
@SamuelJunghenn 27 күн бұрын
+1 🙏
@En1Gm4A
@En1Gm4A 27 күн бұрын
@@alexchaomander Great! I have signed up for your newsletter. Will you inform about the code release there?
@Lutz1985
@Lutz1985 26 күн бұрын
le dot
@bejn5619
@bejn5619 25 күн бұрын
+1
@hjl1045
@hjl1045 26 күн бұрын
When will it be open sourced? :)
@Thrashmetalman
@Thrashmetalman 23 күн бұрын
is there source code anywhere for this?
@SDAravind
@SDAravind Күн бұрын
Whats the database used?
@DefenderX
@DefenderX 24 күн бұрын
Great, this is something I also thought about when AI had difficulty finding relevant information a while back. Basically have filters to determine how the AI will maneuver the training data depending on what is prompted and relevance. This is something I thought about after reading a paper on the discovery of a new hybrid braincell type that acted as a trigger that could turn on and off pathways. So the context in the prompt is what's important. Because that decides which tags in the training data should be turned on and off. Which in the end will give you a unique pathway for the AI to retrieve data.
@DefenderX
@DefenderX 24 күн бұрын
Also, the next step would create overarching filters between several AI agents. After you have all this, the next step is for AI to implement statistics in its reasoning.
@malikanaser8251
@malikanaser8251 27 күн бұрын
Hi, are you going to share the code?
@FitoreKelmendi-fm1tg
@FitoreKelmendi-fm1tg 3 күн бұрын
Does chatgpt (paid version) use graph rag?
@NobleCaveman
@NobleCaveman 21 күн бұрын
Would be a great tool for rapid and more reliable meta analysis
@ABG1788
@ABG1788 Күн бұрын
I don't understand. Why do we need GraphRag, when an LLM can summarise the text and find relationships ?
@pabloe1802
@pabloe1802 27 күн бұрын
To understand semantic search first you need to understand how HNSW works, then you realice no wonder it dosent work. I ended up building a datastructure to combine vector search and entities
@MahmoudAtef
@MahmoudAtef 28 күн бұрын
But knowledge graphs are very slow to query. I wonder if we can encode those graphs in the gpt model by building graph transformers.
@damianlewis7550
@damianlewis7550 28 күн бұрын
I don’t think that’s the case. Optimized graph query engines can return results in milliseconds e.g. WikiMedia, Google etc. at a fraction of the computational cost of an LLM. The reason that GraphRAG is slow-ish is because the LLMs are slow.
@MrDonald911
@MrDonald911 28 күн бұрын
Google, Facebook, and Linkedin all use graph databases, it's actually much faster than relational DBs
@nas8318
@nas8318 27 күн бұрын
Slower than LLMs?
@Sri_Harsha_Electronics_Guthik
@Sri_Harsha_Electronics_Guthik 13 күн бұрын
implementations?
@RickySupriyadi
@RickySupriyadi 11 күн бұрын
oh hey that's obsidian note style of note making it is interesting AI actually can remember better with the help of zettelkasten like human do!? can't wait until japan researcher conclude their research using chemical reactions in tube to emulate emotions, so machine can felt emotions through chemical reactions, like human do.... to me emotional are also the best way to learn and remembering things.
@RickySupriyadi
@RickySupriyadi 11 күн бұрын
so what if... instead of tube of chemical reactions... important informations and often asked questions had an emotional cue graph to create some kind of important profiling so that profile will serve as a mark whenever AI is the expert in that field (strong retrieval in specific field leading for future of MoE)
@user-wr4yl7tx3w
@user-wr4yl7tx3w 28 күн бұрын
but don't you lose information in the process of making a knowledge graph, given how only a subset of the textual information is extracted and retained in the KG?
@computerrockstar2369
@computerrockstar2369 27 күн бұрын
I don't think the LLM really needs the graph to make any decisions. Its more valuable for human users to find related information
@LadharAmir
@LadharAmir 22 күн бұрын
You can use ETL to build your knowledge graph by yourself from RDMSs, then you will not loose information
@tacticalgaryvrgamer8913
@tacticalgaryvrgamer8913 4 күн бұрын
I assume it's open source because why would someone pay to have gpt4 parse and organize their data. Takes 2 seconds to roll your own.
@Walczyk
@Walczyk 24 күн бұрын
What's a rag
@IlyaDenisov
@IlyaDenisov 23 күн бұрын
Retrieval Augmented Generation (use that as an input to your favourite search engine or AI companion)
@nickfleming3719
@nickfleming3719 4 күн бұрын
Okay... we know graph rag is good. duh. How is it implemented, how do you feed it to the LLM, how do you store the data
@knutoletube
@knutoletube 14 күн бұрын
Is the rest of this conversation available somewhere, @alexchaomander?
@lanc3carr
@lanc3carr 20 күн бұрын
Police, FBI, CIA, etc... investigations (CSI AI)
@user-pd2pd1ho2h
@user-pd2pd1ho2h 26 күн бұрын
American princess Google Plex SEO Sandra Mitra watching.....
@ross9263
@ross9263 16 күн бұрын
The content is very political..
@timjrgebn
@timjrgebn 2 күн бұрын
Haha, and skewed... Crickets for Gaza... but Odessa is worth mentioning? This is why it's best to avoid politics when we're trying to stay on task, especially when dealing with tech that's literally forming and pruning knowledge graphs based on topics/themes...
What is RAG? (Retrieval Augmented Generation)
11:37
Don Woodlock
Рет қаралды 80 М.
GraphRAG: Knowledge Graphs for AI Applications with Kirk Marple - 681
46:53
The TWIML AI Podcast with Sam Charrington
Рет қаралды 2,4 М.
ONE MORE SUBSCRIBER FOR 6 MILLION!
00:38
Horror Skunx
Рет қаралды 14 МЛН
Indian sharing by Secret Vlog #shorts
00:13
Secret Vlog
Рет қаралды 56 МЛН
ПЕЙ МОЛОКО КАК ФОКУСНИК
00:37
Masomka
Рет қаралды 10 МЛН
PEOPLE AI
1:23
Defiking4
Рет қаралды 7
The easiest way to chat with Knowledge Graph using LLMs (python tutorial)
18:35
Fabric: Opensource AI Framework That Can Automate Your Life!
9:48
Fixing RAG with GraphRAG
15:04
Vivek Haldar
Рет қаралды 516
Large Language Models and Knowledge Graphs: Merging Flexibility and Structure
1:40:04
RAG Explained
8:03
IBM Technology
Рет қаралды 28 М.
Why Does Diffusion Work Better than Auto-Regression?
20:18
Algorithmic Simplicity
Рет қаралды 136 М.
Эффект Карбонаро и бумажный телефон
1:01
История одного вокалиста
Рет қаралды 2,6 МЛН
POCO F6 PRO - ЛУЧШИЙ POCO НА ДАННЫЙ МОМЕНТ!
18:51
Latest Nokia Mobile Phone
0:42
Tech Official
Рет қаралды 491 М.
How To Unlock Your iphone With Your Voice
0:34
요루퐁 yorupong
Рет қаралды 16 МЛН
Теперь это его телефон
0:21
Хорошие Новости
Рет қаралды 1,7 МЛН
AMD больше не конкурент для Intel
0:57
ITMania - Сборка ПК
Рет қаралды 513 М.