Custom LLM Fully Local AI Chat - Made Stupidly Simple with NVIDIA ChatRTX

Рет қаралды 15,818

2 ай бұрын

NVIDIA have recently updated ChatRTX, a free local LLM chat bot for people with NVIDIA graphics cards. The key feature of ChatRTX are its
1 - its local, so safer for your personal documentation AND query history
2 - it's free
3 - it's configurable
4 - it's easy
5 - it runs multiple LLM models
We look at how to use ChatRTX to search your own data, perfect for making your own local AI companion that runs entirely on your own machine.
This is of course not the only option out there for running LLM locally, so we briefly mention some of the alternatives available.
Links
gamefromscratch.com/nvidia-ch...
-----------------------------------------------------------------------------------------------------------
Support : / gamefromscratch
GameDev News : gamefromscratch.com
GameDev Tutorials : devga.me
Discord : / discord
Twitter : / gamefromscratch
-----------------------------------------------------------------------------------------------------------

Пікірлер: 98

@gamefromscratch 2 ай бұрын

Links gamefromscratch.com/nvidia-chatrtx-easy-local-custom-llm-ai-chat/ ----------------------------------------------------------------------------------------------------------- *Support* : www.patreon.com/gamefromscratch *GameDev News* : gamefromscratch.com *GameDev Tutorials* : devga.me *Discord* : discord.com/invite/R7tUVbD *Twitter* : twitter.com/gamefromscratch -----------------------------------------------------------------------------------------------------------

@Theraot 2 ай бұрын

I have used LM Studio, it will run on lower end hardware, but expect very bad performance. You can try using models that have been quantized, which will perform better, but will be less precise (they can degenerate into random text). And I do not remember it having an easy way to reference files. About RAG, be aware that you want a model that has been trained on the general subject. That is: if a model is specialized in poetry, it probably won't do well with code even if you give it all the text books. Why? Because it is trying to rhyme, that is the pattern it learned. On the other hand, with convinience of ChatRTX, you should be able to give it your project files - those that you would not dare upload to an online AI - and have it give you results based on that, which would be specific to what you are doing. And let that be another reason to put comments and choose good variable names: the better the context you can give the AI the better. Finally, do not forget: Garbage in, garbage out.

@TheRealAfroRick 2 ай бұрын

RAG does not go to the web. With retrieval augmented generation, the local data you provide is embedded (converted to numerical form) and store in a vector database of some sort..then when you make a request to the chat bot, your query is also embedded using the same method as the data you previously embedded. A semantic search is performed (generally with cosine distance from the vectors) and the relevant data is sent to the LLM in its context window so it can base its response on the content in your data. This is done specifically to prevent hallucinations by the LLM since it has never seen the data in your documents.

@Matlockization Ай бұрын

'Vector database' sounds like the cloud or 3rd party data mining operatives who are only too happy to pay for the privilege. People also have to understand that one of the AI's here are linked to Meta, which is owned by Mark Zuckerberg who is famous for sharing people's data.

@micmacha 2 ай бұрын

As a man who has countless useful epubs and pdfs, this looks very useful. I especially like that it will give you a list of its sources; not exactly a full citation yet but very usable. However, I'm not terribly keen on it being Windows only and it's asking for a hefty graphics card and a lot of disk space for something I can do by hand. I think this is good news and it shows that Nvidia is, if haphazardly, listening to the real concerns with LLMs. Otherwise it's becoming an extremely tired subject.

@MurphyArtPrints 2 ай бұрын

What's your primary source for said PDF's and files? I need to start building a collection with the way things are going.

@micmacha Ай бұрын

@@MurphyArtPrints Oh, I've scanned a number of them, and many others are from independent epub sellers like Humble Bundle and a few (legal) torrents. I'm with you on proprietary ebook viewers; it may be more durable and portable than paper but you never know when someone's going to pull the plug.

@D3bugMod3 2 ай бұрын

Yoh, Will definitely spend some time playing with this. Was using chat to help me write a story & lore bible. But Chat can only remember so much before you have to start a new conversation. Not to mention Chats constant need to equivocate over nuanced or political ideas. I spent so much time getting it to see holes in its logic. No doubt this system will still have issues. But at least i wont have to keep starting over. Thanks as always

@sergiofigueiredo1987 Ай бұрын

Anything LLM is probabbly one of the best rags chat with documents software. its open source, the developers are dedicated and it rock a TON of configurations

@SimeonRadivoev 2 ай бұрын

It's not training doccumintation/dataset it's document retrival. It literally just takes pieces of your documents and inserts them into the prompt.

@VarietyGamerChannel Ай бұрын

no it doesnt

@tmanook 2 ай бұрын

Interesting usage for AI. Seems like it could be handy. For me, I really want an easier time localizing my game. I still need to figure out the optimal way to do that.

@shotelco 28 күн бұрын

Excellent! Thanks.

@AscendantStoic 2 ай бұрын

LM Studio is great ... I use it quite often ... there is also Ollama but it as far as I know it doesn't have a UI, but it's easy to use.

@a.aspden 2 ай бұрын

You mentioned copilot. Does this work as well as copilot if you give it your code folder to train on?

@AnnCatsanndra 2 ай бұрын

Easy to install and use, easy to train on my own data? Man this thing is gonna be killer for brainstorming and worldbuilding!

@samwood3691 2 ай бұрын

The quote from HAL made me have to check this video. Local LLM is a cool idea. Regarding NVidia, they basically bailed on Linux which really sucks, but hopefully this will not stop this from being made available soon on Linux.

@strangeboltz 2 ай бұрын

Awesome video! thank you for sharing

@bdeva029 2 ай бұрын

Nice video. This is good content

@Stealthy_Sloth 2 ай бұрын

Llama 3 with Pinokio works great for this as well.

@mascot4950 2 ай бұрын

My experience is that these small models fall apart really quickly, especially when it comes to generalized questions. For programming, they seem do to a bit better, but the difference is still quite noticeable if you ask small and large models the exact same question. The first "oh, hey, this actually feels pretty close to at least ChatGPT3.5" for me was llama 3 70b, clocking in at 42GB in size. I can only fit about half of that on my GPU, and with the rest running on CPU it's pretty slow. Like 2 tokens per second.

$@refractionpcsx2$

@refractionpcsx2 2 ай бұрын

Can confirm this does *not* install on a 2000 series RTX card. Tried on my 2080Ti and the installer goes nope.

@jefreestyles 2 ай бұрын

Thanks for showing this! Seems maybe 1 other downside is you shouldn't have too many editors open or in use while using it. I wonder what would break first when local compute is maxed out.

@phizc 2 ай бұрын

Ollama is also an interesting option. It supports Linux, Windows, and Mac. AMD support is in preview on Linux and Windows. It sets up a server that can be accessed via an API or a simple cli chat interface.

@rob679 2 ай бұрын

I use linux version of Ollama through WSL with OpenWebui as frontend, it already has RAG functionality and everything is installed by basically 3 commands. Llama3 8b works great and I can hook it to VS Code through Continue extension and have personal local Copilot.

@13thxenos 2 ай бұрын

Came to comment something similar. Now if openWevUI gives the functionality to fine tune models furthur...

@rewindcat7927 2 ай бұрын

Is there a good resource for a smooth-brain to get started on this track? Thanks!!! 🙏

@TrolleyTrampInc 2 ай бұрын

@@rewindcat7927 networkchuck has recently done a video explaining everything. setup ollama and then simple install the continue extension on vs code

@jimmiealencar7636 Ай бұрын

Would it run well with a 6gb rtx?

@rob679 Ай бұрын

@@jimmiealencar7636 Yes, but unless you run natively under windows, you also need 16GB of system ram. Llama3 8b uses about 3.5GB of vram on my 3050. And if everything falls, you can always run it on CPU only, but it will be slow.

@quantumgamer6208 2 ай бұрын

Does it work with pycharm code like python and Lua code for game and game engine development

@Matlockization Ай бұрын

That RAG or 'sanity checker' means that it's possible your data is being distributed to 3rd parties for analysis.

@kurtisharen 2 ай бұрын

How does it handle cross-referencing? What happens if you ask the math question, then ask how to calculate the same thing in Godot? It would need to know and understand the first question and how it applies to the second question instead of just looking up a direct answer in the documentation you give it.

@youMEtubeUK 2 ай бұрын

I have this and while it runs nice with my 4090, I still use online tools for PDFS and general research. Understand if you want to keep files private. But with a chrome extension i can use all the main AI platforms across mutiple devices. Also found gemini pro 1.5 better for large 700 page pdfs.

@etherealregions2676 2 ай бұрын

This is very interesting 🤔

@djumeau 2 ай бұрын

Does it read image based pdfs? Or do you have to convert the pdfs into readable format?

@rob679 2 ай бұрын

Most likely it doesn't it doesnt state anywhere and some people commented on NV page that it doesn't see the files.

@MrHannatas 2 ай бұрын

Need this with agents

@nightrain472 2 ай бұрын

I use GBT4All for local LLM

@OriginRow 2 ай бұрын

How can I fetch Unreal Engine Docs to PDF ? 🤔

@UltimatePerfection 2 ай бұрын

Getleft or other website downloader and then HTML to PDF converter.

@OriginRow 2 ай бұрын

@@UltimatePerfection Recently they moved Docs to forums LMAO 🥵 It's not working

@JaxonFXPryer 2 ай бұрын

Dang it... I have so much text in markdown format that is useless for this training data 😭

@scribblingjoe 2 ай бұрын

This actually sounds pretty cool.

@josemartins-game Ай бұрын

Turn off the internet. What does it respond ?

@FusionDeveloper Ай бұрын

Neat idea, but unnecessarily high system requirements, make it prohibitive for most people. I can run Ollama with llama3 with lower system requirements and make my own GUI.

@dariusz.9119 2 ай бұрын

One thing to add is ChatRTX requires Windows 11. 70% of the market is Windows 10 so it's only for a limited number of users

@MonsterJuiced 2 ай бұрын

Thanks for that lmao, I'm on win10 because 11 breaks my dev software and kills my performance. Shame this is win 11 only

@sean7221 2 ай бұрын

LMDE 6 is the future, Windows can go to hell

@habag1112 2 ай бұрын

It runs fine on win10 for me (using rtx 3070)

@varughstan Ай бұрын

I am running this on Windows 10. Working fine.

@TheSleepJunkie Ай бұрын

Get real. I haven't seen a single windows 10 pc on the market. Not even the cheap ones competing with chromebooks.

@Saviliana 2 ай бұрын

So Kobold but Nvidia?

@0AThijs 2 ай бұрын

No api, no custom model loading, just a simple ui, no updater... (I already downloaded it three times to update it, each time ~30gb, yes 30GB FOR MISTRAL!).

@user-yi2mo9km2s 2 ай бұрын

ChatRTX's installer has lots of bugs, never fixed. My PC has Win11 24H2 192GB DDR5+4090 installed.

@RoughEdgeBarb 2 ай бұрын

This might be the first use-case of LLMs I'm interested in. Local is necessary to address huge env cost of GenAI, and the ability to parse your own documentation is interesting.

@gokudomatic 2 ай бұрын

I have a feeling that this thing need NVidia hardware that has RTX. My 1060 GTX won't run that.

@FromagioCristiano 2 ай бұрын

At 1:20 there is the system requirements: Geforce RTX 30/40 Series, RTX Ampere (the ones like RTX A2000, RTX A4000) and the Ada Generation GPUs (but that is not for us mere peasants)

@hipflipped 2 ай бұрын

the 1060 is ancient.

@gokudomatic 2 ай бұрын

@@hipflipped yes, it is. What's your point?

@JARFAST Ай бұрын

Does it support the Arabic language?

@kyryllvlasiuk 2 ай бұрын

I've got 2060 with 6 GB :(

@nangld 2 ай бұрын

Too slow, given it is 7b running on a GPU.

@judasthepious1499 2 ай бұрын

AI : hallo user, what are you doing? please upgrade your nvidia graphics card.. or you can't continue using our AI service

@ionthedev 2 ай бұрын

Why do they hate linux so much

@vi6ddarkking 2 ай бұрын

Well they've been Stupidly Simple fo a couple years now with the WebUIs like Oobabooga. So this isn't exactly anything impressive.

@pm1234 2 ай бұрын

They're late to the party: not llama3, only windos, only a basic chat interface. Open source RAG tools are already here.

@r6scrubs126 2 ай бұрын

Did you even watch the first 30 seconds. It's an easier alternative to all the open source build it yourself ones. I think that's great

@pm1234 2 ай бұрын

@@r6scrubs126 It would have been great (and still late) if it had all the things I mentioned in my comment. I watched it, THEN commented it. Menu shows llama2 13B (@2:05), no llama3, it's only for window$ (@1:17), and the chat UI is basic (not even sure it does markdown tables). RAG are getting common now. If you're happy because you don't know OS tools, no problemo!

@gabrielesilinic 2 ай бұрын

Llama 3 is not even open source by definition, Mistral is doing a better job

@claxvii177th6 2 ай бұрын

Llama3 isnt open-source??

@claxvii177th6 2 ай бұрын

Seriously, i was using for an entrepreneur application

@JasonBrunner-SM 2 ай бұрын

Any HELPFUL comments from those that are already experts on this topic about the better LLMs to use with this from the stand point of game dev? Since Mike admitted this is not his area of expertise.

@24vencedores11 Ай бұрын

Nice! but you're too fast man!

@aa-xn5hc 2 ай бұрын

That is a bad app. For example, it cannot take into account the previous chat when answering a follow up question.

@user-ym6gt8zz4v 2 ай бұрын

train unreal engine 5

@gamefromscratch 2 ай бұрын

You can if you can get a text or PDF version of the documentation. Or enough PDF Unreal Engine books. Really its a matter of dumping as much documentation into your training model folder as you can source.

@jefreestyles 2 ай бұрын

Can one add multiple file/folder locations? Or is it really just one folder that has to be the root? Can it use symbolic links or folder/file shortcuts?

@PurpleKnightmare 2 ай бұрын

OMG This is way more cooler than I thought.

@thesteammachine1282 2 ай бұрын

Win 11 only ? Lol no..

@ryanisanart 2 ай бұрын

more ai stuff pls this is awesome

@impheris 2 ай бұрын

i like some things on AI but this is getting pretty boring now

@ronilevarez901 2 ай бұрын

That's like saying to a new parent that watching their child breathing must be boring. This is a new type of life developing in front of your eyes. This is History. I do find History boring, but seeing it happening every day is in a different level.