AnythingLLM | The easiest way to chat with your documents using AI

AnythingLLM | The easiest way to chat with your documents using AI | Open Source!

Рет қаралды 30,696

Жыл бұрын

OpenSource GITHUB REPO: github.com/Mintplex-Labs/anyt...
AnythingLLM is by far the easiest open-source tool to get started chatting with your documents. Using simple off-the-shelf software like OpenAI's API and a free PineconeDB instance you can be up and running in seconds.
No local LLM setup, no crazy RAM specifications - anythingLLM will run on a potato but will give you unlimited power.
Built this because running a localLLM that isn't anywhere as close to performant or good as GPT-3.5/4 is just ridiculous. Let's do the easy thing first, shall we?
Come with data collection scripts, an awesome UI, and easy setup tooling.
Submit an issue or PR: github.com/Mintplex-Labs/anyt...
#openai #chatgpt4 #chatwithdocs #localgpt #privategpt #gpt #useGPT #gpt4all #ai #aitools #aitools2023 #aitoolsforbusiness #opensource #openaiapi #nodejs #reactjs #opensourcesoftware

Пікірлер: 144

@mmarrotte101 Жыл бұрын

Dude THANK YOU. Very excited to test this out today, big time props to you for making this happen!

@bzzt88 Жыл бұрын

Excellent project, thanks! This is perfect for ingesting public info I want to use, like making summaries of KZfaq videos, etc. OR chatting with my own data that’s already in public channels.

@heather.zenplify Жыл бұрын

Looks amazing! Excited to give it a try. Thank you! 💕

@vitalis 8 ай бұрын

Wow, for someone who has zero experience with LLMs but was looking for an easy solution to chat with documents this looks fantastic. I really like the folder watch so we can just drop all the documents there at once. Keep it up!

@TimCarambat 8 ай бұрын

Got some more cool stuff cooking up right now

@Neolisk Жыл бұрын

Private documents need to stay private. Sending them to ChatGPT API defeats the purpose.

@Heynmffc Жыл бұрын

Not to mention potentially leaking sensitive data 🫠

@vaisakhkm783 Жыл бұрын

i would love to have privategpt to be little more polished... which this perfect for this honestly... i don't think running llm just when needed is a huge issue.. just no need to keep it running always in the background

@RandommCatt Жыл бұрын

Then... Don't use... Gpt for it...?

@Neolisk Жыл бұрын

@@RandommCatt There are GPTs you can run locally. They are just not as good (=complete and utter junk). But this is where the work should be going. Make it good enough to be usable and avoid leaking data to big corps.

@VishnuVardhanS Жыл бұрын

@@Neolisk nothing local can match proper cloud GPTs but many offline LLMs are getting good for home users

@uhtexercises Жыл бұрын

Great project. Really like it. Thanks so much for open sourcing it.

@dubnet69 Жыл бұрын

Epic bro!!!! OUCH!! I think two more arms just sprouted of back. I think I can get used to this!!

@Versole 8 ай бұрын

Great explanation, Great Video, and Great product. Thank you for making it open source. Not many are willing to release this for free.

@jacobusstrydom7017 Жыл бұрын

Dude, that was great. Thanks for this I definitely going to test it out

@AaronMcCloud_Me Жыл бұрын

This looks awesome, and so useful!!! Getting this going right now - I was looking for something like this 🙏🙇‍♂🥳

@JohnDoe-ie9iw Жыл бұрын

I was looking for something like this, was going to build something similar. Cant wait to try it out it looks great.

@ThomasJanzen 11 ай бұрын

I am looking for a developer who can implement this for me or our firm. Interested?

@vocabotics Жыл бұрын

Awesome. Well done on making this.

@ArielTavori Жыл бұрын

This looks incredible! Looking forward to local model support. FYI, there is a tool in private beta that will simplify instruction chains, and handles 1 click installs, CPU queue, and memory management across various AI tools. Public release coming soon, possibly within days...

@blackwhite6681 Жыл бұрын

Where can we learn more about this?

@bzzt88 Жыл бұрын

Please post when available, thanks!

@user-zr3zp2yg4m Жыл бұрын

Nice!

@frknue Жыл бұрын

You did a great job mate 👏

@toddbrous_untwist Жыл бұрын

Awesome work!

@daycentmodel9086 10 ай бұрын

Great content. Loved it!!!!!!!!!!!!!!!!! Subscribed!

@Aaronius_Maximus 9 ай бұрын

Looking forward to checking this project out looks awesome! Does it have to use an API from an online service or can you still host your own model and use this? Yhanks for the video!

@mohitbansalism 9 ай бұрын

Tim - Thanks for the wonderful work. Any plans for including Llama support?

@webdancer Жыл бұрын

Great project, I've been looking for something like this. I have a couple of questions though.

@dazzaofficial3591 Жыл бұрын

Oh man please make it work with Orca LLM! And thank you so much for making a Docker!

@yossefdawoad4311 Жыл бұрын

I wouldn't say it's the easiest but it's like localGPT v100 but using openAI 😢, I like the level of organization, seems you put a ton of work, good job 👍 I am trying to build something like it but backend only for now at least it is called privategptserver I definitely would use some of that awesome work you done😂❤

@liog4127 Жыл бұрын

Amazing!! !

@justMeFromDe Жыл бұрын

The documents are only private on local running LLM's but not when you are using OpenAI

@xox14 Жыл бұрын

thank you so much!

@MrCyqgz Жыл бұрын

Awesome!

@TheLonelyPanther 5 ай бұрын

Marvelous

@NiranjanAkella 10 ай бұрын

Hey Tim, I had a question. Say I have a pdf file of 1000 pages and it gets vectored and stored in the vector database. And when I query something, it is checked for similarity against the vector database and retrieves top contexts? and these contexts along with the query are sent to GPT3.5 for an answer? . If yes then if the document is too big then wouldn't the model crash due to max sequence limit? how are you handling this? Cause when I try to inference LLama2 with huge texts it errors out with max sequence length. So how to handle this?

@crossray974 8 ай бұрын

Hi @NiranjanAkella - you only send parts of the document in chunks to the model.

@PatrickSteil Жыл бұрын

So if you have an entire website that is like a knowledge-base with lots of articles linked from the homepage, can we give that website as input to AnythingLLM or what would you suggest to be able to use such a website?

@TimCarambat Жыл бұрын

Oh you mean like a sitemap? Would that's work? Or do you want to be able to do like a url list

@rustyxof Жыл бұрын

Thank you

@leandro_chad Жыл бұрын

thanks

@QuOUseTERSEa Жыл бұрын

Thank you for your great project! I was wondering how this differs from the AskYourPDF plugin if I don't care about privacy issues?

@TimCarambat Жыл бұрын

Does more than PDFs! Actual file is never sent to a third party, and allows you to bring your own vector dB so even your embedding only go where you want. Among other benefits as well

@danquixote6072 Жыл бұрын

Enjoyed the video and the concept and although I have limited experience with code, I think I've almost got it up and running. However, perhaps I missed it in the video, but where exactly do you put the OpenAI and Pinecone API keys? Is it the .env.example file in the server directory?

@TimCarambat Жыл бұрын

It should exist in a file called .env.development in the /server folder! Dont use the .env.example for your real keys! Its just to show you all the available options!

@danquixote6072 Жыл бұрын

@@TimCarambat Thanks for the response. I'll have a search for it in the morning. Looking forward to giving it a try.

@danquixote6072 Жыл бұрын

@naseefmazumder I couldn't get it to work. I'm not a coder though and couldn't figure it out

@sandmancode4787 Жыл бұрын

This looks amazing! Are you currently looking for repo maintainers? I would love to help!

@TimCarambat Жыл бұрын

Jump in! Plenty of bugs to fix

@user-zr3zp2yg4m Жыл бұрын

Just got access to Anthropic... Any plans to expand this model to work with multiple LLM's?

@ThomasJanzen 11 ай бұрын

Thank you for your good work. Is there any way to "train" or "fine tune" the bot to make it respond even better for certain documents?

@TimCarambat 11 ай бұрын

There are some tips and tricks and it comes down to how the document is split, prompt engineering specific for the document and also separating documents by scope in a workspace. I was gonna to make a user video on this to show what I mean!

@rossanobr 11 ай бұрын

awesome project. it would be nice to make it work in a vercel deployment

@0ZeroTheHero Жыл бұрын

Can't get it to work on windows 10. Would need more thorough setup instructions.

@luizbueno5661 Жыл бұрын

Wonderful work, I almost get a taste for it, but When I send a chat message, I get a return: Could not send chat. I check the API keys and they are RED. "Ensure all fields are green before attempting to use AnythingLLM or it may not function as expected!" in red. I've tried everything I could think of. Could anyone help?

@CarlosHernandez-ki6tv Жыл бұрын

I can't download "pyobjc-framework-AddressBook" because it is for MasOS, is there a windows version?

@galgol23 10 ай бұрын

its a Great project thanks :) q: in the collector you have youtube channel which is great but i f i want to insert only youtube videos and not scrape all the youtube channel videos most authors have videos on different subject from what i want what can i do? Thanks

@MalikNfkt Жыл бұрын

Free accounts for Pinecon are on a waitlist and the paid version starts at $70 a month. Is there anyway to run this without Pinecone? If not, what is the best open source LLM to run on a non-GPU PC without access to Pinecone?

@TimCarambat Жыл бұрын

Yes, you can run Chroma (locally) and also LanceDB (also local)

@MalikNfkt Жыл бұрын

@@TimCarambat Thank you, Tim! I tried to install Super AGI and the only option it gives is Pinecone. What do you think is the best LLM rn that will run om Chroma or LanceDB? Also, do I need a GPU to run those databases locally?

@TimCarambat Жыл бұрын

@@MalikNfkt great questions. Both chroma and Lance do not require a GPU and simply require a CPU and a nominal amount of RAM. Nothing that a low-to-mid tier machine would be lacking. Open-source LLMs almost all require a GPU aside from Gpt4all. Those cpu LLMs however do have crazy resource overheads so running both on the same machine will likely be a problem. Llm choice has no impact on which vector dB you choose. It's more about the LLM context window size limit

@Canna_Science_and_Technology Жыл бұрын

Well, I guess I'm too new to this. I was excited and following along until unfortunately in the beginning. Setting up the server. What window are you in? Yarn? zsh? I use Python in VS code. So, i'm thinking this is not for me. So many different ways to do this nowadays.

@TimCarambat Жыл бұрын

Ill make a pure dev startup tutorial so its more easy to follow - this was just a high level overview!

@sitedev Жыл бұрын

@@TimCarambat that would be great.

@shubham900100 10 ай бұрын

@@TimCarambatReminder!!!

@erickbravo5800 4 ай бұрын

So can this only be run locally? What if I want to include it in a web app for employees to use?

@TimCarambat 4 ай бұрын

No, you can run this in Docker in an EC2 instance or behind a private gateway/IP and it would be available via a browser!

@dawsongrattidge1695 Жыл бұрын

Can you add support and step up for milvus

@rosszhu1660 Жыл бұрын

This is cool! However I just couldn't make it run on Windows.

@user-zr3zp2yg4m Жыл бұрын

Broskillet how do I get the dark theme? 👽

@thumperhunts6250 7 ай бұрын

Where is the link to your discord?

@Therapills-oj2gv 10 ай бұрын

That is a so powerfull tool, I installed using docker on my windows computer. But I face two important problems. When I upload a pdf file, it adds it page by page, unlike in the video, and when I try to upload a pdf file again, it doesn't tell me that it has already been uploaded and charges me the same way. Is there anything I did wrong?

@TimCarambat 10 ай бұрын

As of like 4 days ago all PDFs will be grouped into their own folder. Reuploading them will not overwrite the existing files, but will make a new folder. Once files are embedded they are cached, so unless you are also trying to embed the same duplicate file you wouldn't have to reembed

@Yewbzee Жыл бұрын

So how much does it cost initially for it to go through documents. For example if you have thousands of pages of pdf's.

@TimCarambat Жыл бұрын

depends on the amount of text. AnythingLLM will let you know **exactly** how much it will cost before you pay. To ballpark the total cost all in do $(0.0004/1000) * (word count). With AnythingLLM this is a one time fee because we store the results (locally) so you dont have to re-embed when you want to use them in multiple workspaces.

@Yewbzee Жыл бұрын

@@TimCarambat ok thanks 👍

@GreenKeysLLC Жыл бұрын

Did I miss something, I could not get this installed on Windows. All kinds of errors, had to install node to install yarn, then "yarn setup" attempted to use the cp command, which is not available on windows, etc... one thing after another, feels like I'm going down a rabbit hole to get this installed on a Windows 11 machine.

@TimCarambat Жыл бұрын

You can use the Windows WSL, there will be a cloud version soon so you can spin up your own isolated instance. Hopefully dockerized too for easy local running cross-OS

@webdesignleader Жыл бұрын

@@TimCarambat Cloud version! Dockerized multi-OS local app! Yay! Looking forward to it. Thanks bud!

@tarmicachiwara2973 Жыл бұрын

This is so good! May I have windows instructions?

@TimCarambat Жыл бұрын

It should function similar if you have a command line (not powershell) available. Something like gitbash or lambda should work here. I can work on that aspect soon

@theawebster1505 Жыл бұрын

@@TimCarambat "error: PyObjC requires macOS to build" Idea is great, but it's a pain to install on windows right now. And if I was a Linux power user, I'd connect to OpenAI's API myself / with ChatGPT's help. thanks for.. the idea, currently not accessible to the regular Windows guy.

@CVolkwein 11 ай бұрын

@@TimCarambat It says that LanceDB doesn't support windows operating systems, and I can seem to make it look to other vector databases.

@TimCarambat 11 ай бұрын

@@CVolkwein Do you know have a Pinecone.io account? This tool works with Pinecone, Chroma, and Lance

@CVolkwein 11 ай бұрын

@@TimCarambat Added pinecone server details in the .env including API key and used language to choose pinecone as the vector database. Also tried using the configuration menu in the user interface to apply the same information, worked neither time. Sorry to bother you, any help is appreciated thank you for replying.

@DeanDogDXB Жыл бұрын

Tried to run the installer and setup on Windows, and got errors from installing the requirements. Turns out that it needs MacOS to build the PyObjC dependencies, so cant run this on Windows machine?

@TimCarambat Жыл бұрын

Removing that dependency this week so you can!

@DeanDogDXB Жыл бұрын

@@TimCarambat appreciate it. How and where can I follow to see updates?

@thenewdesign Жыл бұрын

What's the Discord link?

@snehasissnehasis-co1sn 6 ай бұрын

How to deploy anything -llm on docker for windows pc.Please make a step by step Video with attached input all the codes.....

@mohitsethi8934 Жыл бұрын

can the chat be embedded into a website?

@TimCarambat Жыл бұрын

It's locally hosted so that wouldn't be possible. Will have cloud solution soon, so yes but only then

@adrianpetrescu8583 Жыл бұрын

Man it is good idea but ... let say I have 5 book of 200 pages... what will be the cost ?, and also the cost of gpt4 ... why donw try to get same with open and free options ?

@TimCarambat Жыл бұрын

Chroma is free and can embed this for free (Chroma is now supported btw!) As for encoding, that is not gpt4! That is done with ada-text-encoding-002 which is $0.0004/1k tokens and with AnythingLLMs vector-caching its a one-time cost. AnythingLLM also will estimate the cost before embedding so its not a surprise! As far as chatting you can use Davinci, GPT-3.5 turbo, or GPT4. Local LLMs leave a lot to be desired but support is coming for some so its totally no-cost!

@user-th7gd7ge4p Жыл бұрын

are you conducting a symphonic orchestra hidden behind the camera? or is your karate sparring partner hiding behind the cam?

@PatrickMetzdorf Жыл бұрын

Ok but you cannot "run it privately" as you claim, which is the whole point of the other tools you compare this to, i.e. they allow you to run everything locally and not send your files to a third-party like OpenAI or Pinecone. So yes, this is an easy UI wrapper around Langchain that is the simplest possible Langchain setup because it just uses the third-party providers, but that's not an advantage for most people who need to deal with sensitive docs.

@TimCarambat Жыл бұрын

Run your own Chroma on AWS + use Azure OpenAI api and same results and performance but totally compliant (ISO-xyz, SOC2, etc). Will be supporting localLLM apis soon. I dont disagree with your sentiment at all. Ideally everything is local or isolated so no data leaks

@PatrickMetzdorf Жыл бұрын

@@TimCarambat Yea nowadays most langchain based solutions run locally fine, including chroma and some vector DB. The issue is running the model, but with Vicuna, Falcon and the like, even that can execute on a local machine these days. Self-hosting is the way to go for most companies. Private individuals are probably fine with third party providers though.

@TimCarambat Жыл бұрын

@@PatrickMetzdorf Agreed, enterprise will always want their own non-shared and private instance of whatever their AI-tooling stack will require. Thats when I think those tools become very powerful and necessary For the layperson, APIs work, **for now

@tuliomop Жыл бұрын

Sweet work Tim, maybe the local docs link could be one to localhost:8000 ,python -m SimpleHTTPServer 8000 , or something like that (I haven´t python-ed too much lately) subscribed

@crytex1747 Жыл бұрын

Why don't you dockerize it?

@TimCarambat Жыл бұрын

Just not a docker maxi. Probably a good idea in this case tho

@mjackstewart Жыл бұрын

Do you have an OF?

@snehasissengupta2773 6 ай бұрын

How to install anything llm on windows pc....please make a video.... please provide all the steps clearly.....

@abs884 Жыл бұрын

But i still need to pay for openai?

@TimCarambat Жыл бұрын

Yes, but in future will support local llms

@mediastreamview9528 10 ай бұрын

Don't trust anything that requires Internet or OpenAPI. If you can demonstrate how to use this with LM Studio running as an alternative OpenAPI server locally then it would interest me.

@shephusted2714 Жыл бұрын

it would be better to run everything locally with only local api calls and unlimited tokens - this method is way to go particularly for smb sector where they have lots of documents - things are shifting away from openai and big tech ai to fully open source ai

@TimCarambat Жыл бұрын

The resource and overhead for smb to do this is a bit much. Any llm that has any decent performance is on a gpu, which youd have to provision or spot instance. Basically at the end of the day is an smb going to hire an arch to engineer this or are they just going to use an api, realistically. Also the Azure cloud offers provisioned and compliant gpt models no setup required. I don't disagree with your angle at all. Ideally it's all local and low resources and with advancements we might get there but today that isn't the case, so this tool works with what's easiest and lowest resources on local machines

@webdesignleader Жыл бұрын

How to switch to Dark mode?

@TimCarambat Жыл бұрын

It's based on system preferences, so if you use dark mode in system or browser it should appear. Can make a toggle for it as well

@webdesignleader Жыл бұрын

@@TimCarambat Nice! Not sure how that works cross platform. I'm on Linux Mint(A Popular Ubuntu OS Fork from France) which is simply Dark by default, but many apps require reconfig, and sometimes create visual glitches when they try their white styles... Yours looks fine, but I didn't see in the code where it was switching from light to dark, so I just left it for now.